Training
Data Augmentation
Quick Answer
Techniques for creating variations of training data to improve model robustness and generalization.
Data augmentation creates new training examples by modifying existing ones (paraphrasing, back-translation, adding noise). This increases training data volume and diversity. Data augmentation improves robustness to variations and reduces overfitting. For text, augmentation includes paraphrasing and back-translation. Augmentation should preserve label correctness. Well-designed augmentation improves generalization. Augmentation is less critical with large pretrained models but still valuable.
Last verified: 2026-04-08