Function X 트윗 -
🧠 This new paper from researchers at
@GoogleDeepMind demonstrates that "models fine-tuned on weaker & cheaper generated data consistently outperform those trained on stronger & more-expensive generated data across multiple benchmarks"
What does this mean?
When training models, what''s more valuable is having more diverse datasets (even if some might be of lesser quality) than having perfect but fewer datasets. Breadth and variety of training signal is more important.
In other words, don''t obsess too much about 100% perfect data quality!