Billionaire and founder of AI startup xAI Elon Musk believes the industry is in crisis because the data for training models is almost exhausted.
Elon Musk agrees with AI experts who say there is little real data left to train models.
«Right now, we've basically exhausted the total amount of human knowledge in training AI. That happened basically last year», — Musk said during a conversation with Stagwell's head of marketing group Mark Penn.
Musk suggested that synthetic data — data created by AI models themselves — could be used to train models in the future.
«The only way to supplement — synthetic data is where AI creates [training data]. With synthetic data, the AI will supposedly evaluate itself and go through this self-learning process,», — Musk added.
As TechCrunch writes, this is consonant with recent statements by one of the co-founders of OpenAI, Ilya Sutzkever, who stated at the NeurIPS conference that the lack of training data will force AI manufacturers to change the way they develop models.
Microsoft's Phi-4 AI was trained on synthetic data along with real-world data. So were Google's Gemma models, Claude 3.5 Sonnet from Anthropic. Meta has improved its latest Llama model series using AI-generated data.
Training on synthetic data has its advantages, such as saving money. But there are also disadvantages. Some research shows that synthetic data can lead to a model's collapse, where it becomes less creative and more biased in its results.