A small Chinese AI company called DeepSeek has surprised many by sharing how they built their advanced AI model, named R1. This company was started by Liang Wenfeng, who also runs a trading fund named High-Flyer. DeepSeek explained in detail how they created a large language model that can learn and improve on its own without human help. This is notable because big U.S. companies like OpenAI and Google DeepMind have been leading in this area but often keep their methods secret.

Liang used his experience from High-Flyer, where he used Nvidia computer chips to analyze stock markets, to help build DeepSeek’s AI. Even though some people didn’t take him seriously at first, thinking only big companies could do this work, his team became very good at using these chips effectively. After the U.S. restricted the export of powerful Nvidia chips to China, DeepSeek found smart ways to use the chips they had, even if they weren’t the newest models.
DeepSeek focuses mainly on research and shares its findings openly, unlike many companies that keep their discoveries private for business reasons. Liang funds DeepSeek using money from his trading fund, allowing him to pay high salaries to attract top talent. The company employs graduates from leading Chinese universities and has offices in Hangzhou and Beijing.
DeepSeek claims they used 2,048 Nvidia H800 chips and spent $5.6 million to train a model with 671 billion parameters, which is less than what companies like OpenAI and Google have spent on similar models. Some experts believe this shows that with the right knowledge, building advanced AI can be more affordable.
However, it’s uncertain if DeepSeek can keep up as the AI industry grows. Big U.S. companies are investing heavily in new technology, which could widen the gap between them and competitors like DeepSeek. For now, DeepSeek has enough resources, but this might change as the industry advances.