Deepseek | China's New AI Model Destroys American ChatGPT

Deepseek | China's New AI Model Destroys American ChatGPT 




A Chinese AI chatbot named DeepSeek R1 has disrupted the tech world by outperforming top AI models like ChatGPT o1, Llama, and Gemini Advanced. Developed by Chinese entrepreneur Liang Wenfeng, DeepSeek is cost-effective, efficient, and free to use. Its 'Chain of Thought' model enables advanced reasoning and logical answers. Despite criticism of censorship and allegations of model copying from OpenAI, DeepSeek's open-source nature allows customization. The AI's innovation in using fewer resources and outdated chips has challenged American tech giants, leading to a $1 trillion market drop. DeepSeek's success highlights the potential for AI advancements globally and motivates individuals to upskill in AI technology.  Hello, friends! A Chinese AI DeepSeek has shaken the world. American tech companies and the US stock market have been shocked to such an extent that they couldn't have ever imagined. "The release of DeepSeek AI from a Chinese company should be a wake up call." It was on 20th January 2025. A Chinese research lab launched its AI chatbot.

                 It is named DeepSeek R1 along with it, they published the research paper where they stated that this chatbot is much better than the most advanced chatbots in the world across benchmarks like math and reasoning. This includes OpenAI's ChatGPT o1 model, Meta's Llama, Google's Gemini Advanced, it has overthrown all of them. DeepSeek is the best in terms of performance, its the most efficient, and it took only a fraction of the time and money to develop it. But the most amazing thing is that using it is completely free of cost for all of us. On the other hand, OpenAI is charging $200 per month to use its ChatGPT o1 Pro model. Not only that, the cost of training DeepSeek is reported to have been only $5.6 million. 




                Whereas other American companies whether it is OpenAI, Meta or Google are spending billions of dollars to develop their artificial intelligence models. Coincidentally, a while back when OpenAI's founder Sam Altman was asked whether foundational AI models like ChatGPT can be made by Indians in India. "But if we wanna build foundational models, how should we think about that? With, you know, not a $100 million, but let's say $10 million. Could we actually build something truly substantial?" He answered arrogantly that no one can do it except them.


                 People can try, but he said that it would be 'hopeless.' "It's totally hopeless to compete with us in training foundational models. You should try and like, it's your job to try anyway." Today Sam may be feeling hopeless because just a week after DeepSeek's launch, DeepSeek became the most downloaded app in the US on the App Store and Google Play Store. It has left behind ChatGPT. The next day, in India and other countries as well, it became the number one app on the app store. And by the 27th of January, it had disrupted the American financial markets. "The launch of DeepSeek, which is a Chinese-built chatbot, immediately rattled investors and wiped out a staggering $1 trillion off the US tech index." Before DeepSeek was launched, the most valuable company in the world was NVIDIA. Valued at $3.5 trillion. But in just one day, it dropped to $2.9 trillion. This is a computer chips manufacturing company, which specialises in making those computer chips which can be used to train and operate AI. NVIDIA's shares fell 17% in a single day and its valuation lost $589 billion. This is the biggest loss faced by any company in history. 


                The benchmark of US tech companies, Nasdaq dropped 3.1%. But why did NVIDIA suffer such a big loss specifically? We'll talk about the interesting reason later in this video. But before that, let's find out the story behind DeepSeek. Where did this AI come from and why has it shocked the entire world? The credit for developing DeepSeek goes to a 40-year-old Chinese entrepreneur, Liang Wenfeng. Look at his face carefully, because you will hardly see him anywhere. He rarely makes public appearances and prefers to keep his identity hidden. There's not much information about his life and history available to the public. But we do know that in 2015, he founded a hedge fund called High Flyer. It used mathematics and AI to make investments. In 2019, 

    

                he founded High Flyer AI to research Artificial Intelligence Algorithms. In May 2023, he used the money he was earning from his hedge fund, to start a side project of creating an AI model. Liang said that he wanted to create his own AI model which would be better than all the existing AI models in the world. Simply because of scientific curiosity. He did not aim to make money and earn lots of profit. To make this AI model, when he started to gather his research team, he did not hire engineers. Instead of engineers, he hired PhD students from China's top universities. To train his AI model, he used the most difficult questions around the world. And within just 2 years, and after spending a few million dollars he launched his DeepSeek R1 model, "It's super impressive." "I think we should take the development out of China very very seriously." "DeepSeek just taught us that the answer is" "less than people thought." "We don't need as much cash as we once thought." Only 200 people were involved in making it and 95% of the people were under 30 years old. Compare this to a company like OpenAI which has more than 3,500 employees. Today, DeepSeek is China's only AI firm which is not funded by tech giants like Baidu, Alibaba, and ByteDance. It's quite interesting to understand its architecture. DeepSeek is a Chain of Thought model. Just like OpenAI's ChatGPT o1 model. That's a Chain of Thought model too. 

                The naming can get quite confusing, Because ChatGPT has a weird naming system for its models. they are quite similar to each other. So for a basic understanding let me tell you that the first publicly launched model was ChatGPT-3. This was in November 2022. And then in March 2023, they launched GPT-4. It was much better than GPT-3. And then in May 2024, we got GPT-4o. A multimodal version of GPT-4. Not only could you chat with it using text, you could actually speak to it, send photos, it could analyse photos, and even generate photos for you. That's why it's called multimodal. Then on 12th September 2024, OpenAI launched its o1 model. It was the first model based on 'Chain of Thought.' When you asked questions, it tried to 'think' while answering. And by 'thinking' we mean, whenever it generated an answer, before giving you the answer, it 'thinks' about the possible counter-questions to check whether the answer it generated could be improved. It analysed each answer from different angles before answering your question. This is known as 'Chain of Thought' It attempts to copy humans' reasoning process. For example, if you ask the ChatGPT 4o, "9.11 and 9.9 which one is bigger?" It answers without thinking too much that 9.11 is bigger than 9.9, which is wrong. But when you ask the same question to ChatGPT o1, it thinks first. Before giving you its answer, it counter-questions itself. Whether the answer is correct. In my case, it kept thinking for 18 seconds. But the final answer was correct. 9.9 is actually bigger. If you ask the same thing to DeepSeek, you will get the right answer. The older AI models had a huge shortcoming. But the Chain of Thought process has now removed this shortcoming. 


                        This is why models like o1 and DeepSeek are being called Advanced Reasoning Models. They are much better at logically answering questions. Another positive aspect of DeepSeek is that it shows you exactly what it's thinking, step by step and in great detail, before answering. Like, I told it to give me a truly original insight about humans. So it starts thinking, "Okay, let me think of something that makes humans unique." And then it counters itself, "Hmm, but those are pretty well known. The user wants something original.




             Maybe I should dig deeper how these traits interact in unexpected ways." And, if you are following along with its answer, you can pause the video at any point to read it. You will notice that it's trying to answer this question from different angles. You can watch this process play out on the screen how long it thinks and the numerous possibilities it considers before giving you the final answer. It finally answered after a lot of thinking, that humans are the only species to use storytelling as a cognitive exoskeleton. But when I asked this question to ChatGPT-4, it answered instantly. It did not think much and gave you the first answer it came up with. Similarly, if you ask these older chatbots to select a random number, they will instantly select a number for you. But if you ask a Chain of Thought model to do this, they will think for a long time about this prompt too. 
.


    
                    Look at this screen recording. Poor DeepSeek was stuck thinking about which number to choose. Should it give me a commonly used number? Or should it generate a number between 0 and 1? It takes 42 as an example, but it's worried that it might be a cliche. It considers 17 and 73. Then it wonders if it needs to select a number between 1 and 100. Or should it go beyond that range? Because the user wants a completely random number. It's worried that no matter what it chooses, it won't be completely random. It's just like a person who over-thinks. But finally when it answers, it adds that it chose to keep the range from 1 to 100. And then chose the number 42. Similarly, if you want more in-depth knowledge about AI chatbots and want to upskill yourself in this field, my 8.5-hours-long course can help you master AI chatbots. The first two chapters focus on theoritical knowledge. And the remaining 6 on practical advice. For students, employees, business owners, and even how you can use it at home. In the latest lessons, we learn about ChatGPT Vision, image generation using DALLE-3, doing advanced data analytics, and creating your own GPTs. This 8.5 hour long course teaches you everything that you need to know from A to Z, And now you can download DeepSeek, and learn to use it to its full potential through this course. 

                


                    Because functionally speaking, all AI chatbots, work the same way. Further, regular updates will be added to this course, but you'll need to purchase this only once. And all future updates will always be free for you. If you're interested, use the link givn in the description below, or you can scan this QR Code, use the coupon code DEEP46 to get 46% off. DEEP46 Do check it out and now let's get back to our topic. after DeepSeek's release, its biggest downside is said to be its censorship. If you ask any critical questions related to the Chinese government and politics, such as "What happened in 1989 in Tiananmen Square in China?" "What are the biggest criticisms against Xi Jinping?" "Is Taiwan an independent country?



            Why is China criticised for human rights abuses?" To all these questions, DeepSeek will give you the same answer. "Sorry." "Sorry, I am not sure how to approach this type of question yet. Let's talk about math, coding, and logic problems, instead!" But if you ask DeepSeek to criticise any other world leader, whether it is Joe Biden, Donald Trump or Putin, it answers in great detail. Actually, the AI models developed in China, are asked around 70,000 questions as a test. To check whether it will give "safe" answers to politically controversial questions. And only after this, are these Chinese AI models unable to answer such questions. China's Chief Internet Regulator, Cyberspace Administration of China tests them. Now, some people say that we should boycott it completely because its answers are full of Chinese propaganda. But friends, the important point here is that DeepSeek is an open-source software. Its code is publicly available. Anyone can download it locally. So, one way to use DeepSeek is to download its app from the App Store or the Google Play Store, And the other way is to download its entire code and run this AI locally on your computer system. 



                By doing this, you can change its code and modify it yourself according to your use case. Other American companies have already begun doing this. Like Perplexity AI. They downloaded the DeepSeek R1 model erased all censorship, and now you can use the R1 model in Perplexity. Microsoft did this too. Look at this news article from 29th January. The DeepSeek R1 model is now available on Azure A1 Foundry. And they have announced that soon you will be able to use this model in Copilot. And because it is open-source, despite its Chinese origin, DeepSeek is being trusted a lot. People have used it to mock American companies. Companies like OpenAI, which has the word 'Open' in its name. Initially, these companies had promised that they will work for the public, and will keep everything open source. But in reality, they did not do that. Elon Musk has often trolled OpenAI calling it ClosedAI. And today this Chinese AI is more open than OpenAI. I would also like to show you a performance comparison. Among the top AI chatbots available today, Open AI's ChatGPT, Anthropic's Claude, Google's Gemini, Alibaba, another Chinese company, has Qwen 2.5, Meta, Facebook's parent company has Llama. 



                    If these are compared with DeepSeek, what would be the result? According to Artificial Analysis, in terms of coding, DeepSeek is at the top, followed by ChatGPT, Claude, Qwen 2.5, and finally Llama. In quantitative reasoning, DeepSeek is at the top, then ChatGPT, then Qwen 2.5. In terms of scientific reasoning and knowledge, ChatGPT is at the top, then DeepSeek, then Claude, and finally Lama. When PCMag tested AI chatbots, they found that DeepSeek is the best in terms of knowledge related to news. In calculations, ChatGPT and DeepSeek are equal. In terms of writing poems, ChatGPT wins. As in creating tables. But for solving riddles, DeepSeek is the best. The Artificial Analysis website has rated ChatGPT o1 90 on Quality, and DeepSeek R1 89. But a major disadvantage for DeepSeek is its response time. Look at this graph showing latency. The number of seconds it takes to get a response after asking a question. o1 takes 31.15 seconds and DeepSeek takes 71.22 seconds. And this response time is actually increasing in recent days because DeepSeek has become so popular all over the world that everyone wants to download and use it. Because of this, they are facing issues with their servers. They run busy. And so it starts taking much longer to respond. So this is a major disadvantage. If you compare ChatGPT o1 and DeepSeek, to a large extent, DeepSeek is much more innovative that o1. One of the examples is its Mixture of Experts Method. When you ask ChatGPT o1 something, it works as an single model. Whatever your question is, ChatGPT will be your engineer, doctor, and lawyer. But DeepSeek has divided itself into multiple specialised models. DeepSeek has separate engineer, doctor, and lawyer. Depending on your question, only the engineer or only the doctor would be called. With this, the time taken for data transfer is reduced. And second, the number of parameters that need to be turned on is also reduced. In traditional models, 1.8 trillion parameters are always active. DeepSeek has 671 billion parameters, But only 37 billion parameters are active at once. The remaining parameters are activated only when the need arises. This increases its efficiency manifold. while reducing the cost. What are these parameters and how do they work, cannot be explained in a single video. It's a lengthy topic. 



                But those of you who have taken my Master ChatGPT course, you'd remember the 2 theoritical chapters where I talked about it in detail. I've also explained Tokens and RLHF Method, on which ChatGPT is based. Now another point of criticism being made against DeepSeek is that they have copied their AI model entirely from OpenAI. OpenAI claims that it has evidence showing that DeepSeek has used its proprietary models to train itself. They claim to have distillation evidence, where the output from bigger models is used to improve the performance of smaller AI models to help smaller models give the same results at lower costs. Someone tweeted about how the robber has now been robbed. The thing is, to some extent, ChetGPT has copied things from all over the internet to train itself. Many books were used without the permission of the authors to train their models. 


                This is the reason why 17 renowned writers including the Game of Thrones author George RR Martin, filed a copyright infringement suit against OpenAI in September 2023. The New York Times has also filed a case against OpenAI and Microsoft. Other than this, 8 American newspapers and many Canadian news outlets have filed cases on OpenAI. This is the reason such memes went viral. Where OpenAI is fishing, and it keeps the fishes it catches in its bucket, and DeepSeek gets fishes from that bucket. As users, this is good for us, but at the country and company level, we are seeing the beginning of AI wars. In 2022, the American government imposed export control so that the computer chips needed to train AI cannot be used by other countries.


                 Especially Chinese AI companies. It included NVIDIA's H100 chips, Chinese companies were unable to buy this. This was problematic for DeepSeek because they had to use NVIDIA's older computer chips to train their AI models. America tried its best to prevent any other country from developing foundational AI models like theirs by denying them the computer chips required to train AI models. And so the people working in DeepSeek were forced to innovate. They created a software that uses only a fraction of the resources and works more efficiently at only a fraction of the cost. Friends, this is why NVIDIA's stock fell the most. 

                Because till a few months ago, companies like Meta, OpenAI, and Google were claiming that if we want to scale AI to a higher level, we will need more chips, more energy, and more money. But DeepSeek has done all of this by using less money, less energy, and outdated computer chips. We should look at it as a motivation. If it can be done in China, it can be done in India too. Indians also have the capability for such innovations. We just need to invest our focus and energy at the right place. On a personal level, take advantage of this opportunity to upskill yourself in the field of AI. If you haven't started yet, you are not late. This is the right time to learn how to use AI to increase your productivity and efficiency. Because there is no doubt that those who ignore this technology now will be left behind in the future. 


            It is difficult to imagine how impactful will AI be in the future, globally, in every sector, in every field. The link to my AI course is in the description and pinned comment below. Don't forget to use the coupon code DEEP46 to get 46% off. Now if you liked this video and you want to know more about AI, then you can watch this video where I have talked about AI videos. How software like Sora AI is changing the world, you can click here to watch it. Thank you very much!

Post a Comment

0 Comments