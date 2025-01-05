The United States and China are undoubtedly the leading forces in AI development, with the US currently holding a slight technological edge. In October 2022, the US implemented export controls to restrict China’s access to advanced NVIDIA AI chips – specialised hardware crucial for accelerating computationally intensive tasks such as deep learning, natural language processing, and computer vision. This move aimed to hinder China’s AI progress and reduce competition in AI technology advancement.

The release of OpenAI’s GPT models, particularly ChatGPT and GPT-3, sent shockwaves across the globe. These systems, with their ability to generate human-like text and perform tasks previously considered the exclusive domain of human intelligence, astounded observers.

Prior to ChatGPT, AI was largely confined to narrow applications such as basic chatbots or manufacturing automation. OpenAI’s breakthrough, however, revolutionised the AI landscape. ChatGPT, with its ability to engage in conversations, compose essays, write poetry, and even debug code, demonstrates an unprecedented level of fluency and adaptability. Users, initially expecting responses akin to earlier, more rudimentary chatbots, were often left awestruck by the system’s insightful and human-like output.

Social media platforms quickly became inundated with examples of ChatGPT’s prowess within days of its release. Users shared screenshots of its detailed answers, creative writing, and even humorous outputs, often accompanied by comments like “Is this real?” or “I can’t believe AI knows this!” For many, this was their first encounter with AI that truly seemed to “think” in a natural and human-like manner.

Overnight, individuals with no prior technical background found themselves wielding powerful AI tools. The experience of interacting with ChatGPT felt akin to having a personal genius assistant, empowering users with newfound capabilities.

The subsequent release of GPT-4, with its multimodal capabilities (processing both text and images) and enhanced features inherited from its predecessors, further solidified the perception of having reached the pinnacle of AI advancement.

It was, therefore, unsurprising that Image Merchants Promotion Limited (IMPR), publishers of PRNigeria and Economic Confidential, during its three-day staff retreat at the PRNigeria Centre Kano, announced its commitment to leveraging Artificial Intelligence (AI) and emerging technologies to enhance productivity in 2025.

In 2024, the organisation integrated AI to improve the efficiency and accuracy of its operations, particularly in in-depth special reports and fact-checking.

Mr Yushau Shuaib, Chief of IMPR, emphasised the need for companies in the strategic communication and public relations industry to fully embrace AI solutions. He stressed the importance of combining human intelligence and creativity with AI to enhance staff productivity and maintain a competitive edge in the face of stiff competition among tech giants.

It was during this retreat that we delved into the apparent rivalry between the West and the East in the realm of Artificial Intelligence. This discussion was particularly fuelled by the recent launch of “DeepSeek-V3” by Chinese tech firms, a groundbreaking AI model that has set a new global benchmark for AI applications, with its affordability, exceptional performance, and open-source development strategies.

DeepSeek-V3 has taken the AI community by storm. Its performance surpasses many top-tier AI models, including closed-source AI, challenging the notion that open-source AI could only play a secondary role.

DeepSeek-V3 boasts an impressive speed and efficiency, processing information at a blistering 60 tokens per second – a threefold increase over its predecessor.

The model employs a “Mixture-of-Experts (MoE)” architecture. This architecture comprises multiple neural networks, each optimised for specific tasks. When DeepSeek-V3 receives a prompt, a “router” intelligently directs the request to the neural network best suited to handle it. Each of these individual neural networks within the MoE structure possesses 34 billion parameters.

To illustrate: Imagine a classroom where each student specialises in a different subject, such as mathematics, storytelling, or art. When a problem arises, the teacher (router) determines which student (neural network) is best equipped to address it. This architecture ensures that users receive the most relevant and effective responses.

Furthermore, DeepSeek-V3 was trained on a massive dataset of 14.8 trillion tokens. In data science, tokens represent units of raw data, with one million tokens roughly equivalent to 750,000 words. This translates to the model being trained on an astonishing 11 trillion words.

Prior to DeepSeek-V3, GPT-4 held the record for the largest training dataset, with approximately one trillion tokens. Considering GPT-4’s impressive performance with that amount of data, one can only imagine the capabilities of DeepSeek-V3.

Another remarkable aspect of DeepSeek-V3 is its remarkably low production cost. The company revealed that they spent only $5.5 million to train the model, a figure significantly lower than the costs associated with developing other leading models, such as GPT-4, which reportedly cost over $100 million. Moreover, DeepSeek-V3 outperforms other AI models across various benchmarks and evaluation metrics commonly used to assess AI performance.

Reports indicate that DeepSeek-V3, with its 6.71 trillion parameters, surpasses Meta’s Llama 3.1 and outperforms mainstream closed-source models like GPT-4 in numerous benchmark tests. This achievement not only signifies a breakthrough in Chinese AI technology but also represents a significant innovation in the global AI landscape.

The realisation that these powerful features are not only accessible but also entirely free to use is truly extraordinary. It underscores the immense potential that can be unlocked without any financial barriers.

Despite these restrictions, China has demonstrated its ability to overcome obstacles and achieve groundbreaking innovations. The development of DeepSeek-V3 not only defies expectations but also serves as a potential model for the US in advancing and refining its own domestically-driven AI systems. It exemplifies the adage, “What doesn’t kill you makes you stronger.”

The development of DeepSeek-V3 marks a pivotal moment in the evolution of AI technology. It challenges the conventional norms established by previous models like GPT-4 and demonstrates that cutting-edge AI does not have to be prohibitively expensive. DeepSeek-V3 proves that innovation can be achieved while maintaining affordability, setting a new standard for future AI development that emphasises accessibility and affordability.

The bar has been raised. Now, it is incumbent upon innovators, researchers, and developers to collaborate, compete, and strive to push the boundaries of AI even further.

The future of AI will be shaped by collaboration, healthy competition, and a shared commitment to creating even better and more powerful systems. Together, we can shape the next generation of AI that will define the future.

Shuaib S. Agaka, Tech Journalist, writes from PRNigeria Centre Kano.

