{"id":168496,"date":"2025-09-08T20:41:42","date_gmt":"2025-09-08T20:41:42","guid":{"rendered":"https:\/\/teknomers.com\/en\/china-is-distancing-itself-from-nvidia-its-next-move-focuses-on-the-core-of-ai-with-a-groundbreaking-system\/"},"modified":"2025-09-08T20:41:43","modified_gmt":"2025-09-08T20:41:43","slug":"china-is-distancing-itself-from-nvidia-its-next-move-focuses-on-the-core-of-ai-with-a-groundbreaking-system","status":"publish","type":"post","link":"https:\/\/teknomers.com\/en\/china-is-distancing-itself-from-nvidia-its-next-move-focuses-on-the-core-of-ai-with-a-groundbreaking-system\/","title":{"rendered":"China is distancing itself from Nvidia. Its next move focuses on the core of AI with a groundbreaking system."},"content":{"rendered":"\n<p>In 2017, the paper &#8220;<a rel=\"noopener, noreferrer nofollow\" href=\"https:\/\/arxiv.org\/abs\/1706.03762\" target=\"_blank\">Attention is All You Need<\/a>&#8221; by Google revolutionized the technical foundation of language generation with the introduction of \u00a0Transformers\u00a0. These models enabled parallel processing of long sequences and allowed for the scaling of architectures far beyond earlier capabilities. This advancement led to remarkable models like GPT and BERT, establishing \u00a0self-attention\u00a0 as a central pillar of \u00a0contemporary generative AI\u00a0. However, this innovative approach came with significant memory and energy costs as the context length increased, prompting researchers to seek alternatives. The newly developed Spikingbrain-1.0 aims to shatter these limitations.<\/p>\n<h2>From &#8220;Attention Is All You Need&#8221; to the Brain: A New Commitment to Break Boundaries<\/h2>\n<p>A team from the \u00a0Chinese Academy of Sciences Automation Institute\u00a0 has recently presented Spikingbrain-1.0, a family of spiking models designed to reduce the data and computational resources required for tasks that involve very long contexts. The experts propose two distinct approaches: \u00a0Spikingbrain-7B\u00a0, a linear architecture focused on \u00a0efficiency\u00a0, and \u00a0Spikingbrain-76B\u00a0, which fuses linear attention with \u00a0Mixture of Experts (MOE)\u00a0 mechanisms for greater capacity.<\/p>\n<p><!-- BREAK 1 --> <\/p>\n<p>The authors of the paper detail that much of the development and testing were conducted using \u00a0Metax C550 GPU clusters\u00a0, utilizing specialized libraries and operators tailored for that platform. This factor makes the project not only a promising software advancement but also a demonstration of \u00a0homegrown hardware capabilities\u00a0. The significance of this development is heightened when considering China&#8217;s strategic effort to reduce its dependence on Nvidia\u2014a strategy we observed previously with \u00a0DeepSeek 3.1\u00a0.<\/p>\n<p><!-- BREAK 2 --><\/p>\n<div class=\"article-asset-image article-asset-normal article-asset-center\">\n<div class=\"asset-content\"><\/div>\n<\/div>\n<p>Spikingbrain-1.0 draws direct inspiration from how the human brain operates. Rather than relying on neurons that continuously compute values, it utilizes \u00a0spiky neurons\u00a0\u2014units that accumulate signals until they surpass a threshold and trigger a spike. During intervals between spikes, these neurons remain inactive, conserving both operational resources and energy. A vital concept here is that not only the number of spikes matters but also the timing and sequence of these spikes, which relay information akin to brain functions.<\/p>\n<p>In order to merge this design effectively within the current ecosystem, the team developed methodologies that convert traditional self-attention blocks into \u00a0linear versions\u00a0, facilitating easier integration into the spiky system. Furthermore, they introduced a form of \u201c\u00a0virtual time\u00a0\u201d that simulates temporal processes without compromising GPU productivity. The Spikingbrain-76B variant also incorporates MOE, activating only specific submodels when needed, a feature already seen in GPT-4O and GPT-5.<\/p>\n<p><!-- BREAK 3 -->  <\/p>\n<div class=\"article-asset-image article-asset-normal article-asset-center\">\n<div class=\"asset-content\">\n        <img class=\"centro_sinmarco\" height=\"1532\" width=\"1598\" loading=\"lazy\" decoding=\"async\"  fetchpriority=\"high\"  src=\"https:\/\/teknomers.com\/en\/wp-content\/uploads\/2025\/09\/China-is-distancing-itself-from-Nvidia-Its-next-move-focuses.png\" alt=\"Chinese model architecture 1\"\/>\n    <\/div>\n<\/div>\n<p>The authors of the study suggest applications where context length is crucial, such as large legal document analysis, comprehensive medical records evaluation, DNA sequencing, and the management of massive experimental datasets in high-energy physics. The rationale for these applications is presented in the document\u2014if the architecture operates efficiently with contexts containing millions of tokens, it could lower costs and open up possibilities in domains currently constrained by expensive computational infrastructure. However, validation in real-world scenarios outside laboratory conditions remains pending.<\/p>\n<p><!-- BREAK 4 --><\/p>\n<div class=\"article-asset article-asset-normal article-asset-center\">\n<div class=\"desvio-container\">\n<div class=\"desvio\">\n<div class=\"desvio-figure js-desvio-figure\">\n                <img loading=\"lazy\" decoding=\"async\" alt=\"The ASML-Mistral alliance reveals the European plan B: if we cannot manufacture chips, we will at least control how they are manufactured\" width=\"375\" height=\"142\" src=\"https:\/\/teknomers.com\/en\/wp-content\/uploads\/2025\/09\/1757364102_394_China-is-distancing-itself-from-Nvidia-Its-next-move-focuses.jpeg\"\/>\n            <\/div>\n<\/p><\/div>\n<\/p><\/div>\n<\/div>\n<p>The team has made the code for the \u00a07 billion parameter\u00a0 model available on <a rel=\"noopener, noreferrer nofollow\" href=\"https:\/\/github.com\/BICLab\/SpikingBrain-7B\" target=\"_blank\">GitHub<\/a>, alongside a detailed technical report. They also provide a web interface similar to ChatGPT for interacting with the model, which is entirely deployed on national hardware. However, access is currently \u00a0limited to Chinese users\u00a0, complicating its use outside this specific ecosystem. While the proposal is ambitious, its true impact will depend on the broader community&#8217;s ability to reproduce results and conduct comparisons in consistent environments that evaluate \u00a0accuracy, latency,\u00a0 and \u00a0energy consumption\u00a0 under real-world conditions.<\/p>\n<p>Images | Xataka with Gemini 2.5 | <a rel=\"noopener, noreferrer nofollow\" href=\"https:\/\/unsplash.com\/es\/fotos\/una-bandera-china-ondea-alto-en-el-cielo-SI8x3-z-Dck\" target=\"_blank\">ABODI VESAKARAN<\/a><\/p>\n<p>In Xataka | OpenAI believes it has discovered why AI sometimes hallucinates: they struggle to convey uncertainty effectively.<\/p>\n<p><br \/>\n<br \/><a href=\"https:\/\/teknomers.com\/category\/general\/\" rel=\"dofollow\">General News &#8211; 2<\/a><\/p>\n","protected":false},"excerpt":{"rendered":"<p>In 2017, the paper &#8220;Attention is All You Need&#8221; by Google revolutionized the technical foundation of language generation with the introduction of \u00a0Transformers\u00a0. These models enabled parallel processing of long sequences and allowed for the scaling of architectures far beyond earlier capabilities. This advancement led to remarkable models like GPT and BERT, establishing \u00a0self-attention\u00a0 as [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":168497,"comment_status":"closed","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[36399],"tags":[2397,9954,41881,7402,7240,1388,20230,3285],"class_list":["post-168496","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-technology","tag-china","tag-core","tag-distancing","tag-focuses","tag-groundbreaking","tag-move","tag-nvidia","tag-system"],"_links":{"self":[{"href":"https:\/\/teknomers.com\/en\/wp-json\/wp\/v2\/posts\/168496","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/teknomers.com\/en\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/teknomers.com\/en\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/teknomers.com\/en\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/teknomers.com\/en\/wp-json\/wp\/v2\/comments?post=168496"}],"version-history":[{"count":0,"href":"https:\/\/teknomers.com\/en\/wp-json\/wp\/v2\/posts\/168496\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/teknomers.com\/en\/wp-json\/wp\/v2\/media\/168497"}],"wp:attachment":[{"href":"https:\/\/teknomers.com\/en\/wp-json\/wp\/v2\/media?parent=168496"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/teknomers.com\/en\/wp-json\/wp\/v2\/categories?post=168496"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/teknomers.com\/en\/wp-json\/wp\/v2\/tags?post=168496"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}