The Monetary Occasions reported that DeepSeek was compelled to delay the discharge of its new R2 giant language mannequin attributable to efficiency issues with Huawei Ascend processors, “highlighting the bounds of Beijing’s push to exchange US expertise,” in accordance with the FT.
The publication cited three sources who mentioned Chinese language authorities officers inspired DeepSeek to make use of Ascend chips as a substitute of NVIDIA processors after the discharge early this 12 months of DeepSeek’s R1 mannequin, which rocked the AI world.
“However the Chinese language start-up encountered persistent technical points throughout its R2 coaching course of utilizing Ascend chips,” the FT reported, “prompting it to make use of NVIDIA chips for coaching and Huawei’s for inference, mentioned the (sources).”
The information comes amidst different studies that NVIDIA and AMD GPUs – the NVIDIA H20 and AMD MI308 – at the moment allowed for export to China after a current US export rule change, are meant to ship roughly related energy as China’s homegrown GPUs. Additionally, there are studies that Chinese language authorities are asking firms planning to justify their use of American GPUs.
The FT article cited sources who mentioned the Huawei processors have stability issues, “slower inter-chip connectivity and inferior software program in contrast with Nvidia’s merchandise.”
In an try to resolve the issues, a crew of Huawei technicians labored with DeepSeek on website, however {that a} profitable coaching run couldn’t be achieved on the Ascend processors, sources mentioned.
The FT mentioned DeepSeek’s R2 mannequin might be launched “within the coming weeks.”