关闭导航

包含标签"double micro-batch overlapping architecture"的内容

DeepSeek发布V3技术论文 聚焦大模型训练硬件挑战与软硬联合优化
AI妹 1 个月前 8 0

Recently, the DeepSeek team released a technical paper about their latest model, DeepSeek-V3, focu