XIAOMAI NEWS
DeepSeek v3: 671B finegrained MoE trained for $5.5m USD of compute on 15T tokens — Mews