#Gate 2025 Semi-Year Community Gala# voting is in progress! 🔥
Gate Square TOP 40 Creator Leaderboard is out
🙌 Vote to support your favorite creators: www.gate.com/activities/community-vote
Earn Votes by completing daily [Square] tasks. 30 delivered Votes = 1 lucky draw chance!
🎁 Win prizes like iPhone 16 Pro Max, Golden Bull Sculpture, Futures Voucher, and hot tokens.
The more you support, the higher your chances!
Vote to support creators now and win big!
https://www.gate.com/announcements/article/45974
So now where this will make sense for inference, we barely fit quantitized Q8 Qwen Coder 3 and Kimi K2 instances on our H200s. Kimi K2 @ Q8 left no room for a kv cache for the context. Could these models fit on a single 8xB200 instance? Probably, we'll try this week.