Rotary GPU: Exploring Local Execution for Large MoE Models Under Limited VRAM

by dryarzeg | View on Hacker News