Two different tricks for fast LLM inference

by swah | View on Hacker News