正文 Markdown
Incredible possibilities for on-device small models. Here @adrgrondin is running Google’s Gemma 4 E2B on iPhone 17 Pro. ~40tk/s with MLX optimized for Apple Silicon SOTA coding & math on mobile with 128K context. Fully offline with thinking mode.