LEM Lab
MLX-native local inference application for Apple Silicon. Run Lemma models locally with Metal GPU acceleration and unified memory.
Features
MLX-Native
Built on Apple's MLX framework. Unified memory architecture means models run at full speed without copying data between CPU and GPU.
Chat Interface
Native chat UI for conversational interaction with Lemma models. Streaming responses, conversation history, and model switching.
Performance Dashboard
Real-time token throughput, memory usage, and generation metrics. See exactly how your hardware performs with each model size.
Model Hub
Browse and download Lemma models in MLX format directly from HuggingFace. Automatic quantisation selection based on available memory.
Lemma Models
Optimised MLX variants for Apple Silicon
Lemer
2.3B · Edge
Q4_K_M / Q8_0 / BF16
Lemma
4.5B · General
Q4_K_M / Q8_0 / BF16
Lemmy
26B MoE · Agentic
Q4_K_M / Q8_0 / BF16
Lemrd
30.7B · Research
Q4_K_M / Q8_0 / BF16
Platforms
macOS
Apple Silicon native. Metal GPU via MLX. M1/M2/M3/M4 with unified memory for maximum model size.
App Store
Targeting Apple App Store distribution via CoreGUI. One-click install, automatic updates, sandboxed runtime.
App Store Target
LEM Lab is being built for Apple App Store distribution. Native MLX inference, the Lemma model family, and Studio integration.