MLX · Apple Silicon · Local Inference

LEM Lab

MLX-native local inference application for Apple Silicon. Run Lemma models locally with Metal GPU acceleration and unified memory.

Apple Silicon native MLX framework EUPL-1.2

Features

Built on Apple's MLX framework. Unified memory architecture means models run at full speed without copying data between CPU and GPU.

Native chat UI for conversational interaction with Lemma models. Streaming responses, conversation history, and model switching.

Real-time token throughput, memory usage, and generation metrics. See exactly how your hardware performs with each model size.

Browse and download Lemma models in MLX format directly from HuggingFace. Automatic quantisation selection based on available memory.

Optimised MLX variants for Apple Silicon

Lemer

2.3B · Edge

Q4_K_M / Q8_0 / BF16

Lemma

4.5B · General

Q4_K_M / Q8_0 / BF16

Lemmy

26B MoE · Agentic

Q4_K_M / Q8_0 / BF16

Lemrd

30.7B · Research

Q4_K_M / Q8_0 / BF16

Apple Silicon native. Metal GPU via MLX. M1/M2/M3/M4 with unified memory for maximum model size.

Targeting Apple App Store distribution via CoreGUI. One-click install, automatic updates, sandboxed runtime.

In Development

LEM Lab is being built for Apple App Store distribution. Native MLX inference, the Lemma model family, and Studio integration.