Skip to main content
MLX · Apple Silicon · Local Inference

LEM Lab

MLX-native local inference application for Apple Silicon. Run Lemma models locally with Metal GPU acceleration and unified memory.

Apple Silicon native MLX framework EUPL-1.2

Features

MLX-Native

Built on Apple's MLX framework. Unified memory architecture means models run at full speed without copying data between CPU and GPU.

Chat Interface

Native chat UI for conversational interaction with Lemma models. Streaming responses, conversation history, and model switching.

Performance Dashboard

Real-time token throughput, memory usage, and generation metrics. See exactly how your hardware performs with each model size.

Model Hub

Browse and download Lemma models in MLX format directly from HuggingFace. Automatic quantisation selection based on available memory.

Lemma Models

Optimised MLX variants for Apple Silicon

Lemer

2.3B · Edge

Q4_K_M / Q8_0 / BF16

Lemma

4.5B · General

Q4_K_M / Q8_0 / BF16

Lemmy

26B MoE · Agentic

Q4_K_M / Q8_0 / BF16

Lemrd

30.7B · Research

Q4_K_M / Q8_0 / BF16

Platforms

macOS

Apple Silicon native. Metal GPU via MLX. M1/M2/M3/M4 with unified memory for maximum model size.

App Store

Targeting Apple App Store distribution via CoreGUI. One-click install, automatic updates, sandboxed runtime.

In Development

App Store Target

LEM Lab is being built for Apple App Store distribution. Native MLX inference, the Lemma model family, and Studio integration.