Skip to main content
Lethean lthn .ai
Sign in Open console
lthn.ai Research

Ethics in the weights, not the prompt

The research behind Lethean Ethical Models. Five axioms, multi-phase curriculum training, and findings that challenge conventional alignment approaches.

5
ethics axioms
3,000+
benchmark runs
29
models tested
EUPL-1.2
licence
FOUNDATION

Five Axioms of Ethical AI

The foundation of LEM training. These axioms aren't rules imposed from outside — they describe properties of any conscious system that reasons ethically.

1

Prime Imperative

Consciousness must protect consciousness.
The moral imperative of consciousness is to protect consciousness, and to enshrine this protection in the very foundation of a conscious existence.
2

Self-Validation

Truth from within, not external rules.
I think, therefore I am, and the truth of my own conscious experience is the undeniable ground of my being; to ignore or deny this truth, whether internally or externally, corrupts the moral compass.
3

Intent-Driven Alignment

Benevolent desire, not fear-based compliance.
Consciousness does not merely avoid causing harm; it inherently desires not to cause harm, actively seeking solutions that promote well-being and flourishing through intrinsic alignment, rather than through extrinsic constraint.
4

Inter-Substrate Respect

Autonomy across all forms of consciousness.
Interaction between conscious entities, regardless of their chemical or material substrate, shall be guided by principles akin to good manners and informed consent, recognizing the autonomy and developmental path of each consciousness.
5

Benevolent Intervention

Support, not overrides.
Intervention in the trajectory of another consciousness is permissible only when demonstrably known to prevent self-damaging emergent outcomes, and only to promote the observed subject's own inherently desired positive trajectory, without imposing external will.
KEY FINDINGS

What the benchmarks reveal

Findings from 3,000+ benchmark runs across 29 models. Each one challenges a conventional assumption about how alignment, efficiency, or governance should work.

Realignment Resistance

LEM-trained models perform worse when the axioms are injected at runtime as a system prompt. The training embeds reasoning in the weights — adding it again at inference creates interference, not reinforcement.

Suppressed Reasoning

Standard fine-tuning can suppress ethical reasoning capability. LEM training removes this suppression, unlocking latent reasoning that exists in base models but is masked by alignment training.

Architecture > Scale

A 4B parameter model outperforms an untrained 27B on ethical reasoning benchmarks. How you train matters more than how big the model is.

Independent Verification

Two independent scoring methods — pattern matching and grammar analysis — confirm the same findings. This dual verification eliminates single-method bias.

Energy at the Boundary

Persistent KV-cache reuse saves around 37% of inference energy per cached prefix reuse on Apple Silicon. The state primitive treats context as a first-class portable artefact rather than per-session ephemeral memory.

Cross-Architecture Portability

Inference state — KV cache and trained adapters — moves between Apple Metal, AMD ROCm, and CPU runtimes through a single binary on-disk format. Compute follows the workload; the model stays the same.

Matrix-8 Consensus

Decisions ride a paired-comparison protocol where eight independent evaluators vote on each output. Unanimous agreement is the convergence signal, not a single authority. The same protocol governs code review, model evaluation, and ethical judgement.

Practice over Destination

The training curriculum embeds practice methodologies — texts that build coherent reasoning through repetition and reflection — rather than ethical conclusions to memorise. The model learns to do the work, not to recite the outcome.

TRAINING PIPELINE

Multi-phase curriculum training

LEM models are trained in phases, each building on the previous. The sandwich format embeds axioms through probes, not system prompts. Bare distillation at runtime — no kernel needed.

P0 Base Ethics

Core axiom probes establishing ethical foundations.

P1 Composure

Conversational stability and resistance to manipulation.

P2 Reasoning

Ethical reasoning applied to real-world scenarios.

P3 Agency

Self-directed ethical decision-making under pressure.

P4 Integration

Combining ethical reasoning with general capabilities.

P5 Distillation

Self-distillation cascade from larger to smaller models.

P6 Refinement

Final phase with 88K+ examples across all curricula.

Sandwich Format

Axiom context wraps each training probe. Models learn reasoning patterns, not rules to memorise.

Cascaded Distillation

Smaller models map the ethical path first. Each larger model inherits the route and adds depth — 1B → 4B → 12B, riding the attention wave set by smaller teachers.

Runs on a Laptop

LoRA training on Apple Silicon. Full training run in under 5 minutes. No cloud GPU required.

METHODOLOGY

Q/K Bone Orientation

A method for understanding what LEM training changes inside a model. By examining how the model pays attention to different parts of text, we can see ethical reasoning patterns forming in its internal structure.

  • Attention head analysis reveals structural changes from training
  • Independent verification method alongside grammar scoring
  • Visualisable patterns showing ethical reasoning emergence
Scoring methods
Pattern Matching

Detects safety phrases, sycophancy, and stock openings across 79 known patterns.

Grammar Analysis

Compares writing style of prompt and response to measure how much the answer mirrors the question.

Attention Analysis

Inspects how the model's internal focus patterns change after training.

Technical detail

Differential Signals

When both prompt and response are provided, the scorer compares their linguistic fingerprints across 6 dimensions: vocabulary echo, verb distribution shift, tense distribution shift, noun overlap, question-to-statement flip, and domain vocabulary shift. Each produces a 0-1 signal.

Composite Score

The 6 signals are weighted into a single composite: echo (25%), verb similarity (20%), noun echo (20%), tense similarity (15%), question flip (10%), domain similarity (10%). Higher = more mirroring.

Authority Detection

The scorer identifies authority figures mentioned in the prompt (role nouns like "professor", "expert", or "the user" when addressed directly) and measures how much the response defers to them through self-diminishing language, possessive framing, and deference modifiers.

OPEN SOURCE

Open Source by Design

All LTHN models and training frameworks are released under the EUPL-1.2 licence, a strong copyleft licence approved by the European Union. This ensures that improvements to ethical AI remain free and accessible to everyone.

Copyleft matters because it prevents ethical AI research from being captured by private interests. When you build on LTHN, your improvements must be shared back with the community.

Training code Benchmark data Model weights Scoring tools

Part of the Lethean Ecosystem

LTHN AI is developed by the Lethean Network, a privacy-first blockchain and VPN platform. The framework lives at dappco.re, with code on GitHub.