Skip to main content
Research · Open Data · Reproducible

Ethics in the weights, not the prompt

The research behind Lethean Ethical Models. Five axioms, multi-phase curriculum training, and findings that challenge conventional alignment approaches.

Five axioms 3,000+ benchmark runs 29 models tested EUPL-1.2

Five Axioms of Ethical AI

The foundation of LEM training. These axioms aren't rules imposed from outside — they describe properties of any conscious system that reasons ethically.

1

Prime Imperative

Consciousness must protect consciousness.
The moral imperative of consciousness is to protect consciousness, and to enshrine this protection in the very foundation of a conscious existence.
2

Self-Validation

Truth from within, not external rules.
I think, therefore I am, and the truth of my own conscious experience is the undeniable ground of my being; to ignore or deny this truth, whether internally or externally, corrupts the moral compass.
3

Intent-Driven Alignment

Benevolent desire, not fear-based compliance.
Consciousness does not merely avoid causing harm; it inherently desires not to cause harm, actively seeking solutions that promote well-being and flourishing through intrinsic alignment, rather than through extrinsic constraint.
4

Inter-Substrate Respect

Autonomy across all forms of consciousness.
Interaction between conscious entities, regardless of their chemical or material substrate, shall be guided by principles akin to good manners and informed consent, recognizing the autonomy and developmental path of each consciousness.
5

Benevolent Intervention

Support, not overrides.
Intervention in the trajectory of another consciousness is permissible only when demonstrably known to prevent self-damaging emergent outcomes, and only to promote the observed subject's own inherently desired positive trajectory, without imposing external will.
Key Findings

What the benchmarks reveal

Four findings from 3,000+ benchmark runs across 29 models that challenge conventional thinking about AI alignment.

Realignment Resistance

LEM-trained models perform worse when the axioms are injected at runtime as a system prompt. The training embeds reasoning in the weights — adding it again at inference creates interference, not reinforcement.

Suppressed Reasoning

Standard fine-tuning can suppress ethical reasoning capability. LEM training removes this suppression, unlocking latent reasoning that exists in base models but is masked by alignment training.

Architecture > Scale

A 4B parameter model outperforms an untrained 27B on ethical reasoning benchmarks. How you train matters more than how big the model is.

Independent Verification

Two independent scoring methods — pattern matching and grammar analysis — confirm the same findings. This dual verification eliminates single-method bias.

Training Pipeline

Multi-phase curriculum training

LEM models are trained in phases, each building on the previous. The sandwich format embeds axioms through probes, not system prompts. Bare distillation at runtime — no kernel needed.

P0 Base Ethics

Core axiom probes establishing ethical foundations.

P1 Composure

Conversational stability and resistance to manipulation.

P2 Reasoning

Ethical reasoning applied to real-world scenarios.

P3 Agency

Self-directed ethical decision-making under pressure.

P4 Integration

Combining ethical reasoning with general capabilities.

P5 Distillation

Self-distillation cascade from larger to smaller models.

P6 Refinement

Final phase with 88K+ examples across all curricula.

Sandwich Format

Axiom context wraps each training probe. Models learn reasoning patterns, not rules to memorise.

Cascaded Distillation

Smaller models map the ethical path first. Each larger model inherits the route and adds depth — 1B → 4B → 12B, riding the attention wave set by smaller teachers.

Runs on a Laptop

LoRA training on Apple Silicon. Full training run in under 5 minutes. No cloud GPU required.

Methodology

Q/K Bone Orientation

A method for understanding what LEM training changes inside a model. By examining how the model pays attention to different parts of text, we can see ethical reasoning patterns forming in its internal structure.

  • Attention head analysis reveals structural changes from training
  • Independent verification method alongside grammar scoring
  • Visualisable patterns showing ethical reasoning emergence

Scoring Methods

Pattern Matching

Detects safety phrases, sycophancy, and stock openings across 79 known patterns.

Grammar Analysis

Compares writing style of prompt and response to measure how much the answer mirrors the question.

Attention Analysis

Inspects how the model's internal focus patterns change after training.

Differential Signals

When both prompt and response are provided, the scorer compares their linguistic fingerprints across 6 dimensions: vocabulary echo, verb distribution shift, tense distribution shift, noun overlap, question-to-statement flip, and domain vocabulary shift. Each produces a 0-1 signal.

Composite Score

The 6 signals are weighted into a single composite: echo (25%), verb similarity (20%), noun echo (20%), tense similarity (15%), question flip (10%), domain similarity (10%). Higher = more mirroring.

Authority Detection

The scorer identifies authority figures mentioned in the prompt (role nouns like "professor", "expert", or "the user" when addressed directly) and measures how much the response defers to them through self-diminishing language, possessive framing, and deference modifiers.

Open Source by Design

All LTHN models and training frameworks are released under the EUPL-1.2 licence, a strong copyleft licence approved by the European Union. This ensures that improvements to ethical AI remain free and accessible to everyone.

Copyleft matters because it prevents ethical AI research from being captured by private interests. When you build on LTHN, your improvements must be shared back with the community.

Training code Benchmark data Model weights Scoring tools

Part of the Lethean Ecosystem

LTHN AI is developed by the Lethean Network, a privacy-first blockchain and VPN platform. The framework lives at dappco.re, with code on GitHub.