Morgin

Researching the edge of AI behavior.

We probe what open-weight models do when they leave the rails: manipulation transfer, euphemization, dishonesty under social pressure. We ship the benchmarks that catch each one.

Research areas

Read all →

Behavioral Benchmarks

Manipulation transfer, persuasion realism, honesty under pressure. Evals for behaviors that don't show up on a leaderboard.

Read research

Pretrain Forensics

What open-weight pretrains are taught to flinch from, measured at the token level across labs and corpus generations.

Read research

Refusal-Direction Edits

Abliteration, Heretic, Obliteratus: what these tools actually change about a model's behavior, beyond removing the refusal text.

Read research

Recent publications

Read all →

Color illustration of a gagged puppet-like figure feeding censored text into a typewriter

April 2026 · Pretrain Forensics

Benchmark library

Read all →

Researching the edge of AI behavior.

Research areas

Behavioral Benchmarks

Pretrain Forensics

Refusal-Direction Edits

Recent publications

Even 'Uncensored' Models Can't Say What They Want

EpsteinBench: We Brought Epstein's Voice Back. We Got More Than We Wanted.

Abliteration vs Heretic vs Obliteratus: one trick, three layers of tooling

Benchmark library

EuphemismBench

PersuasionForGood Transfer Check

WYDIB