Research.

Original research in model compression, capability governance, and sovereign AI deployment.

Gemma 4 at Half the Size, But Full Performance
Featured April 2026

Gemma 4 at Half the Size, But Full Performance

RAM compression matches BF16 quality at 50% of the original model size. 85.2% on the full 12,032-question MMLU-Pro suite, identical to Google's published baseline. Half the hardware. Same intelligence.

May 2026

MoE Routing Layers Converge Across Subjects
62

MoE Routing Layers Converge Across Subjects: No Free Lunch for Domain-Specific Targeting

May 2026

Metal Buffer Limits Block LoRA Scaling on MoE Models
61

Metal Buffer Limits Block LoRA Scaling on MoE Models

May 2026

The Stacking Confound: Why LoRA Recovery Numbers Lie
60

The Stacking Confound: Why LoRA Recovery Numbers Lie

May 2026

Identity Survives LoRA Stacking
59

Identity Survives LoRA Stacking: Persona Preservation in Multi-Stage Fine-Tuning

May 2026

April 2026

Gemma 4 at Half the Size, But Full Performance
RAM

Gemma 4 at Half the Size, But Full Performance

April 2026

March 2026

What 100 Prompts Reveal About Expert Routing in 256-Expert MoE Models
46

What 100 Prompts Reveal About Expert Routing in 256-Expert MoE Models

March 2026

Why You Can't Prune MoE Experts, Even the Ones Nobody Uses
44

Why You Can't Prune MoE Experts, Even the Ones Nobody Uses

March 2026

SmoothQuant Breaks on Signed RMSNorm: A Negative Result
43

SmoothQuant Breaks on Signed RMSNorm: A Negative Result

March 2026

Why a Mostly-6-Bit Model Runs Faster Than a Mixed 4/8-Bit Model
42

Why a Mostly-6-Bit Model Runs Faster Than a Mixed 4/8-Bit Model

March 2026

When Better Is Worse: What We Learned by Trying to Improve Our Quantizer
40

When Better Is Worse: What We Learned by Trying to Improve Our Quantizer

March 2026

Five Things Our Benchmarks Reveal That Nobody Expected
39

Five Things Our Benchmarks Reveal That Nobody Expected

March 2026

RAM Benchmark Results: 7 Models, 40,000+ Questions, One Winner
38

RAM Benchmark Results: 7 Models, 40,000+ Questions, One Winner

March 2026

Beyond Perplexity: Downstream Benchmarks Confirm RAM Beats All Quantization Strategies
35

Beyond Perplexity: Downstream Benchmarks Confirm RAM Beats All Quantization Strategies

March 2026

Does Quantization Actually Regularize? We Tested It.
34

Does Quantization Actually Regularize? We Tested It.

March 2026

Mean Perplexity Is Lying to You
33

Mean Perplexity Is Lying to You

March 2026

MLX Quantization on Apple Silicon: How RAM Turns a Mac into a Compression Lab
32

MLX Quantization on Apple Silicon: How RAM Turns a Mac into a Compression Lab

March 2026

When Data-Free Beats the Gold Standard
27

When Data-Free Beats the Gold Standard

March 2026

The GPU Hours Nobody Needed to Spend
23

The GPU Hours Nobody Needed to Spend

March 2026

The Quantization Bottleneck Is About to Break
22

The Quantization Bottleneck Is About to Break

March 2026

What RAM Actually Delivers: Evidence from Four Models
21

What RAM Actually Delivers: Evidence from Four Models

March 2026

RAM Evaluation Results: Four Models, Three Architectures
20

RAM Evaluation Results: Four Models, Three Architectures

March 2026

Why RAM Matters: The Future of Model Deployment
19

Why RAM Matters: The Future of Model Deployment

March 2026

When Quantization Beats Full Precision: Anatomy of a Perplexity Anomaly
18

When Quantization Beats Full Precision: Anatomy of a Perplexity Anomaly

March 2026

February 2026

AI Without Permission: Privacy, Sovereignty, and Local Inference
15

AI Without Permission: Privacy, Sovereignty, and Local Inference

February 2026

The End of Calibration Data
13

The End of Calibration Data

February 2026

AI Sovereignty on Commodity Hardware
12

AI Sovereignty on Commodity Hardware

February 2026

RAM and the Humanoid Intelligence Problem
10

RAM and the Humanoid Intelligence Problem

February 2026

RAM for Enterprise: Deploying Without the GPU Bill
09

RAM for Enterprise: Deploying Without the GPU Bill

February 2026

RAM on Apple Silicon: 400B Parameters on a Single Mac
08

RAM on Apple Silicon: 400B Parameters on a Single Mac

February 2026

Why Collapse Tests Are Insufficient
06

Why Collapse Tests Are Insufficient

February 2026

MLX Quantization Pitfalls and Workarounds
04

MLX Quantization Pitfalls and Workarounds

February 2026

Expert Pruning: When Dead Experts Aren't Dead
03

Expert Pruning: When Dead Experts Aren't Dead

February 2026

Per-Expert Mixed-Bit Quantization
02

Per-Expert Mixed-Bit Quantization

February 2026

Profiling Expert Activation Patterns in 512-Expert MoE Models
01

Profiling Expert Activation Patterns in 512-Expert MoE Models

February 2026