Research.

Original research in model compression, capability governance, and sovereign AI deployment.

Featured April 2026

Gemma 4 at Half the Size, But Full Performance

RAM compression matches BF16 quality at 50% of the original model size. 85.2% on the full 12,032-question MMLU-Pro suite, identical to Google's published baseline. Half the hardware. Same intelligence.

May 2026

62

MoE Routing Layers Converge Across Subjects: No Free Lunch for Domain-Specific Targeting

May 2026

61

Metal Buffer Limits Block LoRA Scaling on MoE Models

May 2026

60

The Stacking Confound: Why LoRA Recovery Numbers Lie

May 2026

59

Identity Survives LoRA Stacking: Persona Preservation in Multi-Stage Fine-Tuning

May 2026

April 2026

RAM

Gemma 4 at Half the Size, But Full Performance

April 2026

March 2026

46

What 100 Prompts Reveal About Expert Routing in 256-Expert MoE Models

March 2026

44

Why You Can't Prune MoE Experts, Even the Ones Nobody Uses

March 2026

43

SmoothQuant Breaks on Signed RMSNorm: A Negative Result

March 2026

42

Why a Mostly-6-Bit Model Runs Faster Than a Mixed 4/8-Bit Model

March 2026

40

When Better Is Worse: What We Learned by Trying to Improve Our Quantizer

March 2026

39

Five Things Our Benchmarks Reveal That Nobody Expected

March 2026

38

RAM Benchmark Results: 7 Models, 40,000+ Questions, One Winner

March 2026

35

Beyond Perplexity: Downstream Benchmarks Confirm RAM Beats All Quantization Strategies

March 2026

34

Does Quantization Actually Regularize? We Tested It.

March 2026

33

Mean Perplexity Is Lying to You

March 2026

32

MLX Quantization on Apple Silicon: How RAM Turns a Mac into a Compression Lab

March 2026

27

When Data-Free Beats the Gold Standard

March 2026

23

The GPU Hours Nobody Needed to Spend

March 2026

22

The Quantization Bottleneck Is About to Break

March 2026

21

What RAM Actually Delivers: Evidence from Four Models

March 2026

20

RAM Evaluation Results: Four Models, Three Architectures

March 2026

19

Why RAM Matters: The Future of Model Deployment

March 2026

18

When Quantization Beats Full Precision: Anatomy of a Perplexity Anomaly

March 2026

February 2026

15

AI Without Permission: Privacy, Sovereignty, and Local Inference

February 2026

13

The End of Calibration Data

February 2026

12

AI Sovereignty on Commodity Hardware

February 2026

10

RAM and the Humanoid Intelligence Problem

February 2026

09

RAM for Enterprise: Deploying Without the GPU Bill

February 2026

08

RAM on Apple Silicon: 400B Parameters on a Single Mac

February 2026

06

Why Collapse Tests Are Insufficient

February 2026

04

MLX Quantization Pitfalls and Workarounds

February 2026

03

Expert Pruning: When Dead Experts Aren't Dead

February 2026

02

Per-Expert Mixed-Bit Quantization

February 2026

01

Profiling Expert Activation Patterns in 512-Expert MoE Models

February 2026