PRJ-005 Active
AI inference cost curves, 2022–2025
Tracking token pricing, hardware generations, and efficiency gains across major providers. Trying to find the inflection point.
PRJ-006 WIP
Reasoning model benchmarks — a skeptic's read
Dissecting ARC-AGI, MMMU, and GPQA scores. What they actually measure, what they don't, and where the goalpost is moving.
PRJ-007 WIP
Mixture-of-experts routing — sparse vs dense tradeoffs
Reading through the Mixtral and Switch Transformer papers. Notes on expert collapse, load balancing, and what this means for inference cost.
PRJ-008 Dormant
On-device inference — the 7B sweet spot
Benchmarking quantized models on M-series and Snapdragon X. Latency, memory pressure, and what tasks are actually viable offline.