14
1
Continual learning and the post monolith AI era (baseten.co)
8
Why is everything so ugly? (nplusonemag.com)
1
DoorDash and Other Food Delivery Apps Are Reshaping Mealtime (nytimes.com)
1
Future leakage in block-quantized attention (matx.com)
2
The Normalization of Deviance in AI (embracethered.com)
4
Bertrand Russell on Apricots (1935) (peterhousehold.blogspot.com)
1
Clawdbot Remembers Everything (twitter.com/manthanguptaa)
39
Exactitude in Science – Borges (1946) [pdf] (kwarc.info)
2
Learning with LLMs (jwuphysics.github.io)
26
Slouching Towards Bethlehem – Joan Didion (1967) (saturdayeveningpost.com)
2
LLM architecture has evolved from GPT-2 to GPT-OSS (2025) (modal.com)
11
Weight Transfer for RL Post-Training in under 2 seconds (perplexity.ai)
16
Keeping 20k GPUs healthy (modal.com)
1
The State of LLM Serving in 2026: Ollama, SGLang, TensorRT, Triton, and vLLM (thecanteenapp.com)
1
What Is Arithmetic Bandwidth? (modal.com)
4
GPU memory snapshots: sub-second startup (2025) (modal.com)
2
Teenygrad (github.com/tinygrad)
1
Meaning in Large Language Models: Form vs. Function (jgehring.net)
1
Shape Suffixes – Good Coding Style (medium.com/noamshazeer)
4
Building Internal Agents (lethain.com)
1
How to Use LLM as a Judge (Without Getting Burned) (twitter.com/manthanguptaa)
42
[flagged] FTX whistleblower Caroline Ellison set for early release next month (invezz.com)
3
Tips for Writing a Technical Book (borischerny.com)
1
Information, complexity, brains and reality (Kolmogorov Manifesto) (2007) (arxiv.org)
4
Dinosaur Food: 100M year old foods we still eat today (borischerny.com)
50
Two kinds of vibe coding (davidbau.com)
2
Statistical Learning Theory and ChatGPT (kamalikachaudhuri.substack.com)
7
I ran out of money, spent my savings on a Hong Kong prostitute,& became a commie (docs.google.com)
5
Why are your models so big? (2023) (pawa.lt)
1
[dupe] The Decline of Deviance (experimental-history.com)
87
Sycophancy is the first LLM "dark pattern" (seangoedecke.com)
2
Contextualization Machines (stochasm.blog)
1
ChaCha has all the answers – unless I'm on the other end (2009) (archive.org)
8
What I don’t like about chains of thoughts (2023) (samsja.github.io)
1
Reframing Impact (turntrout.com)
1
Continuous Batching from First Principles (huggingface.co)
1
Solving Kilordle (hauntsaninja.github.io)
1
Kilordle (jonesnxt.github.io)
1
What makes good reasoning data (huggingface.co)
2
Evaluating the Effectiveness of LLM-Evaluators (a.k.a. LLM-as-Judge) (eugeneyan.com)
1
Compute Forecast (AI 2027) (ai-2027.com)
1
A Realistic AI Timeline (vintagedata.org)
2
Biotech companies I wish existed (eladgil.com)
1
A World of Verifiable Domains (seancai.com)
4
Learning to Model the World with Language (dynalang.github.io)
2
A hitchhiker's guide to CUDA programming (seanzhang.me)
46
Estimating the perceived 'claustrophobia' of New York City's streets (2024) (mfranchi.net)
166
Tinkering is a way to acquire good taste (seated.ro)
2
Modern LLM Training (A Summary) (lesswrong.com)
2
Yes it's just doing compression. No it's not the diss you think it is (blog.wtf.sg)
2
Good developer relations is about being a celebrity for dorks (pfiffer.org)
2
Prompt Baking (arxiv.org)
1
Offline "Studying" Shrinks the Cost of Contextually Aware AI (stanford.edu)
5
The State of Machine Learning Frameworks in 2019 (thegradient.pub)
1
Many AI Safety Orgs Have Tried to Criminalize Open-Source AI (2024) (1a3orn.com)
1
Neural Networks and Deep Learning (neuralnetworksanddeeplearning.com)
105
America's future could hinge on whether AI slightly disappoints (noahpinion.blog)
3
Self-Respect (By Joan Didion) (1961) (gatech.edu)
12
Read your way through Hà Nội (vietnamesetypography.com)
59
How hard do you have to hit a chicken to cook it? (2020) (james-simon.github.io)
2
Start a Blog (guzey.com)
1
Computable Babylonian Diaries Project (christopherwolfram.com)
1
Survival of the Best Fit (survivalofthebestfit.com)
3
Breath of the Wild Decompilation (botw.link)
2
The Politics of Contagion (emilybynight.com)
8
A PhD in Snapshots (rbharath.github.io)
31
Memory access is O(N^[1/3]) (vitalik.eth.limo)
1
Highrises (hythacg.com)
30
How does gradient descent work? (centralflows.github.io)
3
Small Products That Improved My Life (moultano.wordpress.com)
1
Whispers of A.I.'s Modular Future (2023) (newyorker.com)
1
An Age of AI Enlightenment (xiangfu.co)
1
A vision researcher's guide to some RL stuff: PPO and GRPO (yugeten.github.io)
3
LLMs are strangely-shaped tools (near.blog)
1
Learned Structures (nonint.com)
1
LoRA-XS: Low-Rank Adaptation with Small Number of Parameters (arxiv.org)
8
Evals in 2025: going beyond simple benchmarks to build models people can use (github.com/huggingface)
1
Dissecting Batching Effects in GPT Inference (qun.ch)
2
My (speculative) master plan for immortality (maxwellnye.com)
3
Richard Feynman and the Connection Machine (1989) (longnow.org)
98
Defeating Nondeterminism in LLM Inference (thinkingmachines.ai)
12
Perceived Age (2024) (sdan.io)
2
Don't Build an RL Environment Startup (benanderson.work)
1
Shifting Bits in Company History (williamyeny.github.io)
1
ML Systems: Motivating Dense Models (jacobkahn.me)
4
The "it" in AI models is the dataset (nonint.com)
1
The Paradigm (nonint.com)
1
Personalization, measuring with taste, and intrinsic interfaces (thesephist.com)
3
Long Term Memory in AI (Princeton CS 597A) (edoliberty.github.io)
1
Model Merging – A Biased Overview (crisostomi.github.io)
1
Adversarial Examples Are Not Bugs, They Are Superposition (livgorton.com)
1
Sequence Parallelism: Long Sequence Training from System Perspective (2021) (arxiv.org)
10
How many paths of length K are there between A and B? (2021) (horace.io)
2
How A Neuron Learns (rvns.moe)
1
GPT, Fast (pytorch.org)
1
GPT-Fast (github.com/meta-pytorch)
10
Exploring EXIF (2023) (hturan.com)
1
The Practitioner's Guide to the Maximal Update Parameterization (cerebras.ai)
1