477
1
Get 2 months of Codex for your enterprise, free (openai.com)
2
Tau-knowledge: benchmarking agents on real-world knowledge (sierra.ai)
1
Mythos for Offensive Security: XBOW's Evaluation (xbow.com)
3
Why SWE-bench Verified no longer measures frontier coding capabilities (openai.com)
2
METR estimates that GPT-5.2 has a 50%-time-horizon of around 6.6 hrs (twitter.com/metr_evals)
30
GPT-5.1 for Developers (openai.com)
246
GPT-5.1: A smarter, more conversational ChatGPT (openai.com)
6
OpenAI reasoning system scores 12/12 at the 2025 ICPC World Finals (twitter.com/mostafarohani)
12
ChatGPT Sent Me to the ER (benorenstein.substack.com)
4
Google faked Gemini AI output in Super Bowl ad (theverge.com)
459
Stargate Project: SoftBank, OpenAI, Oracle, MGX to build data centers (apnews.com)
1
Evaluating frontier AI R&D capabilities of LLM agents against human experts (metr.org)
1
The Zipper Merge (2017) (tedsanders.com)
1
What Makes Documentation Good (cookbook.openai.com)
1
Apple is putting ChatGPT in Siri for free later this year (theverge.com)
1
What Does the Public in Six Countries Think of Generative AI in News? [pdf] (ox.ac.uk)
2
What Makes Documentation Good (cookbook.openai.com)
1
Roots of Disagreement on AI Risk (forecastingresearch.org)
192
How many legs do ten elephants have, if two of them are legless? (bard.google.com)
90
Is the reversal curse in LLMs real? (andrewmayne.com)
1
I came second out of 999 in the Salem Center prediction market tournament (polybdenum.com)
2
Robotaxis Will Go Through $100B in Losses to Reach Profitable Scale (nextbigfuture.com)
1
Transformative AGI by 2043 is <1% likely (effectivealtruism.org)
3
Transformative AGI by 2043 is <1% likely (arxiv.org)
1
Question answering using embeddings-based search (OpenAI Cookbook) (github.com/openai)
80