tedsanders - Hacker News

353

Advancing the price-performance frontier with GPT‑5.6 (openai.com)

3 hours ago tedsanders openai.com

10

Enabling two settings tripled our scores on the ARC-AGI-3 benchmark (openai.com)

13 hours ago tedsanders openai.com

2

Codex / ChatGPT Work has reached 8M active users (twitter.com/thsottiaux)

2 weeks ago tedsanders twitter.com

522

An OpenAI model has disproved a central conjecture in discrete geometry (openai.com)

2 months ago tedsanders openai.com

1

Get 2 months of Codex for your enterprise, free (openai.com)

2 months ago tedsanders openai.com

2

Tau-knowledge: benchmarking agents on real-world knowledge (sierra.ai)

2 months ago tedsanders sierra.ai

1

Mythos for Offensive Security: XBOW's Evaluation (xbow.com)

2 months ago tedsanders xbow.com

3

Why SWE-bench Verified no longer measures frontier coding capabilities (openai.com)

5 months ago tedsanders openai.com

2

METR estimates that GPT-5.2 has a 50%-time-horizon of around 6.6 hrs (twitter.com/metr_evals)

5 months ago tedsanders twitter.com

30

GPT-5.1 for Developers (openai.com)

8 months ago tedsanders openai.com

246

GPT-5.1: A smarter, more conversational ChatGPT (openai.com)

8 months ago tedsanders openai.com

6

OpenAI reasoning system scores 12/12 at the 2025 ICPC World Finals (twitter.com/mostafarohani)

10 months ago tedsanders twitter.com

12

ChatGPT Sent Me to the ER (benorenstein.substack.com)

10 months ago tedsanders substack.com

4

Google faked Gemini AI output in Super Bowl ad (theverge.com)

a year ago tedsanders theverge.com

459

Stargate Project: SoftBank, OpenAI, Oracle, MGX to build data centers (apnews.com)

a year ago tedsanders apnews.com

1

Evaluating frontier AI R&D capabilities of LLM agents against human experts (metr.org)

a year ago tedsanders metr.org

1

The Zipper Merge (2017) (tedsanders.com)

a year ago tedsanders tedsanders.com

1

What Makes Documentation Good (cookbook.openai.com)

a year ago tedsanders openai.com

1

Apple is putting ChatGPT in Siri for free later this year (theverge.com)

2 years ago tedsanders theverge.com

1

What Does the Public in Six Countries Think of Generative AI in News? [pdf] (ox.ac.uk)

2 years ago tedsanders ox.ac.uk

2

What Makes Documentation Good (cookbook.openai.com)

2 years ago tedsanders openai.com

1

Roots of Disagreement on AI Risk (forecastingresearch.org)

2 years ago tedsanders forecastingresearch.org

192

How many legs do ten elephants have, if two of them are legless? (bard.google.com)

2 years ago tedsanders google.com

90

Is the reversal curse in LLMs real? (andrewmayne.com)

2 years ago tedsanders andrewmayne.com

1

I came second out of 999 in the Salem Center prediction market tournament (polybdenum.com)

2 years ago tedsanders polybdenum.com

2

Robotaxis Will Go Through $100B in Losses to Reach Profitable Scale (nextbigfuture.com)

2 years ago tedsanders nextbigfuture.com

1

Transformative AGI by 2043 is <1% likely (effectivealtruism.org)

3 years ago tedsanders effectivealtruism.org

3

Transformative AGI by 2043 is <1% likely (arxiv.org)

3 years ago tedsanders arxiv.org

1

Question answering using embeddings-based search (OpenAI Cookbook) (github.com/openai)

3 years ago tedsanders github.com

80

Techniques to improve reliability (github.com/openai)

3 years ago tedsanders github.com