Articles by shad42
45

We decreased our LLM costs with Opus (mendral.com)

2

Multi-player agents don't fit in the sandbox (mendral.com)

1

We built our AI agent, for analyzing CI logs (mendral.com)

1

Same LLM, different agent: a CI debugger built on Claude (mendral.com)

3

Agent Harness: Inside vs. Outside the Sandbox (mendral.com)

1

Same LLM but different output: we built a CI specialist (mendral.com)

2

We upgraded our agent to Opus and our costs went down (mendral.com)

1

Same LLM, Different Agent: What Changes When You Specialize for CI (mendral.com)

1

We decreased our LLM costs by switching to Opus (mendral.com)

6

What CI looks like at a 100-person team (PostHog) (mendral.com)

1

We upgraded to a frontier model and our costs went down (mendral.com)

89

We gave terabytes of CI logs to an LLM (mendral.com)

1

Anatomy of a Production AI Agent (mendral.com)

2

What CI Looks Like at a 100-Person Team (mendral.com)

3

What CI looks like at a 100-person team (mendral.com)

3

Evals as Code: CI for LLMs with Dagger (dagger.io)