Articles by shahules
10

PA bench: Evaluating web agents on real world personal assistant workflows (vibrantlabs.com)

6

PA Bench: Evaluating Frontier Models on Multi-Tab Pa Tasks (vibrantlabs.com)

30

Show HN: Ragas – Open-source library for evaluating RAG pipelines (github.com/explodinggradients)

22

Show HN: Ragas – Open-source library for evals and testing RAG systems (github.com/explodinggradients)

4

Show HN: The rise of open source large language models (explodinggradients.com)

1

Show HN: GPT4 vs. GPT3:What you should know (explodinggradients.com)

1

Show HN: Open-source alternative to Adobe speech enhancer (github.com/shahules786)