All Posts
Browse all blog posts by year and month
2026 6
March 1
-
What Is Design of Experiments? Learning It Through a Better Cup of Chai
Published:• 8 min readUsing the perfect cup of chai to understand the fundamentals of Design of Experiments (DOE)
February 4
-
How2Bench: A Guideline for Benchmark Development
Published:• 7 min readA breakdown of the How2Bench paper, which advocates rigor in benchmark development, with a focus on evaluation reliability and reproducibility.
-
An AI survival guide
Published:• 25 min readSome advice and resources I have found helpful so far as a junior AI researcher.
-
Two different philosophies of giving an agent hands
Published:• 7 min readA comparison between CLI and MCP approaches for giving AI agents capabilities to interact with systems.
-
One place, two views: the core idea behind GeoReasoner
Published:• 15 min readA breakdown of the GeoReasoner paper, which leverages both linguistic and geospatial information to reason on geospatially grounded natural language.
January 1
-
Building FineWeb-Legal: A 10B Token Pilot
Published:• 2 min readHow I extracted 67 million words of legal text from 10B tokens of web data using heuristics and classifiers.
2025 1
September 1
-
Can we turn agent-based models into empathetic stories (without getting poetic)?
Published:• 5 min readWe test whether GPT-4 can translate agents’ simulated lives into readable, empathetic narratives—and show that style transfer beats ‘please be empathetic’ prompts.
2024 1
October 1
-
Can LLMs learn conceptual modeling from slide decks?
Published:• 4 min readOur new study asks if LLMs can learn enough conceptual modeling to pass graduate quizzes by using the same course materials as students.