All Posts
Browse all blog posts by year and month
2026 11
June 2
-
OSM data analysis for landuse
Published:• 20 min readAnalyzing OpenStreetMap keys and tags to surface what's relevant to landuse, from raw counts down to clustered key families.
-
You should not take Hugging Face language tags at face value
Published:• 3 min readA short look at why Hugging Face language tags can be useful while still requiring manual investigation.
May 1
-
A small milestone for our empathy and simulation paper
Published:• 1 min readOur Journal of Simulation paper on LLMs, agent-based models, and empathetic decision-making was selected as the Editor's Pick.
April 2
-
Distilling Agent-Based Models into Textual Explanations via LLMs
Published:• 3 min readA look at our new research on using LLMs to turn complex ABM simulations into clear textual explanations.
-
I am joining the EVERGREEN research team
Published:• 5 min readNext month, I will be joining INRIA in Montpellier, within the EVERGREEN research team as a research engineer.
March 1
-
What Is Design of Experiments? Learning It Through a Better Cup of Chai
Published:• 8 min readUsing the perfect cup of chai to understand the fundamentals of Design of Experiments (DOE)
February 4
-
How2Bench: A Guideline for Benchmark Development
Published:• 7 min readA breakdown of the How2Bench paper, which advocates rigor in benchmark development, with a focus on evaluation reliability and reproducibility.
-
An AI survival guide
Published:• 25 min readSome advice and resources I have found helpful so far as a junior AI researcher.
-
Two different philosophies of giving an agent hands
Published:• 7 min readA comparison between CLI and MCP approaches for giving AI agents capabilities to interact with systems.
-
One place, two views: the core idea behind GeoReasoner
Published:• 15 min readA breakdown of the GeoReasoner paper, which leverages both linguistic and geospatial information to reason on geospatially grounded natural language.
January 1
-
Building FineWeb-Legal: A 10B Token Pilot
Published:• 2 min readHow I extracted 67 million words of legal text from 10B tokens of web data using heuristics and classifiers.
2025 1
September 1
-
Can we turn agent-based models into empathetic stories (without getting poetic)?
Published:• 5 min readWe test whether GPT-4 can translate agents’ simulated lives into readable, empathetic narratives—and show that style transfer beats ‘please be empathetic’ prompts.
2024 1
October 1
-
Can LLMs learn conceptual modeling from slide decks?
Published:• 4 min readOur new study asks if LLMs can learn enough conceptual modeling to pass graduate quizzes by using the same course materials as students.