Cory - Chronic Tinkering

Chronic Tinkering

Cory

Senior Engineering Leader | Chronic Tinkerer | Veteran of the Before Times

SWE-bench: LLM Developer Metrics vs. Hype

Development Featured

SWE-bench: LLM Developer Metrics vs. Hype

TL/DR; SWE-bench evaluates current LLMs abilities to complete common developer tasks and shows that, even with task specific agents, today's LLMs are at best ~12.5% successful. We need more benchmarks like these. Real-world benchmarks like these are critical for engineering leaders to understand so we can

Field Notes: Real Unreal Landscapes

Field Notes: Real Unreal Landscapes

Prologue One constant challenge of tinkering with a bunch of different topics is remembering what I read, where I found it, where I was in the process, and all the blind alleys that wasted an evening. A lot of tasks only need to be done once or twice for a

A seriously failed Marlin firmware update showing broken emojis on a blue LCD screen

It's Not Me, It's You—or Screwing up With Unreal Engine

Upon being told I was no longer needed at my job of 19 years, I did what any reasonable adult and parent of small children would do. I installed an Unreal development environment and wrote a blog post about it.