Michelle Kidby

Skip to content

June 5, 2026

Extending Playwright: Mobile, Lighthouse, and Storybook

An experiment in extending the core Playwright runner to natively support Appium, inline Lighthouse audits, Storybook discovery, and duration-based test sharding.
June 4, 2026

Benchmarking LLMs on real test-automation work

Twenty models, scored by what their generated tests actually do against a sandbox. They perform exceptionally well at unit-test generation, but stall on the engineering judgment that maintaining a real suite requires.