I trained a machine learning model to differentiate between buttons and links on web pages. Using a dataset of ~3000 button images and ~4000 link images, I trained a convolutional neural network (CNN) with added noise for better generalization. Preprocessing included grayscale conversion, dataset diversification with multilingual sites, and image compression. The model performed well in initial tests, correctly classifying button-like and link-like elements. Next, I'll build a web app for easier testing and a Lighthouse audit for website analysis.
I've created Puppeteer Go, a small JavaScript library to simplify the process of creating CLI utilities with Puppeteer. It handles the boilerplate of launching the browser, opening a tab, navigating to a URL, performing a specified action, and cleaning up. This post demonstrates its usage by taking multiple screenshots of elements on a page, inspired by Ire Aderinokun's work. Examples include capturing screenshots of h1 elements on my blog and feature blocks on caniuse.com.