I built Ask Paul, a generative AI demo that answers front-end web dev questions using my content. It leverages Polymath-AI to index content, find related concepts, and generate summaries by creating embedding vectors, using cosine-similarity, and querying OpenAI. The implementation has a UI, a Polymath Client, and a Polymath Host. It's super cool how accessible this tech is now!
I presented "Aiming for the Future" at Bangor University, exploring computing's evolution from the Difference Engine to the modern era, focusing on content/data delivery shifts. I proposed that Machine Learning, especially Generative AI, is the next major computing wave, akin to the Web's rise in the early 2000s, potentially mechanizing mental labor. The Student Expo showcased many final-year projects incorporating AI, from creative tools to practical problem-solving, indicating the growing importance of AI in various fields.
I created a Lighthouse audit that uses machine learning to detect if an anchor tag looks like a button. This involved training a TensorflowJS model, building a custom Lighthouse gatherer to capture high-resolution screenshots, and processing those screenshots to identify anchors styled as buttons. The audit highlights these anchors in the Lighthouse report. The code for the scraper, web app, and Lighthouse audit are available on GitHub. While there are edge cases, this project demonstrates the potential of using ML for visual inspection tasks in web development.
I needed higher resolution screenshots for an ML model to classify elements on a webpage, but the default Lighthouse screenshot was too compressed. So, I created a custom Lighthouse Gatherer using Puppeteer. This gatherer captures a full-page, high-resolution screenshot encoded as base64 and returns it along with the device pixel ratio. This was a fun little project, and the code is surprisingly concise. However, future Lighthouse versions may include higher-resolution screenshots, making this gatherer redundant.
I built a web app using Deno, Fresh, and TensorFlowJS to classify images as links or buttons. The app uses a pre-trained ML model and allows users to drag and drop multiple images for classification. I encountered challenges with server-side rendering and islands, specifically with integrating a file-drop web component. I also documented the process of integrating the TensorFlowJS model, including model loading and prediction handling. The code is available on GitHub.
I trained a machine learning model to differentiate between buttons and links on web pages. Using a dataset of ~3000 button images and ~4000 link images, I trained a convolutional neural network (CNN) with added noise for better generalization. Preprocessing included grayscale conversion, dataset diversification with multilingual sites, and image compression. The model performed well in initial tests, correctly classifying button-like and link-like elements. Next, I'll build a web app for easier testing and a Lighthouse audit for website analysis.
In this project, I'm working on an accessibility tool to detect links styled as buttons, a common issue that can confuse users. My approach involves scraping websites to gather images of buttons and links, and then training a machine learning model to distinguish between them. This post focuses on the scraping process using Puppeteer. I encountered challenges like occluded elements and smooth scrolling, which I addressed by checking for occlusion and disabling smooth scrolling. The next step is training the ML image classifier.