I created a Lighthouse audit that uses machine learning to detect if an anchor tag looks like a button. This involved training a TensorflowJS model, building a custom Lighthouse gatherer to capture high-resolution screenshots, and processing those screenshots to identify anchors styled as buttons. The audit highlights these anchors in the Lighthouse report. The code for the scraper, web app, and Lighthouse audit are available on GitHub. While there are edge cases, this project demonstrates the potential of using ML for visual inspection tasks in web development.
In this project, I'm working on an accessibility tool to detect links styled as buttons, a common issue that can confuse users. My approach involves scraping websites to gather images of buttons and links, and then training a machine learning model to distinguish between them. This post focuses on the scraping process using Puppeteer. I encountered challenges like occluded elements and smooth scrolling, which I addressed by checking for occlusion and disabling smooth scrolling. The next step is training the ML image classifier.
This blog post explores how machine learning (ML) can enhance the developer experience. Inspired by Corridor Crew's use of ML in VFX, I initially brainstormed ways ML could automate tedious developer tasks, like accessibility improvements and performance optimization. I also considered ML's potential for generating layouts and images. The emergence of tools like GitHub Copilot and DALL-E-2 significantly impacted my thinking, especially regarding the future of software development and my role as a DevRel lead. Ultimately, the transformative power of GPT-Chat, demonstrated through its ability to generate webpage layouts and populate them with images based on simple prompts, left me questioning the future of my profession and considering the role I might play in training the next generation of AI tools.
I added dark mode to my blog! Inspired by Jeremy Keith, I used CSS custom properties and media queries to switch between light and dark themes based on the user's preference. I also included a fallback for browsers that don't support custom properties and a temporary CSS class for testing since Chrome DevTools didn't yet have dark mode emulation.
During a trip to Llangollen, I noticed that the historical information on local signs wasn't available online. This sparked an idea to make such information accessible on the web, especially for those with reading difficulties. I experimented with my existing image text extraction tool and found it works surprisingly well on these types of images. I'm now considering creating a website dedicated to archiving and indexing the text from informational signs, inspired by Google's NavlekhÄ project which helps offline Indian publishers digitize their content.
Inspired by a recent trip to India and the emphasis on local language content, I developed a script to translate my Hugo-based blog using Google Cloud Translate. This script processes markdown files, handles code blocks and pull quotes, and outputs translated versions, expanding the potential reach of my content to non-English speakers. While machine translation isn't perfect, the goal is to improve content discoverability and accessibility for a wider audience. I'll share results as data becomes available.
We rebuilt Pinterest's mobile web experience as a PWA and the results after one year have exceeded our expectations. Weekly active users on mobile web have increased 103% year-over-year, with even higher growth in Brazil (156%) and India (312%). Engagement metrics also saw incredible growth: session length (+296%), Pins seen (+401%), and Pin saves (+295%). Perhaps most importantly, logins increased by 370% and new signups by a staggering 843% year-over-year, making mobile web our top platform for new signups. We've seen 800,000 weekly users add the PWA to their homescreen in under 6 months. Beyond performance, this new platform supports right-to-left languages and night mode, making it more accessible. We're proud of this user experience and excited to continue building on this foundation.
I'm excited to share the latest addition to the Shape Detection API: the Text Detection API! This API allows you to detect text within images in real-time, right in the browser. It's still experimental and currently works on Chrome Canary for Android, but it opens up amazing possibilities. Imagine real-time translation, assistive technologies for parsing image content, or even grabbing URLs from slides at conferences. I've built a demo where the API detects text, draws a box around it, and reads it aloud when clicked. Check out the code and demo to experiment yourself. I can't wait to see what you build with this!
In this post, I share my support for Internet Explorer 7's decision to enable ClearType by default. Bill Hill's blog post on the topic highlights research demonstrating ClearType's positive impact on reading, IE's primary use case. Personally, I've found ClearType enhances readability and focus, though IE7 Beta 2 has presented rendering issues on platforms like Blogger.