Creating a Lighthouse Gatherer to generate high-res screenshots for your Audit

Lighthouse has an awesome yet little known API. Anything that Lighthouse can do, so can you and as far as I can tell, pretty much every Audit and Gatherer that runs in lighthouse is open.

Lighthouse has two concepts:

  1. Gatherers: Given a page, run a set of scripts to pull information and data out of the page so that the audit can quickly parse it.
  2. Audits: Given all the information gathered, run some tests to see if the page.

Lighthouse has a huge number of Gathers, one of them is called FullPageScreenshot which returns a highly compressed screenshot of the page. (note to self... why didn't I just use this code?? NIH probably).

When I was building the the Audit to identify if an anchor looks like a button (more in a later post), the default screenshot resolution was too low and had too many compression artefacts for the ML model to reliably classify the image. Normal resolution images worked just fine though, so I decided to build a gatherer that took a full resolution and full-size screenshot of the current page which could later be requested by my audit.

In the end it was quite fun and not a much code as I thought it might be. I chose to use Puppeteer because I know the API well and that's about it.

const { Gatherer } = require("lighthouse");
const puppeteer = require("puppeteer");

// Heavily inspired by https://keepinguptodate.com/pages/2021/08/custom-lighthouse-audit/ and https://github.com/GoogleChrome/lighthouse/blob/main/docs/recipes/custom-gatherer-puppeteer/custom-gatherer.js

async function connect(driver) {
  const browser = await puppeteer.connect({
    browserWSEndpoint: await driver.wsEndpoint(),
    defaultViewport: null,
  });
  const { targetInfo } = await driver.sendCommand("Target.getTargetInfo");
  const puppeteerTarget = (await browser.targets()).find(
    (target) => target._targetId === targetInfo.targetId
  );
  const page = await puppeteerTarget.page();
  return { browser, page, executionContext: driver.executionContext };
}

class BigScreenshot extends Gatherer {
  /**
   * @param {LH.Gatherer.PassContext} options
   * @param {LH.Gatherer.LoadData} loadData
   */
  async afterPass(options, loadData) {
    const { driver } = options;
    const { page, executionContext } = await connect(driver);

    const devicePixelRatio = await page.evaluate("window.devicePixelRatio");
    const screenshot = await page.screenshot({
      encoding: "base64",
      fullPage: true,
      captureBeyondViewport: true,
    });

    /**
     * @return {LH.Gatherer.PhaseResult}
     */
    return { screenshot, devicePixelRatio };
  }
}

module.exports = BigScreenshot;

And that's it.

p.s I heard on the grapevine that version 10 of Lighthouse might move away from Jpeg Screenshot so this code might be redundant pretty quickly.

About Me: Paul Kinlan

I lead the Chrome Developer Relations team at Google.

We want people to have the best experience possible on the web without having to install a native app or produce content in a walled garden.

Our team tries to make it easier for developers to build on the web by supporting every Chrome release, creating great content to support developers on web.dev, contributing to MDN, helping to improve browser compatibility, and some of the best developer tools like Lighthouse, Workbox, Squoosh to name just a few.