Building Ask Paul

I've been doing a lot of experimentation with Generative Machine Learning and one of the demo's that I've build is called "Ask Paul". You can ask me nearly any front-end web development question and the software will give you a direct answer if it can and links to further reading across the sites that I create content for (this blog, web.dev and developer.chrome.com)

You can try it with a couple more queries:

I'm very happy with the results, and I thought it would be nice to document how I built it because I believe that the technology is a lot more accessible than it was just 6 months ago and it can be integrated into any site.

Goals:

I wanted a search function that could easily index content that I gave it
I wanted the search to be able to understand related concepts e.g, "PWA" and "Progressive Web App" are the same thing.
I wanted to see if I could get the machine to generate a summary that would answer the person's question
Did not require JavaScript, and rendered a UI instantly

I used the Polymath-AI project as the main infrastructure for this project. Polymath-AI is pretty great (I've now contributed some fixes) and it's interesting to me not because of how I integrated it to my site, but rather it's a protocol to query and interrogate a web of repositories of knowledge. If you check out the CLI you can see that you can query any polymath instance ( e.g, npx polymath ask "What happened to Web Intents?" --servers https://paul.kinlan.me/polymath --openai-api-key [YOUR OPEN AI API KEY] ). I thought that this was incredibly powerful because I can ACL my own data and give you complete access to my public data, and selected people access to private data (I'm a big logseq user and I have a private repo that I would like to be able to interrogate).

It's worth giving a quick overview of Polymath because it does three things:

It ingests data from the owner of the instance (me in this case), and for each piece of content creates an embedding vector which is then stored in a pinecone database.
It finds content and links that are related to a persons query. First by creating an embedding vector for the persons query and then use cosine-similarity to compare the query vector against all the known contents embedding vector.
It discovers the most similar piece of content to the person's query to insert directly into a query to the Open AI API, with a question of the form: Answer the question as truthfully as possible using the provided context, and if the answer is not contained within the text below, say "I don\'t know".\n\nContext:{context}\n\nQuestion: {query}\n\nAnswer:. context is the most similar document, and query is the person's query.

And it works incredibly well. My mind was completely blown the first time that I saw it working.

I won't cover the ingestion process in this post as it's basically npx polymath ingest rss [url] .

It turns out that the plumbing to get this working is achievable and can be hooked up relatively quickly. If you want to host your own instance, you can do something similar to what I've done with the three components that I created.

A UI - Renders an HTML streaming response from the Polymath Client - that is all. [direct link] ( about 30 lines of HTML and a fetch request)
A Polymath Client - takes a person's query and interacts with any Polymath host - it's configured to connect to my Polymath Host and once it has the data, then queries Open AI with the above request. [direct link]
The Polymath Host - the implementation of the protocol defined by Polymath - this directly queries my configured pinecone database and will return embeddings that are closest to your query [direct link] - They great thing is that you don't need to use my UI to query it, you can use your own client (or CLI) npx polymath ask "Why is Paul so handsome?" --servers https://paul.kinlan.me/polymath --openai-api-key [YOUR OPEN AI API KEY]

Just a note if you are implementing this: I had to split it into three pieces because the Polymath client currently only supports the Node runtime and my UI needed to be able to stream a response from Vercel (which at the time was only available on the "Edge Server"). If I can get Node streaming working, then the UI and Client code would be be merged.

And that's it. I'm just at the start of an ML assisted journey and I still struggle to know what I can do with LLM but it's been a wild couple of weeks trying to learn how it all workings.

It's also wild to me that I can quickly build Generative ML experiences that work directly into my site easily. That I can implement your own site search++ quickly and it's a solved problem. There are database companies that just focus on Vector search :mind-blown:

Interesting links:

View Source | Make a correction | Correction history

I lead the Chrome Developer Relations team at Google.

We want people to have the best experience possible on the web without having to install a native app or produce content in a walled garden.

Our team tries to make it easier for developers to build on the web by supporting every Chrome release, creating great content to support developers on web.dev, contributing to MDN, helping to improve browser compatibility, and some of the best developer tools like Lighthouse, Workbox, Squoosh to name just a few.

I love to learn about what you are building, and how I can help with Chrome or Web development in general, so if you want to chat with me directly, please feel free to book a consultation.

I'm trialing a newsletter, you can subscribe below (thank you!)

Book a consultation

See Also

Tags