As I started to play with Open AI and some Generative ML ideas, I said "There are database companies that just focus on Vector search :mind-blown:". My mind is still blown that this is an industry, but as I play with Polymath and Pinecone it is clear that they are useful services, and the tinkerer that I am wanted to tinker about with the idea of running this type of database directly in the browser. (If you are wondering "What is a Vector Database and why do I need one?", then this article is a good start)
The other week I spent some time building "Vector IDB" (source) as an experiment for making something similar to the structure that Pinecone has. The API surface is relatively plain, there are all the standard utilities: insert
, delete
, update
, query
and they can be used as follows.
import { VectorDB } from "idb-vector";
const db = new VectorDB({
vectorPath: "embedding"
});
const key1 = await db.insert({ embedding: [1, 2, 3], "text": "ASDASINDASDASZd" });
const key2 = await db.insert({ embedding: [2, 3, 4], "text": "GTFSDGRG" });
const key3 = await db.insert({ embedding: [73, -213, 3], "text": "hYTRTERFR" });
await db.update(key2, { embedding: [2, 3, 4], "text": "UPDATED" });
await db.delete(key3);
// Query returns a list ordered by the entries closest to the vector (cosine similarity)
console.log(await db.query([1, 2, 3], { limit: 20 }));
Because it is just a wrapper over IndexedDB you can throw JSON documents at it, and as long as it has an instance of an Array
on the property referenced by vectorPath
all should just work. It will create a IndexedDB for you with an objectStore and an index that is based on the defined vector too.
Now, this is no way a complete solution. There are no optimisations of the index; it doesn't do any pre-filtering to optimise they size query space; it doesn't do post-filtering of results (outside of the [limit] argument) etc etc. The goal was to be a simple wrapper to get you started quickly, if you already have a relatively complex IndexedDB integration for your site you will see by checking out some of the code that this is something that you can do you without too much hassle.
I enjoyed building this creating this project because it I got to learn a bit about Vector Databases and how they can be used when storing and querying embeddings
from APIs like Open AI directly inside a browser without having to use a hosted solution.
If you have experience building these types of databases, I would love to hear from you and learn what I might be missing.
I lead the Chrome Developer Relations team at Google.
We want people to have the best experience possible on the web without having to install a native app or produce content in a walled garden.
Our team tries to make it easier for developers to build on the web by supporting every Chrome release, creating great content to support developers on web.dev, contributing to MDN, helping to improve browser compatibility, and some of the best developer tools like Lighthouse, Workbox, Squoosh to name just a few.
I love to learn about what you are building, and how I can help with Chrome or Web development in general, so if you want to chat with me directly, please feel free to book a consultation.
I'm trialing a newsletter, you can subscribe below (thank you!)