Semantic Knowledge Search with LLM

January 5, 2024 · 3 min read

CEO at Autohost.ai

In today's fast-paced business environment, the efficient use of knowledge is crucial for any modern company's success. This is why organizations like HubSpot and Atlassian thrive; they've developed products that enhance team collaboration and knowledge sharing. However, the landscape is evolving, with new opportunities arising from recent technological advancements.

Key drivers for innovation in knowledge management include:

The rapid advancement of Large Language Models (LLMs) like ChatGPT.
Shorter development cycles, facilitated by the advancements in LLMs.
Emerging search and indexing technologies, such as Vector Databases, enabling semantic search.
The rise of chat applications like Slack as primary interfaces for knowledge workers.

In this post, we'll delve into how these developments can be harnessed to create a cost-effective, easy-to-build, and maintainable knowledge management system.

These solutions are also called "retrieval-augmented generation" (RAG), which means using a search engine to retrieve relevant documents, then using these documents as context for a language model to generate answers.

The Problem

At Autohost, we store our knowledge across:

Notion
Slack
Google Drive
GitHub

Team members typically search these platforms for required information. Although this method is effective, it has its challenges:

Uncertainty about where to start searching. For instance, it's often unclear whether specific information is in Notion or Slack, necessitating searches in multiple places.
The need to read through several pages to locate the right information.
Difficulty in pinpointing the exact information needed.

The Solution

By leveraging ChatGPT and a Vector Database, we've developed a knowledge management system that swiftly and easily locates the right information. Additionally, this system doubles as a chatbot for answering questions or generating creative ideas for sales and marketing.

To explore the project further, visit the Github repository here.

System components:

An API for indexing and searching knowledge (Serverless)
A web application for interacting with the API (Next.js)

Utilized technologies:

LLM API (OpenAI)
Vector Database (Pinecone)
Development frameworks:

How It Works

The API processes a list of documents, indexing them in a Vector Database after passing them through OpenAI for embedding extraction. The API also includes a search endpoint for querying and retrieving the most relevant documents.

In the web app, user queries are converted into embeddings by OpenAI, then passed to the Vector Database to find relevant documents. The LLM then uses these documents as context to generate answers, which are displayed in the web app alongside document references.

How to Use It

Follow these steps to deploy the solution on AWS and Vercel, as outlined in our Github repository:

Prerequisites:

Pinecone API key
OpenAI API key
Vercel account (for hosting the web app)
AWS account (for hosting the API)

Deployment steps:

Clone the Github repository.
Create and edit configs/dev.yml with your API keys.
Deploy the API with npx sls deploy -s dev.
Set up a new project on Vercel and deploy the web app.

Training steps:

Export your Notion and/or Slack data.
Transfer the files to the documentation/ folder.
Execute one of the upload scripts in the scripts/ folder.

Alternative Solutions

Commercial alternatives include:

Open source alternative:

StanGirard/quivr

Github Repository

For more details, visit: https://github.com/AutohostAI/langchain-vector-search

Autohost

Talk to Sales

The Problem​

The Solution​

How It Works​

How to Use It​

Alternative Solutions​

Github Repository​