
Member-only story
How I built the simplest RAG based Question-Answering system before ChatGPT, LangChain or LlamaIndex came out (all for $0!)
A blueprint you can use!
This was the end of 2020 (November 2020 to be precise). It was the era of No ChatGPT, LlamaIndex or LangChain, which all were released in the later half of 2022.
I was working as a Research Data Scientist in the Enterprise AI domain for a Mulitnational Tech Company. The head of the department had a lofty vision. He wanted to create super-intelligence!
His dream was to create a system that could answer user questions from a huge document database (which we are currently calling Retrieval Augmented Generation).
BERT and its cousins were only recently released, and so were GPT and its kindred. I was fully aware of the generative capabilities of GPT, and knew that even with their latest release (GPT2), the task at hand was quite daunting.
So we broke down the whole problem is milestones, starting with the simplest solution, and working our way up. This is the system architecture of the simplest solution we came up with.
Problem Statement
Our simplest milestone in a database QA system was as follows:
Given a large set of question-answer documents (of any document format), the system must be able to produce the accurate answer in under 3 seconds of user input.
System/Product Requirements
Let’s break this down in parts or requirements. We had to:
- Consume a large set of Question-Answer documents
- The documents can be of any format
- The system must be able to consume user’s query
- and return the accurate answer
- The system should provide the answer in under 3 seconds
Model Selection and Design
I came up with a very simple RAG model as follows:
We would find similarity between the embedding of the user’s query and the embeddings of every question in our database (which we would create out of all the documents users…