How I built the simplest RAG based Question-Answering system before ChatGPT, LangChain or LlamaIndex came out (all for $0!)
A blueprint you can use!
This was the end of 2020 (November 2020 to be precise). It was the era of No ChatGPT, LlamaIndex or LangChain, which all were released in the later half of 2022.
I was working as a Research Data Scientist in the Enterprise AI domain for a Mulitnational Tech Company. The head of the department had a lofty vision. He wanted to create super-intelligence!
His dream was to create a system that could answer user questions from a huge document database (which we are currently calling Retrieval Augmented Generation).
BERT and its cousins were only recently released, and so were GPT and its kindred. I was fully aware of the generative capabilities of GPT, and knew that even with their latest release (GPT2), the task at hand was quite daunting.
So we broke down the whole problem is milestones, starting with the simplest solution, and working our way up. This is the system architecture of the simplest solution we came up with.
Problem Statement
Our simplest milestone in a database QA system was as follows: