For my final term capstone project as part of my Master's in Financial Analytics course, we were assigned an AI Chatbot project sponsored by Portland General Electric (PGE). PGE wanted to compare different LLM models and gain insights into the nuances and challenges of building an AI Chatbot. Their overall goal was to develop an internal AI chatbot for PGE employees to ask general questions and questions related to their internal knowledge base.
For this project, our goals were to: Quantitatively compare the outputs of different LLMs, conduct a high level cost analysis, and provide a recommendation of tools/frameworks.
First, we collected FAQ data from PGE's website which is used for finetuning and Retrieval Augmented Generation (RAG). After doing so, we collected the answers from querying 6 different models: GPT with RAG, GPT finetuned, GPT finetuned with RAG, Llama with RAG, Llama finetuned, and Llama finetuned with RAG. After collecting the outputs, we compared each of the outputs based on a pre defined grading scale of 1-5.
Below is our final presentation slides, and a word document with information on our findings while working on this project.
Deepened my understanding of LLMs and RAG
Project management and task delegation
Tools/Frameworks I used: Python (LlamaIndex, GPT API), Hugging Face Inference Endpoints, GPT, And Meta LLaMa 3