Data Science Symposium - RAG Chatbot

I had the opportunity to present at the Gilead Data Science Conference in San Francisco. Gilead's work in life-saving biopharmaceuticals—from HIV/AIDS to COVID-19—relies heavily on complex clinical trial data. To help their teams navigate this information, we partnered with them to demonstrate how Retrieval-Augmented Generation (RAG) can transform static documentation into interactive, conversational AI using the Dash framework.

Python LangChain OpenAI FAISS Dash RAG

Why Dash for AI Applications?

Dash is an open-source Python framework designed for building interactive web applications. It is easy to use for Python developers because it offers the flexibility of React.js and the robustness of Flask without requiring deep knowledge of JavaScript or CSS.

The core strength of Dash lies in its callbacks. These Python-based functions allow the UI to update automatically whenever a user interacts with a component—like a button or a slider. Whether you're building a rapid prototype in a day or a production-grade enterprise application, Dash provides a flexible foundation that integrates seamlessly with databases and Plotly's data visualization suite.

The RAG Workflow: From PDF to Conversation

During the conference, I walked through a demo app that I built that handles the entire RAG process. We broke the architecture down into four primary technical steps:

1. PDF Parsing & Chunking

The process begins by extracting raw text from uploaded clinical PDFs. However, raw text isn't enough; we have to split it into chunks.

The Strategy: Instead of splitting by word or sentence, we aim for a chunk size that captures sufficient context. Since vector search relies on meaning rather than keywords, the ideal chunk size is crucial for ensuring the LLM understands how sentences relate to one another.

2. Embedding & Vector Storage

Once chunked, the text is converted into vectors—numerical representations of meaning. I used FAISS (Facebook AI Similarity Search) to create a searchable index. This allows the system to perform high-speed similarity searches to find the most relevant document sections based on a user's query.

3. The Conversation Chain

To make the bot "smart", I initialize a ConversationRetrievalChain. This chain doesn't just look at the current prompt; it considers the conversation history and the context provided by the vector store.

4. The Retriever Interface

By converting the vector store into a retriever (using as_retriever()), I enable the Large Language Model (LLM) to "fact-check" its responses against the uploaded data. This significantly reduces hallucinations and ensures the chatbot provides information grounded in the specific clinical data provided.

Demo

Here's a quick walkthrough of the chatbot in action, querying clinical trial data and generating context-aware responses in real time.