How to Build a Custom (RAG) Chatbot in Keboola

Contents

Example H2

Example H3

Resources

March 31, 2025

Updated on

5 min read

How to Build a Custom (RAG) Chatbot in Keboola

Jordan Burger

Applied AI Research Lead

Nate Weaver

Product Engineer Intern

Want to make sure your AI chatbot knows what it's talking about? Read this and learn how to set up RAG in Keboola

Download for Free

Oops! Something went wrong while submitting the form. Try it again please.

Scroll to download

The biggest issue with chatbot implementations powered by generative AI is the accuracy and reliability of the output. Models can give erroneous or inaccurate answers due to hallucinations or simply because they lack information specific to a given business case, as many of them don’t have access to new data outside of pretraining.

Retrieval-Augmented Generation (RAG) is a technique designed to address this limitation by integrating an external retrieval mechanism with a generative model. This allows the language model to query an external database or knowledge base to retrieve relevant information, which is then used to generate a more accurate and contextually appropriate response.

Other advantages of using a RAG chatbot include:

It's context aware
RAG helps LLM focus on the context of the conversation and keeps responses more relevant.
It's updatable
You can edit the information the chatbot refers to. You can add new data points or remove outdated ones.
It's more reliable
Retrieval augmented chat bots make fewer mistakes and are able to provide accurate sources and references.‍

‍In this article, we are going to look at the advantages of this technique and explain how a RAG chatbot can be set up from scratch using Keboola required.

How does RAG Work?

At a high level, a RAG-powered chatbot operates in the following sequence:

Query: The user inputs a question or request.
Search in Vector Database: The model queries a vector database (like Pinecone) to find relevant information or documents that relate to the user's query.
Retrieve Relevant Context: The system retrieves contextually relevant data from the database, such as specific facts, documents, or background information.
Combine Query and Context: This retrieved information is then combined with the original user query and fed into a large language model (LLM).
Generate Response: The LLM uses this enriched input to generate a more accurate and contextually appropriate response.‍
Validate and Format Response: The generated response is then polished, checked for quality, and delivered to the user.

Why Build Your RAG Chatbot on Keboola

Keboola’s core features are aligned with RAG's requirements; it’s a self-service data operations platform that supports efficient data indexing and retrieval and has a full toolkit of features that help build a great AI chatbot:

Easy Integrations: With over 700 pre-built integrations and an intuitive drag-and-drop Flow Builder, you can construct ETL/ELT pipelines to collect all the data you need for training or updating RAG chatbots.
Flexible Embedding Integration – Keboola supports multiple embedding providers and hosting options, allowing users to seamlessly test different vectorization strategies or integrate with existing workflows.
Build Chatbot Applications with Data Apps: Data Apps powered by Streamlit enable the rapid development of dialog interfaces for RAG applications, simplifying the interaction between users and the RAG system. It's particularly great for creating user-friendly interfaces that make accessing complex capabilities of RAG models easy.
Get action-ready data with Data Templates: These pre-built workflows significantly reduce setup time, allowing teams to get RAG applications up and running in minutes, using pre-configured data models and integrations to streamline the initial deployment process.
Manage data quality: Transform and validate your data with Keboola’s no-code and low-code transformations, getting the raw data ready for the LLM model.
Automate workflows & MLOps: Keboola streamlines workflows and machine learning operations — crucial for the smooth running and maintenance of RAG systems.
Collaborate: Keboola enables teamwork across departments for cross-functional RAG development.

As a data stack as a service platform, Keboola focuses on simplification and automation, closely following the principles of data mesh.

Four Steps to Launching Your First RAG Chatbot in Keboola

And now, to practice…

Let’s say your goal is to build a RAG chatbot that helps users quickly retrieve relevant information from past conversations or logs. With Keboola’s extensive range of data source components, it offers a flexible foundation for RAG retrieval across many data sources.

Here is the data architecture you need to build:

STEP 1: Start a new data flow

Go to Keboola > Flows > Create Flow

STEP 2: Set Up Extractors from Your Data Sources

Keboola offers a broad selection of data extractors, enabling you to connect to a variety of sources including Slack, Jira, and other supported knowledge bases, support platforms, and more. For this specific example, we’ll demonstrate how to set up a Slack extractor, but you can follow similar steps for other data sources.

Slack Extractor

To extract data from Slack, follow the instructions in this documentation and complete the authorization steps. As part of the process, define your query to optimize extraction speed.

After defining your query, save your configuration and run the extractor.

For more information on other data sources, explore Keboola’s available extractors here.

‍STEP 3: Use Keboola’s Vector Embedding Component

Before you save the extracted data to your vector database, you first need to transform it from text to embeddings. In Keboola, you can do this with our embeddings component that transforms and prepares your data for semantic accurate searches.

Key Features of the Embedding Component:

The Embeddings Component offers several enhancements for improved efficiency, scalability, and support for various vector databases and embedding providers.

Vector Database Compatibility: The embeddings can be stored and searched on multiple vector databases such as Pinecone, pgvector, Qdrant, Milvus, Redis, and OpenSearch.
Flexible Embedding Providers: With the support of leading providers such as OpenAI, Azure, Cohere, Hugging Face, Google Vertex AI, and AWS Bedrock, you are able to generate embeddings that are optimized for their specific performance and cost requirements.

These improvements make it easier to integrate and scale retrieval-augmented generation (RAG) workflows in Keboola for faster and more reliable data indexing for chatbot applications.

More Use Cases for Embeddings Beyond RAG

While RAG is one of the most well-known applications of vector embeddings, they have a wide range of practical use cases:

Semantic Search & Information Retrieval: Embeddings enable context-aware search, allowing users to find relevant content even if it doesn't contain exact keyword matches. This is useful in cases like document repositories, customer support databases, and research platforms.
Recommendation Systems: Many recommendation engines use embeddings to suggest products, movies, or content based on semantic similarity rather than simple historical data. For example, e-commerce platforms can suggest related products based on text descriptions rather than just purchase history.

Content Clustering & Topic Modeling: You can use embeddings to group similar articles, research papers, or customer reviews, making it easier to organize large datasets.

Create an embedding for Slack messages using OpenAI’s Embedding models

‍

Embedding Service Settings

Add your configuration settings for OpenAI’s API. (While this example will use OpenAI, the same approach can be applied to our other available embedding providers.)

Select the embedding model for text-to-vector processing

Choose OpenAI’s "text-embedding-3-small" model to generate vector representations of text. This model balances efficiency and performance, making it well-suited for applications requiring fast and scalable embedding generation.

Save the transformation and test its connection to the OpenAI embedding service

‍Configure relevant columns and their association

We will generate embeddings of the “Text” column, use “ts” as our unique ID column, and include a couple of the other columns for metadata. Metadata can provide additional context for filtering and retrieval for applications

Using Metadata for hybrid search and filtering

Metadata allows for more refined searches by combining vector similarity with other queries. When doing retrieval–for example– you can filter results based on categories, timestamps, or other attributes before ranking based on semantic similarity.

Configure output

Configure advanced settings

Save the configuration and run it

Once the component completes, you should see a populated index in your Keboola storage tables.

STEP 4: Connect the chatbot Data App

Keboola already has a pre-built, customizable chatbot Data App (available here). You just need to connect it to your data.

Go to Components > Data App > Add Data App > Create New Data App. Next, add the GitHub chatbot link to the Project URL field.

Set up environment secrets to align with the ones used earlier.

Save your configuration and click “Deploy Data App” in the top-right corner. Once this finishes processing click “Open Data App” and enjoy your new RAG Chatbot hosted on Keboola!

‍

Alternative Storage

To store these embeddings in Pinecone instead, we only have to change a few settings.

Create a new Pinecone index

Use the corresponding embedding dimensions for the previously selected model. In this case, "text-embedding-3-small" has a dimension of 1536. Ensure the index configuration matches this dimension to store and retrieve embeddings correctly.

Re-configure the output settings for the vector database

Set the Output Configuration to Vector Database, then select Pinecone as the storage provider. Update the API Key, Index Name, and Environment to match the variables found in the Pinecone Dashboard.

Set up environment secrets to align with the new pinecone ones used earlier.

You can now store your vector embeddings on external platforms, making it easier to integrate with your existing data pipelines and infrastructure. Our RAG Demo app comes pre-configured for both Pinecone and Keboola Storage, allowing you to generate and test RAG queries across both platforms right away.

Additional Tips

Adjusting for different Embedding use cases

For longer text consider increasing the batch size to improve processing efficiency. Additionally, adjust the chunking strategy by splitting text into semantically meaningful segments like paragraphs or sentences, which can help preserve context and improve embedding quality. This may take some adjustment so experiment with different chunk sizes and overlap lengths to balance information retention and model performance!

Configure advanced options to best suit the application

For our Slack application, set the Batch Size to 100, enable Chunking, and change the Chunking Strategy to words. These settings help optimize processing efficiency and ensure that text is embedded in manageable segments.

Need help?

‍
Every business case is different. If you want to see how Keboola can help you solve a particular problem you’re facing, if you have questions after reading this guide, or if you simply need a second opinion regarding your project — reach out to us.

We’ll be happy to sit down and talk to you. There is nothing we love more than a chat about data.

References:
[1] Magesh,Surani et al., Hallucination-Free? Assessing the Reliability of Leading AI Legal Research Tools, Stanford University, Preprint, 2024

Online