RAG & Knowledge Bases
Retrieval-Augmented Generation (RAG) allows your AI to answer questions using information from your own documents. Instead of relying only on what the model was trained on, RAG fetches relevant content and includes it in the AI’s context.
Why Use RAG?
Section titled “Why Use RAG?”| Without RAG | With RAG |
|---|---|
| AI only knows training data | AI accesses your documents |
| Can’t answer company-specific questions | Answers from your knowledge base |
| May hallucinate facts | Cites actual sources |
| General, generic responses | Specific, relevant answers |
RAG Architecture
Section titled “RAG Architecture”A RAG system in Flow-Like has two phases:
1. Indexing Phase (One-Time Setup)
Section titled “1. Indexing Phase (One-Time Setup)”Documents ──▶ Chunk Text ──▶ Embed ──▶ Store in Database2. Query Phase (Every Question)
Section titled “2. Query Phase (Every Question)”User Question ──▶ Embed Query ──▶ Search Database ──▶ Add to Prompt ──▶ Generate AnswerBuilding a RAG System
Section titled “Building a RAG System”Step 1: Prepare Your Documents
Section titled “Step 1: Prepare Your Documents”First, you need to get your documents into Flow-Like:
- Upload files to your app’s Storage
- Read file contents using Storage nodes
- Split into chunks for efficient retrieval
Step 2: Create Embeddings
Section titled “Step 2: Create Embeddings”Embeddings are numerical representations that capture the meaning of text. Similar texts have similar embeddings, enabling semantic search.
Load an Embedding Model
Section titled “Load an Embedding Model”Use the Load Embedding Model node:
Load Embedding Model │ ├── Model: (select an embedding model) │ └── Result ──▶ (embedding model reference)Recommended embedding models:
text-embedding-3-small(OpenAI) – Fast, affordabletext-embedding-3-large(OpenAI) – Higher qualitynomic-embed-text(Ollama) – Local, freevoyage-2(VoyageAI) – High quality
Chunk Your Documents
Section titled “Chunk Your Documents”Large documents need to be split into smaller pieces. Use Chunk Text:
Chunk Text │ ├── Text: (your document) ├── Chunk Size: 500 ├── Overlap: 50 │ └── Chunks ──▶ (array of text pieces)| Parameter | Description | Recommendation |
|---|---|---|
| Chunk Size | Characters per chunk | 300-1000 |
| Overlap | Characters shared between chunks | 10-20% of chunk size |
Embed Your Documents
Section titled “Embed Your Documents”For each chunk, create an embedding using Embed Document:
For Each Chunk │ ▼Embed Document │ ├── Document: (chunk text) ├── Model: (embedding model) │ └── Vector ──▶ (embedding array)Step 3: Store in Database
Section titled “Step 3: Store in Database”Flow-Like provides a local vector database for storing and searching embeddings.
Open a Database
Section titled “Open a Database”Use Open Database to create or connect to a database:
Open Database │ ├── Name: "my_knowledge_base" │ └── Database ──▶ (database connection)Insert Documents
Section titled “Insert Documents”Use Insert or Upsert to store your chunks with their embeddings:
Insert │ ├── Database: (connection) ├── Data: { │ "text": "chunk content...", │ "source": "document.pdf", │ "page": 5 │ } ├── Vector: (embedding) │ └── EndStep 4: Search at Query Time
Section titled “Step 4: Search at Query Time”When a user asks a question:
Embed the Query
Section titled “Embed the Query”Embed Query │ ├── Query: "What is our return policy?" ├── Model: (same embedding model!) │ └── Vector ──▶ (query embedding)Search the Database
Section titled “Search the Database”Use Vector Search to find similar documents:
Vector Search │ ├── Database: (connection) ├── Vector: (query embedding) ├── Limit: 5 │ └── Results ──▶ (matching documents)Step 5: Generate the Answer
Section titled “Step 5: Generate the Answer”Now combine the retrieved documents with the user’s question:
Set System Message │ ├── System: "Answer using ONLY the provided context..." │ ▼Push Message (add context) │ ├── Content: "Context:\n{retrieved documents}" ├── Role: "user" │ ▼Push Message (add question) │ ├── Content: "Question: {user question}" ├── Role: "user" │ ▼Invoke LLM ──▶ AnswerSearch Methods
Section titled “Search Methods”Flow-Like supports multiple search strategies:
Vector Search
Section titled “Vector Search”Finds documents by semantic similarity—great for conceptual questions.
"What's our vacation policy?" → finds "PTO guidelines" documentFull-Text Search
Section titled “Full-Text Search”Finds documents by exact keywords—great for specific terms.
"policy number 12345" → finds documents containing "12345"Hybrid Search
Section titled “Hybrid Search”Combines vector + full-text for the best of both worlds:
Hybrid Search │ ├── Vector: (query embedding) ├── Search Term: "vacation policy" ├── Re-Rank: true │ └── Results ──▶ (best matches)The Re-Rank option reorders results for better relevance.
Complete RAG Flow Example
Section titled “Complete RAG Flow Example”Here’s a full RAG chatbot flow:
Chat Event │ ├──▶ history │ ▼Get Last Message (extract user question) │ ▼Embed Query │ ▼Hybrid Search (find relevant docs) │ ▼Format Context (combine retrieved docs) │ ▼Set System Message: "Answer based on context..." │ ▼Push Message: (context + question) │ ▼Invoke LLM │ ▼Push Response ──▶ (stream answer to user)Best Practices
Section titled “Best Practices”1. Chunk Strategically
Section titled “1. Chunk Strategically”- Use smaller chunks (300-500 chars) for precise answers
- Use larger chunks (800-1000 chars) for more context
- Consider semantic chunking (by paragraph/section)
2. Include Metadata
Section titled “2. Include Metadata”Store useful metadata with each chunk:
{ "text": "chunk content", "source": "employee_handbook.pdf", "page": 12, "section": "Benefits", "updated": "2025-01-15"}3. Craft Good System Prompts
Section titled “3. Craft Good System Prompts”Tell the AI to use only the provided context:
Answer the user's question using ONLY the information provided in the context.If the context doesn't contain the answer, say "I don't have information about that."Always cite your sources.4. Handle “No Results” Gracefully
Section titled “4. Handle “No Results” Gracefully”When the search returns no relevant documents, acknowledge it:
If (results.length == 0) └── Respond: "I couldn't find relevant information..."5. Use SQL Filters for Precision
Section titled “5. Use SQL Filters for Precision”Narrow down results using metadata filters:
Vector Search │ ├── SQL Filter: "source = 'hr_policies.pdf'" │ └── Results (only from HR policies)Updating Your Knowledge Base
Section titled “Updating Your Knowledge Base”Adding New Documents
Section titled “Adding New Documents”Run your indexing flow whenever you have new documents.
Updating Existing Documents
Section titled “Updating Existing Documents”Use Upsert instead of Insert—it updates existing records or creates new ones based on a unique ID.
Removing Documents
Section titled “Removing Documents”Use Delete with filters to remove outdated content:
Delete │ ├── Database: (connection) ├── SQL Filter: "source = 'old_document.pdf'" │ └── EndPerformance Tips
Section titled “Performance Tips”1. Batch Embeddings
Section titled “1. Batch Embeddings”Instead of embedding one document at a time, use Embed Documents (plural) for batch processing.
2. Limit Results
Section titled “2. Limit Results”Don’t retrieve too many documents—5-10 is usually enough. More can overwhelm the AI’s context window.
3. Use Hybrid Search
Section titled “3. Use Hybrid Search”For production systems, hybrid search usually outperforms pure vector search.
4. Optimize Chunk Overlap
Section titled “4. Optimize Chunk Overlap”10-20% overlap ensures important information at chunk boundaries isn’t lost.
Common Issues
Section titled “Common Issues””AI ignores my documents”
Section titled “”AI ignores my documents””- Check that your system prompt instructs the AI to use the context
- Verify documents are being retrieved (log the search results)
- Ensure the retrieved text is actually being added to the prompt
”Search returns irrelevant results”
Section titled “”Search returns irrelevant results””- Try different chunk sizes
- Use hybrid search with re-ranking
- Check you’re using the same embedding model for indexing and queries
”Database is empty”
Section titled “”Database is empty””- Verify your indexing flow ran successfully
- Check the database name matches between indexing and querying
- Look for errors in the indexing flow logs
Next Steps
Section titled “Next Steps”With RAG set up, explore:
- AI Agents – Let your AI search the knowledge base autonomously
- Extraction – Pull structured data from retrieved documents
- Chat & Conversations – Build a conversational RAG interface