Indexer Node

The Indexer Node is like a smart librarian that takes long documents and organizes them into searchable sections. It breaks your content into manageable chunks and prepares them for AI to search through and understand.

This is the essential first step for building any AI knowledge base or document search system.

Illustration of documents being organized and indexed for AI search

How it works

The node takes your documents, intelligently splits them at natural break points (like paragraphs), and converts each chunk into a searchable format that AI can understand and find relevant information from.

graph LR
  Document[Long Document] --> Split[Smart Splitting]
  Split --> Index[Create Search Index]
  Index --> Ready[Searchable Chunks]
  style Index fill:#6d28d9,stroke:#fff,color:#fff

Setup guide

Provide Your Content: Connect documents, web pages, or any text content you want to make searchable.
Choose Chunk Size: Decide how big each searchable piece should be (1000 characters works well for most content).
Set Overlap: Choose how much chunks should overlap to maintain context between sections.
Select AI Model: Choose an embedding model to convert text into searchable format.

Practical example: Company knowledge base

Let’s create a searchable knowledge base depending on what kind of documents you have.

For General Business Docs:

Chunk Size: 1000 characters (about 2-3 paragraphs).
Overlap: 200 characters (helps keep context between chunks).
Best For: Policies, procedures, and FAQs.

For Technical Manuals:

Chunk Size: 800 characters (smaller chunks for specific instructions).
Overlap: 150 characters.
Separators: Split by paragraphs or new lines.

For Research Papers:

Chunk Size: 1200 characters (larger chunks to keep complex ideas together).
Overlap: 300 characters.
Goal: Detailed analysis and understanding.

Common configurations

Content Type	Chunk Size	Overlap	Best For
General Business Docs	1000	200	Policies, procedures, FAQs
Technical Documentation	800	150	User manuals, API docs
Research Papers	1200	300	Academic content, detailed analysis
Customer Support	600	100	Quick answers, troubleshooting

Configuration settings

Setting	Purpose	Recommended Values
Chunk Size	How big each searchable piece is	800-1200 characters
Chunk Overlap	How much pieces overlap	150-300 characters
Separators	Where to split content	Paragraphs, sentences, sections
Embedding Model	AI model for search capability	OpenAI text-embedding-ada-002

Real-world examples

Employee handbook search

Make company policies instantly searchable:

Input: Employee handbook PDF
Chunk Size: 1000 (good for policy sections)
Overlap: 200 (maintains context)
Result: Searchable HR knowledge base

Technical documentation

Create searchable API documentation:

Input: Technical documentation
Chunk Size: 800 (shorter for specific instructions)
Separators: ["##", "###", "\n\n"] (respects heading structure)
Result: Instant technical support system

Research database

Build searchable academic paper collection:

Input: Research papers and articles
Chunk Size: 1200 (longer for academic context)
Overlap: 300 (important for research continuity)
Result: AI-powered research assistant

Troubleshooting

Poor chunk quality: Adjust chunk size and overlap settings, or customize separators for your document type.
Slow processing: Reduce document size, process in smaller batches, or use local embedding models.
Missing context: Increase chunk overlap to maintain better connections between sections.
Memory issues: Process large documents in smaller segments or reduce the maximum number of chunks.