Skip to content

Ollama Embeddings

Ollama Embeddings converts text into numerical vectors (embeddings) that capture meaning and context. Think of it as creating a “fingerprint” for text that allows AI to understand similarity and relationships between different pieces of content.

NameTypeDescriptionRequiredDefault
textTextContent to convert to vectorsYes-
modelTextEmbedding model to useYes-
ollama_urlTextOllama server locationNohttp://localhost:11434
NameTypeDescription
embeddingArrayVector representation of the text
model_infoObjectDetails about the embedding model
processing_timeNumberTime taken in milliseconds

🔒 Complete Privacy: Text processing happens locally on your machine 💰 No API Costs: No per-request charges or usage limits ⚡ Fast Processing: No network delays, just local computation 🌐 Works Offline: Generate embeddings without internet connection 🎛️ Full Control: Choose exactly which embedding models to use

flowchart LR
    A[📝 Your Text] --> B[🧠 Ollama Model]
    B --> C[🔢 Vector Numbers]
    C --> D[💾 Ready for Search]

    style A fill:#e3f2fd
    style B fill:#fff3e0
    style C fill:#f3e5f5
    style D fill:#e8f5e8

Simple Process:

  1. Input Text: Give it any text content
  2. AI Processing: Ollama converts text to numbers that capture meaning
  3. Vector Output: Get a list of numbers that represents your text
  4. Search Ready: These vectors can be used to find similar content

📚 Building Knowledge Bases: Convert documents to searchable format 🔍 Finding Similar Content: Compare documents by meaning, not just keywords 🤖 AI Search Systems: Essential component for smart document search 📊 Content Organization: Group similar content automatically

  • Local Processing: Generate embeddings locally using Ollama without external API calls
  • Multiple Model Support: Access multiple embedding models including sentence-transformers and domain-specific models through Ollama
  • Batch Processing: Process multiple texts efficiently in single operations
  • Vector Operations: Calculate similarity scores and perform vector mathematics
  • Privacy Protection: All embedding generation happens locally on user’s machine
  • Semantic Search: Create searchable embeddings for document collections
  • Content Similarity: Compare and cluster similar content or documents
  • Knowledge Base Creation: Generate embeddings for RAG and vector store systems
  • Content Recommendation: Find related content based on semantic similarity
  • Data Classification: Group and categorize content using embedding similarity
ParameterTypeDescriptionExample
ollama_urlstringURL of the local Ollama server"http://localhost:11434"
modelstringOllama embedding model to use"nomic-embed-text"
input_textstringText content to generate embeddings for"This is sample text for embedding"
ParameterTypeDefaultDescriptionExample
batch_sizenumber10Number of texts to process in each batch5
normalizebooleantrueNormalize embedding vectors to unit lengthfalse
timeoutnumber30000Request timeout in milliseconds60000
cache_embeddingsbooleantrueCache generated embeddings for reusefalse
dimensionsnumberautoExpected embedding dimensions (auto-detect if not specified)768
{
"ollama_url": "http://localhost:11434",
"model": "nomic-embed-text",
"input_text": "{content_to_embed}",
"batch_size": 8,
"normalize": true,
"timeout": 45000,
"cache_embeddings": true,
"dimensions": 768,
"model_options": {
"temperature": 0.0,
"top_p": 1.0
},
"retry_attempts": 3,
"retry_delay": 1000
}
PermissionPurposeSecurity Impact
storageCache embeddings and model configurationsStores embedding data locally for performance
activeTabAccess content for embedding generationCan read content from active browser tabs
  • Fetch API: Communicates with local Ollama server for embedding generation
  • IndexedDB: Caches generated embeddings for improved performance
  • Web Workers: Processes large embedding operations without blocking UI
FeatureChromeFirefoxSafariEdge
Ollama Integration✅ Full✅ Full⚠️ Limited✅ Full
Embedding Caching✅ Full✅ Full✅ Full✅ Full
Batch Processing✅ Full✅ Full✅ Full✅ Full
  • Local Processing: All embedding generation occurs locally, ensuring data privacy
  • Network Security: Connections to Ollama server use secure local network protocols
  • Data Caching: Cached embeddings are stored securely in browser storage
  • Model Validation: Verifies Ollama model availability before processing
  • Resource Management: Monitors system resources to prevent overload
{
"input_text": "string or array - Text(s) to generate embeddings for",
"model_config": {
"model": "string - Ollama model name",
"options": "object - Model-specific options"
},
"processing_options": {
"batch_size": "number - Batch processing size",
"normalize": "boolean - Whether to normalize vectors",
"cache_key": "string - Custom cache key for this embedding"
},
"metadata": {
"source": "string - Source of the text content",
"timestamp": "string - When content was extracted"
}
}
{
"embeddings": [
{
"text": "string - Original text that was embedded",
"vector": "array - Embedding vector (array of numbers)",
"dimensions": "number - Vector dimensionality",
"model": "string - Model used for embedding generation",
"cache_hit": "boolean - Whether result came from cache"
}
],
"statistics": {
"total_texts": "number - Number of texts processed",
"processing_time": "number - Total processing time in milliseconds",
"cache_hits": "number - Number of cached results used",
"new_embeddings": "number - Number of newly generated embeddings"
},
"metadata": {
"timestamp": "2024-01-15T10:30:00Z",
"model_info": {
"name": "nomic-embed-text",
"dimensions": 768,
"max_tokens": 8192
},
"source": "ollama_embeddings"
}
}

Scenario: Generate embeddings for web page content to find similar documents

Configuration:

{
"ollama_url": "http://localhost:11434",
"model": "nomic-embed-text",
"input_text": "{extracted_content}",
"batch_size": 5,
"normalize": true,
"cache_embeddings": true
}

Input Data:

{
"input_text": [
"Artificial intelligence is transforming modern business operations through automation and data analysis.",
"Machine learning algorithms help companies optimize their processes and improve decision-making.",
"The latest developments in AI technology focus on natural language processing and computer vision."
],
"model_config": {
"model": "nomic-embed-text",
"options": {
"temperature": 0.0
}
},
"processing_options": {
"batch_size": 3,
"normalize": true,
"cache_key": "ai_content_batch_1"
}
}

Expected Output:

{
"embeddings": [
{
"text": "Artificial intelligence is transforming modern business operations through automation and data analysis.",
"vector": [0.123, -0.456, 0.789, "... (765 more values)"],
"dimensions": 768,
"model": "nomic-embed-text",
"cache_hit": false
},
{
"text": "Machine learning algorithms help companies optimize their processes and improve decision-making.",
"vector": [0.234, -0.567, 0.890, "... (765 more values)"],
"dimensions": 768,
"model": "nomic-embed-text",
"cache_hit": false
}
],
"statistics": {
"total_texts": 3,
"processing_time": 2500,
"cache_hits": 0,
"new_embeddings": 3
},
"metadata": {
"timestamp": "2024-01-15T10:30:00Z",
"model_info": {
"name": "nomic-embed-text",
"dimensions": 768,
"max_tokens": 8192
},
"source": "ollama_embeddings"
}
}

Step-by-Step Process:

  1. Text content is prepared and validated for embedding generation
  2. Connection to local Ollama server is established and model availability verified
  3. Texts are processed in batches using the specified embedding model
  4. Generated embeddings are normalized and cached for future use
  5. Results include both embeddings and processing statistics

Scenario: Create embeddings for documents to build a searchable knowledge base

Configuration:

{
"ollama_url": "http://localhost:11434",
"model": "nomic-embed-text",
"input_text": "{document_chunks}",
"batch_size": 10,
"normalize": true,
"cache_embeddings": true,
"dimensions": 768
}

Workflow Integration:

GetAllTextFromLink → RecursiveCharacterTextSplitter → Ollama Embeddings → LocalKnowledge
↓ ↓ ↓ ↓
raw_content text_chunks embeddings vector_storage

Complete Example: This pattern creates a complete pipeline for building searchable knowledge bases from web content, enabling semantic search and RAG capabilities.

This example demonstrates the fundamental usage of the OllamaEmbeddings node in a typical workflow scenario.

Configuration:

{
"model": "example_value",
"enabled": true
}

Input Data:

{
"data": "sample input data"
}

Expected Output:

{
"result": "processed output data"
}

This example shows more complex configuration options and integration patterns.

Configuration:

{
"parameter1": "advanced_value",
"parameter2": false,
"advancedOptions": {
"option1": "value1",
"option2": 100
}
}

Example showing how this node integrates with other workflow nodes:

  1. Previous NodeOllamaEmbeddingsNext Node
  2. Data flows through the workflow with appropriate transformations
  3. Error handling and validation at each step
  • Nodes: GetAllTextFromLink → RecursiveCharacterTextSplitter → Ollama Embeddings → LocalKnowledge
  • Use Case: Build searchable knowledge bases from web content
  • Configuration Tips: Use consistent chunk sizes and embedding models for optimal search performance
  • Nodes: Ollama Embeddings → Code → Filter → EditFields
  • Use Case: Calculate similarity scores and filter content based on semantic similarity
  • Data Flow: Embedding generation → Similarity calculation → Filtering → Result formatting
  • Performance: Use appropriate batch sizes to balance speed and resource usage
  • Error Handling: Implement retry logic for Ollama server connection issues
  • Data Validation: Validate text content and handle encoding issues before embedding
  • Resource Management: Monitor Ollama server resources and implement request throttling
  • Symptoms: Embedding requests fail with connection errors or timeouts
  • Causes: Ollama server not running, incorrect URL, or network connectivity issues
  • Solutions:
    1. Verify Ollama server is running on the specified URL
    2. Check network connectivity and firewall settings
    3. Increase timeout values for slower systems
    4. Verify the specified model is available in Ollama
  • Prevention: Implement health checks and server status monitoring
  • Symptoms: Embedding operations take significantly longer than expected
  • Causes: Large batch sizes, resource-intensive models, or system limitations
  • Solutions:
    1. Reduce batch_size parameter for better performance
    2. Use lighter embedding models if available
    3. Implement caching to avoid regenerating embeddings
    4. Monitor system resources and optimize accordingly
  • Prevention: Profile embedding performance and optimize batch sizes
  • CORS policies may affect local Ollama server connections; configure server appropriately
  • Use service workers for background embedding processing
  • WebExtension networking may have different timeout behaviors
  • Ensure proper error handling for network request failures
  • Memory Usage: Large embedding batches may consume significant memory
  • Processing Time: Complex models may require substantial processing time
  • Cache Management: Large embedding caches may impact browser storage
  • Ollama Dependency: Requires local Ollama installation and running server
  • Model Availability: Limited to embedding models supported by Ollama
  • Processing Speed: Local processing may be slower than cloud-based alternatives
  • Network Access: Requires network access to local Ollama server
  • Resource Constraints: Browser memory limits may restrict batch processing
  • CORS Restrictions: May require Ollama server CORS configuration
  • Text Length: Limited by Ollama model’s maximum token capacity
  • Batch Size: Large batches may cause memory or timeout issues
  • Model Constraints: Embedding quality depends on chosen Ollama model capabilities

LLM: Large Language Model - AI models trained on vast amounts of text data

RAG: Retrieval-Augmented Generation - AI technique combining information retrieval with text generation

Vector Store: Database optimized for storing and searching high-dimensional vectors

Embeddings: Numerical representations of text that capture semantic meaning

Prompt: Input text that guides AI model behavior and response generation

Temperature: Parameter controlling randomness in AI responses (0.0-1.0)

Tokens: Units of text processing used by AI models for input and output measurement

  • artificial intelligence
  • machine learning
  • natural language processing
  • LLM
  • AI agent
  • chatbot
  • text generation
  • language model
  • “ai”
  • “llm”
  • “gpt”
  • “chat”
  • “generate”
  • “analyze”
  • “understand”
  • “process text”
  • “smart”
  • “intelligent”
  • content analysis
  • text generation
  • question answering
  • document processing
  • intelligent automation
  • knowledge extraction