Skip to content

Multi-Step Workflow Integration

Learn to build sophisticated workflows that combine browser content extraction with external API processing. This tutorial demonstrates how to create multi-step automation that processes web data and integrates with external services.

In this tutorial, you’ll create a comprehensive workflow that:

  • Extracts product information from e-commerce pages
  • Processes and validates the extracted data
  • Integrates with external APIs for price comparison
  • Generates formatted reports with recommendations
  • Handles errors and edge cases gracefully
  • Completed all Beginner Tutorials
  • Understanding of REST APIs and JSON data
  • Basic knowledge of data validation concepts
  • Familiarity with HTTP requests and responses

By the end of this tutorial, you’ll understand:

  • How to design complex multi-step workflows
  • Integration patterns for external APIs
  • Data validation and error handling strategies
  • Performance optimization for complex workflows
  • Real-world automation patterns
Trigger → Extract Product Data → Validate Data → API Integration → Generate Report
↓ ↓ ↓ ↓ ↓
WhenStarted → GetAllText → EditFields → Filter → HTTP Request → EditFields → DownloadAsFile
↓ ↓ ↓ ↓ ↓
GetAllImages → EditFields → Merge → Error Handler → Format Data

Stage 1: Content Extraction

  • Extract product names, prices, descriptions
  • Collect product images and metadata
  • Gather page context and source information

Stage 2: Data Processing

  • Clean and normalize extracted data
  • Validate data completeness and accuracy
  • Merge multiple data sources

Stage 3: External Integration

  • Query price comparison APIs
  • Fetch additional product information
  • Validate external data responses

Stage 4: Report Generation

  • Combine internal and external data
  • Format results for human consumption
  • Generate downloadable reports
  1. Create New Workflow

    • Name: “Product Analysis Pipeline”
    • Category: “Intermediate Tutorials”
    • Description: “Multi-step product data extraction and analysis”
  2. Add Core Nodes

    WhenStarted (Trigger)
    GetAllText (Content Extraction)
    GetAllImages (Image Extraction)
    EditFields (Data Processing) x3
    Filter (Data Validation)
    HTTP Request (API Integration)
    Merge (Data Combination)
    DownloadAsFile (Output)
  3. Plan Data Structure

    // Target data structure throughout workflow:
    {
    "product": {
    "name": "Product Name",
    "price": "$99.99",
    "description": "Product description...",
    "images": ["url1", "url2"],
    "source": "https://example-store.com/product"
    },
    "analysis": {
    "extractedAt": "2024-01-15T10:30:45.123Z",
    "dataQuality": "high",
    "completeness": 95
    },
    "comparison": {
    "averagePrice": "$89.99",
    "priceRank": "above-average",
    "competitors": [...]
    }
    }

GetAllText Node Configuration:

{
"nodeName": "Extract Product Text",
"settings": {
"includeHidden": false,
"preserveFormatting": true,
"excludeElements": ["nav", "footer", "aside", ".advertisement"],
"focusSelectors": [".product-info", ".product-details", ".price"]
}
}

GetAllImages Node Configuration:

{
"nodeName": "Extract Product Images",
"settings": {
"includeDataUrls": false,
"minWidth": 100,
"minHeight": 100,
"excludeSelectors": [".thumbnail", ".icon", ".logo"],
"includeAltText": true
}
}

EditFields Node 1 - Text Processing:

{
"nodeName": "Process Product Text",
"operations": [
{
"field": "productName",
"action": "extract",
"pattern": "(?i)(product|item)\\s*:?\\s*([^\\n]+)",
"group": 2
},
{
"field": "price",
"action": "extract",
"pattern": "\\$[0-9,]+\\.?[0-9]*",
"multiple": false
},
{
"field": "description",
"action": "extract",
"pattern": "(?i)description\\s*:?\\s*([^\\n]{50,500})",
"group": 1
},
{
"field": "extractedAt",
"action": "set",
"value": "{{new Date().toISOString()}}"
}
]
}

EditFields Node 2 - Image Processing:

{
"nodeName": "Process Product Images",
"operations": [
{
"field": "productImages",
"action": "filter",
"conditions": [
{"property": "src", "operator": "not_contains", "value": "logo"},
{"property": "width", "operator": "greater_than", "value": 200}
]
},
{
"field": "primaryImage",
"action": "select",
"criteria": "largest",
"fallback": "first"
},
{
"field": "imageCount",
"action": "count",
"source": "productImages"
}
]
}

Filter Node Configuration:

{
"nodeName": "Validate Product Data",
"conditions": [
{
"field": "productName",
"operator": "not_empty",
"required": true,
"errorMessage": "Product name is required"
},
{
"field": "price",
"operator": "matches",
"pattern": "\\$[0-9,]+\\.?[0-9]*",
"required": true,
"errorMessage": "Valid price is required"
},
{
"field": "description",
"operator": "min_length",
"value": 20,
"required": false,
"errorMessage": "Description too short"
}
],
"onFailure": "continue_with_warning",
"logFailures": true
}

EditFields Node 3 - Data Enhancement:

{
"nodeName": "Enrich Product Data",
"operations": [
{
"field": "priceNumeric",
"action": "convert",
"source": "price",
"type": "number",
"removeChars": ["$", ","]
},
{
"field": "category",
"action": "classify",
"rules": [
{"keywords": ["laptop", "computer"], "category": "electronics"},
{"keywords": ["shirt", "pants", "dress"], "category": "clothing"},
{"keywords": ["book", "novel"], "category": "books"}
],
"fallback": "general"
},
{
"field": "dataQuality",
"action": "calculate",
"expression": "{{($json.productName ? 30 : 0) + ($json.price ? 30 : 0) + ($json.description ? 25 : 0) + ($json.primaryImage ? 15 : 0)}}"
}
]
}

HTTP Request Node Configuration:

{
"nodeName": "Price Comparison API",
"method": "POST",
"url": "https://api.pricecomparison.com/v1/search",
"headers": {
"Content-Type": "application/json",
"Authorization": "Bearer {{$env.PRICE_API_KEY}}"
},
"body": {
"query": "{{$json.productName}}",
"category": "{{$json.category}}",
"priceRange": {
"min": "{{Math.max(0, $json.priceNumeric * 0.7)}}",
"max": "{{$json.priceNumeric * 1.3}}"
},
"limit": 10
},
"timeout": 10000,
"retries": 2
}

EditFields Node 4 - API Data Processing:

{
"nodeName": "Process API Response",
"operations": [
{
"field": "competitorPrices",
"action": "extract",
"source": "response.results",
"mapping": {
"price": "price",
"store": "store_name",
"url": "product_url"
}
},
{
"field": "averagePrice",
"action": "calculate",
"expression": "{{$json.competitorPrices.reduce((sum, item) => sum + item.price, 0) / $json.competitorPrices.length}}"
},
{
"field": "priceRank",
"action": "classify",
"rules": [
{"condition": "$json.priceNumeric < $json.averagePrice * 0.9", "value": "below-average"},
{"condition": "$json.priceNumeric > $json.averagePrice * 1.1", "value": "above-average"},
{"default": true, "value": "average"}
]
},
{
"field": "savings",
"action": "calculate",
"expression": "{{Math.max(0, $json.averagePrice - $json.priceNumeric)}}"
}
]
}

Error Handler Node (IF Node):

{
"nodeName": "Handle API Errors",
"conditions": [
{
"field": "response.status",
"operator": "not_equals",
"value": 200
}
],
"onTrue": {
"action": "set_fallback_data",
"data": {
"competitorPrices": [],
"averagePrice": null,
"priceRank": "unknown",
"apiError": true,
"errorMessage": "Price comparison service unavailable"
}
},
"onFalse": {
"action": "continue"
}
}

Custom Retry Pattern:

// Implemented in HTTP Request node settings:
{
"retryConfig": {
"maxRetries": 3,
"retryDelay": 1000,
"backoffMultiplier": 2,
"retryOn": [500, 502, 503, 504],
"timeoutRetry": true
}
}

Merge Node Configuration:

{
"nodeName": "Combine All Data",
"mergeStrategy": "deep_merge",
"inputs": [
{
"source": "product_data",
"priority": 1
},
{
"source": "api_data",
"priority": 2
},
{
"source": "enrichment_data",
"priority": 3
}
],
"conflictResolution": "highest_priority"
}

EditFields Node 5 - Report Generation:

{
"nodeName": "Generate Report",
"operations": [
{
"field": "report",
"action": "template",
"template": {
"productAnalysis": {
"name": "{{$json.productName}}",
"currentPrice": "{{$json.price}}",
"dataQuality": "{{$json.dataQuality}}%",
"category": "{{$json.category}}"
},
"marketComparison": {
"averageMarketPrice": "{{$json.averagePrice ? '$' + $json.averagePrice.toFixed(2) : 'N/A'}}",
"pricePosition": "{{$json.priceRank}}",
"potentialSavings": "{{$json.savings ? '$' + $json.savings.toFixed(2) : '$0.00'}}",
"competitorCount": "{{$json.competitorPrices.length}}"
},
"recommendations": "{{$json.priceRank === 'below-average' ? 'Good deal - price is below market average' : $json.priceRank === 'above-average' ? 'Consider shopping around - price is above market average' : 'Price is in line with market average'}}"
}
},
{
"field": "metadata",
"action": "set",
"value": {
"generatedAt": "{{new Date().toISOString()}}",
"workflowVersion": "1.0",
"processingTime": "{{Date.now() - $json.startTime}}ms"
}
}
]
}

Test Case 1: E-commerce Product Page

  1. Navigate to Amazon, eBay, or similar product page
  2. Execute workflow and verify data extraction
  3. Check API integration and response processing
  4. Validate final report generation

Test Case 2: Error Scenarios

  1. Test with invalid product pages
  2. Simulate API failures (disconnect internet)
  3. Test with pages missing key information
  4. Verify error handling and fallback data

Test Case 3: Performance Testing

  1. Measure execution time for complete workflow
  2. Test with different product categories
  3. Monitor memory usage during execution
  4. Verify timeout handling

Optimization Strategies:

  1. Parallel Processing:

    // Configure nodes to run in parallel where possible:
    GetAllText + GetAllImagesProcess simultaneously
  2. Caching Strategy:

    {
    "cacheConfig": {
    "enableCache": true,
    "cacheDuration": 300000, // 5 minutes
    "cacheKey": "{{$json.productName}}_{{$json.source}}"
    }
    }
  3. Resource Management:

    {
    "resourceLimits": {
    "maxConcurrentRequests": 3,
    "requestTimeout": 10000,
    "maxRetries": 2
    }
    }

Smart API Usage:

{
"nodeName": "Conditional Price Check",
"condition": "{{$json.priceNumeric > 50 && $json.dataQuality > 70}}",
"onTrue": "call_price_api",
"onFalse": "skip_api_call"
}

Multi-Stage Processing:

Raw Data → Clean → Validate → Enrich → Normalize → Output

Resilient Data Sources:

// Primary API → Secondary API → Local Processing → Manual Fallback
{
"fallbackChain": [
{"source": "primary_api", "timeout": 5000},
{"source": "secondary_api", "timeout": 10000},
{"source": "local_processing", "timeout": 2000},
{"source": "manual_fallback", "data": "default_values"}
]
}

Use Case: Monitor product prices across multiple retailers

Workflow Adaptations:

  • Schedule regular execution
  • Store historical price data
  • Send alerts for price changes
  • Generate trend reports

Use Case: Research and analyze web content for market intelligence

Workflow Adaptations:

  • Extract competitor information
  • Analyze content quality and SEO
  • Generate competitive analysis reports
  • Track content changes over time

Use Case: Extract and qualify leads from business directories

Workflow Adaptations:

  • Extract contact information
  • Validate business data through APIs
  • Score lead quality
  • Export to CRM systems

Issue 1: API Rate Limiting

// Solution: Implement rate limiting
{
"rateLimiting": {
"requestsPerMinute": 60,
"burstLimit": 10,
"backoffStrategy": "exponential"
}
}

Issue 2: Data Inconsistency

// Solution: Data validation at each stage
{
"validation": {
"required": ["productName", "price"],
"types": {"priceNumeric": "number"},
"ranges": {"dataQuality": [0, 100]}
}
}

Issue 3: Memory Usage

// Solution: Stream processing for large datasets
{
"processing": {
"mode": "stream",
"batchSize": 100,
"memoryLimit": "512MB"
}
}
  • Modular Architecture: Break complex workflows into reusable components
  • Error Boundaries: Implement error handling at each critical stage
  • Data Validation: Validate data at input, processing, and output stages
  • Performance Monitoring: Track execution time and resource usage
  • Authentication Security: Store API keys securely in environment variables
  • Rate Limiting: Respect API rate limits and implement backoff strategies
  • Error Handling: Handle API failures gracefully with fallback options
  • Data Transformation: Normalize API responses to consistent formats
  • Comprehensive Testing: Test all workflow paths and error scenarios
  • Documentation: Document workflow logic and configuration decisions
  • Version Control: Track workflow changes and maintain version history
  • Monitoring: Implement logging and monitoring for production workflows

You’ve now mastered multi-step workflow integration! You’re ready to:

  1. Learn Workflow Debugging - Advanced debugging and troubleshooting techniques
  2. Explore Performance Optimization - Optimize complex workflows for speed and efficiency
  3. Build Advanced AI Workflows - Integrate AI processing into multi-step workflows

Estimated Time: 60-75 minutes Difficulty: Intermediate Prerequisites: Completed beginner tutorials, basic API knowledge