ManualTrigger Automate

ManualTrigger Automate streamlines web content extraction by converting HTML pages into markdown format and extracting links. It processes URLs in batches of 10 or 40, ensuring compliance with API rate limits of 10 requests per minute. This workflow simplifies data retrieval from your database, allowing for efficient integration with your existing systems while providing clear, formatted text for further analysis.

7/8/2025
17 nodes
Complex
manualcomplexwaitsticky notenoopsplitinbatchessplitoutadvancedapiintegration
Categories:
Complex WorkflowManual Triggered
Integrations:
WaitSticky NoteNoOpSplitInBatchesSplitOut

Target Audience

  • Web Developers looking to scrape and process web content efficiently.
    - Data Analysts wanting to extract structured data from web pages for analysis.
    - Content Creators needing to convert web content into markdown format for easier editing and use.
    - Marketers interested in gathering links and metadata from competitor websites.
    - API Users who require a solution for automated web scraping while respecting API limits.
  • Problem Solved

    This workflow addresses the challenge of extracting content and links from web pages in a structured format. It automates the process of scraping web content using the Firecrawl API, while also managing API rate limits to ensure compliance with usage policies. The workflow allows users to convert HTML content into markdown, making it suitable for further processing and analysis.

    Workflow Steps

  • Step 1: Trigger the workflow manually by clicking ‘Test workflow’.
    - Step 2: Wait for 45 seconds to ensure the system is ready for the next steps.
    - Step 3: Retrieve a list of URLs from your own data source, ensuring the column is named Page.
    - Step 4: Split the URLs into batches of 10 to manage processing effectively.
    - Step 5: For each batch, send a request to the Firecrawl API to scrape the content and links from the specified URLs.
    - Step 6: Collect the markdown data and links extracted from the response.
    - Step 7: Output the processed data to your chosen data source, ensuring it aligns with your requirements.
  • Customization Guide

  • To adapt the input source, modify the Example fields from data source node to pull URLs from different databases or update the list directly.
    - Adjust the batch size in the 10 at a time node to process more or fewer URLs per request, depending on your needs.
    - Update the API key in the Retrieve Page Markdown and Links node to your own Firecrawl token for authentication.
    - Change the output destination to a different database or format by modifying the connection settings in the Markdown data and Links node.