Indeed Company Data Scraper & Summarization with Airtable, Bright Data and Google Gemini

用于Indeed,通过自动化抓取公司数据并生成摘要,整合Airtable和Google Gemini,提升数据处理效率,快速获取关键信息,助力决策。

7/8/2025
19 nodes
Complex
kujft2fojmovqamjddpkw7hg5dzhqu2wrkoa98eai3ietrlumanualcomplexlangchainsticky notemarkdownsplitinbatchesairtablewaitadvancedapiintegrationlogicconditional
Categories:
Complex WorkflowManual TriggeredData Processing & Analysis
Integrations:
LangChainSticky NoteMarkdownSplitInBatchesAirtableWait

Target Audience

This workflow is ideal for:
- Data Analysts looking to automate the process of extracting and summarizing company data from Indeed.
- Recruiters who want to gather insights from job postings efficiently.
- Business Intelligence Professionals needing to integrate data from various sources into Airtable for better analysis.
- Developers interested in using n8n for building automated workflows with AI capabilities.
- Marketing Teams that require summarized competitor insights for strategic planning.

Problem Solved

This workflow addresses the challenge of manually gathering and summarizing job-related data from Indeed, which can be time-consuming and prone to errors. By automating the process, users can:
- Save significant time by automating data extraction and summarization.
- Ensure accuracy in the data collected through structured API requests.
- Enable faster decision-making with summarized insights delivered directly to their systems.

Workflow Steps

  • Manual Trigger: The workflow starts when the user clicks ‘Test workflow’.
    2. Set Bright Data Zone: Defines the Bright Data zone for web scraping.
    3. Airtable Integration: Retrieves links to Indeed company pages from Airtable.
    4. Conditional Check: If the link field is not empty, proceed to the web request.
    5. Perform Indeed Web Request: Makes a POST request to the Bright Data API to unlock the Indeed page.
    6. Markdown to Textual Data Extraction: Converts the markdown response into textual data for easier processing.
    7. Summarization: Uses Google Gemini to summarize the extracted data.
    8. AI Agent Formatting: The Indeed Expert AI Agent formats the search results.
    9. Convert Markdown to HTML: Converts the summarized markdown into HTML format.
    10. Webhook Notification: Sends the HTML response to a designated webhook for further processing or storage.
  • Customization Guide

    Users can customize this workflow by:
    - Modifying Airtable Links: Change the Airtable base and table configurations to connect to different datasets.
    - Adjusting Webhook URLs: Update the webhook URLs to send data to different endpoints as needed.
    - Changing Summarization Model: Switch the Google Gemini model to a different version or another AI model based on requirements.
    - Adding Additional Nodes: Enhance the workflow by integrating more nodes for additional data processing or storage options.
    - Setting Different Wait Times: Adjust the wait times between requests to manage API rate limits effectively.