Store Notion's Pages as Vector Documents into Supabase with OpenAI

For Notion, this workflow automates the storage of newly added pages as vector documents in Supabase, enhancing data accessibility and analysis. It retrieves page content, filters out non-text elements, summarizes the text, generates embeddings using OpenAI, and efficiently stores the processed data in a Supabase vector column. This streamlined process ensures that valuable information is easily retrievable and ready for further use.

7/8/2025
9 nodes
Medium
manualmediumsticky notelangchainnotiontriggernotionfiltersummarize
Categories:
Manual TriggeredMedium Workflow
Integrations:
Sticky NoteLangChainNotionTriggerNotionFilterSummarize

Target Audience

  • Developers looking to automate data storage from Notion to Supabase.
    - Data Scientists who require a seamless method for embedding and vectorizing text data from Notion.
    - Content Creators wanting to store and analyze their Notion pages efficiently.
    - Business Analysts who need quick access to summarized data from Notion for reporting purposes.
  • Problem Solved

    This workflow automates the process of storing Notion pages as vector documents in Supabase, addressing the challenge of manual data entry and ensuring that important textual information is efficiently stored and easily retrievable for analysis or further processing.

    Workflow Steps

  • Trigger on Notion Page Addition: The workflow starts when a new page is added to a specified Notion database, ensuring real-time data capture.
    2. Retrieve Page Content: It fetches all the block content from the newly added Notion page, gathering all relevant information.
    3. Filter Non-Text Content: The workflow excludes non-text blocks (like images and videos) to focus solely on textual data, enhancing the quality of the stored content.
    4. Summarize Content: It concatenates the textual content from the blocks into a single text for embedding, streamlining the data for processing.
    5. Generate Embeddings: The workflow utilizes OpenAI's API to create embeddings for the concatenated text, enabling advanced analysis and search capabilities.
    6. Create Metadata: It generates associated metadata (like page ID and creation time) to contextualize the stored data, making it easier to reference later.
    7. Split Content into Chunks: The text is divided into smaller, manageable chunks for efficient processing and embedding generation.
    8. Store in Supabase: Finally, the processed documents and their embeddings are stored in a Supabase table, ready for retrieval and analysis.
  • Customization Guide

    Users can customize the workflow by:
    - Changing the Notion Database: Update the databaseId in the Notion Trigger to monitor a different database.
    - Modifying Filters: Adjust the filter conditions to include or exclude different types of content (e.g., text types) based on specific needs.
    - Altering Chunk Size: Change the chunkSize and chunkOverlap parameters in the Token Splitter to optimize the text processing based on the size of the data being handled.
    - Customizing Metadata: Add or modify metadata fields in the Create Metadata and Load Content node to capture additional information relevant to your use case.
    - Adjusting Embedding Options: Modify the options in the Embeddings node to fine-tune how embeddings are generated, potentially improving the quality of the stored vectors.