RAG on Living Data

This workflow implements a Retrieval-Augmented Generation (RAG) function through real-time data updates, automatically retrieving the latest content from the Notion knowledge base. It performs text chunking and vectorization, storing the results in the Supabase vector database. By integrating OpenAI's GPT-4 model, it provides contextually relevant intelligent Q&A, significantly enhancing the efficiency and accuracy of knowledge base utilization. This is applicable in scenarios such as enterprise knowledge management, customer support, and education and training, ensuring that users receive the most up-to-date information.

Workflow Diagram
RAG on Living Data Workflow diagram

Workflow Name

RAG on Living Data

Key Features and Highlights

This workflow implements a Retrieval-Augmented Generation (RAG) process based on real-time updated data. It automatically fetches the latest content from a Notion knowledge base, performs text chunking and vectorization, and stores the vectors in a Supabase vector database. By integrating OpenAI’s GPT-4 model, it enables context-aware intelligent Q&A, significantly enhancing the utilization efficiency and accuracy of knowledge base content.

Core Problem Addressed

It solves the challenge of efficiently synchronizing and updating vectorized data following dynamic updates in the knowledge base, while supporting context-based intelligent Q&A. The workflow automatically detects updates in Notion pages, deletes outdated vector data, and inserts new vectors to ensure that Q&A is based on the most current information, avoiding data redundancy and information obsolescence.

Application Scenarios

  • Enterprise Knowledge Management: Automatically synchronize and enrich internal documents, manuals, and FAQs to enable intelligent Q&A.
  • Customer Support Systems: Provide accurate, real-time response support for customer service based on a dynamic knowledge base.
  • Education and Training: Integrate teaching material repositories to assist students and educators with intelligent Q&A and content retrieval.
  • Product Documentation Queries: Offer users an up-to-date Q&A service for product usage guides.

Main Workflow Steps

  1. Data Trigger
    • Use a schedule trigger to pull the most recently updated pages from the Notion knowledge base every minute.
  2. Data Retrieval and Processing
    • Retrieve all content blocks of the updated pages (Get page blocks).
    • Concatenate page content into a single string.
    • Split the text into chunks based on a set token count (Token Splitter) for easier processing and vectorization.
  3. Old Data Cleanup
    • Delete old vector embeddings corresponding to the page from Supabase vector storage to prevent data redundancy.
  4. Vectorization and Storage
    • Convert text chunks into vectors using the OpenAI Embeddings node.
    • Store vectors along with metadata in the Supabase vector database.
  5. Intelligent Q&A Trigger
    • Initiate the Q&A process via a chat message trigger.
    • Retrieve relevant content from the vector database using a vector store retriever.
    • Generate context-aware answers by combining OpenAI GPT-4 chat model with a question-and-answer chain.
  6. Result Output
    • Return intelligent Q&A results based on the latest knowledge base content.

Systems and Services Involved

  • Notion: Serves as the knowledge base data source, providing real-time access to pages and content blocks.
  • OpenAI: Provides text embedding generation (Embeddings) and chat language model support (GPT-4).
  • Supabase: Acts as the vector storage database for storing and retrieving text vector data.
  • n8n: An automation workflow platform that orchestrates node executions to realize process automation.

Target Users and Value

  • Knowledge managers and enterprise digital transformation teams aiming to build intelligent knowledge bases.
  • Customer service and support teams seeking to improve response efficiency and accuracy.
  • Educational institutions and trainers facilitating intelligent content retrieval and interactive Q&A.
  • Developers and automation enthusiasts looking to rapidly build intelligent Q&A systems based on real-time data.

By combining Notion’s dynamic data source with OpenAI’s powerful language understanding capabilities and Supabase’s efficient vector storage, this workflow delivers a real-time, intelligent, and automated knowledge base Q&A solution that greatly enhances the convenience and precision of information retrieval.