RAG & GenAI App With WordPress Content
This workflow automatically scrapes publicly available content from WordPress websites and utilizes generative AI and vector databases to create an intelligent Q&A system. It converts article and page content into Markdown format and generates vector representations to support rapid semantic retrieval. Users can ask questions in real-time, and the system generates accurate answers by combining relevant content, enhancing the interactive experience of the website. This solution is suitable for businesses or personal websites that require intelligent customer service and knowledge management, ensuring that content is always up-to-date and efficiently serves visitors.

Workflow Name
RAG & GenAI App With WordPress Content
Key Features and Highlights
This workflow implements a Retrieval-Augmented Generation (RAG) AI application based on WordPress website content. It automatically crawls posts and pages from a WordPress site, filters for publicly accessible and unprotected content only, converts the content into Markdown format, and generates vector embeddings using OpenAI’s text embedding model. These embeddings are stored in a Supabase vector database. Users can interact with the integrated chat interface to ask questions in real time. The AI combines relevant content retrieved from the vector database with the GPT-4 model to generate accurate answers enriched with source metadata, thereby enhancing the website’s interactive experience.
Core Problems Addressed
- Automates the crawling and updating of WordPress site content to dynamically generate embedding vectors, eliminating the need for manual knowledge base maintenance.
- Employs vector search technology to enable efficient semantic search and precise matching across large volumes of website content.
- Integrates retrieval results with generative AI to improve the quality and reliability of user query responses.
- Supports content version update detection to ensure the knowledge base remains up to date.
Use Cases
- Building intelligent Q&A chatbots for corporate or personal websites to enhance visitor engagement.
- Rapidly developing content-driven chat assistants based on website content.
- Leveraging website content for knowledge management, automated customer support, and intelligent recommendations.
- Scenarios requiring continuous synchronization of website content for semantic search and question answering.
Main Workflow Steps
- Trigger: Manually or on a scheduled basis.
- WordPress Content Crawling: Retrieve all posts and pages via WordPress API.
- Content Filtering: Select only published and unprotected content.
- Content Format Conversion: Convert HTML content to Markdown.
- Text Chunking: Split long texts into smaller chunks to fit model input limits.
- Embedding Generation: Generate vector embeddings for content using OpenAI’s text-embedding-3-small model.
- Storage of Vectors and Metadata: Store content and embeddings in Supabase vector database.
- Version Control: Use Postgres database to record last execution time and fetch updated content accordingly.
- Chat Trigger: On user chat input, query Supabase to retrieve relevant documents.
- Answer Generation: Combine retrieved documents and chat context to generate responses via GPT-4 model, including metadata such as content URL, type, publish date, and modification date.
- Response Output: Return the generated answer to the frontend via webhook.
Involved Systems and Services
- WordPress API (for fetching posts and pages)
- OpenAI (text-embedding-3-small embedding model and GPT-4o-mini chat model)
- Supabase (vector database for storing and retrieving embeddings and related documents)
- Postgres database (for storing chat history and workflow execution records)
- n8n automation platform (workflow orchestration and node execution)
Target Users and Value Proposition
- Website operators and content managers aiming to intelligently and automatically serve visitors with website content.
- Developers and automation experts building intelligent chatbots or knowledge base systems based on existing website content.
- Enterprise customer service teams seeking to enhance user self-service efficiency through AI.
- Content creators and marketers looking to boost content interactivity and user engagement with AI assistance.
This workflow delivers an intelligent Q&A engine for WordPress sites through automated content crawling, smart embedding generation, and efficient semantic retrieval. It significantly enhances content utilization and user experience, making it an ideal solution for building modern content-driven AI applications.