Google Site Index - sitemap.xml Example
This workflow is designed to automate the processing of the website's sitemap.xml file, extracting and sorting all page URLs along with their last modified times. By calling the Google Indexing API, it checks the indexing status of each URL in real-time and automatically triggers update requests, thereby efficiently maintaining the website's index. This process is suitable for website administrators and SEO experts who frequently update content, helping them save time and enhance search engine visibility, ensuring that the latest content is indexed promptly.

Workflow Name
Google Site Index - sitemap.xml Example
Key Features and Highlights
This workflow automatically fetches a website's sitemap.xml file, parses all child sitemaps, and extracts and sorts all page URLs along with their last modification dates (lastmod). By leveraging the Google Indexing API, it checks the indexing status of each URL, identifies pages that require updates, and automatically triggers URL update requests to Google, enabling efficient indexing and content updates for the website.
Core Problems Addressed
- Automates handling of multi-level sitemap structures, eliminating the need to manually check URLs one by one.
- Monitors page content updates in real-time to prevent search engine indexing delays.
- Utilizes the Google Indexing API to precisely control URL indexing and updates, enhancing website visibility and ranking in search engines.
- Reduces omissions and delays caused by manual operations, improving overall work efficiency.
Use Cases
- Website administrators and SEO specialists who need to regularly maintain and optimize website indexing status.
- Large websites with frequently updated content and complex sitemap structures.
- Scenarios requiring automated push of page updates to Google to ensure rapid inclusion of the latest content.
- Any business or individual aiming to improve their website’s search engine performance.
Main Workflow Steps
- Scheduled Trigger: Automatically initiates the workflow daily at midnight.
- Fetch sitemap.xml: Retrieves the main sitemap file of the website.
- Parse Sitemap: Converts the XML sitemap into JSON format, splits, and obtains all child sitemaps.
- Fetch Child Sitemap Content: Sequentially retrieves page data from each child sitemap.
- Data Consolidation: Formats page data uniformly, ensuring the URL list is represented as an array.
- Sort Pages: Orders all pages in descending order based on the lastmod field.
- Process Each Page in Loop:
- Calls the Google Indexing API to query the indexing status and last notification time of the URL.
- Determines if the page is new or updated (lastmod is later than the last notification time).
- For qualifying pages, triggers a URL update notification via the Google Indexing API.
- Waits randomly between 0.3 to 1.5 seconds after each request to avoid sending requests too rapidly.
Involved Systems and Services
- Google Indexing API: Used for checking and pushing URL indexing status.
- HTTP Request Nodes: Fetch sitemap.xml and related content.
- XML Parsing Nodes: Convert sitemap XML into JSON structure.
- Scheduled Trigger: Enables automated timed execution.
- Manual Trigger: Supports manual test execution.
- Data Processing Nodes: Handle splitting, sorting, conditional checks, and batch looping.
Target Users and Value
- SEO professionals and website administrators seeking to accelerate search engine recognition and indexing of website content.
- Website operation teams with frequently updated content, automating index push management and saving significant manual effort.
- Technical personnel and developers who need to monitor and optimize Google indexing status.
- Multi-content channel websites such as corporate sites, news portals, and blogs, ensuring timely crawling by search engines.
This workflow enables users to achieve a closed-loop process from automated sitemap fetching to Google indexing status checking and update pushing, significantly enhancing website SEO maintenance efficiency and search engine performance.