Vision-Based AI Agent Scraper – Integrating Google Sheets, ScrapingBee, and Gemini

This workflow combines visual AI intelligent agents, web scraping services, and multimodal large language models to achieve efficient structured data extraction from web content. By using webpage screenshots and HTML scraping, it automatically extracts information such as product titles and prices, formatting the data into JSON for easier subsequent processing and storage. It integrates with Google Sheets, supporting automatic reading and writing of data, making it suitable for e-commerce product information collection, market research, and complex web data extraction, providing users with accurate and comprehensive data acquisition solutions.

Workflow Diagram
Vision-Based AI Agent Scraper – Integrating Google Sheets, ScrapingBee, and Gemini Workflow diagram

Workflow Name

Vision-Based AI Agent Scraper – Integrating Google Sheets, ScrapingBee, and Gemini

Key Features and Highlights

This workflow leverages an advanced vision-based AI agent combined with Google Sheets, the ScrapingBee web scraping service, and the Google Gemini-1.5-Pro multimodal large language model to efficiently extract structured data from web content. Core highlights include:

  • Primarily uses webpage screenshots as the data source, employing AI visual understanding techniques for information extraction.
  • Automatically supplements incomplete screenshot data by invoking HTML scraping to ensure accuracy and completeness.
  • Outputs structured parsed data automatically converted into JSON format for easy downstream processing and storage.
  • Integrates with Google Sheets to automatically read target URL lists and write scraping results, supporting unified data management.
  • Converts HTML to Markdown to optimize token usage, enhancing AI processing efficiency and reducing costs.

Core Problems Addressed

Traditional web scraping often relies on parsing HTML code, which can lead to information loss or errors when facing complex page structures or dynamic content loading. This workflow overcomes page structure limitations by extracting information directly from webpage screenshots via a visual approach, supplemented by HTML scraping as needed. This significantly improves data accuracy and completeness, making it especially suitable for visually intensive scenarios such as e-commerce product information extraction.

Application Scenarios

  • Collecting and monitoring product information on e-commerce platforms, including prices, brands, and promotions.
  • Market research and competitor analysis by bulk scraping target websites to generate reports.
  • Content aggregation platforms that automatically organize structured data about products or services.
  • Complex web data extraction tasks requiring cross-page and multi-format data integration.

Main Workflow Steps

  1. Manually trigger the workflow or replace with a custom trigger.
  2. Read the list of URLs to scrape from Google Sheets.
  3. Configure the fields to be scraped (e.g., URL).
  4. Use the ScrapingBee API to capture full-page screenshots of the webpages.
  5. The vision-based AI agent (powered by Google Gemini-1.5-Pro model) analyzes the screenshots to extract product titles, prices, brands, and promotional information.
  6. If screenshot data is insufficient or unclear, invoke the HTML scraping tool to fetch webpage HTML and convert it to Markdown format to assist data extraction.
  7. Use the structured output parsing node to format the AI-extracted data into standard JSON.
  8. Split JSON arrays into individual records.
  9. Append the structured data to the results sheet in Google Sheets for easy viewing and further processing.

Involved Systems and Services

  • Google Sheets: Manages the list of URLs to scrape and stores the scraping results.
  • ScrapingBee: Provides webpage screenshot and HTML scraping services.
  • Google Gemini Chat Model (Gemini-1.5-Pro): Multimodal large language model performing visual content understanding and data extraction.
  • Built-in n8n Nodes: Such as HTTP Request, Markdown Conversion, Structured Output Parsing, Array Split, etc.

Target Users and Value

  • E-commerce operators and data analysts seeking rapid access to competitor and market product information.
  • Market research organizations automating large-scale web data collection and structured processing.
  • Developers and automation experts building comprehensive data scraping solutions based on vision AI.
  • Any users needing to overcome the limitations of traditional HTML parsing to achieve highly accurate web data acquisition.

This workflow template is flexible and can be customized according to specific requirements by adjusting fields and parsing logic. It suits diverse web data scraping scenarios, helping users save significant manual effort while improving data acquisition efficiency and quality.