HTTP Request Tool (Web Content Scraping and Simplified Processing Tool)
This workflow is a web content scraping and processing tool that can automatically retrieve web page content from specified URLs and convert it into Markdown format. It supports two scraping modes: complete and simplified. The simplified mode reduces links and images to prevent excessively long content from wasting computational resources. The built-in error handling mechanism intelligently responds to request exceptions, ensuring the stability and accuracy of the scraping process. It is suitable for various scenarios such as AI chatbots, data scraping, and content summarization.

Workflow Name
HTTP_Request_Tool (Web Content Scraping and Simplified Processing Tool)
Key Features and Highlights
This workflow is specifically designed to scrape web content from specified URLs, supporting two scraping modes: "full" and "simplified." The full mode returns the webpage content in Markdown format, including links and image URLs. The simplified mode removes all URLs and image links, generating a more concise Markdown text that effectively reduces page length and conserves processing resources. The workflow incorporates built-in error handling mechanisms that intelligently provide feedback on parameter errors or request failures. It also supports dynamic adjustment of query parameters to enhance scraping accuracy and stability.
Core Problems Addressed
- Automates web content scraping and converts it into an easily processable Markdown format.
- Reduces unnecessary links and image data via the simplified mode to avoid processing bottlenecks caused by overly long content.
- Intelligently detects and reports query parameter errors or request anomalies, supporting AI agent-driven automatic query adjustments.
- Limits the length of returned content to prevent excessive resource consumption on very long pages.
Application Scenarios
- AI chatbots or intelligent agents requiring rapid acquisition and comprehension of web content.
- Content summarization, web information extraction, and structured data processing.
- Data scraping and preprocessing, especially optimized handling of lengthy web pages.
- Automated workflows that invoke web data as input.
Main Workflow Steps
- Receive HTTP Query Parameters: Input as a query string (e.g.,
?url=VALIDURL&method=SELECTEDMETHOD
). - Parse Parameters and Configure Settings: Convert the query string into a JSON object and set the maximum allowed content length.
- Initiate HTTP Request: Fetch the webpage HTML content from the specified URL, with support for ignoring certificate errors.
- Error Detection: Check if the request resulted in an error; return error messages or proceed accordingly.
- HTML Content Processing:
- Extract content within the
<body>
tag. - Remove all scripts, styles, nested media, comments, and other tags to ensure clean content.
- Extract content within the
- Simplification Decision: Based on request parameters, determine whether to replace all links and image tags with placeholders.
- Convert to Markdown Format: Transform the processed HTML into Markdown, preserving page structure while significantly compressing content length.
- Length Limit Check: If content exceeds the maximum limit, return an error message.
- Output Final Page Content: Return the processed Markdown content as a string.
Involved Systems or Services
- n8n Node System: Including HTTP request, conditional logic, text processing, Markdown conversion, and other fundamental nodes.
- LangChain AI Agent and Models (OpenAI GPT-4o-mini): Used for intelligent query adjustments and error feedback.
- Webhook Trigger: Supports workflow activation via chat messages.
- Internal Workflow Invocation Mechanism: Enables calls from other workflows for seamless integration.
Target Users and Value Proposition
- AI Developers and Data Scientists: Facilitate easy integration of web data scraping and preprocessing to improve AI model input quality.
- Product Managers and Automation Engineers: Quickly build intelligent content scraping and conversion tools to support diverse automation needs.
- Content Operations and Information Extraction Teams: Efficiently obtain structured web content to assist in content analysis and summarization.
- Developer Communities and n8n Users: Provide a powerful and flexible web scraping template that lowers technical barriers and enables automated web information processing.
By combining AI-driven agents with multi-step content cleansing, this workflow helps users efficiently and accurately scrape and convert web content, significantly enhancing the quality and efficiency of automated information processing.