Hacker News Historical Headlines Insight Automation Workflow
This workflow automatically scrapes the headlines from Hacker News over the years, organizes key news titles from the same date, and utilizes a large language model for intelligent classification and analysis. It ultimately generates a structured Markdown format insight report, which is pushed to users in real-time via a Telegram channel. This process efficiently addresses the repetitive task of manually organizing news, enhancing the efficiency and timeliness of information retrieval, and is suitable for various scenarios such as technology research, news review, and data analysis.

Workflow Name
Hacker News Historical Headlines Insight Automation Workflow
Key Features and Highlights
This workflow periodically fetches the top headlines from Hacker News for the current calendar date across multiple years. It automatically extracts and aggregates key news titles from the same date over the years, leverages the Google Gemini language model for intelligent classification and analysis, and ultimately generates a structured insight report in Markdown format. The report is then delivered to users in real-time via a Telegram channel.
Core Problems Addressed
- Automated cross-year news data retrieval and organization, eliminating manual repetitive tasks
- Utilization of large language models to intelligently identify and categorize vast amounts of news information, uncovering core themes and trends
- Multi-platform data flow and automated push notifications to enhance information acquisition efficiency and timeliness
Application Scenarios
- Technology industry researchers tracking development trends
- News editors quickly reviewing historical hot topics and their evolution
- Data analysts conducting time-series analysis of news themes
- Social media operators enhancing user engagement through regular high-quality content delivery
Main Workflow Steps
- Schedule Trigger: Workflow initiates daily at 9 PM
- CreateYearsList: Generates a list of corresponding dates from 2007 to the present based on the current date
- CleanUpYearList & SplitOutYearList: Cleans and splits the date list to prepare for daily data fetching
- GetFrontPage: Sends requests to Hacker News front-end pages to retrieve top headlines HTML for specified dates
- ExtractDetails: Parses HTML to extract news headlines and their corresponding dates
- GetHeadlines & GetDate: Organizes headline and date information
- MergeHeadlinesDate & SingleJson: Merges multi-date data into a unified JSON structure
- Basic LLM Chain (integrating Google Gemini Chat Model): Invokes the large language model to identify key headlines from the input JSON, classify them by theme, and generate a Markdown-formatted analytical report
- Telegram: Automatically pushes the generated report to subscribers via a Telegram channel
Involved Systems or Services
- Hacker News (data source)
- Google Gemini (PaLM) language model (intelligent analysis and text generation)
- Telegram (content distribution and push notifications)
- n8n platform (workflow automation orchestration)
Target Users and Value
- Technology trend analysts and industry observers, helping them quickly grasp the evolution of tech news
- Media and content operators, efficiently obtaining high-quality historical news content to enrich creative materials
- Data scientists and researchers, supporting time-series news data analysis and insight extraction
- Individuals or teams interested in technology news and its historical development, enhancing information insight and decision-making efficiency
This workflow achieves systematic cross-year news theme insights through highly automated data retrieval, intelligent content analysis, and multi-channel distribution, serving as a powerful tool for technology information content management and dissemination.