🔐🦙🤖 Private & Local Ollama Self-Hosted LLM Router

This workflow implements a private and locally deployed dynamic router that intelligently selects the most suitable local large language model for responses based on user input. It supports various specialized models, ensuring that the entire process runs locally to safeguard data privacy and security. With built-in decision trees and classification rules, it automatically schedules models and manages contextual memory, enhancing the interaction experience and task processing efficiency, making it suitable for user groups that require efficient and diverse task handling.

On-PremiseSmart Routing

Workflow Name

Key Features and Highlights

This workflow implements a private, locally deployed Ollama large language model (LLM) dynamic router that intelligently selects the most suitable local Ollama model based on user input prompts. It supports multiple specialized models (text reasoning, multilingual conversation, programming assistance, visual analysis, etc.) and runs entirely on local infrastructure, ensuring data privacy and security. The workflow incorporates complex decision trees and classification rules to automate model scheduling and context memory management, enhancing interaction experience and task processing efficiency.

Core Problems Addressed

In environments with multiple local Ollama LLM models, users often find it challenging to manually identify and select the best model for a given task. This workflow intelligently analyzes user requests and automatically routes them to the most appropriate model, eliminating technical barriers, ensuring efficient and accurate task completion, and preventing data from being uploaded to the cloud to protect privacy.

Use Cases

AI enthusiasts and developers requiring a localized and secure multi-model language intelligence platform
Handling diverse tasks such as multilingual conversations, complex reasoning, code generation and repair, image and document visual analysis
Enterprises or individuals prioritizing data privacy and avoiding external access to sensitive information
Automated office or R&D environments needing unified management of local multi-model resources with intelligent scheduling

Main Workflow Steps

Receive user chat messages via Webhook (triggered by the “When chat message received” node)
Analyze user input and determine the most suitable Ollama model name using the “LLM Router” node based on predefined decision trees and classification rules
Pass the routing result to the “AI Agent with Dynamic LLM” node to invoke the corresponding Ollama model for response generation
Maintain conversation context using the “Router Chat Memory” and “Agent Chat Memory” nodes to enable continuous dialogue memory
Return the response to the user, delivering a localized, intelligent, and dynamic multi-model collaborative service

Systems and Services Involved

Multiple locally deployed Ollama large language models, including qwq, llama3.2, phi4, qwen2.5-coder, granite3.2-vision, llama3.2-vision
n8n automation workflow platform nodes, such as LangChain-integrated chat triggers, agents, and memory buffers
Webhook for receiving external chat message inputs

Target Users and Value Proposition

AI technology enthusiasts and researchers seeking intelligent Q&A and task processing solutions with zero data leakage
Software developers aiming to improve multi-scenario code assistance and text processing efficiency through dynamic model selection
Enterprise users and privacy-sensitive industries requiring data security while leveraging advanced AI multi-model collaboration capabilities
Teams or individuals looking to build intelligent, flexible, and privacy-friendly localized large language model services

This workflow demonstrates how to leverage n8n’s powerful and flexible automation capabilities combined with local Ollama LLMs to achieve intelligent multi-model routing and collaboration, ensuring complete local data processing and meeting diverse and complex task requirements.

Recommend Templates

Intelligent Chat Assistant Workflow

This workflow implements an intelligent chat assistant with context memory and computational capabilities. By continuously tracking user conversations, it ensures dialogue coherence and prevents information loss. It can handle complex calculation requests, enhancing user experience, and is suitable for scenarios such as online customer service, virtual assistance, and educational tutoring. This assistant integrates powerful language understanding and generation capabilities, making it ideal for developers and businesses to build efficient intelligent dialogue systems, significantly improving interaction quality and response efficiency.

Smart ChatContext Memory

Discord MCP Chat Agent

This workflow enables intelligent chat interactions and task processing through the reception of Discord chat messages, utilizing advanced language models and intelligent agents. It can automatically understand user instructions, streamline the management processes of Discord servers, and enhance user interaction efficiency, making it suitable for various scenarios such as community management, customer support, and smart assistants. Its flexible structure allows users to customize settings according to their needs, enhancing both automation and the interactive experience.

Discord BotSmart Support

AI Agent: Conversational Airtable Data Assistant

This workflow is an intelligent data assistant that allows users to interact with the Airtable database using natural language, simplifying the process of data querying and analysis. Users only need to input their questions, and the system intelligently parses the requests, automatically generating query conditions and executing operations. It supports mathematical operations and data visualization, and features contextual memory, enabling multi-turn conversations to enhance interaction efficiency. It is suitable for business personnel, data analysts, and project managers, helping them to access and analyze data more quickly and conveniently.

Airtable AssistantSmart Chat

Multi-Scenario Intelligent Automation Showcase

This workflow integrates various intelligent automation features, enabling smart email categorization, semantic question answering for PDF documents, and intelligent appointment management. Through AI models and vector databases, users can efficiently process email and document information, quickly retrieving key content. Additionally, the built-in calendar interface can automatically schedule meetings, avoiding appointment conflicts and enhancing work efficiency. It is suitable for business users who need to manage information and schedules effectively, optimizing customer experience and team collaboration.

Intelligent AutomationEmail Classification

AI Voice Chat using Webhook, Memory Manager, OpenAI, Google Gemini & ElevenLabs

This workflow builds a complete AI voice chat system that can transcribe user speech into text in real time and achieve understanding and generation of multi-turn conversations through context memory management. By combining advanced language models with high-quality text-to-speech technology, the system can provide natural and smooth voice responses, making it suitable for scenarios such as intelligent customer service and voice assistants, thereby enhancing user interaction experience and efficiency.

Intelligent VoiceMulti-turn Dialogue

🐋🤖 DeepSeek AI Agent + Telegram + LONG TERM Memory 🧠

This workflow combines intelligent agents and chatbot technology to automatically receive and process messages from Telegram users. Through personalized intelligent analysis and long-term memory capabilities, it enables contextually relevant interactions and stores important information in Google Docs to provide personalized services and efficient communication. Additionally, it features a strict user authentication mechanism to ensure interaction security, making it suitable for various scenarios such as smart customer service and personal assistants, thereby enhancing user experience and information management efficiency.

Telegram BotLong-term Memory

WhatsApp Multimedia Intelligent Interaction Assistant

This workflow aims to achieve automatic recognition and intelligent processing of multimedia messages sent by users via WhatsApp. Utilizing advanced AI technology, it can transcribe audio in real-time, analyze video, recognize image content, and generate intelligent replies, effectively streamlining customer service, consultation, and appointment processes, while enhancing user experience and processing efficiency. It is suitable for various scenarios including enterprise customer service, marketing, and education, facilitating the automation and intelligence of multimedia interactions.

WhatsApp AssistantMultimodal AI

Insert and Retrieve Documents

This workflow is designed to automatically scrape the latest articles from the Paul Graham website, extract and clean their main content, generate vectors, and store them in the Milvus database. Users can query through a chat interface, and the system will retrieve relevant text based on vector searches, utilizing the GPT-4 model for intelligent Q&A, ensuring that the answers are accurate and traceable. It is suitable for knowledge base construction, intelligent customer service, content aggregation, and research assistance, enhancing the management and utilization efficiency of text data.

text scrapingsemantic search