🔐🦙🤖 Private & Local Ollama Self-Hosted LLM Router

This workflow implements a private and locally deployed dynamic router that intelligently selects the most suitable local large language model for responses based on user input. It supports various specialized models, ensuring that the entire process runs locally to safeguard data privacy and security. With built-in decision trees and classification rules, it automatically schedules models and manages contextual memory, enhancing the interaction experience and task processing efficiency, making it suitable for user groups that require efficient and diverse task handling.

Workflow Diagram
🔐🦙🤖 Private & Local Ollama Self-Hosted LLM Router Workflow diagram

Workflow Name

🔐🦙🤖 Private & Local Ollama Self-Hosted LLM Router

Key Features and Highlights

This workflow implements a private, locally deployed Ollama large language model (LLM) dynamic router that intelligently selects the most suitable local Ollama model based on user input prompts. It supports multiple specialized models (text reasoning, multilingual conversation, programming assistance, visual analysis, etc.) and runs entirely on local infrastructure, ensuring data privacy and security. The workflow incorporates complex decision trees and classification rules to automate model scheduling and context memory management, enhancing interaction experience and task processing efficiency.

Core Problems Addressed

In environments with multiple local Ollama LLM models, users often find it challenging to manually identify and select the best model for a given task. This workflow intelligently analyzes user requests and automatically routes them to the most appropriate model, eliminating technical barriers, ensuring efficient and accurate task completion, and preventing data from being uploaded to the cloud to protect privacy.

Use Cases

  • AI enthusiasts and developers requiring a localized and secure multi-model language intelligence platform
  • Handling diverse tasks such as multilingual conversations, complex reasoning, code generation and repair, image and document visual analysis
  • Enterprises or individuals prioritizing data privacy and avoiding external access to sensitive information
  • Automated office or R&D environments needing unified management of local multi-model resources with intelligent scheduling

Main Workflow Steps

  1. Receive user chat messages via Webhook (triggered by the “When chat message received” node)
  2. Analyze user input and determine the most suitable Ollama model name using the “LLM Router” node based on predefined decision trees and classification rules
  3. Pass the routing result to the “AI Agent with Dynamic LLM” node to invoke the corresponding Ollama model for response generation
  4. Maintain conversation context using the “Router Chat Memory” and “Agent Chat Memory” nodes to enable continuous dialogue memory
  5. Return the response to the user, delivering a localized, intelligent, and dynamic multi-model collaborative service

Systems and Services Involved

  • Multiple locally deployed Ollama large language models, including qwq, llama3.2, phi4, qwen2.5-coder, granite3.2-vision, llama3.2-vision
  • n8n automation workflow platform nodes, such as LangChain-integrated chat triggers, agents, and memory buffers
  • Webhook for receiving external chat message inputs

Target Users and Value Proposition

  • AI technology enthusiasts and researchers seeking intelligent Q&A and task processing solutions with zero data leakage
  • Software developers aiming to improve multi-scenario code assistance and text processing efficiency through dynamic model selection
  • Enterprise users and privacy-sensitive industries requiring data security while leveraging advanced AI multi-model collaboration capabilities
  • Teams or individuals looking to build intelligent, flexible, and privacy-friendly localized large language model services

This workflow demonstrates how to leverage n8n’s powerful and flexible automation capabilities combined with local Ollama LLMs to achieve intelligent multi-model routing and collaboration, ensuring complete local data processing and meeting diverse and complex task requirements.