Testing Multiple Local LLMs with LM Studio

This workflow implements automated testing and performance evaluation of multiple local large language models. It integrates the LM Studio server, supporting dynamic invocation of various models to generate text. Users can guide the model to produce text that meets specific readability standards through custom prompts. Additionally, the workflow includes multiple text analysis metrics that calculate output quality in real-time and automatically save the results to Google Sheets, facilitating subsequent comparisons and data tracking, significantly enhancing the efficiency and accuracy of language model testing.

Tags

Local LLM TestText Readability

Workflow Name

Testing Multiple Local LLMs with LM Studio

Key Features and Highlights

This workflow enables automated testing and performance evaluation of multiple local large language models (LLMs) by integrating the LM Studio server. It supports dynamic invocation of each model for text generation responses. Through customizable system prompts, the workflow guides models to produce outputs that meet specific readability standards (e.g., 5th-grade reading level). Built-in text analysis metrics—including word count, sentence count, average sentence length, and readability scores—are automatically calculated. Final results can be synchronized and saved to Google Sheets for convenient batch comparison and data tracking.

Core Problems Addressed

  • Automated management and unified invocation of multiple models
  • Quantitative analysis and comparison of language model output quality
  • Real-time statistics on key indicators such as readability and response time
  • Effective storage and visualization of test result data

Application Scenarios

  • Performance comparison testing of local models by language model R&D teams
  • Evaluation of text readability and clarity in education or content creation domains
  • Rapid understanding and comparison of multiple LLM performances by product managers and data analysts
  • Any automated workflow requiring batch generation and analysis of text outputs

Main Workflow Steps

  1. LM Studio Environment Setup: Download, install, and configure the LM Studio server, loading the LLM models to be tested.
  2. Retrieve Model List: Dynamically fetch all available model IDs on the server via HTTP requests.
  3. Trigger on Incoming Chat Messages: Listen for external chat message inputs to serve as test prompts.
  4. Add System Prompts: Automatically inject guiding instructions to ensure concise and readable model outputs.
  5. Invoke Models for Response Generation: Run text generation individually for each model.
  6. Record Timestamps: Capture request start and end times to calculate response latency.
  7. Text Metrics Analysis: Execute custom code nodes to compute word count, sentence count, average sentence length, average word length, and Flesch-Kincaid readability scores.
  8. Data Preparation and Storage: Organize data and automatically append test results to Google Sheets online spreadsheets.

Involved Systems or Services

  • LM Studio: Local LLM server for loading and managing multiple language models
  • n8n: Automation platform for scheduling triggers, invoking models, and processing data
  • Google Sheets: Online spreadsheet service for storing and displaying test result data

Target Users and Value

  • AI Researchers and Developers: Conveniently compare performance and output quality of various locally deployed LLMs
  • Content Creators and Editors: Assess text readability to optimize content expression
  • Data Analysts and Product Managers: Obtain detailed model response metrics to support decision-making
  • Educators: Verify AI-generated text compliance with specific reading level standards
  • Automation Engineers: Enhance model testing efficiency and reduce manual operations through automated workflows

By structuring and automating the testing process, this workflow significantly simplifies the complexity of multi-model local testing, providing a scientific basis for performance and text quality comparison, thereby empowering teams to rapidly iterate and optimize language model applications.

Recommend Templates

Twilio SMS Intelligent Buffering Reply Workflow

This workflow receives users' text messages and temporarily caches multiple rapidly sent messages in Redis within a short period. After a 5-second delay for evaluation, these messages are consolidated into a single message, which is sent to an AI model to generate a unified response. Finally, the response is returned to the user via text message. This process effectively addresses the issue of intermittent replies when users input messages frequently, enhancing the coherence of the conversation and improving user experience. It is suitable for scenarios such as customer service auto-replies and intelligent chatbots.

Twilio SMSSmart Buffer

modelo do chatbot

This workflow builds an intelligent chatbot designed to quickly recommend suitable health insurance products based on users' personal information and needs. By combining OpenAI's language model with persistent chat memory, the chatbot can dynamically interpret user input to provide personalized services. Additionally, by integrating external APIs and knowledge bases, it further enriches the content of responses, enhances user interaction experience, and addresses the issues of slow response times and inaccurate matching commonly found in traditional customer service.

Smart ChatHealth Insurance

Fine-tuning with OpenAI Models

This workflow implements a fully automated fine-tuning process based on the OpenAI model, simplifying the cumbersome steps of traditional model fine-tuning. Users only need to manually initiate the workflow to download training data from Google Drive, automatically upload it to OpenAI for fine-tuning training, and generate a customized model that can be accessed via API calls. This process supports intelligent Q&A functionality and is suitable for fields such as enterprise, education, and customer service, helping users quickly build professional AI assistants and enhance the level of business intelligence.

Model Fine-tuningAutomation Process

Dynamically Switch Between LLMs Template

This workflow is capable of dynamically selecting different large language models for dialogue generation based on user input, flexibly utilizing various OpenAI models to meet complex customer needs. It includes automatic verification of the quality of generated responses, ensuring that the content is polite and to the point. Additionally, the workflow features error detection and an automatic switching mechanism, enhancing the robustness and success rate of the conversations. It is suitable for scenarios such as customer service automation, AI experimentation platforms, and intelligent assistants. Through intelligent management, it significantly improves customer service efficiency.

Dynamic LLM SwitchResponse Quality Check

Intelligent Todoist Task Auto-Classification and Priority Update

This workflow automatically retrieves to-do tasks from Todoist at scheduled intervals and utilizes AI to intelligently analyze the task content, achieving automatic classification and priority updates. Users do not need to perform manual operations, allowing for efficient task management and preventing the oversight of important items. It is suitable for both individuals and teams, especially when dealing with a large volume of tasks and complex classifications, significantly enhancing the intelligence and efficiency of task management, and helping users allocate their time and energy more effectively.

Smart SortingPriority Management

Spy Tool

This workflow has the capability to automatically monitor the content of specified websites. It intelligently analyzes changes in web pages and uses AI models for differential analysis to accurately determine whether an email notification needs to be sent. It integrates functions such as web scraping, content comparison, and automatic email sending, avoiding interference from irrelevant changes and ensuring that users receive important updates in a timely manner. This greatly enhances the efficiency and accuracy of information monitoring and is suitable for various scenarios, including market analysis, product management, and media public relations.

Web MonitoringAI Analysis

AI Agent with Charts Capabilities Using OpenAI Structured Output

This workflow seamlessly integrates an intelligent chat agent based on the GPT-4 model, enabling a natural language request to be combined with dynamic chart generation. Users only need to describe their requirements, and the system will automatically generate chart definitions that comply with Quickchart.io standards, embedding them as images within the conversation. This significantly enhances the efficiency of data analysis and decision support, making it suitable for various scenarios such as business reports and educational training.

Smart ChartsStructured Output

YogiAI

YogiAI is an automated tool designed for yoga enthusiasts that randomly selects exercises from a yoga pose database at scheduled times each day. It utilizes artificial intelligence to generate friendly practice texts and pushes them to users via Line, simplifying the preparation process for daily practice content. It enhances the user experience in a smart and personalized way, helping users develop a regular yoga practice habit while increasing the interactivity and diversity of the content.

Yoga AutomationAI Push