Capabilities Overview

OpsPilot's capability system consists of two core modules:

Multi-Type LLM Integration: Supports mainstream domestic and international LLM, Embed, Rerank, and OCR models, meeting AI scenario requirements for Q&A, retrieval, document understanding, and more. Supports both OpenAI Compatible and Anthropic Compatible vendor protocol types, enabling mixed deployment of different protocol models under the same API gateway.
Built-in Platform Tools: Out-of-the-box operations and general-purpose tools covering cluster management, CI/CD, inspection, office automation, and other scenarios, ready to use without additional configuration.

Models

OpsPilot supports integration with various mainstream large language models and industry models. This chapter introduces the different model types, capability characteristics, and applicable scenarios to help users flexibly select the right model based on business requirements.

LLM Models

HuggingFace Series

QwQ (Alibaba Cloud Tongyi Qianwen Reasoning Model)

Core Function: Enhances mathematical, programming, and logical reasoning capabilities through reinforcement learning, able to analyze problems step by step using "chain of thought."
Advantages:
- Small yet powerful: With only 32 billion parameters (approximately 320 million "knowledge nodes"), its performance approaches the 670 billion parameter DeepSeek-R1 model.
- Multi-domain proficiency: Excels at math problem solving, code generation, and complex logical reasoning (such as contract clause analysis).
Applications:
- Mathematical reasoning: Solves complex mathematical problems, such as advanced algebraic equations, geometric proofs, etc.
- Programming assistance: Helps write code, debug code, and optimize code logic.
- General reasoning: Handles various problems requiring logical thinking, such as contract clause interpretation, logic puzzles, etc.

DeepSeek Series (High-Efficiency Domestic Models)

DeepSeek-R1:1.5b

Core Function: A lightweight large language model focused on the code domain, supporting Python, Java, and other programming languages.
Advantages:
- Low-configuration operation: Can run on mobile phones or ordinary computers (8GB RAM), minimal resource consumption.
- Fast code generation: Code completion, function generation, and error debugging 3-5x faster than manual work.
Applications:
- Code generation: Quickly generates code in various programming languages such as Python, Java, C++, etc.
- Code optimization: Optimizes existing code to improve runtime efficiency and readability.
- Code debugging: Checks for syntax errors and logic errors in code and provides correction suggestions.

OpenAI Series (World-Leading General Models)

1. GPT-3.5-Turbo-16K

Core Function: A general-purpose large language model with extended context processing capability, handling long texts (12,000 characters) with high cost-effectiveness.
Advantages:
- Multi-task adaptability: Capable of writing articles, translation, data analysis, customer service Q&A, and more.
- Cost-effective: Achieves a good balance between performance and cost, with lower usage costs compared to GPT-4.
Applications:
- Content creation platforms: Creators can use it to quickly generate article outlines and first drafts.
- Enterprise document processing: Used for content summarization and key information extraction from contracts, reports, etc.

2. GPT-4-32K

Core Function: A top-tier large language model with ultra-long context processing capability and powerful multimodal abilities. Handles ultra-long texts (24,000 characters) and supports multimodal input (text + images).
Advantages:
- Powerful comprehensive capabilities: Excels in language understanding, reasoning, creative generation.
- Multimodal fusion: Enables text-image fusion interaction and expands application scenarios.
Applications:
- Ultra-long text analysis: Processes text up to 32K tokens for deep analysis and understanding.
- Multimodal interaction: Supports text and image input/output.
- Complex problem solving: Solves advanced mathematical and specialized technical problems.

3. GPT-4o (Multimodal All-in-One Model)

Core Function: A flagship multimodal model supporting mixed input/output of text, audio, images, and video, focused on "all-in-one interaction" capabilities.
Advantages:
- Speed & cost innovation: 200% faster processing speed, 50% lower price, 5x higher rate limits.
- Multimodal capability breakthrough: Precisely identifies image and video details, supports direct voice interaction with emotion recognition.
- Enhanced reasoning: Sets new high scores in MMLU evaluations.
Applications:
- Real-time multimodal reasoning: Simultaneously performs real-time reasoning on audio, visual, and text data.
- Cross-language translation: Handles 50 different languages.
- Complex task handling: Strong text, reasoning, and coding capabilities.

LLM Model Summary

In addition to the models described above, OpsPilot currently supports the following full list of LLMs:

Category	Model Icon	Model ID	Model Name	Model Type
Baichuan	Baichuan	baichuan-2	Baichuan 2	Text
Baichuan	Baichuan	baichuan-2-chat	Baichuan 2 Chat	Text
CodeLlama	MetaAI	code-llama	CodeLlama	Code
CodeLlama	MetaAI	code-llama-instruct	CodeLlama Instruct	Code
CodeLlama	MetaAI	code-llama-python	CodeLlama Python	Code
CodeGeeX	CodeGeeX	codegeex4	CodeGeeX4	Code
CodeQwen	Alibaba	codeqwen1.5	CodeQwen1.5	Code
CodeQwen	Alibaba	codeqwen1.5-chat	CodeQwen1.5 Chat	Code
CodeShell		codeshell	CodeShell	Code
CodeShell		codeshell-chat	CodeShell Chat	Code
Codestral	Mistral	codestral-v0.1	Codestral v0.1	Code
CogAgent	Zhipu	cogagent	CogAgent	Reasoning Enhanced
DeepSeek	DeepSeek	deepseek	DeepSeek	Reasoning Enhanced
DeepSeek	DeepSeek	deepseek-chat	DeepSeek Chat	Reasoning Enhanced
DeepSeek	DeepSeek	deepseek-coder	DeepSeek Coder	Code
DeepSeek	DeepSeek	deepseek-coder-instruct	DeepSeek Coder Instruct	Code
DeepSeek	DeepSeek	deepseek-prover-v2	DeepSeek Prover v2	Reasoning Enhanced
DeepSeek	DeepSeek	deepseek-r1	DeepSeek R1	Reasoning Enhanced
DeepSeek	DeepSeek	deepseek-r1-0528	DeepSeek R1 0528	Reasoning Enhanced
DeepSeek	DeepSeek	deepseek-r1-0528-qwen3	DeepSeek R1 0528 Qwen3	Reasoning Enhanced
DeepSeek	DeepSeek	deepseek-r1-distill-llama	DeepSeek R1 Distill Llama	Reasoning Enhanced
DeepSeek	DeepSeek	deepseek-r1-distill-qwen	DeepSeek R1 Distill Qwen	Reasoning Enhanced
DeepSeek	DeepSeek	deepseek-v2-chat	DeepSeek V2 Chat	Reasoning Enhanced
DeepSeek	DeepSeek	deepseek-v2-chat-0628	DeepSeek V2 Chat 0628	Reasoning Enhanced
DeepSeek	DeepSeek	deepseek-v2.5	DeepSeek V2.5	Reasoning Enhanced
DeepSeek	DeepSeek	deepseek-v3	DeepSeek V3	Reasoning Enhanced
DeepSeek	DeepSeek	deepseek-v3-0324	DeepSeek V3 0324	Reasoning Enhanced
DeepSeek	DeepSeek	deepseek-vl2	DeepSeek VL2	Multimodal
DianJin		DianJin-R1	DianJin R1	Reasoning Enhanced
ERNIE	Baidu	Ernie4.5	ERNIE 4.5	Text
FinLLM		fin-r1	Fin R1	Reasoning Enhanced
Gemma	Gemma	gemma-3-1b-it	Gemma 3 1B IT	Text
Gemma	Gemma	gemma-3-it	Gemma 3 IT	Text
GLM (Zhipu)	ChatGLM	glm-4.1v-thinking	GLM 4.1V Thinking	Reasoning Enhanced
GLM (Zhipu)	ChatGLM	glm-4.5	GLM 4.5	Text
GLM (Zhipu)	ChatGLM	glm-4v	GLM 4V	Text
GLM (Zhipu)	ChatGLM	glm-edge-chat	GLM Edge Chat	Text
GLM (Zhipu)	ChatGLM	glm4-0414	GLM4 0414	Text
GLM (Zhipu)	ChatGLM	glm4-chat	GLM4 Chat	Text
GLM (Zhipu)	ChatGLM	glm4-chat-1m	GLM4 Chat 1M	Text
Gorilla		gorilla-openfunctions-v2	Gorilla OpenFunctions v2	Reasoning Enhanced
OpenAI	GPT	gpt-2	GPT-2	Text
LLaMA	Meta	llama-2	LLaMA 2	Text
LLaMA	Meta	llama-3	LLaMA 3	Text
LLaMA	Meta	llama-3.1	LLaMA 3.1	Text
LLaMA	Meta	llama-3.2-vision	LLaMA 3.2 Vision	Multimodal
LLaMA	Meta	llama-3.3-instruct	LLaMA 3.3 Instruct	Text
Mistral	Mistral	mistral-instruct-v0.3	Mistral Instruct v0.3	Text
Mistral	Mistral	mistral-large-instruct	Mistral Large Instruct	Text
Mixtral	Mistral	mixtral-8x22B-instruct-v0.1	Mixtral 8x22B Instruct v0.1	Text
Qwen	Qwen	qwen2.5-instruct	Qwen2.5 Instruct	Text
Qwen	Qwen	qwen2.5-coder	Qwen2.5 Coder	Code
Qwen	Qwen	qwen3	Qwen3	Text
Qwen	Qwen	Qwen3-Coder	Qwen3 Coder	Code
Qwen	Qwen	Qwen3-Thinking	Qwen3 Thinking	Reasoning Enhanced
Yi	Yi	Yi-1.5-chat	Yi 1.5 Chat	Text

For the complete model list, please refer to the Chinese documentation.

Embed Models

FastEmbed (bge-small-zh-v1.5)

A lightweight Chinese semantic embedding model focused on converting Chinese text into computer-understandable "digital encodings," efficiently capturing semantic similarity.

Lightweight & Fast: Only 95MB in size, runs on ordinary computers with millisecond-level response.
Chinese Optimized: Deeply optimized for Chinese semantics, accurately recognizing idioms, internet slang, and professional terminology.
Low-Cost & Easy to Use: Open-source and free, requires no complex configuration.

BCEmbedding (bce-embedding-base_v1)

A Chinese-English bilingual semantic bridge model that supports bidirectional semantic conversion between Chinese and English text.

Bilingual Precise Alignment: Based on Youdao translation technology, cross-language retrieval accuracy improved by 40%.
Long Text Processing: Supports overall semantic encoding of documents with tens of thousands of characters.
Two-Stage Retrieval: First quickly filters semantically related texts, then precisely ranks results.

Rerank Models

BCEReranker (bce-reranker-base_v1)

A cross-encoder-based Chinese-English-Japanese-Korean multilingual reranking model focused on fine-grained reranking of initial retrieval results.

Multilingual Cross-Domain Adaptation: Supports Chinese, English, Japanese, and Korean.
Long Text Precise Ranking: Breaks through the traditional 512-token input limit.
Absolute Score Filtering: Outputs quantifiable "absolute relevance scores" (recommended threshold 0.35-0.4).
RAG Deep Optimization: Achieves SOTA performance in LlamaIndex evaluations when combined with BCEmbedding.

OCR Models

OlmOCR

A document parsing tool based on large language models, focused on converting PDFs and images into editable text with support for tables, formulas, and handwritten content.

AzureOCR

Microsoft Azure cloud OCR tool supporting multilingual text recognition with Form Recognizer integration for extracting structured data from documents.

PaddleOCR

An open-source OCR toolkit by Baidu providing end-to-end text detection, recognition, and direction classification supporting 80+ languages.

Model	Advantages	Applications
OlmOCR	Multimodal fusion, complex layout parsing, structured output	Academic documents, handwritten archives, PDF digitization
AzureOCR	Cloud service integration, 50+ languages, security compliance	Enterprise forms, contract parsing, real-time video text recognition
PaddleOCR	Lightweight open-source, 80+ languages (Chinese optimized), edge-cloud deployment	Mobile recognition, batch document processing, cross-language scenarios

Tools

OpsPilot ships with a set of out-of-the-box tools covering databases, container orchestration, CI/CD, retrieval, script execution, and file generation. Agents can enable these tools directly so the LLM can "reach out" and interact with your existing enterprise systems. The table below is the full inventory of the current built-in tools (some require connection and authentication parameters when created).

Built-in Tool Overview

Category	Tool ID	Description	Connection/Auth Params Required
File	`attachment_file`	Generates downloadable attachments from conversation context	No
Browser	`agent_browser`	Agent-driven browser operations	No
Browser	`browser_use`	Browser automation for navigation and scraping	No
Time	`current_time`	Gets the current system time	No
Search	`duckduckgo`	DuckDuckGo web search	No
Retrieval	`elasticsearch`	Elasticsearch index queries	Yes
Network	`fetch`	Fetches the content of a given URL	No
Dev	`github`	GitHub repository and issue queries	No
CI/CD	`jenkins`	Jenkins build job queries and triggering	Yes
Container	`kubernetes`	Kubernetes cluster resource operations	Yes
Container	`kubernetes_data_collection`	Kubernetes data collection	Yes
Database	`mssql`	SQL Server queries	Yes
Database	`mysql`	MySQL queries	Yes
Database	`oracle`	Oracle queries	Yes
Database	`postgres`	PostgreSQL queries	Yes
Script	`python`	Executes Python code	No
Database	`redis`	Redis read/write	Yes
Script	`shell`	Executes Shell commands	No
Remote	`ssh`	Executes commands on a remote host over SSH	No

Notes:

These tools are registered by the platform's tool loader. In addition, the platform separately wires up a monitor built-in tool to integrate with the Monitoring Center.

The attachment generation tool (attachment_file) currently supports only Markdown, PDF, and Word (docx) at the persistence layer; other types are rejected.

Tools marked as requiring connection/auth parameters need host, port, account, password, and similar constructor parameters at creation time; password-type parameters are stored encrypted.

The following sections use Kubernetes and Jenkins as examples of typical operations tools.

Operations Tools

Kubernetes Tools

A suite of operations and management tools for Kubernetes clusters, covering cluster resource queries, status monitoring, fault diagnosis, and configuration management.

CI/CD Pipeline Full-Chain Management: Task visual management and automated triggering with parameterized builds.
Team Collaboration & Process Auditing: Build task tracing and multi-tool integration support.
Fault Recovery & Emergency Handling: Batch task operations and canary deployment support.

Jenkins

A collection of functions for interacting with the Jenkins CI/CD platform, providing the ability to query and operate build tasks on Jenkins servers.

Fine-Grained Cluster Resource Management: Multi-dimensional resource queries and resource optimization & cleanup.
Full-Chain Fault Diagnosis & Monitoring: Quick abnormal resource identification and log & configuration linked analysis.
Automated Operations & Standardization: Batch operation support and configuration standardization management.
Performance & Stability Assurance: Node health monitoring and container runtime analysis.

General Tools

General tools

get_current_time

Function: Retrieves the current system time in real-time (accurate to seconds/milliseconds).
Use Cases: Logging, task scheduling, and report generation.

Search Tools

duckduckgo_search

Function: Initiates web queries through the DuckDuckGo search engine, returning structured search results.
Use Cases: Information retrieval and data collection.

Models​

LLM Models​

HuggingFace Series​

QwQ (Alibaba Cloud Tongyi Qianwen Reasoning Model)​

DeepSeek Series (High-Efficiency Domestic Models)​

DeepSeek-R1:1.5b​

OpenAI Series (World-Leading General Models)​

1. GPT-3.5-Turbo-16K​

2. GPT-4-32K​

3. GPT-4o (Multimodal All-in-One Model)​

LLM Model Summary​

Embed Models​

FastEmbed (bge-small-zh-v1.5)​

BCEmbedding (bce-embedding-base_v1)​

Rerank Models​

BCEReranker (bce-reranker-base_v1)​

OCR Models​

OlmOCR​

AzureOCR​

PaddleOCR​

Tools​

Built-in Tool Overview​

Operations Tools​

Kubernetes Tools​

Jenkins​

General Tools​

General tools​

get_current_time​

Search Tools​

duckduckgo_search​