Skip to main content

Algorithm Configuration Management

Algorithm Configuration is the core extension mechanism of MLOps. Administrators can dynamically maintain the list of available algorithms for each scenario, inference/training images, and hyperparameter forms without modifying code or restarting the service, allowing algorithms to go live, be adjusted, or be taken offline on demand.

Role of Algorithm Configuration

Each algorithm configuration record contains the following key fields:

FieldDescription
algorithm_typeThe scenario to which the algorithm belongs, such as anomaly_detection, timeseries_predict, etc.
nameAlgorithm unique identifier, such as ECOD, Prophet; cannot be duplicated within the same scenario
display_nameAlgorithm name displayed on the frontend
scenario_descriptionDescription of the scenario the algorithm applies to, shown in the algorithm selection guide interface
imageDocker image address shared by training and inference
form_configJSON configuration for the frontend dynamic form, defining the hyperparameter input interface
is_activeWhether enabled; when disabled, it no longer appears in the algorithm selection list

The training image for training tasks and the inference image for inference services are both dynamically read from the algorithm configuration. If no corresponding configuration is found in the database, the system falls back to the default built-in image for each scenario.

Built-in Algorithm Presets

The system provides 7 built-in algorithm presets that can be initialized in one go:

cd server
uv run python manage.py init_algorithm_config

Anomaly Detection Scenario (anomaly_detection)

AlgorithmDescription
ECODSuitable for operational metrics monitoring, detecting anomalous points in multi-dimensional data, such as sudden spikes in system resource usage or abnormal business traffic fluctuations
EWMASuitable for progressive univariate time series anomaly detection, such as CPU/memory continuously climbing or interface latency gradually degrading from baseline
PELTSuitable for time series changepoint/state transition detection, identifying sudden performance baseline changes caused by releases, configuration changes, or traffic switches

Time Series Prediction Scenario (timeseries_predict)

AlgorithmDescription
ProphetFacebook open-source time series prediction model, suitable for medium to long-term business metric forecasting with trend and seasonality

Log Clustering Scenario (log_clustering)

AlgorithmDescription
SpellOnline log template mining algorithm, suitable for real-time clustering and template extraction from high-frequency streaming logs

Text Classification Scenario (classification)

AlgorithmDescription
XGBoostEnsemble learning (Gradient Boosting Tree), suitable for small to medium-sized text classification tasks with complete feature engineering
GradientBoostingScikit-learn gradient boosting classifier with stable training, suitable for small-scale precisely labeled corpora
RandomForestRandom forest with fast training and overfitting resistance, suitable for exploratory classification tasks in early-stage data exploration

Image Classification Scenario (image_classification)

AlgorithmDescription
YOLOClassificationImage classification model based on YOLO architecture, using ImageFolder format for training data (one subdirectory per class)

Object Detection Scenario (object_detection)

AlgorithmDescription
YOLODetectionObject detection model based on YOLO architecture, using YOLO annotation format for training data (.txt annotation files)

Enabling and Disabling Algorithms

Disable Algorithm: Enter the "Algorithm Configuration" list for the corresponding scenario and turn off "Enable" for the specified algorithm. Before disabling, the system checks whether there are training tasks currently using this algorithm; if there are, it rejects the operation and you must wait for the task to complete or delete it before disabling.

Enable Algorithm: Simply turn "Enable" back on, and the algorithm reappears in the algorithm selection list for training tasks.

Delete Algorithm: Before deletion, the system likewise validates whether training tasks are using it. If so, deletion is rejected.

Adding Custom Algorithms

To integrate a custom algorithm, the following conditions must be met:

  1. Build Inference Service Image: The inference service in the image must listen on port 3000, accept POST /predict requests with request body format {"data": [...]}, and return response format {"success": true, "data": [...], "error": null}.
  2. Create Algorithm Configuration in Admin Interface: Fill in the algorithm identifier, image address, and dynamic form configuration.
  3. Fill in form_config: Write JSON configuration for the frontend dynamic form according to the AlgorithmConfig type definition. The system renders the corresponding hyperparameter form when creating new training tasks.

Note: The platform does not bind to any fixed training framework. The training entry point and inference protocol inside the image must be guaranteed by the image provider to be compatible with platform specifications.