ml-commons provides a set of common machine learning algorithms, e.g. k-means, or linear regression, to help developers build ML related features within OpenSearch.
Find a file
Rithin Pullela 3697e899f6
Add token usage tracking for Chat and PER agents (#4683)
- Track per-model and per-turn token usage in MLChatAgentRunner and
  MLPlanExecuteAndReflectAgentRunner using AgentTokenTracker
- Extract token counts from LLM responses via FunctionCalling interface
- Add include_token_usage opt-in flag for agent responses
- Emit token usage via CustomEvent for AGUI streaming agents
- Add utility methods in AgentUtils for model metadata resolution,
  token tensor creation, and structured per-model usage logging
- Suppress sub-agent token logging to prevent double-logging in PER
- Fix NPE for old-style AGUI agents with conversational memory
- Add comprehensive unit tests across all components

Signed-off-by: Rithin Pullela <rithinp@amazon.com>
Signed-off-by: rithin-pullela-aws <rithinp@amazon.com>
2026-03-05 12:58:29 -08:00
.github Onboard code diff analyzer and reviewer (ml-commons) (#4666) 2026-02-25 12:26:54 -08:00
build-tools Onboard to s3 snapshots (#4320) 2025-10-16 14:34:21 -07:00
client Fix ML build with 1) adapt to gradle shadow plugin v9 upgrade and 2) make ml-common fips build param aware (#4654) 2026-02-20 10:50:58 -08:00
common Add token usage tracking for Chat and PER agents (#4683) 2026-03-05 12:58:29 -08:00
docs Support unified pre/post processing for Nova MME model (#4425) 2026-01-27 10:43:24 -08:00
gradle/wrapper Update JDK to 25 and Gradle to 9.2 (#4465) 2025-12-01 15:56:51 -08:00
memory Optimize IT setup, remove redundant per-test work (#4667) 2026-02-26 16:50:41 -08:00
ml-algorithms Add token usage tracking for Chat and PER agents (#4683) 2026-03-05 12:58:29 -08:00
plugin Add token usage tracking for Chat and PER agents (#4683) 2026-03-05 12:58:29 -08:00
release-notes Add 3.5.0 release notes (#4577) 2026-02-10 09:53:50 -08:00
scripts Fix ML build with 1) adapt to gradle shadow plugin v9 upgrade and 2) make ml-common fips build param aware (#4654) 2026-02-20 10:50:58 -08:00
search-processors Upgrade commons-text (#4241) 2025-10-01 18:51:32 -07:00
spi Fix ML build with 1) adapt to gradle shadow plugin v9 upgrade and 2) make ml-common fips build param aware (#4654) 2026-02-20 10:50:58 -08:00
.eclipseformat.xml Upgrade dependency version, apply spotless formatter, checkstyle and enable jacoco test coverage. (#39) 2021-11-15 19:16:40 -08:00
.gitignore Add global resource support (#4003) 2025-09-24 15:32:25 +08:00
.whitesource Add .whitesource configuration file (#611) 2022-12-05 16:09:08 -08:00
build.gradle Increment version to 3.6.0-SNAPSHOT (#4609) 2026-02-10 12:15:32 -08:00
CODE_OF_CONDUCT.md udpate the docs for opensearch project (#8) 2021-11-15 19:16:40 -08:00
CONTRIBUTING.md Update readme to add more information (#81) 2021-11-15 19:16:40 -08:00
DEVELOPER_GUIDE.md Add instructions to run OpenSearch from source code (#3512) 2025-02-24 12:18:35 -08:00
gradlew Update JDK to 25 and Gradle to 9.2 (#4465) 2025-12-01 15:56:51 -08:00
gradlew.bat Update JDK to 25 and Gradle to 9.2 (#4465) 2025-12-01 15:56:51 -08:00
LICENSE Update readme to add more information (#81) 2021-11-15 19:16:40 -08:00
lombok.config Bumping gradle from 7.4 to 8.1 to unblock build pipeline (#892) 2023-05-16 18:09:50 -07:00
MAINTAINERS.md Add maintainer (#4660) 2026-02-23 14:52:51 -08:00
NOTICE.txt refactor NOTICE (#16) 2021-11-15 19:16:40 -08:00
README.md Replace the usage of elasticsearch with OpenSearch in README (#3876) 2025-05-27 14:57:42 -07:00
settings.gradle Memory interface in spi (#1664) 2023-11-22 15:51:30 -08:00
TRIAGING.md add triaging doc (#1250) 2023-09-01 14:15:53 -07:00

Test Workflow codecov Documentation Chat PRs welcome!

OpenSearch Machine Learning Commons

Machine Learning Commons for OpenSearch is a new solution that make it easy to develop new machine learning feature. It allows engineers to leverage existing opensource machine learning algorithms and reduce the efforts to build any new machine learning feature. It also removes the necessity from engineers to manage the machine learning tasks which will help to speed the feature developing process.

Problem Statement

Until today, the challenge is significant to build a new machine learning feature inside OpenSearch. The reasons include:

  • Disruption to OpenSearch Core features. Machine learning is very computationally intensive. But currently there is no way to add dedicated computation resources in OpenSearch for machine learning jobs, hence these jobs have to share same resources with Core features, such as: indexing and searching. That might cause the latency increasing on search request, and cause circuit breaker exception on memory usage. To address this, we have to carefully distribute models and limit the data size to run the AD job. When more and more ML features are added into OpenSearch, it will become much harder to manage.
  • Lack of support for machine learning algorithms. Customers need more algorithms within Opensearch, otherwise the data need be exported to outside of OpenSearch, such as s3 first to do the job, which will bring extra cost and latency.
  • Lack of resource management mechanism between multiple machine learning jobs. It's hard to coordinate the resources between multi features.

In the meanwhile, we observe more and more machine learning features required to be supported in OpenSearch to power end users business needs. For instance:

  • Forecasting: Forecasting is very popular in time series data analysis. Although the past data isnt always an indicator for the future, its still very powerful tool used in some use cases, such as capacity planning to scale up/down the service hosts in IT operation.
  • Root Cause Analysis in DevOps: Today some customers use OpenSearch for IT operations. It becomes more and more complicated to identify the root cause of an outage or incident since it needs to gather all the information in the ecosystem, such as log, traces, metrics. Machine learning technique is a great fit to address this issue by building topology models of the system automatically, and understanding the similarity and casual relations between events, etc.
  • Machine Learning in SIEM: SIEM(Security Information and Event Management) is another domain in OpenSearch. Machine learning is also very useful in SIEM to help facilitate security analytics, and it can reduce the effort on sophisticated tasks, enable real time threat analysis and uncover anomalies.

Solution

The solution is to introduce a new Machine Learning library inside the OpenSearch cluster. The major functionalities in this solution include:

  • Unified Client Interfaces: clients can use common interfaces for training and inference tasks, and then follow the algorithm interface to give right input parameters, such as input data, hyperparameters. A client library will be built for easy use.
  • ML Plugin: ML plugin will help to initiate the ML nodes, and choose the right nodes and allocate the resources for each request, and manage machine learning tasks with monitoring and failure handing supports, and store the model results; it will be the bridge for the communication between OpenSearch process and ML engine.
  • ML Engine: This engine will be the host for ML algorithms. Java based machine learning algorithms will be supported in the first release.

This solution makes it easy to develop new machine learning features. It allows engineers to leverage existing open-source machine learning algorithms, and reduce the efforts to build any new machine learning feature. It also removes the necessity from engineers to manage the machine learning tasks which will help to speed up the feature developing process.

How to use it for new feature development

See How to add new function.

Contributing

See developer guide and how to contribute to this project.

Code of Conduct

This project has adopted the Amazon Open Source Code of Conduct. For more information see the Code of Conduct FAQ, or contact opensource-codeofconduct@amazon.com with any additional questions or comments.

Security

If you discover a potential security issue in this project we ask that you notify OpenSearch Security directly via email to security@opensearch.org. Please do not create a public GitHub issue.

License

This project is licensed under the Apache v2.0 License.

Copyright 2020-2021 Amazon.com, Inc. or its affiliates. All Rights Reserved.