Real-Time Machine Learning



To it's full potential with an architecture built ground up for it.


Online Inference

For more accurate results at blazingly fast speeds.


Continual Learning

To adapt to continuous changes in underlying data distribution.


What does TurboML do? How does it help my business?

We provide a platform for the complete lifecycle of real-time machine learning, from development to deployment, maintenance, and monitoring. We help keep your models and features fresh, based on the latest production data, as well as greatly bring down the iteration time, and time to value.

Our platform aims to reduce the barrier to entry for streaming, and as such, the interface is designed to be convenient, and minimize the learning curve for data scientists and ML engineers. With our platform, you can quickly experiment and test new features/models directly on live data through shadow deployments.

How do I get my data in?

We support both Pull and Push-based ingestion.

Pull-based ingestion: No matter where you have your data, be it in data lakes like Snowflake or Databricks, or cloud storage solutions like S3, or databases like RDS or MongoDB, or even streaming sources like Kafka or Kinesis, we have readily available connectors that can connect with your data source, and continuously ingest data from there.

Push-based ingestion: We have REST API endpoints, as well as performant client SDKs in multiple languages if you want to push the data to us on a per-event basis.

Support for handling sensitive data (e.g. masking) is present in both methods. Moreover, this can be hosted in your own cloud as well.

Can I use multiple data sources in the same ML model?

We got you covered! Similar to feature platforms like Tecton, you can define features for different data sources that can be accessed by a key (like entity id) by the model. We also provide an interface to perform streaming join operations for more complex cross-data relationships.

What is the interface available for my data scientists for feature engineering?

We provide the flexibility of SQL and Python to define features. We also provide a Jupyter environment to supplement the python interface and make it easy for you to play around with the features. Additionally, we provide optimized implementations of some of the most common feature transformations, including complex time-aggregations, through a simple no-code UI.

I don’t want to use your platform for feature engineering. I have my own feature store. Can you work with that?

Absolutely! If you already have a feature store that can serve online features, say through Redis or DynamoDB, you can use that instead to get features on our platform.

How is the ML modeling workflow defined?

We provide our own ML algorithms that are effective in online and continual learning scenarios, with implementations that are optimized for streaming, real-time use-cases providing latencies as low as tens of microseconds. These algorithms come from our own research, as well as, from the community around real-time ML.

What are the benefits of using your models?

Our models have been optimized, so that along with giving blazingly fast inference, they can dynamically adapt to drifting data distributions in production. This can either be automatically managed for you, or, if you prefer, you can define triggers for re-training the models based on performance, volume, time, or drift.

Additionally, our model implementations are explainable i.e. we can provide an explanation for each prediction/decision made by the model.

I don’t want to use your ML models. I want to bring my own. What’s the support for that?

We use the ONNX format to support bringing your own models, including popular ML frameworks like PyTorch, TensorFlow, Keras, and Scikit-Learn. A complete list of supported frameworks can be found here.

How do I consume the output from the models?

There are several ways in which you can consume model outputs.

If your downstream application is event-driven, we push all model outputs to a Kafka topic, which could be subscribed by your downstream application, or directly ingested into another data source. You can register webhooks with your own custom logic for the outputs. We also provide APIs to include the outputs in your existing monitoring/ops dashboards, e.g. Grafana. If your use-case is not covered, please reach out to us.

If your downstream application is request-response based, we expose production-ready model-serving REST API endpoints that can be used for high-throughput, low-latency inference.

We also support an OLAP interface to run real-time analytical queries on the outputs.

I also have access to labeled data. How can that be used?

Access to labels can have two-fold benefits for your real-time ML workflows. First, you can evaluate your models in real-time. This can enable interesting deployment models, including champion-challenger, where you can monitor multiple models in production, and dynamically select the best performing model. As well as using bandits to intelligently route the incoming inference request based on finer grain model performance (for instance, performance for a specific demographic). Second, labels can help guide your models better with feedback to make them more accurate on the latest data!

© TurboML, Inc.