Skip to main content

Implementing Data Streaming in PyTorch from Remote DB

· 9 min read
Anthony Cavin
Data Scientist - ML/AI, Python, TypeScript

PyTorch Training Diagram

PyTorch training loop with data streaming from remote device

When training a model, we aim to process data in batches, shuffle data at each epoch to avoid over fitting, and leverage Python's multiprocessing for data fetching through multiple workers.

The reason that we want to use multiple workers is that GPUs are capable of handling large amounts of data concurrently; however, the bottleneck often lies in the time-consuming task of loading this data into the system.

Moreover, the challenge is even trickier when there is simply too much data to store the whole dataset on disk and we need to stream data from a remote database such as ReductStore.

In this blog post, we will go through a full example and setup a data stream to PyTorch from a playground dataset on a remote database.

Let's dig in!

Open-Source Alternatives to Landing AI

· 7 min read
Anthony Cavin
Data Scientist - ML/AI, Python, TypeScript

Photo by Luke Southern

Photo by Luke Southern

In the thriving world of IoT, integrating MLOps for Edge AI is important for creating intelligent, autonomous devices that are not only efficient but also trustworthy and manageable.

MLOps—or Machine Learning Operations—is a multidisciplinary field that mixes machine learning, data engineering, and DevOps to streamline the lifecycle of AI models.

In this field, important factors to consider are:

  • explainability, ensuring that decisions made by AI are interpretable by humans;
  • orchestration, which involves managing the various components of machine learning in production–at scale; and
  • reproducibility, guaranteeing consistent results across different environments or experiments.

Implementing AI for Real-Time Anomaly Detection in Images

· 8 min read
Anthony Cavin
Data Scientist - ML/AI, Python, TypeScript

Photo by Randy FathPhoto by Randy Fath on Unsplash

The journey of taking an open-source artificial intelligence (AI) model from a laboratory setting to real-world implementation can seem daunting. However, with the right understanding and approach, this transition becomes a manageable task.

This blog post aims to serve as a compass on this technical adventure. We'll demystify key concepts, and delve into practical steps for implementing anomaly detection models effectively in real-time scenarios.

Let's dive in and see how open-source models can be implemented in production, bridging the gap between research and practical applications.