Skip to main content

Key Component of a Manufacturing Data Lakehouse

· 12 min read
Anthony Cavin
Data Scientist - ML/AI, Python, TypeScript

Manufacturing Data Lakehouse Building Block

In today's data landscape, flexibility in terms of performance, cost, and storage availability is at a premium. Data warehouses provide structured, analytical capabilities to process large amounts of data. Data lakes, on the other hand, are known for scalability, and the flexibility to handle vast amounts of unstructured data.

In recent years, however, demand has grown for a robust marriage of these two concepts, leveraging cloud-based technologies and advanced data processing frameworks to store massive volumes of data in raw form (data lake), while also supporting structured querying and analytics (data warehouse). This combined data solution is referred to as a data lakehouse.

The data lakehouse concept is particularly useful for manufacturing, as manufacturing requires fast processing of large amounts of data from numerous sources, including sensors (vibration, temperature, power), employee productivity data (files, spreadsheets, documents), logs, cameras, GPS, and more, creating an unstructured jumble of formats that is difficult to process in a traditional data warehouse.

A data lakehouse is an IT infrastructure that provides a unified solution to handle multiple data formats while still providing the capacity to make sense of this data. It supports intelligent query-based analytics and applies structure to the chaos.

At the same time, ReductStore Cloud Solution is a special type of data storage solution that combines the flexibility of a time series database with the capacity of an object storage. In this article, we will explain these strengths in detail and why we think ReductStore has many advantages as a building block to create a data lakehouse for manufacturing.

In order to present these strengths, we will tie our case to the core components of a strong data lakehouse solution.

How to Store and Manage Robotic Data

· 17 min read
Gracija Nikolovska
Software Developer - C#, Python, ROS

Introduction Diagram

Robots generate massive amounts of data that must be managed effectively. Challenges like limited on-device storage, the need for real-time processing, and high cloud storage costs make it essential to find efficient solutions. Balancing edge and cloud storage while keeping data synchronized is a key part of effective management.

This article begins by outlining these challenges and offering practical strategies, such as using time-series databases and implementing retention policies. We will then introduce ReductStore, a specialized database designed to meet the unique needs of robotic systems. With features like real-time ingestion, efficient querying with batching, smart retention policies, and edge-to-cloud replication, ReductStore offers a cost-effective and high-performance solution for storing and managing robotic data.

We’ll also explore a hand-on example where we’ll show how you can set up ReductStore and use it for storing and managing data. Finally, we will compare ReductStore with MongoDB, explaining why ReductStore is the better choice for robotics. This comprehensive guide is designed to help engineers and developers overcome the challenges of robotic data management and optimize their systems.

The MinIO alternative for Time-Series Based Data

· 10 min read
Anthony Cavin
Data Scientist - ML/AI, Python, TypeScript

MinIO vs ReductStore

The amount of data generated world-wide is expanding exponentially, and will only increase further in coming years. In fact, over 90% of the data worldwide has been generated in the last two years, and 40% of data in 2020 was generated by machines. Not to mention that 80 to 90 percent of data is unstructured. Not only is timely processing of said data ever more important, the data itself is often time-stamped and must be handled in a time-based structure. Due to the rise of AI/ML, Robotics, IoT, and edge-computing, solutions that can efficiently leverage much cheaper and plentiful unstructured object/blob storage while maintaining the ability to organize, read, and transmit time-series based data from multiple sources and in multiple formats are in great demand. ReductStore and MinIO are two solutions designed to meet this demand.