A custom data lake solution for a manufacturing company

Find out how Yalantis helped a manufacturing company solve issues with performance downtime and improve business visibility by efficiently bringing together a large amount of scattered supply chain data.

industry

Manufacturing
Country

Europe
Team size

5+
Implementation

3 months

About the client

Our client is a manufacturer of electronic circuits with several plants in Europe and Asia. They needed to securely store data from multiple sources and easily access it for business intelligence and data analysis.

Business context

The client came to Yalantis to solve the following issues:

Frequent and unexpected idle periods in production and difficulty defining their root causes due to the limited amount of data available for analysis
Recurrent human errors due to manual collection and entry of business and production data
Inability to keep track of business performance across facilities due to lack of integration between separate data management systems at different departments
High risk of data corruption and unauthorized access caused by the lack of a unified security policy across employees’ workstations

Solution overview

We built a data lake architecture to provide centralized cloud-based storage and enable supply chain data analytics. The data lake can store and structure data in all formats (structured, semi-structured, and unstructured) and from all internal and external sources. Authorized company employees can easily and securely access this data for analysis.

Automating data collection

Using AWS services, our team set up a fully automated data flow from all sources in the company to a single source of truth to eliminate the need for manual data management. We ensured the collection of:
- IoT data on humidity, temperature, and heat in production facilities
- Production data, including the number of products produced per day, the number of errors and malfunctions, and downtime frequency
- Business data about vendors, suppliers, and clients, items in stock, and invoices
- Equipment logs on who used certain equipment in the facility and for how long, as well as all equipment maintenance activities
- External data such as employees’ timesheets, work schedules, payrolls from third-party Hubstaff software, and real-time data on material tracking and production planning from Katana software
Enabling data transmission and storage

The Yalantis team built a data lake solution on Amazon S3, a scalable cloud storage service. Doing so involved:
- configuring the Amazon Kinesis Firehose service to transfer data in real time to Amazon S3 and then to the data lake and AWS Data Sync to transfer all records from our client’s on-premises databases to the data lake
- setting up file exchange between our client’s on-premises legacy systems and the data lake with AWS Storage Gateway
- implementing the AWS Lake Formation tool to automatically extract, transform, and load raw supply chain data
- enabling AWS Lake Formation and AWS Glue to deduplicate records as well as match and partition data attributes from various sources
Cataloging and access

To help analysts quickly find and directly access the necessary data to analyze the root causes of downtime, we ensured that the solution properly grouped all gathered data. To achieve this, we:
- used AWS Lake Formation to create catalogs with specific datasets in the data lake
- implemented AWS Glue Crawler to examine all data received in the data lake and compose queryable tables with data catalogs that also contain information about the users who can access these datasets
Data security

To ensure secure data access and retrieval, we combined server-side encryption and client-side encryption. The AWS Key Management Service helped us orchestrate the exchange of encryption keys. With the help of AWS Identity and Access Management (IAM), we provided user policies with user roles that have different permissions for processing and accessing data.

Value delivered

As a result of our tight cooperation with the client, we delivered a full-fledged data lake solution that transformed our client’s daily business operations.

Before:

Difficulty with promptly resolving unexpected performance downtime
Decreased competitiveness due to production downtime
Complex data analysis due to lack of a single data storage location
Insecure storage of business and production data
Manual handling of data assets
Inefficient assessment of business performance due to scarce analytical data

After:

Single source of truth for all data gathered in the client’s facilities
Security and access management policies for all types of data
Potential for multi-perspective data analysis with a single source of truth
Simplified data search and access with accurate data cataloging
Better root cause analysis of performance downtime
Improved competitiveness with data-driven resolution of business challenges

Unleash the full potential of your business data

Develop a single source of truth to enhance data analysis and drive insightful decision-making

Request a call

A custom data lake solution for a manufacturing company

About the client

Business context

Solution overview

Automating data collection

Enabling data transmission and storage

Cataloging and access

Data security

Value delivered

Unleash the full potential of your business data

More projects

System for big data analytics in supply chain

SaaS solution for transportation management

Load planning system