A custom data lake solution for a manufacturing company
Find out how Yalantis helped a manufacturing company solve issues with performance downtime and improve business visibility by efficiently bringing together a large amount of scattered supply chain data.
-
industry
Manufacturing
-
Country
Europe
-
Team size
5+
-
Implementation
3 months
About the client
Our client is a manufacturer of electronic circuits with several plants in Europe and Asia. They needed to securely store data from multiple sources and easily access it for business intelligence and data analysis.

Business context
The client came to Yalantis to solve the following issues:
- Frequent and unexpected idle periods in production and difficulty defining their root causes due to the limited amount of data available for analysis
- Recurrent human errors due to manual collection and entry of business and production data
- Inability to keep track of business performance across facilities due to lack of integration between separate data management systems at different departments
- High risk of data corruption and unauthorized access caused by the lack of a unified security policy across employees’ workstations
Solution overview
-
We built a data lake architecture to provide centralized cloud-based storage and enable supply chain data analytics. The data lake can store and structure data in all formats (structured, semi-structured, and unstructured) and from all internal and external sources. Authorized company employees can easily and securely access this data for analysis.
Automating data collection
Using AWS services, our team set up a fully automated data flow from all sources in the company to a single source of truth to eliminate the need for manual data management. We ensured the collection of:
- IoT data on humidity, temperature, and heat in production facilities
- Production data, including the number of products produced per day, the number of errors and malfunctions, and downtime frequency
- Business data about vendors, suppliers, and clients, items in stock, and invoices
- Equipment logs on who used certain equipment in the facility and for how long, as well as all equipment maintenance activities
- External data such as employees’ timesheets, work schedules, payrolls from third-party Hubstaff software, and real-time data on material tracking and production planning from Katana software
- IoT data on humidity, temperature, and heat in production facilities
-
Enabling data transmission and storage
The Yalantis team built a data lake solution on Amazon S3, a scalable cloud storage service. Doing so involved:
- configuring the Amazon Kinesis Firehose service to transfer data in real time to Amazon S3 and then to the data lake and AWS Data Sync to transfer all records from our client’s on-premises databases to the data lake
- setting up file exchange between our client’s on-premises legacy systems and the data lake with AWS Storage Gateway
- implementing the AWS Lake Formation tool to automatically extract, transform, and load raw supply chain data
- enabling AWS Lake Formation and AWS Glue to deduplicate records as well as match and partition data attributes from various sources
-
Cataloging and access
To help analysts quickly find and directly access the necessary data to analyze the root causes of downtime, we ensured that the solution properly grouped all gathered data. To achieve this, we:
- used AWS Lake Formation to create catalogs with specific datasets in the data lake
- implemented AWS Glue Crawler to examine all data received in the data lake and compose queryable tables with data catalogs that also contain information about the users who can access these datasets
-
Data security
To ensure secure data access and retrieval, we combined server-side encryption and client-side encryption. The AWS Key Management Service helped us orchestrate the exchange of encryption keys. With the help of AWS Identity and Access Management (IAM), we provided user policies with user roles that have different permissions for processing and accessing data.
Value delivered
As a result of our tight cooperation with the client, we delivered a full-fledged data lake solution that transformed our client’s daily business operations.
Before:
-
Difficulty with promptly resolving unexpected performance downtime
-
Decreased competitiveness due to production downtime
-
Complex data analysis due to lack of a single data storage location
-
Insecure storage of business and production data
-
Manual handling of data assets
-
Inefficient assessment of business performance due to scarce analytical data
After:
-
Single source of truth for all data gathered in the client’s facilities
-
Security and access management policies for all types of data
-
Potential for multi-perspective data analysis with a single source of truth
-
Simplified data search and access with accurate data cataloging
-
Better root cause analysis of performance downtime
-
Improved competitiveness with data-driven resolution of business challenges
Unleash the full potential of your business data
Develop a single source of truth to enhance data analysis and drive insightful decision-making
More projects
-
System for big data analytics in supply chain
A solution for a 3PL company to ensure unified access to business data and in-depth data analytics
-
SaaS solution for transportation management
A custom TMS to increase supply chain visibility and ensure effective logistics problem-solving
-
Load planning system
A solution for building feasible production plans by calculating availability of materials