A custom data lake solution for a manufacturing company

Find out how Yalantis helped a manufacturing company solve issues with performance downtime and improve business visibility by efficiently bringing together a large amount of scattered supply chain data.

  • industry


  • Country


  • Team size


  • Implementation

    3 months

About the client

Our client is a manufacturer of electronic circuits with several plants in Europe and Asia. They needed to securely store data from multiple sources and easily access it for business intelligence and data analysis.

Business context

The client came to Yalantis to solve the following issues:

  • Frequent and unexpected idle periods in production and difficulty defining their root causes due to the limited amount of data available for analysis
  • Recurrent human errors due to manual collection and entry of business and production data
  • Inability to keep track of business performance across facilities due to lack of integration between separate data management systems at different departments
  • High risk of data corruption and unauthorized access caused by the lack of a unified security policy across employees’ workstations

Solution overview

  • We built a data lake architecture to provide centralized cloud-based storage and enable supply chain data analytics. The data lake can store and structure data in all formats (structured, semi-structured, and unstructured) and from all internal and external sources. Authorized company employees can easily and securely access this data for analysis.


    Automating data collection

    Using AWS services, our team set up a fully automated data flow from all sources in the company to a single source of truth to eliminate the need for manual data management. We ensured the collection of:

    • IoT data on humidity, temperature, and heat in production facilities
    • Production data, including the number of products produced per day, the number of errors and malfunctions, and downtime frequency
    • Business data about vendors, suppliers, and clients, items in stock, and invoices
    • Equipment logs on who used certain equipment in the facility and for how long, as well as all equipment maintenance activities
    • External data such as employees’ timesheets, work schedules, payrolls from third-party Hubstaff software, and real-time data on material tracking and production planning from Katana software
  • Enabling data transmission and storage

    The Yalantis team built a data lake solution on Amazon S3, a scalable cloud storage service. Doing so involved:

    • configuring the Amazon Kinesis Firehose service to transfer data in real time to Amazon S3 and then to the data lake and AWS Data Sync to transfer all records from our client’s on-premises databases to the data lake
    • setting up file exchange between our client’s on-premises legacy systems and the data lake with AWS Storage Gateway
    • implementing the AWS Lake Formation tool to automatically extract, transform, and load raw supply chain data
    • enabling AWS Lake Formation and AWS Glue to deduplicate records as well as match and partition data attributes from various sources
  • Cataloging and access

    To help analysts quickly find and directly access the necessary data to analyze the root causes of downtime, we ensured that the solution properly grouped all gathered data. To achieve this, we:

    • used AWS Lake Formation to create catalogs with specific datasets in the data lake
    • implemented AWS Glue Crawler to examine all data received in the data lake and compose queryable tables with data catalogs that also contain information about the users who can access these datasets
  • Data security

    To ensure secure data access and retrieval, we combined server-side encryption and client-side encryption. The AWS Key Management Service helped us orchestrate the exchange of encryption keys. With the help of AWS Identity and Access Management (IAM), we provided user policies with user roles that have different permissions for processing and accessing data.

Value delivered

As a result of our tight cooperation with the client, we delivered a full-fledged data lake solution that transformed our client’s daily business operations.


  • Difficulty with promptly resolving unexpected performance downtime

  • Decreased competitiveness due to production downtime

  • Complex data analysis due to lack of a single data storage location

  • Insecure storage of business and production data

  • Manual handling of data assets

  • Inefficient assessment of business performance due to scarce analytical data


  • Single source of truth for all data gathered in the client’s facilities

  • Security and access management policies for all types of data

  • Potential for multi-perspective data analysis with a single source of truth

  • Simplified data search and access with accurate data cataloging

  • Better root cause analysis of performance downtime

  • Improved competitiveness with data-driven resolution of business challenges

Unleash the full potential of your business data

Develop a single source of truth to enhance data analysis and drive insightful decision-making

Request a call