How next-gen DI works

How next-gen DI works

Overview

Data integration in the ‘Age of Digital’ brings in need for ETL development to happen at the ‘Speed of Business’ rather than at ‘IT Speed’.

Data integration layer is the important ‘glue’ between the user engagement apps in the EDGE and the systems of record at the CORE of IT landscape. Application development for the Experience Layer happens at the ‘Speed of Business’ while changes in Integration Layer move at ‘IT Speed’. Data Integration projects in the Age of Digital need to be delivered much faster and intuitively to meet demands of digital transformation of enterprises.

Needs of Automation

  • Automation interfaces for rapid ETL development.
  • Self-service ETL development based on pre-defined data integration patterns.
  • Emphasis on ETL patterns that integrate cloud based apps, un/semi-structured data formats, NoSQL data stores with enterprise systems.

 

Solution Overview

Wipro’s NextGen DI automation with Talend and pre-built patterns libraries help kick-start Big Data, digital and traditional data integration projects from Day 1 which includes accelerates cloud deployment and building of cloud applications for business analytics. While responding to current data integration needs beyond ETL development, this IP tool also provides other critical modules such as pattern discovery for design identification of existing ETL, batch analysis on data flow dependencies for batch optimization and source-to-target data lineage document generation. With the combination of all these modules in one platform, NextGen DI is a complete solution for today’s data integration needs.

Next Gen DI Wipro and Talend

 

Deployment

After deploying NextGen DI and running in tomcat, will find the home page as

Next Gen DI Wipro and Talend

Once successfully login, user can access the different modules of the application

Next Gen DI Wipro and Talend

Workflow Generation

Wipro’s Next Generation Data Integration Platform uses design pattern based approach with a rich GU to automate the Talend development process thus reducing the development effort and improving the code quality very significantly.

It provides the following functionalities

  • Automated Talend job mapping
  • session and workflow generation
  • Extensible hierarchical pattern library
  • Rich intuitive user interface
  • Creation of patterns from pre-existing mappings
  • Bulk generation of Talend jobs

 

Expandable Library

NextGenDI follows the pattern based approach for generating the mappings. The commonly used Talend logic is created and stored as a pattern in web application. This pattern can be used any number of times

  • To create the mappings. The similar kinds of patterns are grouped under a collective category and the similar category in turn is grouped under a similar kind of library.
  • Thus the library structure is a 3 level hierarchy library having the structure as
  • Library -> Category -> Pattern.
  • The application comes containing 50 patterns grouped under 20 categories and 4 libraries.

Next Gen DI Wipro and Talend

  • This library structure is an expanding one, user can easily create a new library containing categories and patterns without doing any code change.
  • Once a particular library is chosen, user shall see the available categories inside that library as,

Next Gen DI Wipro and Talend

  • Once a particular category is chosen, user shall see the available patterns inside that category as,

Next Gen DI Wipro and Talend

 

Mapping parameter screen 

In this screen, the user will be feeding the required source target metadata and the parameters for the chosen pattern to create a mapping.

  • User can upload all the files needed to generate a mapping such as source, target metadata, mapplets, dictionary files if there are any. User can upload multiple files at once.
  • User can either directly upload the xmls, or sql file containing the table structure of the source target metadata.
  • If user upload a sql, user will be able to choose the database type of the source target meta data to be created.

Next Gen DI Wipro and Talend

Download the individual job or download as a project into Talend Studio.

Next Gen DI Wipro and Talend

 

Features

  • Automated ETL job creation for Talend.
  • Portability to Big Data Edition and Cloud DI.
  • Extensible hierarchical pattern library.
  • Pre-built library of ETL patterns for various use cases such as Data Ingestion for different databases and Big data echo systems such HDFS, Hive etc.
  • End to end patterns for Digital integration.
  • Pre-built patterns for Snowflake on Next Gen DI with Talend help to jump-start Cloud Analytics and deploy modernized Cloud Data Warehouse faster.
  • Rich intuitive user interface.
  • Graph based analysis of batch data flows and dependencies.
  • Bulk generation of Talend code.

 

Benefits

  • Leverage existing code to detect and extract patterns in order to fast track future development.
  • Create an org level Data Integration Pattern Library to bring in higher level of compliance and standardization.
  • Accelerates integration and adaptation of new age technologies using pre-built Digital pattern library.
  • Improve quality of code and reduce defects through automation. Defect reduction of over 50%.
  • Bring back focus of developers into Analysis and Design.
  • Drastically cuts down development effort by ~ 40 to 50% improving time to market and reducing costs.
  • Improve quality of code and reduce defects through automation. Defect reduction of over 50%.

 

About the Authors

Ganesh Arunasalam Senior Architect - Data, Analytics & AI, Wipro Ganesh has over 19 years of data warehouse experience. He is currently focusing on open source integration technologies and has successfully executed large engagements for global companies. He is a TOGAF certified Enterprise Architect. He is also certified in different database technologies and supports the practice in managing cloud native ETL tools.

Purushottam Joshi Senior Architect - Data, Analytics & AI, Wipro Purushottam has over 21+ years of data warehouse and ETL experience. He is currently focused on open source integration technologies and has successfully executed and delivered large engagements for Fortune 500 companies of different domains like Healthcare, Manufacturing , Telecom etc., He is a TOGAF certified.

 

Join The Conversation

0 Comments

Leave a Reply

Your email address will not be published. Required fields are marked *