Talend Enterprise Data Integration for the IGEPA Group

Down from 2½ hours to 7 minutes: Talend makes light work of IGEPA’s ETL processes
Open source and good support do not necessarily go hand in hand. But even with the free version of Talend you have the community to fall back on for advice. You can also find truly comprehensive and useful documentation on SourceForge. We were extremely impressed with the support that comes with the licensed version, in particular the professional assistance we received via WebEx meetings at all times of the day.
Grzenkowsky, Senior Application Developer at HRI ITS

Founded in 1960 as “Interessengemeinschaft Papier”, the IGEPA Group has gone on to become one of Europe’s leading paper wholesalers with more than 50,000 customers and around 3,500 employees. In 2012, the Group with its strategic partners across Europe turned over 2 billion euro and sold 1.8 million tons of paper. The IGEPA product catalog lists more than 7,000 different items.

Some of the members of the IGEPA Group have been working together for over 50 years. While maintaining their independence, Group companies are united by a common marketing strategy. IGEPA is present in over 20 European countries as well as New Zealand, and it is constantly expanding its reach with new holdings and subsidiaries. This broad footprint allows IGEPA – in collaboration with its partners – to realize exceptional concepts, leverage international synergies and respond with agility to changing market dynamics.

Up until late 2012, the Group’s IT service provider was igepa papertec GmbH. A merger with H&R Infotech, which provided IT services for chemicals manufacturer H&R, led to the creation of a new company called HRI ITS at the start of 2013. Since then, HRI ITS has been providing a full range of IT services for both IGEPA and H&R. As part of the merger, the IT infrastructures of both companies had to be aligned and processes automated as far as possible. Another IT challenge involved populating the business intelligence (BI) database with key operational data to support executive decision-making. HRI ITS decided to use data integration technology from Talend for this task.

The challenge

HRI ITS operates a single data center in Berlin that provides IT services for all members of the IGEPA Group. The main software program used is the current “E1” version of an ERP system from JD Edwards (currently part of the Oracle product portfolio). The primary database is DB2, which runs on an AS400 server. Data from the JD Edwards system is regularly loaded to a BI database, where OLAP cubes are generated, read with the BI software solution from MicroStrategy and exported as reports. These reports are vital controlling tools for the management team. Previously, Ascential software was used for the ETL process of extracting the data, transforming it into a suitable format and loading it from the operational system to the BI database.

“The Ascential solution served us well for many years, but it was clearly starting to reach the limits of its capabilities,” explains Jan Grzenkowsky, Senior ETL Application Developer at HRI ITS. “The loading processes were often taking so long that we were not able to fully load some of the data in the allotted time. On occasion, there simply were not enough hours in the night, and we often had to contend with database errors and aborted jobs.” Another downside was that Ascential could not be virtualized. As a result, it often took several hours to load a customer database. “I have actually had to abort jobs because the load rate was not rising above 25 data records per second, which was unacceptably slow.”

In 2005, Ascential was taken over by IBM. Grzenkowsky decided to try IBM first, requesting a quote for an updated version of the software. “The quote was astronomical for software that was to be used by a relatively small in-house team. Anyway, we could not and would not pay that amount,” recalls Grzenkowsky. When searching for an alternative solution online, he came across Talend’s integration platform.

The Talend solution

“Even just looking at the technical specifications, Talend had an instant appeal,” continues Grzenkowsky. “I was also impressed with the list of reference customers. So I decided on the spot to download the free Talend Open Studio suite and played around with it a bit. My initial test was so successful that I contacted the company and arranged a date for a presentation.” The outcome on the day was an impressive demo with live data, which more or less sealed the deal. HRI ITS has now licensed the Talend Platform for Enterprise Integration, which has more functions for enterprise-class environments than the free version, and also comes with a professional support package. As for licensing, the customer found Talend’s terms to be fairer and more transparent than those of competitors. The basis for calculation is developer seats, so the volume of data loaded and the number of platforms involved plays no role.

The slow performance of the previous solution resulted in serious structural problems in certain areas. For example, the internal product database contains around eight million entries, but the products are located at multiple warehouses. In E1, it was not easy to see what data had been updated and when. To gain an accurate overview of product availability, the entire database had to be uploaded every time. This was simply not possible with the old system because of time constraints. The workaround was to isolate a certain period, for instance by incrementally saving data from the previous six months. Talend’s answer to this structural problem was to load data at what Grzenkowsky describes as “a crazy speed”. Uploading the whole database now takes as little as seven minutes.

“One huge advantage of Talend is that the solution can be fully virtualized. In addition, it can run on all platforms,” enthuses Grzenkowsky. An ETL job is a Java program in Talend, so it is also possible to distribute different jobs across multiple servers. This allows the data center team to make optimum use of the available hardware resources at all times. Meanwhile, the Talend Administration Center (TAC) helps the team plan and allocate the jobs efficiently.

The benefits

The migration to Talend was seamless. Originally, it was estimated that two employees would be able to migrate the 440 ETL jobs for IGEPA in one year. In the end, it took one employee just six months to finish the task. “Talend resembled our old Ascential solution in structure, which meant that we were able to find our way around the new environment very quickly. Since Talend is based on Eclipse, we were even able to fall back on in-house Java programmers in some instances,” comments Grzenkowsky.

But IGEPA didn’t stop on completion of the planned 440 jobs. Further analysis revealed potential for another 538 Talend jobs – most of which were developed as mini-jobs, which load just a few data records from key tables. With the old solution, these queries exerted a considerable load on the databases due to frequent open/close commands for the ODBC drivers. Talend not only significantly accelerates these processes, it also virtually eliminates the burden on the databases with intelligent open/close control functionality.

The high speeds achieved with the ETL processes result in much smoother operation. If a process does ever go wrong, however, it can easily be repeated since it generally takes mere minutes rather than hours to load a database. Grzenkowsky is particularly enthusiastic about the support from Talend: “Open source and good support do not necessarily go hand in hand. But even with the free version of Talend you have the community to fall back on for advice. You can also find truly comprehensive and useful documentation on SourceForge. We were extremely impressed with the support that comes with the licensed version, in particular the professional assistance we received via WebEx meetings at all times of the day. I even had support sessions in the middle of the night with a Chinese employee who was able to solve my problem in no time.”

For IGEPA, the introduction of the Talend platform has already paid off. The next stage entails migrating H&R, which is expected to take the whole year. The situation with this H&R is a lot more complex. H&R also uses E1 and Ascential, but these are accompanied by numerous parallel systems, which also need to be integrated. Furthermore, the company’s BI system data is mission-critical. Granted, there are “only” 200 jobs to migrate, “but they are challenging, so Talend will really have a chance to show what it can do,” says Grzenkowsky. “The entire team is already looking forward to seeing the end of the old system, which caused us considerable problems. Following the IGEPA project, we know what Talend is capable of, and we cannot wait until it is in place for H&R. For me, Talend is a home run.”