More data is being produced today than ever before. Two and a half quintillion bytes of data are created every day, and the volume of data is doubling each year. In addition, we are experiencing a time of accelerating change as everything in the data world gets continually reinvented from the bottom up. This has been going on for the last five years and will continue for at least the next decade if not longer. Today, it’s critical for enterprises to have data integration tools that provide trusted data and insights faster than the competition.
Data Integration Tools for the Speed of Business now.
The importance of rapid data integration tools
Companies must become data-driven to remain competitive. In fact, a report from McKinsey Global Institute indicates companies that are truly data-driven—meaning those that can gather, process and analyze data in real-time as it flows through the enterprise—make better decisions. According to the report, data-driven organizations that employ fast and trusted data integration tools results in a:
- 23x greater likelihood of customer acquisition
- 6 greater likelihood of customer retention
- 19x greater likelihood of profitability
Deriving accurate insights quickly — and acting on them immediately — becomes a competitive advantage in every market. Having inaccurate analytics or even delayed insights can put you in a vulnerable position—one in which your competitors can overtake your market share by acting on industry and customer needs that you haven’t yet identified.
The realities of data integration
Organizations today are exposed to tremendous opportunities with the growth of technologies such as cloud, machine learning, Internet of Things (IoT), and big data. But lack of the right data integration tools has left businesses dealing with these realities:
Getting Started with Data Integration now.
- 55% of a company’s data is not accessible for making decisions
- Only 45% of an organization’s structured data is actively used for business intelligence, and less than 1% of unstructured data is analyzed or used at all
- More than 70% of employees have access to data they should not
- 80% of analysts’ time is spent simply discovering and preparing data
- 47% of newly-created data records have at least 1 critical error
- The estimated financial impact of poor data quality is $15m/year on average
- Knowledge workers waste 50% of their time hunting for data, finding and correcting errors, and searching for confirmatory sources for data they don’t trust
- Data scientists spend 60% of their time cleaning and organizing data
Enterprises need modern data management solutions that move at the speed of business – and that requires employing fast and agile data integration tools.
Data integration as a business strategy
Although it is commonly thought of as a mere technical process, data integration is a cornerstone of business strategy. Data integration tools get data where it needs to go and makes data accessible to those who need it. But, as many data professionals have discovered, integration can easily become the main bottleneck to get to the insights.
Gartner estimates that through 2020, integration will consume 50% of the time and cost of building a digital platform. But this bottleneck doesn’t have to exist — if we start thinking about data integration strategically from a business lens rather than a collection of technical processes. Businesses that win have gotten good at using data integration tools to make their infrastructure available and agile.
Data integration use cases
Two examples of businesses that are winning with data integration tools that are driving fast results are Lenovo and AstraZeneca:
Lenovo is a $46 billion personal technology company, the #1 PC maker and #4 smartphone company in the world, serving customers in more than 160 countries.
“Customer expectations have been changing over the years.,” says Marc Gallman, director, Lenovo Analytics and Data Platform. “We needed to answer those typical questions: Which options influence customer decisions for our computers the most? Which type of hard-disk would be more preferable to our customers?” To answer those questions, Lenovo is working with a variety of large data sets. But the influx of on-premise, cloud and SaaS-based technologies were creating a data connectivity problem.
The Definitive Guide to Data Integration now.
“We decided to do a hybrid big data architecture of Amazon Web Services (AWS) and our own Lenovo servers. The idea was to maintain the privacy and security of our data and also benefit from the cloud. Talend quickly became a core component of this architecture.”
With Talend, Lenovo has built an elastic hybrid-cloud platform to analyze +22 billion pieces of customer information annually. 250+ terabytes of data and 60+ different types of data sources are captured annually across Lenovo ‘s business units. 8,300 reports are delivered annually to over 615 users across Lenovo providing real-time dashboards, API data feeds and data analysis.
“Using Talend, we have nearly 300 data integration processes running at the same time against a multiplicity of data types and sources, and we expect these numbers to keep rising as we develop the approach,” says Gallman. “We have been able to drive up a revenue per unit by 11%. The attach rate for ThinkPad laptop series has also increased by 18%.”
With Talend, Lenovo has saved about $140,000 in initial migration costs alone. Lenovo operational costs (employee costs) were reduced by over $1 million within 1 year (34%) while productivity increased by 2-3 times. In addition, Talend has helped improve reporting performance and cut certain process times by a matter of hours to minutes. The ease of use of the Talend platform also allows Lenovo to deliver on requests to continually increase velocity in acquiring data. The time to market on 95% plus of requests is 14 days.
AstraZeneca plc is a global, science-led biopharmaceutical company headquartered in Cambridge, United Kingdom. It is the world’s seventh-largest pharmaceutical company and has operations in over 100 countries. AstraZeneca had data dispersed throughout the organization in a wide range of sources and repositories. Having to draw data from CRM, HR, finance systems and several different versions of SAP ERP systems slowed down vital reporting and analysis projects.
“To be able to easily analyze that data, we knew we needed to put in place an integration architecture that could help with a mass consolidation and bring data together in a single source of the truth,” says Simon Bradford, senior data and analytics engineer at AstraZeneca. “We wanted to consolidate everything and get a single set of global metrics so we could monitor activity across divisions and markets and do comparisons that were not previously possible.”
AstraZeneca resolved to build a data lake on AWS to hold the data from its wide range of source systems. To capture that data, they selected Talend. Andy McPhee, science and enabling units data and analytics engineering lead, explains: “Talend is responsible for lifting, shifting, transforming, and delivering our data into the cloud, extracting from multiple sources and then pushing that data into Amazon S3. The Talend jobs are built and then executed in AWS Elastic Beanstalk. After some transformation work, Talend then bulk loads that into Amazon Redshift for the analytics. Talend is also being used to connect to AWS Aurora.”
The data lake that AstraZeneca has built shows the value of a data integration strategy built on a reusable infrastructure. “The data lake enables us to pull large volumes of valuable data from disparate systems and make our data discoverable across divisions,” says McPhee.
Choosing Data Integration Tools
In today’s data-driven world, the need for data integration tools exists in every company, small to large. Whether it is extracting data that exists in spreadsheets, packaged applications, databases, sensor networks, or social media feeds, there is a significant benefit to integrate, share and reuse information instead of having duplicate processes and silos of information. It is also important to select a solution that can address all your data integration needs at the speed of your business.
Talend Data Fabric is a single suite of apps for data integration and integrity with self-service apps, pervasive data quality and governance, and native performance that spans across all data flows and data sources from end-to-end.