Data Integration Platform Adds New Self-Service Data Preparation and Governance Features to Help Transform Data Lakes into Qualified, Clean Data Anyone Can Use
REDWOOD CITY, Calif. - January 12, 2017

Talend (NASDAQ: TLND), a global leader in cloud and big data integration software, today announced the Winter ’17 version of Talend Data Fabric, a powerful platform that eases collaboration between IT and the business to enable more widespread use of data for decision making. Talend’s integrated platform now includes new data preparation features for big data that enable all employees to access, cleanse and collaborate on the analysis of massive data sets, as well as an intuitive, self-service Data Stewardship app that helps companies avoid the costly fines and penalties that can result from data integrity issues. The latest version of Talend Data Fabric also includes Spark 2.0 innovations for Talend Big Data and Talend Integration Cloud that allow customers to accelerate business processes and easily upgrade their environments to keep pace with the rapidly changing technology landscape.

Gartner research indicates that “Through 2018, 90 percent of deployed data lakes will become useless as they are overwhelmed with information assets captured for uncertain use cases.”[1] While data lakes have numerous benefits and can often serve as the first step in a company’s digital transformation, they also present new challenges in terms of governance, data quality, lineage, and ubiquitous access.

“Companies need to fundamentally change how they use and share data across their organization to advance their digitization efforts. The beauty of a data lake is that regardless of whether it’s housed in Hadoop, on premises, or in the cloud, you have a centralized repository that allows you to store significantly more information at a lower cost, and extract more insight,” said Ashley Stirrup, chief marketing officer for Talend. “The new version of Talend Data Fabric propels customers to the next phase of their digital evolution by fostering collaboration between IT and the business to scale and transform their data lakes into qualified, trusted data that employees can use to make more informed decisions, faster.”

Data Preparation for Big Data

The latest version of Talend Data Fabric empowers IT to enable business users to access and expedite data preparation and cleansing to get more value out of corporate data lakes. The new data preparation capabilities for Talend Big Data allow customers to:

  • Access any data source–whether it’s housed in Hadoop, the cloud or traditional databases—and share it across users and groups to encourage collaboration
  • Run preparations at scale using the power of Spark 2.0 and Hadoop
  • Utilize a pre-configured data dictionary to auto-recognize the meaning of the raw data from the data lake, as well as augment the dictionary with their own vocabulary, such as product codes or names
  • Crowdsource new data definitions from open data and/or the Talend Community

Data Stewardship: Getting to Good, Clean Data

In today’s increasingly competitive marketplace, the difference between digital leaders and laggards lies in how companies put their data to use. Talend’s new Data Stewardship app is one of the first self-service tools that allows IT and business users to curate and manage data efficiently throughout its lifecycle. With this component, users can quickly resolve many data integrity issues to ensure data in the lake is clean, governed and compliant. The new app can help companies ensure better data compliance to avoid the costly fines that can be incurred from a breach of regulatory mandates such as the General Data Protection Regulation or Sarbanes-Oxley. By extending data governance tasks to line-of-business stewards who are most familiar with the data, the new app creates a collaborative environment, wherein data in the lake is ‘trusted’, spurring broader use.

Using the Data Stewardship app, employees can embed governance into any data integration flow, and isolate subsets of data that require manual curation, arbitration or certification. The app then organizes those tasks as workflows, assigns each one to the business worker best equipped to perform the quality check, and sets rules for which data should be cleansed and validated. The new version of Talend Data Fabric also utilizes machine learning to discover best practices for data curation from the line-of-business experts and to automate matching of massive data sets so they can be completed faster and with greater intelligence. Additionally, new support for Apache Atlas allows customers to have a better understanding of data lineage across Hadoop, to better manage risk and compliance.

“Many organizations start data governance initiatives either due to an embarrassing or regulatory incident, or because the line of business workers feel they can’t trust the data. Some organizations also see data governance as an IT problem and not a business problem,” said Stewart Bond, research director of IDC's Data Integration Software service. “The best way to manage data governance is to engage line of business workers in the data stewardship process. Having intimate knowledge of the data empowers users to improve data trust and value through enrichment, cleansing, standardization and certification, increasing confidence in data-driven business decisions.”

Adaptable Investments Provide Peace of Mind

Big data and cloud technologies are rapidly evolving, which gives some customers pause that the platform purchases they make today may be outdated in a matter of months. Built on open source and adhering to industry standards, Talend Data Fabric can more easily adapt to change than proprietary software solutions. The continuous innovation provided by the open source developer community, as well as multiple big data and cloud partners, ensures that Talend Data Fabric keeps pace with emerging technology advances. Additionally, Talend Data Fabric is a model-driven code generator, which makes it very easy to acclimate to emerging technologies. For example, generating the code to transition a job or application from Spark 1.6 to Spark 2.0 can be done in just a few clicks. All of these features give customers peace of mind that their technology investments are secure for the long term and won’t need to be replaced every two years.

Pricing and Availability

Talend Data Fabric will be released on January 19, 2017. Customers that license the newest version of Talend Data Fabric will receive two complimentary seats of the Talend Data Stewardship App and Talend Data Preparation. For pricing and packaging information, contact a Talend sales representative at

To learn more about the full capabilities and benefits of the Winter ‘17 version of Talend Data Fabric, customers can register for one of the live webinars, “Talend Winter ’17: Transform Your Data Lake to Accelerate Insight”, on Thursday, January 19th from either 10:00 – 11:00 am GMT or 10:00 – 11:00 am PT. Those that attend the Talend Winter ’17 webinar will be automatically entered for a chance to win an all-expenses-paid ski trip for two to Park City, Utah.

Like this story? Tweet this: New Talend Big Data Fabric Platform speeds business insights with collaborative governance and more

About Talend

Talend (NASDAQ: TLND) is a next generation leader in cloud and big data integration software that helps companies become data driven by making data more accessible, improving its quality and quickly moving data where it’s needed for real-time decision making. By simplifying big data through these steps, Talend enables companies to act with insight based on accurate, real-time information about their business, customers, and industry. Talend’s innovative open-source solutions quickly and efficiently collect, prepare and combine data from a wide variety of sources allowing companies to optimize it for virtually any aspect of their business. Talend is headquartered in Redwood City, CA. For more information, please visit and follow us on Twitter: @Talend.


Media Contacts:

Chris Taylor, VP, Corp. Communications, Talend

Siobhan Lyons, Sr. Manager, Corp. Communications, Talend     

[1] Gartner, Inc., “Defining the Data Lake,” Nick Heudecker, Mark A. Beyer, November 2016.