Data Prep 101: Diving into Enterprise Features

  • Mark Balkenende
    Mark Balkenende is a Sales Solution Architects Manager at Talend. Prior to joining Talend, Mark has had a long career of mastering and integrating data at a number of companies, including Motorola, Abbott Labs and Walgreens. Mark holds an Information Systems Management degree and is also an extreme cycling enthusiast.


A few months ago I wrote a blog about the exciting, new open source Data Preparation tool and all the great quick action you can take on your data. But, it gets so much better!  Where the single user desktop Free Data Preparation tool stops (as well as many of the not so free competitive tools) Talend’s Enterprise Data Preparation version will pick up and complete the Talend Data Fabric. But, what does that mean exactly?  Let me walk you through some very exciting pieces of that integrated and unified platform with Talend Data Preparation.

Data Preparation Integration Out of the Box

First and foremost is the access and entry to the Data Preparation tool itself.  We have completely integrated the access and controls of datasets and sharing into the our Talend Administration Center (TAC). In order for any user to access the Web UI of Talend Data Preparation and see or use the Datasets or Preparations (formally known as Recipes) the user must be added to Talend Administrator Center as a user and as a Data Preparation user.  AND GUESS WHAT?! Every existing customer that upgrades to Talend 6.2 will automatically get two free Talend Data Preparation users to start using right away!

Breaking Down Data Silos with Data Preparation

In the Enterprise version of Talend Data Preparation, you can stop fixing your data in silos just on your desktop and start sharing your datasets and preparations (AKA, the processes or steps used to clean and shape your data).  This is all done through a very easy-to-use, straight forward web user interface.

Not only can you share these clean and ready to use Datasets that you certified with your co-works and fellow Data Experts, but you give these processes/preparation back to you friendly IT ETL developer that can now quickly and easily pull those steps in that you created and add them directly into a productionalized Talend Data Integration job that can be delivered with the power of Talend’s enterprise platform. That means that anything that a Data Analyst or Business Analyst can design in the Web tool to clean up the data can quickly and easily be reused  by ETL developers and further integrated in a standard Data Integration process.

Below is an example of a Preparation created in the Web UI (in the background) and in the forefront is the Talend Studio and the component used to access that exact preparation and all the steps and is used on full datasets at the scale of Talend Data Integration.

Data Loading No Longer a Load of Work

Oh yeah, one last point today (yes more to come).  What about getting your data into the Talend Data Preparation tool?  Well, have you heard we have over 900 components and connectors to help your IT gurus access and transform data?

Well, all those connectors can be used to serve up any data from anywhere to a Data Specialist, who then can access and shape and cleanse in this sweet Web UI tool (that , ahem, looks and feels just like a popular Spreadsheet tool every business person in the world uses today).  That is right, just request your friendly IT Data Integration specialist to put a bunch of components that access your data in front of the new and powerful tDatasetOutput component and your final data output form that Talend Job will end up as a easy to access and use Dataset on the nice and shiny web interface of the Talend Data Preparation tool.  Then you share and certify the data so that others then can perform their own steps to shape the data the way they want it. (Of course only the people with the right permissions can see this data if you want them to, complete and full governance).

I hope you are as excited to start working/playing with the new Enterprise Talend Data Preparation features and platform as I have been in the last couple weeks.  Stay tuned for a detailed blog post from me on the enterprise features you can find in the Talend Data Fabric 6.2 release as well as more detail on the cool things you can do with the Enterprise version of Talend Data Preparation. I can’t what to tell you all about the “Live Dataset” and how that can provide up-to-date access to any data you need, and it is live!  Here’s to clean, accurate data!

