Back in September I talked about how excited I was about the new Star Wars movie coming out. Well, that day is upon us. Yes, Thursday, Dec 17, is the pre-ordained day in my part of the world (Ireland). Have you ever been hit by a spoiler? You know, an older sister who breaks the news about Santa Clause? A colleague who lets the ending to the Sixth Sense slip out or perhaps a friend who provides a way too obvious hint about Luke’s “relationship” to Vader before you have the chance to catch The Empire Strikes Back? Spoilers change everything don’t they? True story, I never shed a tear at ET…. my buddy told me the ending before I saw the movie. Did I hear you say “Robbed”?
So, never one to spill the beans, here’s a heads up that there are major spoiler alerts below – so if you don’t want to hear about the fantastic new features in Talend 6.1, please turn away now!
Oh and when it comes to Star Wars, I have a ticket for Friday night, so mum’s the word.
Introducing Talend 6.1
Fresh on the heels of the Talend 6.0 release, team Talend has been busy creating new capabilities for the holidays. Introducing Talend 6.1, which further enables the data-driven enterprise, boldly going where no other integration tools have gone before (sorry, couldn’t resist), and delivering shiny new machine learning capabilities, along with continuous delivery and data masking on Spark presents.
What does this mean for you? It’s much easier and faster to make your data applications more intelligent.
As discussed in previous blogs, there are growing amounts of data everywhere including the Internet of Things. Businesses need to become data-driven to survive or risk being marginalized. We are seeing the rise of data science and machine learning as core competencies in every data-driven organization.
But how do you make deploying and updating models a scalable and repeatable development process? This is where Talend 6.1 comes in.
New Tricks in Time for the Holidays
Talend 6.1 provides an easier way to operationalize analytics, benefitting both IT and data scientists alike. Developers use Talend’s pre-built components and drag-and-drop tools to build Spark analytics models (e.g. Random Forest, Logistic Regression, Clustering via K-Means) for customer segmentation, forecasting, classification, regression analysis and more. Behind the scenes, Talend provides the smart tools for data connectivity, transformation, and cleansing, so you spend less time wrangling your data (which can be as much as 50 to 80% of an analytics project on average), and more time gaining insight.
Combined with the existing Continuous Delivery support added with Talend 6, which by the way we enhanced in Talend 6.1 with Git version control and Talend ESB process support, developers can rapidly deploy machine-learning algorithms. These algorithms can provide real-time operational insight so the systems and people that need it can act in-the-moment (e.g. a machine is about to fail, so shift operational load; online credit card fraud is about to occur, so disable account; or a shopper cart is going to be abandoned, so make another recommendation or offer). The recent Talend Data Master Awards highlights some of these use cases.
Data scientists can use these machine-learning algorithms to understand data, and teach the model to make predictions, with IT having the ability to quickly deploy into production for “testing with live users”. The benefits of Continuous Delivery are fast, iterative development and maintenance cycles, access of the information back to the data scientist for further refinement, and an overall more collaborative approach between data scientists and IT – something that is on every CIO’s holiday list.
Talend 6.1 also delivers a couple of other cool new presents, data masking on Spark and advanced support for Cloudera Navigator, which will bring joy to the security advocates and data quality developers on your shopping list.
As companies build vast data lakes there is an increasing need to make data private to protect against data breaches and meet compliance mandates. Data masking obfuscates your data (numbers, strings, dates, personally identifiable information and more) without impacting the rules that surround that data or allowing other users to see the data. By running on Spark, this can be done in-memory and at scale – sort of like delivering presents to 526 million kids!
A unique and first to market capability for Talend 6.1 and Cloudera Navigator users, is data lineage Spark support. This lets users trace data lineage for MapReduce and Spark down to the level of the schema defined by the developer in a data job, which is crucial for both impact analysis and data lineage.
These are just some of the Talend 6.1 highlights. Check out the Talend 6.1 webinar or Technical Note to see a demo and learn more. Happy Holidays and may 2016 be a year of fresh, new actionable insight!