With all the hype and interest in Big Data lately, open source ETL tools seem to have taken a back seat. MapReduce, Yarn, Spark, and Storm are gaining significant attention, but it also should be noted that Talend’s ETL business and our thousands of ETL customers are thriving. In fact, the data integration market has a healthy growth rate with Gartner recently reporting that this market is forecasted to grow 10.3% in 2014 to $3.6 billion!
I recently attended a Gartner presentation on the convergence of Application and Data Integration at their Application Architecture, Development and Integration conference. During the talk they stressed that “chasms exist between application- and data-oriented people and tools” and that digital businesses have to break down these barriers in order to succeed. Gartner research shows that more and more companies are recognizing this problem – in fact, 47% of respondents to a recent survey indicated they plan to create integrated teams in the next 2-3 years.
Big data has monopolized media coverage in the past few years. While many articles have covered the benefits of big data to organizations, in terms of customer knowledge, process optimization or improvements in predictive capabilities, few have detailed methods for how these benefits can be realized.
Gartner has just released its annual “Magic Quadrant for Data Quality Tools.”
While everyone’s first priority might be to check out the various recognitions, I would also recommend taking the time to review the market overview section. I found the views shared by analysts Saul Judah and Ted Friedman on the overall data quality market and major trends both interesting and inspiring.
Hence this blog post to share my takeaways.
This is the first in a series of posts on container-centric integration architecture. This first post covers common approaches to applying containers for application integration in an enterprise context. It begins with a basic definition and discussion of the Container design patterns. Subsequent posts will explore the role of Containers in the context of Enterprise Integration concerns. This will continue with how SOA and Cloud solutions drive the need for enterprise management delivered via service containerization and the need for OSGI modularity. Finally, we will apply these princ
The term ‘big data’ is at risk of premature over-exposure. I’m sure there are already many who turn off when they hear it – thinking there’s too much talk and very little action. In fact, observing that ‘many companies don’t know where to start with big data projects’ has become the default opinion within the IT industry.
Data Quality follows the same principles than other well defined quality related processes: it is all about engaging an improvement cycle to Define & detect, Measure, Analyze, Improve and Control quality.
This doesn’t happen at one time, or one place. It should be an ongoing effort, and that is often neglected when dealing with data quality. Think about the big ERP, CRM or IT consolidation projects where data quality gets high attention during the roll out, and then fades away once the project is delivered.
As the move to the next generation of integration platforms grows momentum, the need to implement a proven and scalable technology is critical. Databricks and Spark, delivered on the major Hadoop distributions, is one such area where the delivery of massively scalable technology low risk implementation is really key.
At Talend we see a wide array of batch processes, moving to an operational and real time perspective, driven by the consumers of the data. In this vein, the uptake in adoption and the growing community of Apache Spark, the powerful open-source processing engine, has been hard to miss. In a relatively short time, it is now a part of every major Hadoop vendor’s offering, is the most active open source project in the Big Data space, and has been deployed in production across a number of verticals.
Today marks a major new milestone in Talend’s journey: we’re thrilled to announce that Laurent Bride is joining us as CTO. Laurent is both a terrific manager and leader as well as a strong technologist who brings a wealth of experience to lead our engineering team in the coming years. Most recently, Laurent was CTO at Axway and was responsible for R&D, Innovation and Product Management. His role was to take Axway’s products to the next level while ensuring quality and security of the solutions. Laurent was also very involved in M&A and post-integration activities. Laurent has spent more than nine years in the Silicon Valley, with Business Objects and then SAP. During his tenure, Laurent has developed deep expertise in Enterprise Software Development, working with multi-national teams across the globe. His last role at SAP was SVP of Advanced Development, leading a 350 person team of developers building the next generation mobile, cloud, real-time analytics, M2M and big data solutions. Laurent holds an engineering degree in mathematics and computer sciences from EISTI.
In this “summer series” of posts dedicated to Master Data Management for Product Data, we’ve gone across what we identified as the five most frequent use cases of MDM for product data. Then, we looked at the key capabilities that are needed in MDM platform to address each of these use cases. In this last post of the series, we address the key capabilites needed for MDM for Anything, which is about dealing with the things that you are producing and/or the things that you using to produce them, for things that don’t fit to the four other facets of product master data described in this series.
MDM for anything refers to all the master data about product and things that are very specific to an industry, a use case, an enterprise… As this is specific, you would have to define on a case by case basis what is needed from your MDM solution, in terms of modeling, data quality, data accessibility, data stewardship, master data services... In any case, the flexibility of the solutions will key. By flexibility, I mean that the MDM solution should allow designing very specific data models, to connect easily to any source of data and eventually to applications in real time.