Big data is a journey that often starts with a sandbox or proof-of-concept project and evolves into batch analytical big data before expanding toward real-time and operational uses of big data. Ever-changing requirements from the business translate in ever-increasing pressure on development teams to deliver these more and more complex projects, on time and on spec.
Throughout this journey from the sandbox to full productive use, challenges abound for integration teams that prevent them to deliver projects that meet expectations.
Big data platforms such as Hadoop and NoSQL are an entirely new game, and there is a shortage of skills on the job market that know about them. Few developers have been trained on MapReduce programming, and even fewer organizations have the resources to invest in this training. As a result, it is difficult to accurately predict project scope leading to missed project deadlines. Integration teams need to be able to use native Hadoop tools, with their existing skillsets.
New Data Sources
The multiplicity and disparity of data sources create new technical challenges for big data projects, especially if using legacy integration tools since they were not designed for big data integration complexities. Sources range from traditional systems such as ERPs, databases, SaaS applications, flat files, all the way to new sources including social media, web data, sensors, logs, etc.
The new world of big data is more heterogeneous than ever. Relational databases are now supplemented by Hadoop clusters, augmented or replaced by a multiplicity of NoSQL databases technologies – and it’s an ever-changing landscape developers need to deal with. Native support for these platforms becomes a must-have for optimum performance, and legacy integration tools are simply unable to keep pace.
Big data projects introduce new challenges for data security and data quality. Whereas Hadoop natively uses Kerberos for security, legacy integration engines require their own proprietary security methods. And legacy data quality approaches simply don’t work with Hadoop – instead, they require that data be extracted from Hadoop for processing, adding unnecessary complexity and risk to the project.
While demands from the business keep accelerating, the challenges that integration teams face are severely impacting their ability to deliver projects on time and on spec. And soon enough, these projects are running out of funding. Without proven results and with the rising costs of legacy technologies, executive sponsorship drops, and projects die.
Successfully building and deploying compelling applications that take advantage of the infinite scale possibilities of big data require first to deal with these project challenges.
Because the big data environment is so different from legacy data infrastructure, Talend provides integration teams with a complete solution that enables them to be successful from the get-go, while building up their skills for future projects.
Native Big Data Support
Unlike legacy integration tools, Talend natively supports Hadoop, generating Pig Latin and MapReduce/YARN code. Talend requires zero footprint on the Hadoop cluster since there is no runtime component. Natively optimized for major Hadoop distributions such as Cloudera, Hortonworks, MapR, Pivotal HD and more, Talend also uses native Hadoop security, Kerberos, and is the only data quality solution to run inside Hadoop – again via native MapReduce code generation.
Open and Easy-to-use
Like Hadoop, Talend is committed to open source and open standards and the benefits that they bring: the largest developer community, collaboration tools that include a vibrant forum and a component/code sharing platform, and of course portability of skills. Rich, Eclipse-based graphical tools and wizards make big data integration a breeze. With Talend, any integration or data developer can become a big data developer in no time!
Predictability and Low Risk
Talend’s unique user-based subscription pricing model allows to predictably scale data and projects, today and tomorrow, and ensures that projects will not fail because of exploding license costs. Adherence to Java, Eclipse and big data standards reduce project development and maintenance time, so operational costs are predictable as well.
As the big data journey continues through analytics to real-time and operational use cases, the investment in Talend continues to bear fruit thanks to the commitments made to the big data platform, but also to the unification of what legacy vendors view as separate integration challenges: data and application integration. With Talend, real-time, REST web services and ESB become part of every developer’s toolkit.
Demands from data-driven businesses for more and more access to data, at a faster and faster pace, place new constraints on development teams and require them to adopt new technologies, new practices and even new paradigms to deliver on these expectations. Simply put, legacy integration technologies used in most projects are not designed to meet the demands of big data.