Big data is a journey that often starts with a sandbox or proof-of-concept project and evolves into batch analytical big data before expanding toward real-time and operational uses of big data. Throughout this journey, challenges abound that derail projects and prevent them to deliver an effective return-on-data.
New Data Sources
The multiplicity and disparity of sources create challenges for the collection, processing and storage of this data. Big data projects have to deal with traditional systems such as ERPs and databases, Cloud-based and SaaS applications, enterprise data warehouses, but also new sources including social media and sensors.
Whether deployed on-premises or in the cloud, existing software must adapt to a native Hadoop engine. Legacy integration technologies require their own engine on each node, their own management and security infrastructure, and as projects scale, this infrastructure simply doesn’t scale.
Big data projects introduce new challenges for the security and quality of the data. Whereas Hadoop natively uses Kerberos for security, legacy integration engines require their own proprietary security methods. And big data without big data quality can be a big mess, as legacy data quality approaches simply don’t work with Hadoop – instead, they require that data be extracted from Hadoop for processing.
New technologies call for new skills, and there is a shortage of big data skills in the job market. Few developers master MapReduce, the ones who do demand premium packages, and once developers have been trained, retaining them becomes a challenge in itself.
Combining the high scaling costs of multiple integration engines or runtimes and their management tools with the high cost of big data developers is challenging firms to find a rationale, predictable cost approach to their business cases.
While demands from the business keep increasing for a higher return-on-data, the challenges on the journey are severely impacting the delivery potential of big data projects. And soon enough, these projects are facing an obstacle: the funding wall. Without proven return, and with the rising costs of legacy technologies, the expected returns from big data will soon flatten, and fail to deliver.
Developing compelling business cases that take advantage of the infinite scale possibilities of big data require first to deal with these infrastructure challenges, and deploy a solution that enables the value from big data to be achieved.
Because the new big data environment is unlike existing environments, Talend equips IT to realize the vision of providing instant value from all data.
Native Big Data Support
Unlike legacy integration solutions, Talend natively resides in the Hadoop environment, generating MapReduce code and requiring zero-footprint on the Hadoop cluster. Natively integrated with major Hadoop distributions’ management and monitoring consoles, Talend also uses native Hadoop security, Kerberos, and is the only data quality solution to run inside Hadoop.
Like Hadoop, Talend is committed to open source and open standards and the benefits that they bring: an innovation ecosystem, no vendor lock-in, faster and more agile development, and support from a broad community.
Talend’s unique user-based subscription pricing model allows to predictably scale data and projects, today and tomorrow, without having to scale the integration costs.
As the big data journey continues through analytics to real-time and operational use cases, Talend uniquely provides an extensible solution that grows with changing needs. The solution becomes adaptable through Talend’s support for the latest standards, code generation, cloud platforms, and a unified platform for data, application and process integration.
Big data, like any bleeding edge IT project, is a new adventure for most organizations. It promises great rewards, but also carries high risks. Demands from data-driven businesses for more and more access to data, at a faster and faster pace, place new constraints on IT, and strains integration infrastructure.
Simply put, legacy integration technologies are not designed to meet the demands of big data.