Talend Products

Big Data Integration

The first data integration platform for Hadoop and Spark

Talend simplifies big data integration with graphical tools and wizards that generate native code so you can start working with Apache Hadoop, Apache Spark, Spark Streaming and NoSQL databases today.

Talend Big Data Integration platform delivers high-scale, in-memory fast data processing, as part of the Talend Data Fabric solution, so your enterprise can turn more and more data into real-time decisions.

  • Blazing fast speed and scale with Spark and Hadoop
  • Optimize big data performance in the cloud
  • Protect your investments with a future-proof architecture

DOWNLOAD FREE TRIAL

Talend Customers Get to Market Faster

We have to continually increase our velocity in acquiring data, and the ease of use of the Talend platform allows us to deliver on those requests. Marc Gallman, Manager of Data Architecture, Lenovo

Fast and First for 1/5th the Price

Blazing fast speed and scale with Spark and Hadoop

Only Talend takes advantage of the massively parallel environment of Hadoop by generating native MapReduce and Spark code. Load, transform, enrich, and cleanse data inside Hadoop to leverage Hadoop's power and scale. Run 5 times faster than MapReduce using Spark and Spark Streaming in-memory data processing. Talend Studio provides access to over 900 connectors and components, plus graphical drag-and-drop tools and wizards to speed development.

View White Paper
Get the White Paper:
Hadoop in the Enterprise

Optimize big data performance in the cloud

Run big data processing when and where you need it—on-premises, hybrid or in the cloud—with the best response time, lowest latency, and most cost-effective use of resources. Build end-to-end big data integration workflows that easily integrate with Amazon Redshift, Elastic MapReduce (EMR), Amazon Kinesis or Azure HDInsight systems, so all your infrastructure runs in the cloud. Or use our powerful self-service tools with Talend Integration Cloud, part of the Talend Data Fabric.

Protect big data investments with a future-proof architecture

Talend released the first integration platform to run MapReduce, Spark and Spark Streaming on YARN. With each new Hadoop framework, Talend makes it possible to convert data integration jobs to the latest frameworks with the push of a button, so you can stay ahead of the innovation curve. Subscription pricing, based on users not CPUs or connectors, sets a predictable cost basis even as data volumes and systems grow exponentially.

Speed up Your Big Data Integration Projects

Design
Faster
Collaborate
Better
Cleanse
Earlier
Manage
More
Scale
Easier
Data Lifecycle
Use Talend Studio to design batch, real-time and streaming integration jobs with a drag-and-drop user interface. Improve collaboration with a shared repository, continuous delivery methods, and metadata bridge sharing. Use native Hadoop data profiling, data matching, and machine learning to better understand your data. Leverage big data consoles to centrally manage and monitor your projects. Achieve infinite scale with built-in Lambda architecture and in-memory processing.
Design Faster
Use Talend Studio to design batch, real-time and streaming integration jobs with a drag-and-drop user interface.
Collaborate Better
Improve collaboration with a shared repository, continuous delivery methods, and metadata bridge sharing.
Cleanse Earlier
Use native Hadoop data profiling, data matching, and machine learning to better understand your data.
Manage More
Leverage big data consoles to centrally manage and monitor your projects.
Scale Easier
Achieve infinite scale with built-in Lambda architecture and in-memory processing.

Right Size Your Big Data Integration Solution

Choose a Talend Big Data Integration solution with the feature set and licensing options to best fit your project and budget.

 
Open Studio for Big Data
Big Data
Big Data Platform
Real-Time Big Data Platform
License Apache Subscription Subscription Subscription
Big Data Hadoop and NoSQL components + Batch Processing (MapReduce, Spark), Native Hadoop Connectors + Batch Processing (MapReduce, Spark), Native Hadoop Connectors + Real-Time Processing (Spark Streaming), and Machine Learning, High-Speed Messaging, and IoT Connectivity
Big data components: HDFS, Hbase, HCatalog, Hive, Pig, Sqoop Included Included Included Included
Hadoop job scheduler Included Included Included Included
Hadoop security for Kerberos Included Included Included Included
NoSQL connectivity Included Included Included Included
YARN support Included Included Included Included
Certified on Hadoop distributions (Amazon EMR, Azure HDInsight, Cloudera, Hortonworks, MapR, Pivotal) Unavailable Included Included Included
Spark and MapReduce job designer Unavailable Included Included Included
MapReduce visual code optimization Unavailable Included Included Included
Hadoop cleansing, profiling, parsing and matching Unavailable Unavailable Included Included
HDFS File Profiling Unavailable Unavailable Included Included
Spark batch Unavailable Included Included Included
Spark Streaming Unavailable Unavailable Unavailable Included
Spark machine learning Unavailable Unavailable Included Included
High-Speed Messaging Components (Kafka, Kinesis, Flume) Unavailable Unavailable Unavailable Included
Enterprise Messaging (JMS, ActiveMQ, AMQP) Unavailable Unavailable Unavailable Included
Internet of Things Connectivity (AMQP, MQTT) Unavailable Unavailable Unavailable Included
Design Faster
& Scale Easily
900+ Components & Connectors + Continuous Delivery, testing, sharing, and debugging + Repository Manager + Repository Manager
On Demand Documentation Included Included Included Included
Business Modeler Included Included Included Included
Eclipse-based developer tooling Included Included Included Included
ETL & ELT support Included Included Included Included
Job designer Included Included Included Included
Versioning Included Included Included Included
Audit Unavailable Included Included Included
Automatic documentation Unavailable Included Included Included
Change data capture (CDC) Unavailable Included Included Included
Continuous Delivery Data Integration Unavailable Included Included Included
Drools business rule management system (BRMS) Unavailable Included Included Included
Distant run Unavailable Included Included Included
Dynamic schema Unavailable Included Included Included
Impact analysis Unavailable Included Included Included
Interactive data viewer Unavailable Included Included Included
Jobs compare Unavailable Included Included Included
Metadata Bridge Unavailable Included Included Included
Parallelization Unavailable Included Included Included
Reference projects Unavailable Included Included Included
Re-usable joblets Unavailable Included Included Included
Team collaboration with shared repository Unavailable Included Included Included
Testing, debugging and tuning Unavailable Included Included Included
Centralized metadata management Unavailable Included Included Included
Wizards Unavailable Included Included Included
Repository manager Unavailable Unavailable Included Included
Visual mapping for complex XML and EDI Unavailable Unavailable Included Included
Collaborate Better
& Manage More
Unavailable Manage Administration, Deployment, & Automate Tasks + High Availability, Load Balancing, & Failover + High Availability, Load Balancing, & Failover
Amazon EC2 lifecycle control Unavailable Included Included Included
Check points, error recovery Unavailable Included Included Included
Context management (dev, QA, prod) Unavailable Included Included Included
Deployment manager and team collaboration Unavailable Included Included Included
Execution plan, time and event-based scheduler Unavailable Included Included Included
Log server with dashboard Unavailable Included Included Included
Activity Monitoring Console Unavailable Included Included Included
Talend Administration Center Unavailable Included Included Included
High availability, load balancing, failover for Jobs Unavailable Unavailable Included Included
Increase Trust
with Data Quality
Unavailable Unavailable Data Profiling, Cleansing, Matching, Masking & Stewardship Data Profiling, Cleansing, Matching, Masking & Stewardship
Batch execution of analyses Unavailable Unavailable Included Included
Big data quality capabilities (parsing & matching) Unavailable Unavailable Included Included
Comprehensive survivorship Unavailable Unavailable Included Included
Data cleansing Unavailable Unavailable Included Included
Data masking Unavailable Unavailable Included Included
Data profiling Unavailable Unavailable Included Included
Data quality analytics with graphical charts and drilldown data Unavailable Unavailable Included Included
Data quality monitoring,
reporting & dashboards
Unavailable Unavailable Included Included
Data standardization Unavailable Unavailable Included Included
Data stewardship Unavailable Unavailable Included Included
Enrichment, fuzzy matching & de-duplication Unavailable Unavailable Included Included
Sampling Unavailable Unavailable Included Included
Semantic discovery Unavailable Unavailable Included Included
Cloud or on-premises third-party address validation services Unavailable Unavailable Optional Optional
Support TalendForge Community,
Help Center access
+ Guaranteed Response Times, Web & Email Support, Optional 24/7 + Phone Support, Faster Response, Optional 24/7 + Phone Support, Faster Response, Optional 24/7
Indemnification/
Warranty
Unavailable Included Included Included
SPECIFICATIONS → Free Download Free Trial Request Info Request Info

Why Talend?

The more connected the world becomes, the more quickly a business must adapt. By design, Talend integration software simplifies the development process, reduces the learning curve, and decreases total cost of ownership with a single platform for batch and real-time data integration, in the cloud and on-premises.

Executive Header: 
Simplify Big Data Integration
Executive Copy: 
Talend provides a powerful and versatile open source big data product that makes the job of working with big data technologies easy and helps drive and improve business performance, without the need for specialist knowledge or resources.
What it Does: 
Integration at Cluster Scale
Manager Copy: 

Talend redefines the development skills needed for big data and facilitates the organization and orchestration required by these projects so that you can focus on the key question: “What use should we make of data, big and small, and how am I going to be the leader in using data to help my business?”

Talend’s big data product combines big data components for MapReduce 2.0 (YARN), Hadoop, HBase, Hive, HCatalog, Oozie, Sqoop and Pig into a unified open source environment so you can quickly load, extract, transform and process large and diverse data sets from disparate systems.

How it Works: 
Big Data Without The Need To Write / Maintain Code
Implementer Copy: 

Ready to Use Big Data Connectors

Talend provides an easy-to-use graphical environment that allows developers to visually map big data sources and targets without the need to learn and write complicated code. Running 100% natively on Hadoop, Talend Big Data provides massive scalability. Once a big data connection is configured the underlying code is automatically generated and can be deployed remotely as a job that runs natively on your big data cluster - HDFS, Pig, HCatalog, HBase, Sqoop or Hive.

Big Data Distribution and Big Data Appliance Support

Talend's big data components have been tested and certified to work with leading big data Hadoop distributions, including Amazon EMR, Cloudera, IBM PureData, Hortonworks, MapR, Pivotal Greenplum, Pivotal HD, and SAP HANA.  Talend provides out-of-the-box support for big data platforms from the leading appliance vendors including Greenplum/Pivotal, Netezza, Teradata, and Vertica.

Talend big data integration works with Apache, mongodb architecture, sqoop, and more

Open Source

Using the Apache software license means developers can use the Studio without restrictions. As Talend’s big data products rely on standard Hadoop APIs, users can easily migrate their data integration jobs between different Hadoop distributions without any concerns about underlying platform dependencies. Support for Apache Oozie is provided out-of-the-box, allowing operators to schedule their data jobs through open source software.

Pull Source Data from Anywhere Including NoSQL

With 800+ connectors, Talend integrates almost any data source so you can transform and integrate data in real-time or batch. Pre-built connectors for HBase, MongoDB,Cassandra, CouchDB, Couchbase, Neo4J and Riak speed development without requiring specific NoSQL knowledge. Talend big data components can be configured to bulk upload data to Hadoop or other big data appliance, either as a manual process, or an automatic schedule for incremental data updates.

Support for Google BigQuery

Quote: 
The strategy for data quality with Big Data will depend on whether the application is mission-critical, whether regulatory compliance ramifications are involved, and the degree to which bad quality data will materially impact the business.
Quote Author: 
Tony Baer
Quote Author Title: 
Ovum
Product Screenshot: 
Feature Grid: 
FEATURESTalend Open Studio for Big DataTalend Enterprise Big DataTalend Platform for Big Data
Job Designer

x

x

x

Components for HDFS, HBase, HCatalog, Hive, Pig, Sqoop

x

x

x

Hadoop Job Scheduler

x

x

x

NoSQL Support

x

x

x

Versioning and Centralized Metadata Management

x

x

Shared Repository

x

x

Reporting and Dashboards

x

Big Data Profiling, Parsing and Matching

x

Indemnification/Warranty and Talend Support

x

x

LicenseApacheSubscriptionSubscription
Section Landing Page Text: 

Talend Open Studio for Big Data combines big data technologies into a unified open source environment simplifying the loading, extraction, transformation and processing of large and diverse data sets.

Feature Grid Description: 

Talend Open Studio for Big Data is an Apache licensed, open source development tool. Talend Enterprise Big Data adds teamwork and management features. Talend Platform for Big Data adds data quality, clustering features with extended support services.

Site Section:

Product Specifications: 
Specifications: Big Data

Link to Downloads:

Download Page Text: 

The product:

  • Provides graphical development productivity tools for interaction with big data sources and targets.
  • Provides 800+ connectors and components to almost any data source with support for big data Hadoop distributions including Cloudera, Google BigQuery, Greenplum, Hortonworks and MapR.
  • Supports HDFS, Pig, HCatalog, Hbase, Sqoop, Oozie and Hive.

Talend Open Studio for Big Data is provided under the Apache License v2 agreement terms.

Select the appropriate tabs below to download the Current Version, or to download Other Releases, or to download the User Manuals.

Download Landing Page Text: 

Talend Open Studio for Big Data combines big data technologies into a unified open source environment simplifying the loading, extraction, transformation and processing of large and diverse data sets. Users are also presented with a full palette of components for NoSQL connectivity, all under an open source Apache license.

What's new text: 

Talend provides a powerful and versatile open source big data product that makes the job of working with big data technologies easy and helps drive and improve business performance, without the need for specialist knowledge or resources.

Why Upgrade: 

Talend Platform for Big Data is a powerful and versatile big data integration and data quality solution that simplifies the loading, extraction and processing of large and diverse data sets so you can make more informed and timely decisions.

See the different solutions Talend offers for Big Data

Basic Version: 
Introduction: 
Talend makes the task of working with big data technologies easy.
Product Title: 
Talend Open Studio for Big Data
Product Subtitle: 
Free Open Source
Product version: 
Version 5.6.0
Product type version: 
Basic Big Data
Product Info: 
Open Studio Capabilities Includes:
Eclipse-Based Tooling
Hadoop 2.0 and YARN Support
Big Data ETL and ELT
HDFS, HBase, HCatalog, Hive, Pig, Sqoop Components
Job Designer
Apache License 2.0
Broadest NoSQL Support
Fully Open Source

Advanced Version: 
Product Title: 
Talend Enterprise Big Data
Product Subtitle: 
Free 30-day Full Product Trial
Product version: 
Version 5.6.1
Product type version: 
Advanced Big Data Integration
Product Info: 
Open Studio Capabilities, Plus:
Design and Generate 100% MapReduce Code
Visual MapReduce Job Optimizer
Data Viewer for Hadoop
Collaborative Team Development
Compare Changes and Impact Analysis
Versioning
Data Lineage
Testing, Debugging and Tuning Tools
Advanced Management and Monitoring
Integrated Business Rules
Graphical Wizards

 

© 2016 Talend All rights reserved.

X