Talend Integration Suite

Talend Integration Suite is the first open source enterprise data integration solution, designed to support multi-user development, and to scale to the highest levels of data volumes and process complexity.

Talend Integration Suite is a subscription service that extends award winning Talend Open Studio with professional grade technical support and additional features to facilitate the work of large teams and industrialize enterprise-scale deployments.

The Talend Studio is the core of Talend Integration Suite. Its three main applications, Business Modeler, Job Designer, and Metadata Manager, constitute the primary work environment of business users and integration process developers for data integration, data migration or data synchronization jobs.

Not sure if you need Talend Open Studio or Talend Integration Suite for data integration, data migration or data synchronization? Check out the features comparison matrix.

Want to learn more about Talend Integration Suite for data integration, data migration or data synchronization? Then watch an online demo or check out our users' testimonials.

Teamwork and consolidation of development

Talend Integration Suite: Business Modeler

Talend Integration Suite's Shared Metadata Manager is designed to consolidate all project information and enterprise metadata in a centralized repository shared by all stakeholders in the integration processes: business users, job developers, and IT operations staff-all of whom can access the same, single version of the truth. This shared repository facilitates collaboration between team members by allowing them to store and share their business models, data integration jobs, and metadata in an industry-standard source manager (SVN).

This promotes reusability of objects and code, as well as facilitating the design of development best practices that can then be leveraged by all developers for building data integration, data migration and data synchronization jobs.

The Shared Metadata Manager features advanced collaboration capabilities that include object-level check-in and check-out. Users, roles, permissions & privileges are managed centrally through the web-based Administration Center, which supports LDAP-compliant systems, such as Active Directory.

Changes to the Jobs made by different stakeholders can be identified at a glance using the Job comparison feature which provides a detailed analysis of differences between two versions of the same Job or between two different Jobs.

Industrialization

As the architecture of enterprise processes is often complex and takes a long time to implement, Talend Integration Suite includes automation features to facilitate the development and implementation of these processes.

  • Numerous wizards help automate connections to heterogeneous sources, including enterprise-scale platforms such as SAP, or the most versatile sources such as Copybook (EBCDIC) formats or Web Services.
  • Core models of processes can be developed in Talend Joblets that can be reused to facilitate the industrialization of data integration, data migration and data synchronization open source processes.
  • The Reference Projects help avoid duplication (copy-paste) of existing projects. "Slave" projects are linked to a master project by reference and thus help leverage developments and reuse proven processes.
  • The parallelization feature helps make the most of enterprise server capabilities and the number of processors available, greatly improving processing time of data integration, data migration and data synchronization open source jobs.
  • Advanced capabilities of impact analysis and data lineage help users know the path followed by data through the information system and the impact of a change on data structure or transformation processes.
  • The Change Data Capture (Publish & Subscribe) feature quickly identifies and captures data that has been added to, updated in, or removed from database tables and makes these changes (and only the changes) available to subscribers. Additional Business Rule features let users incorporate JBoss Rules Governor (BRMS) for a centralized definition and administration of JBoss compliant business rules.
    Talend Integration Suite: Industrialization
  • The Project Audit tool provides qualitative and quantitative metrics calculated against best practices that help optimize open source data integration, data migration and data synchronization projects.

Managing complex deployments

Talend Integration Suite: Business Modeler

Talend Integration Suite incorporates powerful capabilities for managing all data integration deployments-from the simplest Jobs to the most complex ones, from single open source Jobs to thousands of Jobs, and with data volumes ranging from a few records to terabytes of data through a central console, the Talend Administration Center. Based on Web 2.0, including the Ajax technology, the administration center web application offers friendly ergonomics and fast-refresh capability.

  • Job Conductor coordinates and schedules the execution of open source data integration, data migration or data synchronization jobs. It provides a centralized execution interface from which all jobs can be started upon request or according to time-based schedules. Job Conductor automatically maps available execution servers and constantly monitors their resources to intelligently load balance the execution of jobs.
  • In Professional Edition and higher, Job Conductor Advanced adds event-based scheduling capabilities for real-time data integration including real-time execution reporting and statistics.
  • Grid Conductor optimizes the scalability and availability of the integration processes by ensuring an optimal use of the execution grid, automatically distributing jobs across the grid.
    With dynamic load balancing that constantly monitors the resources available on the execution servers, and an intelligent distribution of jobs, Grid Conductor guarantees that all jobs execute smoothly at all times and fully leverage available resources, removing bottlenecks created by the traditional single-server approach. This alleviates any concerns related to resource preemption when a large number of jobs run concurrently, or when non-dedicated servers are used. Grid Conductor also provides automatic fail-over in the event an execution resource becomes unavailable.
  • CPU Balancer provides the highest degree of parallelization for the integration processes. It distributes jobs between all processing resources available, and parallelizes their execution between CPUs and cores. It also provides synchronization and wait points for open source data integration, data migration and data synchronization jobs.
  • Distant Run enables the remote execution of open source jobs on specified systems. This can be extremely useful for testing jobs in the same configuration as the production environment, or on different operating systems, or for simply running jobs upon request on specific systems, without going through complex deployment procedures.

Execution monitoring

Talend Integration Suite provides advanced monitoring capabilities that enable centralized supervision of all integration processes.

  • The Activity Monitoring Console monitors job events (successes, failures, warnings, etc.), execution times and data volumes through a single console, which is fully integrated to Talend Integration Suite Studio for open source data integration, data migration and data synchronization.
    With customizable indicators and thresholds, the Activity Monitoring Console provides a high-level view that can drill down into individual jobs.Talend Open Studio: Activity monitoring console
  • The Activity Monitoring Dashboard (included in the Talend Administration Center) is a more advanced, Web-based version of the Activity Monitoring Console that can be accessed easily through a Web browser. The Dashboard provides gauges and status indicators, as well as a business-model orientation, enabling business stakeholders to view both the current status and historical data of any data integration, data migration or data synchronization job. Data management processes can report in real-time their execution status and performance, resulting in real-time statistics and status information being presented in the Activity Monitoring Dashboard. This feature is essential for IT Operations teams who need real-time visibility on the progress of jobs, without having to wait for the end of the full execution. This feature has been designed to use minimum bandwidth and resources, and it can be turned off if needed.
    Talend Open Studio: Dashboard Analysis

Available in different editions

Talend Integration Suite is available in various Editions that cover all organizations' needs:

  • Talend Integration Suite Team Edition provides all basic collaborative features and scheduling functions.
  • Talend Integration Suite Professional Edition extends the Team Edition with advanced scheduling and extra execution features, error recovery management and real-time capabilities.
  • Talend Integration Suite RTx extends Talend Integration Suite Professional Edition with Service Oriented Architecture management features, as well as additional real-time capabilities to focus on intensive real-time operational integration needs. More details...
  • Talend Integration Suite Enterprise Edition includes powerful additional capabilities such as high availability and grid management among other enterprise-grade functionalities.
  • Talend Integration Suite MPx extends Talend Integration Suite Enterprise Edition with massively parallel execution and large volume handling capabilities. More details...

To get the clear picture of which feature belongs to which edition, please check the Feature Comparison matrix.

Data Quality option

Offered as an option, the data quality features include an in-depth data mining and profiling tool to detect non-compliant and poor-quality data as well as a data cleansing tool that helps improve data for full data governance.

All data quality processing can be deeply embedded into open source data integration processes, making data quality an integrant part of the processing of data. Since all Talend products are part of the same unified platform, the data quality option is seamlessly integrated with data integration, providing users with consistent ergonomics, fast learning curve and a high-level of reusability. This offers unrivaled benefits in terms of resource optimization & utilization, and project consistency. Talend Data Quality is the first open source data quality solution with enterprise-grade features, which resolves the challenge of data quality.

Technical support

Talend's technical support centers provide prompt, effective and high-quality support services to Talend Integration Suite subscribers, who benefit from the knowledge of Talend's technical experts, who are directly connected with Talend's Research & Development organization. View more information about Talend's Technical Support.

Massively Parallel

Based on Talend's award winning enterprise data integration technology, Talend Integration Suite MPx is a highly scalable, massively parallel data integration platform that scales to the highest volumes of data.

Geared toward enterprises that need to process extreme data volumes in ever tightening time windows, Talend Integration Suite MPx exceeds the most demanding requirements and supersedes all existing performance benchmarks.

Request more information on Talend Integration Suite MPx.

Want to learn more about Talend Integration Suite MPx? Then watch an online demo or check out our users' testimonials.

FileScale Technology

Talend Integration Suite: FileScale Technology

Talend Integration Suite MPx features the unique FileScale technology which leverages the execution server hardware architecture and maximizes the performance of low-level sort algorithms.

The FileScale technology works in bulk mode on (very) large files. It takes full advantage of the execution architecture as it is not restricted by the JVM or execution engine limitations typical of traditional data integration architectures.

FileScale technology sorts and transforms data using innovative high-performance mathematical algorithms for data processing. It leverages the MapReduce architecture to automatically break down any data processing operation into a number of granular processes.

Massively Parallel Processing

Talend Integration Suite: Massively Parallel Processing

The challenges involved in processing large amounts of data are similar to those of any large-scale project. Typically, the best approach is to divide the task into as many subtasks as possible and distribute them over all available resources in order to process them in parallel.

Similarly, Talend Integration Suite MPx benefits from multi-server, multi-CPU, and multi-core architectures where code and separate sub-processes can be executed in parallel to make the most of the architecture. This massively parallel feature maximizes enterprise server capabilities and the number of processors available, greatly improving processing time.

Talend Integration Suite MPx also automates the break down of data sets into many parallel streams, for further acceleration of the processing, leveraging the massively parallel loaders of leading RDBMS engines.

In addition, Talend Integration Suite MPx includes support for Hadoop's distributed file system (HDFS) that provides high throughput access to application data, as well as Hadoop's data warehouse infrastructure (Hive) that provides data summarization and ad hoc querying.

Based on Talend Integration Suite

Talend Integration Suite: Based on Talend Integration Suite

Based on Talend Integration Suite, Talend Integration Suite MPx includes its core modules - Business Modeler, Job Designer, and Metadata Manager - as well as the teamwork, development consolidation, industrialization and monitoring-oriented features of the leading enterprise data integration platform.

 

Real Time Integration

Based on Talend's award winning enterprise data integration technology, Talend Integration Suite RTx is the real-time data integration platform of choice for enterprise application integration needs.

Today's companies live in an on-demand world, where data a few-hour old is already obsolete. Using low-latency solution to process data in real-time, stakeholders can be better informed and thus make better business-critical decisions.

Request more information on Talend Integration Suite RTx.

Want to learn more about Talend Integration Suite RTx? Then watch an online demo or check out our users' testimonials.

Service-oriented architecture

Talend Integration Suite: open source data integration Service-oriented architecture

Talend Integration Suite RTx provides support for:

  • Data Integration Services: triggering or integrating data integration processes in real-time as the need arises, using Web Services.
  • Data Services: providing an easy and immediate access service to critical data that is usually difficult to access using standard protocols.

The administration console of Talend Integration Suite RTx offers a web-based and fully graphical environment to expose one or more data integration jobs as services (Web Services), enabling their automatic deployment in and across heterogenous applications and systems using SOAP binding (RPC or document-based). A dedicated WSDL wizard helps generate WDSL descriptors to expose Jobs as Web Services and find matching UDDI entries when consuming Web Services.

In addition, Talend Integration Suite RTx provides a native export to JBoss ESB for full interoperability between applications.

The SOA Manager also features an advanced capability of incoming request management based on an optimized pooling and queueing system. The user-defined pool of active services handles a number of requests in real-time, while a queue manager handles the additional requests, buffering the throughput, for an asynchronous processing.

Event-Based Execution

Talend Integration Suite RTx offers real-time, event-centric task execution triggering based on Web Services invocation or on direct execution.

The event listener allows the process executions to trigger on an on-demand basis, as a message arrives through JMS-compliant Message-Oriented Middleware or through an Enterprise Service Bus (ESB), via RPC, HTTP, socket listeners and using "wait for" types of condition.

Latencies and volumes are configurable on a trickle feed basis ("as it comes"), as small batches (customizable batch size) for a nearly real-time processing or, in a timely manner, or on a combination of all modes.

Talend Integration Suite RTx provides multi-instance support. At runtime, the various Job executions can be distributed over multiple processors and servers, making the most of load balancing and grid architecture and delivering the highest possible execution performance.

Real-Time Connectors

Talend Integration Suite: open source data integration Real-Time Connectors

Talend Integration Suite RTx offers multiple connectors dedicated to real-time data processing. The Web Service component helps data integration processes consume any Web Service using SOAP or REST protocols.

Talend Integration Suite RTx provides native support for asynchronous communications via Message-Oriented Middleware (MOM). It also integrates with JMS-based messaging systems to enable event-driven architecture (EDA) and to support service-oriented architecture (SOA).

Connectors to the real-time APIs of business applications include Salesforce.com, SAP, Microsoft Dynamics, etc. Other connectors enable data integration with:

  • RDBMS
  • Mainframe or legacy systems
  • Files
  • LDAP, email, HTTP, FTP, etc.
  • Messages queues (MOM) & ESB

Change Data Capture

Talend Integration Suite: open source data integration Change Data Capture

Capturing changes (CDC) reduces the flow of data between systems and thus helps reduce processing time. The Change Data Capture feature identifies and captures real-time data that has been added to, updated in, or removed from database tables. This transactional functionality is natively available for major RDBMS and via the Attunity integration for mainframes and legacy systems.

The Publish & Subscribe mode makes these changes (and only the changes) available to subscribers on a continuous feed basis or in a digest mode, depending on the needs of the subscribing application. This mode allows support for multiple latencies and multiple type of consumers.

Based on Talend Integration Suite

Talend Integration Suite: open source data integration Business Modeler

Based on Talend Integration Suite, Talend Integration Suite RTx includes its core modules - Business Modeler, Job Designer, and Metadata Manager - as well as the teamwork, development consolidation, industrialization and monitoring-oriented features of the leading enterprise data integration platform.