Big Data Platform

Turn big data into trusted insights.

Free Trial

Get up and running fast with the leading open source big data tool.

Talend Big Data Platform simplifies complex integrations to take advantage of Apache Spark, Databricks, Qubole, AWS, Microsoft Azure, Snowflake, Google Cloud Platform, and NoSQL, and provides integrated data quality so your enterprise can turn big data into trusted insights. Leverage the full power and scale of your big data framework with the leading data integration and data quality platform built on Spark for cloud, hybrid and multi-cloud architectures.

What’s new in Big Data Talend Fall ’18

Big Data Platform Features

License and Support

  • Subscription license with warranty and indemnification
  • 2 free Data Preparation and 2 free Data Stewardship licenses with any Talend subscription
  • Available as cloud service and downloadable software

Design and Productivity Tools

  • Generates native MapReduce and Spark batch code
  • Visual mapping for complex JSON, XML, and EDI on Spark
  • Spark and MapReduce job designer
  • Serverless Spark processing through Databricks and Qubole
  • Dynamic distribution support
  • Hadoop job scheduler with YARN
  • Hadoop security for Kerberos
  • Ingestion, loading, and unloading data into a data lake
  • Graphical design environment
  • Team collaboration with shared repository
  • Continuous integration / Continuous delivery
  • Visual mapping for complex JSON, XML, and EDI
  • Audit, job compare, impact analysis, testing, debugging, and tuning
  • Metadata bridge for metadata import/export and centralized metadata management
  • Distant run and parallelization
  • Dynamic schema, re-usable joblets, and reference projects
  • Repository manager
  • ETL and ELT support
  • Wizards and interactive data viewer
  • Versioning
  • Change data capture (CDC)
  • Automatic documentation
  • Customizable assessment
  • Pattern library
  • Cloud Pipeline Designer
+ Show more features

Data Quality and Governance

  • Data profiling and analytics with graphical charts and drill-down data
  • Automated data standardization, cleansing, and rules enforcement
  • Data privacy with masking and encryption
  • Data quality portal with monitoring, reporting, and dashboards
  • Semantic discovery with automatic detection of patterns
  • Comprehensive survivorship
  • Data sampling
  • Enrichment, harmonization, fuzzy matching, and de-duplication
+ Show more features

Connectors

  • Cloud: Amazon Web Services (AWS), Microsoft Azure, Google Cloud Platform, and more
  • Supported big data distributions: Amazon EMR, Azure HDInsight, Cloudera, Google Dataproc, Hortonworks, MapR
  • Serverless: Cloudera Altus, Databricks, Qubole
  • Spark MLlib (classification, clustering, recommendation, regression)
  • NoSQL: Cassandra, Couchbase, DynamoDB, MongoDB, Neo4j, and more
  • RDBMS: Oracle, Teradata, Microsoft SQL server, and more
  • SaaS: Marketo, Salesforce, NetSuite, and more
  • Packaged Apps: SAP, Microsoft Dynamics, Sugar CRM, and more
  • Technologies: Dropbox, Box, SMTP, FTP/SFTP, LDAP, and more
  • Optional 3rd-party address validation services
+ Show more features

Components

  • Hadoop components: HDFS, Hbase, Hive, Pig, Sqoop
  • File management: open, move, compress, decompress without scripting
  • Control and orchestrate data flows and data integrations with master jobs
  • Map, aggregate, sort, enrich, and merge data
+ Show more features

Data Preparation and Stewardship

  • 2 free licenses with subscription
  • Import, export, and combine data from any database, Excel or CSV file
  • Import, export and combine CSV, Parquet and AVRO files**
  • Export to Tableau
  • Self-service on-demand access to sanctioned datasets
  • Share data preparations and datasets
  • Operationalize preparations into any data or big data integration flow
  • Operationalize preparations into any cloud integration flow
  • Run preparations on Apache Beam*
  • Auto-discovery, standardization, auto-profiling, smart suggestions, and data visualization
  • Customization of semantic type for auto-profiling and standardization
  • Smart and selective sampling and full-runs
  • Data tracking and masking with role-based security
  • Cleansing and enrichment functions
  • Data Stewardship App for data curation and certification
  • Define data models, data semantics and profile data accordingly. Define and apply rules
  • Merge and match data, resolve data errors, and arbitrate on data (classification and certification)
  • Orchestrate and collaborate on activities in campaigns
  • Define user roles, workflows and priorities, assign and delegate tasks, tag and comment
  • Embed governance and stewardship in data integration flows and manage rejects
  • Embed human certification and error resolution into MDM processes
  • Take matching decisions that cannot be processed automatically
  • De-duplicate data at scale with machine learning
  • Audit and track data error resolution actions. Monitor progress of campaigns. Undo/redo based on business needs
+ Show more features

Management and Monitoring

  • High availability, load balancing, failover for jobs
  • Deployment manager and team collaboration
  • Manage users, groups, roles, projects, and licenses
  • Manage execution engines
  • Single Sign-On (SSO) integration with several SSO providers
  • Execution plan, time, and event-based scheduler for jobs
  • Check points, error recovery
  • Context management (dev, QA, prod)
  • Log collection and display
  • Optional Admin user add-on*
  • Engine clusters for jobs*
  • Static IP addresses*
  • Job execution log history (2 months for Entry products, 3 months for Platforms)*
  • Environments (2 for Entry products, unlimited for Platforms)*
  • Cloud Security Information and Event Management (SIEM), Intrusion Detection System (IDS), Intrusion Prevention System (IPS) and Web Application Firewall (WAF)
+ Show more features

Big Data Quality

  • Data cleansing, profiling, masking, parsing, and matching on Spark and Hadoop
  • Machine learning for data matching and deduplication
  • Support for Cloudera Navigator and Apache Atlas
  • HDFS file profiling
+ Show more features

Advanced Data Profiling

  • Fraud pattern detection using Benford Law
  • Advanced statistics with indicator thresholds
  • Column set analysis
  • Advanced matching analysis
  • Time column correlation analysis
+ Show more features

Keep your data integration projects under budget.

Flexible

Flexible

Keep costs predictable and resources flexible
with annual or monthly subscriptions.

Predictable

Predictable

Talend charges per user, not per
data volumes or connectors.

Simple

Simple

50% lower total cost of ownership with
a single solution running in the cloud.

Customer success stories

Contact Sales

For information about our collection and use of your personal information, our privacy and security practices and your data protection rights, please see our privacy policy.