Real-time Big Data Platform

Leverage real-time and streaming analytics to get insights faster than ever.

Take advantage of real-time data streams

Unleash the potential of real-time and streaming analytics by leveraging the power of serverless Spark Streaming and machine learning. Talend Real-time Big Data integration generates native code that can run in your cloud, hybrid, or multi-cloud environment, so you can start working with Spark Streaming today and turn all your batch data pipelines into real-time, trusted, actionable insights.

Integrate data sources and run on the leading data platforms

Real-time Big Data Platform Features

API Development

  • Visual API Designer
  • Visual API Tester
  • Support for OAS / Swagger(tm) and RAML
  • Automatic API mocking
  • API testing automation
  • Hosted API documentation
  • API contract import into Talend Studio
+ Show more features

License and Support

  • Subscription license with warranty and indemnification
  • 2 free Data Preparation and 2 free Data Stewardship licenses with any Talend subscription
  • Available as cloud service and downloadable software
+ Show more features

Design and Productivity Tools

  • Generates native MapReduce and Spark batch code
  • Generates native Spark Streaming code
  • Visual mapping for complex JSON, XML, and EDI on Spark
  • Spark and MapReduce job designer
  • Serverless Spark processing through Databricks and Qubole
  • Dynamic distribution support
  • Hadoop job scheduler with YARN
  • Hadoop security for Kerberos
  • Ingestion, loading, and unloading data into a data lake
  • Graphical design environment
  • Team collaboration with shared repository
  • Continuous integration / Continuous delivery
  • Visual mapping for complex JSON, XML, and EDI
  • Audit, job compare, impact analysis, testing, debugging, and tuning
  • Metadata bridge for metadata import/export and centralized metadata management
  • Distant run and parallelization
  • Dynamic schema, re-usable joblets, and reference projects
  • Repository manager
  • ETL and ELT support
  • Wizards and interactive data viewer
  • Versioning
  • Change data capture (CDC)
  • Automatic documentation
  • Cloud Pipeline Designer
+ Show more features

Data Quality, Self-Service, and Governance

  • Data profiling and analytics with graphical charts and drill-down data
  • Automated data standardization, cleansing, and rules enforcement
  • Data privacy with masking and encryption
  • Data quality portal with monitoring, reporting, and dashboards
  • Semantic discovery with automatic detection of patterns
  • Comprehensive survivorship
  • Data sampling
  • Enrichment, harmonization, fuzzy matching, and de-duplication
  • Data sampling, semantic discovery, and auto-profiling
  • Social curation with data sharing, ratings and endorsement
  • Cross reference between datasets and data pipelines for data lineage and impact analysis
  • Cross reference between datasets and data preparations for data lineage and impact analysis
+ Show more features

Connectors

  • Cloud: Amazon Web Services (AWS), Microsoft Azure, Google Cloud Platform, and more
  • Cloud Data Warehouse and Data Lakes: Snowflake, Amazon Redshift, Azure Data Lake Storage Gen2, Azure SQL Data Warehouse, Databricks Delta Lake, Google BigQuery
  • Supported big data distributions: Amazon EMR, Azure HDInsight, Cloudera, Google Dataproc, Hortonworks, MapR
  • Serverless: Cloudera Altus, Databricks, Qubole
  • Spark MLlib (classification, clustering, recommendation, regression)
  • NoSQL: Cassandra, Couchbase, DynamoDB, MongoDB, Neo4j, and more
  • RDBMS: Oracle, Teradata, Microsoft SQL server, and more
  • SaaS: Marketo, Salesforce, NetSuite, and more
  • Packaged Apps: SAP, Microsoft Dynamics, Sugar CRM, and more
  • Technologies: Dropbox, Box, SMTP, FTP/SFTP, LDAP, and more
  • Optional 3rd-party address validation services
+ Show more features

Components

  • Hadoop components: HDFS, Hbase, Hive, Pig, Sqoop
  • File management: open, move, compress, decompress without scripting
  • Control and orchestrate data flows and data integrations with master jobs
  • Map, aggregate, sort, enrich, and merge data
  • Standard support: REST, SOAP, OpenID Connect, OAuth, SAML, WSDL, SWAGGER(tm), and more
  • Transports/protocols support: HTTP, JMS, MQTT, AMQP, UDP, Apache Kafka, WebSphere MQ, and more
  • Enterprise Integration Patterns for service mediation, routing, and messaging
+ Show more features

Data Preparation and Stewardship

  • 2 free licenses with subscription
  • Import, export and combine data from database, Excel, CSV, Parquet and AVRO files
  • Export to Tableau
  • Self-service on-demand access to sanctioned datasets
  • Share data preparations and datasets
  • Operationalize preparations into any data, big data or cloud integration flow
  • Run preparations on Apache Beam*
  • Auto-discovery, standardization, auto-profiling, smart suggestions, and data visualization
  • Customization of semantic type for auto-profiling and standardization
  • Smart and selective sampling and full-runs
  • Data tracking and masking with role-based security
  • Cleansing and enrichment functions
  • Data Stewardship App for data curation and certification
  • Define data models, data semantics and profile data accordingly. Define and apply rules
  • Merge and match data, resolve data errors, and arbitrate on data (classification and certification)
  • Orchestrate and collaborate on activities in campaigns
  • Define user roles, workflows and priorities, assign and delegate tasks, tag and comment
  • Embed governance and stewardship in data integration flows and manage rejects
  • Embed human certification and error resolution into MDM processes
  • Take matching decisions that cannot be processed automatically
  • De-duplicate data at scale with machine learning
  • Audit and track data error resolution actions. Monitor progress of campaigns. Undo/redo based on business needs
+ Show more features

Management and Monitoring

  • High availability, load balancing, failover for jobs
  • Deployment manager and team collaboration
  • Manage users, groups, roles, projects, and licenses
  • Manage execution engines
  • Single Sign-On (SSO) integration with several SSO providers
  • Execution plan, time, and event-based scheduler for jobs
  • Check points, error recovery
  • Context management (dev, QA, prod)
  • Execution logs collection and display
  • Optional Admin user add-on*
  • Engine clusters for jobs*
  • Static IP addresses*
  • Job execution logs history
  • Environments (2 environments for Entry products, unlimited for Platforms)
  • Cloud Security Information and Event Management (SIEM), Intrusion Detection System (IDS), Intrusion Prevention System (IPS) and Web Application Firewall (WAF)
+ Show more features

Big Data Quality

  • Data cleansing, profiling, masking, parsing, and matching on Spark and Hadoop
  • Machine learning for data matching and deduplication
  • Support for Cloudera Navigator and Apache Atlas
  • HDFS file profiling
+ Show more features

Services Management

  • System monitoring: JMX / Jolokia
  • Runtime engine (Talend Runtime on-premises, Remote Engine cloud)
  • Containerized service generation
  • Access into live statistics of message flow activity
  • Integrated artifact repository
  • Interface to deploy data services and routes
  • Identity management and authorization**
+ Show more features

Agile Application Integration

  • Drag-and-drop route, data, and web/REST services creation and simulation
  • WS policy-based web services security
  • Deliver and route messages and events based on Enterprise Integration Patterns (EIPs)
  • Reliable messaging backbone based on ActiveMQ
  • Service locator and registry**
  • Command line and scripting tools
  • XML key management specification (XKMS)**
  • Build and deploy as an OSGI feature
  • Deploy and manage a microservice
  • Build a microservice
+ Show more features

Advanced Data Profiling

  • Fraud pattern detection using Benford Law
  • Advanced statistics with indicator thresholds
  • Column set analysis
  • Advanced matching analysis
  • Time column correlation analysis
+ Show more features

Keep your data integration projects under budget

Talend keeps it flexible

Flexible

Keep costs predictable and resources flexible with annual or monthly subscriptions.

Talend keeps it predictable

Predictable

Talend charges per user, not per data volumes or connectors.

Talend keeps it simple

Simple

50% lower total cost of ownership with a single solution running in the cloud.

The digital context creates an increasing demand for product customization. To meet these challenges and innovate, we have deployed new tailor-made digital platforms for L’Oréal researchers, with Talend at the heart, to facilitate the management of more than 50 million pieces of data per day

Philippe Benivay, IS Experimental Data Intelligence - L'Oréal R&I IT Team

With Talend, capturing data at high speed from hundreds of data sources, we had fifty business projects in production for our financial, logistics, SCM, CRM business units in less than six months

Axel Frank, Solution Architect BI Platform

Ready to get started with Talend?