Talend Big Data
Certified Developer Exam

Talend certification exams are designed to be challenging to ensure that you have the skills to successfully implement quality projects. Preparation is critical to passing.

This certification exam covers the Talend Big Data Basics, Talend Big Data Advanced – Spark Batch, and Talend Big Data Advanced – Spark Streaming learning plans. The emphasis is on the Talend Big Data architecture, Hadoop ecosystems, Spark on YARN, Kafka, and Kerberos.

Certification exam details

Exam content is updated periodically. The number and difficulty of questions may change. The passing score is adjusted to maintain a consistent standard.

Duration: 65 minutes
Number of questions: 55
Passing score: 70%

Recommended experience

  • At least six months of experience using Talend products
  • General knowledge of Hadoop (HDFS, Hive, HBase, YARN), Spark, Kafka, Talend Big Data and cloud storage architectures, and Spark Universal
  • Experience with Talend Big Data solutions and Talend Studio, including metadata creation, configuration, and troubleshooting


Preparation

To prepare for this certification exam, Talend recommends:

  • Taking the Big Data Basics, Big Data – Spark Batch, Big Data – Spark Streaming learning plans
  • Studying the training material in the Talend Big Data Certified Developer preparation training module
  • Reading the product documentation and Community Knowledge Base article

For more information about the recommended Learning Plans, go to the Talend Academy Catalog.


Badge

After passing this certification exam, you are awarded the Talend Big Data Certified Developer badge. To learn more about the criteria to earn this badge, refer to the Talend Academy Badging program page.


Certification exam topics

Defining Big Data

  • Define Big Data
  • Describe the Hadoop ecosystem
  • Differentiate between Talend architecture and Big Data architecture
  • Describe cloud storage architecture in a Big Data context

Managing metadata in a Big Data environment

  • Manage a Talend metadata stored in the repository
  • Describe the main elements of a Hadoop cluster metadata
  • Create a Hadoop cluster metadata
  • Create metadata connections to HBase, HDFS, YARN, and Hive

Managing data using Hive

  • Import data to a Hive table
  • Process data stored in a Hive table
  • Analyze Hive tables in the Profiling perspective
  • Manage Hive tables on Hive Warehouse Connector with CDP public cloud

Managing Spark in a Big Data Environment

  • Describe the principal usage of Spark
  • Manage Spark Universal, including modes, environments, and distributions
  • Configure Spark Batch and Streaming Jobs
  • Troubleshoot Spark Jobs
  • Optimize Spark Jobs at runtime

Streaming with Talend Big Data

  • Describe principal usage of Kafka
  • Use Kafka components in Streaming Jobs
  • Manage Big Data Streaming Jobs in Studio
  • Tuning Streaming Jobs, including windowing, caching, and checkpointing

Configuring a Big Data environment

  • Manage Kerberos and security
  • Manage Apache Knox security with Cloudera Data Platform (CDP)

Managing data on Hadoop and cloud

  • Describe the principal usage of Hadoop (HDFS, HBase, and Hive) and cloud technologies
  • Export and import big data files to HDFS
  • Export and import big data files to the cloud
  • Export data to an HBase table

Manage Big Data Jobs

  • Differentiate between Big Data Batch and Big Data Streaming Jobs
  • Migrate and convert Jobs in a Big Data environment

Managing a Spark cluster

  • Define Spark on YARN
  • Describe the principal usage of YARN
  • Manage YARN, including client and cluster
  • Monitor Big Data Job executions
  • Use Studio to configure resource requests to YARN

Ready to register for your exam?