Talend Big Data v7 Certified Developer Exam

Talend certification exams are designed to be challenging to ensure that you have the skills to successfully implement quality Talend Big Data projects. Preparation is critical to passing.

Certification Exam Details

Number of questions: 65
Exam duration: 65 minutes

Types of questions:

  • Multiple choice
  • Multiple response

Validity: Our certification program aligns with our major product releases. This means that over time, the value of certification decreases. 

training-iconReady to register for your exam? Talend Exam

Recommended Experience

  • General knowledge of Hadoop (HDFS, MapReduce v2, Hive, Pig, HBase, Sqoop, YARN), Spark, Kafka, the Talend Big Data architecture, and Kerberos
  • Experience with Talend Big Data 7.x solutions and Talend Studio, including metadata creation, configuration, and troubleshooting

Preparation

To prepare for this certification exam, Talend recommends that you:

  • Take the Big Data Basics and Big Data Advanced courses
  • Study the training material in detail
  • Acquire experience by using the product for at least six months
  • Read the product documentation

Certification Exam Topics

  • Big Data in context
    • Define Big Data
    • Understand the Hadoop ecosystem 
    • Understand cloud storage architecture in a Big Data context
  • Basic concepts
    • Define Talend metadata stored in the repository
    • Understand the main elements of Hadoop cluster metadata
    • Create Hadoop cluster metadata
    • Create additional metadata (Hadoop Distributed File System, HDFS; YARN, Hive)
  • Read and write data (HDFS, cloud)
    • Understand HDFS
    • Use Studio components to import Big Data files to and export them from HDFS
    • Use Studio components to import Big Data files to and export them from the cloud
  • HBase
    • Understand HBase principles and usage
    • Use Studio components to connect to HBase
    • Use Studio components to export data to an HBase table
  • Sqoop
    • Understand Sqoop principles and usage
    • Create database metadata for Sqoop
    • Use Studio components to import tables to HDFS with Sqoop
  • Hive
    • Understand Hive principles and usage
    • Create database metadata for Hive
    • Use Studio components to import data to a Hive table
  • Standard, batch, and Streaming Jobs
    • Understand the differences between standard, batch, and Streaming Jobs
    • Know when to use a standard, batch, or Streaming Job
    • Migrate Jobs
  • Hadoop
    • Use Studio components to process data stored in a Hive table
    • Analyze Hive tables in the Profiling perspective
    • Understand Pig principles and usage
    • Use Studio components to process data with Pig
    • Understand MapReduce Jobs in Studio
    • Create a Big Data batch MapReduce Job to process data in HDFS
  • Spark
    • Understand Spark principles and usage
    • Set up Spark batch Jobs
    • Set up Spark Streaming Jobs
    • Troubleshoot Spark Jobs
    • Optimize Spark Jobs at runtime
  • YARN
    • Understand YARN principles and usage
    • Tune YARN
    • Monitor Job execution with web UIs
    • Use Studio to configure resource requests to YARN
  • Kafka
    • Understand Kafka principles and usage
    • Use Studio components to produce data in a Kafka topic
    • Use Studio components to consume data from a Kafka topic
  • Big Data Streaming Jobs
    • Understand Big Data Streaming Jobs in Studio
    • Tune Streaming Jobs
  • Setting up a Big Data environment
    • The Talend architecture and Big Data
    • Kerberos and security

Sample Questions

  1. How many containers does YARN allocate to a MapReduce application made up of two map tasks and one reduce task?​
    Choose one.
  1. Three
  2. Four
  3. One
  4. Five
  1. What is HDFS?
    1. A data warehouse infrastructure tool for processing structured data in Hadoop
    2. A tool for importing tables to and exporting them from the Hadoop File System
    3. A column-oriented key/value data store built to run on top of the Hadoop File System
    4. The primary storage system used by Hadoop applications

  2. In which perspective of Studio can you run an analysis on Hive Tables content:
    1. Profiling
    2. Integration
    3. Big Data
    4. Mediation

  3. What system resources does YARN allocate to Jobs running on the cluster?​
    Select two:

    1. CPU (number of cores)
    2. RAM (amount of memory)
    3. IP address ranges
    4. Network bandwidth
    5. Hard drive space

Answers:

  1. b
  2. d
  3. a
  4. a and b