Filtering Data Using the tMap Component
In this tutorial, discover the tMap component and its interface, and learn how to use it to filter columns from a schema.
Instructions as PDF, Sample data file, Job export
This tutorial uses Talend Open Studio Data Integration version 6.
1. Create a new Job, add the movies metadata as an input source, and add a tMap component
- Create a new Standard Job named tMapFilter.
- Add the movies metadata file as input delimited component.
- Add a tMap component that can modify the schema and filter columns.
- Create a flow of data from the movies component to the tMap_1 component by linking the two components.
2. Configure the tMap_1 component to filter columns
- Double-click the tMap_1 component.
The tMap_1 wizard window has four main sections:
- Left Section displays the incoming data flows. Note that there can be multiple inputs into the tMap component.
- Middle Section displays the mapping links between the input and output data flows. Here you can also create variables that use input values, and are then used to produce output.
- Right Section displays the output data flows.
- Bottom Section is the Schema editor that can be used to modify the schema of an input or output flow. To edit a Schema, select the input/output flow whose schema you want to change (the selected flow is highlighted in yellow) and edit the schema in the Schema editor.
- To create a new output component, in the output section of the tMap_1 wizard, click the [+] button, type the name of the output as filteredOutput, and click OK. An empty output is created.
- To add columns to the output, in the Schema editor of the output, click the [+] icon.
- Define a column for movie ID (Column: movieID, Type: Integer, and Length: 4).
Note: The output column name need not be the same as the input column name. To change the column name, edit the entry in the Schema editor.
- To send the data from the movieID column of the input file to the output column, click movieID, hold, and drag to the Expression column of filteredOutput. A yellow arrow appears indicating the flow of data.
- To add the title and releaseYear columns to the output component and link them, select and drag the columns from the input component to the output component.
- To change the order of the columns in the output component, click the [↑] or [↓] icons. The column order and the corresponding links will be updated.
3. Use the configured tMap_1 component
- To display the output processed by the tMap_1 component, add a tLogRow component in the Job Designer and link the filteredOutput output of the tMap_1 component to the tLogRow_1 component.
- To run the Job, in the Run view, click Run.
Only the filtered movie data (movieID, releaseYear, and title) is displayed.