[TOS tutorial 03] Sorting a File

In this tutorial, use a processing component and learn how to sort data from a file.

This tutorial uses Talend Open Studio Data Integration version 6.

1. Create a new Job

  1. Ensure that the Integration perspective is selected.
  2. Create a new Job and name it SortCSVFile.

The Job Designer opens an empty Job.

2. Add and configure a tFileInputDelimited component

  1. Add a tFileInputDelimited component to the Job.
  2. To configure the a tFileInputDelimited_1 component, in the Component view of the component, click [...] next to the FileName field, select the file from the local disk, and click Open.
  3. To describe the structure of the file, open the Schema wizard of tFileInputDelimited_1 and click [...] next to the Edit schema field.
  4. Click the [+] icon to add the first column and enter the details for the column.
  5. Repeat step d for each column in the CSV file and close the Schema wizard.

3. Sort the data in your Job

  1. Add a tSortRow component to the Job and link the two components. Note: The schema of the tFileInputDelimited_1 component is inherited by the linked tSortRow component, so you do not need to configure it.
  2. To view the schema that has been inherited, in the Component view of the tSortRow component, click […] next to Edit schema.
  3. To create a new sorting rule based on the movie release year, click [+] and in the Schema column, click releaseYear and specify the sort order by clicking desc.
  4. To view the result of the sort rule, in the Job Designer, add a tLogRow component and link the tSortRow_1 and the tLogRow_1 components.
  5. To run the Job, in the Run view for the Job sortCSVFile, click Run.

The movies in the source file will now be sorted based on the year of release.

4. Add a second sort rule

  1. To add a second sorting rule, in the Component view of the tSortRow_1 component, click (+) and, in the Schema column, choose title. Then in the sort column, choose alpha.
  2. To run the Job, in the Run view, click Run.

Now, the movies will be sorted by year of release, and within each year, the movies will be sorted in the alphabetical order of the movie title.

5. Store the result of the Job in a file

  1. Add a tfileOutputExcel component to the Job Designer and link the tLogRow_1 to it.
  2. To configure the output component, in the Component view of the component, specify the path and name for the output file.
  3. To include the header row in the output file, select the Include Header.
  4. To run the Job, in the Run view, click Run.
  5. To check the moviesSorted.xls file, navigate to the folder in which the file was created and open the file. The file with the sorted data will be displayed.
  6. To prevent the sorted data from being displayed in the Run view, right-click tLogRow_1 and click Deactivate tLogRow.
  7. To run the Job, in the Run view, click Run.

The Job is run again. However, no data is displayed in the Run view.


Sei pronto a iniziare con Talend?