In this installment of A Day in the Life of a Data Integration Developer, we’re going to cover running processes or jobs, testing, and debugging all within Talend Studio. What is shown in the video, and detailed below, is different ways to run Jobs, testing with smaller datasets, and how to use logging features to debug, as well as the debug feature that’s built into Studio.
- Part 1: Introduction to Talend Studio
- Part 2: How to Build Your First Job in Talend Studio
- Part 3: Running, Testing, and Debugging
- Part 4: AMC Studio Basic Features
- Part 5: Basic Job Design Features
- Part 6: How to Self-Document Any Data Integration Job
- Part 7: How to Import a License File
Debugging method one
Here’s a simple process that’s reading data from a large file, aggregating it, and writing it out to a table.
If you go to the Run feature on the tab, Run, and click on the Debug, you can then hit the Next button, which will then build the job and start running the process. You can see the data as it goes through each transformation with all the attributes.
So, the file has several columns, and then the tMap reduces it down. Now, I want to reduce the dataset so that I can see a smaller volume of data go through here.
So, if I click on the component for the file, I have a limit field and I can say ten fields. And now if I go back to the Run, Debug, Run, I can then see it complete all the way through with just 10 records by clicking the Next button ten times.
And I’ll see the debug attribute features all the way through all the components. I can drag them and move them over so it’s easier to see what was actually written to the database here. And I can see the sequence which is being built in the tMap.
Debugging method two
Another way of debugging the process is by adding screen output to the studio using the tLogRow component, and linking the output from the last tMap into the tLogRow.
So, I’m just making the output row a new output row in the tMap. If I just go into the tMap here, I’m can connect the attributes that I want to go to the tLogRow in the tMap, so I just drag and drop them here:
If you need to add the sequence so you can make sure the sequence is being generated correctly, add a new column.
Call it seq for sequence. And I need to make it an integer, so just change it to int. And then add a function under the expression builder, and I want to add the numeric sequence. So, now I have a sequence ready to go.
If I want to see the data in a nice format, in a table format, I can change that. Maybe I want more rows so I can see more output, so I change it to 100 rows in the file. And if I run it in the regular run mode, I can then see all the data output to the screen that’s been aggregated along with the sequence numbers.
For more details on these two debugging methods, watch the video above. In the next tutorial, you will learn about basic design functions.