Executing a Task on a Remote Engine

Task outline

In this tutorial, you first connect to the Talend Cloud Portal. To do this, you must have an administrator account, or you can create a trial account at https://iam.us.cloud.talend.com/idp/trial-registration (for which you need an email address).

Then, you use a predefined Job and publish it to the cloud. You configure a remote engine and learn how to execute the task on a remote engine.

The predefined Job, which is available on your training machine, uses a local file containing sales per individuals, aggregates sales figures by city, and sorts the list and uploads the results in a file to an FTP server.

For this course, a local FTP server is available on your virtual machine in C:\ftp_root. However, if you were using a distant FTP server, the process would be exactly the same.

Configuring your virtual machine

  1. To start the virtual machine, open a new tab with the same page and click the following link : START VM!

    The VM is launched in your web browser. Wait for Windows to start.

    A script is automatically launched. You can close it by clicking the X button.

  2. Allow the PC to be discoverable on the network.

    On the Networks panel to the right, click the Yes button.

    This is critical in order to correctly execute all the exercises in this tutorial.

Connecting to the Talend Cloud Portal

  1. Log in the Talend Cloud Portal URL.

    1. Go to https://portal.us.cloud.talend.com and click CONTINUE.

    2. Log in to the Talend Cloud Portal interface. Notice that you can use Asia-Pacific, European or US Instance. If, for regulatory or other reasons, you must have your data hosted in Europe, you can change the instance from US to European.

  2. Connect to the Talend Cloud portal using your administrator account.

    If you cannot locate your log-in credentials, sign up for a trial at https://iam.us.cloud.talend.com/idp/trial-registration.

    Enter your email and click START YOUR TRIAL. You immediately receive a Talend Cloud – Trial Registration email with a link to complete registration. Follow the link and enter your information and Password and click COMPLETE REGISTRATION.

    You finally receive a Talend Cloud Account Confirmation email with your log in credentials. More precisely, look for a user name ending in @domain.talend.com or @trialxxxxxx.talend.com.

    To log in to the graphical interface, you can either use the email or the complete user name. Even if you use the email to log in, note the complete user name, as you need it later in this course.

    In the screenshots for this course, Adam Brown is the administrator.

    His user name is abrown@training.talend.com.

    Click LOGIN.

  3. The portal welcome page opens.

    Notice that several applications are available on the screen: Management Console, Studio, Data Preparation, and Data Stewardship.

    Clicking SELECT AN APP on the upper left section of the page, you can also open applications by selecting them on the drop-down menu.

    In this tutorial, you use Management Console to configure and execute your Talend Studio Jobs in the cloud.

Retrieving the complete username from Talend Management Console

  1. Click the Management Console section on the welcome page.

    The Talend Management Console welcome page opens. Click the LAUNCH button to access the tab in your browser.

  2. To display the user details, on the USERS tab on the left menu pane, click your username.

  3. Copy the complete Login name, which you need in order to configure Studio.

  4. Click the CANCEL button.

Connecting Talend Studio to Talend Cloud

You are ready to configure Talend Studio so it can publish Jobs in the cloud.

  1. Double-click the Talend Studio logo to open Talend Studio.

  2. Select Local_Project and click Finish.

  3. When the project opens, go to Window > Preferences > Talend > Integration Cloud.

  4. Enter your Talend Cloud Portal account credentials.

    Replace the default value in the Account Username text box with the login name you copied from Talend Management Console. Look for a username ending in @training.talend.com or @trialxxxxxx.talend.com.

    Enter your password (the same one you used to access the Talend Cloud portal).

  5. Ensure that access is granted by clicking the Test Connection button. “Service available” appears in green.

  6. Click Apply and OK.

    Talend Studio is configured.

  7. Optional (if your connection is slow) : go to Talend > Performance and set the Connection Timeout with Administrator Center field parameter to 300 seconds.

When your Tasks run on Talend Management Console application, you use a cloud engine if all data or applications are in the cloud.

You can also leverage remote engines and clusters when on-premises applications and data are involved.

Creating a remote engine

With Talend Management Console, you can execute Data Integration Jobs created in Studio. If all the data or applications are available in the cloud, you can use a cloud engine. If on-premises applications and data are involved, you can also leverage remote engines and clusters.

Outbound communication between Talend Cloud and remote engines is fully secured, as data is not staged.

You are ready to learn how to create a remote engine. In this case, the remote engine must be installed on the VM, giving you access to needed resources (such as your local file system and databases).

  1. If you have not done so already, connect to Talend Management Console using your administrator credentials.
  2. On the pane on the left, click ENGINES.

  3. Click the ADD button, and select Remote Engine.

  4. Fill in the form as shown in the following screenshot. Enter a Name and Description, and select the Environment and Workspace.

  5. Validate your entries by clicking Save.

  6. The remote engine is created in your Cloud account but still needs to be locally installed and paired.

Installing and pairing a remote engine

When the remote engine is created, it must be installed locally and paired.

  1. Click on Training VM RE to access the Engine Details. Copy the Remote Engine Key clicking the icon, or select it manually and press CTRL+C.

  2. Click the DOWNLOAD button and download the installer for Windows server (exe).

  3. Install the remote engine on the training VM.

    1. When the installer is downloaded, run it.

    2. Click Yes to allow the installer to be executed.

    3. Click Next and accept the license agreement, and again click Next.

    4. Select the installation directory and click Next.

    5. In the System Service section, select the Yes radio button, and on the Region drop-down menu, select USA.

    6. Using CTRL+V, paste the remote engine key in the Pre-authorized key text box (or you can declare it after installation).

    7. Click Next twice.

    8. Click Finish to complete the installation.

  4. Verify that the services are running.

    1. On the Windows taskbar, click the Services (gear) icon.

    2. Ensure that the Talend Remote Engine service is running. If not, start it.

  5. Refresh the ENGINES page to verify that the remote engine is paired.

  6. If you did not enter the remote key during installation, follow these steps to manually pair the remote engine.

    1. Go to http://locahost:8043/configuration.
    2. Paste the key in the REMOTE ENGINE KEY text box and click the PAIR REMOTE ENGINE button.

    3. Refresh the REMOTE ENGINES/CLUSTERS page to verify that the remote engine is paired.

Publishing a Job to Talend Cloud

When Talend Studio is configured with your Talend Cloud account, you can publish a Job to the cloud.

  1. Open the Sales_Remote_Engine Job.

    1. Open Studio in the Integration perspective.
    2. In the Repository, double-click Job Designs > Standard > Sales_Remote_Engine Job.

    3. The Job opens in the Designer. Take a couple of minutes to analyze it.

    4. Right-click the tFileInputDelimited component and select Data Viewer, look at the input file, and click the Close button.

  2. Publish the Job to Talend Cloud.

    1. In the Repository, right-click the Job name and select Publish to Cloud.

    2. Select your workspace and click Finish.

    3. When the Job upload is complete, click the Open Job Flow hyperlink (or click OK, open your browser, and go to the Talend Management Console application).

    4. Log in to the Talend Cloud portal using your credentials.

Configuring and running the task on a Remote Engine

The remote engine is ready and the Job has been published to Talend Cloud, so you are ready to configure and run it.

  1. Configure the task to execute on the remote engine.

    In Talend Management Console, on the left pane, select Management, from the left menu then click the Tasks icon.

  2. Configure the task to execute on the remote engine and schedule it to start only when called from Talend Data Preparation.

    1. Click your Task name.

    2. Hover over the Configuration section until the pencil symbol appears, then click it to edit configuration.

    3. Leave the Artifact parameters as default and click CONTINUE.

    4. Set the Runtime to Training VM RE (Engine) and set the Run type to To be used in Plans only.

    5. Click GO LIVE.

    6. Your Job is now configured to execute on the remote engine and is scheduled to start only on Execution Plan.

      Click RUN NOW button.

  3. You can see in the RUN HISTORY section that the Task is running.

  4. In the RUN HISTORY section, check the execution status by clicking the REFRESH button.

    When the task execution is complete, the duration is displayed.

  5. To access detailed information on the task execution, expand panel by clicking the arrow on the right, and click VIEW LOGS. From here, you can filter by user logs or developer logs, or display all logs, including information messages.

Verifying execution results

  1. Go to C:\ftp_root\stats and verify that the sorted_sales.csv file is there.

  2. Open the file and confirm that the sales are aggregated by city and sorted by sale.

Scheduling task execution

If you don’t want to execute the task immediately, you can schedule the task execution.

  1. From your Task and Plan list in TMC, go back to the Sales_Remote_Engine details in the Configuration section to edit your task.

  2. Configure the Go Live section.

    1. Leave Runtime as Training VM RE (Engine).

    2. For Run type, select Daily.

    3. Define the reference time zone.

    4. Set Starts and Repeat from to the current date and time, and set Repeat to 10 minutes after.

    5. For Repeat every, select 1 day(s), At specific intervals and Repeat every 5 minutes.

  3. Click the GO LIVE button.

    A pop-up window confirms that the task execution has been scheduled.

    On the task screen, you can see in the RUN HISTORY section that the task is running.

  4. Go to C:\ftp_root\stats and verify that the file is there.

Talend Cloud also allows you to create an execution plan. By using this operation, you can execute several tasks in sequence and schedule your execution plan.

For further details, see these other Talend resources:

You have finished the tutorial.