Performing a Business Rule Analysis

Task outline

The business rule analysis enables you to check for anomalies in your data that are tied to business rules you define.

After starting Talend Studio, you will establish a connection to the local database server. Then you will create a SQL business rule in the repository. Finally, you will create a business rule analysis and associate the business rule with it.

Your training environment uses a single virtual machine (VM) that contains all the software you need to complete this tutorial, including Talend Studio and MySQL.

Starting Talend Studio

    1. To start the virtual machine, open a new tab with the same page and click the following link :

      The VM is launched in your web browser. Wait for Windows to start.

      A script is automatically launched. You can close it by clicking the X button.

    2. Select Yes on the Networks panel on the right side of your screen.

    3. To start Studio, on your VM desktop, double-click the Talend Studio shortcut.

    4. The Talend Data Fabric window opens.

      A project named DQ Essentials is available.

      Click Finish.

    5. Verify that the splash screen appears.

    6. Click the Start now! button (you may need to scroll down).

      The window opens.

Depending on your preferences and available tools, your screen may vary.

You can see the commonly used areas:

    • The DQ Repository, where data analyses and metadata are stored, is in the upper left corner.
    • The workspace, where you create and modify Jobs, run analyses, and examine results, is in the center.

Creating database connection metadata

  1. Make sure you are in the Profiling perspective. In the upper right corner, click the Profiling icon:

  2. Create metadata. In DQ Repository, expand Metadata.

    Right-click DB connections and click Create Connection.

    You are creating a connection to a collection of databases already configured in your training environment.

    1. In the Name text box, enter StagingDB.

    2. Click Next.
    3. On the DB Type drop-down list, click MySQL. Do not modify the default value in the Db Version box.

    4. Enter these settings:

      • In the Login and Password text boxes, enter root
      • In the Server text box, enter localhost
      • Leave the DataBase text box empty (you want to be able to access all databases configured on the local server)
    5. To verify your connection information, click the Check button.

    6. In the Check Connection window, click OK or make any necessary corrections and again click Check.
    7. Click Finish.

      Your new database connection, StagingDB, appears in the repository.

      The workspace displays the configuration information for the connection.

      If needed, make changes to the connection settings.

Creating a business rule analysis

  1. Create a business rule.

    1. In the Repository, expand Libraries, then Rules.

    2. Right-click the SQL folder and click New Business Rule.

      Fill in the Name, Purpose, and Description text boxes and click Next.

    3. In the Where clause text box, enter Begin_dt<End_dt.

      This ensures that the value for the end date is greater than the value for the start date.

    4. Click Finish. The new rule appears in the Where Clause box for Data Quality Rule.

    5. Notice that the new business rule appears in the repository.

  2. Create a business rule analysis.

    Right-click the CRM_Analyses folder and choose New Analysis.

    1. Expand Table Analysis and click Business Rule Analysis.

    2. Name the analysis Contract_Dates_Analysis and click Next.

    3. To identify the table to be analyzed, expand DB Connections, then the crmcatalog, and click the contract table.

    4. Click Next.

    5. Expand Rules, then SQL, and select the business rule you created earlier, Contract_Dates.

    6. Click Finish.

      The Analyzed Tables section displays the information you provided earlier to create the analysis.

    7. Run the analysis.

  3. Examine the results of the analysis.

    Approximately 99% of the data is considered valid and 1% is invalid.

    Right-click the Contract_Dates indicator and click View invalid rows.

In this tutorial, you started Talend Studio and learned about perspectives. You created and stored database connection information as metadata in the DQ repository. You finished by learning how to create a business rule in the DQ repository, and how to reuse it in a business rule analysis.

You have finished the tutorial.