Skip to main content
How to join data

Combine two datasets based on a common key

Thea avatar
Written by Thea
Updated over 4 years ago

Joins allow you to combine two datasets based on a common key. For example, you could be working with a customer details dataset and a purchases dataset, both of which contain a column named "customer_id". You can connect these two datasets using the "customer_id" column as a key.

Create a join from the transformer grid

1. Open the recipe that contains the data you want to use as the left side of your join. Joins are individual recipe steps.

2. You can create a join in the transformer grid by clicking on the Join option from the toolbar:

3. Select the dataset or recipe that you want to use as the right side of your join. Click Add. Best Practice: Create joins between two recipes. Try to avoid creating a join between a recipe and a source dataset.

4. From the Join - Keys & Conditions screen, you can configure the following aspects of your join:

a) The join type. Trifacta supports inner, right, left, outer, and cross joins.

b) The join keys. Trifacta attempts to suggest the correct join key based on the contents of your datasets. However, if the suggested keys are incorrect, you can click the edit pencil icon to manually select the correct keys.

5. Click the Next button to progress to the Join - Output Columns screen. You can use this screen to choose the columns from either side of the join that you want to include in the join output. Place a checkmark next to each column that you want to include in the join output.

6. Click Review to confirm the final join settings.

7. When you are satisfied with your join, click Add to Recipe. You will see the join appear as a step in your recipe.

Additional information

Did this answer your question?