When automating a flow, you often need to selectively process certain input files. Source dataset parameterization in Trifacta allows you to dynamically choose which inputs to pass into your flow.
Customers often use source parameterization in the following scenarios:
Scale out: Perform the same transformations on data that is stored in parallel in a source.
Filter: Dynamically find datasets given a certain criteria.
Automate: Create a workflow based on data that is refreshed on a scheduled basis.
All parameters are evaluated at the time when you run a job. Consequently, parameters are often combined with scheduling to create complex workflows.
How to create a parameterized dataset
1. From the Flow View, right-click on the dataset that you want to use as the basis for your parameterization. From the pop-up menu that appears, click Replace with dataset with parameters:
This will open the Define Parameterized Path window.
2. From the window, you can select the parts of the file path that you want to parameterize. After highlighting the path, you will see a pop-up appear over your selection. This pop-up allows you to choose which type of parameter you want to create.
Trifacta supports three types of parameters:
Variables: Pass in a specific string
Date/time parameters: Pass in a dynamic date or range of dates
Pattern parameters: Pass in a regular expression or use a wildcard
Click on the icon that represents the type of parameter you want to create, and complete fields in the parameter configuration window.
Crate a variable parameter
Select the Add Variable icon:
From the drop-down, you can name your parameter and configure a default value. We recommend choosing a unique name for each of your parameters. For variable type parameters, the default value is a string. You can override this string at run-time.
Once you have configured your variable settings, click Save.
Create a date/time parameter
Select the Add Datetime Parameter icon:
From the pop-up box, enter the format for your date/time variable. This should match the format in your source file path. Next, configure the date range that will be evaluated at job run time.
IMPORTANT: Make sure you pay attention to the timezone. If you configure an incorrect timezone in this variable, Trifacta will not be able to identify the correct source file.
Once you have configured your variable settings, click Save.
Create a pattern parameter
Select the Add Pattern Parameter icon:
From the pop-up box, choose whether your pattern is defined using a wildcard or a custom pattern. You can use regular expressions or Trifacta patterns to define a custom pattern.
Once you have configured your variable settings, click Save.
More Information
For product help and specific questions, check out the Trifacta Community
For more information about source parameterization, refer to our documentation here.