This flow assumes the use case of a software solutions vendor who provides custom applications for customers. In this flow we show you how to prepare, calculate and ultimately create three different sales, marketing and finance analyses.

The main types of transformations used in this flow are cleansing and calculation transformations.

How to use example flows:

To take the most advantage from example flows, we recommend reading through the descriptions and comments in ever node and recipe to understand how to solve your use case with our tool. Once you're done, you can either use it as a starting point for your development or share it with other people in your workspace.

Getting to know the datasets

SALES OPPORTUNITIES.CSV

TIP

This dataset contains several empty columns. To see the column profiles more easily use CMD + G. This shortcut will toggle between the list and the grid views.

The most important fields we'll be paying attention to are OWNERID, ID, AMOUNT, PROBABILITY, TYPE, LEADSOURCE, CREATEDDATE and CLOSEDATE.

OPPORTUNITY OWNERS.CSV

The opportunity owners dataset contains the records for each one of the sales reps in the company. This is a much cleaner, smaller dataset containing only the first and last names of the reps, their employee ID and their email.

Cleansing the data

The first step in this process is to ensure your data is clean and ready to use. You'll find the data cleansing steps in the CLEANUP RECIPE.

TIP

If you want to understand what each step did, click right before that step to see the state of the data before the step kicked in. See the image below for an example of how to do it:

The CLEANUP RECIPE starts by removing any unnecessary columns.

As can be seen in the image above, at first RECORDTYPEID is removed since it is not useful for the purpose of this analysis, followed by another 19 empty columns.

The column ISDELETED originally displayed 0 and 1 values. In this case, 0 is the same as false and 1 equates true. When it comes to empty values, we can easily assume that those are the same as false.

TIP

Any transformation in Trifacta can be found by using the search transformation icon and typing any related keywords. If you are acquainted with SQL or Excel, try typing some of the functions you would use in either one and see what comes up!

In Trifacta replacing values can easily be done by searching for transformations with the keyword replace and choosing the option Replace text or pattern.

After replacing 0 and empty values with false and 1 with true, as well as empty PROBABILITY records with 0.15, you should end up with the following recipe steps:

TIP

Certain transformations like replacing, splitting and filtering are made a lot easier if you select values you want changed on the UI. This process is called brushing, and the more you brush, the more accurate the suggestions will become.

By brushing the quote symbols and selecting the first suggestion, we've quickly replaced it with an empty string (the equivalent of deleting it).

TIP

A lot of transformations can also be found by clicking the arrow next to the column names.

Certain columns in your dataset may have names that are hard to understand or that make it unclear what the data in the column is about. In this case we renamed the column NAME to CUSTOMERNAME.

Base calculations

The first base calculation was created by using a Custom Formula, and multiplying the AMOUNT by 150. This new column was then named REVENUE.

The next set of calculations fills any empty CLOSEDATES in the data. To calculate projected close dates, we used the following transformations:

DATEDIF to calculate the difference between the opportunity creation dates and the close dates for the ones we have dates for, using the interval day.
ROLLINGAVERAGE to calculate a rolling average time taking into consideration the 3 records before and 3 records after. This transformation is wrapped with the transform CEILING that rounds the average values up to the closest integer.
IFMISSING is then finally used to fill in any missing values by adding the CREATEDDATE and the AVERAGETIMETOCLOSE using the transformation DATEADD.

Forecasting revenue per month

In the recipe FORECASTED REVENUE PER MONTH only three simple transformations are used to create a report-like structure that gives us the information we need for our forecast:

MONTH - returns the month of a given date or datetime
YEAR - returns the year of a given date or datetime
PIVOT - just like in Excel or Google Sheets, this transformation lets you create a pivot table by aggregating rows and columns.

The final result of these transformations is a monthly analysis of the revenue as well as the amount of opportunities by closing likelihood (Best Case, Commit, Pipeline and Forecast).

Forecasting new revenue per source

The recipe FORECASTED NEW REVENUE PER SOURCE leverages one single recipe step using the PIVOT transform. However, in this pivot, we use a special function called SUMIF that allows for a condition to be define and only if the condition evaluates to true then the values are summed.

In this case, because we only want to know new revenue, we only want to sum if the revenue TYPE == 'New.

Forecasting new revenue per salesperson

This third analysis recipe leverages two transforms, JOIN and PIVOT. Since we've mentioned several times PIVOT during this guide, we'll focus on the JOIN transform.

Joining is a typical SQL operation that can be compared to a VLOOKUP in Excel. The result is an enriched dataset that contains the original data as well as some other data you want to pull in from a different dataset. When you join in Trifacta, you have to decide the following:

Joined-in data - what dataset you want to get data from
Join type - we support all common join types. If you want to read more about the types of joins available, this article goes in depth into that topic. In this case we want to enrich the OPPORTUNITY dataset with data from the OWNER dataset, so we chose a left join
Join keys - what data is common across these datasets and can be used to find records in the joined-in dataset
Output columns - what columns from either dataset you want to keep as a result of the transformation