This flow assumes the use case of a software solutions vendor who provides custom applications for customers. In this flow we show you how to prepare, calculate and ultimately create three different sales, marketing and finance analyses.
The main types of transformations used in this flow are cleansing and calculation transformations.
How to use example flows:
To take the most advantage from example flows, we recommend reading through the descriptions and comments in ever node and recipe to understand how to solve your use case with our tool. Once you're done, you can either use it as a starting point for your development or share it with other people in your workspace.
Getting to know the datasets
SALES OPPORTUNITIES.CSV
TIP
This dataset contains several empty columns. To see the column profiles more easily use CMD + G. This shortcut will toggle between the list and the grid views.
The most important fields we'll be paying attention to are OWNERID
, ID
, AMOUNT
, PROBABILITY
, TYPE
, LEADSOURCE
, CREATEDDATE
and CLOSEDATE
.
OPPORTUNITY OWNERS.CSV
The opportunity owners dataset contains the records for each one of the sales reps in the company. This is a much cleaner, smaller dataset containing only the first and last names of the reps, their employee ID and their email.
Cleansing the data
The first step in this process is to ensure your data is clean and ready to use. You'll find the data cleansing steps in the CLEANUP RECIPE
.
TIP
If you want to understand what each step did, click right before that step to see the state of the data before the step kicked in. See the image below for an example of how to do it:
The CLEANUP RECIPE
starts by removing any unnecessary columns.
As can be seen in the image above, at first RECORDTYPEID
is removed since it is not useful for the purpose of this analysis, followed by another 19 empty columns.
The column ISDELETED
originally displayed 0 and 1 values. In this case, 0 is the same as false
and 1 equates true
. When it comes to empty values, we can easily assume that those are the same as false
.
TIP
Any transformation in Trifacta can be found by using the search transformation icon and typing any related keywords. If you are acquainted with SQL or Excel, try typing some of the functions you would use in either one and see what comes up!
In Trifacta replacing values can easily be done by searching for transformations with the keyword replace
and choosing the option Replace text or pattern
.
After replacing 0 and empty values with false
and 1 with true
, as well as empty PROBABILITY
records with 0.15, you should end up with the following recipe steps:
TIP
Certain transformations like replacing, splitting and filtering are made a lot easier if you select values you want changed on the UI. This process is called brushing, and the more you brush, the more accurate the suggestions will become.
By brushing the quote symbols and selecting the first suggestion, we've quickly replaced it with an empty string (the equivalent of deleting it).
TIP
A lot of transformations can also be found by clicking the arrow next to the column names.
Certain columns in your dataset may have names that are hard to understand or that make it unclear what the data in the column is about. In this case we renamed the column NAME
to CUSTOMERNAME
.
Base calculations
The first base calculation was created by using a Custom Formula
, and multiplying the AMOUNT
by 150. This new column was then named REVENUE
.
The next set of calculations fills any empty CLOSEDATES
in the data. To calculate projected close dates, we used the following transformations:
DATEDIF
to calculate the difference between the opportunity creation dates and the close dates for the ones we have dates for, using the intervalday
.ROLLINGAVERAGE
to calculate a rolling average time taking into consideration the 3 records before and 3 records after. This transformation is wrapped with the transformCEILING
that rounds the average values up to the closest integer.IFMISSING
is then finally used to fill in any missing values by adding theCREATEDDATE
and theAVERAGETIMETOCLOSE
using the transformationDATEADD
.
Forecasting revenue per month
In the recipe FORECASTED REVENUE PER MONTH
only three simple transformations are used to create a report-like structure that gives us the information we need for our forecast:
MONTH
- returns the month of a given date or datetimeYEAR
- returns the year of a given date or datetimePIVOT
- just like in Excel or Google Sheets, this transformation lets you create a pivot table by aggregating rows and columns.
The final result of these transformations is a monthly analysis of the revenue as well as the amount of opportunities by closing likelihood (Best Case, Commit, Pipeline and Forecast).
Forecasting new revenue per source
The recipe FORECASTED NEW REVENUE PER SOURCE
leverages one single recipe step using the PIVOT
transform. However, in this pivot, we use a special function called SUMIF
that allows for a condition to be define and only if the condition evaluates to true
then the values are summed.
In this case, because we only want to know new revenue, we only want to sum if the revenue TYPE == 'New
.
Forecasting new revenue per salesperson
This third analysis recipe leverages two transforms, JOIN
and PIVOT
. Since we've mentioned several times PIVOT
during this guide, we'll focus on the JOIN
transform.
Joining is a typical SQL operation that can be compared to a VLOOKUP
in Excel. The result is an enriched dataset that contains the original data as well as some other data you want to pull in from a different dataset. When you join in Trifacta, you have to decide the following:
Joined-in data
- what dataset you want to get data fromJoin type
- we support all common join types. If you want to read more about the types of joins available, this article goes in depth into that topic. In this case we want to enrich theOPPORTUNITY
dataset with data from theOWNER
dataset, so we chose aleft join
Join keys
- what data is common across these datasets and can be used to find records in the joined-in datasetOutput columns
- what columns from either dataset you want to keep as a result of the transformation