A JSON file stores simple data structures and objects in Javascript Object Notation (JSON), which is a standard data interchange format. JSON files are text-based, lightweight, and easily readable. Trifacta supports JSON files. You can import JSON files into Trifacta, convert them to a tabular format, wrangle them, and then export them back in the same JSON format.
Requirements
JSON input
Filename extensions must be
.json
or.JSON
.Recommended limit of 1 GB in source file size.
Conversion of compressed JSON files is not supported.
For best results, all keys and values should be quoted and imported as strings.
Escape characters that make JSON invalid can cause your JSON file to fail to import.
JSON output
JSON-formatted files that are generated by Dataprep by Trifacta are rendered in JSON Lines format, which is a single line per-record variant of JSON. For more information, see http://jsonlines.org.
Characteristics of generated JSON files:
Newline-delimited: The end of each record is the
\n
character.Non-nested: Each record in the generated file is flat
Configuration
This method of working with JSON is enabled by default.
For more information on disabling, see Dataprep Project Settings Page.
JSON Workflow
Import the file
You can import your JSON file using the Import data page.
Navigate and select your file from the source folder.
Click the Preview icon on the right side of the screen. In the Preview, you can review the first few rows of the imported file.
The Preview is seen as follows:
Add the JSON-based imported dataset to a flow and create a recipe for it.
Considerations for your JSON file
Each row in your file is a complete JSON record containing keys and values.
Nested JSON (like the 'metrics' column in the above preview) can be inserted as part of a record. It can then be unnested within the application.
Each key's value must have a comma after it, except for the final key-value in any row.
You can see your dataset in the Transformer as follows:
Improved Parsing of JSON
Your JSON may be formatted as a single top-level object containing an array of JSON records. Starting v8.6, a formatted JSON with a top-level object containing a single array e.g. {"a": …} will be imported, with each object in the array as its own row.
All other values in the object will be copied into other rows.
For more information, check out this detailed documentation guide.
Commonly performed next steps
Unnest JSON records
Your JSON records are in tabular format. If you have nested JSON objects within your JSON records, the next step is to unnest your JSON records.
The easiest way to unnest is-
Select the column header for the column containing your nested data.
Unnest appears as one of the suggested options in the right panel, and the suggestion should include the specification for the paths to the key values.
3. Next, change the data type of the new column created '-1' to Object data type.
4. Click on the header of this new column now and again select the Unnest suggestion that appears
Each Path to elements entry specifies a key in the JSON record. The key's associated value becomes the value in the new column, which is given the same name as the key. So, this step breaks out the key-value pairs for the specified keys into separate columns in the dataset.
Note:
For JSON records that have multiple levels in the hierarchy, you should unnest the top level of the hierarchy first, followed by each successive level.
Repeat the above process for the next level in the hierarchy.
You can choose to remove the original from the source or not. In deeper or wider JSON files, removing can help to identify what remains to be unnested.
Wrangle your dataset
Your JSON data is ready for wrangling. Continue adding steps until you have transformed your data as needed and are ready to run a job on it.
Generate JSON output
When you are ready, you can run the job. Create or modify a publishing action to generate a JSON file for output. See Run Job Page
More Info
You can choose to continue using the legacy method of working with JSON. The legacy version of working with JSON is likely to be deprecated in a future release. For more information on migrating to the new version, see Working with JSON v1
For more details of Working with JSON, check out this documentation guide