Removing or Extracting JSON elements in Snowflake using Built-In Functions.

#OBJECT_DELETE #OBJECT_PICK

4 min readFeb 22, 2023

In the real-world data scenarios, JSON documents will be more than 1000 lines with N number of direct elements, Objects {} and Arrays []. When the JSON document(record) is more than 1000 lines then it would take little more time than usual to scroll through the list of elements or arrays that we want to extract or remove particularly while ETL. In such cases Snowflake provides built-in JSON functions called OBJECT_DELETE and OBJECT_PICK. The names themselves are self-explanatory, we will go through hands on experience with these two functions.

{
"name": "John",
"dob": 1980-07-15,
"LastName":
{
"name": "Smith"
},
"Cities Lived": [
{
"City Name": "New York",
"YearsOfLiving": 1,
"FavouriteRestaunrants": ["Paradise", "Italian Pizzeria"],
"Total Spent Money": 2400.45
},
{
"City Name": "Miami",
"YearsOfLiving": 2,
"FavouriteRestaunrants": ["Bahamas Breeze", "Pei Wei"],
"Total Spent Money": 1800.12
},
{
"City Name": "Dallas",
"YearsOfLiving": 1,
"FavouriteRestaunrants": ["Hyderabad Biryani", "Pista house"],
"Total Spent Money": 1200.98
}
],
"Current Location": {
"City": "Atlanta",
"State": "Georgia",
"Zip Code": 123456
}
},
{
"name": "Arnold",
"dob": 1992-11-15,
"LastName":
{
"name": "Bolt"
},
"Cities Lived": [
{
"City Name": "New Orleans",
"YearsOfLiving": 3,
"FavouriteRestaunrants": ["Bolay", "MOD Pizza"],
"Total Spent Money": 1895.76
},
{
"City Name": "Detroit",
"YearsOfLiving": 1,
"FavouriteRestaunrants": ["Buffalo Wild Wings", "5th Element"],
"Total Spent Money": 1420.58
},
{
"City Name": "San Jose",
"YearsOfLiving": 2,
"FavouriteRestaunrants": ["Dosa Place", "Peacock Cuisine"],
"Total Spent Money": 950.42
}
],
"Current Location": {
"City": "California",
"State": "Sacramento",
"Zip Code": 456789
}
}

The above are the two records that explain a person's record. I have loaded the above two records to PERSONS_DATA Snowflake table.

DATA_VARIANT is a column of type VARIANT which holds the JSON data. AS I explained in one my posts about Snowflake Semi-Structured data types. For understanding purpose, I have created small JSON records but in real data world scenarios one JSON record can be more than 1000 lines.

OBJECT_DELETE

Syntax: OBJECT_DELETE(<column>, <key1>, <key2>, <key3> …. <key n>)

You can mention n number of keys to remove from the JSON record but those keys should be at the top level in any order.

Even though the data type of column DATA_VARIANT is “Variant” but the output data type of OBJECT_DELETE is type “Object”. you will not see any difference between an Object and Variant.

Nested keys will not be removed.

I’m trying to remove the element “City” inside “Current Location” object.

If you see “City” is still there inside “Current Location”, that means OBJECT_DELETE will not remove nested elements. To remove nested elements, we have to specify the parent element in OBJECT_DELETE syntax like below.

Limitations of OBJECT_DELETE

  1. It is case sensitive.
  2. Does not remove nested elements.
  3. One has to know the schema beforehand while using this function to know which elements need to be removed.

Where to use OBJECT_DELETE?

  1. To remove sensitive information from JSON record before loading it into target table or data warehouse.
  2. To remove huge arrays and objects which are unnecessary in target table which results performance boost and better pruning.
  3. For cleaner look after loading into the target table removing unnecessary clutter from the JSON.

OBJECT_PICK

OBJECT_PICK is completely opposite to OBJECT_DELETE. This is to extract or pull only the specific element or elements. The syntax is same except the keys can be mentioned inside of an array.

If you observe the keys are mentioned in the form an array which is completely optional, you can also mention the keys without the array just like OBJECT_DELTE.

The return data type of this function is Object.

Limitations of OBJECT_PICK

  1. It is case sensitive.
  2. Does not remove nested elements just like OBJECT_DELETE.

Where to use OBJECT_PICK?

  1. To extract only specific elements which are only useful for business users or for reporting.
  2. Clutter free JSON record in target table for easy support and debugging.

I will be coming up with more useful Snowflake functions which play a key role in everyday SQL queries. Thank you

--

--

Santosh Reddy Kesava Reddy
Santosh Reddy Kesava Reddy

Written by Santosh Reddy Kesava Reddy

Passionate Data Engineer and SQL expert. I work as a Data Developer and Data Engineer on Azure infrastructure and Snowflake. With a working experience reporting

No responses yet