Welcome to the Frosty Friday Challenge! I’m Dorian Banutoiu from canonicalized.com, and I’ve got an
exciting challenge for you.
Get ready to showcase your data wrangling skills and JSON manipulation prowess!
Challenge Details
Imagine you’re working at a Telecom company and your colleague Rick has sent you a spreadsheet
containing a list of products.
Your task is to clean up and transform this data into a Snowflake table and create a JSON file in a
specific format for Rick to upload to a third-party tool.
But there’s some cleaning up to do before the data can be used effectively. In his attempt to make the
spreadsheet pretty Rick merged some cells in the Brand column which of course will make your life
harder.
[Data File](s3://frostyfridaychallenges/challenge_61/Telecom Products – Sheet1.csv)
Tasks
Data Cleanup:
– Fill the null values in the “Brand” column with the value from the first non-empty row from
above (forward fill)
– Fill null values in the “Friendly URL” column with the corresponding URL from the second
column
– Remove rows with null values in the Category column.
JSON Object Creation:
– Transform the cleaned data into a nested JSON object using the following format
{
"Category": {
"Brand": [
{"Product Name": "Friendly URL"}
],
"Brand": [
{"Product Name": "Friendly URL"},
{"Product Name": "Friendly URL"}
]
},
"Category": {
"Brand": [
{"Product Name": "Friendly URL"}
]
},
...
}
Unload Data:
– Once the JSON object is created, unload the data into a JSON file to a Snowflake stage.
Send your colleague [a pre-signed link]() to the file.
Final Outputs:
Snowflake Table: A cleaned and transformed dataset that’s ready to be used by you and your
colleagues looking like the below screenshot
JSON File: A JSON file in the required format, suitable for Rick to upload into a third-party tool. Pay attention to the case of the Category column.
Tips
- Window functions should make it easy to forward fill empty values in Snowflake
- You can create a UDF to make it easy to covert the case of the Category values
- Server-side encryption is required on the unload stage to be able to generate a pre-signed URL
Have fun!
Remember, if you want to participate:
- Sign up as a member of Frosty Friday. You can do this by clicking on the sidebar, and then going to ‘REGISTER‘
- Post your code to GitHub and make it publicly available (Check out our guide if you don’t know how to here)
- Post the URL in the comments of the challenge
4 responses to “Week 61 – Intermediate”
-
It was fun having a challenge constructed from multiple parts, but definitely took a bit longer than usual to go through each step! Thanks for keeping the challenges interesting.
- Solution URL – https://github.com/ChrisHastieIW/Frosty-Friday
-
It’s always interesting to go through your solutions. Thanks for sharing!
-
Thanks for the interesting challenge. Relatively simple but addressing many aspects (and Challenge 37 to the rescue here with the presigned URLs 🙂
- Solution URL – https://github.com/dsmdavid/frostyfridays-sf/blob/main/models/challenge_61.sql
-
Tried to get the JSON object created in a single statement first, but couldn’t get it working so resorted to using CTEs
- Solution URL – https://github.com/ChrisBo94/FrostyFriday/blob/main/Week_61.py
Leave a Reply
You must be logged in to post a comment.