Frosty Friday Consultants has been hired by the University of Frost’s history department; they want data on monarchs in their data warehouse for analysis. Your job is to take the JSON file located here, ingest it into the data warehouse, and parse it into a table that looks like this:
End Result:
If you can’t read the above right-click and view image in another tab.
- Separate columns for nicknames and consorts 1 – 3, many will be null.
- An ID in chronological order (birth).
- An Inter-House ID in order as they appear in the file.
- There should be 26 rows at the end.
Hints:
- Make sure you don’t lose any rows along the way.
- Be sure to investigate all the outputs and parameters available when transforming JSON.
Remember if you want to participate:
- Sign up as a member of Frosty Friday. You can do this by clicking on the sidebar, and then going to ‘REGISTER‘ (note joining our mailing list does not give you a Frosty Friday account)
- Post your code to GitHub and make it publicly available (Check out our guide if you don’t know how to here)
- Post the URL in the comments of the challenge.
If you have any technical questions you’d like to pose to the community, you can ask here on our dedicated thread.
26 responses to “Week 4 – Hard”
-
Not quite sure how the ordering is meant to take place for the ID since my result seems to swap the records for Felipe V’s two reigns (same date of birth), but I’m happy with this overall.
-
Reorganised my submission repo, new master solution URL here.
- Solution URL – https://github.com/ChrisHastieIW/Frosty-Friday
-
-
Hi Chris, Hi Dan,
Could you please help me to understand how we are populating INTER_HOUSE_ID. You have used index to generate the value. I am not clear on using “index”. Please help me to understand this.
Regards
Sachin Mittal- Solution URL – Solution URL :https://github.com/sachingithubaccnt/Frosty_Friday/blob/main/Week4_Hard.sql
-
@sachin: When you do a lateral flatten on the JSON, this creates output columns that contain the Index column along, defining the position in the JSON, along with the value.
https://docs.snowflake.com/en/sql-reference/functions/flatten.html#output
You can see it by doing the following:SELECT .M*
FROM
WEEK_4_JSON WK,
LATERAL FLATTEN(SRC:”Houses”) as H,
LATERAL FLATTEN(H.value:”Monarchs”) as M-
Alex beat me to it!
In short Sachin, the INDEX field is automatically generated when any FLATTEN is used. The values of INDEX are simply the row number of the flattened result from the perspective of the flatten, starting from zero. For a list, this will be returning the index of the record within the list, which equates to a row number for all records retrieved from that particular list.
In SQL, you could think of this similarly to a ROW_NUMBER() window function which is partitioned by H.value:”Monarchs” and ordered in the same order as the values within H.value:”Monarchs”. The flatten is a little bit more complicated and robust than this so that it can avoid duplicates etc, but this explains the basic functionality
-
-
Great challenge to keep the semi-structured skills sharp!
Solution Week 4: https://github.com/AlexClaydon42/frostyfriday/blob/main/chalange_4.sql -
Funny that the Catholic Monarchs didn’t have a spouse*…
My approach:
– Use the macro from week 1 to create a new stage.
– Retrieve the file as json, then flatten on house and flatten on monarch.I tried to be creative to extract the distinct keys inside the ‘Monarchs’ json for the different rows.
It was a learning experience using object_keys, array_agg, and array_to_string but, eventually, a rabbit hole (I obtained the unique keys and eventually used it to generate the SQL I ended up typing, still…)so I ended up with something similar as others have posted, although I did not honor the column order…
* When there’s only one spouse/nickname, the object is a string, not an array, and thus doing something like `src:key[0]` will return null (https://github.com/dsmdavid/frostyfridays-sf/blob/a2425bb65d7e609e3951ebe9bef76afc6545c92e/models/challenge_04_03.sql#L20 , the `coalesce` can help).
Or the dbt-docs: http://dvd-il-bucket-public.s3-website-eu-west-1.amazonaws.com/#!/model/model.frosty.challenge_04_03#code- Solution URL – https://github.com/dsmdavid/frostyfridays-sf/blob/main/models/challenge_04_03.sql
-
I broke my parsing into steps, easier to track of things.
- Solution URL – https://github.com/zlzlzl2-data/FrostyFriday/blob/main/FF4_LZ.sql
-
Also broke into steps, and coalesced the nicknames and spouses that got kicked out with [0]
- Solution URL – https://github.com/scallybrian/bs-frosty-sf/blob/main/frostyfridays/models/challenge_004/challenge_004.sql
-
This was a tricky one to get right. The sample solution of the image is incorrect as the multiple nicknames and consorts are wrong and it only picks up values where the value in the JSON is an array. I got around this by using a case statement on the first value with the IS_ARRAY function.
- Solution URL – https://github.com/ChrisBBiztory/FrostyFriday/blob/main/Week_4.sql
-
Awesome!
Please check out this solution blog: https://theinformationlab.nl/en/2022/09/06/snowflake-skills-4-parsing-a-json/- Solution URL – https://github.com/Atzmonky/snowflake/blob/main/ch4_JsonParse.sql
-
Nice practice for wrangling json data!
- Solution URL – https://github.com/pekka-kanerva/frosty-friday/blob/main/week4/ff-week4-solution.sql
-
I think having value vs array of values were key to understand. But overall really exercise
- Solution URL – https://github.com/ankitsr92/Ankit_Work/blob/main/FrostyFriday/Week1/Week4.sql
-
Here’s my solution 🙂
https://github.com/CSimonini/Frosty_Friday/blob/main/W4_Solution.sql- Solution URL – https://github.com/CSimonini/Frosty_Friday/blob/main/W4_Solution.sql
-
Here’s my solution
- Solution URL – https://github.com/apd-jlaird/frosty-friday/blob/main/week_4/ff_week_4.sql
-
My solution ^. Flattening json takes a while to get the hang of.
- Solution URL – https://github.com/BenAConnor/Frosty_Friday/blob/main/Week%204%20Solution.sql
-
Great task to improve json skills
- Solution URL – https://github.com/jameskalfox/frosty-friday-snowflake-challenges/blob/main/Week_04.sql
-
Loving the challenges so far
- Solution URL – https://github.com/canonicalized/FrostyFriday/blob/main/WEEK4.sql
-
Pfff, lot of work Great challenge though! Especially liked the window function action for `inter_house_id`.
https://github.com/marioveld/frosty_friday/tree/main/ffw4
- Solution URL – https://github.com/marioveld/frosty_friday/tree/main/ffw4
-
Great task to basic json skills
- Solution URL – https://github.com/tomoWakamatsu/FrostyFriday/blob/main/FrostyFriday-Week4.sql
-
Snowflake should really improve the infer_schema capabilities lol, nice to get hands dirty with JSON though.
- Solution URL – https://github.com/marco-scatassi-nimbus/Frosty-Friday/blob/main/week4/load_and_format_json.sql
Leave a Reply
You must be logged in to post a comment.