In Week 1 we looked at ingesting S3 data, now it’s time to take that a step further. So this week we’ve got a short list of tasks for you all to do.
The basics aren’t earth-shattering but might cause you to scratch your head a bit once you start building the solution.
Frosty Friday Inc., your benevolent employer, has an S3 bucket that was filled with .csv data dumps. These dumps aren’t very complicated and all have the same style and contents. All of these files should be placed into a single table.
However, it might occur that some important data is uploaded as well, these files have a different naming scheme and need to be tracked. We need to have the metadata stored for reference in a separate table. You can recognize these files because of a file inside of the S3 bucket. This file, keywords.csv, contains all of the keywords that mark a file as important.
Objective:
Create a table that lists all the files in our stage that contain any of the keywords in the keywords.csv file.
The S3 bucket’s URI is: s3://frostyfridaychallenges/challenge_3/
Result:
Your result should look like:
Remember if you want to participate:
- Sign up as a member of Frosty Friday. You can do this by clicking on the sidebar, and then going to ‘REGISTER‘ (note joining our mailing list does not give you a Frosty Friday account)
- Post your code to GitHub and make it publicly available (Check out our guide if you don’t know how to here)
- Post the URL in the comments of the challenge.
If you have any technical questions you’d like to pose to the community, you can ask here on our dedicated thread.
31 responses to “Week 3 – Basic”
-
I had time to fit in one more today. Fun little challenge
-
Reorganised my submission repo, new master solution URL here.
- Solution URL – https://github.com/ChrisHastieIW/Frosty-Friday
-
-
Good Use case, Used Pattern while COPY and CONCAT while loading data to table
- Solution URL – Solution URL : https://github.com/sachingithubaccnt/Frosty_Friday/blob/main/Week3_Basic.sql
-
My approach:
1. Use the macro from week 1 to create a new stage.
2. List the files in the stage.
3. Retrieve the last_query_id to scan the results.
4. Read the keywords’ file.
5. Keep those files in the stage that match any of the keywords (`ilike ‘%’ || kwd || ‘%’ )Still sticking to dbt, https://github.com/dsmdavid/frostyfridays-sf/blob/main/models/challenge_03.sql
or the dbt-docs http://dvd-il-bucket-public.s3-website-eu-west-1.amazonaws.com/#!/model/model.frosty.challenge_03 -
Notes & Documentation: https://www.craft.do/s/K03sY2JoJWQqAJ
- Solution URL – https://github.com/meerens/frosty-friday/blob/main/week_3_basic_metadata_queries.sql
-
Nice, gradual upgrade in difficulty from Week 1!
Here is my video solution: https://www.youtube.com/watch?v=vZGC1ZQzwoI
And here the solution on GitHub: [https://github.com/DownToEarthDataTips/FrostyFridays/blob/main/Week 3 – Load Metadata into a Table](https://github.com/DownToEarthDataTips/FrostyFridays/blob/main/Week%203%20%20-%20Load%20Metadata%20into%20a%20Table)
- Solution URL – https://www.youtube.com/watch?v=vZGC1ZQzwoI
-
Continue to practicing with metadata, so it string match.
- Solution URL – https://github.com/zlzlzl2-data/FrostyFriday/blob/693b7b7aa622bb31018f247f30b0ea282840d844/FF3_LZ.sql
-
LIKE ANY was a new one for me..
- Solution URL – https://github.com/scallybrian/bs-frosty-sf/blob/main/frostyfridays/models/challenge_003/challenge_003.sql
-
Created the final table using 2 methods, where exists and like any
- Solution URL – https://github.com/ChrisBBiztory/FrostyFriday/blob/main/Week_3.sql
-
Check out this solution blog: https://theinformationlab.nl/2022/08/26/snowflake-skills-3-metadata/
- Solution URL – https://github.com/Atzmonky/snowflake/blob/main/ch3_metadata.sql
-
Catching up, 3/15…
- Solution URL – https://github.com/pekka-kanerva/frosty-friday/blob/main/week3/ff-week3-solution.sql
-
My solution
- Solution URL – https://github.com/isarenz/FrostyFridayChallenges/blob/main/FrostyFriday_Challenge_Week3.sql
-
A bit different. I didnt load in proper columns. And used load_history to get information on table loads.
- Solution URL – https://github.com/ankitsr92/Ankit_Work/blob/main/FrostyFriday/Week1/Week3.sql
-
Keep practicing with Snowflake even if my code seems very basic for this task
- Solution URL – https://github.com/meerens/frosty-friday/blob/main/week_3_basic_metadata_queries.sql
-
Really liked this one as an intro to metadata!
-
Solution URL
-
Solution URL updated
- Solution URL – https://github.com/jameskalfox/frosty-friday-snowflake-challenges/blob/main/Week_03.sql
-
-
Here’s my solution.
I found a neat workaround to COPY INTO not allowing complex queries by running my query first and then using `table(result_scan(last_query_id()))` within the copy statement. The only quirk was that in DataGrip you need to offset the query id by -5 for some reason.
- Solution URL – https://github.com/apd-jlaird/frosty-friday/blob/main/week_3/ff_week_3.sql
-
LIKE ANY is a good syntax.
- Solution URL – https://github.com/indigo13love/FrostyFriday/blob/main/week3.sql
-
Not sure it was the most efficient way to do it but got to the answer…
- Solution URL – https://github.com/BenAConnor/Frosty_Friday/blob/main/Week%203%20Solution.sql
-
Loving the challenges so far!
- Solution URL – https://github.com/canonicalized/FrostyFriday/blob/main/WEEK3.sql
-
Nice one, scratched my head more than once!
- Solution URL – https://github.com/arjansnowflake/Frosty_Friday/blob/main/Week_3/week_3.sql
Leave a Reply
You must be logged in to post a comment.