Have you heard about Artic? What incredible news! But for this challenge we will leave AI and LLMs aside and focus on a very cute function for finding matches or making a comparison between two input strings: JAROWINKLER_SIMILARITY.
Your task is to write a SQL query in Snowflake that sorts a table of strings based on their Jaro-Winkler similarity to a given reference string.
This is the code to create the needed table (fruit today!). While the reference string is ‘strawberry’ 🍓
-- Create a table named fruit_salad
CREATE OR REPLACE TABLE fruit_salad (
fruits VARCHAR(255)
);
-- Insert sample frutis into the fruit_salad
INSERT INTO fruit_salad (fruits) VALUES
('apple'),
('apricot'),
('banana'),
('pineapple'),
('oranges'),
('kiwi'),
('strawberry'),
('grape'),
('watermelon'),
('pear'),
('peach'),
('strawberry'),
('blueberry'),
('mango'),
('lemon'),
('lime'),
('papaya'),
('cherry'),
('plum'),
('fig'),
('passion fruit'),
('raspberry'),
('blackberry'),
('nectarine'),
('cantaloupe'),
('apricot'),
('tangerine'),
('guava'),
('dragon fruit');
The query you need to create should return strings sorted in descending order based on their similarity to the reference string.
Remember if you want to participate:
- Sign up as a member of Frosty Friday. You can do this by clicking on the sidebar, and then going to ‘REGISTER‘ (note joining our mailing list does not give you a Frosty Friday account)
- Post your code to GitHub and make it publicly available (Check out our guide if you don’t know how to here)
- Post the URL in the comments of the challenge.
darko says
Jaro & Winkler for the win!
marcoscatassi says
As always there’s something to learn! It would have been nice to compare it with other metrics 😉
ChrisHastie says
This is a really useful function if used correctly. Definitely helpful for eliminating typos of finding similar statements in freetext fields, just be wary of matching your strawberries to your raspberries!