'Fuzzy Matching in SnowFlake like EDIT_DISTANCE_SIMILARITY

Do we have any function for name fuzzy matching like we have UTL_MATCHING.EDIT_DISTANCE_SIMILARITY in oracle. I have to find the difference at row level.



Solution 1:[1]

Snowflake has EDITDISTANCE and SOUNDEX functions:

select editdistance('Duningham', 'Cunningham');
-- Result 2

select soundex('McArthur') = soundex('MacArthur');
-- Result TRUE

For EDITDISTANCE, unlike EDIT_DISTANCE_SIMILARITY lower scores are closer matches. There are many open source JavaScript implementations of fuzzy matching that you could plug into a Snowflake JavaScript UDF.

Solution 2:[2]

Interzoid (Disclaimer, I work there) has matching capabilities with native Snowflake connectivity, using knowledge bases (for different data types: name, company, address, etc.), heuristics, soundex, spelling analysis, derivatives, contextual ML, etc.) using a similarity key technology for use with one or more tables. It accesses an underlying API for each record in a table to generate the similarity keys (which can be appended to the table if desired) upon which the fuzzy matching is based -> https://connect.interzoid.com/matching-data-database - it would work on the above scenario.

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1
Solution 2 Silver Bullet