Implementing fuzzy search in SQL server - part 2: Levenshtein distance
Levenshtein Distance Comparison[/caption]
Levenshtein Distance CLR
To implement the Levenshtein Distance CLR, run this SQL ScriptLevenshtein Distance T-SQL
To implement the Levenshtein Distance in T-SQL, run the below code. Please note that this function has a cut-off value (@d) where it simply gives up and returns -1. [code]CREATE FUNCTION edit_distance_within(@s nvarchar(4000), @t nvarchar(4000), @d int) RETURNS int AS BEGIN DECLARE @sl int, @tl int, @i int, @j int, @sc nchar, @c int, @c1 int, @cv0 nvarchar(4000), @cv1 nvarchar(4000), @cmin int SELECT @sl = LEN(@s), @tl = LEN(@t), @cv1 = '', @j = 1, @i = 1, @c = 0 WHILE @j <= @tl SELECT @cv1 = @cv1 + NCHAR(@j), @j = @j + 1 WHILE @i <= @sl BEGIN SELECT @sc = SUBSTRING(@s, @i, 1), @c1 = @i, @c = @i, @cv0 = '', @j = 1, @cmin = 4000 WHILE @j <= @tl BEGIN SET @c = @c + 1 SET @c1 = @c1 - CASE WHEN @sc = SUBSTRING(@t, @j, 1) THEN 1 ELSE 0 END IF @c > @c1 SET @c = @c1 SET @c1 = UNICODE(SUBSTRING(@cv1, @j, 1)) + 1 IF @c > @c1 SET @c = @c1 IF @c < @cmin SET @cmin = @c SELECT @cv0 = @cv0 + NCHAR(@c), @j = @j + 1 END IF @cmin > @d BREAK SELECT @cv1 = @cv0, @i = @i + 1 END RETURN CASE WHEN @cmin <= @d AND @c <= @d THEN @c ELSE -1 END END [/code] Discover more about our expertise in SQL Server.On this page
Share this
Share this
More resources
Learn more about Pythian by reading the following blogs and articles.
Magic Knobs in Replication Agents
Magic Knobs in Replication Agents
Oct 31, 2013 12:00:00 AM
3
min read
Why your Cassandra cluster scales poorly

Why your Cassandra cluster scales poorly
Mar 20, 2023 12:00:00 AM
13
min read
GoldenGate 12.2 big data adapters: part 4 - HBASE
GoldenGate 12.2 big data adapters: part 4 - HBASE
May 11, 2016 12:00:00 AM
13
min read
Ready to unlock value from your data?
With Pythian, you can accomplish your data transformation goals and more.