Implementing fuzzy search in SQL server - part 2: Levenshtein distance
The Levenshtein Distance, as discussed in my last post, is a way to measure how far two strings are...
The Levenshtein Distance, as discussed in my last post, is a way to measure how far two strings are...
Last week I published a blog post titled " Are You Ready For the Leap Second?", and by looking at...
If a lot of rows or pages are locked, the SQL Server escalates to a table-level lock, to save...
When I’m not working on Big Data infrastructure for clients, I develop a few internal web...
One of the well-known best practices for HDFS is to store data in few large files, rather than a...
The code For the impatient ones, the couchbase collector can be found in github: Couchbase...
Today we’re having a quick one. Earlier during the day, I had to peruse an Oozie log for the first...
Have you ever wondered about tuning your replication configuration to achieve better performance or...
I’ve been using Sqoop to load data into HDFS from Oracle. I’m using version 1.4.3 of Sqoop, running...
Some decisions sound easy, but its also easy to get them wrong. Today I had a choice of hanging...