Eli Leiba created a function to try to generate a blacklist against SQL injection:
The suggested solution presented here involves creating a user defined T-SQL scalar function that checks the input string for any suspicious key words that might indicate the SQL injection intents.
The function checks the input string against a set of pre-defined keywords that are known to be used in SQL injection cases.
I get the intent here, but blacklists don’t work.
The first line of defense that many developers come up with is a blacklist: we know that keywords like “select,” “insert,” and “drop” are necessary to perform a SQL injection attack, so if we just ban those keywords, everything should be fine, right? Alas, life is not so simple; this leads to a number of problems with blacklists in general, as well as in this particular case.
The second-biggest problem with blacklists is that they could block people from performing legitimate requests. For example, a user at a paint company’s website may wish to search for “drop cloths,” so a naïve blacklist, outlawing use of the word “drop” in a search would lead to false positives.
The biggest problem is that, unless extreme care is taken, the blacklist will still let through malicious code. One of the big failures with SQL injection blacklists is that there are a number of different white-space characters: hex 0x20 (space), 0x09 (tab), 0x0A, 0x0B, 0x0C, 0x0D, and 0xA0 are all legitimate white-space as far as a SQL Server query is concerned. If the blacklist is looking for “drop table,” it is looking for the word “drop,” followed by a 0x20 character, followed by the word “table.” If we replace the 0x20 with a 0x09, it sails right by the blacklist.
With this particular blacklist, you have a pretty high probability of false positives: the list includes dashes, “tran,” “update,” “while,” “grant,” and even “go.” These are tokens used in SQL injection attempts, but they’re also very common words or word segments in English. This means that if you’re trying to blacklist a publicly-accessible search box which reads common English phrases, the incidence of false positive is going to be high enough that the blacklist changes. But even if it doesn’t, a dedicated attacker can still get around your blacklist; as the old saying goes, the attacker only needs to be right once.