Full-Text PDF Search

Jon Morisi shows how to use Full-Text Search to read PDF files:

Faced with this very issue, I decided to setup a local SQL Server Full-Text Search.
Some of the cool things Full-Text Search will give you, over and above, a standard search include the following:

  • One or more specific words or phrases (simple term)
  • A word or a phrase where the words begin with specified text (prefix term)
  • Inflectional forms of a specific word (generation term)
  • A word or phrase close to another word or phrase (proximity term)
  • Synonymous forms of a specific word (thesaurus)
  • Words or phrases using weighted values (weighted term)
In order to get stared with the setup, it’s important to know that the Full-Text Search architecture relies on filters for searching various file types.  This is important for this example because the PDF filter is not installed by default.  So, for starters, we need to go download and install the PDF ifilter(PDFFilter64Setup.msi).

Up until I read this blog post, I had no idea that full-text search could index PDFs, so that’s very interesting.

Related Posts

Things Not To Do In SQL Server

Randolph West has a how-not-to guide for SQL Server: Don’t use TIMESTAMP We covered this in detail in a previous post, What about TIMESTAMP? It’s better to pretend that this data type doesn’t exist. Why not? It is not what you think it is. TIMESTAMP is actually a row version value based on the amount of time since SQL Server was started. […]

Read More

Tracking Deployment Details

Andy Leonard tells a story whose moral is that you need to keep track of what you deploy: But this had to be done. Right now. I thanked Geoff and hung up the phone. I then made another judgment call and exercised yet more of my ETL Architect authority. I assigned the PrUAT ticket to myself, logged […]

Read More

Categories

June 2016
MTWTFSS
« May Jul »
 12345
6789101112
13141516171819
20212223242526
27282930