Full-Text PDF Search

Jon Morisi shows how to use Full-Text Search to read PDF files:

Faced with this very issue, I decided to setup a local SQL Server Full-Text Search.
Some of the cool things Full-Text Search will give you, over and above, a standard search include the following:

  • One or more specific words or phrases (simple term)
  • A word or a phrase where the words begin with specified text (prefix term)
  • Inflectional forms of a specific word (generation term)
  • A word or phrase close to another word or phrase (proximity term)
  • Synonymous forms of a specific word (thesaurus)
  • Words or phrases using weighted values (weighted term)
In order to get stared with the setup, it’s important to know that the Full-Text Search architecture relies on filters for searching various file types.  This is important for this example because the PDF filter is not installed by default.  So, for starters, we need to go download and install the PDF ifilter(PDFFilter64Setup.msi).

Up until I read this blog post, I had no idea that full-text search could index PDFs, so that’s very interesting.

Related Posts

Mislabeled Column In dm_os_sys_memory

Lonny Niederstadt points out that the definition of a column in the sys.dm_os_sys_memory DMV is incorrect: Based on the column names and values above, seems natural to think: total_page_file_kb – available_page_file_kb = used page file kb 11027476 kb – 3047668 kb = 7979808 kb Holy cow! Is my laptop using nearly as much paging space […]

Read More

Creating Powershell Documentation In VS Code

Rob Sewell has a post covering a nice addition to Visual Studio Code when you’re building Get-Help documentation for a cmdlet: Now you can simply type <# and your help will be dynamically created. You will still have to fill in some of the blanks but it is a lot easier. Here it is in […]

Read More

Categories

June 2016
MTWTFSS
« May Jul »
 12345
6789101112
13141516171819
20212223242526
27282930