Press "Enter" to skip to content

Category: Filestream

File Not Found in SQL Server 2022 with Distributed AG and Filestream

Sean Gallardy goes sleuthing:

I don’t often find many people using FileStream in their databases (which isn’t a bad or good thing, in my opinion, just a statement of fact). Some technologies in SQL Server use it behind the scene, such as FileTable or Hekaton, and there isn’t really any getting around it in those cases. However, I was brought an interesting issue by a friend on Database Administrators Stack Exchange, Hannah Vernon (w), when it came to a database that was in a Distributed Availability Group in 2019 and had no issues, but after upgrading to SQL Server 2022, started having a major problem.

Read on for Sean’s analysis of the problem and solution.

Comments closed

Database Restores and Filestream Woes

Joey D’Antoni troubleshoots a problem:

We had a case over the weekend where our automated restore process at client got hung up on this wait type, for a single database. What was the unique characteristic about this relatively medium (2-300 GB) database? It had a lot of filestream data–it seemed like the file count wasn’t that high, but my guess is the filestream data was the majority of the data in that database. When the job hung up, the restore had been waiting on PREEMPTIVE_OS_FILEOPS for over a day and still had a null value for percentage complete.

Click through to see what the problem was.

Comments closed

Finding Duplicate Images with Filetable

Louis Davidson continues a discussion of SQL Server’s filetable feature:

In my article “Filetables, Great For Business, Great For Hobbyist”, I implemented the basis of a hobby database I use for my Twitter accounts (@disneypicaday and @dollywoodp), where I post daily pictures of theme parks.

As my inventory of prepped pictures has grown, one thing has become a major pain. Duplicates. I have tens of thousands of pictures, and I scan them occasionally to add to my directories. Perhaps not surprisingly, the same pictures often get chosen a second time. I use a naming standard that integrates the files into my database, as well as for the copies to go out for a given day. So the second (and third and fourth) time I choose the same picture, it looks different, though it has the exact same bits.

There are some nice benefits to maintaining file metadata in a queryable format, though do read to the end to see the performance impact.

Comments closed

The Value of FileTable

Louis Davidson shows us a good scenario for SQL Server’s FileTable:

Now, my pictures are very manageable and searchable using SQL Server. The fact that the names are very standardized also helps me with searching the names in a file folder (I copy all of the files from the share into a Dropbox folder so I have access to the files everywhere I am as well.)

And additional cool thing is that even on my Express Edition local server, the filetable data is backed up with the database, so when I backup the database, it backs up my pictures in addition to the textual data. That extra backup, along with the copy of the files in Dropbox give me extra relief that I have a decent backup if something gets corrupted.

With just a little bit of configuration, and a little bit of code, I have a simple database of my images that I can search using T-SQL in a manner that is a lot more powerful that what I can reasonably do using File Explorer, along with the ability to keep history of how the pictures are used. This would be very useful for any photographer with a stock of pictures that they want to search and keep history of, even if it is just baby pictures…of the puppy.

Click through for the example. I think FileTable is one of the most underutilized features from SQL Server 2012.

Comments closed

Semantic Search in SQL Server

Haroon Ashraf wraps up a series:

Being the final part of the article, it is going to take you to the next level of analyzing word documents stored in Windows folders, managed by File Table, and consumed by Semantic Search.

Additionally, the readers are going to gain more understanding of Semantic Search and how to make it work with MS Word documents for analysis.

This article provides a name-based analysis of the documents with equal attention to both theory and practice.

Click through for the culmination of all of this filestream work.

Comments closed

Semantic Search with FileTable

Haroon Ashraf continues a series on semantic search with Windows and SQL Server:

The focus of the article is on comparing documents that can be stored on Windows File System in one respect and in the other respect their comparative analysis that can be performed with Semantic Search in SQL Server.

Additionally, the readers will learn how to store unstructured data by exploring File Table and creating MS Word documents on the fly (instantly) to be consumed by Semantic Search.

This part of the article is related to the use of Semantic Search on unstructured data for the extraction of basic level business-crucial information provided standard naming is in place.

Click through for the article.

Comments closed

Analyzing Windows Documents with SQL Server Semantic Search

Haroon Ashraf begins a series on semantic search over files:

The article examines storing unstructured data using native windows directory structure and managing it through the SQL Server database. Subsequently, we demonstrate how to perform comparative analysis with the help of Semantic Search in order to get valuable information out of the data. Additionally, you are going to get some hands-on experience of storing and linking documents on Windows File System and using Semantic Search.

It is of great importance for both database beginners and professionals to know how to apply file storage strategies such as File Table to searching technologies like Semantic Search for unstructured data analysis.

The first part of the series is all about setting up FileTable.

Comments closed

Problem With Merge Replication And FILESTREAM

Gianluca Sartori walks us through an error when combining merge replication with FILESTREAM:

I published tables with FILESTREAM data before, but it seems like there is a particular planetary alignment that triggers an error during the execution of the snapshot agent.

This unlikely combination consists in a merge article with a FILESTREAM column and two UNIQUE indexes on the ROWGUIDCOL column. Yes, I know that generally it does not make sense to have two indexes on the same column, but this happened to be one of the cases where it did, so we had a CLUSTERED PRIMARY KEY on the uniqueidentifier column decorated with the ROWGUIDCOL attribute and, on top, one more NONCLUSTERED UNIQUE index on the same column, backed by a UNIQUE constraint.

Setting up the publication does not throw any error, but generating the initial snapshot for the publication does:

Cannot create, drop, enable, or disable more than one constraint,
column, index, or trigger named 'ncMSmerge_conflict_TestMergeRep_DataStream'
in this context. Duplicate names are not allowed.

This is a rather specific confluence of events, so it probably won’t affect many people.  Still, it is a bug.

Comments closed

Filestream Garbage Collection Bug

Andy Mallon found a bug with Filestream:

I tracked the problem back to this specific scenario: When you have a database that is part of an Availability Group, and you drop a table that contains filestream data, the filestream garbage collection does not clean up the data container subdirectory that corresponds to that table. Garbage collection will continue to clean up other items (eg, deleted rows), but the dropped table never gets cleaned up.

I’ve logged this as a connect item for your up-voting pleasure.

Go forth and upvote.

Comments closed