Building A Recommendation System With Graph Data

Arvind Shyamsundar and Shreya Verma show how to implement a recommender system using SQL Server 2017’s new graph database functionality:

As you can see from the animation, the algorithm is quite simple:

  • First, we identify the user and ‘current’ song to start with (red line)
  • Next, we identify the other users who have also listened to this song (green line)
  • Then we find the other songs which those other users have also listened to (blue, dotted line)
  • Finally, we direct the current user to the top songs from those other songs, prioritized by the number of times they were listened to (this is represented by the thick violet line.)

The algorithm above is quite simple, but as you will see it is quite effective in meeting our requirement. Now, let’s see how to actually implement this in SQL Server 2017.

Click through for animated images as well as an actual execution plan and recommendations for graph query optimization (spoilers:  columnstore all the things).  They also link to the GitHub project where you can try it out yourself.

Related Posts

APPROX_COUNT_DISTINCT

Niko Neugebauer is happy with a new function in SQL Server 2019: A rather interesting result takes place if we scale our database to 100GB TPCH and run the very same queries – the total elapsed time jumps to 50% difference (from 30%), the CPU execution time difference is kept at 50%, but the memory […]

Read More

Simulating LAG And LEAD Prior To SQL Server 2012

Izik Ben-Gan highlights a reader submission from his last post: Last month I covered a Special Islands challenge. The task was to identify periods of activity for each service ID, tolerating a gap of up to an input number of seconds (@allowedgap). The caveat was that the solution had to be pre-2012 compatible, so you couldn’t […]

Read More

Categories

April 2017
MTWTFSS
« Mar May »
 12
3456789
10111213141516
17181920212223
24252627282930