As you can see from the animation, the algorithm is quite simple:
- First, we identify the user and ‘current’ song to start with (red line)
- Next, we identify the other users who have also listened to this song (green line)
- Then we find the other songs which those other users have also listened to (blue, dotted line)
- Finally, we direct the current user to the top songs from those other songs, prioritized by the number of times they were listened to (this is represented by the thick violet line.)
The algorithm above is quite simple, but as you will see it is quite effective in meeting our requirement. Now, let’s see how to actually implement this in SQL Server 2017.
Click through for animated images as well as an actual execution plan and recommendations for graph query optimization (spoilers: columnstore all the things). They also link to the GitHub project where you can try it out yourself.