Building Graph Tables

Tomaz Kastrun uses a set of e-mails as his SQL Server 2017 graph table data source:

To put the graph database to the test, I took bunch of emails from a particular MVP SQL Server distribution list (content will not be shown and all the names will be anonymized). On my gmail account, I have downloaded some 90MiB of emails in mbox file format. With some python scripting,  only FROM and SUBJECTS were extracted:

for index, message in enumerate(mailbox.mbox(infile)): content = get_content(message) row = [ message['from'].strip('>').split('<')[-1], decode_header(message['subject'])[0][0],"|" ] writer.writerow(row)

This post walks you through loading data, mostly.  But at the end, you can see how easy it is to find who replied to whose e-mails.

Related Posts

Installing Python Support In SQL Server

Ginger Grant has a teaser for her upcoming 24 Hours of PASS talk: The process for using Python in SQL Server is very similar to the previous process of installing R.  Microsoft renamed R Services to Machine Learning Services, and now allows both R and Python to be installed, as shown in the screen.  Microsoft’s […]

Read More

Naming Graph Edges

Greg Low is trying to find a common nomenclature for edges in graphs: Positive (Forward) Direction I’d also like to see the tables use a forward direction naming rather than reverse (like “Written By”). So perhaps: ($from_id) the member Wrote the post ($to_id) ($from_id) who Likes who/what ($to_id) ($from_id) the reply to the main post RepliesTo the main post ($to_id) Avoid […]

Read More


June 2017
« May Jul »