Building Graph Tables

Tomaz Kastrun uses a set of e-mails as his SQL Server 2017 graph table data source:

To put the graph database to the test, I took bunch of emails from a particular MVP SQL Server distribution list (content will not be shown and all the names will be anonymized). On my gmail account, I have downloaded some 90MiB of emails in mbox file format. With some python scripting,  only FROM and SUBJECTS were extracted:

writer.writerow(['from','subject'])
for index, message in enumerate(mailbox.mbox(infile)): content = get_content(message) row = [ message['from'].strip('>').split('<')[-1], decode_header(message['subject'])[0][0],"|" ] writer.writerow(row)

This post walks you through loading data, mostly.  But at the end, you can see how easy it is to find who replied to whose e-mails.

Related Posts

R Versus Python

Vincent Granville believes that Python is overtaking R in the realm of data science: We use the app in question to compare search interest for R data Science versus Python Data Science, see above chart.  It looks like until December 2016, R dominated, but fell below Python by early 2017. The above chart displays an interest index, 100 […]

Read More

Classifying Time Series Data With TensorFlow

Burak Himmetoglu applies a time series data set to two different types of neural networks using TensorFlow: In this blog post, I will discuss the use of deep leaning methods to classify time-series data, without the need to manually engineer features. The example I will consider is the classic Human Activity Recognition (HAR) dataset from […]

Read More

Categories

June 2017
MTWTFSS
« May Jul »
 1234
567891011
12131415161718
19202122232425
2627282930