Ingesting Google Analytics Data Into Kafka With Python

Kevin Feasel

2018-07-06

Hadoop

Bill Ward shows us an example using Python of reading Google Analytics data into a Kafka topic:

Google Analytics is a very powerful platform for monitoring your web site’s metrics including top pages, visitors, bounce rate, etc. As more and more businesses start using Big Data processes, the need to compile as much data as possible becomes more advantageous. The more data you have available, the more options you have to analyze that data and produce some very interesting results that can help you shape your business.

This article assumes that you already have a running Kafka cluster. If you don’t, then please follow my article Kafka Tutorial for Fast Data Architecture to get a cluster up and running. You will also need to have a topic already created for publishing the Google Analytics metrics to. The aforementioned article covers this procedure as well. I created a topic called admintome-ga-pages since we will be collection Google Analytics (ga) metrics about my blog’s pages.

In this article, I will walk you through how to pull metrics data from Google Analytics for your site then take that data and push it to your Kafka Cluster. In a later article, we will cover how to take that data that is consumed in Kafka and analyze it to give us some meaningful business data.

This is a good tutorial and I’m looking forward to part two.

Related Posts

Testing an Event-Driven System

Andy Chambers takes us through how to test an event-driven system: Each distinct service has a nice, pure data model with extensive unit tests, but now with new clients (and consequently new requirements) coming thick and fast, the number of these services is rapidly increasing. The testing guardian angel who sometimes visits your thoughts during […]

Read More

Processing Fixed-Width Files with Spark

Subhasish Guha shows how you can read a fixed-with file with Apache Spark: A fixed width file is a very common flat file format when working with SAP, Mainframe, and Web Logs. Converting the data into a dataframe using metadata is always a challenge for Spark Developers. This particular article talks about all kinds of […]

Read More

Categories

July 2018
MTWTFSS
« Jun Aug »
 1
2345678
9101112131415
16171819202122
23242526272829
3031