Analyzing Clickstream Data With Markov Chains

Eleni Markou shows one method of analyzing clickstream data:

We chose to use the third-order Markov Chain on the above-produced data, as:

  • The number of parameters needed for the chain’s representation remains manageable. As the order increases, the parameters necessary for the representation increase exponentially and thus managing them requires significant computational power.
  • As a rule of thumb, we would like at least half of the clickstreams to consist of as many clicks as the order of the Markov Chain that should be fitted. There is no point in selecting a third-order chain if the majority of the clickstream consists of two states and so there is no state three steps behind to take into consideration.

Fitting the Markov Chain model gives us transition probabilities matrices and the lambda parameters of the chain for each one of the three lags, along with the start and end probabilities.

This particular analysis is trying to understand which page (if any) a user will go to next when on a particular page.  Eleni uses additional techniques like k-means clustering to segment out particular groups of users.  Very interesting analysis.

Related Posts

A Primer on Survey Analysis

Federico Pascual has a long primer on survey analysis: When it comes to customer feedback, you’ll find that not all the information you get is useful to your company. This feedback can be categorized into non-insightful and insightful data. The former refers to data you had already spotted as problematic, while insightful information either helps […]

Read More

Linear Regression in Power BI

Joseph Yeates shows how to implement linear regression in Power BI: The goal of a simple linear model is to fit a line onto this plot to summarize the shape of the data using the equation above. The “a” value is the slope of the fitted line (rise over run) and the “b” value is […]

Read More

Categories

September 2017
MTWTFSS
« Aug Oct »
 123
45678910
11121314151617
18192021222324
252627282930