David Smith has a post on a new R package to display graphs:

A graph, a collection of nodes connected by edges, is just data. Whether it’s a social network (where nodes are people, and edges are friend relationships), or a decision tree (where nodes are branch criteria or values, and edges decisions), the nature of the graph is easily represented in a data object. It might be represented as a matrix (where rows and columns are nodes, and elements mark whether an edge between them is present) or as a data frame (where each row is an edge, with columns representing the pair of connected nodes).

The trick comes in how you represent a graph visually; there are many different options each with strengths and weaknesses when it comes to interpretation. A graph with many nodes and edges may become an unintelligible hairball without careful arrangement, and including directionality or other attributes of edges or nodes can reveal insights about the data that wouldn’t be apparent otherwise. There are many R packages for creating and displaying graphs (igraph is a popular one, and this CRAN task view lists many others) but that’s a problem in its own right: an important part of the data exploration process is trying and comparing different visualization options, and the myriad packages and interfaces makes that process difficult for graph data.

Click through for more information as well as a mesmerizing animated image.

Related Posts

The Basics Of PCA In R

Prashant Shekhar gives us an overview of Principal Component Analysis using R: PCA changes the axis towards the direction of maximum variance and then takes projection on this new axis. The direction of maximum variance is represented by Principal Components (PC1). There are multiple principal components depending on the number of dimensions (features) in the […]

Read More

Tidy Data Is Normalized Data

I emphasize the link between a tidy dataframe and a normalized data structure: The kicker, as Wickham describes on pages 4-5, is that normalization is a critical part of tidying data.  Specifically, Wickham argues that tidy data should achieve third normal form. Now, in practice, Wickham argues, we tend to need to denormalize data because […]

Read More


February 2017
« Jan Mar »