Data Visualization For Social Science

I’ve started reading Kieran Healy’s book, Data Visualization For Social Science.  He has a free draft available online, and it automatically builds nightly so you’re seeing the latest version.  From the preface:

This book is a hands-on introduction to the principles and practice of looking at and presenting data using R and ggplot. R is a powerful, widely used, and freely available programming language for data analysis. You may be interested in exploring ggplot after having used R before, or be entirely new to both R and ggplot and just want to graph your data. I do not assume you have any prior knowledge of R.

After installing the software we need, we begin with an overview of some basic principles of visualization. We focus not just on the aesthetic aspects of good plots, but on how their effectiveness is rooted in the way we perceive properties like length, absolute and relative size, orientation, shape, and color. We then learn how to produce and refine plots using ggplot2, a powerful, versatile, and widely-used visualization library for R (Wickham 2016a). The ggplot2 library implements a “grammar of graphics” (Wilkinson 2005). This approach gives us a coherent way to produce visualizations by expressing relationships between the attributes of data and their graphical representation.

Through a series of worked examples, you will learn how to build plots piece by piece, beginning with scatterplots and summaries of single variables, then moving on to more complex graphics. Topics covered include plotting continuous and categorical variables, layering information on graphics; faceting grouped data to produce effective “small multiple” plots; transforming data to easily produce visual summaries on the graph such as trend lines, linear fits, error ranges, and boxplots; creating maps, and also some alternatives to maps worth considering when presenting country- or state-level data. We will also cover cases where we are not working directly with a dataset, but rather with estimates from a statistical model. From there, we will explore the process of refining plots to accomplish common tasks such as highlighting key features of the data, labeling particular items of interest, annotating plots, and changing their overall appearance. Finally we will examine some strategies for presenting graphical results in different formats, and to different sorts of audiences.

I’m less than halfway through the book so far, but it is quite an approachable look at the ggplot2 library with a bit of discussion on what makes for quality graphics.

Related Posts

Microsoft R Open 3.5.1

David Smith announces Microsoft R Open 3.5.1: Microsoft R Open 3.5.1 has been released, combining the latest R language engine with multi-processor performance and tools for managing R packages reproducibly. You can download Microsoft R Open 3.5.1 for Windows, Mac and Linux from MRAN now. Microsoft R Open is 100% compatible with all R scripts and packages, and works with […]

Read More

Performing Linear Regression With Power BI

Jason Cantrell shows how to create a simple linear regression in Power BI: Linear Regression is a very useful statistical tool that helps us understand the relationship between variables and the effects they have on each other. It can be used across many industries in a variety of ways – from spurring value to gaining […]

Read More

Categories

December 2017
MTWTFSS
« Nov Jan »
 123
45678910
11121314151617
18192021222324
25262728293031