Press "Enter" to skip to content

Category: R

Adding Mean to Box Plots in R

Steven Sanderson tracks the sixth number of a five-number summary:

Data visualization is a powerful tool for understanding and interpreting data. In this blog post, we will explore how to create box plots with mean values using both base R and ggplot2. We will use the famous iris dataset as an example. So, grab your coding tools and let’s dive into the world of box plots!

Note that this is mean in addition to median in these visuals, not replacing the median.

Comments closed

Lists and DataFrames in R

Adrian Tam continues a series on core data types in R:

Vectors in R are supposed to be of homogeneous data type. You can use a list as the container if there are mixed data types, such as numbers and strings. The list and data frame are closely related in R. The data frame is probably more useful because it reflects how we usually collect statistics. In this post, you will learn about them. Specifically, you will know:

  • What are lists and data frames in R
  • How to manipulate lists and data frames

Read on to learn more about these two sorts of collections.

Comments closed

Creating a Box Plot in R

Steven Sanderson builds up a box plot:

Are you ready to dive into the world of data visualization in R? One powerful tool at your disposal is the box plot, also known as a box-and-whisker plot. This versatile chart can help you understand the distribution of your data and identify potential outliers. In this blog post, we’ll walk you through the process of creating box plots using R’s ggplot2 package, using the airquality dataset as an example. Whether you’re a beginner or an experienced R programmer, you’ll find something valuable here.

Click through to learn what kind of information a box plot can provide, as well as how to create one using a variety of R libraries.

Comments closed

Data Pre-Processing in R

Amieroh Abrahams cleans up some data:

As data scientists, we often find ourselves immersed in a vast sea of data, trying to extract valuable insights and hidden patterns. However, before we embark on the journey of data analysis and modeling, we must first navigate the crucial steps of data cleaning and preprocessing. In this blog post, we will explore the significance of data cleaning and preprocessing in data science workflows and provide practical tips and techniques to handle missing data, outliers, and data inconsistencies effectively.

Read on for several tactics which can help you clean up your data.

Comments closed

A Primer on Vectors in R

Adrian Tam shows off one of the building blocks for R:

R is a language for programming with data. Unlike many other languages, the primitive data types in R are not scalars but vectors. Therefore, understanding how to deal with vectors is crucial to programming or reading the R code. In this post, you will learn about various vector operations in R. Specifically, you will know:

  • What are the fundamental data objects in R
  • How to work with vectors in R

This is often a little tricky for newcomers to the language to pick up, though if you’re already familiar with set-based operations in SQL, vector-based operations are fairly straightforward.

Comments closed

Creating Curves in R

Steven Sanderson draws a curve:

In the vast world of R programming, there are numerous functions that provide powerful capabilities for data visualization and analysis. One such function that often goes under appreciated is the curve() function. This neat little function allows us to plot mathematical functions and explore their behavior. In this blog post, we will dive into the syntax of the curve() function, provide a couple of examples to demonstrate its usage, and encourage readers to try it on their own.

Click through for several examples.

Comments closed

Solving Systems of Equations in R

Steven Sanderson needs a solution:

In mathematical modeling and data analysis, it is often necessary to solve systems of equations to find the values of unknown variables. R provides the solve() function, which is a powerful tool for solving systems of linear equations. In this blog post, we will explore the purpose of solving systems of equations, explain the syntax of the solve() function, and provide three examples of increasing complexity to demonstrate its usage.

This post got me thinking about linear programming, which is a different topic but still pretty easy to do in R.

Comments closed

Counting Groups in R

Steven Sanderson counts items in a group:

As data-driven decision-making becomes more critical in various fields, the ability to extract valuable insights from datasets has never been more important. One common task is to calculate counts by group, which can shed light on trends and patterns within your data. In this guide, we’ll explore three different approaches to achieve this using the powerful R programming language. So, let’s dive into the world of grouped counting with the help of the classic mtcars dataset!

Read on for the base R solution, the dplyr solution (which looks a lot like how we’d solve it in SQL), and the data.table solution.

Comments closed

Goldbach’s Conjecture and the Sieve of Sundaram in R

Tomaz Kastrun promised us there would be no math on the quiz and yet here we are:

This is fun It is also O(MAX) complexity. But first some background. Since the problem is super old, we are not intending to solve it, merely to play with it. In the number theory of mathematics, the Goldbach’s conjecture states that for every even integer (greater than 2) can be expressed with the sum of two prime numbers. There are also far cries from this theory. For example, prove that every even number can be written as the sum of not more than 300.000 primes (by Schnirelman (1939)).

Read on for the functions and trials of Goldbach’s conjecture.

Comments closed