Press "Enter" to skip to content

Category: R

Data Pre-Processing in R

Amieroh Abrahams cleans up some data:

As data scientists, we often find ourselves immersed in a vast sea of data, trying to extract valuable insights and hidden patterns. However, before we embark on the journey of data analysis and modeling, we must first navigate the crucial steps of data cleaning and preprocessing. In this blog post, we will explore the significance of data cleaning and preprocessing in data science workflows and provide practical tips and techniques to handle missing data, outliers, and data inconsistencies effectively.

Read on for several tactics which can help you clean up your data.

Comments closed

A Primer on Vectors in R

Adrian Tam shows off one of the building blocks for R:

R is a language for programming with data. Unlike many other languages, the primitive data types in R are not scalars but vectors. Therefore, understanding how to deal with vectors is crucial to programming or reading the R code. In this post, you will learn about various vector operations in R. Specifically, you will know:

  • What are the fundamental data objects in R
  • How to work with vectors in R

This is often a little tricky for newcomers to the language to pick up, though if you’re already familiar with set-based operations in SQL, vector-based operations are fairly straightforward.

Comments closed

Creating Curves in R

Steven Sanderson draws a curve:

In the vast world of R programming, there are numerous functions that provide powerful capabilities for data visualization and analysis. One such function that often goes under appreciated is the curve() function. This neat little function allows us to plot mathematical functions and explore their behavior. In this blog post, we will dive into the syntax of the curve() function, provide a couple of examples to demonstrate its usage, and encourage readers to try it on their own.

Click through for several examples.

Comments closed

Solving Systems of Equations in R

Steven Sanderson needs a solution:

In mathematical modeling and data analysis, it is often necessary to solve systems of equations to find the values of unknown variables. R provides the solve() function, which is a powerful tool for solving systems of linear equations. In this blog post, we will explore the purpose of solving systems of equations, explain the syntax of the solve() function, and provide three examples of increasing complexity to demonstrate its usage.

This post got me thinking about linear programming, which is a different topic but still pretty easy to do in R.

Comments closed

Goldbach’s Conjecture and the Sieve of Sundaram in R

Tomaz Kastrun promised us there would be no math on the quiz and yet here we are:

This is fun It is also O(MAX) complexity. But first some background. Since the problem is super old, we are not intending to solve it, merely to play with it. In the number theory of mathematics, the Goldbach’s conjecture states that for every even integer (greater than 2) can be expressed with the sum of two prime numbers. There are also far cries from this theory. For example, prove that every even number can be written as the sum of not more than 300.000 primes (by Schnirelman (1939)).

Read on for the functions and trials of Goldbach’s conjecture.

Comments closed

Counting Groups in R

Steven Sanderson counts items in a group:

As data-driven decision-making becomes more critical in various fields, the ability to extract valuable insights from datasets has never been more important. One common task is to calculate counts by group, which can shed light on trends and patterns within your data. In this guide, we’ll explore three different approaches to achieve this using the powerful R programming language. So, let’s dive into the world of grouped counting with the help of the classic mtcars dataset!

Read on for the base R solution, the dplyr solution (which looks a lot like how we’d solve it in SQL), and the data.table solution.

Comments closed

Managing Plot Parameters in R

Steven Sanderson switches up a visual:

When it comes to data visualization in R, the par() function is an indispensable tool that often goes overlooked. This function allows you to control various graphical parameters, unleashing a world of customization possibilities for your plots. In this blog post, we’ll demystify the par() function, break down its syntax, and provide you with hands-on examples to help you create stunning visualizations.

Click through to check it out. My loyalties definitely lie with ggplot2 for static visual development in R but it’s definitely not the only way to get images to look the way you want them.

Comments closed

Adding Text to a Plot in R

Steven Sanderson texts up a plot:

As a programmer, you’re well aware of the importance of data visualization. A well-crafted plot can convey complex information with clarity and impact. In R, creating stunning plots is a breeze, especially when you’re armed with the versatile text() function. This little gem allows you to add custom text to your plots, enabling you to annotate and highlight essential details. Let’s dive into the world of text() and uncover its syntax and potential through some hands-on examples.

I’m also a big fan of geom_text_repel() in ggplot2’s ggrepel library. It is by no means perfect but it does do a good job of not overlapping important visual features like plotted lines.

Comments closed

Learning about Data in R with str()

Steven Sanderson explains the value of the str() function:

In a nutshell, str() stands for “structure” and offers a concise summary of the structure of an R object. It presents essential details about the object, including its data type, dimensions, and the first few values. By providing an overview of your data, str() allows you to grasp the fundamentals at a glance and proceed with a clearer understanding of what you’re working with.

str() is a really useful function and people who develop objects in R thoughtfully can pack a lot of useful data into the one call.

Comments closed