I am introducing how to use the r language to make data analysis. I see someone else has already made a post about it, so I will try my best to make my post as different from his as possible. To start with, r language is “the” programming language for statistical work, being able to analyze data and create elegant diagrams. When applied to statistical problems, a “r markdown” file, or in short, a rmd file is written and knitted into a final pdf file, with the help of LaTeX. This brief tutorial is meant to teach the basics of r programming and data analysis.

Step 1

Open the maize workbench provided by Carleton College. The link is provided down here

https://maize.mathcs.carleton.edu/. Since it is unnecessarily difficult to setup r markdown and LaTeX by yourself, the best way is to use the school’s workbench; but keep in mind that it is only

accessible at school, so you would either need to be at the school yourself or use a VPN to access it.

Step 2

Login with your school account. Create your own folder and create a new file. These are a few files I created, “rmd” indicates r markdown files, and “pdf” indicates, well, just pdf’s.

Step 3

Read the csv dataset and view it using code shown below. Click the little green triangle to run the code. The button to the left of it runs all the code prior to this section.

Step 4

You can play with the data in following ways. It is easy to find the mean, standard deviation, or even create a table.

Step 5

Create a diagram using ggplot2 package provided. There are multiple ways you can do it.

Step 6

Create a pdf file for you have now. Click on the “knit” button on top.

There is way more you can do with r. For example, linear regression and various statistical tests. Here are some links for further interest:Carleton College official tutorial and a video from MIT. But seriously, the best advice I can give you is simple: take the stats intro course!

Great post! I do plan on taking your advice this coming spring and taking intro! I have some experience with programming and statistics, so I hope I will enjoy it. Anyways, your tutorial did a great job introducing such a complex tool in a simple way, especially how to easily set up r markdown. I was wondering what the correlation was between r markdown and regular markdown files?

This is a great tutorial, especially for Digital Arts & Humanities since data analysis has a great role in this field so an easily understood introduction can bring great benefits. I would like to know if you have any recommendations as to where can I find some more in depth instructions for r in general? It seems that r can provide a lot more utility regarding data analysis that I thought.

I’ve taken the intro statistics course here and I think this tutorial does a pretty good job of introducing what R is and what it can be used for. It’s definitely a useful tool in terms of the data analysis portion of digital humanities projects. The only thing I’ll say about this tutorial is that the step regarding the making of a diagram may be confusing as there are specific codes that have to be formatted a certain way (it’s been a little bit since I’ve used R, so I could be wrong). Other than that though, great tutorial! I also agree that people should take the intro course!

As someone who has used R in the past for a stats class that I took, I think that you did a very good job of explaining the basics of this pretty confusing application. It is not the first thing that I would think of for Digitial Humanities, but I like how you brought it into the DH realm. Data Visualization and analysis are definitely things that R can accomplish and those are important things for DH.

Thanks for your post! I appreciated your level of detail; I was able to follow it fairly well despite having no experience in this area. I’m curious to hear more about LaTeX — there’s been some discussion about its use in the formatting of Carleton’s humanities journal (an immediate DH application!), and I’ve been wondering what exactly it even is.

This is very helpful! Since data analysis plays a pivotal role in DH projects, knowing how to use different tools to analyze data more efficiently is very important. I appreciate your detailed explanations and step-by-step instructions. I didn’t know that R and LaTeX have this much utility even though I have been using r for my statistics class.