Data cleaning for mapping, Google Sheets style

Google Sheets is a tool that many of us are familiar with, but I’d like to shed some light on some useful functions for data cleaning, specifically when preparing to make a map. Some of these functions, like translation and geocoding plugins, are specific to Google Sheets, and make it a valuable platform for cleaning data. There are benefits to using Google Sheets rather than OpenRefine or other software; the document can be collaborative and is stored in the cloud, it supports plugins such as Geocode by AwesomeTable and Google Translate and it’s a slightly more user friendly platform- you don’t have to know how to use regular expressions to use it. I found it to be useful during my midterm project, and In this tutorial, I’ll be sharing several applications of Google Sheets that I discovered while trying to clean my data for mapping people, which is applicable to ArcGIS, Flourish, or other software that allows you to plot points on a map. For the sake of the tutorial, I’m using a dataset similar to the Tate artists dataset that I used for my midterm, which has the artist name, year of birth, and place of birth.

I created a (somewhat lengthy) tutorial screencast encompassing several aspects of data cleaning for mapping, using the dataset of Tate Collection artists born 1930-1949.

The general outline of my screencast is as follows, for those who are less inclined to watch the whole video. I’ve included links to other resources for using each specific feature mentioned.

  1. Paring down the data to exclude artists without a recorded place of birth (you wouldn’t be able to put them on a map), using filters.
    • Select column, data>create a filter>filter by condition>is empty
    • Delete rows that appear with this filter, data>remove filter to return to original table
  2. I forgot to mention this in the screencast, but there are a few built-in data cleaning options on Google Sheets. Data>data cleanup lets you choose to remove duplicate entries and trim whitespace.
  3. Searching for special characters in the name section so that I could reformat them, using filters.
    • data>create a filter>filter by condition>text contains>[insert special character to search for] and reformat accordingly
  4. Reformatting names to correct capitalization, using the PROPER function.
  5. Transforming the name column from Lastname, Firstname to Firstname Lastname
    • =SPLIT(A1, “,”)
      • it splits at the comma between the names, since I entered “,” as the delimiter
    • =CONCATENATE(A2,” “,A1)
      • puts the cells together as Firstname Lastname, separated by a space.
  6. Translating places of birth to English
    • =GOOGLETRANSLATE(A1,”auto”,”en”)
      • “auto” signifies that I want it to detect the original language, “en” stands for English, the target language
  7. Geocoding the places of birth to get latitude and longitude using Geocode by AwesomeTable, a Sheets extension.
    • Extensions>geocode by awesometable(link to add extension)>start geocoding>address column: place of birth>geocode!
      • Produces a list of latitude and longitude values for the places given

Leave a Reply

Your email address will not be published.

This site uses Akismet to reduce spam. Learn how your comment data is processed.

css.php