Source Documentation: Carletonian Corpus

Since our project centers around corpus analysis of the Carletonian, we are currently considering the college’s digital Carletonian archive our only source. We’re unsure of the best citation practice for an entire collection, but the citation format for an object in an archive is something like this:

Author (if applicable) / Title of the item (Rule) / date of the item (Rule) / item number (if applicable) / series title (if applicable) / series number (if applicable) / name of the collection (if applicable) / collection number (if applicable) / name of the depository or archive / location of depository (if applicable) / URL or DOI or name of database (if applicable)
From the Gould Guide for citing archival material in Chicago format

Format

We’ve been accessing the Carletonian issues so far from the digital archives, where they’re stored as searchable PDFs. We’re currently going through and downloading the front pages of each of those, since we can definitely use that format at least for the timelapse that we want to create. We’re not sure yet of the process we’ll need to go through to get xml files from the images (Angie is in touch with Sarah Calhoun about it), but we think that we can either get existing files from the archives or use a tool like Docparser to get them into that format ourselves.

Rights

The Carletonian archives are part of the Carleton College archives, and are owned by the college. Since we’re student researchers, we should be allowed to use the material housed there (according to the website, “Students, faculty, and other researchers are welcome to use the resources housed in the Carleton College Archives.”) If we end up using the print archives, there is a registration that we’ll need to complete.

Privacy/Ethics

Students, alumni, faculty, staff, community members, and even non-community members are depicted in the Carletonian one way or another over the years. Students and faculty are featured more than anybody else (quoted for articles, as the subject of articles, or as contributors to the paper), but most of those people are no longer here. Regardless, it’s primarily the newspaper’s responsibility to deal with privacy concerns — contributors know that what they put in the paper will become publicly accessible, and editors navigate the ethics of publishing features, opinion pieces, etc. (sometimes controversially). With that being said, we can also exercise sensitivity in our handling of the data — we’re looking to perform a generic analysis of words, not of individuals, and separate people probably will not emerge in our discussion of results. We want to look at campus trends in discourse.

We may be analyzing trends in words that deal with delicate issues — race, gender, politics, etc. We’ve already noticed that some of the earlier issues of the paper use slurs and offensive language. Since our project is directly concerned with language usage over time, we don’t want to ignore those words, but we don’t want to perpetuate their usage under the guise of ‘research’ either. If they enter into our project, we’ll censor them; and we won’t be making them a central focus or spotlighting them just for the sake of demonstrating their presence. Thoughtful discussion of whatever results we get will also be important in navigating this concern.

Tags: archives, blog, carletonian corpus, corpus analysis, Digital Humanities, source documentation

Format

Rights

Privacy/Ethics

Leave a Reply Cancel reply

Final Project Update: Graduate School Mapping

Thorpe Pool 3D Tour Update

Analysis of Linked Jazz

Swiss Cheese 3D Model

Carleton’s major Majors Final Project Update

FINAL PROJECT: Carletonian Corpus

Week 9 Blog – Tutorial

Week 4 – Network analysis DH project reflection

Alumni Visualization: A Display of Carleton’s English Majors

Carleton’s Major Majors Website

Data Visualization Update: 3D Thorpe Pool

Data Visualization: Graduate School Mapping

Final Data Visualization

Popularity of Carleton Majors by Year – An Update

Exploring ArcGIS Mapping

Hacking the Humanities 2023W