Source Documentation: Carletonian Corpus

Since our project centers around corpus analysis of the Carletonian, we are currently considering the college’s digital Carletonian archive our only source. We’re unsure of the best citation practice for an entire collection, but the citation format for an object in an archive is something like this:

Author (if applicable) / Title of the item (Rule) / date of the item (Rule) / item number (if applicable) / series title (if applicable) / series number (if applicable) / name of the collection (if applicable) / collection number (if applicable) / name of the depository or archive / location of depository (if applicable) / URL or DOI or name of database (if applicable)

From the Gould Guide for citing archival material in Chicago format

Format

We’ve been accessing the Carletonian issues so far from the digital archives, where they’re stored as searchable PDFs.  We’re currently going through and downloading the front pages of each of those, since we can definitely use that format at least for the timelapse that we want to create.  We’re not sure yet of the process we’ll need to go through to get xml files from the images (Angie is in touch with Sarah Calhoun about it), but we think that we can either get existing files from the archives or use a tool like Docparser to get them into that format ourselves.

Rights

The Carletonian archives are part of the Carleton College archives, and are owned by the college.  Since we’re student researchers, we should be allowed to use the material housed there (according to the website, “Students, faculty, and other researchers are welcome to use the resources housed in the Carleton College Archives.”)  If we end up using the print archives, there is a registration that we’ll need to complete.

Privacy/Ethics

Students, alumni, faculty, staff, community members, and even non-community members are depicted in the Carletonian one way or another over the years.  Students and faculty are featured more than anybody else (quoted for articles, as the subject of articles, or as contributors to the paper), but most of those people are no longer here.  Regardless, it’s primarily the newspaper’s responsibility to deal with privacy concerns — contributors know that what they put in the paper will become publicly accessible, and editors navigate the ethics of publishing features, opinion pieces, etc. (sometimes controversially).  With that being said, we can also exercise sensitivity in our handling of the data — we’re looking to perform a generic analysis of words, not of individuals, and separate people probably will not emerge in our discussion of results.  We want to look at campus trends in discourse.

We may be analyzing trends in words that deal with delicate issues — race, gender, politics, etc.  We’ve already noticed that some of the earlier issues of the paper use slurs and offensive language.  Since our project is directly concerned with language usage over time, we don’t want to ignore those words, but we don’t want to perpetuate their usage under the guise of ‘research’ either.  If they enter into our project, we’ll censor them; and we won’t be making them a central focus or spotlighting them just for the sake of demonstrating their presence.  Thoughtful discussion of whatever results we get will also be important in navigating this concern.

Leave a Reply

Your email address will not be published.

This site uses Akismet to reduce spam. Learn how your comment data is processed.

css.php