Diving into Misc archives with Optical Character Recognition

Image courtesy of The Miscellany News.

After a stint of working as a phone-a-thon caller in the Office of Advancement last semester, I discovered a new work study opportunity that better suits my schedule and skillset—or, more specifically, doesn’t include me awkwardly trying to convince alumni to give money to Vassar. (Big shoutout to all of the phone-a-thon workers who are more skilled at that than I was!)

While some of my friends justifiably boast about being paid to play badminton or simply watch basketball and volleyball games for their work study, no one job can compare to mine: Digital Initiatives Assistant. My sole responsibilities can be described in two words: text correcting. Essentially, I look at old, archived newspapers and magazines and compare images of the original text to a computer generated copy of the text. The process in which a programmed software reads an image, for example one page of The Miscellany News, and converts it into a text file is referred to as Optical Character Recognition (OCR). The software is designed to recognize the shapes of letters and symbols to copy into a digital text file. Most of the time, the program is fairly accurate, and my edits are limited to correcting miscapitalized letters or adding in missed punctuation. However, at times when, say, the font is slightly weird or a symbol is unrecognized, the translations can be completely incoherent. For example, a simple dotted line was once copied as “I!AI.QEXJJLiiiiMiTED^B.”  This digital copy of the text allows old, pre-internet resources to be easily searchable online. At first description, this may seem like the most tedious job possible, but stick with me here. 

The first major pro of this position is that I was able to set my own hours. The second is that it’s a remote job, and while the novelty of remote work may have worn off for most, I am happy to be able to work anywhere from the Old Bookstore to the sunny Nircle to my own cozy bed. And, finally, the third major pro is that it’s surprisingly interesting.

So far, I have only edited Vassar Miscellany News issues. The issues available for editing in the library’s digital archive range from 1914 to 2013. I initially started out with the oldest articles to take a look at Vassar’s older history. And while it was interesting to read about rules for visiting male guests, or how Founder’s Day originally consisted of lectures and speeches, as well as seeing names of potential nepo-babies of the past, I was particularly excited to check out more recent issues. I skipped forward nine decades and started looking at issues from 2004. The articles were nearly six times as long as the articles from 1914 and consisted of many of the same sections as the current Miscellany: News, Features, Opinions, Arts and Sports. 

The most recent issue I have edited, Volume CXXXIX, Number 1, published on Sept. 10, 2004, has been one of my favorites. Some notable mentions from the issue include an article on changing fashion trends where the author, William Chang, wrote: “If you…wear leggings, velvet, tweed or corduroy blazers, vintage t-shirts, low-rise medium flared jeans, or large, gaudy, but not necessarily tacky, sunglasses, you might be a hipster.” I wholeheartedly second this statement and believe that at least half of Vassar’s population can still be categorized as hipsters. Another noteworthy mention is the discovery that condoms and other safe sex products have been available to students on the doors of Student Fellows for over two decades—however, I wonder if they were actually kept stocked back then. The Sports section, surprisingly, took up four whole pages, one of which consisted of an article I took particular offense to as a Brooklyn Nets fan. Gabe Mosca, guest writer for the “Out of Bounds” column (a column focused on important happenings in the greater sports world outside of Vassar), disparaged former New Jersey Net player and current Brooklyn Nets announcer and analyst, Richard Jefferson, saying: “Richard Jefferson is perhaps the worst outside shooter I have ever seen in my life. In high school, we had a cross-eyed kid who was 70 lbs overweight with a self-done tattoo on his arm (a girl’s name no less), and he was a better shooter than Richard Jefferson. Nice shooting, Dick.” And, finally, I saved the best mention for last: a short list of news briefs that could be pulled directly from any of today’s issues of The Miscellany News including headlines such as: “Beer keg found in Main,” “Bike seat stolen outside Main,” “Security responds to crowd at Town Houses” and “Students found on Chicago Hall roof.”

All of the old issues contain fun little looks into Vassar’s past that are at times shocking or right on target for a Vassar publication. If this sounds at all interesting to you  (and why wouldn’t it?), anyone can look at and even edit archived Vassar publications at newspaperarchives.vassar.edu

Leave a Reply

Your email address will not be published. Required fields are marked *

The Miscellany News reserves the right to publish or not publish any comment submitted for approval on our website. Factors that could cause a comment to be rejected include, but are not limited to, personal attacks, inappropriate language, statements or points unrelated to the article, and unfounded or baseless claims. Additionally, The Misc reserves the right to reject any comment that exceeds 250 words in length. There is no guarantee that a comment will be published, and one week after the article’s release, it is less likely that your comment will be accepted. Any questions or concerns regarding our comments section can be directed to Misc@vassar.edu.