Digital Humanities Grad Student Roundtable

This week at Rice, as part of our ongoing Digital History Masterclass, we did a Google Hangout with Cameron Blevins, Jeri Wieringa, and Annie Swofford, to discuss some of the limits and possibilities of doing digital humanities work as a graduate student, and where we see digital humanities going in the future. Unfortunately, I wasn’t able to make it to class or check in to the hangout because I was attending my grandmother’s surprise 90th birthday party/family reunion (I swear to god that’s a real excuse), but Caleb thankfully has made the video of the hangout available here.

The conversation as a whole was really interesting, and it’s always nice to get perspectives from different people, and especially fellow graduate students. I thought the discussion of the future of publishing was particularly interesting. This has come up in previous classes, but I continue to be intrigued by the idea of a “digital dissertation,” and when/if/how such a project would gain the same type of acceptance (if that’s the right word) as a traditional dissertation. The project also made me think about the ways traditional presses could move into promoting these types of projects. It seems to me that if a project was hosted on the website of a university press, it would have a bit more cred than just hosting a digital project on my personal website.

I have also been thinking about the ways that adding a digital component to a traditional piece of scholarship can broaden the exposure it gets, particularly in the undergraduate classroom. For example, I would say that I would be far more likely to assign Emma Rothschild’s The Inner Life of Empire now that the Harvard Digital History Lab has created this fantastic digital project that can accompany it. Finding a way to pare down my dissertation or first book into a relatively accessible web project like this seems like a fantastic idea. Making something like that available for free I think increases the likelihood people will read and become interested in your work.

I also thought it was interesting, and pretty clear, the ways that greater institutional support for digital humanities makes a huge difference. All three of these students have resources available to them that are not available, or at least not in any centralized way, at Rice, and they all seem to have benefitted from them immensely. Outside of this class and twitter, it has been tough for me to remain as engaged with digital methods as I’d like to be, which of course is a problem with self-motivation. One of the great things about digital humanities though is the way it is collaborative in a way that more traditional humanities scholarship rarely is. In disciplines outside the humanities, co-authorship and collaboration is the norm, and hopefully digital humanities is a way of getting scholars to work together more often, both for traditional and non-traditional projects.

Hearing about the exciting projects that people are engaged in and the opportunities it opens for them has once again re-energized me, so hopefully that will bring me back to my blog more often, as I continue thinking about how DH can/will impact my own scholarship. But no promises.

Paper Machines

Last week, I attended a workshop offered by Jo Guldi on Paper Machines, Paper Machines as part of the Digital History Master Class here at Rice. Caleb McDaniel has a debriefing post on what we covered in the workshop, and some thoughts on how we can use paper machines, but I wanted to offer some longer thoughts here about how I think I can make paper machines useful for me.

First, I think Paper Machines could be really helpful for me in the project I’m working on right now for the American Historical Association annual meeting coming up in January. I’m presenting a paper on the panel Manipulating Freedom: Liberty, Enslavement, and the Quest for Power in the Southwestern Borderlands discussing Texas’s voluntary enslavement law of 1858.[1] I will be analyzing how Texas newspapers (and southern newspapers more generally) discussed instances of free blacks voluntarily enslaving themselves as a way of analyzing Texans’ views of black freedom and the growing sectional crisis of the 1850s. One of my initial observations has been that when these stories are discussed in the newspaper, they are very formulaic, and often feature what seem like stock characters. Using a database like America’s Historical Newspapers, I could download OCR-ed articles discussing voluntary enslavement in Texas newspapers, and assess this general observation more systematically using Paper Machines. I’d be interested to see what kind of word clouds and phrase-nets these articles produced, even if it only functioned as a way to visualize what I thought I was reading in these newspapers.

Secondly, I think Paper Machines would be helpful as well in developing a comparative project like my dissertation. Even if it I used it to analyze secondary sources and journal articles, I think Paper Machines could offer some direction on fruitful avenues of research when going into what seems like a pretty ambitious project. If I downloaded to my Zotero library all the articles I will be using for my dissertation on free people of color in Cartagena, Colombia and in Charleston, South Carolina, I could use Paper Machines to see if my focus is in the right place, or if there are potential areas of research that I hadn’t yet thought of exploring. For instance, I would expect terms like “Haiti” and “respectability” to be featured fairly prominently in any word clouds, but perhaps there are terms I wouldn’t expect as well. Further, since the concept of respectability will play such a central role in my argument, it would be really interesting to see what kind of terms and ideas are connected to respectability (using phrase nets and topic modeling) both when the articles on both regions are analyzed together, as well as when Cartagena and Charleston are analyzed separately. I would likely have to separate out articles in Spanish from the articles in English, although keeping them together could perhaps still work if I was careful about analyzing cognates/false cognates.

Jo Guldi emphasized to us that Paper Machines is in a “pre-Alpha” stage, so I look forward to exploring what Paper Machines can do as she and other programmers begin to cater it more closely to their research needs.

[1]I’ve written previously about my work on this law here (go back)

Is this thing on? Digital History, Programming, Python

*MICROPHONE FEEDBACK* Whoa, whoa, hot mic here! Sorry folks.

So, I haven’t written a blog post in 5 months. Here’s what happened in that time: I studied for, took, and passed my comprehensive exams; I submitted an article to a journal, got it back, made revisions, and resubmitted it; I went to Bogotá, Colombia for a two-week preliminary dissertation research trip; I applied for research fellowships from Fulbright-Hays, Fulbright, Social Science Research Council, and the Council on Library and Information Resources; I started revisions on a paper I’m presenting at the American Historical Association annual meeting in New Orleans in January. I’ve been a little busy.

I am also taking a Caleb McDaniel’s Digital History Master Class, and doing some of the lessons/tutorials over at The Programming Historian and that’s what I want to write about today.

Last week, Chad Black (@parezcoydigo) came to Rice and delivered a lecture about criminality and institutional profiling in colonial Quito, but also talked/worked with us about Python. Much of what we discussed in our workshop had to do with using digital tools like Python to solve problems. We seemed to come to a consensus (or at least I thought we came to a consensus, perhaps because this is what I was thinking) that you need to have specific problems that need solving in order for digital humanities to “work” for you, but at the same time, you need to have some kind of familiarity with digital tools in order to think of digital solutions when these problems come up.

With this in mind, I have started the Programming Historian tutorials. It seems like Python could be really helpful with a lot of text-based issues/problems that might come up while trying to research, organize that research, and write over the next few years while writing my dissertation (and over the course of my career). I’ve only gotten through the first two lessons, but so far the process reminds me of when I first learned HTML. I’m hoping that these tutorials will get me to a point where I have enough of a base to go rogue, and start looking up my own Python-based solutions as problems arise. I plan on periodically posting back here to give updates on how learning a new language is going, ask questions, etc.

I swear (to myself) it won’t be another five-month hiatus. I’ll be here every week.

Try the meatloaf. Tip your bartenders.

Digitization and 19th Century Newspapers, continued thoughts.

So in thinking more about how I might go about using some kind of spatial mapping to demonstrate the relative importance of the self-enslavement laws in comparison with the re-opening of the African slave trade, the admission of Kansas as a free state, I’ve come across some really great stuff from other people who have thought (and acted) much more deeply on these issues than I have.

In a joint venture between Stanford University and University of North Texas, Mapping Texts assesses patterns for hundreds of thousands of pages of Texas newspapers, from 1829 to 2008. I haven’t found a way to do very much with it just yet, but it’s something I’m going to look into, to see if there is a way I can get “under the hood” so to speak. One interesting thing I noticed in a preliminary assessment is that between 1856 and 1861, ‘Kansas’ is one of the top 30 most frequently named entity’s, higher than every other state other than Texas (of course,) and New York.

Another post that has less of a bearing on the issues I’m working through but is still interesting is two posts from April 2011 about how changing database construction in America’s Historical Newspapers as it pertains to the way in which advertisements are identified and counted as articles can skew results.

I’m glad I’m really starting to think critically about how these databases are constructed, not just about the images they contain. And I’m glad so many other people have gotten there first.

Methodology, Technology, and Historical Newspaper Databases

The use of newspaper databases (America’s Historical Newspapers, for example) has in many ways drastically changed the way historians conduct research.  No longer doomed to pore over microfilm day by day, column by column hoping we find something useful, historians/we can now utilize strategic keyword searches—for this project, phrases like “going into slavery” and “chose + master”—to find the articles we’re looking for, or figure out they don’t exist.  Historians like Matthew Rainbow Hale and Carol Lasser have used these databases in truly innovative ways, using the frequency of appearance of certain keywords to support their arguments about the shape of discourses regarding political time and antislavery rhetoric, respectively.[1]  These scholars should be applauded for their unique use of new technologies to substantiate their claims; however, in researching the ways that Texas’s self-enslavement law of 1858 relates to the wider defense of slavery in the state, and the South, I have realized a potential pitfall of using these databases.

Part of the way I am organizing my argument is in interpreting self-enslavement laws not through the lens of free blacks’ social position, but by placing them in the context of proslavery rhetoric.  The passage of voluntary enslavement laws across the South in the late 1850s makes infinitely more sense when considered in conjunction with the historiography of the southern defense of slavery, and proslavery thought more generally.  Herein lies one of the issues with digital newspaper databases: a researcher could easily find stories of self-enslavement from across the South through a keyword search, and take these to reflect an increasing desperation among free blacks in the 1850’s, or a sudden decision by southern states to enforce existing free black expulsion statutes.  To a certain extent, that should be mitigated against simply from taking the same caution with newspapers that we do with other sources, that’s fairly clear; but it is only by viewing these newspapers in full, and looking at dozens of issues in which self-enslavement stories don’t appear, that what I would argue is their proper context can be understood.

Stories of voluntary enslavement appear sporadically in southern newspapers in the late 1850s, but with nowhere near the frequency, nor the importance, ascribed to them by Ira Berlin.[2]  In Texas at least, self-enslavement stories typically seemed to be extremely short, and almost never appeared on the front page.  In contrast, stories about the need to re-open the African slave trade, the admission of Kansas to the Union, and the scarcity and high price of slave labor (among others) all take up drastically more attention, and space, in the columns of Texas newspapers.  The ability to get straight to self-enslavement stories through keyword searches sometimes risks allowing historians to skip over the forest, and get straight to the trees.

I am trying to determine if there is a methodology that will allow me to more scientifically highlight the discrepancy in importance between self-enslavement stories and the reopening of the slave trade, for example, in Texas’s proslavery periodicals.  Hale, for instance, puts together a table in which he tracks references to certain key phrases, and how those references changed over time.  Hales methodology has its own inherent drawbacks, but since self-enslavement articles and others generally defending slavery as a “positive good” tend to use similar language, I’m not sure this approach would work well for my purposes.  I have considered either using multiple papers within a small date-range surrounding the passage of the law, or a single paper over a greater period of time, to compare the surface area of the paper taken up by various issues.  If we assume editors gave more, and more prominent, space in the paper to the issues of greatest importance, I could perhaps come up with a formula in which each line, and each column was assigned a particular value, depending on which page it appeared: 50 lines on page 1 would be weighted as more important than 50 lines on page 4, etc.  This is something I am attempting to work through, but I think it could ultimately provide a nice graphic representation of the ways in which self-enslavement was a part, but only a very small part, of the wider defense of the institution of slavery, if coupled with a more traditional evidence base.

[1] Matthew Rainbow Hale, “On Their Tiptoes: Political Time and Newspapers during the Advent of the Radicalized French Revolution, circa 1792–1793,” Journal of the Early Republic, 29, no. 2 (2009), 191–218”; Carol Lasser, “Voyeuristic Abolitionism: Sex, Gender, and the Transformation of Antislavery Rhetoric,” Journal of the Early Republic, 28, no. 1 (2008), 83–114.

[2] Ira Berlin, Slaves Without Masters: The Free Negro in the Antebellum South, (New York: Pantheon Books, 1974), 366–67.