Category Archives: Digital History

Text Mining “Interchange: The Promise of Digital History”

One of the reading  assignments participants read during the Doing Digital History Institute was the 2008 JAH article, “Interchange: The Promise of Digital History.” The “Interchange” sought to explore the burgeoning field of digital history, tackling questions of definition, pedagogy, forms of institutional support, possible effects on the meaning and process of historical research, and the resonance digital history might have with various publics who might encounter it.

Today, Fred Gibbs introduced us to the concepts of data and text mining, and so I decided to see if I could apply what I learned to the JAH article. Would interesting patterns emerge from the various interviews that appeared? My initial work focused on converting the article into a plain text (.txt) file. I then divided it into a variety of smaller files: the questions posed by the JAH editor, all of the responses offered by each individual participant, and each question accompanied by its related set of answers (and here do I wish I knew how to automate this process instead of cutting and pasting for an hour). In the end, I had one large question file, eight participant files, and nine individual question files, as well as the original .txt file. Different computations were run through Voyant Tools.

Caveat: it is important to note that I have no idea how this interview was edited. I am assuming that the final printed comments reflected the overall contributions of each of the interviewees, but I most certainly cannot be sure.

Overall, one can see the general emphasis of the article through a simple word cloud:

The cloud specifically excludes common English words, as well as  other common, but probably less helpful, words: digital, history, historians, historical, each of the author’s names (which were used to signal the start of each of their contributions in the article), and the word “JAH”, which was used before each question. We are left with the following twenty most frequently used words:

Certainly a couple of themes begin to emerge from this basic analysis. First, the emphasis digital historians placed on the field being “new” is clearly apparent, especially as “new” is frequently followed by “media” or “digital technology” in the article. Given the article’s goal of identifying and describing digital history as a new enterprise historians were embarking upon, this may not be surprising; however, the strong use of the word does supply evidence for Fred Gibbs’ point today of the somewhat overstated dichotomy between “traditional” (textually-based) history and “new” (digitally-based) history.

The interviewees also signaled a strong interest in thinking about “research” and “scholarship,” both of which appear more frequently than the word “student.” What might be even more interesting is the way that “research” and “scholarship” appear throughout the article, whereas “student” is mainly concentrated in the early questions on pedagogy:

Yet, despite the importance of words like “new”, “research”, and “scholarship” in the printed discussion, it is also worth noting how similarly the remaining words appear in frequency. In fact, 86% of the most frequently used fifty words in the article fall within one standard deviation of the mean usage of those fifty words (85% if the top three results are excluded). Thus participants appear to have been equally interested in most of the topics covered in the article.

If we examine the responses by interviewee, though, we do see some interesting differences begin to emerge. First, it is worth noting that each interviewee is not represented equally in the interview:

43% of all the text is supplied by two individuals. Consequently, if might be useful to see if Cohen and Thomas had a particular effect on the overall pattern of words in the article.

A graph of the seven most common words broken down by interviewee reveals some important trends:

First, Cohen’s responses overwhelmingly focused on “research”, “new”, “web”, “scholarship,” and “work.” Given his position as Director of the Roy Rosenzweig Center for History and New Media at the time of the interview, this is probably not very surprising. “Medium” and “scholarship” also appear quite frequently in Thomas’ interview, which given his role in the Valley of the Shadow project should also not be too surprising.

Fewer clear trends appear when these seven highly ranked words are analyzed by question:

In the end, this post is mainly an experiment to see if I could use the tools we were taught, but the data does allow for some broad conclusions to be drawn. Overall, it seems that the interview and interviewees were mainly concerned with thinking about the “newness” of digital history in 2008 – figuring out what it might mean particularly for scholarship, though with a reasonably strong emphasis on pedagogy. It is worth noting that certain topics that have dominated the 2014 Institute discussion, such as the place of public history and museums within and around digital history, are present, but are much lower in the list of frequently used 2008 words. Moreover, “questions”, “methods”, and “process” are also quite low in the list, possibly indicating a certain uncertainty about these topics six years ago.

For simple comparison, one can find a Wordle compiled by Spencer Roberts of participants’ blog posts,and one can see a much stronger emphasis placed on “students”, “project,” “comments”, and “sources.” Whether this change is signaling a shift in DH conversation, is resulting from who the participants of the Institute are (mainly from Master’s-granting degree programs, instead of larger research universities), or is arising from the structure of the Institute is beyond the goals of this overly long post.


Project Ideas for Doing Digital History

As part of my application for the Doing Digital History seminar, I suggested two possible projects:

  • An introductory, undergraduate course in digital history
  • A digital experience exploring the rich and contrasting perspectives related to the annexation of Hawai’i in 1898.

In thinking about how to teach digital things to undergraduates, I have been guided by Jeff McClurken’s distinction between “digitally inflected” versus “digitally centered” courses. In the past, I have experimented with various inflections, allowing students to complete a digital assignment, but I had never tried creating a course that placed digital historical learning at the center of what we were doing. Last year, almost on a lark, I suggested to members of my department that we should really take digital history more seriously. Somehow,  this conversation ended with me agreeing to propose a true digital history course.

Yet, teaching digital history to freshmen and sophomores was not my first choice. I had originally thought to offer the course as an upper-level major elective. I assumed that students would need an exposure to thinking historically through more traditional non-digital assignments to understand how to apply this thinking to the digital world.  I have come to wonder, though, if this thinking is flawed.  First, I have been influenced by T. Mills Kelly’s contention that students, living in what he terms a “remix culture”, might be perfectly primed to engage simultaneously with historical thinking and the digital world.
Second, I have also begun to wonder if students encounter the digital world more directly and explicitly early in their career then there might be less need to convince students of the myriad ways that they could “use” their historical education in future career pursuits – simply because they would already have taken part in an exercise that demonstrates this to them.

The second project I have been thinking about is an exploration of the extraordinarily contested annexation of Hawai’i in 1898. For three years, I have been teaching a History of Hawai’i course, the creation of which was supported by the Tocqueville Summer Institute hosted by the University of Richmond. Undoubtedly, the most popular week of the course has been the debate on Hawaiian annexation, during which students read, engaged with, and role-played arguments put forth by different opposing historical parties. The complicated history of the islands that immediately preceded the annexation, including a forced constitution, an armed coup, and a tense public debate, is a history that illuminates questions of imperial power, colonization, indigenous resistance, and the gendered construction of American identity, to name just a few. Moreover, given the current, but often unheard, demands of the Hawaiian Sovereignty Movement, there remain contemporary ramifications of this debate for Americans who all too often think of Hawai’i simply as our own national Eden.

My initial thought for the site was to use James Blount, the historical commissioner sent to Hawai’i following the 1892 overthrow of the Hawaiian monarchy to determine the legality and viability of annexation. Blount spent a considerable amount of time interviewing numerous participants in the events of the day, and in the end returned to the U.S. and filed a massive report suggesting numerous improprieties with the overthrow. Blount, I have wondered, might serve as a useful narrational tool that could give the site some structure, focusing the various primary materials around the types of questions Blount might have pursued. However, while I think Blount could work well as a lens for an American audience engaged with the question of annexation, I am worried about the way the choice could potentially silence, or at least deaden, many of the voices and assumptions made by those who opposed annexation. My choice of Blount too easily privileges the right of the US to choose annexation or non-annexation, as opposed to questioning the right of the very choice to begin with.  Consequently, I am particularly interested in exploring ways that multiple perspectives can be fairly explored in a digital environment, without necessarily losing the narrative and interpretive focus we all felt were crucial threshold concepts to historical thinking.