For the last reflection before our final research assignment, my Digital Humanities in REL course was asked to re-evaluate what we’ve learned throughout the semester. We focused especially on what it means to ‘do data’ and what that might result in for scholars and students and research participants and everything and everyone in between. Because the concept was a bit broad, we tried to narrow our focus with a working definition of Digital Humanities and together came up with this definition:
Digital humanities combine technology with theory. Working in digital humanities requires the recognition of human error and contribution to what seems “given” when using technological interfaces present everywhere. We must critically examine the digital world, just as we analyze literature, by leaving room for humanistic contribution and not completely trusting what appears at face value. We complicate the “givens” of computational methods because knowledge production is a political act.
Dr. Wieringa’s DH in REL Class, Fall 2020.
While far from perfect, you can get the gist of what we think it means to do data in the digital humanities and in religious studies more specifically. For me and my classmates, it was important to point out that knowledge does not exist on its own, but in a context that is situated and dependent on the knowledge producer.
Text analysis has been a popular form of computational analysis since it’s inception. Whether you support close reading, distant reading, or a healthy mixture of both, there is always something to be learned when evaluating, comparing, and considering the words used by scholars, authors, poets, and anyone in between.
Voyant Tools is a popular online source for analyzing digital texts. Any user can upload their word source and then play with the various visualizations offered by the site. All of the visualizations show the various relationships between the digitized words and can be connected and presented in unique ways. The image above shows the first page Voyant shows after analyzing the American Medical Association (AMA) Journal of Ethics, July 2018 edition. The screenshot shows the page exactly as it first appeared. I did not make any edits or refine any key-terms. This is why abbreviations like ‘dr’ are visible.
In my first semester of graduate school, I took Debates in Method and Theory with Dr. Russell McCutcheon. In the second half of the course, we read Constructing “Data” in Religious Studies, which was (at the time) the most recent addition to the NAASR Working Papers series. If you have time to deep dive into what it means to ‘do data’ in Religious Studies, then this collection of papers is a must-read. Data is broken into the subcategories: Subjects, Objects, Scholars, and Institutions. Each scholar takes a step back to reconsider the ways that data is constructed and not discovered.
In Digital Humanities in REL, which I am currently taking, we were asked to reflect on what counts as data for the study of religion. It kind of feels like cheating to bring in a powerhouse source like Constructing “Data” in Religious Studies, but then again, it would be just plain wrong to neglect it. Data — as I have repeated endlessly in other blog posts and in almost every class discussion — does not speak for itself, and beyond that, data does not exist by itself. This is why these subcategories of Data can exist. Social actors employ tools (like subjects, objects, scholars, and institutions) to construct data.
We’ve somehow made it to mid-semester already. And while the workload certainly supports that observation, the time itself has flown by. Getting halfway through an upper-level course often means the focus shifts towards a final project, which is exactly where my Digital Humanities course is headed.
For our lab this week, the class was asked to evaluate a data set that might be used as a source for our final projects. The goal of the project is to formulate an argument based on the comparisons of two different datasets. One of the datasets must be the Longitudinal Religious Congregations and Membership File discussed in a previous blog post. The other source can be one of our choosing. Which is great unless you have a brain like mine that basically runs like an internet browser with too many tabs open. I’ve found one rabbit hole after another and (as often happens) have been slightly derailed from my long-term goal. This is where small goals become especially handy; as this blog post will hopefully help move me in a step closer towards finalizing my ideas for my final project.
For Lab 9 of my Digital Humanities course, I evaluated the various ways to organize and then visualize data. These graphics were done using Tableau Prep and Tableau Desktop and are far from comprehensive. The dataset manipulated for these graphics came from a group called Gallup in 2019 and is titled the Self-described religious identification of Americans. This dataset is similar to the Longitudinal Religious Congregations and Membership File discussed in previous posts as it also looks at self-identified religious groups over time. Although both evaluate similar categories, they each draw the categorical lines differently, and beyond that, count category members differently (but this is an idea that I’ll explore later).
For now, it is important to understand the process of visualizing data. Once you’re the one in charge, the choices of inclusion and exclusion become quite obvious. Consider my first attempt at cleaning and visualizing the Gallup data:
As I have repeated many times to my classmates in Digital Humanities: the data doesn’t speak for itself. Part of understanding that comes from an insight provided by the Philosopher Karl Popper, who reminded a group of physics students that the first step in observation is choosing what to observe in the first place.
This is exactly what we were asked to do for our lab this week – choose what to observe and thus, create data. Every student evaluated the same data source, The Seventh Day Adventist Yearbook, but we each chose different information to make into our own datasets.
From my understanding, an interface is a medium of meaning-making. The UCLA Center for Digital Humanities defines any interface as, “an in-between space, a space of communication and exchange, a place where two worlds, entities, systems meet”. They go on to explain how terminology like ‘windows’ and ‘desktop’ imply real-world, tangible places to be looked through or worked on, parallel to their uses in technology. But as this article points out, an interface might not be as straightforward as looking through a window:
“As with all conventions, these [interfaces] hide assumptions within their format and structure and make it hard to defamiliarize the ways our thinking is constrained by the interfaces we use”
The final project for my Digital Humanities course asks students to create a data review exploring a research question of interest. Part of the source data must come from the Longitudinal Religious Congregations and Membership File, but other data sources can be drawn on for support as well. The problem for me, as it often is, is narrowing my research interests. At first, the plan was to evaluate what scholars and digital humanists even mean by ‘data’ and what counts as ‘data’, but this seemed to close to my comfort zone — more humanities than digital — and I wanted to challenge myself a bit. In the long-run, I have decided that I will tie in some commentary on data, but more to provide some ethos for myself than to be the main example of my data review.
This semester I am taking a Digital Humanities course designed and taught by Dr. Jeri Wieringa. Part of this class includes writing blog posts about various topics discussed in class. I have already crafted a few posts (one on accessibility in DH and another assessing and critiquing a DH project) and there will be several more to follow.
Last class, we read and discussed Hadley Wickham‘s “Tidy Data” as a way to re-evaluate the options for organizing and presenting data. For homework, we were tasked with tidying a table from the PEW Research Center on the frequency of prayer. Below is the original table:
According to Wickham’s argument, a table should be made of columns and rows. The columns should consist of a single variable while the rows should be filled with a single observation of what is described. The rest of the table is filled with values that represent the recorded data. Based on Hadley Wickham’s criteria, this Pew research presentation is a bit untidy. What is being described is the percentage of various religious traditions that pray. The frequency of prayer is divided into categories (‘At least daily’, ‘weekly’, ‘monthly’, ‘seldom/never’ ‘don’t know’). These categories represent various observations and as such, should exist in rows, not columns. The column headers should represent the variables being measured.
The Viral Text’s Project is a digital humanities project that aims to help scholars understand the themes and decisions that helped newspaper content ‘go viral’ before going viral was the hip thing to do. The project created an algorithm that ‘reads’ newspapers and traces its reprinting in other areas. By following the reprints they visualize how certain newspaper trends went ‘viral’.
Most newspapers at the time did not have intellectual property rights, so editors and publishers of papers in smaller cities would literally cut and paste the newspaper sections from larger newspapers into their local papers. This created a sort of modge-podge of ‘viral’ material that publishers thought their readers might be interested in.
Below is a presentation I gave for a Digital Humanities course which asked students to constructively critique and assess a digital humanities website. The Viral Texts Project was the focus of my presentation.