Topic Modeling with Networks

This page documents our use of biterm topic modelings and interactive text networks.
R topic modeling interactive text networks race relations

Topic Model Networks

A topic model put simply models the topics in a piece of text and the words that are associated with each topic. Naturally, words may fall in multiple topics and the model accounts for this by giving each topic a probability distribution over the words. A Topic Model Network is a useful way to visualize the topics and the words associated with each topic. Here we will explore two different topic models.

Latent Dirichlet Allocation

Latent Dirchlet Allocation, or LDA, is the typical go to method for topic modelling. We chose to model the texts with 6 topics. We can see that in the three networks this produces very disconnected topics which intuitively seems to be a poor fit as the corpus is rather small and the soldiers are responding to direct and specific questions. LDA does produce a better connected network for the white soldiers outfits comment but does not do a great job in delineating the topics.

Black Soldiers Long Comment

It appears that Topic 1 deals with the war itself and what people are fighting for. We see words like democracy, freedom, right, and live. At the same time we see words like race, equal, and unit suggesting the soldiers are thinking about their role in the war and at home. Topic 2 is about the survey itself and makes explicit references to having the chance to answering questions. Topic 3 is about organization in the military while Topic 4 itself is about segregation in the military and transportation. Topic 5 delves further into race relations as well a geographical distinctions of north and south. Topic 6 seems to be generally about life and home.

White Soldiers Outfits Comment

The topics here are very well connected. As a result, none of the topics are distinctly about one thing. Topics 1 and 2 touch upon race relations within the military. Topic 3 is loosely about units in the military and answering question 60 of the survey. Similarly, Topic 4 is about question 62 and camp. Topic 5 is touching on what seems a varied mix of the other topics. Topic 6 is perhaps about friction in the units and the war.

White Soldiers Long Comment

The topics here are well distinguished. Topic 1 seems to be about what civilian life will be like. Topic 2 is on the service experience itself and has elements indicating the soldiers are thinking of a career in the military. Topic 3 is about the questionnaire itself and having the chance to answer questions. Topic 4 is on durations of time, probably refering to the military and/or school. Notably, we see the words waste and time. Topic 5 again touches upon race and the treatment of black people. It refers to treatment but interestingly, the word equal is missing. Topic 6 is about food in the military and how morale is poor. Contrasting these topics with those of the black soldiers’ long comments, we see that the black soldiers were more concerned with race relations and what their status is in the military and in civilian life. White soldiers focused more on the mundane things such as food. Nevertheless, both spoke to having the chance to answering questions.

Biterm Topic Modeling (BTM)

There are some drawbacks to using LDA for our dataset, namely it doesn’t handle short texts well. That is why we also implemented a Biterm Topic Model that does better on short texts. Overall, it seems that the topic model networks produced this way strike a better balance between effectively delineating the topics and showing interconnectivity.

Black Soldiers Long Comment

Topic 1 differs in the BTM model and seems interconnected with nearly all the topics. Broadly, it touches on war and the chance to fight. It also suggests with the war there being some sort of give and take and expecting to get something better out of it in the end. Topic 2 concretely is about segregation in different spaces within the military. Topic 3 is filled with optimistic words such as better, opportunity, equality, chance, free and fair. At the same time however, it speaks of Jim Crow. Topic 4 about positions within the military and potentially also having black officers. Topic 5 touches upon various parts of the day to day life and seems a bit more mundane. Topic 6 focuses in on race and geography specifically mentioning negro, white, north, and south, etc. One advantage here is that this model sorts together race related terms into one topic as opposed to 3 different ones as seen in the LDA.

White Soldiers Outfits Comment

Topic 1 here is nebulously connected with the other topics yet is concretely is about seperation of races and how well it would work. Topic 2 seems to be about day to day activities such as eating, sleeping, bathing and interacting with the sergeant. Topic 3 focuses on how the merger of races/class may interfere with morale. Topic 4 is very vaguely about career development and somehow Delaware. T Topic 5 suggests that racial mixing would cause disunity and start race riots. Topic 6 includes the word resent and several other racial terms. However, it also includes the word like. This BTM model fared about as well as the LDA as there were some vague topics. This might be an indication that we fit too many topics to this. However, as we mentioned earlier, we went with 6 topics for all to standardize the comparisons.

White Soldiers Long Comment

Again, Topic 1 intersects with other topics. It is in general about the war/army and includes words such as give and take. Topic 2 is about getting time to eat and getting furloughed. Topic 3 is about positions in the military and also includes terms like valuable, waste, time, and money suggesting they are thinking of career prospects here. Topic 4 identifies the enlistment/service experience and how many of the soldiers are young. Topic 5 focuses on how the questionnaire is a chance to talk about race and whether this opportunity is needed or not. Lastly, Topic 6 is about different branches in the military and the experience needed to transfer. Again we see that in comparison to the black soldiers long comment topics, the white soldiers are more focused on careers.