Allen B. Riddell

How to read 16,700 journal articles: studying German Studies with topic models

Sat 31 March 2012

The slides for my presentation at the 21st St. Louis Symposium on German Literature at Washington University: Distant Readings / Descriptive Turns: Topologies of German Culture in the Long Nineteenth Century are available here.

For those interested in learning more, I want to point to the following resources. These vary in their assumptions of background knowledge; I’ve tried to put the more introductory material first within each section.

Topic models

(NB: This is a partial selection. I’ve tried to focus on material of interest to those working in the human and social sciences)

Software implementing LDA

Text Analysis



Bayesian Statistics

Machine Learning


Block, Sharon, and David Newman. 2011. “What, Where, When, and Sometimes Why: Data Mining Two Decades of Women’s History Abstracts.” Journal of Women’s History 23: 81–109.

Chang, J., J. Boyd-Graber, S. Gerrish, C. Wang, and D. M. Blei. 2009. “Reading Tea Leaves: How Humans Interpret Topic Models.”

Griffiths, T. L. 2004. “Finding scientific topics.” Proceedings of the National Academy of Sciences 101 (jan): 5228–5235. doi:10.1073/pnas.0307752101.

Hall, David. 2008. “Studying the History of Ideas Using Topic Models.” Stanford University.

Hall, David, Daniel Jurafsky, and Christopher D. Manning. 2008. “Studying the History of Ideas Using Topic Models.” In Proceedings of the Conference on Empirical Methods in Natural Language Processing, 363–371. Honolulu, Hawaii: Association for Computational Linguistics.

Heinrich, Gregor. 2004. “Parameter estimation for text analysis.”

Hoff, Peter D. 2009. A First Course in Bayesian Statistical Methods. Springer.