Newer
Older
# TopiCompare - topic modelling for historical corpus comparison
The project aims to showcase the use of topic modelling, particularly the ETM, for the comparison of historical newspaper corpora, on the example of the Wiener Zeitung and the Salzburger Intelligenzblatt from 1789 - 1799.
This project aims to showcase how topic modelling can be used to compare historical newspapers. In the notebooks folder, there are notebooks concerning the tokenization, the preprocessing, the exploratory data analysis, the model fitting and the output analysis of the Wiener Zeitung and the Salzburger Intelligenzblatt. However, the code can easily be adapted to other needs. The repository also includes a requirements file, stating which packages are needed to run the whole code. For further reading of the specific steps, you can go to: [link].
- model training
- model evaluation
- topic interpretation and analysis
Additonal contribution by Christoph Steindl:
- data retrieval
- data preprocessing
Additional contribution by Martin Krickl:
- formulation of research question
- domain-related consulting
Terms of use: This work is licensed under a CC-BY-NC-SA license.