Count of places mentioned in The Times (UK) articles, per year

1785

What is this?

Gale Primary Sources provides a digitized archive of all editions of The Times (UK) from 1785 until 2020. To isolate what we consider to be "crisis-related coverage", we used a keyword search beginning with the word "cris!s" of the headlines and first 100 words of the articles in the corpus. Based on an assessment of these articles, the list of crisis-related keywords was expanded and the process repeated.

Next, Named Entity Recognition (NER) was used to determine mentions of places in this corpus of crisis-related coverage. Meanwhile, GeonamesCache (GNC) and GADM were used to generate a list of "known places". Iterating through the list of NER places, a combination of spellchecking and fuzzy-matching were used to find matches in the list of known places; since both GNC and GADM already have geocodes, the locations were immediately geocoded.

Finally, the points were aggregated to the nation-level and filtered by year, yielding the dataset that this map visualizes.

Caveats

The dataset is not by any means perfect. There were a few obstacles to geocoding. Here are a few, along with the solutions employed to deal with them:

This is a work in progress and there are still issues. The biggest result is that US locations are overrepresented and should be taken with a grain of salt.