Cloudy Place Names

I've been playing some more with the database of place names from the US Board on Geographic Names. I decided to make some word clouds to get a visual idea of what pops up a lot. The first cloud is generated from all US Place names. Click on it for a bigger view:

US Place Names

Yipes.

Then I decided to do some filtering. I generated a list of fairly boring 'stopwords', which I then removed from the clouds. Directional words like North and South, along with bunch of nouns like 'Church' or 'Cemetery', and geographical words like 'Hill', 'Pond', etc. I left some behind if they weren't too common, but I might go back and clean out all nouns later. Then I generated a cloud of just US towns/population centers. Apparently we have a lot of 'Estates':

US Town Names, Common Words

After that, I decided to do a list of all US Places - this includes population centers, buildings, mountains, lakes, etc. Every location in the US with a name is in this database. I filtered out all those same stopwords and got this: US Place Names, Filtered, with Churches

This image includes Churches, which clearly are a huge chunk of the things we name. We have a hell of a lot of churches. Anyway, when you filter those out, you get this image:

US Place Names, Filtered, without Churches

Finally, here's an image with just church names:

US Place Names, Churches

For my next trick, I'll probably put some of these places on a map to see what they look like.

Filed under: data, place names