All that work in identifying names in Photo-Era is paying off. I have started looking at some of the aspects of how people, but especially women, are mentioned in Photo-Era.
Frequency by name
There a lots of ways we could look at the names. The simplest is just seeing how often people are mentioned by name (and their variants). [I’ve updated that previous post with a chart showing the overall frequencies of names in Photo-Era by gender.] One interesting thing is that in the top 10 most frequent names, only one is a woman: Katherine Bingham. In addition to being a photographer herself, she was an assistant editor at Photo-Era for several years. (Here is a post by the St. Johnsbury Historical Society about a picture of hers that is in the Library of Congress collection, and Lee talked about her in her PHSNE presentation.) Another interesting thing about the frequencies is that of the top 10 women and the top 10 men, 5 of the top women have Wikipedia entries, but only 3 of the top men do. (I’m omitting the names for which I have not determined a gender, but they don’t have any Wikipedia entries in any case.)
|Woman||Rank||Rank within women||Wikipedia|
|Elizabeth Flint Wade||17||2||Elizabeth Flint Wade|
|Gertrude Kasebier||41||3||Gertrude Kasebier|
|Nancy Ford Cones||70||7||Nancy Ford Cones|
|Gerhard Sisters||87||9||Gerhard Sisters|
|Emily H. Hayden||99||10||Emily H. Hayden|
|Man||Rank||Rank within men||Wikipedia|
|William S. Davis||2||2||William S. Davis|
|Paul Lewis Anderson||7||6||Paul Lewis Anderson|
|Clarence H. White||9||8||Clarence H. White|
Frequency by nearby mentions
Another way to look at the names is by considering what other names are mentioned nearby. The motivation is that people who are mentioned together probably have some connection, which we can try to discover after finding them together. “Nearby” is rather vague, but we can make it more precise. First, we can measure the distance between names in terms of words and punctuation (“tokens” in linguistics jargon). Then we can limit ourselves to the closest 1% of distances, which turns out to be any pairs that are within 26 tokens of each other. This leaves us with 30,976 unique pairs that make up the closest 1% of the distances, out of the 55,851 total pairs (which also means that over half of the pairs are “nearby” pairs). That gives us plenty to work with.
As example, here are the top 4 co-occurring pairs of names, with the average distance between them.
|Person 1||Person 2||Occurrences||Average Distance|
|Alonzo H. Beardsley||Wilfred A. French||104||6.7|
|Phil M. Riley||Wilfred A. French||98||8.1|
|Katherine Bingham||Wilfred A. French||62||14.0|
|Elizabeth Flint Wade||Wilfred A. French||57||12.7|
There is nothing important about the designations “Person 1” and “Person 2” — “Person 1” is the name in the pair which is alphabetically first (by first name). What is striking is that the top 4 pairs all involve Wilfred A. French. However, once we realize that he was the main editor (and sometimes owner) of Photo-Era, and the other people were all also editors at one time or another, then we understand why their names would occur near each other so often: they would be together on the masthead. (Interestingly, that Wikipedia entry for Photo-Era does not mention either of the women editors that we see here, though it does mention other male editors.)
This is a nice example of the what and the why of the names. Mostly in these first few posts, I’ll have whats rather than whys. Hopefully, more whys will come later…
This gives you a taste of the kind of things I’ll be looking at. In the next post, I’ll do a deeper dive into the “nearby” names. Stay tuned …
If you look for these names in my Photo-Era Search tool, you might not find what you expect. For example, if you look for Alonzo H. Beardsley AND Wilfred A. French, there are only 10 results in 2 issues, which is not nearly enough. One big difference is the variants: Beardsley is most often referred to as A. H. Beardsley, but my normalization used here is his full name.
On the other hand, if we look for A. H. Beardsley AND Wilfred A. French, we get 1041 results in 148 issues, which is way too many. Here we need to keep in mind that the search tool looks for the names anywhere in the same issue, whereas the information reported above is only for those pairs within 26 tokens of each other. That means that A. H. Beardsley and Wilfred A. French get mentioned a lot more often further apart, or even separately, than they do close together.
Finally, the 26 token distance which is the 1% cutoff is almost the 25 token cutoff often used as the upper limit for long “sentences” when parsing corpora (sequences of words longer than that are often not really sentences). In other words, our nearby names can be thought of as being roughly within one long sentence’s distance apart (though not necessarily in the same sentence).