By Daniel Hubbard | May 18, 2014
Records evolve. Some records are created to preserve specific information. These certainly change over time. Go far enough back in time and death records might not give the cause of death, something you expect to see today.
Other records are created to gather statistics. These, it seems, evolve much faster. If one simply wants to know the population of a place one might just count. With enough people that can go seriously wrong. Maybe you might count the number of people in each household then add. That’s better and you’d be less likely to lose count but you have no way of knowing if two households with 4 people are really the same one counted twice. You need a way to label each household. You might use one person’s name. Nevertheless, you might want to know how old the people in the household are. Instead of writing a list of ages, you could count how many 0-10-year-olds there are and how many 11-20-year-olds there are in each household. You’d probably end up dividing the age groups by gender as well. After all, if you wan to figure out how big an army you could have now or how big it might be in the near future, it is the number of males that you’d want to know. If you want to figure out how much the population might grow, then it is most useful to know the number of females in different age groups.
As time goes on, you might realize that you could really use more accurate age information so you make smaller and smaller age categories until it takes a lot of space just to write out the categories and then most of them will be empty for most households. At that point, you might decide that it would be simpler to write down everyone’s name and age. That would make it less likely to count someone twice. After all, how do you know if Johnny Doe wasn’t a 6-10 year-old son in one household and a 6-10-year-old nephew in another? You would also need just one space for the age. Do you really need exact ages though? Maybe it would be good enough to round the ages down to the nearest multiple of 5? On the other hand, why do all that rounding? Rounding a number is easy but rounding the 600th number when you’re tired can go wrong. Maybe the exact age would be best anyway.
You might also want to know how people are moving around. It could be good to ask them where they were born but not too accurately. We don’t need a street name. Maybe we could ask what part of the country someone was from. Maybe we could ask if they were from this part of the country or any other part or another country all together. Perhaps it would be interesting to know where someone’s parents were born? That would help to understand migrations. How about asking where someone lived before they came to the place where they are now?
All That Has Been Tried
Every one of those evolutionary steps occurred in census taking somewhere. At first it can seem mysterious. Why would information be recorded that way? Then you stop and think about it. Information costs. The more information you gather, the more you have to process. The more fine grained the information, the more work it is to gather it into statistically useful chunks. From that perspective, it seems obvious to try to gather at little as possible and gather it in a way that is already in those statistically useful chunks.
Soon you start to realize that there are problems with that strategy. If you gather very little information, you have very little way to decide if you have recorded someone once and only once. So you gather a bit more. You also realize that if you gather it in chunks that are too large (Say, every male over the age of 44 in one age group as in early U.S. censuses), then you might have questions that you can’t answer. So you evolve your census by adding age groups until you realize that you’ve taken it to such an extreme that it would be more efficient to get the exact information (It took 4 ledger pages to contain all the categories of the 1840 U.S. census). If you decided to kill two birds with one stone and try to minimize double counting by taking down everyone’s name and get rid of the age categories by taking down exact age, then you end up with something like the 1850 U.S. Census. That wasn’t the only way to go. You could take down everyone’s name but round the ages. You wouldn’t need all those category boxes but you would get the ages in 5-year-wide categories if you rounded down to the nearest multiple of 5 as was done in the 1841 census of Great Britain.
What about place of birth? First, until you take down every name you can’t really take down everyone’s place of birth. There might have been an intermediate stage where “unnamed 17 year-old male” was recorded as being born in New York, or some such thing but I don’t know that any census ever worked that way as a rule. The problem with places in the census has always been the level of detail to record. In the U.S. Federal census, the solution was always state or country of birth. New York chose to be more accurate and recorded the county of birth (if born in New York) in some of their state censuses. That greater accuracy when close to home is a philosophy that comes up in both census taking and in our everyday speech. When we are far from home, naming the closest big city to where we live is close enough.
In Britain there was an intermediate step before simply writing down the name of a place. In 1841 people were asked if they were born within the county where they were currently living. The meaning of a “yes” is clear and you learn the county of birth. The meaning of a “no” is not so clear. That meant that they were currently living within the U.K. country (e.g. Scotland) where they were born but not the same county. If a person was living in a different country of the U.K. from where they were born, there would be no answer at all. Instead the next column over would name the country or, if the person was born outside the U.K. it would simply indicate that the person was foreign. That probably came to seem both confusing and prone to error. Ten years later, exact town and county of birth started to be recorded but they followed the same thinking as the State of New York. That accuracy was only achieved close to home. If you were born in a different country of the U.K. from where you were living, exact information would not be recorded for many years.
What about other locations? The U.S. Federal census started to record parents’ places of birth in 1880 and recorded that until 1930. In 1940 only a subset of people had that information recorded. Who recorded where a person came from immediately before they came to where they were living? Kansas asked that question for a while when many people in their state census were likely to have come to Kansas from outside.Twitter It!