By Daniel Hubbard | May 18, 2014
Records evolve. Some records are created to preserve specific information. These certainly change over time. Go far enough back in time and death records might not give the cause of death, something you expect to see today.
Other records are created to gather statistics. These, it seems, evolve much faster. If one simply wants to know the population of a place one might just count. With enough people that can go seriously wrong. Maybe you might count the number of people in each household then add. That’s better and you’d be less likely to lose count but you have no way of knowing if two households with 4 people are really the same one counted twice. You need a way to label each household. You might use one person’s name. Nevertheless, you might want to know how old the people in the household are. Instead of writing a list of ages, you could count how many 0-10-year-olds there are and how many 11-20-year-olds there are in each household. You’d probably end up dividing the age groups by gender as well. After all, if you wan to figure out how big an army you could have now or how big it might be in the near future, it is the number of males that you’d want to know. If you want to figure out how much the population might grow, then it is most useful to know the number of females in different age groups.
As time goes on, you might realize that you could really use more accurate age information so you make smaller and smaller age categories until it takes a lot of space just to write out the categories and then most of them will be empty for most households. At that point, you might decide that it would be simpler to write down everyone’s name and age. That would make it less likely to count someone twice. After all, how do you know if Johnny Doe wasn’t a 6-10 year-old son in one household and a 6-10-year-old nephew in another? You would also need just one space for the age. Do you really need exact ages though? Maybe it would be good enough to round the ages down to the nearest multiple of 5? On the other hand, why do all that rounding? Rounding a number is easy but rounding the 600th number when you’re tired can go wrong. Maybe the exact age would be best anyway.
You might also want to know how people are moving around. It could be good to ask them where they were born but not too accurately. We don’t need a street name. Maybe we could ask what part of the country someone was from. Maybe we could ask if they were from this part of the country or any other part or another country all together. Perhaps it would be interesting to know where someone’s parents were born? That would help to understand migrations. How about asking where someone lived before they came to the place where they are now?
All That Has Been Tried
Every one of those evolutionary steps occurred in census taking somewhere. At first it can seem mysterious. Why would information be recorded that way? Then you stop and think about it. Information costs. The more information you gather, the more you have to process. The more fine grained the information, the more work it is to gather it into statistically useful chunks. From that perspective, it seems obvious to try to gather at little as possible and gather it in a way that is already in those statistically useful chunks.
Soon you start to realize that there are problems with that strategy. If you gather very little information, you have very little way to decide if you have recorded someone once and only once. So you gather a bit more. You also realize that if you gather it in chunks that are too large (Say, every male over the age of 44 in one age group as in early U.S. censuses), then you might have questions that you can’t answer. So you evolve your census by adding age groups until you realize that you’ve taken it to such an extreme that it would be more efficient to get the exact information (It took 4 ledger pages to contain all the categories of the 1840 U.S. census). If you decided to kill two birds with one stone and try to minimize double counting by taking down everyone’s name and get rid of the age categories by taking down exact age, then you end up with something like the 1850 U.S. Census. That wasn’t the only way to go. You could take down everyone’s name but round the ages. You wouldn’t need all those category boxes but you would get the ages in 5-year-wide categories if you rounded down to the nearest multiple of 5 as was done in the 1841 census of Great Britain.
What about place of birth? First, until you take down every name you can’t really take down everyone’s place of birth. There might have been an intermediate stage where “unnamed 17 year-old male” was recorded as being born in New York, or some such thing but I don’t know that any census ever worked that way as a rule. The problem with places in the census has always been the level of detail to record. In the U.S. Federal census, the solution was always state or country of birth. New York chose to be more accurate and recorded the county of birth (if born in New York) in some of their state censuses. That greater accuracy when close to home is a philosophy that comes up in both census taking and in our everyday speech. When we are far from home, naming the closest big city to where we live is close enough.
In Britain there was an intermediate step before simply writing down the name of a place. In 1841 people were asked if they were born within the county where they were currently living. The meaning of a “yes” is clear and you learn the county of birth. The meaning of a “no” is not so clear. That meant that they were currently living within the U.K. country (e.g. Scotland) where they were born but not the same county. If a person was living in a different country of the U.K. from where they were born, there would be no answer at all. Instead the next column over would name the country or, if the person was born outside the U.K. it would simply indicate that the person was foreign. That probably came to seem both confusing and prone to error. Ten years later, exact town and county of birth started to be recorded but they followed the same thinking as the State of New York. That accuracy was only achieved close to home. If you were born in a different country of the U.K. from where you were living, exact information would not be recorded for many years.
What about other locations? The U.S. Federal census started to record parents’ places of birth in 1880 and recorded that until 1930. In 1940 only a subset of people had that information recorded. Who recorded where a person came from immediately before they came to where they were living? Kansas asked that question for a while when many people in their state census were likely to have come to Kansas from outside.Twitter It!
By Daniel Hubbard | May 11, 2014
I found a reference to a man, Anders, that I was researching. It was a reference to his drowning. I checked the death registers for the date that was given and sure enough, he was listed. So were two other men. There was a note squeezed into the margin. It was hard to read but the meaning was clear. Three farmers had gone out onto the lake in a boat on an early winter day and none of them survived. They were all buried the same day, longer than normal after they died, a hint that perhaps it had taken time to recover the bodies. On the date that he was buried, I found that Anders’s wife gave birth to his final child, a boy she named Anders. A sad and simple story of the kind that just waits for a genealogist to rediscover it.
The day after I discovered that story, it nearly slipped away again. My computer became hot and, unless I squeezed it in just the right way, the screen shimmered pink. Stories, facts, data and little details can all disappear. With a new computer and plenty of backups of the old one, nothing was lost. Data recovery is not always so easy.
Sometimes not everything is backed up but even bringing a little back can bring a thrill. If you ever do Irish research, you know just how much was lost there—so many records burned, destroyed as unnecessary, or reduced to pulp to make new paper. The 1821, 1831, 1841 and 1851 censuses are commonly regarded as having been totally lost during the Irish Civil War. Yet in some few cases there was a backup here and a document that survived there and and now the National Archives of Ireland has gathered and collated what remains of censuses enumerated almost two centuries ago.
Other information from those censuses survived in a different way. In 1908 a new pension system was instituted in the United Kingdom. To prove their ages, many wrote for extracts from the 1841 and 1851 censuses. Their names, ages, places of birth and parents still survive in those extracts even if the census returns themselves are long gone. Now what remains of those four censuses and the extracts made from them is online. They are both dazzling to see and saddening to see because they make clear how much information was lost. At least they are not quite gone.
Do you have backups of your data?Twitter It!
By Daniel Hubbard | May 4, 2014
Identity must be one of the most fascinating facets of genealogy.
Isak and Ovra where recorded twice as parents. Those were the names that were written into the birth registers. When one of their sons died the information about his parents matched Isak and Ovra with one exception, their names. Ignatz and Charlotte were the dead man’s parents. So much for Isak and Ovra as the identities of the parents.
Isak and Ovra lived in Austria. Ignacz and Lotti were found in Hungary. So much for Isak and Ovra as the parents.
Living with Ignacz and Lotti were their two sons—sons with the same names and years of birth as the sons of Isak and Ovra. Interesting. Nevertheless Ignacz and Lotti’s boys were born at Bécs. Isak and Ovra’s boys were born in Vienna. So they can’t be the same.
The only question left is the identity of Bécs. Bécs is Hungarian for Vienna. Isak is Ignatz. Ovra is Lotti. Lotti’s American descendants knew her as Charlotte and her son and his descendants gave his place of birth as Vienna. Identity can be a subtle and mysterious thing.Twitter It!
By Daniel Hubbard | April 27, 2014
There is something special about those times when records don’t simply stop but fade. Times long enough ago that identity was seen differently. A man is not known by two names but by one name and the place where he lived. When his child was born, all that was recorded was the date, his given name and the place. No mother. No name for the child. Not even the child’s gender, just the date, the place and the father.
In more recent times the wonder of genealogy often comes from reconstructing lives and resurrecting stories. Out at the edge of what is possible, the wonder comes from being able to see through the fog of time just well enough to know a single mysterious name. So many people who will never have their lives reconstructed stand out there at the limit. A vast multitude like the masked chorus of an ancient Greek play. We hear them as one because we cannot hear them separately.
Connections are there but they hang by a thread, tenuously.Twitter It!
By Daniel Hubbard | April 21, 2014
The full Moon occurs when the Sun and the Moon are on opposite sides of our world. We see the full face of the Moon illuminated by the Sun because as we turn up our eyes to the Moon, the Sun is shining from below our feet, lighting up the far side of the Earth. This month we were treated to the most spectacular sort of full Moon. The Sun and the Moon were not just roughly in opposite directions, they were in exactly opposite directions. The Moon passed through the Earth’s shadow and grew darker and darker until it turned sunset red. That moment when the entire Moon seems to turn to blood is the time when it is most obvious that the moment of the full Moon has been reached.
The full Moon this month is also special because it is the one that defines when Easter is celebrated and it is problems with the calculation of Easter that gave us the modern (Gregorian) calendar. Every genealogist and historian who has researched in a time and place before the arrival of the modern calendar or in a culture where it is only one of the calendars in use, needs to learn to navigate the transitions from one way of looking at time to another.
So, how does the Moon help determine Easter’s date. Easter is defined as the first Sunday after the first full moon on or after the vernal equinox. Well, that is what is often said and it is approximately true but we need to make a few corrections.
The first correction, and it is perhaps a bit of nit picking is that “vernal” means spring and anyone living south of the equator might be justified in complaining that Easter takes place in the fall, not the spring. Either March equinox or Northern vernal equinox might be a better term.
Then what about the word “equinox”? The term means when day and night are equal. “Equinox” comes from Latin words meaning “equal night.” More accurately though, the equinox occurs at the particular moment when the sun is directly above a point on the Earth’s equator. The date that this moment occurs in any given year depends on where you are on the globe and can be one day different depending on your position. If that one day shift mattered, then people in some places might be celebrating Easter more than a month earlier than people in other locations. In the end though, that turns out not to happen. For purposes of computation the equinox isn’t actually used. Instead of the true equinox, which can occur on different dates in different years and on different dates depending on where you are on the Earth, the date March 21 is used. No need to observe the Sun, just use a calendar.
That calendar based solution works well as long as March 21 always occurs about when the equinox occurs. If your calendar drifts, you eventually have problems. The old Julian calendar drifted three days every 400 years. That might not seem like much but eventually Easter began to occur later and later in the year. At different times in different places, Pope Gregory’s calendar replaced the calendar of Julius Caesar. Days were dropped from years to bring the calendar back into synchronization with the Sun.
The second group of corrections has to do with the Moon. We think of the full Moon occurring on a specific day but of course, as we realized when thinking about lunar eclipses, the Moon is actually full at one precise moment. The problems with the full Moon are the same as the problems with the equinox. For the purpose of setting the date of Easter, the date of the full Moon is set by mathematical tables. The moment when the Moon is truly full may happen on the day given in the tables or a few days before or after.
Does it Make a Difference?
It certainly matters to the day you celebrate Easter. This year (2014) is special because Western churches (using updated calculations) and Orthodox churches (using original calculations) celebrate Easter on the same day. That last happened in 2011 and won’t happen again until 2017. As the Julian and Gregorian calendars drift apart it will happen less and less. April 24, 2698 will be the last time that the same date is used by both groups. Already now, the dates of Easter can be as much as 5 weeks apart.
To the genealogist and the historian, it makes a difference. The calendar is our map of time. Use the wrong map and we are lost, though we will think we know where, or rather when, we are. In different places, different cultures and different times, the calendar has been and still is different from what we expect. If you have ancestors from different Christian denominations, then what records with dates based on Easter, or any holy day related to Easter, can mean very different dates. In 1584, the second year after the new calendar began to be adopted, Catholics celebrated Easter on a date four weeks different than other groups. In some areas of Europe the changes made to the lunar tables were accepted decades after the change to the calendar, leading to still more dates for Easter.
There was no one map of time. There was, in fact, a giant atlas.
By Daniel Hubbard | April 14, 2014
David J. Hand, the author of the book The Improbability Principle has a list of items that go into his principle-
- The Law of Inevitability
- The Law of Truly Large Numbers
- The Law of Selection
- The Law of the Probability Lever
- The Law of Near Enough
In the book he describes events that seem like they should never happen and shows how those laws make those events not particularly surprising.
The Odds of Finding Something
This week as I’ve been tracing immigrants back to Sweden. That means trying to determine information in American records that will be useful in matching to Swedish records. That has put the Law of Near Enough on my mind. Put simply that law means that the odds of finding something increase as you loosen your criteria for what you consider to be a match. It is easy to see how that could contribute to our sense that a coincidence has occurred. If we think that two things happened at the same moment, we might find it interesting. We might still find it interesting if they occurred during the same hour. What if they happened during the same day? The same month? Somewhere, as we alter the amount of time, those occurrences go from interesting to boring if we stop and consider the span of time. If we don’t think about it, we can be fooled into thinking that something is wildly unlikely when it is actually rather probable.
Often we don’t even think about the fact that criteria exist but, like the time span above, they do and they affect our research. In genealogy we are always trying to determine if two records correspond to the same person. What criteria do we use? We can look at the opposite of the Law of Near Enough. We can restrict the criteria we use until we will never match records with each other. How about an obituary that claims that a person died at 4am and a death certificate with information that matches the obituary in every detail but one. It records the time of death as 3:56am? Clearly different people, right? Wrong, our allowable difference between the times is ridiculously small.
Where do we draw the line? If a document records a man’s age as 50 and the person we are looking for would have been 51 at the time, is that “near enough”? Now we need to start answering questions. How common was his name? How close was the record made to the expected place (another “near enough” question)? How likely is it that the man’s age was rounded down? How likely is a misunderstanding? Did anyone have a reason to lie? Might the person who made the statement not have known any better? Might the informant have given an accurate version of their uncertainty only to have the clerk right down something exact—could “about 50″ have been said to a clerk who “simplified” the answer to “50”?
An Old Story
An apparently old story, which I have to admit I never heard before, involves a traveler going down the road passed a barn. The side of the barn was peppered with arrows, each sitting dead center in a target. The traveler concluded that a master archer lived there until turning the corner to discover a man carefully painting targets around each one of a set of arrows. As genealogists, we are stuck (so to speak) with the arrows that were shot into the barn generations ago but we do need to think about the targets that we paint around them. Draw targets that are too small and no two arrows will sit in the same target. Draw targets that are too large and arrows that have nothing to do with each other will be in the same bullseye.
So how big do we draw the targets around our documents? How far out can a document be and still be a match for another document? Make the target large enough and we will find something but that something will very likely be a false positive—something we consider “near enough” when it really isn’t. When we thought a moment ago about how common a person’s name was, that was a false positive type of thing to worry about. A woman with a very strange and unusual name who was within a few years of the correct age and found in almost the right place is much less likely to be a false positive than is a man named James Smith in otherwise identical circumstances.
Then there is the question of confirmation. The obituary and the death certificate that differ by 4 minutes in the reported time of death don’t need confirmation in order to conclude that, with everything else matching, they record the same person. On the other hand, if we don’t find a match and expand our target, what kind of confirmation will we need to really show that the bigger target was justified? Expanding our target often means needing to find more arrows that hit it.Twitter It!
By Daniel Hubbard | April 6, 2014
In light of David Letterman’s impending retirement, here are the-
Top ten reasons to give up genealogy
10 Tired of the search result “Buy Ichabod Whittleby products on Amazon.com!”
9 Thought people were kidding about the 1890 census.
8 Upset by DNA match to Justin Bieber.
7 Discovered that someone has copied your idea to fill a hollowed out mountain with genealogy records.
6 Ancestral castle in Scotland turned out to be a White Castle® in Scottsburg.
5 Discovered great-grandma died in infancy.
4 Tired of accusations of plagiarism over your book, Roots.
3 That lying clerk’s insistence that the courthouse had burned when you were clearly standing inside it.
2 Your starring role in the local theater production of Who Do You Think You Are failed to land you a part on the TV show.
1 Realized that all the interesting ancestors are already taken.
If any of these reasons actually seem correct and reasonable to you, don’t give up, take a deep breath and forge on.Twitter It!
By Daniel Hubbard | March 30, 2014
When I studied ancient history in this university many years ago, I had as a special subject “Greece in the period of the Persian Wars.” I collected fifteen or twenty volumes on my shelves and took it for granted that there, recorded in these volumes, I had all the facts relating to my subject. Let us assume—it was very nearly true—that those volumes contained all the fact about it that were then known, or could be known. It never occurred to me to inquire by what accident or process of attrition that minute selection of facts, out of all the myriad facts that must once have been known to somebody, had survived to become the facts of history.
—historian E. H. Carr as quoted in The Improbability Principle
We don’t generally research events of quite the historical importance of the Persian Wars. It was a period of fifty years that involved hundreds of thousands of soldiers and civilians and left a deep imprint on the future of the world but even so, “fifteen or twenty volumes” could contain the sum total of the information about that half century of intermittent war. It makes one wonder about the “process of attrition” that has limited what we can learn about our own, more immediate, ancestors.
I’ve been researching a woman who applied for a Civil War widows pension. she was required to prove various things in order to be entered into the pension rolls. Obviously, her husband must have died. That she could prove. Equally obviously, the dead man must have been her husband. She had been married previously so she also needed to prove that her first husband had died. Those last two important events had occurred in Chicago in the years before the Great Fire of 1871 and she could not prove either of them. There was already attrition of facts in her past.
Records burn and memories fade. How often do we run into someone who consistently says that her father was from New York and her mother was from Ohio, only to find that her parents consistently said that they were from Germany and New Jersey? How often do we find the phrase “don’t know” written into the census or a death certificate?
As researchers we are often forced to rely on implications and reason to try to replace what had once been known, what had once been clearly documented. We painstakingly rebuild what attrition of facts has torn down. That, I think, is when searching for the personal past goes from interesting to sublime.
People who are interested in genealogy but not genealogists themselves are often amazed not just by what was found but by what can be found and pieced together. They are fascinated by the fact that it can be known. That amazement is something we ought to remember. We ought to take the time on occasion to sit back and ponder or perhaps “to inquire by what accident or process of attrition that minute selection of facts, out of all myriad facts that must once have been known to somebody, had survived to become the facts” of our family history.Twitter It!
By Daniel Hubbard | March 23, 2014
The other day I went to the library to pick up a book I had on hold. I found the book and a magazine on the shelf waiting for me. I checked them out and was casually flipping through the magazine while I waited for my daughter’s choir practice to wrap up. In the back of the magazine I saw a picture that would not have been familiar to me just an hour earlier. It was a picture of the book that I had just checked out along with the magazine. What a coincidence!
Well, it isn’t really a coincidence. I had the book on hold because I had heard an author interview and it sounded interesting. Authors usually do interviews about their books because the books have just been published. Recently published books are more likely to be found in full-page advertisements at the back of a magazine than are other books. It is also more likely for a person who is interested in a book to also be interested in a magazine whose subject matter means that it is a good place to run an ad for the book. My choice of that book and that magazine was not random. I do have to admit that it is amusing that the book in question is The Improbability Principle, a book about the nature of coincidences and why we should actually expect them.
What is a Coincidence?
What genealogist hasn’t run into what seems like a remarkable coincidence when researching their family? Some of the toughest things for genealogists to handle are those “coincidences.” Some coincidences are meaningless and we need to weed them out and disregard them or they will waste our time and even lead us to wrong conclusions. Other coincidences are meaningful and are major clues. How do you tell them apart?
The first thing to do is to define what we really mean by “coincidence.” The parts of the word simply indicate that something happened at roughly the same place and time as something else. That doesn’t really do it though. Coincidences need to have some sort of surprise factor. There can’t be a cause and effect relation. If you throw a rock at a window and then hear the sound of shattering glass, you would not call it a coincidence. If you threw the same rock at a tree and heard the sound of shattering glass just as you saw the rock hit the tree, that you would probably think of it as a coincidence—and a pretty strange one at that. A more genealogical example might be finding a man with a very unusual name in a small town and then finding a different man in the same town with the same name fifty years later. Coincidence or a child named for his grandfather?
Another set of occurrences that we shouldn’t really think of as coincidences are things that have the same cause. It isn’t a coincidence when the snow melts and the first plants start to come up in the spring. Both are brought on by the arrival of warmer weather.
Events that don’t have any apparent connection don’t qualify as a coincidences either. You would not think “Wow, what a coincidence!” if the dog two doors down starts to bark at the same moment you took a sip of tea.
Coincidence to Clue in Genealogy
If we find something that looks like a coincidence, what should we do? Can we immediately conclude that we have something meaningful and go on from there? No, given the number of people who have existed in the last few hundred years and the number of things they did that were recorded, there is plenty of room for chance to make things seem to be connected when they are not. In fact, it would be strange if chance did not make a few things seem to be connected when they are not. Lots of things happen and some of them will randomly seem to be related. Can we immediately conclude that our coincidence is meaningless? No, it might be that what looks like two unconnected random facts are actually connected by cause and effect. One may have caused the other, or they might be the results of a common cause or share some common factor.
One question that needs to be asked is, what is the chance that there is something behind this coincidence? Figuring that out means digging into the situation. If you just discovered that one of your ancestors was in the Civil War and you look at the names of the other men in his unit and see a name that looks familiar, is it random or meaningful? If the name is familiar because someone of that name lived in the same town that you suspect your ancestor lived in before the war, then it might be meaningful. Nevertheless, you could only conclude that after you learn that Civil War units were usually made up of men from the same area. You can also think about what is the chance that it is a random coincidence that two people with the same unusual last name are from the same place? Is the population of that place 50 or 50,000? Is the name unusual in general but common among an ethnic group that is numerous in that area? All of these things help to decide if the coincidence is worth pursuing.
The other question that you should ask is, “If this was more than a coincidence, how could these things be connected?” There might no way for them to be connected or there might be several. If there is no way, then it is time to go on. If there is a way or even several different ways, then you need to think about the next question—”What would a possible relationship between these things mean?” What kind of evidence might exist if any given possible connection was correct? Once you are able to ask that question, you have turned a coincidence into a clue and that is cause for a little celebration.Twitter It!
By Daniel Hubbard | March 16, 2014
Sure, we already have Family History Month in October but as my kids have pointed out to me lately, to be cool now, your activity needs to have a specific day with a funny connection to what it is supposed to celebrate. For example, earlier this month, grammarians celebrated grammar day on March 4. They are clearly required to celebrate as well. It is an imperative, literally. The day is March Fourth or as a clear order “March forth!” It is apparently a day for grammar gurus to take to the streets.
As I write this, my little math whizzes are coming down from a Pi Day induced sugar high. Clearly then, I am writing this paragraph on March 14, or rather 3/14, the first three digits of the famous mathematical constant Π (Pi, 3.14159…) In the rest of the world where people more sensibly put the month in the middle of the date rather than at the beginning, it needs to be explained to people why the 14th of March is Pi Day. Unfortunately, the more sensible ordering would put Pi Day on 31/4, the 31st of April. That would be a good candidate for “National Confused by the Calendar Day” but not for Pi Day. Pi Day has the plus side that “pi” sounds like “pie” which are appropriately round and also good at grabbing children’s interest, hence the Pi Day pie sugar highs.
In a month and a half my son will draw his light saber in celebration of Star Wars Day. It falls on May the 4th, as in “May the Fourth be with you.” If you don’t get the joke, you clearly have not spent much time with an elementary or middle schooler, or anyone sufficiently geeky.
Clearly to be cooler, genealogy needs a day like this. One problem is that where pi has a clear tie to pie, making Pi Day even more popular, genealogy has trees and the food that looks most like trees is broccoli. I like broccoli but my kids just spent the afternoon at the library doing pi based activities and eating pie. While the library would be a great place to celebrate Genealogy Day, eating broccoli at the library won’t get the kids out in quite the same numbers.
Perhaps we need to forget the food tie-in but what about a good day?
January 1 “First one” to remind people to start with themselves when researching but it’s already national hangover day. I don’t want to be on the road on New Year’s Day and I don’t want to see the trees that might be put online that day either.
February 2 is nice and binary like Ahnentafel numbers but “Two Two” sounds like it ought to be National Ballet Day.
The fifth of any month might be in remembrance of all our ancestors who seem to have “taken the fifth” instead of leaving any useful information behind. On the other hand, that is not likely to increase the popularity of genealogy.
We need a day to celebrate our past, our ancestors, our origins, our “august beginnings,” which of course, is the answer. August 1, the beginning of August it should be. Pi Day eat your hear out. Now if I can just get my kids to eat their Genealogy Day broccoli…
« Previous Entries Next Entries » Twitter It!