By Daniel Hubbard | February 28, 2016
It has been a while since I mixed mathematics and genealogy. It feels like time to do that again.
Genealogy is all about information. We gather information and use it to figure out information about ancestors and relatives. There is a branch of engineering and mathematics that is actually called information theory. It describes many aspects of how we transmit and store information, even what information is. The basic unit of information is the bit. Bits are the 1s and 0s that computers use, the ons and offs of the transistors inside computer and the trues and falses in a game of twenty questions. One might think that the amount of information in a document or message is just the number of bits it takes to express that message. There are several reasons that isn’t quite right. My favorite, partly because of its name, is the concept of “surprise.”
Surprise sounds like it ought to be a fun concept, and it is actually quite intuitive. Imagine your in a bank. As you wait in line for a teller, you shift your gaze around the room. You notice that there is an alarm bell on the wall. That bell can do two things. It can ring and it can not ring. That sounds nice and binary. The bell can express a single bit of information. As you stand in line, you are not surprised that the alarm is not ringing. You expect it to be silent. From that point of view you might say that it is not conveying any information. It actually is, but only very little. If on the other hand, the bell starts ringing, you (and everyone else in the bank) will be very surprised. It feels like the bell is now conveying a lot of information, but isn’t it still just a one bit message?
The key is that the less likely a message is, the more surprising the message is, and the more information it is conveying. By the formal definition of surprise, a message that has to occur has no surprise and carries no information. A message that is impossible would convey infinite surprise, and infinite information as well. (the next time you are surprised by something, you probably won’t consider that it has a formal definition, but you can always try). It is an interesting concept. We are often surprised by discoveries in genealogy and probably wouldn’t even call them discoveries unless they were at least somewhat surprising. The closer a document is to impossible, the more surprised we are, and it feels right to consider it as carrying more information. The next time you find the second death record for an ancestor recorded years after the first, you are not only entitled to be surprised but to be in awe of the amount of information being conveyed—it is either telling you that your previous information is wrong and giving you the new information, or it is conveying the information that your next step is to get a hold of Stephen King’s phone number.
Another thing that leads to surprise in genealogy, is going back another generation. It gets harder as we go and feels more surprising every time we manage to do it. I was surprised to learn that there is a basis for this in information theory as well.
Pedigree charts are very binary things. When we number them with ahnentafel numbers, the numbers get larger as we go back to more distant ancestors, and they do so in a special way. Every generation back adds another bit to the length of an ahnentafel number if you express it in binary. You are 1, your father 10, your mother 11, her mother 111 and so on.
The codes computer use to represent typed characters are also very binary. For example, in one coding (ASCII) an A is represented by 0100 0001, a B by 0100 0010, and so on. Often when trying to be efficient, the most probable messages will be given the shortest codes. Morse code represents the most commonly used letters with the shortest codes. E is a single dot in Morse. Q is dash-dash-dot-dash. If you construct a diagram to decide on binary codes in the most efficient way, the most probable codes using the fewest bits, you end up constructing a thing that looks exactly like a pedigree chart. The less likely a coded item is to be used, the more bits in the code for that item. The more bits, the lower the probability, and from above we know that means the greater the surprise. Now, the binary ahnentafel numbers get one bit longer per generation, which means we now have a mathematical proof that discovering great-great-great-great-great-great-grandma is very surprising indeed.Twitter It!
By Daniel Hubbard | February 14, 2016
Last week I was at Rootstech helping people with their Swedish ancestry. I may have called this kind of thing genealogical speed dating before. If I haven’t, I will now. It goes fast and it can be a lot of fun. Sometime there just isn’t enough time to find records or to think things through. Other times, you find things. Sometimes things even go spectacularly well.
The Story, Part 1
I was asked to track down a soldier. I found a record that placed him in the Malmö garrison in the 1860s. I checked there and didn’t so much find a record as a story. I found him. He was listed with his wife. Above her name it was written that she was a widow and the identity of her deceased husband was entered. I could tell looking at it, that he wasn’t just anyone, but the man whose death opened the position in the garrison that had been filled by the soldier I was tracking. He hadn’t simply gotten a job, he had gotten a wife too. If you think about it, that is a story all on its own. After her name, her deceased husband was mentioned again in order to state that he was the father of the children whose names followed.
The Story, Part 2
After those children came the name of the first child of our soldier and the widow. All his information was entered, date of birth and date of baptism, just as one would expect. No story there. The next child was born four years later. The birth date was stated, but there was no name and no date of baptism. There was a note about the mother not allowing the child to be baptized. It reminded me of a note I had once read in a different Swedish record. The Lutheran minister had written “Child not baptized” where the date should have been but then added a note at the end that the child had been baptized by a Methodist minister. Thinking about that religious explanation of a baptismal oddity made me realize that the note I was looking at was not what I thought. It did not use the Swedish word “Maman,” a rather informal term for mother. It said “Mormon.” It wasn’t that the mother refused baptism, it was that Mormons refused baptism. When I explained this, the woman I was helping got excited and said that it made sense “We don’t baptize until age 8.” I looked back at the record. The previous, baptized child had been born four years earlier. In those four years there had been a conversion. I pointed out what the timing had to mean. Her eyes grew big. “Thank you! Thank you! I have to go check church records!” And she was off, with one record and quite a story.
You never want to leave a stone unturned. It might be hiding a record. And just one record can tell a whole story.Twitter It!
By Daniel Hubbard | January 31, 2016
I’ve been researching Robert. He was a soldier in the Revolutionary War. In his pension application he made many statements. He listed his units and the surnames of those units’ commanding officers. It was very good information. I learned where his units gathered, the paths they took on the march and the name of the battle in which he fought. All very interesting, but he said nothing that might give me a hint about his family except for the name of the county where he was living in the 1770s.
When Robert was a rather old man, he testified concerning another man’s pension application. He testified that he knew that the other man had been in the war because he had been in the same unit as Robert’s father. They were very different in age, but they had been comrade’s-in-arms. Frustratingly, he never named his father. He mentioned him a few times in his testimony, but he never named him. After the main body of Robert’s testimony, the man who was taking it all down must have asked a few questions. The questions were not recorded, but suddenly the statements that were recorded became staccato. They were disjoint, the sentences not following at all from each other. One of those non sequiturs was the statement, “My father’s name was Robert.” Ah, the kindness of strangers…or at least of strangers’ pension records.
So his father was Robert. Is that sure? It makes perfect sense for a son to share his name with his father. It is also possible for an old man to give the wrong answer, for him to simply state the first name that comes to mind, which just might be his own name. So was his father Robert or not? There are a lot of documents which might corroborate Robert as the son of Robert, but the obvious searches turned up nothing. A lowly tax list turned out to be magical. It was the kind of record that makes you want to jump for genealogical joy. It listed Robert and his son Robert. They were in the right place at the right time. The list was prepared in sections by different people. The man who prepared the section with the Roberts had an unusual surname that happened to be the same as the name of the captain of one of Robert Jr’s units. One of the other men on the list was actually called captain and his whole name matched another one of Robert Jr’s officers. The only thing that might make it better was if Robert Sr’s comrade-in-arms from the pension records was also in that section, and sure enough, he was. It is the right path. Now to follow it!Twitter It!
By Daniel Hubbard | January 17, 2016
There is an interesting phrase one runs into in genealogy. It is “the records are confused.” When I first started out in genealogy, back when I was a kid, I thought that was a very strange phrase. It seemed to imply that those records sat around after the courthouse closed scratching their headings and not being sure of anything. Of course, that isn’t what the phrase means, but the question isn’t really what the phrase means, but rather what it might indicate.
A Record is Confusing
Some records really are internally inconsistent. A document can contain actual errors, so that if all the information in it is taken at face value, obvious nonsense is the result. That is probably the closest thing we have to my imagined records that sit around wondering what they might mean. Other records just seem to be internally inconsistent. I once read a will that did not make much sense. Then I realized that the will referred to two different people who had exactly the same name. Until further research showed that the author of the will had an aunt and a cousin of the same name, that will was certainly confusing.
Boundary changes can also be sources for “confused documents.” It is odd, but sometimes a record will not line up with history. I’ve run into a few cases where a deed says that a man purchased land in a county before the county existed, or sold land in a county after that county was discontinued. The full explanations of such things might never be known but I suspect the confusion I saw to have arisen from a difference in time between the writing of the deed and the copying of that deed into the county records. If names and boundaries changed in the time between those events, what might the clerk write in the register of deeds? Quite possibly he wrote something confusing.
A Group of Records is Confusing
Sometimes records seem fine on their own but don’t line up with each other. I have six records relevant to a man’s death:
- a slip of paper with the most important dates in his life. It was written by his mother between 50 and 80 years ago,
- the page from his mother’s family Bible that records his death,
- his burial record,
- his death certificate,
- a photograph of his grave marker.
Before I get to the sixth, I need to discuss the dates on those five records. The first four agree on the day and month. The grave marker has only years. The Bible record, the grave marker, the death certificate and the burial record all have the year of death as 1926. The slip of paper records the year as 1925. There can be little doubt that the agreement between the death certificate and the burial record would outweigh just about anything else. With the Bible record and grave marker also in agreement, there is little reason to trust that slip of paper. I don’t know when his mother wrote it, or how good her memory was at the time. All I can say in its defense is that all the other dates on it are corroborated by records made at the time of the events in question.
Now the sixth document. It is a telegram sent to the man’s mother. There is no question that it was sent to her. It has both her name and her address on it, and I can verify the address using many other records. The telegram has been passed down in the family, so there is no question about its provenance. The telegram calls her “mother,” so it is clearly from one of her children. She only had two children and there can be no confusion between the two. Given the name that appears at the bottom of the telegram, it is from her older son, who is the man in question. It was sent from a small city a few hundred miles from the place where he and his mother both lived, but that is as expected for a man who was a traveling salesman and wanted to send his mother birthday greetings when he was out of town. The telegram was sent on his mother’s birthday too. The problem was that it was sent on his mother’s birthday in 1927, almost a year after he died. Not being a mystery writer, I can only conclude that Western Union simply put the wrong year on the telegram, yet that would normally be a pretty trustworthy place to find a year. This record would seem to be confused, but only those other records make the problem clear.
The Researcher is Confused
Often what is meant by “the records are confused” is that the researcher is confused. My first experience with the phrase was in a typed document of unknown origin sent to me by a friendly archivist. It purported to trace several generations of the family that I was researching. When my 11-year-old head stopped spinning, I had worked out that the person whose records were confused could only be explained by being the daughter of her father and his own mother, who would have to have been…wait for it…the daughter in question. That is, she was both her own mother and her own grandmother. This was long before the days when anyone could accidentally tie their database into knots with an unfortunate click or an ill-timed computer crash. How do we avoid confusion like that? We can start by not trusting all information in secondary sources and family recollections without question. Then don’t struggle to make it all fit together if it defies logic. Go back to contemporary records and try to find as many records relevant to the question as possible. Finally, try to find sensible resolutions to the inconsistencies that appear, even if it means arguing that some of the hard won data is wrong.Twitter It!
By Daniel Hubbard | January 10, 2016
When I lived in Sweden I learned a tradition about the end of the Christmas season. Translated it goes something like this-
Twentieth day Knut, Christmas is out.
There are different variations on the saying, but they are all short and rhyme. Since I love calendars and the strange effects they can have on genealogy, I also love calendric sayings like this, but what does it mean?
In Sweden every day has one or more names associated with it, and children typically celebrate their names’ days similarly to how they celebrate their birthdays. The day of the name Knut is January 13, not the twentieth. The reason for the difference is that though that day is thirteen days into the year, is that counting from Christmas as day one, Knut is day twenty. It is the twentieth day of Christmas, and is sometimes called just that. This never made particular sense to me having grown up singing the Twelve Days of Christmas, not, and perhaps thankfully not, the Twenty Days of Christmas. At the time I figured that the rhyme involving the name Knut was the origin of the tradition. I’ve learned that it is more complicated and involves the kind of calendar complexities that I love.
The Danish prince Knut Lavard was murdered by his cousin on January 7, 1131. When he was later canonized, his saint’s day was placed on his death date. Throughout the Christian world, the Twelve Days of Christmas were held to extend either from Christmas to the day before Epiphany (January 6) or from the day after Christmas to Epiphany. In much of the English speaking world it is held to be unlucky to leave Christmas decorations out passed Twelfth Night, the night before Epiphany. In Sweden the tradition was that Christmastide ended on Knut’s day, January 7.
So why is Knut’s day now January 13? No one really knows. Sometime in the late 1600s or early 1700s, someone in Sweden seems to have decided that Christmastide needed another week. An extra week of festivities in a place where the sun barely clears the horizon at noon in January, probably seemed like a good idea. A bit more seriously, it may have been because of the loss of many holy days with the Protestant Reformation, Christmastide was extended from the traditional 12 or 13 days to 20 days to compensate, but only in Sweden. Nowhere else. Because people already associated Knut with the end of Christmas, the thought seems to have been that Knut’s Day needed to be moved so that people could continue the tradition of ending their Christmas celebrations on his day. this also happened only in Sweden. Strangely, Swedish Lutheran opinion on when a Danish Catholic saint should be celebrated did not exactly send shock waves through the Vatican.
So something that happened in Sweden on Knut’s Day in Sweden in 1650, happened on January 7. Something that happened on Knut’s Day in Sweden in 1750, happened on January 13. No one really knows exactly when the transition took place but roughly 1695. Just to add to the fun, something that happened in Denmark or Norway on Knut’s Day probably happened on January 19. Why? Because Knut’s uncle, also named Knut, was also canonized, and his day as also placed in January.
So, when you find a record that says something about an ancestor in Scandinavia, and it tells you that something happened on Knut’s Day and with Knut’s Days making up 10% of January, you’ll need to think a bit to pick the right date.
Postscript from the Department of Useless Mathematics
Given the way the verses of the song Twelve Days of Christmas get longer and longer with each day, I was curious how long Twenty Days of Christmas might be. With each verse being as long as the verse before it plus the time needed to sing about another group of things, a quick calculation shows it would take about three times longer to sing the Twenty Days of Christmas than it would to sing the original. That’s a lot of Lords a Leaping. If a standard singing of the Twelve Days of Christmas lasts for about 5 minutes, the prolonged Swedish version would go on for a rather grueling quarter of an hour. I’m also somewhat concerned about what the extra verses would contain. Perhaps they might include “…seventeen gingersnaps baking, sixteen meatballs rolling, fifteen lutfisk soaking…”
By Daniel Hubbard | December 20, 2015
This is part 2 of a multi-part post. Part one was If at First You Succeed, Try, Try Again.
Another reason to keep searching, even after finding “the answer,” is that if we only look for the answers, we are limited by our ability to imagine the questions.
In the sciences, there are the concepts of pure research and applied research. In pure research, one usually has no precise idea of what the results might be used for, only that a hole in our knowledge exists. When the British Chancellor of the Exchequer visited the laboratory of Michael Faraday, perhaps the first scientist to have his work publicly funded, he said, “This is all very interesting, but what good is it?” It was pure research, so Faraday responded, “Sir, I do not know.” Then he added, “but someday you will tax it.” That question was limited by imagination. Faraday didn’t know the answer but he cast a wide net and could feel that something would come of it. By the way, the thing Faraday was in the process of discovering was electromagnetism. From spark plugs to iPhones, it is what separates our world from the world of the steam engine. Not bad for research that had no known use at the time.
For some ancestors we may feel lucky to find birth, marriage and death dates. Then we move on. It is true that for some ancestors, those things might be all we ever know. For others, those might be the answers we want, so they become the basis for the questions we ask. If they are the only questions, their answers will be all we ever learn. Sometimes pure research, without predefined questions, is the way to get the right answers.Twitter It!
By Daniel Hubbard | December 14, 2015
Often when we do a deep and thorough search for records it is because less deep and less thorough searching has not given a result. We search until we find what we were looking for and then search no further.
There is another, and in the end much better, reason to make that thorough search. What if that first, easily (or not so easily) found record is incomplete and you settle for it? What if it is misleading or wrong? Part of the point of a thorough search is to avoid assuming that the first thing is both correct and all there is. Not so long ago, I found a man in a death register. The register gave me his place of birth and it was consistent with a couple of census entries I had found. Wonderful! It would need to be corroborated, of course. His birth and the making of that entry were separated in time by eighty years, but at least it gave me a starting point. It would have been the wrong starting point. I searched for a death certificate as well and when I found it, it gave a totally different place for his birth. The same name, the same death, the same county’s records, but the places of birth were 500 miles apart in different states. Hmm… What about cemetery records? They agreed with the death certificate. Every other indication, hint, and clue I’ve found since, as well as history, all lead me to believe that his entry in the death register is wrong and his death certificate is correct. It is good to keep looking, even after you have “the answer.”
I found a man I was looking for in the census. He was living with his parents, which I didn’t expect, but the rest of the information was a very good match. So my search, at least for that census year, was over. Except, I found a man with the same name living with his wife and son. The information did not match particularly well with the man I was looking for. Who was the right person? Both of them. They were the same man. He was enumerated twice. One enumeration found him where he should not have been recorded but gave correct biographical information. The other recorded him at his actual home but was full of bent truths. Both were useful, but only one should have existed. It is good to keep looking, even after you have “the answer.”
By Daniel Hubbard | December 8, 2015
Getting into the minds of the people I research is important to me. It can help lead the way to discoveries. It also makes those people so much more real. It also helps to remind me that our ancestors were a fascinating combination of just-like-us and totally alien. If we assume that they were all the one or all the other, we will certainly be all wrong.
Today we look at the clock on the wall, the clock on the microwave, the oven, the car dashboard, the coffeemaker, the stove fan (at least in my kitchen), the computer screen… If none of those is present, a wristwatch or phone probably is. None of them even requires the old daily ritual of inserting the key and winding the clock.
This time of year reminds me of a different way, a way of marking time. The little Grinch in my head points out that a steady flow of Christmas related ads and spam marks the time by trickling into my inbox like the the sand flowing in an hour glass—117 unread items, must be 10 am… If I tell that little Grinch to go away and let me enjoy the season, my mind can get to other, older ways of telling the time. The kids always look forward to the evening lighting of the Advent candle that marks of the time by burning away a little wax every day. In the morning they set about finding and opening the doors on the Advent calendar that makes a game out of the simple task of marking the time and indicating the date.
The lighting of the candles in an Advent wreath reminds me that time can be experienced in intervals of weeks rather than minutes and seconds. The pace is clearly something out of the past. It represents a different way of thinking about time. Advent starts on no one date and lasts different numbers of days, depending on the year. It is a tradition, an observance, and a reminder of the way our ancestors experienced time—uneven, drifting, slower than the blinking LED numbers on my microwave.
Tonight I passed a window with a menorah. Its candles marking days that begin at sunset and that start from a date set by the cycles of both the sun and moon. None of my ancestors marked time in just that way, but I have researched people who did. It was yet another reminder of the different ways that time has been, and still is, seen.
All these things, at this one time of year, cause us to stop for a moment and consciously mark the time. Something that was once so normal, a part of our ancestors lives, that now we rarely do. Take the time to notice.Twitter It!
By Daniel Hubbard | November 29, 2015
Sometimes the past doesn’t need to be so distant to seem far away. Cleaning out things that the kids have outgrown turned up one of those typical alphabet books that are for children that can’t yet read. The kind of book whose genealogist version might start—
A is for antecedent, those things that came before.
B is for the family’s Bible; births, marriages, and deaths tucked safely in a drawer.
H is for Headstone
“H” could be for “headstone,” those objects that others associate with burials and Halloween, but we know they are one of the few times a bit of our family history is literally carved in stone.
“H” could be for “Handybook,” the genealogical reference that sits on so many desks. Mine is a 7th edition given to me by an aunt, who perhaps wanted me to stay interested, or who just maybe got tired of me borrowing hers.
“H” could be for “hope,” that feeling that keeps us going through all sorts of adversity, genealogical and otherwise. It helps us persist when we are convinced an ancestor is hiding.
“H” could be for “hiding,” that activity that certain ancestors seem to revel in. They play hide-and-seek with us across the centuries.
“H” could be for “hunt,” when our seeking gets more serious and that hiding ancestor becomes Moby Dick to our Captain Ahab.
Those are all fine words, but, the most important word beginning with “H” must be “History.” We deal most often with historical minutia, but we dare not forget about history at the other scales. That bigger history is the bedrock upon which our ancestors stood. If we ignore that history, we leave them floating and misunderstood or even unfound and unknown.Twitter It!
By Daniel Hubbard | November 23, 2015
Genealogists spend much of there existence in “about-time,” that time that is neither known nor unknown, that twilight between mystery and understanding. Yet about-time doesn’t need to be as mysterious as it often seems. There is usually some information hiding behind the word “about.”
Perfect-World Type of About-Time
Where do we find “about 1811” in the about-time calendar? It depends on the reason for that “about.” Probably the most common reason we enter into about-time is an age. If all we know about an ancestor’s birth is that they were recorded as being age 39 in the 1850 census, we might write that they were born “about 1811.” That is not really all we know though, is it? If the age is accurate, then that ancestor was born either in 1811 or, if his or her birthday had not yet passed, 1810. A simple “about 1811” implies that 1811 is likely but that 1810 and 1812 are also fairly likely. That isn’t really correct. 1810 is quite possible and 1812 is impossible (again, if the age is accurate). We can even go one step farther. If we check the census day for 1850, we find that it was June 1. If everything was done write, our ancestor had to have been born by June 2, 1810, and on or before June 1, 1811. A birth in 1810 is actually slightly more likely than one in 1811.
Another source of about-time is probate records. Fraud aside, an accurate copy of a will implies that the ancestor was alive on the date the will was written. Outside of rare cases when a missing person was declared dead, the ancestor was dead by the date the will was proved. If the will was proved January 31, 1860, one might write “about 1859” for the death date. Yet it is trickier than that. Under some circumstances, it can by years after the death that the will was proved, making “about 1859” wildly off. Check the date that the will was written and discover that it was written January 7, 1860, and if everything is accurate, “about 1859” is clearly not right.
Sometimes we genealogists might estimate a marriage date based on the birth date for the only known child. Once again we have entered about-time, and now we are dealing with a marriage that might have occurred only a few months before the birth, as sometimes happened, or more than a decade before the birth. Here there is no true range. All we know is that the marriage occurred before the birth, and that though likely to have occurred within a few years of the birth, it could have happened anywhere in a range only limited by legal marrying age and biological impossibility.
The Evidence, Warts and All Type of About-Time
Because the records we deal with are created by real, fallible people, we also need to remember that the dating evidence might be wrong. This goes especially for ages. Can we be sure that the ancestor in question was really born between June 2, 1810, and June 1, 1811? No. That is what that one record tells us, but we cannot be sure of the accuracy? Misreporting of ages is common. In any case when we are in about-time we also need to ponder how accurate we believe our information to be. Suspiciously round numbers for ages, notes that imply the information was questionable even to the person who recorded it, reasons to doubt other information recorded by the same person, and conflicting evidence, should all lead us to be much less certain of our information. Our about-time needs to become wider to accommodate the reasonable possibilities.
When I was a physicist, I learned to calculate the level of error on a measurements. The measurement might give a single value, but that value might be off. Errors on data points expressed how for away from reality those points might be, given the circumstances. Just as we just saw in genealogy, there were two types of errors. One reason for inaccuracy in physics is the number of measurements. More measurements give more accuracy by an amount that depends on how many measurements were made. Those statistical errors shrink with more measurements. If everything went right, the true value should be within the range given by those errors. They are the perfect world type of errors. They answer the question—if everything was fine, how close should we be to reality? In the world of genealogical about-time, this is the perfect-world range one gets if one has an age, assumes it is correct, and uses it to calculate the possible birthdays. Back in the world of physics, we also needed to think about systematic errors, those errors that might occur because of external problems, like a lack of accuracy in settings used. In a physics experiment, you might set a meter to 3.50 wangdoodles (not an actual unit), but the actual number of wangdoodles might have been 3.49 or 3.51. That inaccuracy can shift or blur results. Back in the world of genealogical about-time, this is when we take into account that we are not dealing with perfect informants, talking to perfect clerks, enumerators, priests, ministers, and sextons. Those records were not then copied by perfect copyists when the originals needed preservation. That is the source of our systematic error.
All those things are rolled up into our about-time. If we write down nothing more than “about” to tell the reader that the date is not accurate, we leave them to try to figure it out from any clues we leave behind. In the worst case, the reader is forced to guess.
We have different sorts of errors. We either know what they are, or have ways of estimating. We can tell our readers the rationale behind those errors. Instead we write “about.” Sometimes our tools make it hard to do anything else. I guess that we could do better.Twitter It!
« Previous Entries Next Entries » Twitter It!