By Daniel Hubbard | November 27, 2011
This week, I need to start where I left off last week, with reproducibility. This week comes a facet to research that overlaps greatly with reproducibility—openness. If reproducibility is central to true research, there must be some way to see inside what someone else has done. It must be possible to understand not just the result but what was done to reach the result. It must be possible for one researcher to walk the proverbial mile in another researcher’s shoes. Openness is the key to reproducibility.
I just read about an attempt to reproduce a study that seemed to show that chronic fatigue syndrome might be caused by a specific virus, one that is known to cause cancer in mice. Several labs tried to reproduce the result by analyzing blood samples from the same two sets of individuals. One group had chronic fatigue, the other didn’t. The labs did not know which samples came from which people, so they had no idea where they were “supposed” to find the virus and where they were not. When the results were compared with the identities of the people who gave blood samples, there was no correlation. The research was open. All the labs knew what to do but they could not reproduce the research.
In genealogical research, reproducibility looks a bit different from experimental science but the foundation, openness, is the same. If no one can confirm, disprove or question your work, you may be doing some of the activities involved in research but for the outside world there is no way to tell. To paraphrase—If a will is found in a courthouse and no one ever hears your shout, have you really made a sound?
If that will only leads you to add the names of a few children to a list without indicating the source of the information you can perhaps be said to have done research (you performed the search, found a presumably relevant document and drew conclusions) but you haven’t produced research. If all you produce that can be accessed by others is that list of children, a descendant who finds your list won’t know where the names originated and can’t tell anything about how legitimate the list is. Only by making your research available from start to finish in some way at some time, that is by making it open, can your work be reproduced.
In particle physics one often publishes a paper that simply describes the apparatus. This allows others to understand its strengths and weaknesses. It allows them to see the tools that you had at your disposal, construct something similar or improved, or just reuse a technique.
Openness in genealogy can give similar benefits. Even if someone is researching something totally different, they might learn something from a method you used. Perhaps someone else might learn of the existence of a set of records that was unknown to them.
Perhaps a record seems to show that your conclusion cannot be correct. If it is possible to see that you used that record in your research, then perhaps someone will think twice about concluding that you were wrong. If you present a coherent argument for why that record should not be taken at face value and an interpretation of why the confusion exists, then it is much more likely that you research will continue to be a contribution to genealogy. On the other hand, if your research is open and you clearly did not know of this record’s existence, then suspicions that you had some secret reason for believing what you concluded won’t get in the way of progress.
Openness allows researchers to build upon your ideas and potentially make better choices. Sometimes the most important things about a piece of research can be the steps along the way to the conclusion. If your conclusion does not seem rock solid and someone else can see something that you did not check, that hole might be filled for you. Perhaps you saw hints of a temporary move made by an ancestor but you couldn’t find anything more. You could not really conclude anything about that possible move. If your research is open, someone else may look at your work, agree that your ancestor temporarily left his hometown and be able to show where he was because your thoughts on the subject were available.
Maybe you will lead someone around a pitfall that would have caused them to disagree with you had they fallen in. Maybe you will allow them to point out a flaw as the work through what you have written. Maybe it will just save them time in confirming your brilliance. The path from evidence to ancestor can be a rough one, it is up to every researcher who has tried to navigate that path to be the native guide for those that come later.
Being open with negative results is usually quite difficult. Who likes to advertise things that didn’t work? Yet spreading the word about what not to try is just as important as being open about positive results, if not more so. If there are several obvious steps to take, it does very little good for researcher after researcher to keep trying them and failing. It would be better to know that they have been tried many times before and always failed. They might still be worth trying again but with the knowledge of what has failed one can decide to try something else.
Even without the psychological barrier to being open about failed attempts, there is often another barrier. Often no one wants to publish negative research. Positive results are interesting and get published. Negative results only become interesting once a positive result appears. It often happens that a positive scientific result gets attention, it might even catch the public eye. Later people are disappointed that nothing seems to have come of it. The reason is quite possibly what is known as publication bias. Once that first positive result sees the light of day, once a big effect is announced, only then do negative results or experiments that see a smaller effect become interesting. There was a threshold blocking getting information out. Once the subject is publicized, the threshold becomes lower.
This happens in genealogy as well. Imagine that you are trying to determine who the father of James Smith was. You have several possibilities—William Smith, John Smith and James Smith Sr. The first thing you manage to show is that James Smith Sr could not possibly be the father. What do you do next? You work on William or John. Time drags on and eventually you read that the problem has been solved. The father was James Smith Sr. Your immediate thought is bound to be “Wait a minute!” You check through what you have and it still seems correct to you. The “positive result” that you read about could not possibly be right and you have the proof. Suddenly that negative result of yours becomes very interesting on its own, without you having actually solved the underlying problem.
It Isn’t Easy
What might openness in genealogy really mean? Of course some research is already open. Some people get articles into respected publications. Other people produce books that not only state their conclusions but at least state their sources if not always their reasoning.
For cost and availability reasons, the internet is the main way to disseminate genealogical information and one rarely finds full blown openness there. Most openness in genealogy probably occurs informally and incompletely—sharing rather than formal openness. People share this and that with distant cousins who are interested in the same problem. Online family trees seem to only rarely contain sources let alone a significant number of quality sources. Including reasoning in those trees is not really possible. A researcher might put some information on their website but it is rarely if ever the equivalent of a research paper. Not many people would be interested in expending the time and effort to do that, I suspect or would not feel that their research was worthy of that kind of treatment.
One also needs to decide when to be open. Some family historians are gregarious. Others prefer family history to be a very private pursuit. Some never feel like they have gotten far enough to make what they have done so public. There are clearly times when research isn’t mature. Research that hasn’t been processed and interpreted can easily be more noise than nuance. At some time though, research that is truly original should be made open.
Openness requires effort on the part of the reader as well. Research may seem to be open without it actually being so. If the appearance of openness gives confidence without leading to a critical examination of what is being described, then apparent openness can become a tool for misleading the reader.
Another responsibility placed on the reader is the scholarly response. Open scholarly output deserves an open, well thought through scholarly response. Though genealogists actually seem better than most denizens of the internet when it comes to keeping their cool, the opportunity to blow one’s stack, and let the world know instantly, is there.
The results of openness and even simple sharing are often not what one wants. No one likes to make years of research available only to find it appear elsewhere, somewhat mangled, with no indication as to who did the research or the sources that went into it. One person’s labor of love has a way of being replicated over and over losing all the blood and sweat that went into it, all of its heart gone. Here is another implicit responsibility of the reader, the responsibility to give credit where credit is due and to remember that you are not the one who decides how and how widely another person’s research should be disseminated.
Once the research is released into the wider world, there is something else that happens that can be a little scary. The researcher loses control over the research. A mistake that exists only on a hard drive can be remedied almost as soon as it is discovered. In open research, if the original author finds a mistake, suddenly there is the responsibility to correct it and the feeling of responsibility whenever someone makes use of the old erroneous research. Here is the fear, not so much of being wrong, but of misleading.
Even with its difficulties and potential pitfalls, openness is the ideal in research. It is not easy to handle but it is a vital ingredient in any field. It is not a simple matter to get right but every field that manages it, benefits from it. We hold doors open for each other, perhaps we’d benefit from doing the same with our research.Twitter It!