Recent Posts

Read a Random Post

Archives

Topics


« | Main | »

Research, Rinse, Repeat

By Daniel Hubbard | November 20, 2011

What is “research”? It seems like a simple question but it isn’t. What does research look like? What properties should it have?

Research takes many forms, has many entries in the dictionary and can mean slightly different things in different fields. It can involve experiment, observation, testing, fact finding and theory development. In my old field of particle physics, research almost always involves either experiment or work to develop and improve mathematical models. Astronomical research is much more dependent on observation. Archaeological research can involve performing tests on unique artifacts.

What do genealogists mean by the word research? For better and for worse, the answers to that question vary greatly. For some it means putting down a few names in a pedigree chart, names that are vaguely remembered from long ago conversations with grandma. Then they graft on some online family trees and presto—research. I’m not going to rant about that, everyone needs to begin somehow, though even as a first set of steps, there are some big problems there.

At least historically in genealogy, a person’s implicit answer to “What is research?” has often been “Whatever I want it to be.” That of course is not a particularly useful answer, nor is it an answer that is likely to lead to personal improvement in research technique. So, what are some things that go into good research? It will take me many posts to scratch the surface, so I expect this to be the first in a series. So without much further ado, here is my first property of research.

Reproducibility and Repeatability

At least in experimental sciences, one of the strongest demands on research is that it be reproducible. If no one else can reproduce what you did and come up with a result that is at least consistent with yours, then there is a problem. In particle physics, the big experiments that require unique conditions, like special types of particle beams or collisions, almost always come in pairs. The reason is simple. If an experiment sees something interesting, if it announces a discovery, if it makes any claim at all, there needs to be a way to confirm it with a different experiment. The apparatus must be different with different strengths and weaknesses. The analysis must be different. The people involved must be different. The apparatus, the analysis and the experimenters can all subtly influence the outcome. It needs to be possible to confirm or deny those results or they mean very little.

In scientific research there is an important distinction between reproducibility and repeatability. If the same scientists can perform the same experiment with the same apparatus and get a consistent result then they have successfully repeated the experiment. If another group of scientists using a different apparatus get a result consistent with the first team’s result then they have successfully reproduced the result.

Successfully repeating an experiment might be easy to do or it might be very difficult and prone to failure but few scientists would be comfortable with an experimental result that they could not repeat at all. In genealogy, repeatability looks quite different. You should be able to go back over a part of your research years later and be able to check that you still feel that it is correct as far as it goes.

Everyone’s Ancestors—Pete and Repeat

The first aspect of repeatability in genealogy is the record of the search. Is there a record of what archives you used? Can you look at your research and say that you looked in this courthouse, that online collection of digitized images, these cemeteries, and those libraries?

Can you tell what you searched in each archive? Can you say that you looked through will books B, C and D because they seemed to be the only ones that corresponded to the time period of interest? Did you record what time period that was?

Another aspect of repeatability is the results of the search, both positive and negative. Remember that the positive search results might not be the same as the final conclusion. Several small successes can add up to a much deeper conclusion, so recording the positive search results isn’t automatically part of the overall result.

Negative results are important as well. Sometimes they can be evidence in and of themselves. A person’s disappearance and failure to reappear in records can be an indication of death. Other times a negative search is more subtly important. Perhaps a hypothesis that at the start seems clearly the best, actually looks to be a poor bet as the negative search results pile up. Part of repeatability is to know what possibilities were rejected and be able to confirm that the roads not taken really were dead ends.

Now that the input is well described, it is time to record the logic that gets from the basic undigested evidence to the final conclusion. The logic can be the trickiest part of getting to the conclusion. Finding the evidence might have been trivial but using it might require quite a lot of serious thought. Repeatability includes more than just recording the information that you would need to find the evidence again. Documenting how you mentally got from point A to point B allows you to revisit the same steps and gives you a chance to agree or disagree with each step.

The last facet of repeatability is the conclusion. Usually that is the least likely to be missed but making sure that the conclusion is as clear as possible is important. Sometimes a clear cut conclusion isn’t possible. Things can be messy and the conclusion might be a range of possibilities and a judgement of how probable they are. If that is as clear as it can get, so be it. Don’t shy away from, or over simplify messy reality.

Reproduction: Not Just Something For Great-Grandpa’s Family of Twenty

Now that we have some idea of what repeatability looks like, what might reproducibility entail in genealogical research? In genealogy the closest analogue of experimental apparatus might be the resources that were checked—the books, court records, cemeteries etc. Clearly one can’t say that a genealogical activity is not research unless it is possible to draw the same conclusion with different sources. Those different sources might not exist and often we are lucky that one crucial document was preserved that casts light on all the rest. No one could be expected to reproduce your results without that document. If someone can, it is wonderful but it cannot be required. In this respect, for genealogists there is no difference between reproducibility and repeatability.

The other requirement of scientific reproducibility, that other qualified researchers can reproduce the result, is much more relevant to genealogy. People have biases, different knowledge, different motivations and all of those things can interfere with results. If different people produce compatible results then there is some reason to believe that the results are correct. The more researchers succeed and the fewer who fail the more likely the results are to be valid.

There are, I think, two distinct types of result reproduction. It happens that people working totally independently produce compatible, high quality results. That is always interesting because it means that one person’s research did not bias the other’s research. The other type is closer to what was discussed about repeatability. Another researcher knows what you did. They know what archives you visited, they can look at the same sources and follow your logic. They know how you searched, what you found and what you did not find. They know everything you did and can say at the end whether or not they agree or if perhaps they suspect that you misread something or that the will that you could not find is housed elsewhere or if you missed a possibility in your thinking that means that there is more work to do. Achieving that kind of reproducibility requires another property of research—openness.

Twitter It!

Topics: Genealogy, Research Mindset | No Comments »

Twitter It!

Comments