User:Huffkw/sandbox

Descendent genealogy, or descendent-sequence genealogy, or "reverse genealogy" (see section describing differences below Differentiate from reverse genealogy), is the genealogy research process in which an ancestor is chosen as the beginning point, and the descendents of that person are traced to the present. The research might extend to every descendent, regardless of surname, or the research process might be mostly focused on just those descendents retaining the surname of that ancestor. At least in the United States, male descendents typically retain the surname of their ancestor, while female descendents and their offspring adopt the surname of their husbands which is normally different from the daughter's birth surname.

As an example, the book entitled Descendants of Governor William Bradford ... compiled under auspices of Bradford Family Compact, begins with William Bradford of Mayflower fame, and documents about 6500 of his descendents, of which only about 300 retained the Bradford surname. Numerous books have been researched in descendent sequence and published with titles in the form of "Descendents of..." About 300,000 books of family genealogy have been published, although it is not known what portion of them follow the descendent-sequence presentation method.

Contrasting with "descendent genealogy" is "ascendant genealogy" or pedigree-sequence genealogy which has been the traditional method for tracing one's ancestors. See Wikipedia article entitled Genealogy. In ascendant genealogy research, the researcher starts with oneself and goes back in time, generation by generation, locating his progenitors.

In the past, if one was interested in one's personal pedigree, in most cases, the most practical way to discover that pedigree was to do research in the pedigree sequence. Because there were serious technical barriers to any extensive cooperation, most of this kind of research would be done by a single person, mostly working in isolation. First, people only share the same exact pedigree with their full-blood siblings, which may be only a small pool of potential helpers, and as they go back in time, and the number of surnames they need to research multiplies, it becomes increasingly difficult to find other researchers who can offer significant assistance. For example, at 10 generations back there are 1024 surnames to research, and it is quite unlikely that our isolated researcher is going to find 1024 collaborators to assist in that process.

Second, without a cooperation tool as powerful as our current Internet, the technical ability to find and cooperate with hundreds or thousands of other people was extremely difficult. Locating the names and addresses of those people who might conceivably offer assistance was very difficult, and actual communication with them would be limited by the slow and uncertain means of surface mail. A two or three week turnaround time on inquiries, if they were answered at all, tends to stretch out the process almost interminably.

In the case of descendent sequence research, there may be a large pool of same-surname cousins who might be willing to assist in assembling the data on the descendents of a common ancestor. Close cousins will share many of the same ancestors, and more distant cousins will share fewer, but, by definition, all of these cousins will share at least one common ancestor.

Efficiency considerations
In today's information age, perhaps the most important feature of descendent research methodologies is that they allow enormous increases in researcher productivity.

(Detailed technical descriptions of 22 major efficiency improvements can be found in issued and pending patents. The largest of the 22 improvements can itself provide an overall productivity improvement of up to 1000 times. Also available is an 82-page e-book explaining the concepts and mechanisms of efficient and cooperative descendent-sequence research: Doing Genealogy the Henry Ford Way: Assembling High Quality Genealogy Data 2000 Times Faster Using Specialization and Cooperation. )

Industrialization and mass production techniques seen at the beginning of the 20th-century, as with Henry Ford's automobile assembly lines, used specialization and cooperation to increase industrial productivity by thousands of times over the previous cottage industry methods of production. Adam Smith, in his famous work Wealth of Nations, observed that a pin manufacturing operation could be made from 240 to 4800 times more efficient through specialization and cooperation. Applying those same techniques to genealogical data assembly can give similar increases in individual researcher productivity.

As is often the case with re-engineering efforts, in order to achieve maximum efficiency and productivity, it is necessary to rearrange the process, which, in the case of genealogy, is the research and recording of historical names. If one is trying to complete a large block of genealogy research, as for a state or a nation, then efficiency considerations become very important. Names are researched and assembled in descendent sequence because that process is hundreds of times more efficient than the traditional pedigree-sequence method of locating relevant names. Once the names are all assembled in descendent sequence, and connections are made to show the marriages between different surname groups, then all possible pedigrees can be read out at the end. This is something like the process where perhaps 10,000 manufacturers prepare all the vehicle parts in advance, and then the final automobile is assembled in perhaps two hours time by selecting from those standardized parts.

The two main elements of genealogy research efficiency are 1) specialization, accomplished by each participant (and any same-surname cousins who wish to assist) working on a single surname, beginning with an ancient ancestor and coming forward in time until the present, and 2) cooperation, accomplished by using a database system which is optimized for storing the resulting single-surname descendent structures logically side-by-side so that all marriage links between surname groups can be easily made using "same-person" links between women where they appear as daughters and where they appear as wives. When this process is finished, all pedigrees can be read out. This entire process can be hundreds of times more efficient in creating the desired pedigrees, even though it may seem more circuitous on the surface.

The mathematics of genealogy
Traditional genealogy research methods are plagued by massive amounts of duplication of research efforts and of duplication in storing and displaying the results for others to see. The basic problem, of course, is that the isolated researchers have had no convenient way to find out what work has been done in the past that relates to their own research interests. Rather than spend enormous amounts of time trying to locate people who had already done work of interest, it was normally more efficient to simply do the research again from scratch. If a famous person has 10,000 descendents, and all those descendents decide to research their unique pedigrees by themselves, then somewhere in the world there might be 10,000 entries for that person appearing in genealogy results. If an attempt is made to assemble and merge those 10,000 research products in one central location and process, there could be massive inconsistencies and confusion as those collections of hundreds of thousands of names are merged together, where each of those collections include that one particular ancestor's name.

All of this duplication of research, and all of the confusion in attempting to merge the many results, can all be avoided if researchers agree to limit their research efforts to the same-surname descendents of a particular ancestor. For example, if researchers with the Bradford surname concentrate their research efforts on the Bradford surname descendents of a particular ancestor, and those researchers with the Adams surname similarly focus on only Adams surname individuals, etc., then there need be no duplication at all, or only minimal duplication.

The massive problems of traditional methods of doing pedigree-sequence research, and the need for the far more efficient descendent-sequence methods, can be illustrated by a few statistics: The LDS Church has had an active genealogy research program going on for more than 100 years, so they can provide some interesting statistical experience. Within the last few years they have discovered that the genealogy research which has been assembled in their central systems has been duplicated an average of 30 times, with 200 times being common, and 10,000 times being perhaps the largest case. This illustrates the staggering duplication, and thus waste, of valuable researcher time. Of the 1.5 billion entries in the database, only about 50 million names are unique. Looking at this another way, if there were 1.5 billion unique entries in the database, that would easily cover the entire United States and all of Western Europe, instead of just the tiny portion of it represented by the 50 million entries.

Zero cooperation implies a 37,000 times potential duplication factor
Traditional methods of research in the United States imply a duplication factor of up to 37,000 times. For example, if all 320 million people in the United States each decided to research their pedigrees back 12 generations, they would need to assemble 2.6 trillion names to complete that task. Calculation: Going back 12 generations requires that each person assemble 8192 names, having a total of 4096 surnames. Those 8192 names for each person times 320 million people equals 2,621,440,000,000 names, or about 2.6 trillion names.

There were only about 70 million people who died in United States before 1930, so if we divide 2.6 trillion by 70 million we get 37,142 or about 37,000. Obviously, it would be far more efficient to do all the research work on those 70 million people just one time, rather than have every person repeat their part of that research themselves and contribute to a massive amount of duplication. These staggering levels of potential duplication helps explain why very little progress is made each year in completing the basic genealogy research for the entire United States.

Note: It is not critical to know the actual number of deaths in the US each year before 1930, but a reasonable estimate is useful is estimating the rate of duplication of research using the traditional methods which still dominate the genealogy industry. There do not seem to be any easily accessible mortality rates and tables prepared that show the estimated number of people who have died in the US since 1790, year by year. However, there are actual census totals for each decennial census, plus loosely estimated death rates for the period of 1790 to 1930. Those numbers have been placed together in a simple calculation which yields a total number of deaths of about 71.9 million for that period. A simple rounded number of 70 million is used in calculations in the body of the main article.

Perfect cooperation could finish the United States with two weeks work
Using descendent sequence research and maximum cooperation provides quite a different result. For example, if we asked the 4 million active genealogists in the United States to each do the research on 18 names and store the high-quality results properly in a central database, the 70 million people who died before 1930 could be quickly completed. If we allocated four hours work for each of those names, the entire project could be finished with two weeks work done by each of the participants.

Changes in efficiency of communication methods
Communication methods among genealogists in modern times began with surface mail, but only in fairly recent times have e-mail communications come along to greatly speed up communications among cooperating genealogists. However, even e-mail communications have their problems, since e-mail addresses are often hard to find, and people may change their e-mail addresses far more often than they change their physical addresses. One of the advantages of a high-quality central genealogy database is that the overwhelming bulk of surface mail and e-mail communications are no longer necessary. Participants in the project simply put their best data into the central database, and anyone seeking that data can quickly find it, evaluate it, and perhaps make use of it (assuming they are given proper authorization). There would rarely be a need to formulate an e-mail request, and, more important, there would rarely be a need for data owners to take the potentially large amounts of time to prepare e-mail replies which might include data attachments. With the availability of an appropriate central database, participants can move from the inefficient ad hoc ways of communicating by surface mail or e-mail, and begin to use sophisticated and industrial-strength cooperation methods.

Thoroughness in finding all historical people
If the goal of a project is to finish a large block of names, such as for a state or for a nation (which also happens to be the most efficient way for all participating genealogists to get their own pedigree research done), one might want to be thorough in finding and recording the names of all historical individuals.

We might notice that only by using descendent-sequence research methods can we be sure that all people who have ever lived are actually documented. Obviously, people who are doing pedigree-sequence research, starting with themselves, are only going to come across those people who had lines of descendents which lasted until the present day. Where a child died without descendents, or their line of descendents did not continue until the current time, that child would not appear on any living person's pedigree chart. But that child was the descendent of someone, and descendent research for that family would naturally include that otherwise missing child.

Graphic representations of descendent-sequence database


Figure 1 presents a conceptual view of a descendent sequence database. Figure 2 represents a single-surname descendent structure that might be contributed by one genealogy researcher who specializes in that surname. The idea is that each descendent structure begins with an ancient ancestor, perhaps 12 generations back, and comes forward until today. All the same-surname descendent structures for a nation can be logically placed side-by-side so that marriage connections can be made, as illustrated in figure 3. Women appear twice in this database, once as a daughter and once as a wife. "Same person" connections are made between the women as daughters and women as wives as a means of tying the same-surname descendent structures together. This allows all possible pedigrees to be read out from this descendent-sequence database. Figure 4 is another way of representing the finished database as shown in figure 3, showing example surnames and connections between women in their separate roles as daughters and wives. Figure 5 has two interpretations. 1) It was originally intended to illustrate the massive duplication of research and public presentation of data when tens of thousands of overlapping pedigrees are placed in a central database, especially when attempts are made to merge that voluminous data into a single structure. 2) With a change of viewpoint, it could also represent the fact that all possible pedigrees can be computed from the finished descendent-sequence database. The fact that the database was originally constructed in descendent sequence assures that there will be zero or minimal duplication within the database.

The extensive cooperation made possible by logically placing all of these single-surname descendent structures side-by-side means that the participants can receive a 1000-to-1 reward for their efforts. They put in one unit of data, which is the descendent structure related to their surname, and they receive access to the other 1023 surname structures they need to complete their 10-generation pedigree (which requires 1024 surnames to be complete).

Differentiate from reverse genealogy
Descendent sequence genealogy is sometimes referred to as "reverse genealogy" by those who are habituated to the traditional pedigree-sequence research methods. But that "reverse genealogy" term might seem a little strange to our ancestors, since life is inevitably lived in descendent sequence, that is, from parents to children. So our insistence that the only important genealogy data is that which goes from us backwards might seem a bit self-centered. In many other uses of the term genealogy, the assumption is often that the researcher is looking for a single important place from which a thing or idea, or a class of things or ideas had their beginnings – the place from which they descended and possibly spread out to have a wide effect.

We should take note that there are books and courses that point out the methods and advantages of descendent-sequence research. It appears that "reverse genealogy" is not really a separate method for doing genealogy research, but is normally treated simply as a technique for finding living cousins who may be able to help in furthering your pedigree-sequence research, or as a bag of tricks to find ways to go over, under, around, or through "brick walls" – breaks in the normal document chain of child-to-parent pedigree-sequence research.

If one of the most productive techniques for getting past "brick walls" in pedigree-sequence research is the use of descendent-sequence research to find new routes to research success, then we might expect that doing all original research in the inherently far more efficient and collaborative descendent sequence would solve in advance the overwhelming bulk of the "brick wall" situations people encounter today.

Apparently, the powerful mathematics of cooperation made possible by descendent-sequence research has never occurred to anyone in the normal or traditional genealogy research community. If more people became aware of the possibilities, great strides could be made in overall researcher efficiency.

It is computers and the Internet which make these great efficiencies easily possible today, but it is interesting to note that purely manual methods could have been used 100 years ago to get approximately the same effect, saving hundreds of millions of researcher hours of duplicated effort. The "computer" would have consisted of a large room containing 2000 4-drawer file cabinets, attended by 500 clerical workers, and could have completed all of the nation's basic genealogy research in about 17 years. The difficulty, of course, was in someone doing the theoretical work, and then finding an institution willing to fund and oversee this rather ambitious project.

The extremely large public interest in genealogy research these days, especially with the explosion of public record information resources on the Internet, seems to call for big strides forward in productivity for these many researchers. An estimated 4 million genealogists, spending perhaps two hours a day on their hobby or profession, would mean that about 3 billion hours were expended each year in United States. If we assign a value of $20 an hour to that work, we reach a $60 billion industry. If we add another $6 billion for computers, Internet connections, books, courses, etc., we might assign the genealogy industry a value of $66 billion per year. The huge levels of duplication in the industry cry out for higher productivity solutions.

Implementation
The ideas briefly presented here are being implemented on the http://www.ProgenyLink.com and http://dev.ProgenyLink.com websites. The associated http://www.ProgenySociety.org website is being organized as an educational and philanthropic effort to train researchers and hobbyists in the use of these new techniques. An older website, http://www.GenReg.com, or http://www.GenealogyRegistry.com, was the site of most early testing of concepts and programming, beginning in 2000.

Books
Ruth Gardiner Hall, Descendants of Governor William Bradford (Ann Arbor, Mich.,, 1951)

Adam Smith, An Inquiry into the Nature and Causes of the Wealth of Nations (London, Methuen & Co., Ltd., 1776. 5th edition 1904.), book 1, chapter 1. Full text at http://www.bibliomania.com/2/1/65/112/frameset.html

E-books
Maureen A. Taylor, Research Strategies: Reverse Genealogy (PDF) $4.00 at http://www.shopfamilytree.com/product/10229/

Kent. W. Huff, Doing Genealogy the Henry Ford Way: Assembling High Quality Genealogy Data 2000 Times Faster Using Specialization and Cooperation (2011), an 82-page book explaining in some detail the concepts and mechanisms of efficient and cooperative descendent-sequence research. Free at http://dev.progenylink.com/docs/20110421Doing%20Genealogy%20the%20Henry%20Ford%20WayV02.pdf

Courses
Reverse Genealogy: Working Forward to Break Down Brick Walls, taught by Lisa Louise Cooke http://www.familytreeuniversity.com/reverse-genealogy

Patents
Genealogy Registry System, patent US 6,760,731 B2 issued July 6, 2004.

Efficient Genealogy Registry System, provisional (improvement) patent filed January 2011.