User:The Jedi Math Squirrel/newsandbox

Coverage Error
Coverage error is one type of Total survey error. In survey sampling, a Sampling frame is used to draw a random sample from the population.In a census, a sampling frame is still used, but the intent is to include the entire population. The differences between the target population of the survey and the sample frame result in coverage error. One way to describe coverage error is by using an example from Twitter. Suppose a researcher is using Twitter to determine the United State's public opinion on a recent action taken by that nation's current President. U.S. Twitter user demographics are not representative of U.S. demographics. Therefore, the data source will introduce a type of error called undercoverage. Undercoverage is the error that results when the target population of a study is not contained within the sampled population. Also, not all users are assigned to an individual, and one individual might have multiple accounts. Therefore, the data source will introduce a type of error called overcoverage. Overcoverage is the error that results when data exists for entities that should not be counted or entities are counted more than once. The result of undercoverage and overcoverage is bias.

Ways to Quantify Coverage Error
Coverage errors can be quantified by using methods of employed in Mathematical statistics in identifying a plausible Statistical model. (Numerous journal articles can be cited. Maybe Mike can add the example from the 2010 census that modeled the frame using a zero inflated negative binomial model Zero-inflated model.)

One method to quantify coverage error is to perform an evaluation study. This approach is similar to mark-recapture methodology. For example, suppose a census was conducted. After the completion of the census, random samples from the frame are drawn to be counted again. The difference between the two counts of the same area sampled is used to determine coverage error.

Ways to Reduce Coverage Error
One way to reduce coverage error is to rely on multiple sources to either build a sample frame or solicit information. This is called a mixed-mode approach. For example, Washington State University students conducted Student Survey Experience Surveys by building a sample frame using both street addresses and email addresses. In another example, the 2010 U.S. Census primarily relied on residential mail responses, and then field interviewers were deployed for non-responders. This approach had the added benefit of cost reduction as the majority of people responded by mail and did not require a field visit.

Another way to reduce coverage error is by utilizing Paradata. An example of this is using paradata to produce a sampling frame of telephone numbers. Suppose the target population is households. Since telephone numbers can include businesses, overcoverage is a concern. There is a method of assigning a score to phone numbers which indicates the number's likelihood of being assigned to a person or business.

2010 Census
Coverage Follow-Up (CFU) and Field Verification (FV) were United States governmental operations in the 2010 Census that were formed to improve upon the 2000 Census. The type of coverage errors these operations intended to address were as follows: not counting someone who should have been counted; counting someone who should not have been counted; and counting someone who should have been counted, but whose identified location was in error. Coverage errors in the U.S. Census have the potential impact of allowing people groups to be underrepresented by the government. Of particular concern is "differential undercounts" which underestimates demographic groups. Although the efforts of the CFU and FV improved the 2010 Census accuracy, more study was recommended to address the question of differential undercounts.