User:Sigma0 1/ma

Micah Altman is an American social and information scientist who conducts research in social science informatics. He is known for his pioneering work on computational models of electoral districting and for his contributions to the research methodology of social sciences, especially data curation and statistical computing.

Biography
Micah Altman was born on August 31, 1967 in St. Louis.

He studied computer science and political philosophy at Brown University and graduated with a double B.A. degree in 1989. He went to graduate school at California Institute of Technology where he studied social science under Morgan Kousser and received a Ph.D. in 1998. After that, he was a postdoctoral researcher in Gary King's research group at Harvard.

Altman started his professional career as a software engineer at Sun Microsystems, and Silicon Graphics. Since 1997, he has been working at Harvard University, both on administrative positions as the associate director of Harvard-MIT data center and the archival director of Henry A. Murray archive, and as a research fellow at the Center for Basic Research in Social Science and a scientist at the Institute for Quantitative Social Science. He is currently the director of data archiving and acquisition of Harvard University Institute for Quantitative Social Science, a non-resident senior fellow at the Brookings Institution in Washington, D.C., and from March 2012, the director of research at the MIT Libraries

Altman lives in Somerville, Massachusets and has two children.

Electoral districting and redistricting
Altman's contributions to electoral districting and redistricting have been both theoretical and implementational. He studied fundamental aspects of automatability of redistricting for his doctoral research at CalTech and showed that because the number of different ways to partition a region into electoral districts is prohibitively large for all but trivial cases, the computational complexity of the districting problem is NP-hard and hence likely to be intractable without further constraints and heuristics (Altman 1997).

The undesirable implications of this result are that redistricting cannot be fully automated in practice and the choice of constraints and the manual selection of the winning, "optimal" plan from a group of auto-generated plans, reintroduce value-laden and politically-biased decision making back into the redistricting process (something that the use of "objective" computer programs was hoped to avoid), while potentially also legitimizing such undercover gerrymandering for the less knowledgeable public (Altman 1997).

The computational simulations that Altman performed, showed also (Altman 1998) that even the constraints that have been traditionally considered politically non-preferential, such as the overall compactness of the district, are not necessarily that because compactness requirements have different effects on political groups if the groups are distributed in geographically different ways. This result was referenced by the Supreme Court justices in the Vieth_v._Jubelirer case.

Altman and his colleagues later created the open-source BARD software (Altman and McDonald 2011) and the DistrictBuilder software to enable users to automatically determine district boundaries on the basis of demographic data (voting age, race, medium income, education) and other criteria such as district compactness and contiguity. They address the computational complexity of the districting problem by using metaheuristics (such as simulated annealing, genetic algorithms, tabu search and greedy randomized adaptive search) for refining auto-generated or pre-existing plans, and implement different performance enhancements like evaluation caching, explicit memory management, and distributed computing (Altman and McDonald 2011). These solutions minimize the necessary human intervention in narrowing down the number of district plans but do not fully eliminate it.

Data curation
Altman's research in data curation has been in relation to his work at Harvard libraries and data archives, especially the Virtual Data Center project that he led with Sidney Verba, and with its successor the Dataverse Network. He has studied ways of improving the methodologies for preserving, archiving and cataloging research data in social sciences, and methods for distributing and disseminating the data for the reuse by other researchers. To yield reliable and comparable results, standard methods of data encoding are needed for data attribution and data citation, and for maximally accurate data verification and replication (Altman, Gill and McDonald 2004).

In (Altman and King 2007), the authors proposed a standard for citing quantitative data, similar to the existing standards for citing papers and analyses that are performed on the data, as no such standard for data citation existed before. The citation information they recommended included a unique global identifier, a short character string guaranteed to be unique among all such identifiers, that permanently identifies the data set independent of its location, and a universal numeric fingerprint, a fixed-length string of numbers and characters that summarize all the content in the data set, such that a change in any part of the data would produce a different fingerprint.

The data fingerprints they propose are based on checksums and can be created by applying hash functions to normalized and approximated data (Altman, Gill and McDonald 2004), and used in statistical applications to prevent misinterpretations of data, and to verify content and format during data migration and archiving (Altman 2008). The algorithm for generating the fingerprints has undergone several revisions because the initial versions underestimated the expressive power needed to encode the data (Altman 2008) and the simpler algorithm inherited the weaknesses of the MD5 hash function that was shown to have several vulnerabilities.

Statistical computing
Because of the great number of variables involved in experiments in social sciences and the values of variables often entangled, complex or hard to quantify, precise predictions are hard to make in social sciences. (Altman, Gill and McDonald 2004), an advanced-level reference book for social scientists about computational statistics,  shows that these problems are frequently compounded by measurement errors and numerical inaccuracies that arise in statistical computing. The sources of these errors range from un-modeled measurement errors to software errors in statistical packages, errors in data input, data that is ill-conditioned for a particular model, ﬂoating point underﬂows and overﬂows, rounding errors, non-random structures in random number generators, local optima or discontinuities in optimization, inappropriate or unlucky choices of starting values and inadequate stopping criteria.

It is shown that the knowledge of numerical methods underlying computerized statistical calculations and how they are used in statistical packages is essential for planning quantitative studies in social sciences and for making accurate inferences, and techniques and diagnostic tests are offered to detect such problems and prescriptions for good statistical computing practice that results in greater accuracy, precision, robustness, sensitivity and reproducibility (Altman, Gill and McDonald 2004).

Selected works

 * Micah Altman (1997). The computational complexity of automated redistricting: Is automation the answer? Rutgers Computer and Technology Law Journal, 81-141.
 * Micah Altman (1998). Modeling the effect of mandatory district compactness on partisan gerrymanders. Political Geography'', 17(8), 989-1012.
 * Micah Altman, Jeff Gill and Michael P. McDonald (2004). Numerical issues in statistical computing for the social scientist. Wiley Series in Probability and Statistics. Hoboken, NJ: John Wiley & Sons. ISBN 0471236330
 * Micah Altman and Gary King (2007). A Proposed Standard for the Scholarly Citation of Quantitative Data. DLib Magazine, 13 (3/4), 1082–9873.
 * Micah Altman (2008). A Fingerprint Method for Scientific Data Verification. Advances in Systems, Computing Sciences and Software Engineering: Proceedings of the International Conference on Systems, Computing Sciences and Software Engineering 2007. Springer Verlag, 311-316.
 * Micah Altman and Michael P. McDonald (2011). BARD: Better automated redistricting. Journal of Statistical Software, 42(4), 1-28.

Recognition
Altman is the recipient of the year 1999 Outstanding Dissertation Award by the Western Political Science Association for his doctoral dissertation "Districting Principles and Democratic Representation". The software he has created has been recognized by awards from the American Political Science Association.

Altman is mentioned in the 57th, 58th and 63rd editions of Who's Who in America.