Human Speechome Project

The Human Speechome Project ("speechome" as an approximate rhyme for "genome") is an effort to closely observe and model the language acquisition of a child over the first three years of life.

The project was conducted at the Massachusetts Institute of Technology's Media Laboratory by the Associate Professor Deb Roy with an array of technology that is used to comprehensively but unobtrusively observe a single child – Roy's own son – with the resulting data being used to create computational models to yield further insight into language acquisition.

Detail
Most studies of human speech acquisition in children have been done in laboratory settings and with sampling rates of only a couple of hours per week. The need for studies in the more natural setting of the child's home, and at a much higher sampling rate approaching the child's total experience, led to the development of this project concept.

Just as the Human Genome Project illuminates the innate genetic code that shapes us, the Speechome project is an important first step toward creating a map of how the environment shapes human development and learning. Frank Moss, director of the Media Lab

A digital network consisting of eleven video cameras, fourteen microphones, and an array of data capture hardware was installed in the home of the subject. A cluster of ten computers and audio samplers is located in the basement of the house to capture the data. Data from the cluster is moved manually to the MIT campus as necessary for storage in a one-million-gigabyte (one-petabyte) storage facility.

To provide control of the observation system to the occupants of the house, eight touch-activated displays were wall-mounted throughout the house to allow for stopping and starting video and or audio recording, and also erase any number of minutes permanently from the system. Audio recording was turned off throughout the house at night after the child was asleep.

Data was gathered at an average rate of 200 gigabytes per day, necessitating the development of sophisticated data-mining tools to reduce analysis efforts to a manageable level, and transcribing significant speech added a labor-intensive dimension.