Geodemographic segmentation

In marketing, geodemographic segmentation is a multivariate statistical classification technique for discovering whether the individuals of a population fall into different groups by making quantitative comparisons of multiple characteristics with the assumption that the differences within any group should be less than the differences between groups.

Principles
Geodemographic segmentation is based on two simple principles:
 * People who live in the same neighborhood are more likely to have similar characteristics than are two people chosen at random.
 * Neighborhoods can be categorized in terms of the characteristics of the population which they contain. Any two neighborhoods can be placed in the same category, i.e., they contain similar types of people, even though they are widely separated.

Clustering algorithms
The use of different algorithms leads to different results, but there is no single best approach for selecting the best algorithm, just as no algorithm offers any theoretical proof of its certainty. One of the most frequently used techniques in geodemographic segmentation is the widely known k-means clustering algorithm. In fact most of the current commercial geodemographic systems are based on a k-means algorithm. Still, clustering techniques coming from artificial neural networks, genetic algorithms, or fuzzy logic are more efficient within large, multidimensional databases (Brimicombe 2007).

Neural networks can handle non-linear relationships, are robust to noise and exhibit a high degree of automation. They do not assume any hypotheses regarding the nature or distribution of the data and they provide valuable assistance in handling problems of a geographical nature that, to date, have been impossible to solve. One of the best known and most efficient neural network methods for achieving unsupervised clustering is the Self-Organizing Map (SOM). SOM has been proposed as an improvement over the k-means method, for it provides a more flexible approach to census data clustering. The SOM method has been recently used by Spielman and Thill (2008) to develop geodemographic clustering of a census dataset concerning New York City.

Another way of characterizing an individual polygon's similarity to all the regions is based on fuzzy logic. The basic concept of fuzzy clustering is that an object may belong to more than one cluster. In binary logic, the set is limited by the binary yes–no definition, meaning that an object either belongs or does not belong to a cluster. Fuzzy clustering allows a spatial unit to belong to more than one cluster with varying membership values. Most studies concerning geodemographic analysis and fuzzy logic employ the Fuzzy C-Means algorithm and the Gustafson-Kessel algorithm, (Feng and Flowerdew 1999).

Systems
Famous geodemographic segmentation systems are Claritas Prizm (US), CanaCode Lifestyles (Canada), PSYTE HD (Canada), Tapestry (US), CAMEO (UK), ACORN (UK), and MOSAIC (UK). New systems targeting population subgroups are also emerging. For example, Segmentos examines the geodemographic lifestyles of Hispanics in the United States. Both MOSAIC and ACORN use Onomastics to infer the ethnicity from resident names.

CanaCode Lifestyle Clusters
CanaCode Lifestyle Clusters is developed by Manifold Data Mining and classifies Canadian postal codes into 18 distinct major lifestyle groups and 110 niche lifestyles. It uses current-year statistics on over 10,000 variables ranging from demographics to socioeconomic factors to expenditures to lifestyle traits (e.g. consumer behaviors) including product usage, media usage, and psychographics.

PSYTE HD
PSYTE HD Canada is a geodemographic market segmentation system that classifies Canadian postal codes and Dissemination Areas into 57 unique lifestyle groups and mutually exclusive neighborhood types. PSYTE HD Canada is built on the Canadian Census demographic and socioeconomic base in addition to various other third party data inputs combined in a state of the art cluster build environment. The resultant clusters represent the most accurate snapshots of Canadian neighborhoods available. PSYTE HD Canada is an effective tool for analyzing customer data and potential markets, gaining market intelligence and insight, and interpreting consumer behavior across the diverse Canadian marketplace.

CAMEO system
The CAMEO Classifications are a set of consumer classifications that are used internationally by organisations as part of their sales, marketing and network planning strategies.

CAMEO UK has been built at postcode, household and individual level and classifies over 50 million British consumers. It has been built to accurately segment the British market into 68 distinct neighbourhood types and 10 key marketing segments.

Internationally Global CAMEO is the largest consumer segmentation system in the world, covering 40 nations. There is also single global classification CAMEO International which segments across borders.

CAMEO was developed and is maintained by Callcredit Information Group.

Acorn system
A Classification Of Residential Neighborhoods (Acorn) is developed by CACI in London. It is the only geodemographic tool currently available that is built using current year data rather than 2011 Census information. Acorn helps to analyse and understand consumers in order to increase engagement with customers and service users to deliver strategies across all channels. Acorn segments all 1.9 million UK postcodes into 6 categories, 18 groups and 62 types.

MOSAIC system
Mosaic UK is Experian's people classification system. Originally created by Prof Richard Webber (visiting Professor of Geography at Kings College University, London) in association with Experian. The latest version of Mosaic was released in 2009. It classifies the UK population into 15 main socio-economic groups and, within this, 66 different types.

Mosaic UK is part of a family of Mosaic classifications that covers 29 countries including most of Western Europe, the United States, Australia and the Far East.

Mosaic Global is Experian's global consumer classification tool. It is based on the simple proposition that the world's cities share common patterns of residential segregation. Mosaic Global is a consistent segmentation system that covers over 400 million of the world's households using local data from 29 countries. It has identified 10 types of residential neighbourhood that can be found in each of the countries.

geoSmart system
In Australia, geoSmart is a geodemographic segmentation system based on the principle that people with similar demographic profiles and lifestyles tend to live near each other. It is developed by an Australian supplier of geodemographic solutions, RDA Research.

geoSmart geodemographic segments are produced from the Australian Census (Australian Bureau of Statistics) demographic measures and modeled characteristics, and the system is updated for recent household growth. The clustering creates a single segment code that is represented by a descriptive statement or a thumbnail sketch.

In Australia, geoSmart is mainly used for database segmentation, customer acquisition, trade area profiling and letterbox targeting, although it can be used in a broad range of other applications.

The Output Area Classification
The Output Area Classification (OAC) is the UK Office for National Statistics' (ONS) free and open geodemographic segmentation based upon the UK Census of Population 2011. It classifies 41 census variables into a three-tier classification of 7, 21, and 52 groups.

The perceived advantages of OAC over other commercial classifications stem from the fact that the methodology is open and documented, and that the data is open and freely available to both the public and commercial organizations, subject to licensing conditions.

OAC has a wide variety of potential applications, from geographic analysis to social marketing and consumer profiling. The UK public sector is one of the main users of OAC.

ESRI Community Tapestry
This method classifies US neighborhoods into 67 market segments, based on socioeconomic and demographic factors, then consolidates these 67 segments into 14 types of LifeModes with names such as "High Society", "Senior Styles", and "Factories and Farms". The smallest spatial granularity of data is produced at the level of the U.S. Census Block Group.

See also Market segmentation