Affinity analysis

Affinity analysis falls under the umbrella term of data mining which uncovers meaningful correlations between different entities according to their co-occurrence in a data set. In almost all systems and processes, the application of affinity analysis can extract significant knowledge about the unexpected trends. In fact, affinity analysis takes advantages of studying attributes that go together which helps uncover the hidden patterns in a big data through generating association rules. Association rules mining procedure is two-fold: first, it finds all frequent attributes in a data set and, then generates association rules satisfying some predefined criteria, support and confidence, to identify the most important relationships in the frequent itemset. The first step in the process is to count the co-occurrence of attributes in the data set. Next, a subset is created called the frequent itemset. The association rules mining takes the form of if a condition or feature (A) is present then another condition or feature (B) exists. The first condition or feature (A) is called antecedent and the latter (B) is known as consequent. This process is repeated until no additional frequent itemsets are found. There are two important metrics for performing the association rules mining technique: support and confidence. Also, a priori algorithm is used to reduce the search space for the problem.

The support metric in the association rule learning algorithm is defined as the frequency of the antecedent or consequent appearing together in a data set. Moreover, confidence is expressed as the reliability of the association rules determined by the ratio of the data records containing both A and B. The minimum threshold for support and confidence are inputs to the model. Considering all the above-mentioned definitions, affinity analysis can develop rules that will predict the occurrence of an event based on the occurrence of other events. This data mining method has been explored in different fields including disease diagnosis, market basket analysis, retail industry, higher education, and financial analysis. In retail, affinity analysis is used to perform market basket analysis, in which retailers seek to understand the purchase behavior of customers. This information can then be used for purposes of cross-selling and up-selling, in addition to influencing sales promotions, loyalty programs, store design, and discount plans.

Application of affinity analysis techniques in retail
Market basket analysis might tell a retailer that customers often purchase shampoo and conditioner together, so putting both items on promotion at the same time would not create a significant increase in revenue, while a promotion involving just one of the items would likely drive sales of the other.

Market basket analysis may provide the retailer with information to understand the purchase behavior of a buyer. This information will enable the retailer to understand the buyer's needs and rewrite the store's layout accordingly, develop cross-promotional programs, or even capture new buyers (much like the cross-selling concept). An apocryphal early illustrative example for this was when one super market chain discovered in its analysis that male customers that bought diapers often bought beer as well, have put the diapers close to beer coolers, and their sales increased dramatically. Although this urban legend is only an example that professors use to illustrate the concept to students, the explanation of this imaginary phenomenon might be that fathers that are sent out to buy diapers often buy a beer as well, as a reward. This kind of analysis is supposedly an example of the use of data mining. A widely used example of cross selling on the web with market basket analysis is Amazon.com's use of "customers who bought book A also bought book B", e.g. "People who read History of Portugal were also interested in Naval History".

Market basket analysis can be used to divide customers into groups. A company could look at what other items people purchase along with eggs, and classify them as baking a cake (if they are buying eggs along with flour and sugar) or making omelets (if they are buying eggs along with bacon and cheese). This identification could then be used to drive other programs. Similarly, it can be used to divide products into natural groups. A company could look at what products are most frequently sold together and align their category management around these cliques.

Business use of market basket analysis has significantly increased since the introduction of electronic point of sale. Amazon uses affinity analysis for cross-selling when it recommends products to people based on their purchase history and the purchase history of other people who bought the same item. Family Dollar plans to use market basket analysis to help maintain sales growth while moving towards stocking more low-margin consumable goods.

Application of affinity analysis techniques in clinical diagnosis
An important clinical application of affinity analysis is that it can be performed on medical patient records in order to generate association rules. The obtained association rules can be further assessed to find different conditions and features that coincide on a large block of information. It is crucial to understand whether there is an association between different factors contributing to a condition to be able to administer the effective preventive or therapeutic interventions. In evidence-based medicine, finding the co-occurrence of symptoms that are associated with developing tumors or cancers can help diagnose the disease at its earliest stage. In addition to exploring the association between different symptoms in a patient related to a specific disease, the possible correlations between various diseases contributing to another condition can also be identified using affinity analysis.