User:Ahhweetan/sandbox

Fusion Adaptive Resonance Theory (fusion ART) is a generalization of self-organizing neural networks known as Adaptive Resonance Theory for learning recognition categories (or cognitive codes) across multiple pattern channels. It unifies a number of neural network models, supports a myriad of learning paradigms, notably unsupervised learning, supervised learning, and reinforcement learning, and can be applied for domain knowledge integration, memory representation, and modelling of high level cognition.

Overview
Fusion Adaptive Resonance Theory models is a natural extension of the original Adaptive Resonance Theory (ART) models developed by Stephen Grossberg and Gail A. Carpenter from a single pattern field to multiple pattern channels. Whereas the original ART models perform unsupervised learning of recognition nodes in response to incoming input patterns, fusion ART learns multi-channel mappings simultaneously across multi-modal pattern channels in an online and incremental manner.

The Learning Model
Fusion ART employs a multi-channel architecture (as shown below), comprising a category field $$F_2$$ connected to a fixed number of (K) pattern channels or input fields $$F_1^{c1},\dots,F_1^{ck}$$ through bidirectional conditionable pathways. The model unifies a number of network designs, most notably Adaptive Resonance Theory (ART), Adaptive Resonance Associative Map (ARAM) and Fusion Architecture for Learning and COgNition (FALCON), developed over the past decades for a wide range of functions and applications.



The generic network dynamics of fusion ART, based on fuzzy ART operations, is summarized as follows.

Input vectors:
Let $$\vec{I}^{ck}= (I_1^{ck},I_2^{ck},\ldots,I_n^{ck}) $$ denote the input vector,

where $$I_i^{ck} \in [0,1]$$ indicates the input i to channel ck. With complement coding, the input vector $$\vec{I}^{ck}$$ is augmented with a complement vector $$\vec{I}^{cck}$$ such that $$\bar{I}_i^{cck}=1-I_i^{ck}.$$

Activity vectors: Let $$\vec{x}^{ck}$$ denote the $$F_1^{ck}$$ activity vector for $$k=1,\ldots,K$$. Let y denote the $$F_2$$ activity vector.

Weight vectors: Let $$\vec{w}_j^{ck}$$ denote the weight vector associated with the jth node in $$F_2$$ for learning the input patterns in $$F_1^{ck}$$ for $$k=1,\ldots,K$$. Initially,

$$F_2$$ contains only one uncommitted node and its weight vectors contain all 1's.

Parameters: The fusion ART's dynamics is determined by choice parameters $$\alpha^{ck}>0$$, learning rate parameters $$\beta^{ck} \in [0,1]$$, contribution parameters

$$\gamma^{ck} \in [0,1]$$ and vigilance parameters $$\rho^{ck} \in [0,1]$$ for $$k=1,\ldots,K$$.

As a natural extension of ART, fusion ART responds to incoming patterns in a continuous manner. It is important to note that at any point in time, fusion ART does not require input to be present in all the pattern channels. For those channels not receiving input, the input vectors are initialized to all 1s. The fusion ART pattern processing cycle comprises five key stages, namely code activation, code competition, activity readout, template matching, and template learning, as described below.

Code activation: Given the activity vectors $$\vec{I}^{c1},\dots,\vec{I}^{ck}$$, for each $$F_2$$ node j, the choice function $$T_j$$ is computed as follows:

$$T_j = \sum_{k=1}^K \gamma^{ck} \frac{|\vec{I}^{ck} \wedge \vec{w}_j^{ck}|}{\alpha^{ck}+|\vec{w}_j^{ck}|}$$

where the fuzzy AND operation \and defined by $$(\vec{p} \land \vec{q})_i \equiv min (p_i,q_i)$$, and the norm |.| is defined by $$|\vec{p}| \equiv \sum_i p_i$$ for vectors $$\vec{p}$$ and $$\vec{q}$$.

Code competition: A code competition process follows under which the $$F_2$$ node with the highest choice function value is identified. The winner is indexed at J where $$T_j$$ = max {$$T_j$$: for all $$F_2$$ node j}.

When a category choice is made at node J, $$y_J=1$$; and $$y_j=0$$ for all $$j \neq J$$. This indicates a winner-take-all strategy.

Activity readout: The chosen $$F_2$$ node J performs a readout of its weight vectors to the input fields $$F_1^{ck}$$ such that $$\vec{x}^{ck} = \vec{I}^{ck} \wedge \vec{w}_J^{ck}.$$

Template matching: Before the activity readout is stabilized and node J can be used for learning, a template matching process checks that the weight templates of node J are sufficiently close to their respective input patterns. Specifically, resonance occurs if for each channel k, the match function $$m_J^{ck}$$ of the chosen node J meets its vigilance criterion:

$$m_J^{ck} = \frac{|\vec{I}^{ck} \wedge \vec{w}_J^{ck}|}{|\vec{I}^{ck}|} \geq \rho^{ck}.$$

If any of the vigilance constraints is violated, mismatch reset occurs in which the value of the choice function $$T_j$$ is set to 0 for the duration of the input presentation. Using a match tracking process, at the beginning of each input presentation, the vigilance parameter $$\rho^{ck}$$ in each channel ck equals a baseline vigilance $$\bar{\rho}^{ck}$$. When a mismatch reset occurs, the $$\rho^{ck}$$ of all pattern channels are increased simultaneously until one of them is slightly larger than its corresponding match function $$m_J^{ck}$$, causing a reset. The search process then selects another $$F_2$$ node J under the revised vigilance criterion until a resonance is achieved.

Template learning: Once a resonance occurs, for each channel ck, the weight vector $$\vec{w}_J^{ck}$$ is modified by the following learning rule:

$$\vec{w}_J^{ck {\rm (new)}} = (1-\beta^{ck}) \vec{w}_J^{ck {\rm (old)}} + \beta^{ck} (\vec{I}^{ck} \wedge \vec{w}_J^{ck {\rm (old)}}).$$

When an uncommitted node is selected for learning, it becomes committed and a new uncommitted node is added to the F2 field. Fusion ART thus expands its network architecture dynamically in response to the input patterns.

Types of Fusion ART
The network dynamics described above can be used to support a myriad of learning operations. We show how fusion ART can be used for a variety of traditionally distinct learning tasks in the subsequent sections.

Original ART Models
With a single pattern channel, the fusion ART architecture reduces to the original ART model. Using a selected vigilance value $\rho$, an ART model learns a set of recognition nodes in response to an incoming stream of input patterns in a continuous manner. Each recognition node in the $$F_2$$ field learns to encode a template pattern representing the key characteristics of a set of patterns. ART has been widely used in the context of unsupervised learning for discovering pattern groupings. Please refer to the selected ART literatures for a review of ART's functionalities, interpretations, and applications.

Adaptive Resonance Associative Map
By synchronizing pattern coding across multiple pattern channels, fusion ART learns to encode associative mappings across distinct pattern spaces. A specific instance of fusion ART with two pattern channels is known as Adaptive Resonance Associative Map (ARAM), that learns multi-dimensional supervised mappings from one pattern space to another pattern space. An ARAM system consists of an input field $$F_1^a$$, an output field $$F_1^b$$, and a category field $$F_2$$. Given a set of feature vectors presented at $$F_1^a$$ with their corresponding class vectors presented at $$F_1^b$$, ARAM learns a predictive model (encoded by the recognition nodes in $$F_2$$) that associates combinations of key features to their respective classes.

Fuzzy ARAM, based on fuzzy ART operations, has been successfully applied to numerous machine learning tasks, including personal profiling, document classification , personalized content management , and DNA gene expression analysis. In many benchmark experiments, ARAM has demonstrated predictive performance superior to those of many state-of-the-art machine learning systems, including C4.5, Backpropagation Neural Network, K Nearest Neighbour, and Support Vector Machines.

Fusion ART with Domain Knowledge
During learning, fusion ART formulates recognition categories of input patterns across multiple channels. The knowledge that fusion ART discovers during learning, is compatible with symbolic rule-based representation. Specifically, the recognition categories learned by the $$F_2$$ category nodes are compatible with a class of IF-THEN rules that maps a set of input attributes (antecedents) in one pattern channel to a disjoint set of output attributes (consequents) in another channel. Due to this compatibility, at any point of the incremental learning process, instructions in the form of IF-THEN rules can be readily translated into the recognition categories of a fusion ART system. The rules are conjunctive in the sense that the attributes in the IF clause and in the THEN clause have an AND relationship. Augmenting a fusion ART network with domain knowledge through explicit instructions serves to improve learning efficiency and predictive accuracy.

The fusion ART rule insertion strategy is similar to that used in Cascade ARTMAP, a generalization of ARTMAP that performs domain knowledge insertion, refinement, and extraction. For direct knowledge insertion, the IF and THEN clauses of each instruction (rule) is translated into a pair of vectors A and B respectively. The vector pairs derived are then used as training patterns for inserting into a fusion ART network. During rule insertion, the vigilance parameters are set to 1s to ensure that each distinct rule is encoded by one category node.

For details on integrating domain knowledge into fusion ART, please refer to a recent paper.

Fusion Architecture for Learning and COgNition (FALCON)
Reinforcement learning \cite{Sutton98} is a paradigm wherein an autonomous system learns to adjust its behaviour based on reinforcement signals received from the environment. An instance of fusion ART, known as FALCON (Fusion Architecture for Learning and COgNition), learns mappings simultaneously across multi-modal input patterns, involving states, actions, and rewards, in an online and incremental manner. Compared with other ART-based reinforcement learning systems, FALCON presents a truly integrated solution in the sense that there is no implementation of a separate reinforcement learning module or Q-value table. Using competitive coding as the underlying principle of computation, the network dynamics encompasses a myriad of learning paradigms, including unsupervised learning, supervised learning, as well as reinforcement learning.

FALCON employs a three-channel architecture, comprising a category field $$F_2$$ and three pattern fields, namely a sensory field F1c1 for representing current states, a motor field F1c2 for representing actions, and a feedback field F1c3 for representing reward values. A class of FALCON networks, known as TD-FALCON, incorporates Temporal Difference (TD) methods to estimate and learn value function Q(s,a), that indicates the goodness to take a certain action a in a given state s.

The general sense-act-learn algorithm for TD-FALCON is summarized. Given the current state s, the FALCON network is used to predict the value of performing each available action a in the action set A based on the corresponding state vector $$\vec{s}$$ and action vector $$\vec{a}$$. The value functions are then processed by an action selection strategy (also known as policy) to select an action. Upon receiving a feedback (if any) from the environment after performing the action, a TD formula is used to compute a new estimate of the Q-value for performing the chosen action in the current state. The new Q-value is then used as the teaching signal (represented as reward vector R) for FALCON to learn the association of the current state and the chosen action to the estimated value.