User:Tukun1980/sandbox/Common Sense Inference Memory

introduction Common Sense Inference Memory System (CSIM) is an Inference engine on the behaviors/properties of an entity/concept in the world, in the form of texts written in natural language. It simulate the procedure in which human brain process the text information with Machine Learning techniques. Its goal is to interact with human by giving feedback or answers to the query, to organize documents and summarize them by parsing sentences and making inference on the main events described in the documents.

Structure of the CSIM
The CSIM follows a procedure similar with that in human brain. It is consist of three modules: the language processing Module (LPM), the Memory Module (MM) and Inference Module (IM).

The LPM parses input sentences and extract the behaviors or properties of an entity. These behaviors are in the form of verb phrases.

MM stores the information from all the inputs. An Extended Semantic Network is used to store the information, where the nodes represents the entities in the sentences and the edges represents the relation between two entities or a property of one entity.

There is Working Memory, (similar with that of human ) which store the most recently input information to construct a full scenario. When a behavior of an entity is read in, relevant behaviors are also collected from previous data or Common Sense Knowledge base to reconstruct the memory of the entity. Another kind of memory, long-term memory, stores all the input data as previous experience. This data can be used of training data for the inference module.

IM gets the behaviors/properties from Working Memory in MM and collects relevant common sense knowledge accordingly to build up a Bayesian Network (BN). As the input would be different in different scenarios, the nodes, structure and the parameters of the BN would also be different depending on the input. The probabilities of each nodes of the BN are calculated based on the input evidence. Nodes with high probabilities will be selected and pass to the MM. Then the LPM constructs new sentences with these inference and output them as a feedback to what CSIM reads.

Machine Learning Techniques
The nodes in the Bayesian Networkscould be different each time CIMS reads in some sentences, the conditional probability tables (CPT) would vary. In some cases, there are no CPT for some node family in the BN. This is caused by missing data from previous texts.

EM algorithm is applied to obtain CPTs. Let $$ph_i$$ be the $$i-th$$ phrase that describe the behavior/property of an entity. $$ph_i=1$$ if the behavior occurs and $$ph_i = 0$$ otherwise.

let $$\theta$$ be the parameters in the CPTs that decides a BN, $$PH = \{ph_1, ph_2,\cdots, ph_n\}$$ be observed behaviors, $$ PH' = \{ph'_1, ph'_2,\cdots, ph'_m\}$$ be the unobserved behaviors. For a given node family in the BN, the joint probability is

$$P(ph_1, ph_2,\cdots, ph_n,ph'_1, ph'_2,\cdots, ph'_m|\theta) = \prod_{x \in PH \cup PH'} P(x|\mathbf{parent(x)})$$

EM algorithm will perform as follow:

Let the Maximum Likelihood Estimate be:
 * $$L(\boldsymbol\theta; \mathbf{PH}, \mathbf{PH'}) = p(\mathbf{PH}, \mathbf{PH'}|\boldsymbol\theta)$$


 * In E step:


 * $$Q(\boldsymbol\theta|\boldsymbol\theta^{(t)}) = \operatorname{E}_{\mathbf{PH'}|\mathbf{PH},\boldsymbol\theta^{(t)}}\left[ \log L (\boldsymbol\theta;\mathbf{PH},\mathbf{PH'}) \right] \,$$


 * M step:


 * $$\boldsymbol\theta^{(t+1)} = \underset{\boldsymbol\theta}{\operatorname{arg\,max}} \ Q(\boldsymbol\theta|\boldsymbol\theta^{(t)}) \, $$

Application
The CSIM system generates some useful information based on the common sense knowledge. This can be fine-tune to apply to machine-human interaction system to choose best behavior for the robots with knowledge base such as ConceptNet, CYC.