User:Hmerlino/Books/Text Mining

Introducction

 * INTRODUCTION
 * Text mining
 * General Architecture for Text Engineering
 * Unstructured data
 * Document-term matrix
 * Bag-of-words model
 * Vector space model
 * Tf–idf
 * Generalized vector space model
 * Information retrieval
 * Okapi BM25
 * Rocchio algorithm
 * Inverted index
 * Web crawler
 * Concept map
 * Metadata
 * Language model
 * Hidden Markov model
 * Baum–Welch algorithm
 * Viterbi algorithm


 * CLUSTERING HIGH DIMENSIONAL DATA
 * Document clustering
 * Clustering high-dimensional data
 * Biclustering
 * Mixture model


 * INFORMATION EXTRACTION AND NLP
 * Information extraction
 * Knowledge extraction
 * Natural language processing
 * Part of speech
 * Part-of-speech tagging
 * Named-entity recognition
 * Automatic summarization
 * Sentiment analysis
 * OpenNLP
 * UIMA


 * DIMENSIONALITY REDUCTION AND MODELING
 * Principal component analysis
 * Curse of dimensionality
 * Singular value decomposition
 * Latent variable
 * Latent semantic analysis
 * Probabilistic latent semantic analysis
 * Latent Dirichlet allocation
 * Factor analysis
 * Non-negative matrix factorization
 * Regularization (mathematics)


 * TEXT CLASSIFICATION
 * Naive Bayes spam filtering
 * Naive Bayes classifier
 * Logistic regression
 * String kernel