User:Lixinso/datamining

= Fundamentals=

Env Setup
=Statistics=

Bayes Theorem
Bayes' theorem

Euclidean Distance
=Programming=

IBM SPSS
=Machine Learning=

Vocabulary Mapping
=Text Mining / NLP=

Corpus
=Big Data=

HDFS
==Data Replication Principles ==Setup Hadoop (IBM / Cloudera / HortonWorks)

MongoDB,Neo4j
=Visualization=

Tree & Tree Map
==Histogram & Pie (Uni)

ggplot2
==Uni,BI&Multivariate Viz

Data Exploration in R(Hist, Boxplot etc
=ToolBox=

Cassandra MongoDB
=Data Ingestion=

Using ETL
=Data Munging=