User:Montanabernese/sandbox

Graph Representation Learning
Graph Representation Learning is an extended version of Representation Learning for the graph type data structures. It is also known as Network Embedding. The main goal of Graph Representation Learning is to learn latent vectorial representations of nodes (vertices) that reflect the graph's structure and attributes.

Graph structures are mainly used for Machine Learning tasks including node classification, link prediction, community detection and network similarity. In traditional Machine Learning approach, there is need to build specific features that describe the graph structure, such as node degree, PageRank score, degree of neighbors, PageRank of neighbors. However, the feature selection is one of the most difficult and challenging part of a Machine learning task. As a solution to this problem, Graph Representation Learning enables the learning part automatically, without requiring feature extraction and selection.

Modeling Approaches
Graph representation learning methods can be classified into three categories: Generative models that learn the underlying connectivity distribution in the graph, Discriminative models that predict the probability of edge existence between a pair of vertices, and finally the joint usage of Generative and Discriminative models.

Generative Models
Generative graph representation learning models assume that, for each vertex $$v_c$$, there exists an underlying true connectivity distribution $$p$$true$$(v|v_c)$$ which implies $$v_c$$'s connectivity preference (or relevance distribution) over all other vertices in the graph. The edges in the graph can thus be viewed as observed samples generated by these conditional distributions, and these generative models learn vertex embeddings by maximizing the likelihood of edges in the graph.

Discriminative Models
Different from generative models, Discriminative graph representation learning models do not treat edges as generated from an underlying conditional distribution, but aim to learn a classifier for predicting the existence of edges directly. Typically, discriminative models consider two vertices $$v_i$$ and $$v_j$$, jointly as features, and predict the probability of an edge existing between the two vertices, i.e., $$p(edge|(v_i,v_j))$$, based on the training data in the graph.

Joint Models
Although generative and discriminative models are generally two disjoint classes of graph representation learning methods, there exist many studies that are jointly used.

Algorithmic Approaches
Most of the Graph Representation algorithms are unsupervised and they are mainly divided into two main categories: matrix factorization and random walk based.

Matrix Factorization
Matrix factorization is inspired by classic techniques for dimensionality reduction, which optimize loss function of the form
 * $$ \text{L} = ||Z^T Z -S ||^2_2$$

where $$S$$ is a matrix containing proximity measures and $$Z$$ is the matrix of node embeddings. The goal of Matrix factorization based methods is to learn embeddings for each node such that the inner product between the learned embedding vectors approximates some deterministic measure of graph proximity. While Singular Value Decomposition (SVD) is applicable to solve linear cases, Laplacian Eigenmaps are used for the non-linear structures.

Random Walk
On the other hand, random walk tries to define embeddings such that nodes have similar vectors if they co-occur on short random walks over the graph and this results in a flexible, stochastic measure of graph proximity. The basic idea is to compute the probability $$p_{G_T}(v_j|v_i)$$ of visiting a node $$v_j$$ on a length-$$T$$ random walk starting at $$v_i$$, with usually $$ T \in \{2,....10 \}$$. This leads to minimize the cross-entropy loss
 * $$ \text{L} = \sum_{(v_i,v_j) \in D} - log(p_{G,T}(v_i|v_j))$$

Example Algorithms

 * t-SNE
 * DeepWalk
 * Node2vec
 * LINE
 * SDNA
 * PPNE

Example Research Areas

 * Link prediction
 * Node classification
 * Recommendation
 * Visualization
 * Knowledge graph representation
 * Clustering
 * Text embedding
 * Social network analysis

Article Improvement: List of datasets for machine learning research
Target page: https://en.wikipedia.org/wiki/List_of_datasets_for_machine_learning_research#Twitter_and_tweets

Article Improvement: Difference between Hadoop 2 and Hadoop 3
Target page: https://en.wikipedia.org/wiki/Apache_Hadoop

There are important features provided by Hadoop 3. While there is one single namenode in Hadoop 2, Hadoop 3 enables having multiple name nodes, which prevents single point of failure.

In Hadoop 3, there are containers working in principle of Docker, which reduces time spent to application development.

One of the biggest changes is that, Hadoop 3 decreases storage overhead with erasure coding.

Also, Hadoop 3 permits usage of GPU hardware within the cluster, which is a very substantial benefit to execute Deep Learning algorithms on a Hadoop cluster.

Article Improvement: PyTorch
Target page: https://en.wikipedia.org/wiki/PyTorch

PyTorch that provides two high-level features:
 * Tensor computation (like numpy) with strong GPU acceleration
 * Deep Neural Networks built on a tape-based autodiff system

The main elements of PyTorch are


 * PyTorch Tensors
 * Autograd module
 * Optim module
 * nn module

PyTorch Tensors
Tensors are nothing but multidimensional arrays. Tensors in PyTorch are similar to numpy arrays, with the addition being that Tensors can also be used on a GPU. PyTorch supports various types of Tensors.

Autograd Module
PyTorch uses a technique called automatic differentiation. A recorder records what operations have performed, and then it replays it backward to compute our gradients. This technique is especially powerful when building neural networks in order to save time on one epoch by calculating differentiation of the parameters at the forward pass itself.

Optim Module
torch.optim is a module that implements various optimization algorithms used for building neural networks. Most of the commonly used methods are already supported, so there is no need to build them from scratch

nn Module
PyTorch autograd makes it easy to define computational graphs and take gradients, but raw autograd can be a bit too low-level for defining complex neural networks. This is where the nn module can help.