User:JimmyJimmereeno/sandbox

Stochastic Approach for Link-Structure Analysis (SALSA) is a web page ranking algorithm designed by R. Lempel and S. Moran to assign high scores to hub and authority web pages based on the quantity of hyperlinks among them.

SALSA is inspired by two other link-based ranking algorithms, namely HITS and PageRank, in the following ways:
 * like HITS, the algorithm assigns two scores to each web page: a hub score and an authority score. An authority is a page which significantly more relevant to a given topic than other pages whereas a hub is a page which contains many links to authorities;
 * like HITS, SALSA also works on a focused subgraph which is topic-dependent. This focused subgraph is obtained by first finding all the pages that are relevant to a given topic (e.g. take the top-n pages returned by a text-based search algorithm) and then augmenting it with web pages that links directly to them and are linked directly from them. Because of this selection process, the hub and authority scores are topic-dependent;
 * like PageRank, the algorithm computes the scores by simulating a random walk through a Markov chain that represents the graph of web pages. SALSA however works with two different Markov chains: a chain of hubs and a chain of authorities.

Properties
SALSA can be seen as an improvement of HITS.

It is computationally lighter since its ranking is equivalent to a weighted in/out degree ranking. The computational cost of the algorithm is a crucial factor since HITS and SALSA are computed at query time and can therefore significantly affect the response time of a search engine. This should be contrasted with query-independent algorithms like PageRank that can be computed off-line.

SALSA is less vulnerable to the Tightly Knit Community (TKC) effect than HITS. A TKC is a topological structure within the Web where a small set of pages are highly interconnected. The presence of TKCs in a focused subgraph is known to negatively affect the detection of meaningful authorities by HITS.