User:Anil Kumar Sahani/sandbox

INTRODUCTION:
The Internet has become a vast information source in recent years. To help ordinary users find desired data in the Internet, many search engines[3] have been created. Each search engine has a corresponding database that defines the set of documents that can be searched by the search engine. Usually, an index for all documents in the database is created and stored in the search engine. For each term which represents a content word or a combination of several content words, this index can identify the documents that contain the term quickly. The pre-existence of this is critical for the search engine to answer user queries efficiently. Two types or search engines exist. General-purpose search engines attempt to provide searching capabilities for all documents in the Internet or on the Web. WebCrawler, HotBot, Lycos and Alta Vista are a few of such well-known search engines. Special-purpose search engines focus on documents in confined domains such as documents in an organization or of a specific interest. Large number of special-purpose search engines is currently running in the Internet. A more practical approach to providing search services to the entire Internet[13] is the multi-level approach. At the bottom level are the local search engines. These search engines can be grouped based on the relatedness of their database, to form next level search engines (called metasearch engines). Lower, level metasearch engines can themselves be grouped to form higher level metasearch engines. This process can be repeated until there is only one metasearch engine at the top. A metasearch engine [3] is essentially an interface and it does not maintain its own index on documents. However, a sophisticated metasearch engine may maintain information about the contents of the (meta) search engines at a lower level to provide better service. When a metasearch engine receives a user query, it first passes to the appropriate (meta) search engines at the next level recursively until real search engines are encountered, and then collects recognizes the results from real search engines, possible going through metasearch engines at lower levels. A two-level search engine organization is illustrated in Figure1. The advantages of this approach are as follows (a) User queries can (eventually) be evaluated against smaller databases in parallel, resulting in reduced response time. (b) updates to indexes can be localized, i.e., the index of a local search engine is updated only when documents in its database are modified. Local updates may need to be propagated to upper level metadata that represent the contents of local databases; the propagation can be done infrequently as the metadata are typically statistical in nature and can tolerate certain degree of inaccuracy. (c) Local information can be gathered more easily and in a timelier manner; (d) The demand on storage space and processing power at each local search engine is more manageable. In other words, many problems associated with employing a single super search engine can be overcome or greatly alleviated when this multi-level approach is used. When the number of search engines invokable by a metasearch engine is large, a serious inefficiency may arise. Typically, for a given query, only a small fraction of all search engines may contain useful documents to the query. As a result, if every search engine is blindly invoked for each user query, then substantial unnecessary network traffic will be created when the query is sent to useless search engines. In addition, local resources will be wasted when useless database are searched. Author@Anil Kumar PSIT Kanpur for more detail click here:https://www.ijarcsse.com/docs/papers/Volume_4/12_December2014/V4I12-0376.pdf