Aug 7, 2010

The way search engines work

Web search engines work by storing information about many web pages, which are taken directly from the WWW. These pages were taken with a web crawler - an automated web browser which follows every link he saw. The contents of each page and then analyzed to determine how to index it (eg, words taken from the title, subtitle, or special fields called meta tags). Data about web pages are stored in an index database for use in subsequent searches. Some search engines such as Google, store all or part of the source of the page (called a cache) as well as information on the web page itself.

When a user visits a search engine and enter a query, usually by entering a keyword, search engines index and provides a list of web pages that best matches the criteria, usually accompanied by a brief summary of the document title and sometimes some of the text.

There are other types of search engines: real-time search engine, such as Orase. Machines like this do not use indexes. Machinery necessary information is only collected if there is a new search. When compared with the index-based system used engines like Google, real-time system is superior in several respects: information is always up to date, (almost) no dead links, and fewer system resources required. (Google uses nearly 100,000 computers, Orase only one.) But there are also disadvantages: the search for longer completion.

Benefits depend on the relevance of search engine results it gives. Although there may be millions of web pages containing a word or phrase, some pages may be more relevant, popular, or authoritative than others. Most search engines use different methods to rank the results to provide "best" results first. The way the machine determines which pages best match, and the order of the pages were not disclosed, is very varied. The methods also change over time as Internet usage changes and new techniques evolve.

Most web search engines are commercial ventures supported by advertising revenue and therefore the most controversial practice, which allows advertisers to pay so that their pages are ranked higher in search results.

Source :

Related Post