Search engine is in fact an information retrieval (IR) system. An index is created, and the IR matches search terms against that index.
Search engines have three major fundamentals. First is the spider, also called the crawler. The spider visits a web page, reads it, and then follows links to other pages within the site. This is what it means when someone refers to a site being "spidered" or "crawled."
Everything the spider locates goes into the succeeding part of the search engine, the index. The index, occasionally called the catalog, is like a gigantic book containing a replica of every web page that the spider finds. If a web page modifies, then index is updated with fresh information.
Sometimes it can take a while for new pages or alterations that the spider finds to be supplemented to the index. Thus, a web page may have been "spidered" but not yet "indexed." Until it is indexed – adjoined to the index -- it is not available to those probing with the search engine.
Search engine software is the final fraction of a search engine. This program filters all the way through the millions of pages recorded in the index to find matches to a search and rank them in order of what it believes is most relevant.
Steps taken by search engines to display formatted ranked page are:
- Create an Index
- Receive a query – a set of search terms and command.
- Look in the index file for matches
- Gather the matching page entries and rank them by relevance
- Return the result page in HTML to the searcher’s web browser.
How different types of Search Engines uses the above steps to search a page:
1. Crawler-Based Search Engines
Crawler-based search engines, such as Google, create their listings automatically. They "crawl" or "spider" the web, then people search through what they have found.
If you modify your web pages, crawler-based search engines eventually find these changes, and that can affect how you are listed. Page titles, body copy and other elements all play a role.
2. Human-Powered Directories
A human-powered directory, such as the Open Directory, depends on humans for its listings. You submit a short description to the directory for your entire site, or editors write one for sites they review. A search looks for matches only in the descriptions submitted. Altering your web pages has no effect on your listing. Good site, with good content, might be more likely to get reviewed for free than a poor site.
How Google Displays the search result:
Google uses roughly 100 algorithms, the results of which are combined in order to make available a search ranking. The order of factors:
- Finding pages with keywords that match the search
- Ranking the page based on page content factors, including keywords and keyword density
- Measuring inbound anchor text
- Multiplying times PageRank to provide a final listing.
It's highly recommended as a first stop in your hunt for whatever you are looking for.