Search Engines
For boosting your site’s rankings on search engines, it is important to have a ba‐sic understanding of how search engines work.
Search engines do not perform live or real‐time searches of the web. Instead, they search through their own database which contains ‘snapshots’ of millions, even billions of web pages. An engine attempts to copy and arrange all the in‐formation on the web into its database, and when you enter a search query, it searches through the database and returns results based on an algorithm, with each result pointing to the URL of the webpage.
This search process occurs in three stages: crawling, indexing and retrieving.
1) Crawling: The engine’s robotic crawlers, also known as spiders, go over each available web page and make a partial or full copy (also known as caching) of it. Spiders travel from page to page via the links on a page.
2) Indexing: The engine removes or devalues duplicate pages, removes ‘spam’ pages, and then catalogues and indexes each page according to the content of the page, which includes both text and markup (code).
3) Retrieving: Once a page is indexed, it is available for retrieval in search re‐sults. A retrieval algorithm determines the results and their order; each en‐gine has its own algorithm, which is why results vary from engine to engine.
Performing a search
Let us use Google to perform a quick search for “used books”. In 0.2 seconds, the first SERP is generated, featuring the top ten of about 200 million results related to the query. Google now also features ‘web options’, enabling you to narrow your results down in a number of ways. Each result also has a link to its cached copy on Google, and a link to a list of similar results.
Before these results are generated, Google’s spiders have already accessed and cached all these pages into its database and organized them. Each page is parsed and stored in Google’s database as a collection of words, which are used to de‐termine what the page is about. Each page also has its own information, such as its age, type (news, forums, shopping etc), popularity and authority.
When you search for “used books”, Google searches through its own database to look for:
pages that contain the exact phrase “used books”,
- pages where the words “used” and “books” appear close together, 7
- pages that contain both words, though not necessarily close together,
- pages that contain other variations of both words, such as “use” and “book”
- pages that are linked by other pages with “used books” in the link text, and
- pages that are linked by other pages with “used’ and “books” in the link text.
One or more of these criteria are satisfied by over 200 million web pages in Google’s massive database.
Ranking Factors
The order in which these millions of results are returned depends on their rela‐tive relevance to your query. In short, Google aims to return the most relevant results first, and the least relevant last. The calculation of this relevance is of ut‐most importance to web site developers and optimizers.
In order to determine a page’s rank for a specific query, two main criteria are used by all major search engines:
1. Keyword relevance: how central is the search term to the theme or meaning of the content on the page?
2. Page trust: how popular and trusted is the page on which the term appears?
These criteria are further broken down into over 200 individual factors, and search engines conceal their exact algorithms from users and webmasters in or‐der to avoid manipulation and spamming.
Keyword Relevance
Keyword relevance depends heavily on on‐page factors. In addition to the raw text or body of the page, search engines use a number of criteria to understand what the page is about:
- the title of the page: does the keyword appear in the title of the page?
- the prominence and placement of keywords on the page: is the keyword em‐phasized on the page – by being used in headings, bold text, italicized text, link text, bulleted lists or larger text?
- the meta description of the page: does the ‘description’ meta tag in the page’s markup contain the keywords?
- keyword density: does the keyword appear a number of times?
- anchor text (link text): do inbound links to the page contain the keyword?
A page where “used books” appears once or twice in the body of the page will be considered less relevant than a page where it appears in content headings, which will be less relevant than a page titled “used books” which also uses it in the meta description and main body of the page.
Page Trust
If rankings depended only, or even heavily, on on‐page factors, it would be quite easy to manipulate search engines. Since search engines don’t have a human understanding of meaning, it would be easy to create spam pages where key‐words appear in the right places (title, headings, bold text etc). Just a few years ago, it was not uncommon for a top result in Google to be an irrelevant page, where keywords were either hidden or used without relevant meaning.
In order to lower the ranking of such low‐quality results, off‐page factors have gained a great deal of importance in search rankings. In short, search engines now put a lot of weight on what other pages or websites think of a page.
Since spiders (and humans) navigate the web through links, search engines use linking as a way to determine the reliability of the linked page. By linking to an‐other page, a web page leads humans and spiders to it, so a backlink (inbound link to a page) may be considered a vote of confidence for the page, and the words which appear in the link text are used in determining what the linked page is about.
A page’s reliability or trust depends on a number of factors:
The authority of the domain: how reliable and trusted is the main domain? A page on Wikipedia, for example, is much more reliable than a page on a new or low‐traffic website. Domain authority is determined by a number of fac‐tors, including age, traffic and link popularity.
Number of backlinks: how many pages are linking to this page, and from how many different sources (domains)?
The authority of the linking page: how important is the page from which it is linked? A link from a high authority website (such as .edu or .gov sites) is more valuable than a link from an unreliable source such as a blog.
Google has its own method, known as PageRank™, of evaluating the backlinks to a page. Each indexed page is assigned a weighted number between 1 and 10 which signifies its link popularity. You can check the PageRank of a page here or by installing the Google toolbar. Note that PageRank is updated every few months and can sometimes vary unpredictably. So should not be relied upon.


















Leave a reply