How do
search engines work - Web Crawlers
It is the search engines
that finally bring your website to the notice of the
prospective customers. Hence it is better to know how these
search engines actually work and how they present information
to the customer initiating a search.
There are basically two types of
search engines. The first is by robots called crawlers or
spiders.
Search Engines use spiders to
index websites. When you submit your website pages to a search
engine by completing their required submission page, the search
engine spider will index your entire site. A 'spider' is an
automated program that is run by the search engine system.
Spider visits a web site, read the content on the actual site,
the site's Meta tags and also follow the links that the site
connects. The spider then returns all that information back to
a central depository, where the data is indexed. It will visit
each link you have on your website and index those sites as
well. Some spiders will only index a certain number of pages on
your site, so don't create a site with 500 pages!
The spider will periodically
return to the sites to check for any information that has
changed. The frequency with which this happens is determined by
the moderators of the search engine.
A spider is almost like a book
where it contains the table of contents, the actual content and
the links and references for all the websites it finds during
its search, and it may index up to a million pages a
day.
Example: Excite, Lycos, AltaVista
and Google.
When you ask a search engine to
locate information, it is actually searching through the index
which it has created and not actually searching the Web.
Different search engines produce different rankings because not
every search engine uses the same algorithm to search through
the indices.
One of the things that a search
engine algorithm scans for is the frequency and location of
keywords on a web page, but it can also detect artificial
keyword stuffing or spamdexing. Then the algorithms analyze the
way that pages link to other pages in the Web. By checking how
pages link to each other, an engine can both determine what a
page is about, if the keywords of the linked pages are similar
to the keywords on the original page.
|