Copyright B-SeenOnTop LLC, Haverford PA 19041. All rights reserved. "B-SeenOnTop" and it's logo is a trademark of B-SeenOnTop LLC. All other trademarks, names, logos are the property of their respective holders. This page - how do search engines work. Serving: Ardmore, Mainline, and Center City Philadelphia, Philly, Bryn Mawr, Haverford, Villanova, Gladwyne, Narberth, Wynnewood, Bala Cynwyd, Paoli, Devon, Penn Valley, Radnor, Wayne, Lower Merion.
This page is about search engines. How do they work? What functions do they perform? What happens when a person conducts a search?
Our Goal is to have your website...
B-SeenOnTop of Search Engine Results !
How Search Engines Work
Not clear? Have questions? Please contact us. One of our goals is to help demystify search engine optimization, and to make our clients at least partially self-sufficient.
Toll Free: 1-877-691-8989 Local: 484-437-7977
Last Revised: 03/12/11
A search engine is a large and complex computer system that is used to find and index content on the Internet. Google is the best known search engine. It is used to conduct roughly 65 percent of searches world-wide. There are about 30 recognized, general purpose search engines, and countless special purpose and single-site search engines across the Web as a whole.
Search engines perform three major functions:
They find (spider or crawl) web pages.
They inventory (index) page content.
They capture user queries (keyword phrases), search their index, and rank order and display findings (results).
A highly simplified 30,000 foot visual representation of these search engine functions is displayed below. Notice two major streams of activity take place, on top and on bottom: (1) the spider and index functions; and the (2) search function.
How Search Engines Work (1) Spider and Index Functions Computer programs called spiders or bots continuously crawl the web by following identified links, page by page. When new and/or updated web page content is found, a search engine database is populated (indexed) with every single word, every image, and every incoming, internal and outgoing link to the page. Much more is captured, but these are the important parts.
Spiders return, on average, every four to six weeks to check for updated content. If site content is updated more frequently, the spiders return more frequently. This is one reason why blogs are popular. By maintaining a blog, you are "teaching" search engine spiders to return to your site more frequently to check for new and updated content.
Not every page is indexed by a search engine. Spiders are relatively unsophisticated and can sometimes have difficulty finding or traversing a website. Orphan pages (pages without any inbound links) simply won't be found. Dynamic pages (constructed on the fly using parameters input by human searchers) sometimes aren't indexed. Spiders can also be specifically instructed to not index certain files and/or directories on a site using what are called robot meta tags. Theseare all common reasons why some websites have little or no visibility on the web. If you are curious, you can type your web page address into a spider simulation tool and see exactly what a spider sees when it visits your web page. This is a good way to validate nothing in your site design is impeding the indexing of your site.
In our example above, Page C is not stored in the index or database because there are no inbound links to the page. A page must have at least one incoming link from an external site before it can be found by a search engine spider.
(2) Search Function In the bottom stream of our diagram, the third or search function occurs on demand when a user wants to find information on the web. Note that search engines never actually search the web. It would simply take too long. In reality, the search is conducted using the previously described search engine index or database. The keywords in the search phrase are looked up in the index. When a match is found, an algorithm is run to determine its relevance. The details of these "ranking algorithms" are a closely held secret of search engine companies.
Using our Page C example again, the word "green" is the subject of a search. No match is found because the page containing the word "green" is not captured in the search engine index.