How do search engines work?

I’m not talking about the actually browser but the internal workings of a search engine. Incidentally they won’t tell you precisely how they do it as their custom algorithms are their core business. However we can look at how search engines gather the data, the main type being crawler-based technology.

Search engines, such as Google, create listings automatically. They crawl or spider the web, and index what they find. If you change your web pages, crawler-based search engines eventually find these changes.

Crawler-based search engines have three elements. First is the spider, also called the crawler. The spider visits a web page, reads it, and then follows links to other pages within the site. Spiders return to sites on a regular basis to capture updated information.

What the spider finds gets passed to the second part of the search engine, the index. The index, sometimes called the catalogue  is like a giant book containing a copy of every web page that the spider finds. If a web page changes, then this book is updated with new information. It can take a while for a page to be updated as the crawl will take time to be added to the catalogue.

The search engine software is the third part. This is the program that sifts through the millions of pages recorded in the index to find matches to a search and rank them in order of what it believes is most relevant.

You can manually intervine in this process and submit details about your site via the search engines online forms. Although this process is human-initiated the spider will then follow the same process as above and add your content to the directory.