Search Engines

From Isopedia

Contents

History

After the creation of the ARPANET in 1969 to be used for the United States Military, a new method of communication was born. In 1972, the ARPANET went public and what we know now as the "Internet" was established. Soon thereafter the public exchange of information on the internet was growing rapidly. The only problem with this is that people who were using the internet had no way of searching for different web pages. Web sites were only found by word of mouth or by links on other pages. This problem was finally fixed in 1990 by Alan Emtage, a student at McGill University in Montreal. This very first tool for searching the internet was called "Archie"; which stands for "archives" without the "v". This program downloaded the direct listings of all the files located on the FTP (File Transfer Protocol) sites. This created a searchable database of filenames.

The first web robot was created in 1993 by MIT student Matthew Gray. It was called the Wold Wide Web Wanderer and was initially used to measure the size of the web. Later it was used to find URLs, creating the first database of websites called Wandex. Later that same year, Aliweb was created by Martjin Koster in order to create a directory just for the web. webmasters would submit thier own description of thier website, allowing a more accurate listing. The new problem which arouse with this was that the application was tedious and so many websites were never listed. A few months later, new robots (now called spiders) were created; Jumpstation, World Wide Web Worm, and the Repository-Based Software Engineering spider.

Soon the internet began to look profitable and funding and investment was put into it. Excite was launched in 1993 and introduced concept based searching. it incorporated and went online in 1995. In 1994 Yahoo was created by Jerry Yang and David Filo. It began as a listing of thier favorite websites, thier database was relatively small, but later it perfected the search directory. Webcrawler was also introduced in 1994 and added a further degree of accuracy by indexing the entire text of a webpage. Lycos introduced retrieval, prefix matching, and word-proximity. AltaVista began in 1995 and was the first search engine to allow natural language inquiries and advanced searching techniques. Around this time, multimedia searching is introduced. Inktomi was started in 1996 and later introduced a directory search engine by "concept induction" technology. Inktomi was purchased by Yahoo in 2003. AskJeeves, Northern Light, and the famous Google were all launched in 1997. Google uses pagerank which monitors traffic by looking at how many sites link to a different page. in 1998 MSN search and OpenDirectory were started. OpenDirectory uses a human-edited directory of the web.

Search Engines continue to grow, but not quickly enough. The internet is growing faster than the databases of search engines such as google can. As search engines continue to evolve, it will be interesting to see how quickly and how well they can perform the search operations they were designed for.


Technical Information

The backbone of search engines is comprised of several computers working together to search the web. They utilize web robot software, commonly known as spiders, to accomplish this task. The web robot sends a request to a server for a specific site and then downloads the site in its entirety. It is possible to search for thousands of websites simultaneously. The content and address of the site are then relayed to an index. The software also records any links on the web pages to use for additional crawling. This drastically reduces the time it takes for the engine to build up a stockpile of websites. However, the process presents several challenges due to the vastness of the internet and the constant updating of web sites. Indexing software catalogs the information based on every word in the document. The sites are grouped by keywords and then are arranged in alphabetical order. Once the information is organized, it is transferred to the engines' database which stores all of the websites to be used in response to a query. The user interacts with the database by searching for key words. After receiving a query, the search engine comprises a list of websites containing these keywords and provides hyperlinks with their web addresses. The index allows for quick results since the engine does not have to search each website in the database individually.


References

A brief History of search engines

Another brief history - where would we be without 'em?

Google bloggoscoped

Search engine news

A more detailed look at current search engine trends

An interesting look at specific search engines and challenges they face

How stuff works

An animated step-by-step look at search engines

A table of search engine features

Google Guide


Team Members

  • Colin Fernandez
  • Chris Giroux
  • Shaun McCumber
  • Joe Lombardi