Development of the University of Illinois Web

Technology History and Background

For years, computers have held the promise of the "paperless office", and for years, "promise" has been the operative word. Computers have instead tended to increase paper consumption. Electronic communications technologies - from the telegraph to electronic mail - have greatly improved the ability of people to interact with others. However, until very recently, there were very poor tools for obtaining information without the active, co-temporal intervention of the information provider. This is not true in the paper-based information world: anybody can go into a library or bookstore and peruse a book, walk into a car dealership and pick up promotional materials, or turn on the radio and hear an opera.

Until a few years ago, the only tool available on the Internet for distributing information to an unidentified audience was ftp (File Transfer Protocol). Using ftp software, people could access documents on any computer in the world, as long as the remote computer was connected to the Internet and configured with the proper software.

The ftp transactions used a very simple version of what is called a client-server architecture. One piece of software would be used by the person who wanted to get the information (the client), and another piece of software would be continuously running on the machine with the information (the server). To give an analogy, the server is like a butler for the computer. When a request is presented at the entrance to the computer, the server examines the request and determines if that request is understandable and can be fulfilled. If so, the server gets the information and presents it back to the requester. Thus clients can get information from the remote computer without being able to "enter" the computer themselves.

A big advantage of a client-server architecture is hardware and operating system independence. If the interface between the client software and the server software is very well defined, client and server software can be written for several different types of computers. A Macintosh can use ftp client software to connect to a ftp server running on a Cray, or vice versa. No longer were computers constrained to only communicate with other computers of the same type.

Unfortunately, finding and retrieving interesting resources with ftp was complicated. To look at a graph of stock prices, for example, might take the following steps:

log in (using the ftp client software) to the computer with the list of stocks (the server)
change directories
retrieve ("download") a plain-text file
read the file (with another program) to find out where the graph if interest is
log off of the first computer
log on to a second computer, the one that has the graph
change directories
download the graph
view the graph (with another program)

Needless to say, the appeal of such operations was limited. In 1991, the University of Minnesota began development of a different client-server system called gopher. Gopher clients had a text-based, menu-driven, hierarchical organization of files, and allowed users to examine plain text files with the same software that they used to navigate through the menus. Furthermore, gopher provided an ability to search through all of a server's files for user-specified character strings, and allowed connecting other gopher sites to menu entries. This allowed people to aggregate information based on topic, even for information that was geographically dispersed. @@@gateway to programs? - have a call in to check@@@

@@@ Unsure of next paragraph - am checking ---- However, gopher clients were unable to access ftp sites. The negotiation (called a protocol) that took place between the client and server was different enough that ftp sites couldn't understand gopher requests, and gopher clients didn't even allow the users to ask for information from an ftp site. To go back to the butler analogy, it was as if a courier speaking English went to the back door, while the Chinese-speaking butler was waiting at the front door. If a consumer or provider wanted to make sure of their ability to communicate with all others, they would need software for both protocols - essentially having two couriers or two butlers, one speaking English and one speaking Chinese.@@@

Meanwhile, at Europe's CERN physics laboratory, a very small team had been working for about a year on a hypertext system, dubbed the World-Wide Web, for sharing research information. Tim Berners-Lee and Robert Cailliau envisioned a distributed hypertext system for publishing documents over the Internet to allow for better communication and archiving of knowledge in their organization.

CERN has an interesting environment. Not only does it have heterogenous hardware, software, and data formats, as is true in many companies, but many researchers are there for relatively short periods. This large population of transients leads to a form of organizational amnesia. Important information is lost every time a researcher returns to his or her "home" institution.

Berners-Lee and Cailliau felt that a system to allow for easier publishing of information would help defray this continual disappearance of organizational memory. Because of the environment they were in, it was clear to them that they needed a system that would be extremely flexible, open, and extensible. The system they developed, the World-Wide Web had all of the features of gopher in this regard, plus three more.

Web clients (also called browsers) were able to request an arbitrary file in an arbitrary format from an arbitrary machine. The form of the request was a Uniform Resource Locator, or URL. Simply, a URL is an address in cyberspace. The URL compactly specified the protocol used for the transfer, the machine where the file was located, where on the machine the file was located, and finally what the file was named.
While even the first web browsers were able to understand four existing protocols (ftp, WAIS, gopher, and NNTP), another protocol, http, was established. Among other things, http allowed for format negotiation. The client, when making a request, could specify the formats it was capable of dealing with. The server would select an appropriate file, perform appropriate conversions to reach a format that the browser could understand, and send it to the client along with a message specifying what format had been sent.
A new format was defined, called Hypertext Markup Language or HTML. This language was intended to be recognizable by any Web browser, and allowed some control over the presentation of text (e.g. italics or bold). More importantly, it allowed the author to connect an arbitrary block of text to an arbitrary URL. When that block of text was selected - by using arrow keys, a menu, or a mouse - the client software would automatically make a request for the file at that URL, then display it. This hyperlinking facility was similar to gopher's ability to link to other gopher sites, but unlike gopher, allowed putting connections to other resources in the files, not just in a separate menu structure.

While format negotiation is practically unused today, the fact that information was passed back to the client about the format meant that browsers could then start up "helper applications". A browser would not have to be capable of displaying e.g. chemical structure data if the browser were capable of starting another program that could display chemical structure data.

Because the specifications were an open standard, other people could and did develop WWW software. Two early browsers that came from outside CERN were Viola and Cello. Viola was developed by Pei Wei at Berkeley in January of 1993, and Cello was developed by Thomas Bruce at Cornell later in 1993. Both browsers used a page-oriented, multifont, point-and-click interface.

Meanwhile, two programmers at the University of Illinois' National Center for Supercomputing Applications (NCSA) developed a browser called Mosaic. Marc Andreessen and Eric Bina jointly developed this browser to not only have point-and-click capabilities, like Viola and Cello, but it also to display graphics in-line instead of spawning a separate image. This feature, good timing, and the resources available at NCSA to develop, publish, and support binary, ready-to-use versions for X Window, Mac, and PC clients made Mosaic far more popular than any of its contemporaries.

Rob McCool, who was also with NCSA, developed a web server, called httpd. Collaborating with the Mosaic team, McCool developed a mechanism that allowed a server to execute a program and send its outputs to the browser as if it were a static file. This Common Gateway Interface (CGI) allowed dynamic data to be transmitted over the web. Things as mundane as the local time in various places around the world, and as exotic as an interactive map of the world with arbitrary scale were placed onto the web using this CGI programs. Kevin Hughes, who was a student at Honolulu Community College, wrote an important CGI program that allowed for "clickable" images. regions could be defined inside an inline image; clicking in such an area would take the user to another URL. This was incorporated into NCSA's httpd distribution.

By this point, the Web had all the components needed for a universal user interface for unknown-audience document publishing. Independence of physical location, hardware, operating system, software, and transfer protocol for both servers and clients, and format independence for the data meant that practically anybody could easily publish and/or view information on the Web.

Furthermore, the Web being an open and extensible system meant that obsolesence would not be an issue for some time, if ever. New protocols and new data formats might spring up, but they would augment, not eliminate earlier versions. This, and the easy availability of free browsers, servers, and helper applications that were easy to install and use lead to a remarkable growth around the world in the Web.

Go on to History of the UIUC Web

Go up/back to Table of Contents

Kaitlin Duck Sherwood