Google Analytics

Thursday 19 March 2009

Hypertext, Hypermedia and the Semantic Web: The Web Itself

Sir Tim Berners-Lee described computing in 1980 as a world of ‘incompatible networks, disk formats, data formats and character encoding schemes’, this was particularly frustrating ‘given that... to a greater extent computers were being used directly for most information handling, and so almost anything one might want to know was almost certainly recorded magnetically somewhere’, (Berners-Lee 1996).

The ‘Design Criteria’ of the World Wide Web, described in Sir Tim Berners-Lee’s 1996 paper make very interesting reading:
  1. ‘An information system must be able to record random associations between any arbitrary objects, unlike most database systems’.
  2. ‘To make a link from one system to another should be an incremental effort, not requiring un-scalable operations such as the merging of databases’.
  3. ‘Any attempt to constrain users as a whole to the use of particular languages or operating systems was always doomed to failure’.
  4. ‘Information must be available on all platforms, including future ones’.
  5. ‘Any attempt to constrain the mental model users have of data into a given pattern was always doomed to failure’.
  6. ‘Entering or correcting [information] must be trivial for the person directly knowledgeable’.
The Web is formed around three common standards: the Address Space, Hyper-Text Transfer Protocol (HTTP) and Hyper-Text Mark-up Language (HTML), all originally designed by Sir Tim Berners-Lee.

The Web was designed around a principle of minimal constraint, in order that it could be incrementally improved by future developers. Additionally, the Web’s standards needed to be modular and support information-hiding. So that anybody designing anything on top of those standards did not have to know how the standards actually worked, (Berners-Lee 1996).
‘A test of this ability was to replace them with older specifications, and demonstrate the ability to intermix those with the new. Thus, the old FTP protocol could be intermixed with the new HTTP protocol in the address space, and conventional text documents could be intermixed with the new hypertext documents’, (Berners-Lee 1996).
Also, as a further example, we can look at HTTP’s ability to carry images (JPG, PNG, VRML) or even Java code.
‘Typically, hypertext systems were built around a database of links. This did not scale... However, it did guarantee that links would be consistent and links to documents would be removed when documents were removed. The removal of this feature was the principle compromise made in the [World Wide Web] architecture... allowing references to be made without consultation with the destination, allowed the scalability which the later growth of the web exploited’, (Berners-Lee 1996).
File Transfer Protocol (FTP) existed when the web was first developed, but was ‘not optimal for the web, in that it was too slow and not sufficiently rich in features’, (Berners-Lee 1996). So the Hyper-Text Transfer Protocol (HTTP) was created.

Universal Resource Identifiers (URIs) are the primary element of Web architecture. ‘Any new space of any kind which has some kind of identifying, naming or addressing syntax can be mapped into a printable syntax and given a prefix’, (Berners-Lee 1996). ‘URIs are generally treated as opaque strings: client software is not allowed to look inside them and to draw conclusions about the object referenced’, (Berners-Lee 1996). ‘HTTP URIs are resolved... by splitting them into two halves. The first half is applied to the Domain Name Service to discover a suitable server, and the second half is an opaque string which is handed to that server’, (Berners-Lee 1996).

Hyper-Text Markup Language (HTML) was defined as the data format to be transmitted over HTTP. HTML was based around SGML in order to encourage its adoption by those already using SGML.

The initial prototype browser was written in NeXTStep in late 1990. It allowed HTML to be edited as well as browsed. The limited use of NeXT limited its visability, so in 1991 a read-only ‘line mode’ browser was written. This enabled the early web to be viewed on a range of systems. As more people became involved, full browsers were written.

In 1993, rumours threatend that the Web’s competition ‘Gopher’ was to become a licenced product. As a result, a mass of people and organisations transferred their hypermedia systems to be WWW systems instead.

The World Wide Web Consortium (W3C) was formed in 1994. The rest is history...

No comments:

Post a Comment