Includes TCP/IP, FTP and HTTP due to assignment requirements:
Web Dev Research
View more presentations from nathomas82.
‘One of a growing number of start-ups that have emerged to supply social computing technologies (especially wikis) to enterprises.’
‘”For a long time,” says Mayfield, “personal productivity tools and applications – the kind that Microsoft makes – have been centred on a single user who generates documents. You also have highly structured enterprise systems designed and implemented from the top down – in many ways as an instrument of control – with rigid work flow, business rules, and ontologies that users must fit themselves into. The problem is that users don’t like using those kinds of tools, and what they end up doing is trying to circumvent them. That’s why ninety percent of collaboration exists in emails.”’
‘Mayfield argues that traditional organizations have reached a point where e-mail itself is breaking. “You could argue that ten or twenty percent of e-mail is productive”.’
‘Mayfield thinks the solution is collaboration tools that adapt to the habits of workplace teams and social networks rather than the other way around.’So, in similar way to envisaged genealogy software, each employee is represented by an XML or Web page object. FOAF or similar networks can be created and used as required to provide information about how the employees relate to each other, without prescribing a hierarchy of any kind.
‘Associating a URI with a resource means that anyone can link to it, refer to it, or retrieve a representation of it’, (Shadbolt et al 2006).
‘Much of the motivation for the Semantic Web comes from the value locked in relational databases. To release this value, database objects must be exported to the Web as first-class objects and therefore must be mapped into a system of URIs’, (Shadbolt et al 2006).So, I envisage, any genealogical object, (such as a person, family, source, repository, place, note or media), must exist as an individual XML file on the Web that can then be linked to, as desired or required.
‘[Folksonomies] represent a structure that emerges organically when individuals manage their own information requirements. Folksonomies arise when a large number of people are interested in particular information and are encourage to describe it – or tag it’, (Shadbolt et al 2006).
‘Rather than a centralized form of classification, users can assign keywords to documents or other information sources’, (Shadbolt et al 2006).This links with my vision of future online genealogy, objects could be linked by tagging and specifying a description of that relationship.
‘But folksonomies serve very different purposes from ontologies. Ontologies are attempts to more carefully define parts of the data world and to allow mappings and interactions between data held in different formats. Ontologies refer by virtue of URIs; tags use words’, (Shadbolt et al 2006).I don’t see this as a ‘one or other’ situation, but think we need both. An ontology needs to be defined to define a standard for basic types of genealogical link, (e.g. parent, spouse, sibling), and to ensure system compatibility. A folksonomy system has a particular advantage in that it can cover inadequacies of the ontology.
‘When Berners-Lee developed the Web, he took the salient ideas of hypertext and SGML syntax and removed complexities such as backward hyperlinks. At the time, many criticized their absence from HTML because, without them, pages can simply vanish and links can break. But the need to control both the linking and linked pages is a burden to authoring, sharing, and copying’, (McCool 2006).In my research, (Thomas 2009), the idea of splitting one GEDCOM file into multiple XML files, one XML file for each object, has raised similar concerns. Although, in my opinion, the benefits outlined in this blog out-weigh this problem.
‘Early forms of HTML paid no regard to SGML document-type definitions (DTDs). Berners-Lee simply ignored these difficult to create and understand declarations of how markup tags are used’, (McCool 2006).In a similar way, it does not really matter how people define their data in XML, as long as there are ontologies in order that we can associate XML tags.
‘[Folksonomies have] ‘no notion of synonyms or disambiguation... For a Web community with simple, easy-to-use authoring tools that support synonyms, disambiguation, and categories, we can look to Wikipedia... Wikipedia calls synonyms redirect pages, and disambiguation is explicitly handled via special pages’, (McCool 2006).Wikipedia is different from my vision in that there is not any XML based data behind the presentation layer. Where not implied by the ontology, presentation layer pages (HTML) could be used in the same way as Wikipedia to support the issues of described in the above paragraph.
‘The Semantic Web is not a separate Web but an extension of the current one, in which information is given well-defined meaning, better enabling computers and people to work in cooperation. The first steps in weaving the Semantic Web into the structure of the existing Web are already under way. In the near future, these developments will usher in significant new functionality as machines become much better able to process and “understand” the data that they merely display at present’, (Berners-Lee et al 2001).What are the design principles of the Semantic Web?
‘The essential property of the World Wide Web is its universality. The power of a hypertext link is that “anything can link to anything”... Like the Internet, the Semantic Web will be as decentralized as possible... Decentralization requires compromises: the Web had to throw away the ideal of total consistency of all its interconnections, ushering in the infamous message “Error 404: Not Found” but allowing unchecked exponential growth’, (Berners-Lee et al 2001).We can trace this property back to the original Web’s design principles, particularly the Web’s ability to record random associations between objects.
‘The Web was designed to be a universal space of information, so when you make a bookmark or a hypertext link, you should be able to make that link to absolutely any piece of information that can be accessed using networks. The universality is essential to the Web: it loses its power if there are certain types of things to which you can’t link’, (Berners-Lee 1998).
‘The second part of the dream was... The computer re-enters the scene visibly as a software agent, doing anything it can to help us deal with the bulk of data, to take over the tedium of anything that can be reduced to a rational process, and to manage the scale of our human systems’, (Berners-Lee 1998).How will the Semantic Web work?
‘For the semantic web to function, computers must have access to structured collections of information and sets of inference rules that they can use to conduct automated reasoning’, (Berners-Lee et al 2001).
‘Traditional knowledge-representation systems typically have been centralized, requiring everyone to share exactly the same definition of common concepts such as “parent” or “vehicle”. But central control is stifling, and increasing the size and scope of such a system rapidly becomes unmanageable’, (Berners-Lee et al 2001).
‘For example, a genealogy system, acting on a database of family trees, might include the rule “a wife of an uncle is an aunt”. Even if the data could be transferred from one system to another, the rules, existing in a completely different form, usually could not”, (Berners-Lee et al 2001).The point that Sir Tim Berners-Lee is making here, is that data is usually stored on a range of different systems and the ‘semantic’ rules that define objects are found in a variety of different formats, dependent on the information storage system used to store the data. It is then impossible to associate semantic rule sets when moving data from one information system to another.
‘Moreover, these systems usually carefully limit the questions that can be asked so that the computer can answer reliably or answer at all. The problem is reminiscent of [Kurt] Godel’s [incompleteness] theorem from mathematics: any system that is complex enough to be useful also encompasses unanswerable questions... Semantic Web researchers, in contrast, accept that paradoxes and unanswerable questions are a price that must be paid to achieve versatility. We make the language for the rules as expressive as needed to allow the Web to reason as widely as desired’, (Berners-Lee et al 2001).
‘Early in the Web’s development, detractors pointed out that it could never be a well-organized library; without a central database and tree structure, one would never be sure of finding everything. They were right’, (Berners-Lee et al 2001).
‘The challenge of the Semantic Web, therefore, is to provide a language that expresses both data and rules for reasoning about the data and that allows rules from any existing knowledge-representation system to be exported onto the Web’, (Berners-Lee et al 2001).The earlier example about a genealogy system is particularly close to my heart, GEDCOM is certainly a ‘traditional knowledge representation system’. New genealogy formats are being created, with their own ways of defining data. We must therefore find a way of associating data...
‘XML’s power comes from the fact that it can be used regardless of the platform, language, or data store of the system using it to expose datasets’, (Evjen et al 2007).
‘XML is considered ideal for data representation purposes because it enables developers to structure XML documents as they see fit. For this reason, it is also a bit chaotic. Sending self-structured XML documents between dissimilar systems doesn’t make a lot of sense – it requires custom building of both the exposure and consumption models for each communication pair’, (Evejen et al 2007).So really, everyone can create their own definition of how to represent data using XML. Again, the genealogy developers are doing this, so how will we associate?
‘Meaning is expressed by RDF, which encodes it in sets of triples, each triple being rather like the subject, verb and object of an elementary sentence. These triples can be written using XML tags. In RDF, a document makes assertions that particular things (people, Web pages or whatever) have properties (such as “is a sister of”, “is the author of”) with certain values (another person, another Web page). This structure turns out to be a natural way to describe the vast majority of the data processed by machines’, (Berners-Lee et al 2001).
‘Subject and object are each identified by a Universal Resource Indicator (URI), just as used in a link on a Web page. (URLs, Uniform Resource Locators, are the most common type of URI). The verbs are also identified by URIs, which enables anyone to define a new concept, a new verb, just be defining a URI for it somewhere on the Web’, (Berners-Lee et al 2001).So, as a genealogist, one URI, (web page), can represent one person; another URI represents another person and I can link them together using a RDF file at an intermediate location, (URI), which defines their relationship.
‘Two databases may use different identifiers for what is in fact the same concept... A program that wants to compare or combine information across the two databases has to know that these two terms are being used to mean the same thing. Ideally, the program must have a way to discover such common meanings for whatever databases it encounters’, (Berners-Lee et al 2001).Yes, as I said earlier, I may want to link with another genealogy held in a different system elsewhere, which uses its own XML and RDF structures.
‘A solution to this problem is provided by the third basic component of the Semantic Web, collections of information called ontologies... [In terms of the Semantic Web] an ontology is a document or file that formally defines the relations among terms. The most typical kind of ontology for the Web has a taxonomy and a set of inference rules’, (Berners-Lee et al 2001).
‘The taxonomy defines classes of objects and relations among them... Classes, subclasses and relations among entities are a very powerful tool for Web use. We can express a large number of relations among entities by assigning properties to classes and allowing subclasses to inherit such properties’, (Berners-Lee 2001).
‘Inference rules in ontologies supply further power... A program could then readily deduce, for instance, that a Cornell University address, being in Ithaca, must be in New York State, which is in the U.S., and therefore should be formatted to U.S. standards. The computer doesn’t truly “understand” any of this information, but it can now manipulate the terms much more effectively in ways that are useful and meaningful to the human user’, (Berners-Lee et al 2001).
‘Ontologies can enhance the functioning of the Web in many ways. They can be used in a simple fashion to improve the accuracy of Web searches and the search program can look for only those pages that refer to a precise concept instead of all the ones using ambiguous keywords. More advanced applications will use ontologies to relate the information on a page to the associated knowledge structures and inference rules’, (Berners-Lee et al 2001).Ontologies can be defined using the Web Ontology Language (OWL). Isn’t that neat?
‘Another vital feature will be digital signatures, which are encrypted blocks of data that computers and agents can use to verify that the attached information has been provided by a specific trusted source’, (Berners-Lee et al 2001).Digital signatures can be used to sign the objects (XML files) or links (RDF) to ensure their validity.
‘Proxy caches... will be able to check that they are really acting in accordance with the publisher’s wishes when it comes to re-distributing material [e.g. distribution controls selected dependent on the publisher’s certificate]. A browser will be able to get an assurance, before imparting personal information in a Web form, on how that information will be used [a digitally signed Web service]. People will be able to endorse Web pages that they perceive to be of value [a digitally signed hyperlink]. Search engines will be able to take such endorsements into account and give results that are perceived to be of much higher quality’, (Berners-Lee 1998).
‘When we have this, we will be able to ask the computer not just for information, but why we should believe it. Imagine an ‘Oh, yeah?’ button on your browser’, (Berners-Lee 1998).
‘A test of this ability was to replace them with older specifications, and demonstrate the ability to intermix those with the new. Thus, the old FTP protocol could be intermixed with the new HTTP protocol in the address space, and conventional text documents could be intermixed with the new hypertext documents’, (Berners-Lee 1996).Also, as a further example, we can look at HTTP’s ability to carry images (JPG, PNG, VRML) or even Java code.
‘Typically, hypertext systems were built around a database of links. This did not scale... However, it did guarantee that links would be consistent and links to documents would be removed when documents were removed. The removal of this feature was the principle compromise made in the [World Wide Web] architecture... allowing references to be made without consultation with the destination, allowed the scalability which the later growth of the web exploited’, (Berners-Lee 1996).File Transfer Protocol (FTP) existed when the web was first developed, but was ‘not optimal for the web, in that it was too slow and not sufficiently rich in features’, (Berners-Lee 1996). So the Hyper-Text Transfer Protocol (HTTP) was created.