Developed by Mel Leggatt and Notis Toufexis. Taught by Mel Leggatt
In this session we’ll look at how the Web has evolved and what possibilities it currently offers. In particular, we’ll be examining how web pages work and the differences between Web 1.0 and Web 2.0. In this week’s practical we’ll look at how to begin to create basic web pages. By the end of the practical you’ll have your own live version of your first web page.
- W3C’s Tutorial pages
- Wikipedia’s entry on the dot.com bubble and how it burst
- Tim O’Reilly’s “What is Web 2.0?” and his subsequent Web Squared: Web 2.0 Five Years On, co-written with John Battelle
- Wikipedia’s entry on blogging
- WikiMedia MetaWiki and their list of the largest Wikis
- Delicious social bookmarking
- Stephen Fry’s Videojug commentary on Web 2.0
- Mike Wesch’s YouTube video Web 2.0 The Machine is Us/ing Us
- Everytrail.com as an example of ‘mashed’ data
- Yahoo and Google’s RSS feeders
- O’Reilly’s Head First Labs Head First HTML with CSS and XHTML (accompanies recommended reading)
- Freeman, E. and Freeman, E. (2006) Head First HTML with CSS and XHTML, O’Reilly Media
In the beginning. . .
There was Web 1.0, though of course it wasn’t called that until Web 2.0 came along. As we’ve already discovered, in its earliest days the Web consisted of a few academics sharing information across a handful of servers. The information was textual and so the mechanisms to facilitate this sharing didn’t have to be particularly complex. The users of the Web were also genuine enthusiasts and so spending a little time learning how to use the technology didn’t seem a hardship.
The Anatomy of an HTML Document
Given the relative simplicity of the information that was going into these early web pages ‘tagging’ different parts of the data to make them display differently in a web browser was simple and scalable. The mechanism by which the text was formatted came to be known as the Hyper Text Markup Language (HTML). Sets of tags were developed to, for example, allow authors to start and end paragraphs, define different levels of header in their documents and emphasize text in a number of different ways. The best way to begin to understand HTML is to take a look at some sample text. The following two images illustrate firstly, how the text appears in the browser, and secondly, the underlying code that produces it.
As you can see, it’s all pretty intuitive if we tell you that the <p></p> tags signify the start and end of a paragraph, the <h2></h2> tags tell the browser to render the enclosed text as a header of size 2 and any text enclosed by the <i></i> tags will render as italics. All tags must close and this is extremely important to remember. Those that act on a given piece of content surround it, as in the case of the italics above. Other tags, such as the br tag that’s used to insert a line break in a web page self-close as they don’t directly act on any piece of content themselves (note the space between the text and the /, it’s important).
Clearly, simply tagging the text in a document isn’t enough to tell the server, workstation and its browser that this is a web page, so each HTML document also has the following basic tags to tell these systems what sort of document it is and what they have to do with it.
Each document begins and ends with <html></html> tags and within them it contains two sections. The header section contains various descriptive tags that say something about the document itself, the body contains the content.
Here’s a sample header that contains basic descriptive information about the document and its content including, for example, the character set and some keywords and descriptive text.
Mixing all these elements creates a first basic web page.
Which renders as
So, you’ve got a basic HTML file that will display some unformatted text. It’s great progress, but not very exciting. In particular, web pages that don’t link to anywhere else are pretty boring. Linking to other parts of the same web page, other documents on your site or other parts of the Web is easy. The tag you use to achieve this is the <a> or anchor tag.
As with other tags any content enclosed within the anchor tag is selected for a particular action, in this case it’ll become the piece of text (or indeed graphic) that becomes the link you click on.
Here’s a link to another part of the same document. Useful if, for example, you’ve got various sections in your homepage including qualifications and publications. Creating an anchor at the top of your page would allow you to put links to it from the different sections making it easy for users to navigate around it. The hash at the start of the tag tells the browser that this is a link to another part of the same document.
Linking to other pages on your own website is, in fact, even simpler as there’s no need to define the <a name>. You only need to tell the browser which page to go to and this is achieved by the following code. If the page is in a different directory you can specify that in the tag too.
External or Explicit Links
Finally, the process of linking to a page on another web site is almost identical, the only difference being that you have to be explicit and specify the entire web address, or uniform resource locator (url).
So now the code for our entire webpage looks like
And in the browser it appears as
A note about Mailto: Links
Perhaps slightly confusingly, the anchor tag is also the element that allows you to put a live link to your email address in your web page so that users don’t have to cut and paste your address into their mail application. Note the code that replaces the @ symbol to prevent your email address being harvested for spam. More on working with special characters in web sites later on.
So the content is slowly getting richer, but the layout isn’t particularly attractive. The next crucial set of tags you need to know about relate to lists. There are two sorts of list, ordered and unordered. In both cases the individual list elements are defined by the list tag. It’s worth noting that lists can also be nested.
Including the lists the web page now renders as
The next important structural element is the table. Tables offer the opportunity to arrange information laterally and to be more imaginative in your layout throughout your web page. Much of the functionality of tables has been replaced by the DIV tag, but for now it’s worth mentioning the role of tables in, for want of a better term, Web 1.0 layout. The key elements here are the table, table row and table cell, or data, tags namely <table>, <tr> and <td>.
Adding this code to our web page and enclosing the lists and an image in a single row table with two adjacent cells means it now looks like this.
And you can put anything in an individual table cell, so now we’ve managed to incorporate an image into our web site. For images the appropriate piece of code is the imaginatively named <img> tag and you should be able to spot the section in the code above that names the image source as gorilla.jpg.
Many of the earliest web pages used this basic form of HTML with content and style and structure intermingled. As users began to experiment with enriching the content of their web pages the limitations of the model became apparent as grinding repetition became the bane of anyone developing more than a very basic web page. Something was going to have to be done.
Cascading Style Sheets
The solution lay in the development of Cascading Style Sheets that allowed web page authors to separate content and structure. Note: in the same way that saving files with a .html extension identifies them to servers and workstations as web pages, so it is with Cascading Style Sheets or .css files.
So how do you tell your web page to use a .css? Back to the header. . .
Below is an example of how the headers in an HTML file are described in a Cascading Style Sheet (CSS).
Here you can see that any text that is enclosed by the h1 tag will be bold, will appear at 120% of its font size and will be a given colour, expressed in terms that a web browser understands. Change the way you want all your h1 headers to look in this file and this will affect every single instance of an h1 tag in every web page you’ve told to use this .css file.
Merely adding the reference to the stylesheet in the head section of the web page produces a quite different look, in this case using the old camstyle.css. Css files can be stored locally or remotely, you just need to make sure you use the appropriate form of link (relative or absolute) when you reference them.
Working with these basic elements will allow you to produce attractive, basic web pages – enough to get you through Project One. Further guidance on HTML is available through the World Wide Web Consortium’s tutorials pages.
Today, adhering to the standards is much simpler than it used to be. Whereas at one time we encouraged our students to write their own HTML code by hand, today commercial applications like Adobe Dreamweaver that’s available on the PWF – and their non-commerical counterparts – generate ‘clean’ code by default and shouldn’t let you stray from the standards. For information on the non-commercial packages we recommend see the course Links page on the sidebar. You can check how clean your code is by using the W3C’s online validator and of course we strongly recommend that you do.
That’s a whistle-stop tour of the technical underpinnings of the markup language used to describe basic webpages. There are many, many tags and elements in addition to the few we’ve looked at so far and you’ll encounter more of them as the course progresses.
Having given you a feeling for the technology that got us through the early years of the Web, for the duration of the rest of this session we’re going to take a look at the differences between Web 1.0 and Web 2.0 and why you need to understand them.
Web 1.0 – Web 2.0
So, what was Web 1.0 all about and why is Web 2.0 so different? When Tim Berners-Lee first developed the Web nobody ever imagined was going to take off in the way it did. It scaled as far as it could from supporting a small group of academics sharing information with each other over the network to allowing more users to create web pages with commercial GUI applications. It was never intended to provide the functionality to serve the needs of millions of people who wanted to publish or do commerce online without having to engage with the technology in a significant way.
Even with relatively user-friendly applications a certain degree of enthusiasm was still necessary to either invest time and effort in learning to use an Open Source application or significant amounts of money in a commercial GUI application. Whichever product you used you would still end up with a web site with static content and only capable of supporting media such as audio and video with, at least initially, some fairly specialist effort and resources. This model left the majority of users of the Web as consumers rather than active participants and the production of the content in the hands of a (relative) minority.
With the benefit of hindsight, it’s fairly safe to speculate that Web 2.0 was born out of the dot.com crash in 2001. At the time many writers suggested that the Web might have had its day as it had been revealed to be an over-hyped and, as far as business was concerned, over-valued flash in the pan. As with so many technological revolutions the crash merely served to flush out the pretenders and leave the field clear for the success stories to dust themselves off, feel a certain sense of relief at having survived, and move on.
Of course Web 1.0 wasn’t referred to as such until Web 2.0 was thought of. The term Web 2.0 came out of a brainstorming session at the O’Reilly Media conference (yes, that’s the same O’Reilly Media who publish the books of which we’re so fond) in 2004 when the outcome revealed that the Web had undergone a revolution that focused not so much on any dramatic changes in the underlying technologies as on how software developers empowered end users to engage with them.
The following table is taken from Tim O’Reilly’s article on Web 2.0 and clarifies how they saw the difference between Web 1.0 and Web 2.0.
With Web 2.0 we’ve moved away from a traditional model of publishers making content available to consumers, to a much more dynamic participatory model where the majority of Web users have the opportunity to update their own media-rich web sites as often as they like. Whereas Web 1.0 was characterised by users downloading material, Web 2.0 is defined by a shift in emphasis towards information flowing both ways. The possibilities for publishing, sharing and collaborating are endless as are the ones for manipulating and merging data. At present the costs of providing all these services aren’t being passed on to the consumer directly and Web users appear to have almost limitless access to any number of online resources. We’ll go on to discuss some of the key technologies underpinning Web 2.0 and the possibilities and problems they present.
Personal pages to blogs
Weblogs are the direct evolutionary descendants of static personal web pages. In the very early days of the Web in order to have your own web site you’d probably have to set up your own domain and pay an Internet Service Provider (ISP) to host it, so you’d have to be quite an enthusiast. Later on, as companies such as Yahoo! became more established they were able to offer additional services that included website hosting and storage space. Again though, the free options were quite limited and you were encouraged to pay a monthly fee for a more practical upgrade.
Most people would have an initial flurry of activity in setting up their web site following which they might chose to record significant events like holidays, but apart from that many individuals’ web sites were pretty static. Some enthusiasts might update their pages more frequently and provide their visitors with clear information on what sections had most recently changed, but this had to happen manually and so the majority of people weren’t particularly rigorous, or just didn’t bother at all.
Slowly though, the concept of the online diary began to emerge and by the mid-1990s blogs as we know them today were starting to appear, though they were still the domain of the enthusiast. By the early 2000s more and more diaries were being published electronically and the resources to allow users who were more interested in just keeping an online diary, rather than the technology itself, emerged. The rest, as they say, is history as sites such as blogger.com and wordpress.com began to host millions of blogs. Today technorati claims to track over 112 million blogs. As we’ve already mentioned, some blogs take the form of personal diaries online whilst others represent the main portal for the work of independent journalists – all of which, of course, provide the capacity for visitors to post comments and observations. Visit Wikipedia’s entry on blogging for further information on the different forms of blog and their evolutionary timeline.
Wikis, of which perhaps Wikipedia itself is the most famous example, provide truly collaborative online environments. Currently, Wikipedia has more than 75,000 active contributors. The first Wiki, WikiWikiWeb, developed by Ward Cunningham, appeared in 1994 and throughout the 2000s web users’ engagement with wikis has soared.
Wiki critics claim that as anyone can edit them their factual reliability is questionable; supporters counter this claim by reasoning that the user community will spot and correct any misleading or malicious content. Certainly, Wikipedia seems to overcome most of these issues though other wikis may be less successful and wiki vandalism is indeed a known phenomenon.
As with blogs, the beauty of wikis is that, for better or worse, ordinary web users can update and modify their content easily without the need for anything other than a standard web browser. Most wikis take a soft approach to security, meaning that the emphasis is placed on making it easier to undo mistakes or malicious content rather than trying to prevent it through draconian security measures. Of course, this approach only works if the wiki is regularly maintained. The WikiMedia MetaWiki maintains an online listing of the largest wikis.
Social bookmarking and networking
Social bookmarking evolved from your personal bookmark listing in your favourite browser on your own computer. Pretty soon, Web enthusiasts realised that storing bookmarks locally is limiting because, of course, once you’re away from your machine, you’re also away from your favourites. As a result, mechanisms to facilitate the online storage of bookmarks developed to make them globally accessible.
Web 2.0 has taken the whole concept a stage further and sites like del.icio.us now allow you to share your bookmarks online with an entire community. Social bookmarking can provide a useful alternative to search engines when you’re trying to find information online and would like to limit yourself to web sites evaluated by real people; the downside is that with no controlled vocabulary to ensure that similar sites are described in similar ways the whole process is still somewhat hit and miss.
Social networking sites such as Facebook, MySpace and Bebo provide you with the opportunity to share so much more about yourself than your favourite web sites. Social networks have amalgamated the services provided by a number of previous Web technologies, including, for example, posting information and pictures of yourself on your static web page and spending time online with friends, or with people with similar interests, in UseNet discussion rooms.
At present it’s difficult to predict how social networking sites will evolve. How might they contribute to the whole mashup process (see below)? Even in their relatively very short timeline (even by Web standards) they’ve demonstrated they’re subject to an incredibly transient user-base as one sibling’s MySpace is their younger brother or sister’s Facebook. They could prove to be the next great online innovation, or it could be that individually they prove too ephemeral to be of interest to financiers and the phenomenon runs out of steam for lack of support.
YouTube, Flickr and all that
Services such as Flickr and YouTube provide until now unimagined opportunities to, as YouTube puts it, ‘broadcast yourself’. In many ways they represent the most socially fascinating Web 2.0 phenomenon. From the days of the Roman Empire, and indeed in principle even earlier, the concept of bread and circuses has, to some extent, held sway. The authorities provide accessible, cheap entertainment to keep the majority of the population content and for the most part this works. This is clearly a one-way process with editorial control always lying with, for example, the programme makers and in many regimes, the government itself.
These sites with their Web 2.0 model of two-way data exchange turn all this on its head. The masses (for want of a better term) are entertaining themselves – and on a global scale. The content might be highly variable, but as the actor Stephen Fry puts it on his online commentary on Web 2.0, for every mad person with a banner there will be some excellent street theatre. Editorial control, again for better or worse, is in chaos and traditional media companies are finding themselves struggling to maintain their market share of the leisure industry. Big business might own the infrastructure, but to what extent can they influence or control the content when if the ‘service’ slips there’s always another YouTube on the virtual horizon? Maybe traditional media companies will regain control or maybe governments will try to exert some form of editorial control beyond suppressing illegal content, or perhaps the genie is out of the bottle and can’t be put back? Nobody can honestly say.
Mashups are a recent Web hybrid application phenomenon whereby the content of two or more sites are merged. At the time of writing this often means some kind of combination of map and photographic data such as, for example, everytrail.com where you can download trip information, get GPS coordinates and view images from walks and trails from around the world, or indeed of course (because this is Web 2.0) upload your own trip to share which others can then also edit and enhance with their own images and information. Mashups provide us with the opportunity to create new and exciting merged web sites with previously separated content, but they may also make it difficult for us to control what happens to the data we put online.
Keeping track of it all. Really Simple Syndication (RSS)
With the kind of information overload that Web 2.0 promises, powerful tools to help you manage how you access your favourite online content are increasingly important. RSS feeds allow you to keep track of frequently changing content such as blog entries and news headlines so you can automatically see updates to your top-rated web sites without having to check them manually. There are lots of RSS feeders out there. Visit www.versiontracker.com and search for RSS on your favourite platform to get an idea of the options available to you. Which you prefer is pretty much a matter of personal taste. Some of the biggest names online are also providing their own services, for example Google now provide their own Reader as do Yahoo.
The implications of Web 2.0
Nobody can truly claim to be able to predict where the application of Web 2.0 technologies will take us. On the upside, they provide opportunities for sharing, trading and collaborating that previous generations probably couldn’t have even dreamed of. On the downside, there are serious concerns emerging regarding privacy and security. For instance, how might you control how the data, your data, you’ve put on your social networking site is presented or used when it’s mashed into how many different hybrid sites? At the moment the simple answer is you can’t. Undoubtedly, this new version of the Web offers many undreamed of opportunities and problems and its future is far from predictable.
In the practical work in the workshop we’ll look at one of the friendly GUI web editors on the PWF – one where you can also view the raw HTML code. You’ll be provided with templates for both your first HTML and Cascading Style Sheet files – you won’t have to write everything from scratch! You’re encouraged to experiment with hacking these templates to create your own customised design. Additional online resources will be provided on, for example, working with accented and special characters and colours so that you can refer to them over the vacation. We’ll also show you how to make these web pages live on the PWF, so you’ll have the first version of your web site online for all the world to see by the end of the session – and with the work you’ve already done on the course, some content with which to populate them.
Why do we make you learn to understand code by hand for Project One? Learning something of the underpinning technology will provide you with the valuable and necessary skills to allow you to customise, hack and perhaps most crucially, fix, your own, or a company’s web site.
This week there is also a piece of homework. The (not very beautiful) web page at http://www.mml.cam.ac.uk/call/chucol/broken.html has 10 deliberate mistakes in it. Using the principles of XHTML tags that we’ve discussed in the lecture see how many you can spot. Whichever web browser you’re using will give you the option to view the source code of the page under one of its menus.