A HTML TUTORIAL --------------- Rahul Simha INTRODUCTION ------------ * History: File transfers were made possible with the earliest versions of internet software; the program that handles file transfers still survives more or less intact - ftp. As internet users started to place files for public access, advertising their wares on bulletin boards, some people started distributing "lists" of hot ftp material on bulletin boards. Along came archie, a program to locate ftp sites using a keyword search. This program worked by collecting data at archie servers (located at some sites) and allowing archie clients to connect to the servers to search. The next development was gopher (created at the University of Minnesota), a utility which combined several tools (a file viewer, ftp and telnet) in a single easy-to-use menu-driven interface. At the same time, the publishing industry had been experimenting with so-called hypertext documents (electronic documents with nonlinear organization of data) -- on single machines. A standard called SGML (Standard Generalized Markup Language) was developed to write hypertext documents in free ascii-text (similar to Latex, troff etc). Ideally, SGML should be integrated with TCP/IP to provide links across the network. But SGML is large and complex. Thus came HTML (HyperText Markup Language), a much simpler formatting language developed by CERN in Switzerland that uses TCP/IP. The whole idea in using HTML is to display more than text, that is, formatted text and images. For this, a "browser" is needed - most often, a browser written for a windowing package such as xmosaic written for X-windows. HTML BASICS ----------- * Basic structure of an HTML document: HTML documents are written in ascii text, with commands specified by particular sequences of characters. Commands in HTML usually consist of 3 components: a start tag, a middle, and a stop tag. For example, to specify the title of a document, such as, "Red Riding Hood", you would use the
She set up the soon-to-be hottest internet site:
" all by itself), how to set off a phrase in italics (using " whatever phrase "), and how to create horizontal graphic lines (using "
Code - similar to teletype
Keyboard - similar to teletype
Sample - similar to teletype
Definition - for definitions
End a paragraph, start a new one
Line break - start a new line
Preformatted text - format exactly as entered in ascii.
To set apart a quote * Itemized lists in HTML To create an itemized list, consider the following example:Red Riding Hood's shopping list
Observe that an unordered list list is defined by "
- Picnic basket
- Iced tea
- Red items
- Red delicious apples
- Red sneakers
- Red jacket with hood
- Safety items
- Magnesium flare
- Cellular phone
- Uzi
list of things
" and an ordered list by "list of ordered items
". Each item specified a "". Unordered lists are bulleted and ordered lists are numbered. There are also You get the idea. * Special characters: To indicate special characters, an ampersand command is used. For example, "<" is the less-than symbol that starts an HTML command tag. Typing "<" (without the quotes) will result in the less-than symbol being displayed. Here are some useful character codes (others are available, for example, umlauts and accents for European languages): Character code Description ----------------------------------------------------------- < < - the less than symbol > > - the greater than symbol & & - the ampersand symbol ADDING LINKS TO DOCUMENTS ------------------------- * A simple example: Suppose we have created the following html file (called, say, "red.html") in the current directory, along with the file "home.html" and a subdirectory "early" containing the files "birth.html" and "preschool.html".
- HTML
- A language spoken by nerds
- Java
- A language spoken by major nerds
Red Riding Hood Red's Early Years
Birth Pre-SchoolRed Goes to High-School
Red Sets up a Homepage Notice the new command used above - the anchor command - with the general form " some text that will be highlighted ". When a browser displays this document, the text between the anchor tags will be underlined or highlighted. Mouse clicks on this portion will result in following the address to a new HTML document. The start tag provides the information needed to find the new html document (which has its own header, body etc). * What is a URL? A URL (Universal Resource Locator) is a document name that contains complete access information such as whether the document is HTML, where it is (internet address), the path name (sequence of directories) and other information. For example, consider this URL: http://www.cs.wm.edu:80/tales/fairy/modern/masterlist.html It specifies the following: http - the document is in HTML www.cs.wm.edu - the internet address or system name 80 - the port at which the httpd daemon is listening (most often the port is 80, the default port, and is left out of the URL) tales/fairy/modern - a path name leading to a file masterlist.html - a file name Port number 80 is the standard port number. It is not needed in the URL. It is also possible to specify a query in a URL (for database searches). This is explained later. * Two types of anchors: There are reference anchors (the anchor in the example above) and named anchors (see below). Named anchors allow you to mark a place in the text to point to. For example, suppose the file "stories.html" contains a number of stories, whereas "masterlist.html" contains a list of story titles such as "Red Riding Hood". Each title has a link to the appropriate story in stories.html. For example, here is a portion of masterlist.html: Let's assume that stories.html contains the tales (each with hyperlinks to other files. Now, by clicking on any stories in masterlist.html, the browser will take you to the top of the stories.html file. You then have to scroll down to the story you want. To avoid this problem, we simply mark each story beginning in the file stories.html and use the mark in the href specification. For example, in stories.html, let us mark "Red Riding Hood" as follows:Red Riding Hood
Now, in the appropriate href part in masterlist.html, we specify this mark:Red Riding Hood Observe the hash symbol being used to specify a named anchor. You can use named anchors for rapid movement within a single HTML document. * Relative addressing: Suppose the address of the current document is http://www.cs.wm.edu/tales/fairy/modern/masterlist.html Then, we have seen that links in the file masterlist.html are created by giving an address in the href part of an anchor. We can either provide a full address or a partial or relative address. Above, we saw an example of a relative address: Red Riding Hood We could have also given the complete address: Red Riding Hood IMAGES AND OTHER THINGS ----------------------- * What is MIME? MIME (Multipurpose Internet Mail Extensions) is a standard that incorporates many well-known file formats. The idea is that the browser doesn't handle these formats and instead calls a "plug-in", a program that knows what to do with the data. Thus, for "postscript" files, a postscript viewer is called by the browser. You can, by setting options in the browser, decide which application programs (plug-ins) handle which file extensions. Here are some common extensions (some of which, like .gif, are directly handled by the browser). gif - .gif files are graphics or bitmap files in the GIF (Graphics Interchange Format) format. jpeg - a bitmap format for still images mpeg - format for motion pictures (not yet supported) ps - postscript pdf - the format used by Adobe Acrobat documents au - Unix audio files * How to display in-lined images: In-lined images are images are images within the HTML document (as opposed to spawning a viewer). Consider the following example, which displays an image in the file mypicture.gif: The Next President of the United Brewpub Tasters of America
With the "
" command, we specify the source file ("mypicture.gif"), an alignment for the first following line of text, and an alternate ascii string ("my mugshot") for browsers that don't support images. * Adding links to images (simulating buttons): This is easy: simply enclose the entire image command inside an anchor command. For example:
The Next President of the United Brewpub Tasters of America
Click on my picture to get my biodata To create cool HTML documents, hunt around for the standard images such as a hand (for backwards). HOMEPAGES: --------- You now want to know where to place files that others can view: your homepages. * In Unix, you need to create a subdirectory off of your main directory and call it "public_html". * In the subdirectory public_html, create a file called "index.html". When someone accesses your homepage by just giving your username, it is this file that is brought up. Thus, the URL http://www.cs.wm.edu/~simha is really the file public_html/index.html, which the local http daemon knows to get. (You don't have to understand this last point). Make sure that you grant public access to this directory and to the files you place in the directory. * You can now place all other files you want others to access in the directory public_html. For example, if I create my CV in an html file called cv.html, put the file in the directory public_html, and refer to it by the URL: http://www.cs.wm.edu/~simha/cv.html then others can `open' this URL and get the file. ADVANCED STUFF -------------- * Creating tables: Here is an example of a table in ascii that we will write in HTML: BEER RATING Chilled Cool ---------------------------------- Dork Pilsener 5.6 6.7 Blech Dark 7.3 7.1 Bugwizer 1.2 0.5 Handiken 4.4 6.5 Now, in HTML:
And now the explanation. Use the "border" keyword if you want lines boxing in the table and its rows and columns. Each row of the table is ended with a "
BEER RATING Chilled Cool Dork Pilsener 5.6 6.7 Blech Dark 7.3 7.1 Bugwizer 1.2 0.5 Handiken 4.4 ". If no headings are desired, simply place each entry within table data tags, for example, " Blech Dark ". Use table heading tags for headings; to span multiple columns use the "colspan" keyword, as above. To span multiple rows, use "rowspan" (not shown in above example). * Including files: Editing a large HTML document everytime a change is required can be annonying. Better to use a bunch of includes and edit the includes whenever needed. Some includes will never changes; others will be changed frequently (perhaps even by programs). For example:The Next President of the United Brewpub Tasters of America
Click on my picture to get my biodata
What's new in this homepage