A Brief Introduction to Internet, The World Wide Web, Web Browsers, Web Servers,
Uniform Resource Locators, MIME, HTTP
HTML5: Evolution of HTML and XHTML, Basic Syntax, Document Structure, Links,
Images, Multimedia, Lists, Tables, Creating Forms. Cascading Style sheets.

• Introduction to Internet :
The Internet is a worldwide system of interconnected computer networks. The computers
and computer networks exchange information using TCP/IP (Transmission Control
Protocol/Internet Protocol) to communicate with each other. The computers are connected
via the telecommunications networks, and the Internet can be used for e-mailing,
transferring files and accessing information on the World Wide Web.
The World Wide Web is a system of Internet servers that use HTTP (Hypertext Transfer
Protocol) to transfer documents formatted in HTML (Hypertext Mark-up Language). These
are viewed by using software for web browsers such as Netscape, Safari, Google Chrome
and Internet Explorer. Hypertext enables a document to be connected to other documents
on the web through hyperlinks. It is possible to move from one document to another by
using hyperlinked text found within web pages.

• History of Internet :
The development of the Internet dates from 1969. The Advanced Research Projects
Administration (ARPA) wanted to develop a distributed, decentralized network, so that
communications could be maintained in the aftermath of a nuclear attack. Up until around
1969, networks had been point-to-point networks. The desire to have a distributed,
decentralized network was important so that if one node of the system was down,
information could still travel through the available nodes. The network that ARPA created
was termed ARPANET, and eventually became the Internet. The adoption of Transmission
Control Protocol/Internet Protocol (TCP/IP) as the Internet standard protocol in 1983
heralded the coming to full maturity of the Internet (Dilligan, 1998). At that time, there
were several hundred nodes to the Internet, but all shared an affiliation with the military or
ARPA. In 1986, access to the Internet was allowed to all universities, and shortly thereafter,
to the public at large. By 1990, the number of sites on the Internet had grown to over three
hundred thousand. (Dilligan, 1998).
• World Wide Web:

The World Wide Web (abbreviated WWW or the Web) is an information space where
documents and other web resources are identified by Uniform Resource Locators (URLs),
interlinked by hypertext links, and can be accessed via the Internet.The World Wide Web
has been central to the development of the Information Age and is the primary tool billions
of people use to interact on the Internet.[4][5][6]Web pages are primarily text documents
formatted and annotated with Hypertext Markup Language (HTML). In addition to
formatted text, web pages may contain images, video, audio, and software components that
are rendered in the user's web browser as coherent pages of multimedia content.

• Difference between the Internet and the World Wide Web :

The terms Internet and World Wide Web, although often used synonymously, are different.
The term Internet is a nominalised abbreviation of Internetworking, and came into use in
1982. The Internet identifies not a single network, but a vast network of networks. These
networks communicate with each other via the existing telecommunications networks. The
Internet offers several different services including email and File Transfer Protocol (FTP).
The World Wide Web, commonly known as “the Web,” is the largest and fastest growing
area of the Internet (Worsley, 2000). The Web uses the network of the Internet to access
and link Web sites. The Internet essentially provides the infrastructure over which the Web
is able to operate (Figure 1). The Web is a way of organizing information so that any
workstation or computer around the world can access it through the Internet via any means
of connectivity.
• Web Browsers :
A web browser is a computer program application found on all modern computers. It is a
software application for retrieving, presenting and traversing information resources on the World
Wide Web. An information resource is identified by a Uniform Resource Identifier (URI/URL)
that may be a web page, image, video or other piece of content. Hyperlinks present in resources
enable users easily to navigate their browsers to related resources. Although browsers are
primarily intended to use the World Wide Web, they can also be used to access information
provided by web servers in private networks or files in systems. The major web browsers are
Firefox, Google Chrome, Internet Explorer/Microsoft Edge,Opera, and Safari.
• URL:
Uniform Resource Identifier (URI) is a string of characters used to identify a name or a
resource on the Internet. A URI identifies a resource either by location, or a name, or
both. A URI has two specializations known as URL and URN. A Uniform Resource
Locator (URL) is a subset of the Uniform Resource Identifier (URI) that specifies where
an identified resource is available and the mechanism for retrieving it.URL defines how
the resource can be obtained. It does not have to be HTTP URL (http://), a URL can also
be (ftp://) or (smb://). A Uniform Resource Name (URN) is a Uniform Resource
Identifier (URI) that uses the URN scheme, and does not imply availability of the
identified resource.

A Uniform Resource Locator (URL), can also be informally termed a web address is a
reference to a web resource that specifies its location on a computer network and a
mechanism for retrieving it.Every HTTP URL conforms to the syntax of a generic URI. A
generic URI is of the
It comprises:
The scheme, consisting of a sequence of characters beginning with a letter and followed by
any combination of letters, digits, plus (+), period (.), or hyphen (-). Although schemes are
case- insensitive, the canonical form is lowercase and documents that specify schemes must
do so with lowercase letters. It is followed by a colon (:). Examples of popular schemes
include http, ftp, mailto, file, data, and irc. URI schemes should be registered with the
Internet Assigned Numbers Authority (IANA), although non-registered schemes are used in
Two slashes (//): This is required by some schemes and not required by some others. When
the authority component (explained below) is absent, the path component cannot begin
with two slashes.[11]
An authority part, comprising:

o An optional authentication section of a user name and password, separated by a

colon, followed by an at symbol (@)
o A "host", consisting of either a registered name (including but not limited to a
hostname), or an IP address. IPv4 addresses must be in dot-decimal notation, and
IPv6 addresses must be enclosed in brackets ([ ]).[12][b]
o An optional port number, separated from the hostname by a colon

A path, which contains data, usually organized in hierarchical form, that appears as a
sequence of segments separated by slashes. Such a sequence may resemble or map
exactly to a file system path, but does not always imply a relation to one.[14] The path
must begin with a single slash (/) if an authority part was present, and may also if one
was not, but must not begin with a double slash.

Query Example
Ampersand (&) key1=value1&key2=value
Semicolon (;) key1=value1;key2=value

An optional query, separated from the preceding part by a question mark (?), containing a
query string of non-hierarchical data. Its syntax is not well defined, but by convention is
most often a sequence of attribute–value pairs separated by a delimiter.

An optional fragment, separated from the preceding part by a hash (#). The fragment
contains a fragment identifier providing direction to a secondary resource, such as a section
heading in an article identified by the remainder of the URI. When the primary resource is
an HTML document, the fragment is often an id attribute of a specific element, and web
browsers will scroll this element into view.

• MIME :
Multipurpose Internet Mail Extensions (MIME) is an Internet standard that extends the
format of email to support:
❖ Text in character sets other than ASCII
❖ Non-text attachments: audio, video, images, application programs etc.
❖ Message bodies with multiple parts
❖ Header information in non-ASCII character
sets MIME type names follow a given format:
o image/jpeg is an example of a MIME type where image is the media type, and
jpeg is the subtype identifier.
• HTTP :
The Hypertext Transfer Protocol (HTTP) is an application protocol for distributed,
collaborative, and hypermedia information systems.[1] HTTP is the foundation of data
communication for the World Wide Web.
Hypertext is structured text that uses logical links (hyperlinks) between nodes containing
text. HTTP is the protocol to exchange or transfer hypertext.
Development of HTTP was initiated by Tim Berners-Lee at CERN in 1989.
• Request Phase :
An HTTP client sends an HTTP request to a server in the form of a request message which
includes following format:
A Request-line
Zero or more header (General|Request|Entity) fields followed by
CRLF An empty line (i.e., a line with nothing preceding the CRLF)
indicating the end of the header
fields Optionally a message-body
The Request-Line begins with a method token, followed by the Request-URI and the
protocol version, and ending with CRLF. The elements are separated by space SP characters.
Request-Line = Method SP Request-URI SP HTTP-Version CRLF
Request Method
The request method indicates the method to be performed on the resource identified by
the given Request-URI. The method is case-sensitive and should always be mentioned in
uppercase. The following table lists all the supported methods in HTTP/1.1.

S.N. Method and Description


The GET method is used to retrieve information from the given server using a given URI. Requests
using GET should only retrieve data and should have no other effect on the data.

2 Same as GET, but it transfers the status line and the header section only.


3 A POST request is used to send data to the server, for example, customer information, file upload, etc.
using HTML forms.

Replaces all the current representations of the target resource with the uploaded content.

5 Removes all the current representations of the target resource given by URI.

6 Establishes a tunnel to the server identified by a given URI.

7 Describe the communication options for the target resource.

8 Performs a message loop back test along with the path to the target resource.

The Request-URI is a Uniform Resource Identifier and identifies the resource upon which to
apply the request.
Request-URI = "*" | absoluteURI | abs_path | authority
The asterisk * is used when an HTTP request does not apply to a particular resource, but to the
server itself, and is only allowed when the method used does not necessarily apply to a
resource. For example:
The absoluteURI is used when an HTTP request is being made to a proxy. The proxy is
requested to forward the request or service from a valid cache, and return the response. For
The most common form of Request-URI is that used to identify a resource on an origin server or
gateway. For example, a client wishing to retrieve a resource directly from the origin server
would create a TCP connection to port 80 of the host "" and send the following
GET /pub/WWW/TheProject.html
HTTP/1.1 Host:
Note that the absolute path cannot be empty; if none is present in the original URI, it MUST be
given as "/" (the server root).
Request Header Fields
The request-header fields allow the client to pass additional information about the request,
and about the client itself, to the server. These fields act as request modifiers.
Some important Request-header fields that can be used based on the requirement:
o Accept-Charset If-None-Match
o Accept-Encoding If-Range
o Accept-Language If-Unmodified-Since
o Authorization Max-Forwards
o Expect Proxy-Authorization
o From Range
o Host Referer
o If-Match TE
o If-Modified-Since User-Agent
• Response Phase :
After receiving and interpreting a request message, a server responds with an HTTP
response message:
o A Status-line
o Zero or more header (General|Response|Entity) fields followed by CRLF
o An empty line (i.e., a line with nothing preceding the CRLF)
o indicating the end of the header fields
o Optionally a message-body
Message Status-Line
A Status-Line consists of the protocol version followed by a numeric status code and its associated
textual phrase. The elements are separated by space SP characters.

Status-Line = HTTP-Version SP Status-Code SP Reason-Phrase CRLF

A server supporting HTTP version 1.1 will return the following version information:

HTTP-Version = HTTP/1.1

Status Code
The Status-Code element is a 3-digit integer where first digit of the Status-Code defines the class of
response and the last two digits do not have any categorization role. There are 5 values for the first digit:
S.N. Code and Description

1xx: Informational
It means the request was received and the process is continuing.

2xx: Success
2 It means the action was successfully received, understood, and accepted.

3 3xx: Redirection
It means further action must be taken in order to complete the request.

4xx: Client Error

4 It means the request contains incorrect syntax or cannot be fulfilled.

5xx: Server Error

5 It means the server failed to fulfill an apparently valid request.

HTTP status codes are extensible and HTTP applications are not required to understand the meaning of
all registered status codes. A list of all the status codes has been given in a separate chapter for your

Response Header Fields

The response-header fields allow the server to pass additional information about the response which
cannot be placed in the Status- Line. These header fields give information about the server and about
further access to the resource identified by the Request-URI.
o Accept-Ranges Retry-After
o Age Server
o ETag Vary
o Location WWW-Authenticate
o Proxy-Authenticate

• Introduction to XHTML :
What is HTML?
This is Standard Generalized MarkupLanguage (SGML) application conforming to International
Standard ISO 8879. HTML is widely regarded as the standard publishing language of the World Wide
What is SGML?
This is a language for describing markup languages, particularly those used in electronic document
exchange, document management, and document publishing. HTML is an example of a language
defined in SGML.
What is XML?
XML stands for EXtensibleMarkupLanguage. XML is a markup language much like HTML and it was
designed to describe data. XML tags are not predefined. You must define your own tags according to
your needs.
XHTML stands for EXtensibleHyperTextMarkupLanguage. It is the next step in the evolution of the
internet. The XHTML 1.0 is the first document type in the XHTML family.XHTML is almost identical to
HTML 4.01 with only few differences. This is a cleaner and stricter version of HTML
4.01.XHTML was
developed by World Wide Web Consortium (W3C) to help web developers make the transition from
HTML to XML. By migrating to XHTML today, web developers can enter the XML world with all of its
benefits, while still remaining confident in the backward and future compatibility of the content.

• Why Use XHTML?

• XHTML documents are XML conforming as they are readily viewed, edited, and validated with
standard XML tools.
• XHTML documents can be written to operate better than they did before in existing browsers as
well as in new browsers.
• XHTML documents can utilize applications such as scripts and applets that rely upon either the
HTML Document Object Model or the XML Document Object Model.
• XHTML gives you a more consistent, well-structured format so that your webpages can be easily
parsed and processed by present and future web browsers.
• You can easily maintain, edit, convert and format your document in the long run.
• Since XHTML is an official standard of the W3C, your website becomes more compatible with
many browsers and it is rendered more accurately.
• Basic Syntax :
HTML elements are represented by tags
HTML tags label pieces of content such as "heading", "paragraph", "table", and so on
Browsers do not display the HTML tags, but use them to render the content of the
page Example
<!DOCTYPE html>
<title>Page Title</title>
<h1>My First Heading</h1>
<p>My first paragraph.</p>
Example Explained

• The <!DOCTYPE html> declaration defines this document to be HTML5

• The <html> element is the root element of an HTML page
• The <head> element contains meta information about the document
• The <title> element specifies a title for the document
• The <body> element contains the visible page content
• The <h1> element defines a large heading
• The <p> element defines a paragraph

HTML tags are element names surrounded by angle brackets:
<tagname>content goes here...</tagname>

• HTML tags normally come in pairs like <p> and </p>

• The first tag in a pair is the start tag, the second tag is the end tag
• The end tag is written like the start tag, but with a forward slash inserted before the tag name

HTML Basics :

HTML Headings

HTML headings are defined with the <h1> to <h6> tags.

<h1> defines the most important heading. <h6> defines the least important heading:

<h1>This is heading 1</h1>
<h2>This is heading 2</h2>
<h3>This is heading

3</h3> HTML Paragraphs

HTML paragraphs are defined with the <p> tag:

<p>This is a paragraph.</p>
<p>This is another

paragraph.</p> HTML Links

HTML links are defined with the <a> tag:

<a href="">This is a

link</a> The link's destination is specified in the

href attribute.

Attributes are used to provide additional information about HTML elements.

HTML <blockquote> for Quotations

The HTML <blockquote> element defines a section that is quoted from another source.

Browsers usually indent <blockquote> elements.

<p>Here is a quote from WWF's website:</p>
cite=""> For 50
years, WWF has been protecting the future of nature.
The world's leading conservation organization,
WWF works in 100 countries and is supported
million members in the United
States and close to 5 million

HTML also defines special elements for defining text with a special meaning.

HTML uses elements like <b> and <i> for formatting output, like bold or italic

text. Formatting elements were designed to display special types of text:

• <b> - Bold text

• <strong> - Important text
• <i> - Italic text
• <em> - Emphasized text
• <mark> - Marked text
• <small> - Small text
• <del> - Deleted text
• <ins> - Inserted text
• <sub> - Subscript text
• <sup> - Superscript text

HTML Computer Code Formatting

HTML normally uses variable letter size and spacing.

This is not what we want when displaying computer

code. HTML <code> For Computer Code

The HTML <code> element defines a piece of programming code:

var x =
var y = 6;
document.getElementById("demo").innerHTML = x
+ y;

Comment tags are used to insert comments in the HTML source code.

HTML Comment Tags

You can add comments to your HTML source by using the following syntax:

<!-- Write your comments here -

-> HTML Line Breaks

The HTML <br> element defines a line break.

Use <br> if you want a line break (a new line) without starting a new paragraph:

<p>This is<br>a paragraph<br>with line

breaks.</p> The HTML <pre> Element

The HTML <pre> element defines preformatted text.

The text inside a <pre> element is displayed in a fixed-width font (usually Courier), and it preserves
both spaces and line breaks:

My Bonnie lies over the

ocean. My Bonnie lies over

the sea.

My Bonnie lies over the ocean.

Oh, bring back my Bonnie to me.


Reserved characters in HTML must be replaced with character entities.

Characters that are not present on your keyboard can also be replaced by entities.
HTML Entities

Some characters are reserved in HTML.

If you use the less than (<) or greater than (>) signs in your text, the browser might mix them with

tags. Character entities are used to display reserved characters in HTML.

A character entity looks like

this: &entity_name;



Some Other Useful HTML Character Entities

Result Description Entity Name Entity Number

non-breaking space &nbsp; &#160;

< less than &lt; &#60;

> greater than &gt; &#62;

& ampersand &amp; &#38;

" double quotation mark &quot; &#34;

' single quotation mark (apostrophe) &apos; &#39;

¢ cent &cent; &#162;

£ pound &pound; &#163;

¥ yen &yen; &#165;

€ euro &euro; &#8364;

© copyright &copy; &#169;

® registered trademark &reg; &#174;

HTML <meta> Tag

Metadata is data (information) about data.

The <meta> tag provides metadata about the HTML document. Metadata will not be displayed on
the page, but will be machine parsable.

Meta elements are typically used to specify page description, keywords, author of the document, last
modified, and other metadata.

The metadata can be used by browsers (how to display content or reload page), search
engines (keywords), or other web services.


Describe metadata within an HTML document:

<meta charset="UTF-8">
<meta name="description" content="Free Web tutorials">
<meta name="keywords" content="HTML,CSS,XML,JavaScript">
<meta name="author" content="John Doe">
<meta name="viewport" content="width=device-width, initial-scale=1.0">

HTML <hr> Tag

The <hr> tag defines a thematic break in an HTML page (e.g. a shift of topic).

The <hr> element is used to separate content (or define a change) in an HTML page.


Use the <hr> tag to define a thematic change in the content:

<p>HTML is a language for describing web pages.</p>


<p>CSS defines how to display HTML elements. </p>

HTML Images

In HTML, images are defined with the <img> tag.

The <img> tag is empty, it contains attributes only, and does not have a closing

tag. The src attribute specifies the URL (web address) of the image:

<imgsrc="url" alt="some_text" style="width:width;height:height;">

The alt Attribute

The alt attribute provides an alternate text for an image, if the user for some reason cannot view it
(because of slow connection, an error in the src attribute, or if the user uses a screen reader).

<!DOCTYPE html>

<h2>Spectacular Mountain</h2>
<imgsrc="pic_mountain.jpg" alt="Mountain View" style="width:304px;height:228px;">


XHTML Document Validation :

<!DOCTYPE ... > Is Mandatory

An XHTML document must have an XHTML DOCTYPE declaration.

A complete list of all the XHTML Doctypes is found in our HTML Tags Reference.

The <html>, <head>, <title>, and <body> elements must also be present, and the xmlns attribute in
<html> must specify the xml namespace for the document.

This example shows an XHTML document with a minimum of required tags:

<title>Title of document</title>
some content

HTML Lists

HTML List Example

An Unordered List:

• Item
• Item
• Item
• Item

An Ordered List:

1. First item
2. Second item
3. Third item
4. Fourth item

Unordered HTML List

An unordered list starts with the <ul> tag. Each list item starts with the

<li> tag. The list items will be marked with bullets (small black circles) by



HTML Description Lists

HTML also supports description lists.

A description list is a list of terms, with a description of each term.

The <dl> tag defines the description list, the <dt> tag defines the term (name), and the <dd> tag
describes each term:

<dd>- black hot drink</dd>
<dd>- white cold drink</dd>

HTML Tables

Defining an HTML Table

An HTML table is defined with the <table> tag.

Each table row is defined with the <tr> tag. A table header is defined with the <th> tag. By default,
table headings are bold and centered. A table data/cell is defined with the <td> tag.

<table style="width:100%">

HTML Table - Cells that Span Many Columns

To make a cell span more than one column, use the colspan attribute:

<table style="width:100%">

HTML Table - Cells that Span Many Rows

To make a cell span more than one row, use the rowspan attribute:

<table style="width:100%">

▪ Use the HTML <table> element to define a table

▪ Use the HTML <tr> element to define a table row
▪ Use the HTML <td> element to define a table data
▪ Use the HTML <th> element to define a table heading
▪ Use the HTML <caption> element to define a table caption
▪ Use the CSS border property to define a border
▪ Use the CSS border-collapse property to collapse cell borders
▪ Use the CSS padding property to add padding to cells
▪ Use the CSS text-align property to align cell text
▪ Use the CSS border-spacing property to set the spacing between cells
▪ Use the colspan attribute to make a cell span many columns
▪ Use the rowspan attribute to make a cell span many rows
▪ Use the id attribute to uniquely define one table
The <form> Element

The HTML <form> element defines a form that is used to collect user input:

form elements

An HTML form contains form elements.

Form elements are different types of input elements, like text fields, checkboxes, radio buttons,
submit buttons, and more.

The <input> Element

The <input> element is the most important form element.

The <input> element can be displayed in several ways, depending on the type

attribute. Here are some examples:

Type Description

<input type="text"> Defines a one-line text input field

<input type="radio"> Defines a radio button (for selecting one of many choices)

<input type="submit"> Defines a submit button (for submitting the form)

The <select> Element

The <select> element defines a drop-down list:

<select name="cars">
<option value="volvo">Volvo</option>
<option value="saab">Saab</option>
<option value="fiat">Fiat</option>
<option value="audi">Audi</option>
The <textarea> Element

The <textarea> element defines a multi-line input field (a text area):

<textarea name="message" rows="10"
cols="30"> The cat was playing in the garden.

Input Type Submit

<input type="submit"> defines a button for submitting form data to a form-

handler. The form-handler is typically a server page with a script for processing

input data.

Input Type Reset

<input type="reset"> defines a reset button that will reset all form values to their default values

• HTML5 Elements :
HTML5 Audio
Audio on the

Before HTML5, audio files could only be played in a browser with a plug-in (like

flash). The HTML5 <audio> element specifies a standard way to embed audio in a

web page.

<audio controls>
<sourcesrc="horse.ogg" type="audio/ogg">
<sourcesrc="horse.mp3" type="audio/mpeg">
Your browser does not support the audio

HTML5 Video
Playing Videos in HTML

Before HTML5, a video could only be played in a browser with a plug-in (like

flash). The HTML5 <video> element specifies a standard way to embed a video in a

web page.

<video width="320" height="240" controls>
<sourcesrc="movie.mp4" type="video/mp4">
<sourcesrc="movie.ogg" type="video/ogg">
Your browser does not support the video tag.

HTML <time> Tag

The <time> tag defines a human-readable date/time.

This element can also be used to encode dates and times in a machine-readable way so that user
agents can offer to add birthday reminders or scheduled events to the user's calendar, and search
engines can produce smarter search results.


How to define a time and a date:

<p>We open at <time>10:00</time> every morning.</p>

<p>I have a date on <timedatetime="2008-02-14 20:00">Valentines day</time>.</p>

• Syntactic Differences between HTML and XHTML :

The Most Important Differences of XHTML from HTML:

Document Structure

• XHTML DOCTYPE is mandatory

• The xmlns attribute in <html> is mandatory
• <html>, <head>, <title>, and <body> are mandatory

XHTML Elements

• XHTML elements must be properly nested

• XHTML elements must always be closed
• XHTML elements must be in lowercase
• XHTML documents must have one root element

XHTML Attributes

• Attribute names must be in lower case

• Attribute values must be quoted
• Attribute minimization is forbidden

