Lesson 5: Web Based E Commerce Architecture: Topic: Uniform Resource Locator
Lesson 5: Web Based E Commerce Architecture: Topic: Uniform Resource Locator
Lesson 5: Web Based E Commerce Architecture: Topic: Uniform Resource Locator
Topic:
Introduction Web System Architecture Generation Of Dynamic Web Pages Cookies Summary Exercise
other over the internet. This protocol is called the Hypertext Transfer Protocol (HTTP).
Objectives
After this lecture the students will be able to:
Understand web based E Commerce architecture
All of you might have understood that web system together with the internet forms the basic infrastructure for supporting E Commerce. In this lecture we will discuss in detail what are the components a web bases system is consist of assuming that you have a knowledge of basic network architecture of the internet (i.e. Layered model of the Internet)
https: secure hypertext transfer protocol ftp: file transfer protocol telnet: telnet protocol for accessing a remote host
The domain_name, port, directory and resource specify the domain name of the destined computer, the port number of the connection, the corresponding directory of the resource and the requested resource, respectively. For example, the URL of the welcome page (main.html) of our VBS may be writ-ten as http://www.vbs.com/welcome/ main.html. In this example, the protocol is http, the domain_name is www.vbs.com, the directory is welcome (i.e., the file main.html is stored under the directory called welcome). Note that in this example, the port is omitted because the default port for the protocol is used; that is, formally the URL should be specified as http://WWW.vbs.com:80/welcome/main.html where 80 specifies the port for HTTP as explained later. In some protocols (e.g. TELNET) where the user name and password are required, the URL can be specified as follows : protocol://username:password@domain_name:port/ directory/ resource where username and password specify the user name and password, respectively. Let us consider a general overview of HTTP before discussing its details. This protocol is used for the web client and the web server to communicate with each other. Overview of the Hypertext Transfer Protocol Sup-pose that you access the URL of the VBS http:// www.vbs.com/welcome/main.html by clicking the corresponding hyperlink. This is what happens in terms of the interactions between the web browser and the web server according to. Utilizing the URL of the hyperlink, the web browser (or web client) obtains the IP address of the VBS through the DNS. After receiving the reply, the web client establishes a TCP connection to port 80 of the web server. Note that port 80 is the default port for HTTP. Then it issues a GET command (more specifically, GET/ welcome/main.html) to retrieve the web page main.html from the web server. The web server then returns the corresponding
17
for displaying information to the user as well as collecting users input to the system. Serving as the client, the web browser also interacts with the web server using the HTTP. Web server : It is one of the main components of the service system. It interacts with the web client as well as the backend system.
Application server : It is the other main component of the
server and the web client to exchange information with each other.
Fig 5.1 Web System Architecture As the web client and the web server are not connected directly, we need a protocol for them to talk or communicate with each
3.231/3A.231/3B.231
file to the browser. In HTTP/1.0, the TCP connection is then closed. In HTTP/ 1.1, the connection is kept open in order to support multiple requests. The browser then shows the text in the hypertext file. It also obtains the images in the hypertext file from their respective URLs and displays them. This is why you see the text first and the images enter, because the images take a longer time to download. In many companies, a proxy web server is set up for security and other administrative reasons. In this case, users need to access other web servers via the proxy web server. Basically, a users browser issues a request to the proxy web server first and then the proxy web server retrieves the specific web page on behalf of the user. Having retrieved the web page, it is then returned to the users browser for display. Essentially, the proxy web server acts as an application gateway for enhancing security. A proxy web server can have both positive and negative effects on web performance. On the positive side, it can be used to keep cache copies of web pages so that if subsequent users require these web pages, they can be returned to the users almost immediately. In other words, the retrieval time can be greatly reduced. However, the proxy web server can also become a bottleneck if the system is not well planned.
E-COMMERCE
Description It gets or retrieves a web page. It requests the header information of the web page. In other words, the response is the same as that for GET with the body or the content of the web page removed.
POST
It posts additional data to the web server in the HTTP request message. The additional data is attached after the headers.
Table 5.1 Request methods in HTTP/1.0 As described in Table 5.1, Request_method specifies the request method used. Resource_address is essentially the URL that specifies the location of the requested resource in the web server. HTTP/ Version-number tells the web server what HTTP protocol the web client is using. There are three types of headers for passing additional information to the web server, namely, General_header, Request_header, and Entity_header. They are described in Tables 5.2, 5.3, and 5.4, respectively. Finally, the web client can post additional data to the server after the Blank_line, This is used in conjunction with the POST request method. Let us look at the following example of an HTTP request message. GET /vbs.html HTTP/l..0
Accept:image/gif,image/jpeg,*/*
JPEG and GIF are different encoding techniques that compress an image for transmitting and storing so as to reduce the number of bytes (size) for representing the image. As discussed in the previous section, the basic operation of HTTP is as follows. The web client (e.g. your web browser or even a. robot program) makes a TCP connection to a web server at port 80. Subsequently, an HTTP request consisting of the specific request, required headers and additional data is forwarded to the web server. After processing the request, the web server returns an HTTP response consisting of the status, additional headers, and the requested resource such as a web page. A new version of HTTP called HTTP/1.1 is also becoming popular HTTP request The general format of the client request is as follows: Request_method Resource_address HTTP!Version_number General_header(s) Request_header(s) Entity_header(s) Blank_line Entity (Additional_data)
This request message means that the client wants to get a document called vbs.html from the server. The document IS located at the root directory of the server. Version 1.0 of the HTTP is used. The client can accept any content type as indicated by */* but for the image content, GIF is preferred to JPEG .Note that no additional data can be enclosed in the HTTP request. Header name Description DateIt specifies when (i.e. date and time) the message was generated. Pragma This header is for specifying implementation-specific directives. For example, if the client does not want to receive a cached copy of the requested resource, it will specify Pragma: Nocache
18
3.231/3A.231/3B.231
Description It specifies when (i.e. date and time) the message was generated. This header is for specifying implementation-specific directives. For example, if the client does not want to receive a cached copy of the requested resource, it will specify Pragma: No-cache
HTTP Response Having processed the web clients request, the web server returns a response to the client. The general format of the response is as follows. HTTP/Version_number status_code Result_message (Status line) General_header(s) Response_header(s) Entity_header(s)
E-COMMERCE
Description Used with the later WWW-Authenticate response header, it provides authentication information to the web server. HTIP provides a basic authentication scheme by encoding the username and password in Base64 format.
Blank_line Entity_body (e.g., web page) Again, the HTTP/ Version-number indicates the version of HTTP that the server is using. The Status_code indicates the result of the request. The common status codes are given in Table 7.1. The headers General_header(s), Response_header(s), and Entity_header(s) are used to pass additional information to the web client. Gen-eral_header and Entity_header have been described in Tables 5.2 and 5.4, respectively. Response_header is described in Table 7.2. Following the headers, the response data is enclosed as the Entity _body. Usually this is a hypertext file.
From
This header provides the contact e-mail address. (e.g., the e-mail address of the person who generates the request)
If-Modified-Since
It asks the web server to provide the requested resource only if it has been modified since the specified time in the header.
Referer
It indicates where (i.e. URL) did the client obtain the current address. By using this header, a web server can trace back the previous link(s), e.g., for maintenance or administrative purposes.
User-Agent
It provides information on the user agent (e.g. web browser) used by the web client
Table 5.3 Request heade rs in HTTP/1.0 Header Name Allow Description It indicates the request methods (e.g. GET, POST, and HEAD) allowed Content-Encoding It specifies the encoding method (e.g. compression method) applied to the content. Content-Length It indicates the length of the content in number of octets. Content-Type It indicates the content type or MIME type of the content, e.g., text/html means HTML document in text format. Expires It specifies when (i.e. date and time) the content becomes expired Last-Modified It specifies when the content (web page) was last modified Table 5.4 Entity header in HTTP/1.0
3.231/3A.231/3B.231
19
E-COMMERCE
20
Result message
Meaning This refers to the normal case in which the request is OK or successful. The request is processed and the r resource is created as requested.
204 301
The request is processed but no content is available for the client The resource has been moved permanently to the URL as given in the Location header.
302
Moved temporarily
The resource has been moved temporarilyto the URL as given in the Location header. As it is only a temporary relocation, future requests should still be sent to the current URL
304
Not modified
The requested web page is not returned to the client as it has not been modified since the time as specified in the IfModified-Since header.
400 401
Bad request Unauthorized Used in conjunction with WWW- Authenticate header files, it indicates that user authentication is required
403 404
Access is forbidden, e.g. the user does not have the access rights The requested resource is not found, possibly because it has been deleted from the web server
3.231/3A.231/3B.231
This response message means that the web server is using version 1.0 of HTTP. The request has been processed successfully. The server is Microsoft-IIS/4.0. The current date and time are 30 Sep. 2000 and 09:30:00, respectively. The response document is an HTML file in text format and the file size is 600 bytes. This file has not been modified since 09:00:00 on 30 Sep.,2000.
For example, ?Input=%2F%7Ehenry%2Flecture2%2Dnotes.html is equivalent to attaching a name called Input with value/~henry/lecture2-notes.html to the URL because %2F is 1, %7E is ~ and %2D is -. An alternative way to pass data to the web server is by using the POST command. In this case, data is appended after the headers in the HTTP request message. For example, if we use the POST command to pass data to the above booksearch program, you will find the following in the HTTP request message: POST /servlet/booksearch HTTP/l.0 Accept * / * title=ecommerce&year=2000 Note that data is appended after a blank line following the header. In this example, there is only one header called Accept * /* . It specifies that the web client is willing to accept any content type. All of you might have heard about cookies. Lets discuss what basically the cookies are.
E-COMMERCE
Cookies
HTTP is a stateless protocol. That means, the web server will not keep users state or users information. For example, when a web server receives an HTIP request, it does not know whether this request comes from a previous client or a new client. In other words, there is no way to tell whether or not the current request is related to a previous request. In many e-commerce applications, knowing the users state is an important requirement. For example, in a shopping cart application, the server needs to know the content of the users shopping cart in order to display the items to the user correctly. To address this important issue, Netscape proposed a method called cookies for a web server to save state data at the web client. The original specification is stored at http://www.netscape.com/newsref/ stdlcookie_spec.html. and it has now been standardized. A maximum of 20 cookies are allowed at each domain and each cookie is limited to 4 Kb to prevent overloading the memory of the clients computer. . If a web server wants a web client to save cookie, it will send the Set-Cookie header in the HTTP response. The Set-Cookie header is of the form Set-Cookie: Name=Value where Name and value are the name and value of the cookie, respectively. Whenever required, the client will include the cookie in .the HTIP request header using the following format: Cookie: Name=Value This allows the users information to be passed to the server. Let us look at how cookies can be used to implement a simple shopping cart for: our VBS. Suppose that there are already two items in the shopping cart. The firs: item (Iteml) has a product code of 11111 and the second item (Item2) has a product code of 22222. When the client sends a HTTP request to put another item (say an item with product code 33333) into the shopping cart, the server can set a cookie =-including the following cookie header:
3.231/3A.231/3B.231
21
Set-Cookie: Item3=33333 It means that the third item has a product code of 33333. In the next HTTP request, the user needs to send to the server the following cookie headers: Cookie: Item1=11111 Cookie: Item2=22222 Cookie: Item3=33333 By reading the cookies, the server knows the content of the shopping cart so that it can be displayed in the returned web page accordingly. Besides the Set-Cookie header, the following are extra information that can be provided for the cookie(s) They can be added on the Set-Cookie header as shown in the later example* .
Comment-provides information on the cookie (e.g. its-use)
Furthermore, a web client can send the next request without waiting for the response to the previous request. In other words, HTTP/ 1.1 allows pipelining of requests and responses. If a web client wants to close a connection, it can specify a Close option in the Connection request header, i.e., Connection: close. Efficient use of IP Addresses: Currently many small organizations use a web hosting service from ISPs. For example, we may put the VBS in an ISPs web server such that we do not need to set up and look after a web server ourselves. In HT P/ 1.1, a Host header must be included in the HTIP request message to specify the host name in the web server. This enables different organizations to share the same IP address of the web server thus allowing the efficient use of IP addresses. Range Request: HTIP/1.1 allows a web client to retrieve part of the file by using the Range header. For example, if the connection is broken while the web client is receiving a large file, it can request the web server to send the file from the break point. Furthermore, the range request function is useful when the web client wants only a portion of a large file. Cache Control: The purpose of caching is to. shorten the retrieval time of web pages. It is done by maintaining a cache copy of the previous responses in the web browser or the proxy server so that future requests can be served by the cache copies rather than by the original servers: HTTP/1.0 only supports basic cache control. For example, by using the Expires header, the original server can tell the proxy server when a cache copy should be removed. Furthermore, the web client can tell the proxy server that it does not want a cache copy of the response by using the Pragma: No-cache header. In HTIP/1.1, a Cache-Control header is included to provide better cache control and cache functions. Support for Proxy Authentication: HTTP/1.1 provides the Proxy-Authentication and ProxyAuthorization headers for enabling proxy authentication. In principle, they work in a similar manner to the WWWAuthentication and Authorization headers in HTTP/1.0, respectively. However, the Proxy-Authentication and ProxyAuthorization headers are used on a hop-by-hop basis. Better support for Data Compression: HTIP/1.1 provides better support for data compression. In, particular, a web client can specify the encoding method such as the compression scheme(s) that is/are supported and preferred by using the Accept-Encoding header. Better support for language{s}: In HTTP/1.1, a web client can specify the language(s) that is/are acceptable and preferred. Support for Content Integrity: In HTIP/1.1, Content integrity can be supported by the ContentMD5 header.
E-COMMERCE
Expires-specifies when the cookie will expire Max-age-specifies the cookies lifetime in seconds Path -specifies the URLs to which the web client should return the cookie(s) Secure -specifies that the cookie is returned only if the connection is secure.
Here is a simple example !!!!!!!!! Suppose that the VBS web server wants to create a cookie called Credit= 111 in order to remember the users credit. The Set- Cookie header is Set-Cookie: Credit=lll; secure; expires=Thursday, 07-Dec-200010:00:00 GMT; domain=.vbs.com; path=/ The expiry date of the cookie is 07-Dec-2000, 10:00:00 GMT. The cookie is effective under the domain name vbs.com. Note that path=/ means that the cookie applies to any directory under the root directory of the server. In the discussions above we have used HTTP version 1.1 Lets see how this is different from HTTP. HITP/1. 1 In HTTP/1.1, many enhancements are included to improve the performance of HTTP, to enhance its functionality, and to eliminate the limitations of HTTP/1.0. Generally speaking, HTTP/1.1 works in a similar-manner to HTTP/1.0 except that many additional headers are added so HTTP/1.1 is upwardly compatible with HTTP/1.0. Some of the major enhancements are summarized as follows:
22
3.231/3A.231/3B.231
Additional Request Methods: Four additional request methods are added as described in Table 7.3. However, they are less commonly-used than the GET, POST, and HEAD request methods.
3. How the web client and server communicate with each other. 4. What do you understand by Caching 5. Explain cookies?What all are the additional information provide while setting a cookie?
E-COMMERCE
Description of the request Put the specified resource to the web server. Delete the specified resource from the web server. Return the options available from the web server. Loop back a request, e.g., for diagnostic purposes
Notes
Summary:
The general architecture of a web-based e-commerce system.
Basically, it consists of the following components: Web browser, Web server, Application server, Backend system and Internet A Web page is given an address called a Uniform Resource Locator (URL)
The web client and the web server communicate with each
by appending it after the URL or embedding it inside the HTTP request message. This can be used to generate dynamic web pages.
As the HTTP is stateless, a Cookie method can be used to
keep track of a users state. This is important for many ecommerce applications such as building a shopping cart.
Persistent connections and pipelining, Efficient use of IP addresses, range request, Cache control, Support for proxy authentication, Better support for data compression, Better support for language{s}, Support for content integrity and Additional request methods
Exercise:
1. What are the various components in web system architecture? 2. Explain the following terms:
HTTP URL
3.231/3A.231/3B.231 Copy Right: Rai University 23
E-COMMERCE
TUTORIAL 2:
Question 1: How is e-commerce defined? Answer: E-commerce is defined as the value of goods and services sold online. The term online includes the use of the internet, intranet, extranet, as well as proprietary networks that run systems such as Electronic Data Interchange (EDI). Question 2: Does E-Stats cover the entire economy? Answer: No. E-Stats covers manufacturing, merchant wholesale trade, retail trade, and selected service industries. These sectors and industries are the same as those covered by existing annual Census Bureau surveys. Sectors and industries not covered include agriculture, mining, construction, and utilities as well as nonmerchant wholesalers and parts of the service sector. Question 3: Is the value of e-commerce included in the estimates of total economic activity provided in your ongoing surveys? Answer: Yes. Question 4: Are e-commerce sales of retail businesses with both a physical and internet presence, commonly referred to as brick and click businesses, included in the Electronic Shopping and Mail Order Houses industry estimates? Answer: If the brick and click business has a separate business unit set up for internet sales and is not selling motor vehicles, then its e-commerce sales are included in the Electronic Shopping and Mail Order Houses industry estimates. Otherwise, the ecommerce sales are included with the NAICS industry classification for the brick part of the company. Question 5: What is the difference between merchant wholesalers and non-merchant wholesalers? Answer: Merchant wholesalers take title to the goods they sell and include wholesale merchants, distributors, jobbers, drop shippers, and import/export merchants. These businesses typically maintain their own warehouse, where they receive and handle goods for their customers. Non-merchant wholesalers arrange for the purchase or sale of goods owned by others and do not take title to the goods they sell. Examples of non-merchant wholesalers include manufacturers sales branches and offices, agents, brokers, commission agents, and electronic marketplaces. Question 6: Are the sales of online marketplaces (eMarketplaces) included in the e-commerce estimates? Answer: Only sales from eMarketplaces that take title to the goods they sell are included. Generally, most eMarketplaces arrange for the purchase or sale of goods owned by others and do not take title to the goods they sell. This type of eMarketplace is considered to be a non-merchant wholesaler and would be excluded from the estimates in this report. Question 7: What other types of Nonstore Retailers are there in addition to Electronic Shopping and Mail Order Houses? Answer: It also includes Direct Selling Establishments and Vending Machine Operators. Direct Selling Establishments
24
typically go to the customers location rather than the customer coming to them (e.g., door-to-door sales, home parties) and includes businesses such as heating oil dealers making residential deliveries and mobile food services. Question 8: Can the e-commerce categories be separated into B2B and B2C? Answer: Although the surveys did not collect separate data, one can approximate relative shares by using some simplifying assumptions. If one assumes all manufacturing and wholesale is entirely B2B and all retail and service is B2C, then more than 94% of total e-commerce was B2B. Question 9: How do you account for firms that go out of business? Answer: Our surveys are updated each year to add new businesses and to delete ones no longer in business. Once we receive notification that a firm has ceased operation we drop it from our survey. Results are included up until the point the firm ceased operation. Question 10: How frequently will E-Stats be published? Answer: We plan to publish the E-Stats E-commerce Report annually in March.
3.231/3A.231/3B.231