Conversation over HTTP

PeterLeow
6,009 views

Open Source Your Knowledge, Become a Contributor

Technology knowledge has to be shared and made accessible for free. Join the movement.

Create Content

Conversation over HTTP

As a daily routine, you launch a web browser, type into the address bar of a browser some text that reads like http://peterleowblog.com, and wait while some web page is being loaded onto the browser. A typical web page consists of texts, images, and links to other web pages. You can then navigate to another web page by clicking on one of those links. For ordinary users, that is all that they care. For web developers, however, there is more to this run-of-the-mill practice than meets the eye — the unseen conversation that takes place between the browser and the web server triggered by each user's page request.

The protocol that governs the communication between the browser and the web server is none other than the well-known HTTP — HyperText Transfer Protocol. HTTP is a textual and stateless protocol that does not remember prior communications. A typical HTTP session starts with the client, usually a web browser, establishing a connection to the web server, followed by a series of request-response cycle where in the nutshell:

Step 1. The client sends its request formatted as an HTTP request message to the web server via a URL, e.g. http://www.example.com, and waits for the response.

Step 2. On receiving a request, the web server at the URL processes the request and sends its answer back to the client formatted as an HTTP response message.

Step 3. Repeat step 1 for subsequent request.

Apart from serving static HTML files, the web server may return dynamic contents that are generated on the fly from server-side scripts parsing and database operations with the help of other software such as the PHP engine and MySQL.

Both request and response messages share a similar structure — each consists of a list of text directives, separated by CRLF (carriage return, followed by line feed), and organized into three sections: a start line section at the beginning, a header section that contains some header fields and an ending blank line in the middle, and a data section that contains any payload at the end of the message.

Seeing is Believing

Let's walk through an example: on the document root of your local web server, create a directory called testsite that contains an HTML file named index.html and an image file named ball.png. The HTML file contains the following HTML markup:

<!DOCTYPE html>
<html>
<head>
<meta charset="utf-8">
<title>HTTP Headers</title>
</head>
<body>
<h1>HTTP Headers</h1>
<img src="ball.png">
</body>
</html>

In a browser, this HTML file will be rendered as a web page as shown in Figure 1.

alt text Figure 1: index.html

To get the page shown in Figure 1, start the web server, then enter in the browser address bar this URL as shown:

http://localhost/testsite/index.html

This way you do not get to see the raw conversation that took place over HTTP. Let's skip the browser part and access the index.html via a telnet session from a text terminal, such as the Command Prompt of Windows, instead. Follow me...

HTTP over Telnet

In the Command Prompt of Windows, type:

telnet localhost 80

and press Enter to open a connection to the web server on port 80. Next, copy and paste the following text (including the ending blank line which is mandatory to signal the end of the header section) to the terminal and hit the Enter key:

GET /testsite/index.html HTTP/1.1
Host: localhost
Accept: text/html

Unknowingly, you have just composed and submitted an HTTP request message (Which is usually done by the browser) to the web server.

On receiving the HTTP request message for the index.html file, the web server locates the index.html and embeds its content in an HTTP response message that may read like the following text in the terminal for return to the client:

HTTP/1.1 200 OK
Date: Thu, 16 Nov 2017 16:40:10 GMT
Server: Apache/2.4.23 (Win32) OpenSSL/1.0.2h PHP/5.6.28
Last-Modified: Thu, 16 Nov 2017 16:28:27 GMT
ETag: "a4-55e1c1c1e1486"
Accept-Ranges: bytes
Content-Length: 164
Keep-Alive: timeout=5, max=100
Connection: Keep-Alive
Content-Type: text/html

<!DOCTYPE html>
<html>
<head>
<meta charset="utf-8">
<title>HTTP Headers</title>
</head>
<body>
<h1>HTTP Headers</h1>
<img src="ball.png">
</body>
</html>

View the whole process as animated in Figure 2:

alt text Figure 2: HTTP over Telnet

Note:

An HTTP request message starts with a request line that consists of the method to be applied to the resource (GET), the identifier of the resource (/testsite/index.html), and the HTTP protocol version in use (HTTP/1.1). This is followed by some request headers that contain additional information about the request to the web server in the form of headerFieldName=value pairs (Host: localhost and Accept: text/html). Observe that the headers section ends with a blank line after which the data section should follows if there is any payload to be sent to the server which in this example there is none. We will explore another example of HTTP request message with payload shortly.

Check out the different request methods and request headers from the following links:

Note:

An HTTP response message starts with a status line which consists of the HTTP protocol version in use (HTTP/1.1), a 3-digit status code (200), and the reason phrase (OK) associated with the status code. This is followed by some response headers that contain additional information about the response in the form of headerFieldName=value pairs (Server: Apache/2.4.23 (Win32) OpenSSL/1.0.2h PHP/5.6.28, Content-Length: 164, etc.). The payload, which is the content of the requested resource (index.html), comes after the blank line (with nothing but a CRLF) that indicates the end of the response headers section. Check out the different status codes and response headers from the following links:

HTTP over Web

Instead of the cumbersome terminal, you can actually check out the raw HTTP conversation using the developer tools provided by modern browsers. A quick way to access the developer tools is: open a web page in Chrome or Firefox, hit Ctrl+Shift+I on Windows / Linux or Command+Option+I on Mac, and you will be greeted with the developer tools window opening at the bottom of the browser as shown in Figure 3 in Chrome.

alt text Figure 3: Developer Tools

The developer tools comprise a set of functional tools that allow developers to, among other things, inspect DOM, edit CSS, debug scripts, and profile a web page. They are accessible from the list of tabs in the toolbar of the developer tools window. Clicking on a tab opens a corresponding panel where you can perform specific tasks provided by the tool of that tab. Here, we are only interested in the Network tab. The Network panel under the Network tab provides information on network activities occurred on a web page, including HTTP headers, response, cookies, etc.

With the Network panel open and the All filter option selected, enter http://localhost/testsite/index.html in the browser address bar and hit Enter, you should get a screen that looks like that in Figure 4:

alt text Figure 4: HTTP Request and Response for index.html

The screen in Figure 4 reveals the HTTP request and response messages under the Headers tab pertaining to index.html as indicated in the Name panel. The payload of the response is shown separately under the Response tab as shown in Figure 5:

alt text Figure 5: The Response Payload for index.html

Wait, the story hasn't ended yet. Did you notice that ball.png appearing below index.html in the Name panel? If you are curious, click on it; do you see another set of HTTP request and response messages under the Headers tab that looks like those shown in Figure 6:

alt text Figure 6: HTTP Request and Response for ball.png

When the browser that interprets the index.html comes across the image markup, i.e. <img src="ball.png">, it will initiate a new round of request-response cycle with the web server in its bid to request that image resource. The address of the web page, i.e. index.html in this case, from which the link to the requested image, i.e. ball.png in this case, originates can be found in the request header field called Referer as shown:

Referer: http://localhost/testsite/index.html

The same goes for any external resources, such as audios, videos, CSS files, JavaScript files, plug-ins, and so on, that are specified in a web page. In other words, a complete download of a web page may take several cycles of request-response depending on the number of external resources specified. This is illustrated in Figure 7:

alt text Figure 7: HTTP Request-Response Cycles

Mimicking HTTP's Conversation in Real World

If the web browser and the web server were real human beings, how would the HTTP conversation have taken place in natural language? Try this:

Browser: Hi, I'm Mozilla (User-Agent: Mozilla/5.0), can you send me the HTML file (Accept: text/html) at http://localhost/testsite/index.html (GET /testsite/index.html HTTP/1.1)?

Server: Hi, I'm Apache (Server: Apache/2.4.23). I have succeeded in finding the file (HTTP/1.1 200 OK). It is in HTML text (Content-Type: text/html). It reads ....(payload).

Browser: Hi, I'm Mozilla (User-Agent: Mozilla/5.0), may I have the image file (Accept: image/*) at http://localhost/testsite/ball.png (GET /testsite/ball.png HTTP/1.1) which is referenced in http://localhost/testsite/index.html (Referer: http://localhost/testsite/index.html)?

Server: Hi, I'm Apache (Server: Apache/2.4.23). I have succeeded in finding the file (HTTP/1.1 200 OK). It is an image (Content-Type: image/png).

In the pseudo conversation above, part of the sentences are annotated by corresponding HTTP headers in parentheses. Computers are generally weak in handling unstructured natural language. To overcome this, HTTP request and response messages are organized and structured into different header fields each of which carries a pre-defined role and meaning in the whole HTTP process.

HTTP Request with Payload

So far, you have seen an example of HTTP request without payload, let's walk through one that has. A request with payload is usually initiated by a user submitting data via an HTML form. In the testsite directory, add an HTML file called enquiry.html that contains the following HTML markup:

<!DOCTYPE html>
<html>
<head>
<meta charset="utf-8">
<title>Books Enquiry</title>
</head>
<body>
<h1>Books Enquiry</h1>
<form action="response.php" method="get">
  My Name:<br>
  <input type="text" name="name">
  <br><br>
  Title of Book:<br>
  <input type="text" name="booktitle">
  <br><br>
  <input type="submit" value="Submit">
</form> 
</body>
</html>

In a browser, this HTML file will be rendered as a web page with two text fields and one submit button as shown in Figure 8.

alt text Figure 8: enquiry.html

Now, enter a name and a book title, say Peter Leow and Hands-on with PHP, into the respective text fields, hit the Submit button. This will initiate an HTTP request using GET method (specified in the method attribute of the <form> tag) with the entered name and book title as payload to the web server to be picked up by a response.php (specified in the action attribute of the <form> tag) that contains the following script:

<?php
// This is a very rudimental code for demo only 
$name = $_REQUEST["name"];
$booktitle = $_REQUEST["booktitle"];

// Assuming there is code to search database and found that book

echo "Dear $name<br><br>The book titled \"$booktitle\" is currently on loan.";
?>

On receiving the request, the response.php will read the book title received, supposedly searches for it in a database, and then generates a reply based on the outcome of the search. An example output of response.php is shown in Figure 9.

alt text Figure 9: response.php

As shown in Figure 9, payload data sent via GET method are represented in the form of name=value pairs (name=Peter+Leow&booktitle=Hands-on+with+PHP), delimited with & symbol, and url-encoded (replacing spaces with +). They are visibly appended to the URL as query string parameters. They appear on the request line of the HTTP request message as follows:

GET /testsite/response.php?name=Peter+Leow&booktitle=Hands-on+with+PHP HTTP/1.1

In place of enquiry.html, you can also call the response.php with the data via a telnet session like what you have done previously.

Copy and paste the following text (including the ending blank line which is mandatory to signal the end of the header section) to the telnet console and hit the Enter key:

GET /testsite/response.php?name=Peter+Leow&booktitle=Hands-on+with+PHP HTTP/1.1
Host: localhost
Accept: text/html

You should receive the following response from response.php:

HTTP/1.1 200 OK
Date: Thu, 16 Nov 2017 17:40:10 GMT
Server: Apache/2.4.23 (Win32) OpenSSL/1.0.2h PHP/5.6.28
X-Powered-By: PHP/5.6.28
Content-Length: 80
Content_Type: text/html; charset=UTF-8

Dear Peter Leow<br><br>The book titled "Hands-on with PHP" is currently on loan.

Alternatively, you can send data to the web server using POST method. The following HTTP request message uses POST method to send data as payload to response.php via a telnet session:

POST /testsite/response.php HTTP/1.1
Host: localhost
Content-Length: 43
Content-Type: application/x-www-form-urlencoded

name=Peter+Leow&booktitle=Hands-on+with+PHP

Like its GET counterpart, payload data sent via POST method are represented in the form of name=value pairs (name=Peter+Leow&booktitle=Hands-on+with+PHP), delimited with & symbol, and url-encoded (replacing spaces with +). Unlike its GET counterpart where they are being visibly appended to the URL, however, payload data sent via POST method are embedded after the ending blank line of request header section which make them invisible to naked eyes.

If you are to mimic HTTP request with payload in the real world, it may go like this:

enquiry.html: Hi, I'm Mozilla (User-Agent: Mozilla/5.0), my name is Peter Leow (name=Peter+Leow), I am looking for this book titled "Hands-on with PHP" (booktitle=Hands-on+with+PHP). Is it available in your library (GET /testsite/response.php?name=Peter+Leow&booktitle=Hands-on+with+PHP HTTP/1.1)?

(On receiving the request from enquiry.html, response.php looked up the book and found out that it was on loan.)

response.php: Dear Peter Leow, I'm response.php (HTTP/1.1 200 OK) from Apache (Server: Apache/2.4.23). The book titled "Hands-on with PHP" which your are looking for is currently on loan (Dear Peter Leow<br><br>The book titled "Hands-on with PHP" is currently on loan.).

End of Conversation

More often than not, users pay no attention to how the raw conversation that takes place over HTTP since it just happens and works out of the box between the browser and the web server. For developers, however, understanding the mechanism of HTTP enables them to, among other things,

  • Set values of response headers through server-side scripting to implement certain useful functionalities that would otherwise not be possible. Some of these functionalities include redirecting the browser to a specific URL and downloading resources as files instead of displaying them on screen.

  • Perform analytics based on data gathered from request headers received by the server.

The article Conversation over HTTP appeared first on Peter Leow's Code Blog.

Open Source Your Knowledge: become a Contributor and help others learn. Create New Content