Using OWASP Reform for Output Encoding

In a previous post I discussed the short comings of the Apache Commons Lang StringEscapeUtils class in regards to output encoding in a JavaScript context. In this same post, I mentioned two other projects that do output encoding on this context, as well as others, correctly. One of these was OWASP Reform which is now part of the OWASP Encoding Project. In this post I hope to give a few short examples that will help you get started using this project.

The first thing that you will need to do is visit the OWASP Encoding Project webpage and download the zip archive of the source. As of this writing, the latest version is Reform 0.12. After you unzip the archive, you will notice that Reform has actually been ported to many different languages. In my examples I will be focusing on the Java implementation, but the concepts should apply to any implementation you choose to use.

As the project comes via source, I simply copied the Reform.java file into a new Eclipse project. From there, I simply created a very simple JSP that should get you started. Here is the code:

<%@ page language="java"
contentType="text/html; charset=ISO-8859-1"
pageEncoding="ISO-8859-1"%>
<%@
page import="org.owasp.reform.Reform"%>

<!DOCTYPE html PUBLIC
"-//W3C//DTD HTML 4.01 Transitional//EN"
"http://www.w3.org/TR/html4/loose.dtd">
<
html>
<
head>
<
meta http-equiv="Content-Type"
content="text/html; charset=ISO-8859-1">
<
title>OWASP Reform Example</title>
<
script>
var jsParam = <%= Reform.JsString((String)
request.getParameter("jsParam")) %>;
alert("You passed in: " + jsParam);
</script>
</
head>
<
body>
You passed in: <%= Reform.HtmlEncode((String)
request.getParameter("htmlParam")) %>
</body>
</
html>
As you can see, all that is required is to import the correct library, in this case org.owasp.reform.Reform and call the methods that are right for the context of the data. The most important thing to remember about output encoding is that context matters.

Reform also has other encoding functions for contexts such as HTML Attributes, XML, XML Attributes, and lastly VB Script. I hope this helps.

Effective Account Lockout

If your application processes credit card information of any type, there are two requirements in PCI DSS 1.2 that should scare you to death - "8.5.13 Limit repeated access attempts by locking out the user ID after not more than six attempts" and "8.5.14 Set the lockout duration to a minimum of 30 minutes or until [an] administrator enables the user ID."

In my opinion, account lockout is one of those security controls that should be used sparingly and only in the most extreme circumstances.  My reasoning for this stance is that almost any lockout mechanism can be used by an attacker to purposefully deny legitimate users access to a site.  In its simplest form, this attack can be accomplished with a simple shell script that submits the proper parameter names over HTTP/S through cURL.  Knowing this, you should be asking yourself Why would PCI require me to implement something that could have such a dramatic effect on the availability of my site?.  The short answer, from what I can gleam, is that they are more concerned about preventing a breach of sensitive information than you performing any business transactions.  In essence, the lockout becomes the ultimate fail closed approach to security, in that if no one can access the site then the data stays safe.

So how can you overcome this obstacle, while still maintaining your compliance?  Here are a few possibilities that I have come up with.  Note that with any of the solutions below, the number of failed attempts needs to be reset to 0 upon successful login.

1. Implement the lockout mechanism to maintain a count of the number of failed login attempts in the user's unauthenticated session.  If this count ever reaches the threshold (six in this case) then lock the account by setting some flag in the database.  Although this appears to work, it has a major design flaw.  The count of failed attempts is maintained in the session until it is written to the database when the threshold is met.  For all but the most incompetent attacker, this implementation is easily bypassed.  All that is required to bypass this is to submit five passwords, send a new request without the session cookie (which effectively gives you a new session token with the number of failed attempts set at 0), and submit five more passwords with the new session repeating until the password is found.  Needless to say, this implementation is broken.

2. Modify the mechanism above to store the number of failed attempts in the database upon each failed attempt.  This actually gets us closer to what we really want, because it allows us to persist the number of failed attempts, but can increase overheard dramatically with numerous small updates to the database which are not always efficient.

3. Implement a mechanism that does not perform a lockout, but "meets the spirit of the control."  (Disclaimer: By no means am I trying to say here that you should try to skirt PCI, or that it is even possible.  What I am trying to show here is something that you may be able to implement that demonstrates you are implementing a compensating control that effectively provides the same level of protection as an account lockout mechanism without the potentially large negative impact to your legitimate users.) 

Obviously, the the control is being required so that sites will have something to prevent an attacker from easily brute forcing user's passwords.  My question is what about a mechanism that progressively delays the success/failed response, thereby making an attacker have to wait longer with each request to find out if the supplied password was correct.  Here is a sample with completely arbitrary delay offsets:
    Request 1  -->  Immediately return success/fail
    Request 2  -->  Wait 2 seconds, then return success/fail
    Request 3  -->  Wait 5 seconds, then return success/fail
    Request 4  -->  Wait 10 seconds, then return success/fail
    Request 5  -->  Wait 20 seconds, then return success/fail
    Request 6  -->  Wait 40 seconds, then return success/fail
    Request 7  -->  Wait 80 seconds, then return success/fail
    Request 8+  -->  Wait 160 seconds, then return success/fail

The key to this entire scheme is that the attacker still has to wait the full time delay before being told of the password was correct or not.  At no time other than the first request should the mechanism immediately return.

4. Modify option #3 and combine it with option #2 in such a way that the database is updated with each failed attempt, but you still maintain the progressive delay determining the time to wait on the latest value pulled from the database.  In addition, you could still lock out the account if a threshold was ever met.  Here is some pseudocode with completely arbitrary delay offsets and an assumed threshold of 6:
    Request  -->  if count < 1 then
                           count => 1; 
                           return success/fail;
                       else if count < 2 then 
                           count => 2; wait 2 seconds;
                           return success/fail;
                       else if count < 3 then 
                           count => 3; wait 5 seconds;
                           return success/fail;
                       else if count < 4 then 
                           count => 4; wait 10 seconds;
                           return success/fail;
                       else if count < 5 then 
                           count => 5; wait 20 seconds;
                           return success/fail;
                       else if count < 6 then
                           count => 6; account_locked_flg => 1;
                           wait 40 seconds;
                           return success/fail;
                       else
                           count => count + 1;
                           wait 40 seconds;
                           return account_locked_message;

Overall, I like option #3 the best, but if you want to combine these options in other ways I haven't thought of here, or have other solutions that you have come up with feel free to drop me a comment.

Getting Output Encoding Right

As you may know, one of the best ways to protect your application from cross-site scripting is to use proper output encoding when displaying user editable data on the screen.  Here, I attempt to show that although the above statement sounds simple enough, it can actually be a very daunting task.

When it comes to output encoding in a web application, there are two main contexts that the developer needs to be aware of.  These are the HTML context and the JavaScript context.  These contexts are special in that the encoding scheme used for one does not apply to the other.  As such, care should be taken to ensure that the proper scheme is applied correctly on every page.

So, for the sake of argument, let us say that we are doing an assessment of a Java-based web application and discover that the developer has written code similar to the following:

<%@ page language="java" contentType="text/html; charset=UTF-8"
pageEncoding="UTF-8"
%>


<!DOCTYPE html PUBLIC "-//W3C//DTD HT
ML 4.01 Transitional//EN"
"http://www.w3.org/TR/html4/loose.dtd">
<h
tml>
<head>
<meta http-equiv="Content-Type"
content="text/html; charset=UTF-8">
<title>Test Encode</title>
<script>
var param = <%= request.getParameter("js") %>;
alert(param);
</script
>
</
head
>
<
body
>
There is JavaScript on this page.
</body
>
</
html>



Obviously, this example code is vulnerable to cross-scripting.  So, in our report we point this out, and afterwards the developer comes back with the following change to the code:



...
<script>
var param = <%= StringEscapeUtils.escapeJavaScript(

request.getParameter("js")) %>;
alert(param);
</script
>
...


Upon inspection, it appears as though the developer has done the right thing and found a widely distributed library that will do the output encoding for him.  This is usually always better than trying to write a library/function yourself unless you REALLY understand the subject matter.  In this case, the develop chose the Apache Commons Lang - StringEscapeUtils class.  This appears to be a good choice as it provides methods for both of the contexts mentioned above, namely escapeHtml and escapeJavaScript.  There is only one problem, the escapeJavaScript method has a very poor implementation.  Why?  Well, the code comments say it all in this case:



/**
* ...
* <p>The only difference between Java strings and JavaScript
* strings
is that in JavaScript, a single quote must be
* escaped.</p>

* ...
*/



What?!  Maybe if you are parsing, but in no way is that statement true when it comes to encoding.  According to the JavaScript reference, encoding in a JavaScript context should escape all characters with an ASCII value less than 256 in the format of \xHH where HH is the hex representation of the ASCII value of the character.  For example, the null character, ASCII value 0, becomes \x00, the new line character, ASCII value 10, becomes \x10, etc.  So where does StringEscapeUtils go wrong?  Lets look at the code.



The definition of the escapeJavaScript function is defined as follows:



public static String JavaDoc escapeJava(String str) {
return escapeJavaStyleString(str, false);
}



After more looking, we come to the function in question:



private static void escapeJavaStyleString(Writer out, 
String str, boolean escapeSingleQuote)
throws IOException {

if (out == null) {
throw new IllegalArgumentException("The Writer must not be
null"
);
}
if (str == null) {
return;
}

int sz;
sz = str.length();
for (int i = 0; i < sz; i++) {
char ch = str.charAt(i);

// handle unicode
if (ch > 0xfff) { out.write("\\u" + hex(ch)); }
else if (ch > 0xff) { out.write("\\u0" + hex(ch)); }
else if (ch > 0x7f) { out.write("\\u00" + hex(ch));}
else if (ch < 32) {
switch (ch) {
case '\b': out.write('\\'); out.write('b');
break;
case '
\n': out.write('\\'); out.write('n');
break;
case '
\t': out.write('\\'); out.write('t');
break;
case '
\f': out.write('\\'); out.write('f');
break;
case '
\r': out.write('\\'); out.write('r');
break;
default :
if (ch > 0xf) { out.write("\\u00" + hex(ch)); }
else { out.write("\\u000" + hex(ch)); }
break;
}
}
else {
switch (ch) {
case '
\'': if (escapeSingleQuote) {
out.write('\\');
}
out.write('
\'');
break;
case '"': out.write('\\'); out.write('"');
break;
case '
\\': out.write('\\'); out.write('\\');
break;
default : out.write(ch);
break;
}
}
}
}


If you look closely at this code, you will notice that it fails to follow the specification for JavaScript encoding.  This code is only worried about the ' (single quote), " (double quote), and / (slash) as well as a few of the control characters.  The problem is they fail to consider characters that have special meaning in JavaScript such as ; (semi-colon), () (parenthesis), etc.



So, what does this mean?  A call to the application with the js parameter set to 1; alert(String.fromCharCode(88,83,83)) results in the following code being generated on the client:



...

<script> 
  var param = 1
; alert(String.fromCharCode(88,83,83));

  alert(param);


</script>

...



Now, there are two alerts happening instead of one.  Change this attack from alert(something) to eval(something really bad) and you still face the same problem.



So what is the solution?  Don't use StringEscapeUtils.escapeJavaScript() and hope to be safe from XSS in the JavaScript context.  Use something more robust (and implemented correctly) like the OWASP Reform project or the OWASP ESAPI.

Protection Against Forceful Browsing

After a recent assessment I performed, I have been thinking about effective ways for an application to mitigate the risk of forceful browsing without adding useless overhead or otherwise complicated validation routines that may or may not work in the end.  If you are unfamiliar with what forceful browsing is, check out the CAPEC site here for more information.

The application in question had a certain flow implemented to add a user to the system.  As this was obviously an administrative function, the developers validated that the user accessing the page was in the administrator role - at least they did on the first page.  Due to the lack of role checking on the subsequent pages in the flow, it was possible to add any user to the system by simply making a GET request to the last page in the flow.  Simple.

This bug got me to thinking.  What is the generic problem that leads to applications having problems with forceful browsing?  To me it comes down to a broken access control implementation.  If you look at what forceful browsing is really exploiting, it comes down to a user pointing his/her browser to a URL that they guess may be there and will allow them to do things they do not have permission to do.  In situations where the client is not a browser but some thin client GUI app, the scenario is the same.  The only difference would be that the request from the client is intercepted and submitted to a differing back-end system.  In either case, the same goal is achieved.

So, how do you fix this?  The answer is quite simple - implement authorization at every entry point into the application/system.  Remember, a flow is only a flow because that is the way you coded the data submissions.  Unless you are utilizing some BPM Suite that enforces process step checking for you (or you implemented similar checks yourself), your flow can happen in almost any order a user wants it to.

At the end of the day, solutions to forceful browsing should not be complicated.  On the other hand, they should not be so simple as checking the Referer header either.  Keep access control in mind when writing your applications.  If you do, they will, by nature of you really putting thought into it, be more secure - as long as you implement what you carefully thought about that is.