10 Steps to Supportable Web Applications

September 23rd, 2008
So I've now been developing large scale web applications for a few years now, and I thought i'd share a few points I believe every coder needs to think about.

If you're just starting out you wont know why you NEED to do these things, but anybody who has had to re-visit there own or someody elses poor code will soon tell you that it's far better to get it right and scaleable from the start than to have to go over everything and try and change it 6 months down the line.

So, in no particular order..

1) Don't rush to start

Think things through before you write it. This goes hand in hand with point 10, if you think something through to begin with and do it right the first time, it'll be easy to expand upon and improve later.
If you rush straight in you'll get half way through, get stuck and have to start all over again.

2) Plan to scale the codebase

Many problems in developing applications come from having "organic" code, we've all seen it, code that just grows and grows into a behemoth of unsupportability.
All projects, whether big or small need to be factored into well organised, scalable applications, From! The! Start!
No silly naming of functions, having a function "set()". It needs to be "set_foo()" at a mimumum so that you can have "set_bar()" as well, having just "set()" is going to get confusing VERY quickly, don't do it!
Consider using an MVC like framework, either your own, or a lightweight open source one.

3) Seperate application logic from markup

Now doing this may sound simple, but doing it *well* will inherently move you into a situation where you'll be using an MVC or a templating "engine" like Smarty.
This will look much prettier and give you (or a non-techy designer) complete freedom to change your design without touching any code that could potentially break the application.

PHP Parser - Filtering Cross Site Scripting (XSS)

September 18th, 2008
So the last few days I've been seriously stressing about the implications of XSS (Cross site scripting) in a project that I've been working on. If you don't know what XSS is all about and you're a web developer, you're in trouble, google it.

There's also a great website over at http://ha.ckers.org/xss.html that gives you a huge list of many of the known XSS methods.

There are a plethora of PHP Classes out there that work on forums and such with a limited subset of XHTML but I need to cover as much as possible, and before people start shouting at me, an approach using BBCode or Textile just isn't possible here. (and it's ugly, don't get me started)

Whilst trying to find a decent PHP function to parse out these threats in the simplest manner possible I ended up combining a few to come up with what's below.

Download file (strip_xss.txt)
function strip_xss($str, $allowed=null){
	if (!$allowed){
		$allowed = array('<h1>','<h2>','<h3>','<h4>','<h5>','<h6>','<b>','<i>','<u>','<a>','<ul>','<ol>','<li>','<pre>','<hr>','<blockquote>','<img>','<font>','<span>','','
','<table>','<thead>','<th>','<tr>','<td>','<em>','<strong>','<applet>','<div>','<center>','<pre>','<ins>','<del>','<em>','<kbd>','<dd>','<tbody>','<tfooter>','<big>','<button>','<input>','<option>','<textarea>','<fieldset>','<form>','<legend>','code');
	}
	$disabled = array('onabort', 'onactivate', 'onafterprint', 'onafterupdate', 'onbeforeactivate', 'onbeforecopy', 'onbeforecut', 'onbeforedeactivate', 'onbeforeeditfocus', 'onbeforepaste', 'onbeforeprint', 'onbeforeunload', 'onbeforeupdate', 'onblur', 'onbounce', 'oncellchange', 'onchange', 'onclick', 'oncontextmenu', 'oncontrolselect', 'oncopy', 'oncut', 'ondataavaible', 'ondatasetchanged', 'ondatasetcomplete', 'ondblclick', 'ondeactivate', 'ondrag', 'ondragdrop', 'ondragend', 'ondragenter', 'ondragleave', 'ondragover', 'ondragstart', 'ondrop', 'onerror', 'onerrorupdate', 'onfilterupdate', 'onfinish', 'onfocus', 'onfocusin', 'onfocusout', 'onhelp', 'onkeydown', 'onkeypress', 'onkeyup', 'onlayoutcomplete', 'onload', 'onlosecapture', 'onmousedown', 'onmouseenter', 'onmouseleave', 'onmousemove', 'onmoveout', 'onmouseover', 'onmouseup', 'onmousewheel', 'onmove', 'onmoveend', 'onmovestart', 'onpaste', 'onpropertychange', 'onreadystatechange', 'onreset', 'onresize', 'onresizeend', 'onresizestart', 'onrowexit', 'onrowsdelete', 'onrowsinserted', 'onscroll', 'onselect', 'onselectionchange', 'onselectstart', 'onstart', 'onstop', 'onsubmit', 'onunload');
	
	// remove all non-printable characters. CR(0a) and LF(0b) and TAB(9) are allowed // this prevents some character re-spacing such as <java\0script> // note that you have to handle splits with \n, \r, and \t later since they *are* allowed in some inputs
	$str = preg_replace('/([\x00-\x08,\x0b-\x0c,\x0e-\x19])/', '', $str);
	
	// straight replacements, the user should never need these since they're normal characters
	// this prevents like <IMG SRC=&#X40&#X61&#X76&#X61&#X73&#X63&#X72&#X69&#X70&#X74&#X3A&#X61&#X6C&#X65&#X72&#X74&#X28&#X27&#X58&#X53&#X53&#X27&#X29>
	$search = 'abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ1234567890!@#$%^&*()~`";:?+/={}[]-_|\'\\';
	for ($i = 0; $i < strlen($search); $i++) {
		// ;? matches the ;, which is optional // 0{0,7} matches any padded zeros, which are optional and go up to 8 chars // &#x0040 @ search for the hex values
		$str = preg_replace('/(&#[xX]0{0,8}'.dechex(ord($search[$i])).';?)/i', $search[$i], $str); // with a ;
		// &#00064 @ 0{0,7} matches '0' zero to seven times
		$str = preg_replace('/(&#0{0,8}'.ord($search[$i]).';?)/', $search[$i], $str); // with a ;
	}
	
	return preg_replace('/\s(' . implode('|', $disabled) . ').*?([\s\>])/', '\\2', preg_replace('/<(.*?)>/ie', "'<' . preg_replace(array('/javascript:[^\"\']*/i', '/(" . implode('|', $disabled) . ")[ \\t\\n]*=[ \\t\\n]*[\"\'][^\"\']*[\"\']/i', '/\s+/'), array('', '', ' '), stripslashes('\\1')) . '>'", strip_tags($str, implode('', $allowed))) );
}
Download file (strip_xss.txt)

What I'm yet to come up with is a way of stopping people putting in things such as..
<img src="http://yoursite.com/admin/users/deleteall" />
Then whenever an admin or someone went to this page, alredy logged in to the app, the page would be executed as them, perfectly legally. Obviously there isn't a page that does delete all users, but you can see the problem, right.

Anybody who finds an improvement / bug, please please please add it back here so everyone can benefit, i'll update the code as we go!

Helen and Olly - Answer Me This!

September 10th, 2008
I'm an avid listener of the "Answer me this podcast" http://answermethispodcast.com, they've recently had a challenge for listeners to post a video and ask them a question, here's mine...



Love you!

Women Drivers - a rant

September 10th, 2008
--- Start Rant ---

Ok ok, this might not JUST be with women drivers but I've only ever experienced it with women ;-) and by all means don't think that this is the only gripe I have with women drivers, just hear me out on this point...

Take the diagram to the left, I'm in car 1, she is in car 2, I'm stationary, giving way, she's driving down the road, and can easily get past me, there's no cars behind me and no reason not to go straight through, but..

She pulls in to the space (number 3) and flashes me to go through! What the fuck?? Then when I went past she drove off, why would you do that?

Why? why? why?

--- End Rant ---

"Pushing" the web

September 6th, 2008
So Recently I've been delving into the marvelous world of Pushing data to web browsers.

Ok, before I get flamed, what I mean by pushing (for now) is the browser requesting data and the server sending new events periodically as they happen. Other technologies in this area are things like Comet and Orbited. Many people are already using various methods to implement this, services such as Mibbit, GMail/GDocs, Facebook, Highrise and others.

While this isn't a discussion about why I didn't use the existing approaches, I will say some of them are because the servers are based on Java or Twisted (python), which is bulky and I felt it could be simplified.

For now the general thought process is that the clients web browser requests data from the server via AJAX or JSONP, Flash or an IFrame, and when the server has some data to send, it plops it out, the web browser reads it and everyone is happy...

Limitations of these methods...
.
© Ross Scrivener 2008 | Contact