• Home
  • Blog
  • Lifestream
  • Me
  • Twitter

Recent Posts

  • » Google Car - Damn I'm Observant
  • » Travelling - What's Next?
  • » Posh East Perth Apartment
  • » House in Perth
  • » House Sitting in Adelaide
  • » Sailing The Whitsundays
  • » Agnes Water / 1770
  • » Fraser Island
  • » Coomera Springs and Noosa
  • » Byron Bay

Tags

  • 365  australia  code  experiences  explore  flickr  france  function  holiday  house  javascript  misc  movie  perth  photo  photography  photos  php  random  rant  review  ski  skiing  thailand  traveling  travelling  trekking  video  work 

Search


Links

  • » 365 Gallery
  • » Twitter
  • » Lifestream
  • » My Flickr

Archives

  • » January 2010 (1)
  • » November 2009 (2)
  • » August 2009 (2)
  • » June 2009 (2)
  • » May 2009 (5)
  • » April 2009 (6)
  • » March 2009 (4)
  • » February 2009 (1)
  • » January 2009 (2)
  • » December 2008 (3)
  • » November 2008 (2)
  • » October 2008 (2)
  • » September 2008 (5)
  • » August 2008 (3)
  • » July 2008 (1)
  • » June 2008 (2)
  • » April 2008 (10)
  • » March 2008 (7)
  • » February 2008 (5)
  • » January 2008 (9)
  • » December 2007 (2)

 RSS Feed

Search results for 'filter'

PHP Parser - Filtering Cross Site Scripting (XSS)

September 18th, 2008
So the last few days I've been seriously stressing about the implications of XSS (Cross site scripting) in a project that I've been working on. If you don't know what XSS is all about and you're a web developer, you're in trouble, google it.

There's also a great website over at http://ha.ckers.org/xss.html that gives you a huge list of many of the known XSS methods.

There are a plethora of PHP Classes out there that work on forums and such with a limited subset of XHTML but I need to cover as much as possible, and before people start shouting at me, an approach using BBCode or Textile just isn't possible here. (and it's ugly, don't get me started)

Whilst trying to find a decent PHP function to parse out these threats in the simplest manner possible I ended up combining a few to come up with what's below.

Download file (strip_xss.txt)
function strip_xss($str, $allowed=null){
	if (!$allowed){
		$allowed = array('<h1>','<h2>','<h3>','<h4>','<h5>','<h6>','<b>','<i>','<u>','<a>','<ul>','<ol>','<li>','<pre>','<hr>','<blockquote>','<img>','<font>','<span>','','
','<table>','<thead>','<th>','<tr>','<td>','<em>','<strong>','<applet>','<div>','<center>','<pre>','<ins>','<del>','<em>','<kbd>','<dd>','<tbody>','<tfooter>','<big>','<button>','<input>','<option>','<textarea>','<fieldset>','<form>','<legend>','code');
	}
	$disabled = array('onabort', 'onactivate', 'onafterprint', 'onafterupdate', 'onbeforeactivate', 'onbeforecopy', 'onbeforecut', 'onbeforedeactivate', 'onbeforeeditfocus', 'onbeforepaste', 'onbeforeprint', 'onbeforeunload', 'onbeforeupdate', 'onblur', 'onbounce', 'oncellchange', 'onchange', 'onclick', 'oncontextmenu', 'oncontrolselect', 'oncopy', 'oncut', 'ondataavaible', 'ondatasetchanged', 'ondatasetcomplete', 'ondblclick', 'ondeactivate', 'ondrag', 'ondragdrop', 'ondragend', 'ondragenter', 'ondragleave', 'ondragover', 'ondragstart', 'ondrop', 'onerror', 'onerrorupdate', 'onfilterupdate', 'onfinish', 'onfocus', 'onfocusin', 'onfocusout', 'onhelp', 'onkeydown', 'onkeypress', 'onkeyup', 'onlayoutcomplete', 'onload', 'onlosecapture', 'onmousedown', 'onmouseenter', 'onmouseleave', 'onmousemove', 'onmoveout', 'onmouseover', 'onmouseup', 'onmousewheel', 'onmove', 'onmoveend', 'onmovestart', 'onpaste', 'onpropertychange', 'onreadystatechange', 'onreset', 'onresize', 'onresizeend', 'onresizestart', 'onrowexit', 'onrowsdelete', 'onrowsinserted', 'onscroll', 'onselect', 'onselectionchange', 'onselectstart', 'onstart', 'onstop', 'onsubmit', 'onunload');
	
	// remove all non-printable characters. CR(0a) and LF(0b) and TAB(9) are allowed // this prevents some character re-spacing such as <java\0script> // note that you have to handle splits with \n, \r, and \t later since they *are* allowed in some inputs
	$str = preg_replace('/([\x00-\x08,\x0b-\x0c,\x0e-\x19])/', '', $str);
	
	// straight replacements, the user should never need these since they're normal characters
	// this prevents like <IMG SRC=&#X40&#X61&#X76&#X61&#X73&#X63&#X72&#X69&#X70&#X74&#X3A&#X61&#X6C&#X65&#X72&#X74&#X28&#X27&#X58&#X53&#X53&#X27&#X29>
	$search = 'abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ1234567890!@#$%^&*()~`";:?+/={}[]-_|\'\\';
	for ($i = 0; $i < strlen($search); $i++) {
		// ;? matches the ;, which is optional // 0{0,7} matches any padded zeros, which are optional and go up to 8 chars // &#x0040 @ search for the hex values
		$str = preg_replace('/(&#[xX]0{0,8}'.dechex(ord($search[$i])).';?)/i', $search[$i], $str); // with a ;
		// &#00064 @ 0{0,7} matches '0' zero to seven times
		$str = preg_replace('/(&#0{0,8}'.ord($search[$i]).';?)/', $search[$i], $str); // with a ;
	}
	
	return preg_replace('/\s(' . implode('|', $disabled) . ').*?([\s\>])/', '\\2', preg_replace('/<(.*?)>/ie', "'<' . preg_replace(array('/javascript:[^\"\']*/i', '/(" . implode('|', $disabled) . ")[ \\t\\n]*=[ \\t\\n]*[\"\'][^\"\']*[\"\']/i', '/\s+/'), array('', '', ' '), stripslashes('\\1')) . '>'", strip_tags($str, implode('', $allowed))) );
}
Download file (strip_xss.txt)

What I'm yet to come up with is a way of stopping people putting in things such as..
<img src="http://yoursite.com/admin/users/deleteall" />
Then whenever an admin or someone went to this page, alredy logged in to the app, the page would be executed as them, perfectly legally. Obviously there isn't a page that does delete all users, but you can see the problem, right.

Anybody who finds an improvement / bug, please please please add it back here so everyone can benefit, i'll update the code as we go!
No Comments »
.
 
Twitter   |   Contact “everything should be made as simple as possible, but no simpler ” - Albert Einstein