1

Closed

4.3.0: Unicode DoubleWidth Chars are not Sanitized

description

http://www.securityfocus.com/archive/1/390751
http://stackoverflow.com/questions/8326846/convert-ascii-chars-to-unicode-fullwidth-latin-letters-in-python

When i call GetSafeHTMLFragment its not getting sanitized:

Sanitizer.GetSafeHtmlFragment("〈script〉KillAllHumans();〈/script〉")

output : 〈script〉KillAllHumans();〈/script〉
Closed Jul 14, 2014 at 5:17 PM by bdorrans
The sanitizer is now retired. No issues will be addressed.

Having said that browsers don't treat the double wide less than and greater than as the standard 7bit ones, so this won't cause problems anyway as long as you stick with the standard templates and UTF8.

comments

montago wrote Jul 14, 2014 at 9:46 PM

The browsers are my are not the problem (at first)


It's the database with ascii datatypes, which downsample from unicode to ascii.. Which thereby result in an xss attack.

You said the sanitizer is retired.. Has something replaced it?

bdorrans wrote Jul 14, 2014 at 10:25 PM

Well in that case I'd say use nvarchar, or encode at the point of output. An HTML sanitizer wouldn't help for SQL misinterpretation - HTML != SQL

There's no replacement. We have recommendations in the release notes, but we don't pick or choose a favourite approach.