non-BMP characters incorrectly encoded

Description

From julian.r...@googlemail.com on March 01, 2013 11:36:39

What steps will reproduce the problem? 1. run test case What is the expected output? What do you see instead? The test case encodes a non-BMP character, which internally is represented in java as two chracters, yet needs to be serialized as a single HTML character. What version of the product are you using? On what operating system? 2.0.1, win Please provide any additional information below. Test:

public static void main(String[] args) {
String test = new String (new int[]{0x2f804}, 0, 1);
System.out.println(test + " " + test.length());
System.out.println(ESAPI.encoder().encodeForHTML(test));
}

Note: this problem has been mentioned over two years ago in http://ainthek.blogspot.de/2010/09/orgowaspesapicodecshtmlentitycodecjava.html but apparently hasn't been fixed.

Original issue: http://code.google.com/p/owasp-esapi-java/issues/detail?id=294

Environment

None

Status

Assignee

Unassigned

Reporter

Max Gelman

Priority

Configure