It is often necessary to escape the special html code from the user input in case of avoiding cross site attack (XSS).
Initially i thought jdk provides a method somewhere to do this like function htmlentities() in php, but i failed to find it. All i found is a class called “URLEncoder ” which i don’t think can do this job.
I don’t want to reinvent the wheel as I believe there must be some java packages available that do this job. Googling “java encode html” didn’t lead me straight to the right java package (at least not the one I’d like to use).
After a while, I finally found one package i’d like to use. it’s from Apache Commons project, called “Commons Lang“. The method “StringEscapeUtils.escapeHtml(…) ” can do the encode job while the other method called unescapeHtml can do the decode job. So, I don’t have to write my own method…
What about this class: java.net.URLDecoder
check out java docs, and I believe it is there for a long time.
http://java.sun.com/j2se/1.4.2/docs/api/java/net/URLDecoder.html
URLDecoder and URLEncoder only deal with the URL. For exmaple, URLEncoder would encodes space to %20. However, escaping the HTML content is a different thing.
FYI: I think JTidy is another way to do what you want.
thank you robin
thank you alls lol
JTidy is evil,
it also escapes spaces and number and everything else that is not letters so that the resulting string is totally unreadable.
Also it escapes to unicode form like &#xxxx; not to standart HTML entities like &.
So use only apache, it rules!
I mean “… to standart HTML entities like & …”
You should also try using JSTL standard c:out tag. The c:out tag has an attribute escapeXml, which can be set to true and will escape >, <, ", '
Example: