(PHP 5 >= 5.1.0)
htmlspecialchars_decode — Konvertiert besondere HTML-Auszeichnungen zurück in Buchstaben
Diese Funktion ist das Gegenstück zu htmlspecialchars(). Sie konvertiert besondere HTML-Auszeichnungen zurück in Buchstaben.
Die konvertierten Auszeichnungen sind: &, " (wenn ENT_NOQUOTES nicht gesetzt ist), ' (wenn ENT_QUOTES nicht gesetzt ist), < und >.
Der zu dekodierende String
Quote-Stil. Eine der folgenden Konstanten kann als Wert eingesetzt werden:
Konstantenname | Beschreibung |
---|---|
ENT_COMPAT | Konvertiert doppelte Anführungszeichen und behält einfache Anführungszeichen bei (Standard) |
ENT_QUOTES | Konvertiert doppelte und einfache Anführungszeichen |
ENT_NOQUOTES | Lässt doppelte und einfache Anführungszeichen unkonvertiert |
Gibt den dekodierten String zurück.
Beispiel #1 Ein htmlspecialchars_decode()-Beispiel
<?php
$str = '<p>this -> "</p>';
echo htmlspecialchars_decode($str);
// Beachten Sie, dass die Anführungszeichen nicht konvertiert werden
echo htmlspecialchars_decode($str, ENT_NOQUOTES);
?>
Das oben gezeigte Beispiel erzeugt folgende Ausgabe:
<p>this -> "</p> <p>this -> "</p>
Keep in mind that you should never trust user input - particularly for "mixed-bag" input containing a combination of plain text and markup or scripting code.
Why?
Well, consider someone sending '&<script>alert('XSS');</script>' to your PHP script:
<?php
$var = "&<script>alert('XSS');</script>";
$var = (htmlspecialchars_decode($var) == $var) ? htmlspecialchars($var) : $var;
echo $var;
?>
Since '&' decodes into '&', (htmlspecialchars_decode($var) == $var) will be -false-, thus returning $var without that it's escaped. In consequence, the script-tags are untouched, and you've just opened yourself to XSS.
There is, unfortunately, no reliable way to determine whether HTML is escaped or not that does not come with this caveat that I know of. Rather than try and catch the case 'I've already encoded this', you are better off avoiding double-escaping by simply escaping the HTML as close to the actual output as you can muster, e.g. in the view in an MVC development structure.
or of course:
<?php
$var = "Blue & yellow make green.";
$var = (htmlspecialchars_decode($var) == $var) ? htmlspecialchars($var) : $var;
echo $var; // outputs Blue & yellow make green.
// you can do it a bunch of times, it still won't screw you!
$var = (htmlspecialchars_decode($var) == $var) ? htmlspecialchars($var) : $var;
$var = (htmlspecialchars_decode($var) == $var) ? htmlspecialchars($var) : $var;
echo $var; // still outputs Blue & yellow make green.
?>
Put it in a function. Add it to the method of some abstract data class.
If you use `htmlspecialchars()` to change things like the ampersand (&) into it's HTML equivalent (&), you might run into a situation where you mistakenly pass the same string to the function twice, resulting in things appearing on your website like, as I call it, the ampersanded amp; "&". Clearly nobody want's "&" on his or her web page where there is supposed to be just an ampersand. Here's a quick and easy trick to make sure this doesn't happen:
<?php
$var = "This is a string that could be passed to htmlspecialchars multiple times.";
if (htmlspecialchars_decode($var) == $var) {
$var = htmlspecialchars($var);
}
echo $var;
?>
Now, if your dealing with text that is a mixed bag (has HTML entities and non-HTML entities) you're on your own.
The example for "htmlspecialchars_decode()" below sadly does not work for all PHP4 versions.
Quote from the PHP manual:
"get_html_translation_table() will return the translation table that is used internally for htmlspecialchars() and htmlentities()."
But it does NOT! At least not for PHP version 4.4.2.
This was already reported in a bug report (http://bugs.php.net/bug.php?id=25927), but it was marked as BOGUS.
Proof:
Code:
--------------------
<?php
var_dump(get_html_translation_table(HTML_SPECIALCHARS,ENT_QUOTES));
var_dump(htmlspecialchars('\'',ENT_QUOTES));
?>
--------------------
Output:
--------------------
array
'"' => '"'
''' => '''
'<' => '<'
'>' => '>'
'&' => '&'
'''
--------------------
This comment now is not to report this bug again (though I really believe it is one), but to complete the example and warn people of this pitfall.
To make sure your htmlspecialchars_decode fake for PHP4 works, you should do something like this:
<?php
function htmlspecialchars_decode($string,$style=ENT_COMPAT)
{
$translation = array_flip(get_html_translation_table(HTML_SPECIALCHARS,$style));
if($style === ENT_QUOTES){ $translation['''] = '\''; }
return strtr($string,$translation);
}
?>
Br, Thomas
The following replacement for PHP 4 is a little more complete, as the quote_style is taken into account as well:
if (!function_exists("htmlspecialchars_decode")) {
function htmlspecialchars_decode($string, $quote_style = ENT_COMPAT) {
return strtr($string, array_flip(get_html_translation_table(HTML_SPECIALCHARS, $quote_style)));
}
}
This should be the best way to do it.
(Reposted because the other one seems a bit slower and because those who used the code under called it htmlspecialchars_decode_php4)
<?php
if ( !function_exists('htmlspecialchars_decode') )
{
function htmlspecialchars_decode($text)
{
return strtr($text, array_flip(get_html_translation_table(HTML_SPECIALCHARS)));
}
}
?>
Here is how you can get this function in php < 5.1, just make sure this function is before you try and call the function.
if (!function_exists('htmlspecialchars_decode')) {
function htmlspecialchars_decode($str, $options="") {
$trans = get_html_translation_table(HTML_SPECIALCHARS, $options);
$decode = ARRAY();
foreach ($trans AS $char=>$entity) {
$decode[$entity] = $char;
}
$str = strtr($str, $decode);
return $str;
}
}
The code supplied by or-k at or-k dot com (14-Sep-2005 09:15) is better served using html_entity_decode() for PHP>=4.3.0.
geoffers@gmail (14-Jul-2005 01:38) offers the best htmlspecialchars_decode() for php4 users.
that works also with ä and " and so on.
get_html_translation_table(HTML_ENTITIES) => offers more characters than HTML_SPECIALCHARS
function htmlspecialchars_decode_PHP4($uSTR)
{
return strtr($uSTR, array_flip(get_html_translation_table(HTML_ENTITIES, ENT_QUOTES)));
}
[Update of previous note, having noticed I forgot to put in quote style]
PHP4 Compatible function:
<?php
function htmlspecialchars_decode_php4 ($str, $quote_style = ENT_COMPAT) {
return strtr($str, array_flip(get_html_translation_table(HTML_SPECIALCHARS, $quote_style)));
}
?>
For PHP4 Compatibility:
<?php
function htmlspecialchars_decode_php4 ($str) {
return strtr($str, array_flip(get_html_translation_table(HTML_SPECIALCHARS)));
}
?>