(PHP 4, PHP 5)
imap_utf8 — Konvertiert Text zu UTF8
Konvertiert den übergebenen String in das UTF8-Format.
Liefert einen UTF-8 codierten String.
This function strictly observes parts of RFC 2047 to the extent that it may not decode encoded-words produced by non-compliant composing agents.
1. It is strict about a 76-character maximum length for encoded-words. Gmail may produce an encoded-word longer than this, and this will be passed through without decoding.
2. It is strict about the rule that encoded-words be separated by linear-white-space from adjacent text. Encoded-words not so separated will be ignored.
3. Invalid "B" encodings will cause this function to return without decoding ANY encoded-words in the string (including any valid ones).
But...
4. Invalid "Q" encodings are decoded with invalid characters (e.g. > 127) still in place.
The advantage of this function over mb_decode_mimeheader() is that it will decode any character set, including non-existent ones (by assuming UTF-8).
string in uppercase error http://bugs.php.net/bug.php?id=44098
my function to decode string
<?php
function decodeMimeStr($string, $charset="UTF-8" )
{
$newString = '';
$elements=imap_mime_header_decode($string);
for($i=0;$i<count($elements);$i++)
{
if ($elements[$i]->charset == 'default')
$elements[$i]->charset = 'iso-8859-1';
$newString .= iconv($elements[$i]->charset, $charset, $elements[$i]->text);
}
return $newString;
}
?>
You may also try this example:
$_subject = "=?iso-8859-1?b?THV6b26uMTQ=?= dot CoM";
echo utf8_decode(imap_utf8($_subject));
OUTPUT: Luzon®14 dot CoM
Regards,
Tuso/Nazareth
Also see:
<?php
imap_mime_header_decode($sting);
?>
Don't try to decode subjects with this one... imap_mime_header_decode does a much better job at that.
# this script are correct with the right signs remove expression
# please, ignore the later.
function decode_utf8($str){
# paterns
$err="(=\?.{10,13}q\?_?|\?\=)";
$pat = "/=([0-9A-F]{2})/";
$cha="'.chr(hexdec(";
# erase null signs in string
$str=eregi_replace($err,"",$str);
# to decode with eval and replace
eval("\$str='".
preg_replace($pat,$cha."'$1')).'",$str)
."';");
# return
return $str;
}
#Note: you can to reduce this script in 3 lines.
Here's my new better function... ;)
function decode_ISO88591($string)
{
$string=str_replace("=?iso-8859-1?q?","",$string);
$string=str_replace("=?iso-8859-1?Q?","",$string);
$string=str_replace("?=","",$string);
$charHex=array("0","1","2","3","4","5","6","7",
"8","9","A","B","C","D","E","F");
for($z=0;$z<sizeof($charHex);$z++)
{
for($i=0;$i<sizeof($charHex);$i++)
{
$string=str_replace(("=".($charHex[$z].$charHex[$i])),
chr(hexdec($charHex[$z].$charHex[$i])),
$string);
}
}
return($string);
}
Hope this helps. :)
I got issues decoding strings with spanish characters. I found encodeISO88591() function at this manual as an user note.
After having problems with imap_utf8() and utf8_decode(), I've decided create a function decodeISO88591() for decoding strings encoded with encodeISO88591().
Here's the code(I've translated remarks and variable and array names from spanish to english for better understanding):
function encodeISO88591($string)
{
// ISO-8859-1 string header
$stringISO = "=?iso-8859-1?q?";
// Each character are encoded('equal-to' symbol + hexadecimal value from ASCII code)
for($i=0;$i<strlen($string);$i++)
{
// Basic ASCII characters are not encoded
if(ord(substr($string,$i,1))<1 || ord(substr($string,$i,1))>127)
{
$char = ord($string[$i]);
$char = strtoupper(dechex($char));
$stringISO.="=".$char;
}
else
{
$stringISO.=substr($string,$i,1);
}
}
// ISO-8859-1 string footer
$stringISO.="?= ";
return($stringISO);
}
// And this is my function decodeISO88591()
function decodeISO88591($string)
{
// Arrays for obtaining hexadecimal values
// for each ISO-8859-1 charset
$mAlfa=array("A","B","C","D","E","F");
$mNum=array();
for($n=0;$n<10;$n++)
{
$mNum[]=$n;
}
// ISO-8859-1 charset
$iso88591=array(" ","¡","¢","£","¤",
"¥","¦","§","¨","©","ª",
"«","¬","","®","¯","°",
"±","²","³","´","µ","¶",
"·","¸","¹","º","»","¼",
"½","¾","¿","À","Á","Â",
"Ã","Ä","Å","Æ","Ç","È",
"É","Ê","Ë","Ì","Í","Î","Ï",
"Ð","Ñ","Ò","Ó","Ô","Õ",
"Ö","×","Ø","Ù","Ú","Û",
"Ü","Ý","Þ","ß","à","á","â",
"ã","ä","å","æ","ç","è","é",
"ê","ë","ì","í","î","ï","ð","ñ",
"ò","ó","ô","õ","ö","÷","ø",
"ù","ú","û","ü","ý","þ","ÿ");
// Hexadecimal values array
for($a=0;$a<sizeof($mAlfa);$a++)
{
for($n=0;$n<sizeof($mNum);$n++)
{
$mHex[]=$mAlfa[$a].$mNum[$n];
}
for($a2=0;$a2<sizeof($mAlfa);$a2++)
{
$mHex[]=$mAlfa[$a].$mAlfa[$a2];
}
}
// ISO-8859-1 string header and footer are deleted
$string=str_replace("=?iso-8859-1?q?","",$string);
$string=str_replace("?= ","",$string);
// Encoded values are decoded
for($h=0;$h<sizeof($mHex);$h++)
{
$string=str_replace(("=".$mHex[$h]),$iso88591[$h],$string);
}
return($string);
}
I hope this helps somebody. :)
To get a complete decode to utf8 (which is the protocol with "national" characters), use these two:
$subject = "=?ISO-8859-1?Q?B=F8lla?=, med =?ISO-8859-1?Q?=F8l?= i baggen";
$subject = imap_utf8($subject);
$subject = utf8_decode($subject);
will return the subject in 8-bit ascii.
That is, both functions.
knutm :-)