(PHP 4, PHP 5)
imap_mime_header_decode — Dekodiert MIME-codierte Headerzeilen
Dekodiert MIME-codierte Headerzeilen die nicht-ASCII Zeichen enthalten (siehe » RFC2047).
Der MIME-codierte Text
Die dekodierten Textteile werden als ein Array von Objekten zurückgegeben die jeweils ein text und ein charset Property enthalten.
Für nicht kodierte ASCII-Teile wird das charset Property auf den Wert 'default' gesetzt.
Beispiel #1 imap_mime_header_decode() Beispiel
<?php
$text = "=?ISO-8859-1?Q?Keld_J=F8rn_Simonsen?= <keld@example.com>";
$elements = imap_mime_header_decode($text);
for ($i=0; $i<count($elements); $i++) {
echo "Charset: {$elements[$i]->charset}\n";
echo "Text: {$elements[$i]->text}\n\n";
}
?>
Das oben gezeigte Beispiel erzeugt folgende Ausgabe:
Charset: ISO-8859-1 Text: Keld Jørn Simonsen Charset: default Text: <keld@example.com>
In diesem Beispiel erhalten wir zwei Textteile zurück, den ersten kodiert im westeuropäischen ISO-8859-1 Zeichensatz und den zweiten als einfacher ASCII-String.
Beware of multilined subjects containing whitespaces which are not part of the subject itself, but needed as functional characters for the clients.
i.e. you can have a mail header containing content like
Subject: =?iso-8859-1?Q?WG=3A_Mobilit=E4t_verschlechtert_--=3E_174?=
=?iso-8859-1?Q?6?=
(carriage return and tabspace).
imap_mime_header_decode returns 5 parts (and not expected 2)
for this example.
- The first containing the space between "subject:" and the subject itself.
- The second is the first encoded text
- The third will be the line-break within the head lines.
The were inserted to keep rfc compliant line length and are NOT part of the original subject entered by the sender.
- The fourth will be the second part of the subject.
- The fifth is a line break - the last character will be a line break any time other head lines or mailbody will follow, so it's needed in the head - but not part of the original subject.
The example of diego is working well, he has a (very) little
mistake in his code. Here is the corrected version:
<?php
//return supported encodings in lowercase.
function mb_list_lowerencodings() { $r=mb_list_encodings();
for ($n=sizeOf($r); $n--; ) { $r[$n]=strtolower($r[$n]); } return $r;
}
// Receive a string with a mail header and returns it
// decoded to a specified charset.
// If the charset specified into a piece of text from header
// isn't supported by "mb", the "fallbackCharset" will be
// used to try to decode it.
function decodeMimeString($mimeStr, $inputCharset='utf-8', $targetCharset='utf-8', $fallbackCharset='iso-8859-1') {
$encodings=mb_list_lowerencodings();
$inputCharset=strtolower($inputCharset);
$targetCharset=strtolower($targetCharset);
$fallbackCharset=strtolower($fallbackCharset);
$decodedStr='';
$mimeStrs=imap_mime_header_decode($mimeStr);
for ($n=sizeOf($mimeStrs), $i=0; $i<$n; $i++) {
$mimeStr=$mimeStrs[$i];
$mimeStr->charset=strtolower($mimeStr->charset);
if (($mimeStr == 'default' && $inputCharset == $targetCharset)
|| $mimStr->charset == $targetCharset) {
$decodedStr.=$mimStr->text;
} else {
$decodedStr.=mb_convert_encoding(
$mimeStr->text, $targetCharset,
(in_array($mimeStr->charset, $encodings) ?
$mimeStr->charset : $fallbackCharset)
)
);
}
} return $decodedStr;
}
?>
The previous comment (from hans) seems to make no sense at all, since it will not change the encoding and possibly result in a "multiencoding" string (that the browser and anything else will be unable to render, of course).
I use a little function to decode the whole header to a specified encoding. It is as follow:
<?php
//return supported encodings in lowercase.
function mb_list_lowerencodings() { $r=mb_list_encodings();
for ($n=sizeOf($r); $n--; ) { $r[$n]=strtolower($r[$n]); } return $r;
}
// Receive a string with a mail header and returns it
// decoded to a specified charset.
// If the charset specified into a piece of text from header
// isn't supported by "mb", the "fallbackCharset" will be
// used to try to decode it.
function decodeMimeString($mimeStr, $inputCharset='utf-8', $targetCharset='utf-8', $fallbackCharset='iso-8859-1') {
$encodings=mb_list_lowerencodings();
$inputCharset=strtolower($inputCharset);
$targetCharset=strtolower($targetCharset);
$fallbackCharset=strtolower($fallbackCharset);
$decodedStr='';
$mimeStrs=imap_mime_header_decode($mimeStr);
for ($n=sizeOf($mimeStrs), $i=0; $i<$n; $i++) {
$mimeStr=$mimeStrs[$i];
$mimeStr->charset=strtolower($mimeStr->charset);
if (($mimeStr == 'default' && $inputCharset == $targetCharset)
|| $mimStr->charset == $targetCharset) {
$decodedStr.=$mimStr->text;
} else {
$decodedStr.=mb_convert_encoding(
$mimeStr->text, $targetCharset,
(in_array($mimeStr->charset, $encodings) ?
$mimeStr->charset : $fallbackCharset)
)
);
}
} return $decodedStr;
}
?>
Hope it helps.
This is obvious, but nevertheless here is a "flat" version:
<?php
private function flatMimeDecode($string) {
$array = imap_mime_header_decode($string);
$str = "";
foreach ($array as $key => $part) {
$str .= $part->text;
}
return $str;
}
?>
In response to Sven dot Dickert at planb dot de: if you encounter problems with "=?utf-8?Q?" appearing in your headers, I found that simply using "imap_utf8($string)" decoded the "$string" properly and solved my problem perfectly.
<?php
function decode_iso88591($string){
if (strpos(strtolower($string), '=?iso-8859-1') === false) {
return $string;
}
$string = explode('?', $string);
return strtolower($string[2]) == 'q' ? quoted_printable_decode($string[3]) : base64_decode($string[3]);
}
?>
This function does not code a-z 0-9 like the function posted by bandpay at hotmail dot com
function encode_iso88591($string) {
if( ereg("[^A-Za-z0-9\ ]", $string) ) {
$text = "=?iso-8859-1?q?";
for( $i = 0 ; $i < strlen($string) ; $i++ ) {
if( ereg("[^A-Za-z0-9]", $string[$i]) ) {
$text .= "=".dechex(ord($string[$i]));
} else {
$text .= $string[$i];
}
}
return $text."?=";
} else return $string;
}
imap_mime_header_decode, utf-7 and utf-8 problem, i solved the problem using below function. note that iconv function for code converting.
you must replace "EUC-KR" as iconv parameter with charset you want such as "iso-8859-1".
function mime_decode($s) {
$elements = imap_mime_header_decode($s);
for($i = 0;$i < count($elements);$i++) {
$charset = $elements[$i]->charset;
$text =$elements[$i]->text;
if(!strcasecmp($charset, "utf-8") ||
!strcasecmp($charset, "utf-7"))
{
$text = iconv($charset, "EUC-KR", $text);
}
$decoded = $decoded . $text;
}
return $decoded;
}
Beware! imap_mime_header_decode in 4.0.6 is _not_ RFC2047 conform. The string =?utf-7?Q?Petra_M+APw-ller?= will not converted into Petra Müller cause the charset utf7 is unknown. Same goes to =?utf-8?Q?Petra_M=C3=BCller?= for charset utf8.
I wrote a simple ks_c_5601-1987(2byte)
encode
function encode_ksc5601( $string )
{
$encode = base64_encode( $string );
$text = "=?ks_c_5601-1987?B?";
$text = $text.$encode."?=";
return $text;
}