Delicious Bookmark this on Delicious Share on Facebook SlashdotSlashdot It! Digg! Digg



PHP : Function Reference : Multibyte String Functions : mb_encode_mimeheader

mb_encode_mimeheader

Encode string for MIME header (PHP 4 >= 4.0.6, PHP 5)
string mb_encode_mimeheader ( string str [, string charset [, string transfer_encoding [, string linefeed [, int indent]]]] )

mb_encode_mimeheader() encodes a given string str by the MIME header encoding scheme. Returns a converted version of the string represented in ASCII.

charset specifies the name of the character set in which str is represented in. The default value is determined by the current NLS setting (mbstring.language).

transfer_encoding specifies the scheme of MIME encoding. It should be either "B" (Base64) or "Q" (Quoted-Printable). Falls back to "B" if not given.

linefeed specifies the EOL (end-of-line) marker with which mb_encode_mimeheader() performs line-folding (a » RFC term, the act of breaking a line longer than a certain length into multiple lines. The length is currently hard-coded to 74 characters). Falls back to "\r\n" (CRLF) if not given.

Example 1399. mb_encode_mimeheader() example

<?php
$name
= ""; // kanji
$mbox = "kru";
$doma = "gtinn.mon";
$addr = mb_encode_mimeheader($name, "UTF-7", "Q") . " <" . $mbox . "@" . $doma . ">";
echo
$addr;
?>


Note:

This function isn't designed to break lines at higher-level contextual break points (word boundaries, etc.). This behaviour may clutter up the original string with unexpected spaces.

The indent was added as of PHP 5.

See also mb_decode_mimeheader().

Code Examples / Notes » mb_encode_mimeheader

nigrez

True, function is broken (PHP5.1, encoding from UTF-8 with pl_PL charset). Below is about 15% faster version of proposed _mb_mime_encode. Also it has header more like othe mb_* functions and doesn't trigger any errors/warnings/notices.
<?php
function mb_mime_header($string, $encoding=null, $linefeed="\r\n") {
 if(!$encoding) $encoding = mb_internal_encoding();
 $encoded = '';
 while($length = mb_strlen($string)) {
   $encoded .= "=?$encoding?B?"
            . base64_encode(mb_substr($string,0,24,$encoding))
            . "?=$linefeed";
   $string = mb_substr($string,24,$length,$encoding);
 }
 return $encoded;
}
?>


stormflycut

Some solution for using national chars and have problem with UTF-8 for example in mail subject. Before you use mb_encode_mimeheader with UTF-8 set mb_internal_encoding('UTF-8').

masataka

second parameter 'charset' is character encoding name, but default must be UTF-8 on PHP4.3.1.

gullevek

Read this FIRST: http://bugs.php.net/bug.php?id=23192 because mb_encode_mimeheaders is BUGGY!
a work around for the multibyte broken error for too long subjects for ISO-2022-JP:
$pos=0;
$split=36; // after 36 single bytes characters, if then comes MB, it is broken
while ($pos<mb_strlen($string,$encoding))
{
 $output=mb_strimwidth($string,$pos,$split,"",$encoding);
 $pos+=mb_strlen($output,$encoding);
 $_string.=(($_string)?' ':'').mb_encode_mimeheader($output,$encoding);
}
$string=$_string;
is not the best, but it works


chappy

In countries where there's non-us ASCII, it's a very good example, for sending mail:
mb_internal_encoding('iso-8859-2');
setlocale(LC_CTYPE, 'hu_HU');
function encode($str,$charset){
$str=mb_encode_mimeheader(trim($str),$charset, 'Q', "\n\t");
return $str;
}
print encode('the text with spec. chars: &#337; &#368; &#336; &#369;, á','iso-8859-2');
It creates a 7bit string


paravoid

If mb_ version doesn't work for you in MIME-B mode:
function encode_mimeheader($string, $charset=null, $linefeed="\r\n") {
if (!$charset)
$charset = mb_internal_encoding();
$start = "=?$charset?B?";
$end = "?=";
$encoded = '';
/* Each line must have length <= 75, including $start and $end */
$length = 75 - strlen($start) - strlen($end);
/* Average multi-byte ratio */
$ratio = mb_strlen($string, $charset) / strlen($string);
/* Base64 has a 4:3 ratio */
$magic = $avglength = floor(3 * $length * $ratio / 4);
for ($i=0; $i <= mb_strlen($string, $charset); $i+=$magic) {
$magic = $avglength;
$offset = 0;
/* Recalculate magic for each line to be 100% sure */
do {
$magic -= $offset;
$chunk = mb_substr($string, $i, $magic, $charset);
$chunk = base64_encode($chunk);
$offset++;
} while (strlen($chunk) > $length);
if ($chunk)
$encoded .= ' '.$start.$chunk.$end.$linefeed;
}
/* Chomp the first space and the last linefeed */
$encoded = substr($encoded, 1, -strlen($linefeed));
return $encoded;
}


iwakura

i think mb_encode_mimeheader still have bug. here is sample code:
function mb_encode_mimeheader2($string, $encoding = "ISO-2022-JP") {
$string_array = array();
$pos = 0;
$row = 0;
$mode = 0;

while ($pos < mb_strlen($string)) {
$word = mb_strimwidth($string, $pos, 1);
if (!$word) {
$word = mb_strimwidth($string, $pos, 2);
}
if (mb_ereg_match("[ -~]", $word)) { // ascii
if ($mode != 1) {
$row++;
$mode = 1;
$string_array[$row] = NULL;
}
} else { // multibyte
if ($mode != 2) {
$row++;
$mode = 2;
$string_array[$row] = NULL;
}
}
$string_array[$row] .= $word;
$pos++;
}

//echo "<pre>";
//print_r($string_array);
//echo "</pre>";

foreach ($string_array as $key => $value) {
$value = mb_convert_encoding($value, $encoding);
$string_array[$key] = mb_encode_mimeheader($value, $encoding);
}

//echo "<pre>";
//print_r($string_array);
//echo "</pre>";

return implode("", $string_array);
}
is not the best, but it works


chappy

I found a bad function.
<?php
function encodeHeader($input, $charset = 'ISO-8859-2')
{
preg_match_all('/(\\w*[\\x80-\\xFF]+\\w*)/', $input, $matches);
foreach ($matches[1] as $value) {
$replacement = preg_replace('/([\\x80-\\xFF])/e', '"=" . strtoupper(dechex(ord("\\1")))', $value);
$input = str_replace($value, '=?' . $charset . '?Q?' . $replacement . '?=', $input);
}
return $input;
}
?>
This function should be used:
<?php
function encodeHeader($input, $charset = 'ISO-8859-2')
{
$m=preg_match_all('/(\w*[\x80-\xFF]+\w*)/', $input, $matches);
if($m)$input=mb_encode_mimeheader($input,$charset, 'Q');
return $input;
}
?>


mortoray

At least for Q encoding, this function is unsafe and does not encode correctly. Raw characters which appear as RFC2047 sequences are simply left as is.
Ex:
mb_encode_mimeheader( '=?iso-8859-1?q?this=20is=20some=20text?=' );
returns '=?iso-8859-1?q?this=20is=20some=20text?='
The exact same string, which is obviously not the encoding for the source string.  That is, mb_encode_mimeheader does not do any type of escaping.
That is, the following condition is not always true:
   mb_decode_mimeheader( mb_encode_mimeheader( $text ) ) == $text


Change Language


Follow Navioo On Twitter
mb_check_encoding
mb_convert_case
mb_convert_encoding
mb_convert_kana
mb_convert_variables
mb_decode_mimeheader
mb_decode_numericentity
mb_detect_encoding
mb_detect_order
mb_encode_mimeheader
mb_encode_numericentity
mb_ereg_match
mb_ereg_replace
mb_ereg_search_getpos
mb_ereg_search_getregs
mb_ereg_search_init
mb_ereg_search_pos
mb_ereg_search_regs
mb_ereg_search_setpos
mb_ereg_search
mb_ereg
mb_eregi_replace
mb_eregi
mb_get_info
mb_http_input
mb_http_output
mb_internal_encoding
mb_language
mb_output_handler
mb_parse_str
mb_preferred_mime_name
mb_regex_encoding
mb_regex_set_options
mb_send_mail
mb_split
mb_strcut
mb_strimwidth
mb_stripos
mb_stristr
mb_strlen
mb_strpos
mb_strrchr
mb_strrichr
mb_strripos
mb_strrpos
mb_strstr
mb_strtolower
mb_strtoupper
mb_strwidth
mb_substitute_character
mb_substr_count
mb_substr
eXTReMe Tracker