|
mb_encode_mimeheader
Encode string for MIME header
(PHP 4 >= 4.0.6, PHP 5)
mb_encode_mimeheader() encodes a given string str by the MIME header encoding scheme. Returns a converted version of the string represented in ASCII.
charset specifies the name of the character set
in which str is represented in. The default value
is determined by the current NLS setting (
transfer_encoding specifies the scheme of MIME
encoding. It should be either
linefeed specifies the EOL (end-of-line) marker
with which mb_encode_mimeheader() performs
line-folding (a » RFC term,
the act of breaking a line longer than a certain length into multiple
lines. The length is currently hard-coded to 74 characters).
Falls back to Example 1399. mb_encode_mimeheader() example<?php
Note:
This function isn't designed to break lines at higher-level contextual break points (word boundaries, etc.). This behaviour may clutter up the original string with unexpected spaces. The indent was added as of PHP 5. See also mb_decode_mimeheader(). Code Examples / Notes » mb_encode_mimeheadernigrez
True, function is broken (PHP5.1, encoding from UTF-8 with pl_PL charset). Below is about 15% faster version of proposed _mb_mime_encode. Also it has header more like othe mb_* functions and doesn't trigger any errors/warnings/notices. <?php function mb_mime_header($string, $encoding=null, $linefeed="\r\n") { if(!$encoding) $encoding = mb_internal_encoding(); $encoded = ''; while($length = mb_strlen($string)) { $encoded .= "=?$encoding?B?" . base64_encode(mb_substr($string,0,24,$encoding)) . "?=$linefeed"; $string = mb_substr($string,24,$length,$encoding); } return $encoded; } ?> stormflycut
Some solution for using national chars and have problem with UTF-8 for example in mail subject. Before you use mb_encode_mimeheader with UTF-8 set mb_internal_encoding('UTF-8').
masataka
second parameter 'charset' is character encoding name, but default must be UTF-8 on PHP4.3.1.
gullevek
Read this FIRST: http://bugs.php.net/bug.php?id=23192 because mb_encode_mimeheaders is BUGGY! a work around for the multibyte broken error for too long subjects for ISO-2022-JP: $pos=0; $split=36; // after 36 single bytes characters, if then comes MB, it is broken while ($pos<mb_strlen($string,$encoding)) { $output=mb_strimwidth($string,$pos,$split,"",$encoding); $pos+=mb_strlen($output,$encoding); $_string.=(($_string)?' ':'').mb_encode_mimeheader($output,$encoding); } $string=$_string; is not the best, but it works chappy
In countries where there's non-us ASCII, it's a very good example, for sending mail: mb_internal_encoding('iso-8859-2'); setlocale(LC_CTYPE, 'hu_HU'); function encode($str,$charset){ $str=mb_encode_mimeheader(trim($str),$charset, 'Q', "\n\t"); return $str; } print encode('the text with spec. chars: ő Ű Ő ű, á','iso-8859-2'); It creates a 7bit string paravoid
If mb_ version doesn't work for you in MIME-B mode: function encode_mimeheader($string, $charset=null, $linefeed="\r\n") { if (!$charset) $charset = mb_internal_encoding(); $start = "=?$charset?B?"; $end = "?="; $encoded = ''; /* Each line must have length <= 75, including $start and $end */ $length = 75 - strlen($start) - strlen($end); /* Average multi-byte ratio */ $ratio = mb_strlen($string, $charset) / strlen($string); /* Base64 has a 4:3 ratio */ $magic = $avglength = floor(3 * $length * $ratio / 4); for ($i=0; $i <= mb_strlen($string, $charset); $i+=$magic) { $magic = $avglength; $offset = 0; /* Recalculate magic for each line to be 100% sure */ do { $magic -= $offset; $chunk = mb_substr($string, $i, $magic, $charset); $chunk = base64_encode($chunk); $offset++; } while (strlen($chunk) > $length); if ($chunk) $encoded .= ' '.$start.$chunk.$end.$linefeed; } /* Chomp the first space and the last linefeed */ $encoded = substr($encoded, 1, -strlen($linefeed)); return $encoded; } iwakura
i think mb_encode_mimeheader still have bug. here is sample code: function mb_encode_mimeheader2($string, $encoding = "ISO-2022-JP") { $string_array = array(); $pos = 0; $row = 0; $mode = 0; while ($pos < mb_strlen($string)) { $word = mb_strimwidth($string, $pos, 1); if (!$word) { $word = mb_strimwidth($string, $pos, 2); } if (mb_ereg_match("[ -~]", $word)) { // ascii if ($mode != 1) { $row++; $mode = 1; $string_array[$row] = NULL; } } else { // multibyte if ($mode != 2) { $row++; $mode = 2; $string_array[$row] = NULL; } } $string_array[$row] .= $word; $pos++; } //echo "<pre>"; //print_r($string_array); //echo "</pre>"; foreach ($string_array as $key => $value) { $value = mb_convert_encoding($value, $encoding); $string_array[$key] = mb_encode_mimeheader($value, $encoding); } //echo "<pre>"; //print_r($string_array); //echo "</pre>"; return implode("", $string_array); } is not the best, but it works chappy
I found a bad function. <?php function encodeHeader($input, $charset = 'ISO-8859-2') { preg_match_all('/(\\w*[\\x80-\\xFF]+\\w*)/', $input, $matches); foreach ($matches[1] as $value) { $replacement = preg_replace('/([\\x80-\\xFF])/e', '"=" . strtoupper(dechex(ord("\\1")))', $value); $input = str_replace($value, '=?' . $charset . '?Q?' . $replacement . '?=', $input); } return $input; } ?> This function should be used: <?php function encodeHeader($input, $charset = 'ISO-8859-2') { $m=preg_match_all('/(\w*[\x80-\xFF]+\w*)/', $input, $matches); if($m)$input=mb_encode_mimeheader($input,$charset, 'Q'); return $input; } ?> mortoray
At least for Q encoding, this function is unsafe and does not encode correctly. Raw characters which appear as RFC2047 sequences are simply left as is. Ex: mb_encode_mimeheader( '=?iso-8859-1?q?this=20is=20some=20text?=' ); returns '=?iso-8859-1?q?this=20is=20some=20text?=' The exact same string, which is obviously not the encoding for the source string. That is, mb_encode_mimeheader does not do any type of escaping. That is, the following condition is not always true: mb_decode_mimeheader( mb_encode_mimeheader( $text ) ) == $text |
Change Languagemb_check_encoding mb_convert_case mb_convert_encoding mb_convert_kana mb_convert_variables mb_decode_mimeheader mb_decode_numericentity mb_detect_encoding mb_detect_order mb_encode_mimeheader mb_encode_numericentity mb_ereg_match mb_ereg_replace mb_ereg_search_getpos mb_ereg_search_getregs mb_ereg_search_init mb_ereg_search_pos mb_ereg_search_regs mb_ereg_search_setpos mb_ereg_search mb_ereg mb_eregi_replace mb_eregi mb_get_info mb_http_input mb_http_output mb_internal_encoding mb_language mb_output_handler mb_parse_str mb_preferred_mime_name mb_regex_encoding mb_regex_set_options mb_send_mail mb_split mb_strcut mb_strimwidth mb_stripos mb_stristr mb_strlen mb_strpos mb_strrchr mb_strrichr mb_strripos mb_strrpos mb_strstr mb_strtolower mb_strtoupper mb_strwidth mb_substitute_character mb_substr_count mb_substr |