Welcome! Log In Create A New Profile

Advanced

[PHP-DEV] Default input encoding for htmlspecialchars/htmlentities

Posted by Rasmus Lerdorf 
htmlspecialchars(), htmlentities(), html_entity_decode() and
get_html_translation_table() all take an encoding parameter that used to
default to iso-8859-1. We changed the default in PHP 5.4 to UTF-8. This
is a much more sensible default and in the case of the encoding
functions more secure as it prevents invalid UTF-8 from getting through.
If you use 8859-1 as the default but your app is actually in UTF-8 or
worse, some encoding that isn't low-ascii compatible then
htmlspecialchars()/htmlentities() aren't doing what you think they are
and you have a glaring security hole in your app.

However, people are understandably lazy and don't want to think about
this stuff. They don't want to explicitly provide their input encoding
to these calls. We provided a solution to this and a way to write
portable apps and that was to pass in an empty string "" as the
encoding. If we saw this we would set the input encoding to match the
output encoding specified by the "default_charset" ini setting. We
couldn't just default to this default_charset because input and output
encodings may very well be different and we would risk making existing
apps insecure. For example an app using BIG5/CJK for its output encoding
might very well be pulling data from 8859/UTF-8 data sources and if we
invisibly switched htmlspecialchars/entities to match their output
encoding we would have problems. Invisibly switching them from 8859-1 to
UTF-8 could still be problematic, but it at least it fails safe in that
it doesn't let invalid UTF-8 through and encodes low-ascii the same way
it did before.

The problem is that there is a lot of legacy code out there that doesn't
explicitly set the encoding on those calls and it is a lot of work to go
through and specify it on each call. I still personally prefer to have
people be explicit here, but I think it is slowing 5.4 adoption (see bug
61354).

In PHP 6 we tried to introduce separate input, script and output
encoding settings. Currently in 5.4 we don't have that, but we have
those 3 separately for mbstring and for iconv:

iconv.input_encoding
iconv.internal_encoding
iconv.output_encoding
mbstring.http_input
mbstring.internal_encoding
mbstring.http_output

Ideally we should be getting rid of the per-feature encoding settings
and have a single set of them that we refer to when we need them. This
is one of these places where we really need a default input encoding
setting. We could have it check mbstring.http_input, but there is a
wrinkle here that it has a fancy "auto" setting which we don't really
want in this case. So we could set it to iconv.input_encoding, but that
seems rather random and unintuitive.

So do we create a new default_input_encoding ini directive mid-stream in
5.4 for this? Of course with the longer-term in mind that this will be
part of a unified set of encoding settings in 5.5 and beyond.

-Rasmus

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php
On 23/08/12 17:06, Rasmus Lerdorf wrote:
> htmlspecialchars(), htmlentities(), html_entity_decode() and
> get_html_translation_table() all take an encoding parameter that used to
> default to iso-8859-1. We changed the default in PHP 5.4 to UTF-8. This
> is a much more sensible default and in the case of the encoding
> functions more secure as it prevents invalid UTF-8 from getting through.
> If you use 8859-1 as the default but your app is actually in UTF-8 or
> worse, some encoding that isn't low-ascii compatible then
> htmlspecialchars()/htmlentities() aren't doing what you think they are
> and you have a glaring security hole in your app.
>
> However, people are understandably lazy and don't want to think about
> this stuff. They don't want to explicitly provide their input encoding
> to these calls. We provided a solution to this and a way to write
> portable apps and that was to pass in an empty string "" as the
> encoding. If we saw this we would set the input encoding to match the
> output encoding specified by the "default_charset" ini setting. We
> couldn't just default to this default_charset because input and output
> encodings may very well be different and we would risk making existing
> apps insecure. For example an app using BIG5/CJK for its output encoding
> might very well be pulling data from 8859/UTF-8 data sources and if we
> invisibly switched htmlspecialchars/entities to match their output
> encoding we would have problems. Invisibly switching them from 8859-1 to
> UTF-8 could still be problematic, but it at least it fails safe in that
> it doesn't let invalid UTF-8 through and encodes low-ascii the same way
> it did before.
>
> The problem is that there is a lot of legacy code out there that doesn't
> explicitly set the encoding on those calls and it is a lot of work to go
> through and specify it on each call. I still personally prefer to have
> people be explicit here, but I think it is slowing 5.4 adoption (see bug
> 61354).
>
> In PHP 6 we tried to introduce separate input, script and output
> encoding settings. Currently in 5.4 we don't have that, but we have
> those 3 separately for mbstring and for iconv:
>
> iconv.input_encoding
> iconv.internal_encoding
> iconv.output_encoding
> mbstring.http_input
> mbstring.internal_encoding
> mbstring.http_output
>
> Ideally we should be getting rid of the per-feature encoding settings
> and have a single set of them that we refer to when we need them. This
> is one of these places where we really need a default input encoding
> setting. We could have it check mbstring.http_input, but there is a
> wrinkle here that it has a fancy "auto" setting which we don't really
> want in this case. So we could set it to iconv.input_encoding, but that
> seems rather random and unintuitive.
>
> So do we create a new default_input_encoding ini directive mid-stream in
> 5.4 for this? Of course with the longer-term in mind that this will be
> part of a unified set of encoding settings in 5.5 and beyond.
>
> -Rasmus
>
Personally, I think you should have just two encodings: page_encoding
and internal_encoding. The former is for form input and page output
(could be latin-1, for instance), and internal_encoding is the internal
representation (default to utf-8 - you can deal with all of, say,
latin-1, as well as unicode entities). Input and output, on the web at
least, are almost always going to match.

--
Andrew Faulds
http://ajf.me/


--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php
On 08/23/2012 09:09 AM, Andrew Faulds wrote:
> Personally, I think you should have just two encodings: page_encoding
> and internal_encoding. The former is for form input and page output
> (could be latin-1, for instance), and internal_encoding is the internal
> representation (default to utf-8 - you can deal with all of, say,
> latin-1, as well as unicode entities). Input and output, on the web at
> least, are almost always going to match.

No, we need 3. The internal/script encoding doesn't have to be the same
as the input encoding. It isn't common in the Western world, but
elsewhere people do write their scripts in their local encoding which
may very well be different from their input and/or output encodings.

-Rasmus

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php
On Thu, Aug 23, 2012 at 12:06 PM, Rasmus Lerdorf <[email protected]> wrote:
> So do we create a new default_input_encoding ini directive mid-stream in
> 5.4 for this? Of course with the longer-term in mind that this will be
> part of a unified set of encoding settings in 5.5 and beyond.

Yes! This is a fantastic idea.

Adam

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php
On 23/08/12 17:15, Rasmus Lerdorf wrote:
> On 08/23/2012 09:09 AM, Andrew Faulds wrote:
>> Personally, I think you should have just two encodings: page_encoding
>> and internal_encoding. The former is for form input and page output
>> (could be latin-1, for instance), and internal_encoding is the internal
>> representation (default to utf-8 - you can deal with all of, say,
>> latin-1, as well as unicode entities). Input and output, on the web at
>> least, are almost always going to match.
> No, we need 3. The internal/script encoding doesn't have to be the same
> as the input encoding. It isn't common in the Western world, but
> elsewhere people do write their scripts in their local encoding which
> may very well be different from their input and/or output encodings.
>
> -Rasmus
Oh, you mean script encoding, form input/page output encoding and
internal representation?

Because I don't see a need for differing default input (i.e. file/form
input) and default output (i.e. page/file output) encodings.

--
Andrew Faulds
http://ajf.me/


--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php
El 23/08/12 18:06, Rasmus Lerdorf escribió:
> htmlspecialchars(), htmlentities(), html_entity_decode() and
> get_html_translation_table() all take an encoding parameter that used to
> default to iso-8859-1. We changed the default in PHP 5.4 to UTF-8. This
> is a much more sensible default and in the case of the encoding
> functions more secure as it prevents invalid UTF-8 from getting through.
> If you use 8859-1 as the default but your app is actually in UTF-8 or
> worse, some encoding that isn't low-ascii compatible then
> htmlspecialchars()/htmlentities() aren't doing what you think they are
> and you have a glaring security hole in your app.
I don't see how passing utf-8 as latin1 gets into a security hole. The
characters
you want to replace (&'"<>) in utf-8 are the same as in latin1, and it
can't
get trickied with synchronizations. If it was passing latin1 to a function
expecting utf-8 or "some encoding that isn't low-ascii compatible" then
I see the hole, but not here.


> So do we create a new default_input_encoding ini directive mid-stream in
> 5.4 for this? Of course with the longer-term in mind that this will be
> part of a unified set of encoding settings in 5.5 and beyond.
Yes


--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php
On 08/24/2012 02:23 PM, Ángel González wrote:
> El 23/08/12 18:06, Rasmus Lerdorf escribió:
>> htmlspecialchars(), htmlentities(), html_entity_decode() and
>> get_html_translation_table() all take an encoding parameter that used to
>> default to iso-8859-1. We changed the default in PHP 5.4 to UTF-8. This
>> is a much more sensible default and in the case of the encoding
>> functions more secure as it prevents invalid UTF-8 from getting through.
>> If you use 8859-1 as the default but your app is actually in UTF-8 or
>> worse, some encoding that isn't low-ascii compatible then
>> htmlspecialchars()/htmlentities() aren't doing what you think they are
>> and you have a glaring security hole in your app.
> I don't see how passing utf-8 as latin1 gets into a security hole. The
> characters
> you want to replace (&'"<>) in utf-8 are the same as in latin1, and it
> can't
> get trickied with synchronizations. If it was passing latin1 to a function
> expecting utf-8 or "some encoding that isn't low-ascii compatible" then
> I see the hole, but not here.

In 8859-1 no chars are invalid so anything that doesn't get encoded will
get passed through as-is. For example the byte 0xE0 is a perfectly valid
8859-1 character (à), but if the page is actually UTF-8 then this
becomes the first byte of a 3-byte UTF-8 character. IE is famous for
having a really weak Unicode parser and at least IE6/7 would see the
0xE0 and combine it with the next 2 bytes to form the UTF-8 char.

So, if you had code like this:

$str = htmlspecialchars($str); // Assuming iso-8859-1
echo '<a href="'.$str.'">';

You now have a problem because if the last byte of $str was character
0xE0 now IE will swallow the closing " and > characters in your output
leaving you in a very weird state. IE still thinks you are inside an
attribute in the <a> tag, but you think you are outside in regular HTML
mode and whatever you output next will now be filtered with the wrong
context and you have a potential XSS.

When htmlspecialchars() is in UTF-8 mode it will not allow invalid UTF-8
byte sequences through and you are safe from this particular problem.

-Rasmus

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php
Hi,

I'm +1 for having internal/input/output/script encoding setting at PHP
or Zend level.

If the default is the problem is the problem, we should set default_charset
default to UTF-8 and use them as default for internal/input/output/script
and functions that affected by encoding.

When XSS advisory was released at Feb. 2000, it stated encoding
MUST be specified in HTTP response header. Setting default_charset
is the best practice for security perspective anyway.

If we use default_charset as default encoding, transition to 5.4 might
be easier.

Regards,

--
Yasuo Ohgaki
yohgaki@ohgaki.net


2012/8/24 Rasmus Lerdorf <[email protected]>:
> htmlspecialchars(), htmlentities(), html_entity_decode() and
> get_html_translation_table() all take an encoding parameter that used to
> default to iso-8859-1. We changed the default in PHP 5.4 to UTF-8. This
> is a much more sensible default and in the case of the encoding
> functions more secure as it prevents invalid UTF-8 from getting through.
> If you use 8859-1 as the default but your app is actually in UTF-8 or
> worse, some encoding that isn't low-ascii compatible then
> htmlspecialchars()/htmlentities() aren't doing what you think they are
> and you have a glaring security hole in your app.
>
> However, people are understandably lazy and don't want to think about
> this stuff. They don't want to explicitly provide their input encoding
> to these calls. We provided a solution to this and a way to write
> portable apps and that was to pass in an empty string "" as the
> encoding. If we saw this we would set the input encoding to match the
> output encoding specified by the "default_charset" ini setting. We
> couldn't just default to this default_charset because input and output
> encodings may very well be different and we would risk making existing
> apps insecure. For example an app using BIG5/CJK for its output encoding
> might very well be pulling data from 8859/UTF-8 data sources and if we
> invisibly switched htmlspecialchars/entities to match their output
> encoding we would have problems. Invisibly switching them from 8859-1 to
> UTF-8 could still be problematic, but it at least it fails safe in that
> it doesn't let invalid UTF-8 through and encodes low-ascii the same way
> it did before.
>
> The problem is that there is a lot of legacy code out there that doesn't
> explicitly set the encoding on those calls and it is a lot of work to go
> through and specify it on each call. I still personally prefer to have
> people be explicit here, but I think it is slowing 5.4 adoption (see bug
> 61354).
>
> In PHP 6 we tried to introduce separate input, script and output
> encoding settings. Currently in 5.4 we don't have that, but we have
> those 3 separately for mbstring and for iconv:
>
> iconv.input_encoding
> iconv.internal_encoding
> iconv.output_encoding
> mbstring.http_input
> mbstring.internal_encoding
> mbstring.http_output
>
> Ideally we should be getting rid of the per-feature encoding settings
> and have a single set of them that we refer to when we need them. This
> is one of these places where we really need a default input encoding
> setting. We could have it check mbstring.http_input, but there is a
> wrinkle here that it has a fancy "auto" setting which we don't really
> want in this case. So we could set it to iconv.input_encoding, but that
> seems rather random and unintuitive.
>
> So do we create a new default_input_encoding ini directive mid-stream in
> 5.4 for this? Of course with the longer-term in mind that this will be
> part of a unified set of encoding settings in 5.5 and beyond.
>
> -Rasmus
>
> --
> PHP Internals - PHP Runtime Development Mailing List
> To unsubscribe, visit: http://www.php.net/unsub.php
>

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php
On 25/08/12 00:50, Rasmus Lerdorf wrote:
> In 8859-1 no chars are invalid so anything that doesn't get encoded will
> get passed through as-is. For example the byte 0xE0 is a perfectly valid
> 8859-1 character (à), but if the page is actually UTF-8 then this
> becomes the first byte of a 3-byte UTF-8 character. IE is famous for
> having a really weak Unicode parser and at least IE6/7 would see the
> 0xE0 and combine it with the next 2 bytes to form the UTF-8 char.
>
> So, if you had code like this:
>
> $str = htmlspecialchars($str); // Assuming iso-8859-1
> echo '<a href="'.$str.'">';
>
> You now have a problem because if the last byte of $str was character
> 0xE0 now IE will swallow the closing " and > characters in your output
> leaving you in a very weird state. IE still thinks you are inside an
> attribute in the <a> tag, but you think you are outside in regular HTML
> mode and whatever you output next will now be filtered with the wrong
> context and you have a potential XSS.
>
> When htmlspecialchars() is in UTF-8 mode it will not allow invalid UTF-8
> byte sequences through and you are safe from this particular problem.
>
> -Rasmus
I see. Thank you very much.
Even worse, HTML5 doesn't seem to have any provision for that, as it works
with characters. A user agent would have to protect himself from this by
making
those kind of utf-8 characters a hard error instead of trying to recover
from it.



--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php
Hi,

2012/8/26 Ángel González <[email protected]>:
> Even worse, HTML5 doesn't seem to have any provision for that, as it works
> with characters. A user agent would have to protect himself from this by
> making
> those kind of utf-8 characters a hard error instead of trying to recover
> from it.

Right. I would like to have this behavior.
However, who is going to use a browser that raise fatal error
for bad encoding? While others just render it as safe as possible?

Sending valid encoding is programmer's task. Enforcing and
setting default_charset is more secure and best practice since 2000.

Why not we set the default? I think UTF-8 is good for most users.

BTW, Ruby on Rails depend on Ruby's exception for badly encoded
output. We could do the same thing with output buffer and
mb_check_encoding(), but programmer should validate inputs and
ensure valid encoding in first place. Output time validation should
be fail safe. IMHO.

If there aren't any better idea, I'm willing to write patch for this.
i.e. set default for default_charset=UTF-8, create system wide input/
output/internal encoding setting and use it as default.

Any ideas?

Regards,

--
Yasuo Ohgaki
yohgaki@ohgaki.net

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php
On 08/25/2012 12:59 PM, Ángel González wrote:
> I see. Thank you very much.
> Even worse, HTML5 doesn't seem to have any provision for that, as it works
> with characters. A user agent would have to protect himself from this by
> making
> those kind of utf-8 characters a hard error instead of trying to recover
> from it.

We essentially treat it as a hard error because these functions will
return an empty string if they see any invalid chars. They won't try to
fix them in any way. This is what people are complaining about, by the
way, and in most cases they are actually sending stuff out in UTF-8 but
they were relying on the html* functions passing everything through so
while they look at it as a BC break, it is actually fixing a security
problem in their applications.

Now if they really are using iso-8859-1 as their input and output
encodings, then yes, we have broken things on them and they will need to
specify their charset and this is the case I was wondering if we could
improve and make their lives easier by adding an default_input_encoding
setting that these functions would use.

-Rasmus

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php
Hi!

> In PHP 6 we tried to introduce separate input, script and output
> encoding settings. Currently in 5.4 we don't have that, but we have
> those 3 separately for mbstring and for iconv:
>
> iconv.input_encoding
> iconv.internal_encoding
> iconv.output_encoding
> mbstring.http_input
> mbstring.internal_encoding
> mbstring.http_output
>
> Ideally we should be getting rid of the per-feature encoding settings
> and have a single set of them that we refer to when we need them. This

I agree, having unified set of encodings would be a good thing. However,
I have a feeling most of the people won't really understand what these
three do, and would never bother to set them. From my experience, people
don't even bother to set PHP timezone, even though PHP complains each
time date function is accessed. So these will be left as default in
99.999% of cases.

> So do we create a new default_input_encoding ini directive mid-stream in
> 5.4 for this? Of course with the longer-term in mind that this will be
> part of a unified set of encoding settings in 5.5 and beyond.

What happens to these 6 directives? Will we now have 9 directives for
setting the encoding? This reminds me of: http://xkcd.com/927/. Having
yet more settings is not really a solution to the problem of too many
different settings. So unless we deprecate all others in 5.5 and have
people use only generic ones it's not very useful. If we do deprecate
them, we need some kind of migration path - i.e. if you set
iconv.input_encoding what actually happens? If you set
default_input_encoding will it also set mbstring.http_input - or will it
affect mbstring without actually setting it?
I guess we'd need a good detailed RFC on this :)

--
Stanislav Malyshev, Software Architect
SugarCRM: http://www.sugarcrm.com/
(408)454-6900 ext. 227

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php
On 08/26/2012 02:57 PM, Yasuo Ohgaki wrote:
> Hi,
>
> 2012/8/27 Stas Malyshev <[email protected]>:
>> Hi!
>>
>>> In PHP 6 we tried to introduce separate input, script and output
>>> encoding settings. Currently in 5.4 we don't have that, but we have
>>> those 3 separately for mbstring and for iconv:
>>>
>>> iconv.input_encoding
>>> iconv.internal_encoding
>>> iconv.output_encoding
>>> mbstring.http_input
>>> mbstring.internal_encoding
>>> mbstring.http_output
>>>
>>> Ideally we should be getting rid of the per-feature encoding settings
>>> and have a single set of them that we refer to when we need them. This
>>
>> I agree, having unified set of encodings would be a good thing. However,
>> I have a feeling most of the people won't really understand what these
>> three do, and would never bother to set them. From my experience, people
>> don't even bother to set PHP timezone, even though PHP complains each
>> time date function is accessed. So these will be left as default in
>> 99.999% of cases.
>
> I agree. Other than applications that are made by CJK native, I rarely
> see them set.
>
>>
>>> So do we create a new default_input_encoding ini directive mid-stream in
>>> 5.4 for this? Of course with the longer-term in mind that this will be
>>> part of a unified set of encoding settings in 5.5 and beyond.
>>
>> What happens to these 6 directives? Will we now have 9 directives for
>> setting the encoding? This reminds me of: http://xkcd.com/927/. Having
>> yet more settings is not really a solution to the problem of too many
>> different settings. So unless we deprecate all others in 5.5 and have
>> people use only generic ones it's not very useful. If we do deprecate
>> them, we need some kind of migration path - i.e. if you set
>> iconv.input_encoding what actually happens? If you set
>> default_input_encoding will it also set mbstring.http_input - or will it
>> affect mbstring without actually setting it?
>> I guess we'd need a good detailed RFC on this :)
>
> If I write patch for it, I'll modify iconv.*/mbstring.* to use php.* (or zend.*)
> When default_chartset is set and other settings are null, use it as
> default for all including htmlentities(), mb_*(), etc.
>
> default_charset will be single encoding configuration if user uses
> single encoding for application.
>
> How to deal with iconv.*/mbstring.*
> master: remove iconv.*/mbstring.*
> 5.4: iconv.*/mbstring.* remains for compatibility and use them it they set.
>
> We could remove iconv.*/mbstring.* for 5.4. It's a big change for CJK
> users but they will be okay with it. Almost all users are using single
> encoding for application anyway.
>
> I think removing iconv.*/mbstring.* for master and5.4 would be nicer.
> Any opinions?

We can't remove them in 5.4. We can add new ones without breaking
anything and we can make mbstring/iconv/html* use those if they are set
and then mark the mbstring/iconv settings as deprecated in master.

-Rasmus

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php
Hi,

2012/8/27 Stas Malyshev <[email protected]>:
> Hi!
>
>> In PHP 6 we tried to introduce separate input, script and output
>> encoding settings. Currently in 5.4 we don't have that, but we have
>> those 3 separately for mbstring and for iconv:
>>
>> iconv.input_encoding
>> iconv.internal_encoding
>> iconv.output_encoding
>> mbstring.http_input
>> mbstring.internal_encoding
>> mbstring.http_output
>>
>> Ideally we should be getting rid of the per-feature encoding settings
>> and have a single set of them that we refer to when we need them. This
>
> I agree, having unified set of encodings would be a good thing. However,
> I have a feeling most of the people won't really understand what these
> three do, and would never bother to set them. From my experience, people
> don't even bother to set PHP timezone, even though PHP complains each
> time date function is accessed. So these will be left as default in
> 99.999% of cases.

I agree. Other than applications that are made by CJK native, I rarely
see them set.

>
>> So do we create a new default_input_encoding ini directive mid-stream in
>> 5.4 for this? Of course with the longer-term in mind that this will be
>> part of a unified set of encoding settings in 5.5 and beyond.
>
> What happens to these 6 directives? Will we now have 9 directives for
> setting the encoding? This reminds me of: http://xkcd.com/927/. Having
> yet more settings is not really a solution to the problem of too many
> different settings. So unless we deprecate all others in 5.5 and have
> people use only generic ones it's not very useful. If we do deprecate
> them, we need some kind of migration path - i.e. if you set
> iconv.input_encoding what actually happens? If you set
> default_input_encoding will it also set mbstring.http_input - or will it
> affect mbstring without actually setting it?
> I guess we'd need a good detailed RFC on this :)

If I write patch for it, I'll modify iconv.*/mbstring.* to use php.* (or zend.*)
When default_chartset is set and other settings are null, use it as
default for all including htmlentities(), mb_*(), etc.

default_charset will be single encoding configuration if user uses
single encoding for application.

How to deal with iconv.*/mbstring.*
master: remove iconv.*/mbstring.*
5.4: iconv.*/mbstring.* remains for compatibility and use them it they set.

We could remove iconv.*/mbstring.* for 5.4. It's a big change for CJK
users but they will be okay with it. Almost all users are using single
encoding for application anyway.

I think removing iconv.*/mbstring.* for master and5.4 would be nicer.
Any opinions?

Regards,

--
Yasuo Ohgaki
yohgaki@ohgaki.net

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php
Hi,

I've created RFC page so that this discussion will be forgotten.

https://wiki.php.net/rfc/default_encoding

Please edit the RFC page if needed.

Regards,

--
Yasuo Ohgaki
yohgaki@ohgaki.net

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php
Sorry, only registered users may post in this forum.

Click here to login