Welcome! Log In Create A New Profile

Advanced

[PHP-DEV] default_encoding RFC – aftermath

Posted by Christoph M. Becker 
Christoph M. Becker
[PHP-DEV] default_encoding RFC – aftermath
August 12, 2018 07:10PM
Hi!

Quite a while ago we have accepted the “Use default_charset As Default
Character Encoding” RFC[1], which also contains the clause “Old
iconv.*/mbstring.* php.ini parameters will be removed for master PHP6”
(obviously that was written before we decided to skip PHP 6, and as such
would have to be applied to PHP 7). I'm quite happy, though, that this
hasn't happened yet, since it seems that the encoding names may be
different between the extensions. At least the php.net manual states[2]:

| Some systems (like IBM AIX) use "ISO8859-1" instead of "ISO-8859-1" so
| this value has to be used in configuration options and function
| parameters.

It appears to be worthwhile to review the deprecation of the respective
iconv.* ini options. Not quite sure about the mbstring.* ini options.

Also, if we stick with the deprecation of the iconv.* ini settings, we
also should deprecate iconv_set_encoding(), which currently changes the
deprecated iconv.* ini options, and as such throws a deprecation
warning. However, if the ini options would be removed, what should the
function do? Nothing? Change the php.*_encoding settings. In my
opinion, both would be rather confusing. iconv_get_encoding() has a
similar, albeit minor, issue.

[1] https://wiki.php.net/rfc/default_encoding
[2] http://php.net/manual/en/iconv.configuration.php

--
Christoph M. Becker

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php
Andrea Faulds
[PHP-DEV] Re: default_encoding RFC – aftermath
August 12, 2018 08:00PM
Hi Cristoph,

Christoph M. Becker wrote:
> I'm quite happy, though, that this
> hasn't happened yet, since it seems that the encoding names may be
> different between the extensions. At least the php.net manual states[2]:
>
> | Some systems (like IBM AIX) use "ISO8859-1" instead of "ISO-8859-1" so
> | this value has to be used in configuration options and function
> | parameters.
>
> It appears to be worthwhile to review the deprecation of the respective
> iconv.* ini options. Not quite sure about the mbstring.* ini options.

Could this not be solved by making all the extensions support all the
encoding name aliases or so? That wouldn't be difficult to implement and
wouldn't break anything.

Thanks!

--
Andrea Faulds
https://ajf.me/

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php
Christoph M. Becker
[PHP-DEV] Re: default_encoding RFC – aftermath
August 12, 2018 11:20PM
Hi Andrea!

On 12.08.2018 at 19:57, Andrea Faulds wrote:

> Christoph M. Becker wrote:
>
>> I'm quite happy, though, that this
>> hasn't happened yet, since it seems that the encoding names may be
>> different between the extensions.  At least the php.net manual states[2]:
>>
>> | Some systems (like IBM AIX) use "ISO8859-1" instead of "ISO-8859-1" so
>> | this value has to be used in configuration options and function
>> | parameters.
>>
>> It appears to be worthwhile to review the deprecation of the respective
>> iconv.* ini options.  Not quite sure about the mbstring.* ini options.
>
> Could this not be solved by making all the extensions support all the
> encoding name aliases or so? That wouldn't be difficult to implement and
> wouldn't break anything.

I think that would only be possible, if we knew how they have to be
mapped. Consider the AIX example above; I'm not even sure if this is
(still) valid, but should we have special maps for AIX? Should we try
to detect this during configuration? I'm not sure if that's even
reliably possible, since some iconv implementations (which ones?) do not
support errno[1], so we can't detect whether iconv_open() failed with
EINVAL, or some other error. As it is now, it's simply up to the user
to pass a supported encoding name.

[1]
<https://github.com/php/php-src/blob/php-7.3.0beta1/ext/iconv/config.m4#L125>;

--
Christoph M. Becker

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php
Nikita Popov
Re: [PHP-DEV] default_encoding RFC – aftermath
August 13, 2018 10:30AM
On Sun, Aug 12, 2018 at 7:00 PM, Christoph M. Becker <[email protected]>
wrote:

> Hi!
>
> Quite a while ago we have accepted the “Use default_charset As Default
> Character Encoding” RFC[1], which also contains the clause “Old
> iconv.*/mbstring.* php.ini parameters will be removed for master PHP6”
> (obviously that was written before we decided to skip PHP 6, and as such
> would have to be applied to PHP 7). I'm quite happy, though, that this
> hasn't happened yet, since it seems that the encoding names may be
> different between the extensions. At least the php.net manual states[2]:
>
> | Some systems (like IBM AIX) use "ISO8859-1" instead of "ISO-8859-1" so
> | this value has to be used in configuration options and function
> | parameters.
>
> It appears to be worthwhile to review the deprecation of the respective
> iconv.* ini options. Not quite sure about the mbstring.* ini options.
>
> Also, if we stick with the deprecation of the iconv.* ini settings, we
> also should deprecate iconv_set_encoding(), which currently changes the
> deprecated iconv.* ini options, and as such throws a deprecation
> warning. However, if the ini options would be removed, what should the
> function do? Nothing? Change the php.*_encoding settings. In my
> opinion, both would be rather confusing. iconv_get_encoding() has a
> similar, albeit minor, issue.
>
> [1] https://wiki.php.net/rfc/default_encoding
> [2] http://php.net/manual/en/iconv.configuration.php
>

IIRC the new ini settings also don't work with Zend Multibyte yet. That
would need to be fixed before we drop them.

Nikita
Sorry, only registered users may post in this forum.

Click here to login