Welcome! Log In Create A New Profile

Advanced

[PHP-DEV] Complete case-sensitivity in PHP

Posted by C.Koy 
C.Koy
[PHP-DEV] Complete case-sensitivity in PHP
April 20, 2012 12:30PM
Hi,

This post is about bug #18556 (https://bugs.php.net/bug.php?id=18556)
which is a decade old.

As the recent comments on that page indicate, there's not a
deterministic way to resolve this issue, apart from eliminating
tolower() calls for function/class names during lookup. Hence totally
case-sensitive PHP.

Before opposing with "No, this will break a lot of existing code!", note
that I'm not suggesting a static permanent change in the engine; rather
a runtime option that will need to be enabled (cli option, INI setting),
without which PHP will work as before.

Since I'm not well versed in the workings of Zend engine, I solicit the
wisdom/experience of people in this list: Is this doable in a practical
way, without making grand changes in Zend?

best regards,




--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php
Matthew Weier O'Phinney
[PHP-DEV] Re: Complete case-sensitivity in PHP
April 20, 2012 03:20PM
On 2012-04-20, "C.Koy" <[email protected]> wrote:
> This post is about bug #18556 (https://bugs.php.net/bug.php?id=18556)
> which is a decade old.
>
> As the recent comments on that page indicate, there's not a
> deterministic way to resolve this issue, apart from eliminating
> tolower() calls for function/class names during lookup. Hence totally
> case-sensitive PHP.
>
> Before opposing with "No, this will break a lot of existing code!",
> note that I'm not suggesting a static permanent change in the engine;
> rather a runtime option that will need to be enabled (cli option, INI
> setting), without which PHP will work as before.
>
> Since I'm not well versed in the workings of Zend engine, I solicit
> the wisdom/experience of people in this list: Is this doable in a
> practical way, without making grand changes in Zend?

It's not just about changes to the engine. If you introduce a runtime
option that switches behavior, you then get a portability problem --
code runs fine in one context, but not the other.

--
Matthew Weier O'Phinney
Project Lead | matthew@zend.com
Zend Framework | http://framework.zend.com/
PGP key: http://framework.zend.com/zf-matthew-pgp-key.asc

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php
Tom Boutell
Re: [PHP-DEV] Re: Complete case-sensitivity in PHP
April 20, 2012 03:30PM
Yup - a one time transition would be preferable to that.

On Fri, Apr 20, 2012 at 9:13 AM, Matthew Weier O'Phinney
<[email protected]> wrote:
> On 2012-04-20, "C.Koy" <[email protected]> wrote:
>> This post is about bug #18556 (https://bugs.php.net/bug.php?id=18556)
>> which is a decade old.
>>
>> As the recent comments on that page indicate, there's not a
>> deterministic way to resolve this issue, apart from eliminating
>> tolower() calls for function/class names during lookup. Hence totally
>> case-sensitive PHP.
>>
>> Before opposing with "No, this will break a lot of existing code!",
>> note that I'm not suggesting a static permanent change in the engine;
>> rather a runtime option that will need to be enabled (cli option, INI
>> setting), without which PHP will work as before.
>>
>> Since I'm not well versed in the workings of Zend engine, I solicit
>> the wisdom/experience of people in this list: Is this doable in a
>> practical way, without making grand changes in Zend?
>
> It's not just about changes to the engine. If you introduce a runtime
> option that switches behavior, you then get a portability problem --
> code runs fine in one context, but not the other.
>
> --
> Matthew Weier O'Phinney
> Project Lead            | matthew@zend.com
> Zend Framework          | http://framework.zend.com/
> PGP key: http://framework.zend.com/zf-matthew-pgp-key.asc
>
> --
> PHP Internals - PHP Runtime Development Mailing List
> To unsubscribe, visit: http://www.php.net/unsub.php
>



--
Tom Boutell
P'unk Avenue
215 755 1330
punkave.com
window.punkave.com

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php
Arvids Godjuks
Re: [PHP-DEV] Complete case-sensitivity in PHP
April 20, 2012 03:30PM
In past years such switches where deprecated and removed (in 5.3 most of
them, in 5.4 finally all that stuff is gone for good). So any solution,
involving a switch that modifies how code is executed will hit a wall of
resistance. It's the lesson that was learned the hard way.

So it may be the case to make PHP case-sensetive. There will be code
broken, probably a lot. But that can be fixed, and I personally always
write with respect to char case, so that will be no problem for me.

20 апреля 2012 г. 13:20 пользователь C.Koy <[email protected]> написал:

> Hi,
>
> This post is about bug #18556 (https://bugs.php.net/bug.php?**id=18556https://bugs.php.net/bug.php?id=18556)
> which is a decade old.
>
> As the recent comments on that page indicate, there's not a deterministic
> way to resolve this issue, apart from eliminating tolower() calls for
> function/class names during lookup. Hence totally case-sensitive PHP.
>
> Before opposing with "No, this will break a lot of existing code!", note
> that I'm not suggesting a static permanent change in the engine; rather a
> runtime option that will need to be enabled (cli option, INI setting),
> without which PHP will work as before.
>
> Since I'm not well versed in the workings of Zend engine, I solicit the
> wisdom/experience of people in this list: Is this doable in a practical
> way, without making grand changes in Zend?
>
> best regards,
>
>
>
>
> --
> PHP Internals - PHP Runtime Development Mailing List
> To unsubscribe, visit: http://www.php.net/unsub.php
>
>
Nikita Popov
Re: [PHP-DEV] Complete case-sensitivity in PHP
April 20, 2012 03:50PM
On Fri, Apr 20, 2012 at 12:20 PM, C.Koy <[email protected]> wrote:
> Hi,
>
> This post is about bug #18556 (https://bugs.php.net/bug.php?id=18556) which
> is a decade old.
>
> As the recent comments on that page indicate, there's not a deterministic
> way to resolve this issue, apart from eliminating tolower() calls for
> function/class names during lookup. Hence totally case-sensitive PHP.
>
> Before opposing with "No, this will break a lot of existing code!", note
> that I'm not suggesting a static permanent change in the engine; rather a
> runtime option that will need to be enabled (cli option, INI setting),
> without which PHP will work as before.
>
> Since I'm not well versed in the workings of Zend engine, I solicit the
> wisdom/experience of people in this list: Is this doable in a practical way,
> without making grand changes in Zend?
I'm not sure whether I really get the issue, but as it seems the
problem seems to be that PHP is using locale-aware lowercasing
functions in the core. Couldn't the issue be fixed by replacing those
with local-unaware functions? Why does one have to change PHPs general
case sensitivity handling for that?

Nikita

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php
Arvids Godjuks
Re: [PHP-DEV] Complete case-sensitivity in PHP
April 20, 2012 04:00PM
Because you can write a function name, say, in Cyrilic and it will just
work.

20 апреля 2012 г. 16:47 пользователь Nikita Popov <[email protected]
> написал:

> On Fri, Apr 20, 2012 at 12:20 PM, C.Koy <[email protected]> wrote:
> > Hi,
> >
> > This post is about bug #18556 (https://bugs.php.net/bug.php?id=18556)
> which
> > is a decade old.
> >
> > As the recent comments on that page indicate, there's not a deterministic
> > way to resolve this issue, apart from eliminating tolower() calls for
> > function/class names during lookup. Hence totally case-sensitive PHP.
> >
> > Before opposing with "No, this will break a lot of existing code!", note
> > that I'm not suggesting a static permanent change in the engine; rather a
> > runtime option that will need to be enabled (cli option, INI setting),
> > without which PHP will work as before.
> >
> > Since I'm not well versed in the workings of Zend engine, I solicit the
> > wisdom/experience of people in this list: Is this doable in a practical
> way,
> > without making grand changes in Zend?
> I'm not sure whether I really get the issue, but as it seems the
> problem seems to be that PHP is using locale-aware lowercasing
> functions in the core. Couldn't the issue be fixed by replacing those
> with local-unaware functions? Why does one have to change PHPs general
> case sensitivity handling for that?
>
> Nikita
>
> --
> PHP Internals - PHP Runtime Development Mailing List
> To unsubscribe, visit: http://www.php.net/unsub.php
>
>
Sherif Ramadan
Re: [PHP-DEV] Complete case-sensitivity in PHP
April 20, 2012 05:10PM
>>Because you can write a function name, say, in Cyrilic and it will just work.
>
> PHP deals with strings on a binary level though. To PHP a function
> name of Áãç, for example is just a set of 256 bit encoded bytes. So
> "\xc3\x81\xc3\xa3\xc3\xa7" is all it sees, right? I'm not sure I
> follow what the problem is.

Sorry this might not have been sent to the list properly in my last reply.

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php
John LeSueur
Re: [PHP-DEV] Complete case-sensitivity in PHP
April 20, 2012 05:30PM
On Fri, Apr 20, 2012 at 9:01 AM, Sherif Ramadan <[email protected]>wrote:

> >>Because you can write a function name, say, in Cyrilic and it will just
> work.
> >
> > PHP deals with strings on a binary level though. To PHP a function
> > name of Áãç, for example is just a set of 256 bit encoded bytes. So
> > "\xc3\x81\xc3\xa3\xc3\xa7" is all it sees, right? I'm not sure I
> > follow what the problem is.
>
>
But in order to be case insensitive, PHP needs to know that strtolower("A")
== 'a'. So if you use Cyrilic for userland functions/classes, php needs a
cyrillic aware strtolower function. Then the problem is that core
classes/functions need to use a plain ASCII strtolower for case
insensitivity. So you cannot both write code in cyrillic and interface with
plain ASCII internals. One possible, but less than optimal solution is to
first try a locale aware strtolower, then try a plain ascii strtolower when
looking up symbols.

John
Johannes Schlüter
Re: [PHP-DEV] Re: Complete case-sensitivity in PHP
April 20, 2012 05:50PM
On Fri, 2012-04-20 at 09:21 -0400, Tom Boutell wrote:
> Yup - a one time transition would be preferable to that.

Then the question is: Why? What's the benefit from a break? Is there a
need to have imagecreatefrompng() next to a ImageCreateFromPNG()? Or is
the reason to be a tiny bit faster? Or is the reason to break
"everything" "just" for consistency?

johannes



--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php
Sherif Ramadan
Re: [PHP-DEV] Complete case-sensitivity in PHP
April 20, 2012 05:50PM
> But in order to be case insensitive, PHP needs to know that strtolower("A")
> == 'a'. So if you use Cyrilic for userland functions/classes, php needs a
> cyrillic aware strtolower function. Then the problem is that core
> classes/functions need to use a plain ASCII strtolower for case
> insensitivity. So you cannot both write code in cyrillic and interface with
> plain ASCII internals. One possible, but less than optimal solution is to
> first try a locale aware strtolower, then try a plain ascii strtolower when
> looking up symbols.
>
> John

I can see the confusion about PHP's case-sensitivity and how it mixes
and matches between case-insensitive functions/classes/(arguably even
constants), and case-sensitive variable names, for example.

Its naming rules are a little bit inconsistent in that regard. I just
don't see a point in making it completely locale aware. The fact that
you can do soefunc() and SOMEFUNC() and still invoke the same function
is a benefit. And I suppose for those using UTF-8 encoded function
names it might be convenient to make them case-sensitive as well. I'm
not going to argue that it's not. I'm just going to say that it
doesn't seem to be a significant problem.

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php
C.Koy
Re: [PHP-DEV] Complete case-sensitivity in PHP
April 20, 2012 07:40PM
On 4/20/2012 6:44 PM, Sherif Ramadan wrote:
>
> Its naming rules are a little bit inconsistent in that regard. I just
> don't see a point in making it completely locale aware. The fact that
> you can do soefunc() and SOMEFUNC() and still invoke the same function
> is a benefit. And I suppose for those using UTF-8 encoded function
> names it might be convenient to make them case-sensitive as well. I'm
> not going to argue that it's not. I'm just going to say that it
> doesn't seem to be a significant problem.
>

The lowercase of a multi-byte character is not the lowercase of
individual bytes comprising it.







--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php
Kris Craig
Re: [PHP-DEV] Complete case-sensitivity in PHP
April 20, 2012 08:00PM
On Fri, Apr 20, 2012 at 8:44 AM, Sherif Ramadan <[email protected]>wrote:

> > But in order to be case insensitive, PHP needs to know that
> strtolower("A")
> > == 'a'. So if you use Cyrilic for userland functions/classes, php needs a
> > cyrillic aware strtolower function. Then the problem is that core
> > classes/functions need to use a plain ASCII strtolower for case
> > insensitivity. So you cannot both write code in cyrillic and interface
> with
> > plain ASCII internals. One possible, but less than optimal solution is to
> > first try a locale aware strtolower, then try a plain ascii strtolower
> when
> > looking up symbols.
> >
> > John
>
> I can see the confusion about PHP's case-sensitivity and how it mixes
> and matches between case-insensitive functions/classes/(arguably even
> constants), and case-sensitive variable names, for example.
>
> Its naming rules are a little bit inconsistent in that regard. I just
> don't see a point in making it completely locale aware. The fact that
> you can do soefunc() and SOMEFUNC() and still invoke the same function
> is a benefit.


Could you elaborate? Aside from making PHP forgiving of typos and overall
laziness on the part of the coder, and of course BC notwithstanding, I'm
not sure I understand what benefit there is to preserving this inconsistent
behavior.


> And I suppose for those using UTF-8 encoded function
> names it might be convenient to make them case-sensitive as well. I'm
> not going to argue that it's not. I'm just going to say that it
> doesn't seem to be a significant problem.
>

When I was at Microsoft, I got into a little argument with some folks from
the Windows division about this very issue-- except, in this case, it was
about case-sensitivity in the filesystem. They essentially made the same
argument; i.e. "Why would you want 'Find.exe' and 'find.exe' to be two
separate things?!" I countered that I may want to add a library to the
PATH that contains a file with the same name, such as UnxUtils. "Why would
you want to do that?! Windows' find.exe is way better than the Unix one,
anyway!".... And then my brain exploded.

Turkish localization notwithstanding (I confess that I know absolutely *
nothing* about that lol), one possible use-case could be if you're
including an external library/framework that contains a function with the
same name but different case. I'm not sure how likely that is, mind you,
but I can see that as one potential benefit. Either way, I guess my point
is that the arguments for/against this seem to parallel the arguments for
Windows-style fso case-insensitivity vs. Unix-style fso case-sensitivity.

--Kris


> --
> PHP Internals - PHP Runtime Development Mailing List
> To unsubscribe, visit: http://www.php.net/unsub.php
>
>
Matthew Weier O'Phinney
Re: [PHP-DEV] Complete case-sensitivity in PHP
April 20, 2012 08:20PM
On 2012-04-20, Kris Craig <[email protected]> wrote:
> On Fri, Apr 20, 2012 at 8:44 AM, Sherif Ramadan <[email protected]> wrote:
> > > But in order to be case insensitive, PHP needs to know that
> > > strtolower("A") == 'a'. So if you use Cyrilic for userland
> > > functions/classes, php needs a cyrillic aware strtolower function.
> > > Then the problem is that core classes/functions need to use a
> > > plain ASCII strtolower for case insensitivity. So you cannot both
> > > write code in cyrillic and interface with plain ASCII internals.
> > > One possible, but less than optimal solution is to first try a
> > > locale aware strtolower, then try a plain ascii strtolower when
> > > looking up symbols.
> >
> > I can see the confusion about PHP's case-sensitivity and how it mixes
> > and matches between case-insensitive functions/classes/(arguably even
> > constants), and case-sensitive variable names, for example.
> >
> > Its naming rules are a little bit inconsistent in that regard. I just
> > don't see a point in making it completely locale aware. The fact that
> > you can do soefunc() and SOMEFUNC() and still invoke the same function
> > is a benefit.
>
> Could you elaborate? Aside from making PHP forgiving of typos and overall
> laziness on the part of the coder, and of course BC notwithstanding, I'm
> not sure I understand what benefit there is to preserving this inconsistent
> behavior.

To make extensible and flexible systems, it's not uncommon to
dynamically determine class and function or method names. This task
is far simpler and less expensive if you don't need to worry about the
casing of these names.

As an example, I often see code like the following:

public function setOptions(array $options)
{
foreach ($options as $key => $value) {
$method = 'set' . $key;
if (!method_exists($this, $method)) {
continue;
}

$this->$method($value);
}
}

This is trivial to implement and understand, and requires no need to
transform the value of $key to an appropriately cased value in order to
ensure the method name exists.

Making method names case sensitive would break a ton of code, and
require a ton of computational overhead as well as validation to make
code like the above work.


--
Matthew Weier O'Phinney
Project Lead | matthew@zend.com
Zend Framework | http://framework.zend.com/
PGP key: http://framework.zend.com/zf-matthew-pgp-key.asc

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php
Sherif Ramadan
Re: [PHP-DEV] Complete case-sensitivity in PHP
April 20, 2012 08:50PM
> Could you elaborate?  Aside from making PHP forgiving of typos and overall
> laziness on the part of the coder, and of course BC notwithstanding, I'm not
> sure I understand what benefit there is to preserving this inconsistent
> behavior.
>

Kris,

Sorry, first to be clear I made a typo there, but what I'm saying is
PHP currently doesn't care if you do the following:

<?php

function Foo() { }

/* these all work the same obviously */
foo(); Foo(); FOO();

?>

However, it does care if you do the following:

<?php

$foo = 'bar';
$Foo = 'baz';
$FOO = 'quix';

?>

I'm not saying I'm in favor of making PHP case-insensitive. I think
it's fine that it is case-sensitive since it deals with mostly
everything on a binary level. I think the fact that it currently does
allow ASCII case-insensitivity for method/function/class/(and
partially constant) names is somewhat confusing though. It should
probably be either all case-sensitive or not. As you can argue that
there seems to be little reasoning behind wanting $foo and $FOO to be
two different things, but hey that's the arguable part.

I can't see much reason to breaking any of this now though. I don't
know about others, but I've rarely ever written or worked with PHP
code where function/method names were in anything other than ASCII and
me or anyone else cared about case-sensitivity there. I say it's fine
the way it is and I don't really see anyone presenting a valid use
case for why it should change.

Just my two cents.

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php
C.Koy
Re: [PHP-DEV] Complete case-sensitivity in PHP
April 20, 2012 09:00PM
On 4/20/2012 8:57 PM, Kris Craig wrote:
>
> Turkish localization notwithstanding (I confess that I know absolutely *
> nothing* about that lol), one possible use-case could be if you're
> including an external library/framework that contains a function with the
> same name but different case. I'm not sure how likely that is, mind you,
> but I can see that as one potential benefit. Either way, I guess my point
> is that the arguments for/against this seem to parallel the arguments for
> Windows-style fso case-insensitivity vs. Unix-style fso case-sensitivity.
>

Java, C#, Python, Ruby... are all case-sensitive. This is not a feature
to be (mis-)used so that one can have a function named myfunc() and
MyFunc() in the same code base.
Case-insensitive class/function/interface names is a confusion for
everyone with non-PHP development experience. There's not a modern OO
platform that defines an interface named 'IDispatch' and later allows it
to be referenced as 'idispatch' or 'iDispatch'. And PHP is becoming more
OO with every major release.
Overall, full case-sensitivity seems to be a natural step in PHP's
evolution.





--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php
C.Koy
Re: [PHP-DEV] Complete case-sensitivity in PHP
April 20, 2012 09:00PM
On 4/20/2012 9:48 PM, C.Koy wrote:
>
> Java, C#, Python, Ruby... are all case-sensitive. This is not a feature
> to be (mis-)used so that one can have a function named myfunc() and
> MyFunc() in the same code base.
> Case-insensitive class/function/interface names is a confusion for
> everyone with non-PHP development experience. There's not a modern OO
> platform that defines an interface named 'IDispatch' and later allows it
> to be referenced as 'idispatch' or 'iDispatch'. And PHP is becoming more
> OO with every major release.
> Overall, full case-sensitivity seems to be a natural step in PHP's
> evolution.
>

Let me add this: case-insensitivity is a burden for tool developers.
For example, any ctags-based editor/IDE out there won't find the
definition of myfunc() when you request "Goto tag's definition" if it's
defined as MyFunc().







--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php
Guillaume Rossolini
Re: [PHP-DEV] Complete case-sensitivity in PHP
April 20, 2012 11:20PM
Hi there,

Out of curiosity, how would one migrate a codebase for full case
sensitivity in PHP? They would need to rewrite their calls of core
functions, plus PECL functions. Those are easy enough to spot, but there
are also custom extensions. True, one could maybe parse the .h files to get
the functions, classes, interfaces etc. as long as there are .h files or
similar, and rewrite everything based on that. However, there are also
userland functions, classes and other identifiers that could be defined
anywhere, from files on disk to an encrypted database somewhere. Plus, as
Matthew pointed out, identifiers are not always fully written in the code:
they can be concatenated or aliased, and these calls would need to be
rewritten too (and good luck finding them all).

I am sure I have not listed here all the challenges in migrating for this
new functionality, but I hope it will be enough that we do _not_ get case
sensitivity for functions/classes/interfaces/etc. in PHP. The cost truly
outweights the benefits. I can understand why bug #18556 should be fixed,
but I don't understand why the solution should be to make PHP a fully case
sensitive language.

@C.Koy: until now, tools have been able to cope with PHP's case
insensitivity just fine. I have no idea how difficult it is to do, but
obviously they can do it. And anyway, I use tools _because_ they do some of
the work for me, so that's just a tribute to their usefullness, isn't it?
Regarding other languages, it has been stated before on this list that PHP
is its own language.

Regards,

--
Guillaume Rossolini
Stefan Neufeind
Re: [PHP-DEV] Complete case-sensitivity in PHP
April 21, 2012 01:20AM
On 04/20/2012 08:48 PM, C.Koy wrote:
> On 4/20/2012 8:57 PM, Kris Craig wrote:
>>
>> Turkish localization notwithstanding (I confess that I know absolutely *
>> nothing* about that lol), one possible use-case could be if you're
>> including an external library/framework that contains a function with the
>> same name but different case. I'm not sure how likely that is, mind you,
>> but I can see that as one potential benefit. Either way, I guess my
>> point
>> is that the arguments for/against this seem to parallel the arguments for
>> Windows-style fso case-insensitivity vs. Unix-style fso case-sensitivity.
>
> Java, C#, Python, Ruby... are all case-sensitive. This is not a feature
> to be (mis-)used so that one can have a function named myfunc() and
> MyFunc() in the same code base.
> Case-insensitive class/function/interface names is a confusion for
> everyone with non-PHP development experience. There's not a modern OO
> platform that defines an interface named 'IDispatch' and later allows it
> to be referenced as 'idispatch' or 'iDispatch'. And PHP is becoming more
> OO with every major release.
> Overall, full case-sensitivity seems to be a natural step in PHP's
> evolution.

I also have the feeling that cleaner code with consistent case would be
a benefit. While I admit we can't change that from one day to the other
(as we couldn't with other changes) I think we might possibly add a
special kind of "deprecation" where the non-matching case would still
work but (if you activate those deprecation-warnings) would trigger
warnings so you can clean up your code.

Various projects that I work on take explicit care of the case when
auto-creating classnames etc. - not because they must but because they
want to be consistent. And that's the thing I like about it.


Regards,
Stefan

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php
Johannes Schlüter
Re: [PHP-DEV] Complete case-sensitivity in PHP
April 21, 2012 02:00AM
On Sat, 2012-04-21 at 01:16 +0200, Stefan Neufeind wrote:
> I think we might possibly add a
> special kind of "deprecation" where the non-matching case would still
> work but (if you activate those deprecation-warnings) would trigger
> warnings so you can clean up your code.

yay - two lookups instead of one ;-)

and what would you do in this strange, but possible case:
The functions Foo() and FOO() exist but the user calls foo()?

johannes


--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php
Galen Wright-Watson
Re: [PHP-DEV] Complete case-sensitivity in PHP
April 21, 2012 03:40AM
On Fri, Apr 20, 2012 at 3:20 AM, C.Koy <[email protected]> wrote:

>
> As the recent comments on that page indicate, there's not a deterministic
> way to resolve this issue, apart from eliminating tolower() calls for
> function/class names during lookup. Hence totally case-sensitive PHP.
>
>
What about instead creating a special-purpose Zend function to normalize
class names (zend_normalize_class_name, or zend_classname_tolower)? This
function would examine the current locale and, if it's a problematic one,
convert the string to lower case on its own (calling zend_tolower on
non-problematic characters). Alternatively, zend_normalize_class_name could
switch LC_CTYPE to an appropriate locale (e.g. "UTF-8"; the locale could be
determined at compile time), call zend_str_tolower_copy, then switch back
before returning. Then, any appropriate function (e.g.
zend_resolve_class_name, zend_lookup_class_ex, class_exists, class_alias)
would call zend_normalize_class_name instead of zend_str_tolower_copy/
zend_str_tolower_dup.

The two problems with this approach are
1) additional time-cost. However, if done right, this should have little
impact.
2) break class names using words in the locale-language. For example, a
class named "IzgaraGörünümü" would be converted to "izgaragörünümü", rather
than "ızgaragörünümü". However, this impact should be less than that caused
by the current bug.

Does this bug pop-up for locales other than Turkish, Azerbaijani and Kurdish
?
Robert Cummings
Re: [PHP-DEV] Complete case-sensitivity in PHP
April 21, 2012 04:20PM
On 12-04-20 07:56 PM, Johannes Schlüter wrote:
> On Sat, 2012-04-21 at 01:16 +0200, Stefan Neufeind wrote:
>> I think we might possibly add a
>> special kind of "deprecation" where the non-matching case would still
>> work but (if you activate those deprecation-warnings) would trigger
>> warnings so you can clean up your code.
>
> yay - two lookups instead of one ;-)
>
> and what would you do in this strange, but possible case:
> The functions Foo() and FOO() exist but the user calls foo()?

Cry! :)

Cheers,
Rob.
--
E-Mail Disclaimer: Information contained in this message and any
attached documents is considered confidential and legally protected.
This message is intended solely for the addressee(s). Disclosure,
copying, and distribution are prohibited unless authorized.

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php
Stefan Neufeind
Re: [PHP-DEV] Complete case-sensitivity in PHP
April 21, 2012 07:00PM
On 04/21/2012 01:56 AM, Johannes � wrote:
> On Sat, 2012-04-21 at 01:16 +0200, Stefan Neufeind wrote:
>> I think we might possibly add a
>> special kind of "deprecation" where the non-matching case would still
>> work but (if you activate those deprecation-warnings) would trigger
>> warnings so you can clean up your code.
>
> yay - two lookups instead of one ;-)
>
> and what would you do in this strange, but possible case:
> The functions Foo() and FOO() exist but the user calls foo()?

Tell him it doesn't exist and suggest to go fix his code ...

PS: That was a rethorical question? :-)


Regards,
Stefan

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php
Yasuo Ohgaki
Re: [PHP-DEV] Complete case-sensitivity in PHP
April 21, 2012 11:00PM
Hi,

I'm just curious if anyone took benchmark.
PoC would be simple enough.

Some figures are needed for decision.

Regards,

--
Yasuo Ohgaki
yohgaki@ohgaki.net

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php
C.Koy
Re: [PHP-DEV] Complete case-sensitivity in PHP
April 22, 2012 11:00AM
On 4/21/2012 4:37 AM, Galen Wright-Watson wrote:
> What about instead creating a special-purpose Zend function to normalize
> class names (zend_normalize_class_name, or zend_classname_tolower)? This
> function would examine the current locale and, if it's a problematic one,
> convert the string to lower case on its own (calling zend_tolower on
> non-problematic characters). Alternatively, zend_normalize_class_name could
> switch LC_CTYPE to an appropriate locale (e.g. "UTF-8"; the locale could be
> determined at compile time), call zend_str_tolower_copy, then switch back
> before returning. Then, any appropriate function (e.g.
> zend_resolve_class_name, zend_lookup_class_ex, class_exists, class_alias)
> would call zend_normalize_class_name instead of zend_str_tolower_copy/
> zend_str_tolower_dup.

In plain words/pseudo-code, adding an "if statement" at a certain step
should suffice, like:

1. lowercase the name;
2. if the effective locale is tr_XY, then replace every "ı" with "i";
3. look up the name;

For those who have nothing to do with Turkish locales, that should incur
the overhead of an "if" condition only.


But, I did not start this thread to discuss such bug fix, because:

1. It does not take a genius to figure it out, and should take minutes
to implement for someone experienced in the internals. Given the 10 year
span and dozens of comments/complaints on the bug's entry, it's hard to
say this issue went unnoticed. So I had to conclude that such fix has
quietly been overruled for performance and/or other undisclosed reasons.
2. Absent bug #18556, case-sensitive PHP has merits as I stated in other
post and several people voiced opinions in favor. Case-sensitive PHP is
worth considering.

>
> Does this bug pop-up for locales other than Turkish, Azerbaijani and Kurdish
> ?

Theoretically, this problem occurs for any locales sharing a letter
lowercase of which is different from each other's, and the PHP script
changes its locale among these locales throughout its execution.

best regards,






--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php
Galen Wright-Watson
Re: [PHP-DEV] Complete case-sensitivity in PHP
April 22, 2012 10:40PM
2012/4/22 C.Koy <[email protected]>

> On 4/21/2012 4:37 AM, Galen Wright-Watson wrote:
>
>> What about instead creating a special-purpose Zend function to normalize
>> class names (zend_normalize_class_name, or zend_classname_tolower)? This
>> function would examine the current locale and, if it's a problematic one,
>> convert the string to lower case on its own (calling zend_tolower on
>> non-problematic characters). Alternatively, zend_normalize_class_name
>> could
>> switch LC_CTYPE to an appropriate locale (e.g. "UTF-8"; the locale could
>> be
>> determined at compile time), call zend_str_tolower_copy, then switch back
>> before returning. Then, any appropriate function (e.g.
>> zend_resolve_class_name, zend_lookup_class_ex, class_exists, class_alias)
>> would call zend_normalize_class_name instead of zend_str_tolower_copy/
>> zend_str_tolower_dup.
>>
>
> In plain words/pseudo-code, adding an "if statement" at a certain step
> should suffice, like:
>
> 1. lowercase the name;
> 2. if the effective locale is tr_XY, then replace every "ı" with "i";
> 3. look up the name;
>
> For those who have nothing to do with Turkish locales, that should incur
> the overhead of an "if" condition only.
>
> The fix would need to be applied to at least four functions, so adding a
new function would be more maintainable. Also, there are locales that don't
begin with "tr_" or have "TR" in the locale name, so the condition would
need to be more complex.

Converting "I" or "ı" separately from lowercase conversion is less
performant than either option I describe, as it requires an extra loop,
which is why I didn't bother suggesting it. I suspect switching the locale
is most performant, as it doesn't require additional tests, though I
haven't examined the cost of setting the locale.


> But, I did not start this thread to discuss such bug fix, because:
>
> 1. It does not take a genius to figure it out, and should take minutes to
> implement for someone experienced in the internals. Given the 10 year span
> and dozens of comments/complaints on the bug's entry, it's hard to say this
> issue went unnoticed. So I had to conclude that such fix has quietly been
> overruled for performance and/or other undisclosed reasons.
>

Why does it matter if a solution is simple? If anything, that a fix "does
not take a genius" is an argument in its favor, if it also solves the
problem.

If it's already been rejected privately, it's time to bring the reasons
into the open (which is why I asked). If not, it should be considered
publicly.


> 2. Absent bug #18556, case-sensitive PHP has merits as I stated in other
> post and several people voiced opinions in favor. Case-sensitive PHP is
> worth considering.
>
> It is, but it's also a major BC break, hence perhaps better suited for
PHP6. Case-sensitivity is also a much bigger issue than this bug. A custom
conversion function, on the other hand, produces the minimum impact of any
option I've read. As such, it's hopefully a solution for this bug that
everyone can agree on.


>> Does this bug pop-up for locales other than Turkish, Azerbaijani and
>> Kurdish
>> ?
>>
>
> Theoretically, this problem occurs for any locales sharing a letter
> lowercase of which is different from each other's, and the PHP script
> changes its locale among these locales throughout its execution.
>
> The abstract property that makes a locale problematic is obvious. I was
looking for specific locales, as they need to be identified for a complete
solution.
Yasuo Ohgaki
Re: [PHP-DEV] Complete case-sensitivity in PHP
April 22, 2012 11:20PM
Hi,

2012/4/23 Galen Wright-Watson <[email protected]>:
>> 2. Absent bug #18556, case-sensitive PHP has merits as I stated in other
>> post and several people voiced opinions in favor. Case-sensitive PHP is
>> worth considering.
>>
>> It is, but it's also a major BC break, hence perhaps better suited for
> PHP6. Case-sensitivity is also a much bigger issue than this bug. A custom
> conversion function, on the other hand, produces the minimum impact of any
> option I've read. As such, it's hopefully a solution for this bug that
> everyone can agree on.

Conversion script may be provided.
It's a rather simple script with tokenizer.

Anyway, if we are going to change function name rule, consistent module
function names should better be considered at the same time.
createimage() htmlentities(), etc should be create_image()/html_entities().
There is alias system. This is just a matter of defining aliases for them.

Regards,

--
Yasuo Ohgaki
yohgaki@ohgaki.net

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php
C.Koy
Re: [PHP-DEV] Complete case-sensitivity in PHP
April 23, 2012 12:30PM
On 4/22/2012 11:32 PM, Galen Wright-Watson wrote:
> 2012/4/22 C.Koy<[email protected]>
>
>> On 4/21/2012 4:37 AM, Galen Wright-Watson wrote:
>
>> But, I did not start this thread to discuss such bug fix, because:
>>
>> 1. It does not take a genius to figure it out, and should take minutes to
>> implement for someone experienced in the internals. Given the 10 year span
>> and dozens of comments/complaints on the bug's entry, it's hard to say this
>> issue went unnoticed. So I had to conclude that such fix has quietly been
>> overruled for performance and/or other undisclosed reasons.
>>
>
> Why does it matter if a solution is simple?

It doesn't matter, you've misunderstood.
On the contrary, common sense dictates a simple solution should be
applied (considering how deep in the PHP stack we're talking about).
And that's what makes me curious and confused about why this bug still
exists. See, I'm drawing a conclusion with what little information I
have, and stating the reasonings it's based on (first two statements).
Overall, that and the item following it were an explanation of "why I'm
suggesting a major feature change in solution to a specific bug",
although noone directly asked me to.

>
> If it's already been rejected privately, it's time to bring the reasons
> into the open (which is why I asked). If not, it should be considered
> publicly.

A comment dated 2002-09-26 on bug's page states the bug is fixed. The
next comment dated 2006-02-17 states it reappeared.
I don't know who did what 10, 6 years ago but it's been revoked. Why?
That was the main reason I deemed this bug not fixable, hence suggest
other ways to resolve.

>>
>> The abstract property that makes a locale problematic is obvious. I was
> looking for specific locales, as they need to be identified for a complete
> solution.
>

I'm not locale expert. Given the public complaints/bugs we can, in
practice, assume this affects Turkish and Azerbaijani only. (I don't
know about Kurdish)

best regards,





--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php
Sorry, only registered users may post in this forum.

Click here to login