Welcome! Log In Create A New Profile

Advanced

[PHP-DEV] [VOTE] Make compact function reports undefined passed variables

Posted by Gabriel Caruso 
Hello there.

I'd like to announce that I've opened the RFC for voting until 06/18.

https://wiki.php.net/rfc/compact

Thanks.
--
Gabriel Caruso
Hi Gabriel,


compact(), extract(), parse_str() (with 1 argument) and get_defined_vars() are bad functions, because they access local variables indirectly.

They might be considered to be removed in the next major PHP version, despite of this fix.


Thanks. Dmitry.

________________________________
From: Gabriel Caruso <[email protected]>
Sent: Wednesday, June 6, 2018 3:58:24 PM
To: PHP Internals
Subject: [PHP-DEV] [VOTE] Make compact function reports undefined passed variables

Hello there.

I'd like to announce that I've opened the RFC for voting until 06/18.

https://wiki.php.net/rfc/compact

Thanks.
--
Gabriel Caruso
Am 09.06.2018 um 12:03 schrieb Dmitry Stogov:
> They might be considered to be removed in the next major PHP version, despite of this fix.

+1

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php
On 09.06.2018 at 12:03, Dmitry Stogov wrote:

> compact(), extract(), parse_str() (with 1 argument) and get_defined_vars() are bad functions, because they access local variables indirectly.

While I agree that extract() and parse_str() can be dangerous, I don't
understand why compact() and get_defined_vars() are “bad”. If the issue
is that these functions allow to access (local) variables by their name
(given as string), that appears to be not uncommon in PHP; cf. variable
variables and $GLOBALS.

> They might be considered to be removed in the next major PHP version, despite of this fix.

That would certainly require the RFC process, and in my opinion, a
deprecation phase would be very appropriate. Note that parse_str()
without second argument is already deprecated as of PHP 7.2.0 and
scheduled for removal in the next major version[1].

[1]
<https://wiki.php.net/rfc/deprecations_php_7_2?s[]=parse&s[]=str#parse_str_without_second_argument>

--
Christoph M. Becker

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php
Em sáb, 9 de jun de 2018 às 11:27, Christoph M. Becker <[email protected]>
escreveu:

> On 09.06.2018 at 12:03, Dmitry Stogov wrote:
>
> > compact(), extract(), parse_str() (with 1 argument) and
> get_defined_vars() are bad functions, because they access local variables
> indirectly.
>
> While I agree that extract() and parse_str() can be dangerous, I don't
> understand why compact() and get_defined_vars() are “bad”.. If the issue
> is that these functions allow to access (local) variables by their name
> (given as string), that appears to be not uncommon in PHP; cf. variable
> variables and $GLOBALS.
>

​Agree here. I think that compact() should be modified just to work like
compact($a, $b, $c) instead of compact('a', 'b', 'c'). It is very useful
for template engines.


>
> > They might be considered to be removed in the next major PHP version,
> despite of this fix.
>
> That would certainly require the RFC process, and in my opinion, a
> deprecation phase would be very appropriate. Note that parse_str()
> without second argument is already deprecated as of PHP 7.2.0 and
> scheduled for removal in the next major version[1].
>
> [1]
> <
> https://wiki.php.net/rfc/deprecations_php_7_2?s[]=parse&s[]=str#parse_str_without_second_argument
> >
>
> --
> Christoph M. Becker
>
> --
> PHP Internals - PHP Runtime Development Mailing List
> To unsubscribe, visit: http://www.php.net/unsub.php
>
>

--
David Rodrigues
>
> Hi Gabriel,
>
>> compact(), extract(), parse_str() (with 1 argument) and
>> get_defined_vars() are bad functions, because they access local variables
>> indirectly.
>>
>> They might be considered to be removed in the next major PHP version,
>> despite of this fix.
>>
>>
>> Thanks. Dmitry.
>>
>>
>
> Hello Dmitry.
>
> Thanks for this feedback. When I decided to create this RFC adding a
> warning, many of friends actually suggested me creating an RFC depracting
> `compact`, and complaing with the same arguments as yours.
>
> Do you think we should do this already in PHP 7.3?
>
> Thanks.
>
--
Gabriel Caruso
>>
>> Hi Gabriel,
>>
>>> compact(), extract(), parse_str() (with 1 argument) and
>>> get_defined_vars() are bad functions, because they access local variables
>>> indirectly.
>>>
>>> They might be considered to be removed in the next major PHP version,
>>> despite of this fix.
>>>
>>>
>>> Thanks. Dmitry.
>>>
>>>
>>
>> Hello Dmitry.
>>
>> Thanks for this feedback. When I decided to create this RFC adding a
>> warning, many of friends actually suggested me creating an RFC depracting
>> `compact`, and complaing with the same arguments as yours.
>>
>> Do you think we should do this already in PHP 7.3?
>>
>> Thanks.
>>
>--
>Gabriel Caruso

Deprecating compact() would hurt workflows where you use the same variable names
several times in different contexts, even though they are only declared locally
for the purposes of compact() itself.

Below is an example of what I mean. The array used for compact() is the same one
that is iterated over outside of class A. Further, we use the variables that are
compacted earlier in A::c(), so it is convenient to use them as local variables.

class A
{
private static $b = [
'a',
'b',
'c',
'd',
];

public static function b()
{
return self::$b;
}

public function d()
{
return 2;
}

public function c()
{
$d = $this->d();
$a = pow($d, $d + 1);
$c = $a ^ 0b1100;
$b = $a - $d;

return new B(
compact(
$this->b()
)
);
}
}

class B
{
private $a;

public function __construct($a)
{
$this->a = $a;
}

public function a(string $name)
{
return $this->a[$name];
}
}

$a = (new A())->c();

foreach (A::b() as $b) {
echo $a->a($b) . PHP_EOL;
}

The alternative would be manipulating array elements directly, like this:

public function c()
{
$e['d'] = $this->d();
$e['a'] = pow($e['d'], $e['d'] + 1);
$e['c'] = $e['a'] ^ 0b1100;
$e['b'] = $e['a'] - $e['d'];

return new B($e);
}

That is far more cumbersome. So, compact() has legitimate uses sometimes.

Cheers,

--
Zach Hoffman
________________________________________
From: Gabriel Caruso <[email protected]>
Sent: Saturday, June 9, 2018 12:25
To: Dmitry Stogov
Cc: PHP Internals
Subject: Re: [PHP-DEV] [VOTE] Make compact function reports undefined passed variables

>
> Hi Gabriel,
>
>> compact(), extract(), parse_str() (with 1 argument) and
>> get_defined_vars() are bad functions, because they access local variables
>> indirectly.
>>
>> They might be considered to be removed in the next major PHP version,
>> despite of this fix.
>>
>>
>> Thanks. Dmitry.
>>
>>
>
> Hello Dmitry.
>
> Thanks for this feedback. When I decided to create this RFC adding a
> warning, many of friends actually suggested me creating an RFC depracting
> `compact`, and complaing with the same arguments as yours.
>
> Do you think we should do this already in PHP 7.3?
>
> Thanks.
>
--
Gabriel Caruso

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php
The existence of anything that internally relies on `get_defined_vars()` is
a blocker for applying further optimisations to the engine (think stack
frames), which is probably why Dmitry suggested its removal.

On Sun, 10 Jun 2018, 20:42 Hoffman, Zachary Robert, <[email protected]>
wrote:

> >>
> >> Hi Gabriel,
> >>
> >>> compact(), extract(), parse_str() (with 1 argument) and
> >>> get_defined_vars() are bad functions, because they access local
> variables
> >>> indirectly.
> >>>
> >>> They might be considered to be removed in the next major PHP version,
> >>> despite of this fix.
> >>>
> >>>
> >>> Thanks. Dmitry.
> >>>
> >>>
> >>
> >> Hello Dmitry.
> >>
> >> Thanks for this feedback. When I decided to create this RFC adding a
> >> warning, many of friends actually suggested me creating an RFC
> depracting
> >> `compact`, and complaing with the same arguments as yours.
> >>
> >> Do you think we should do this already in PHP 7.3?
> >>
> >> Thanks.
> >>
> >--
> >Gabriel Caruso
>
> Deprecating compact() would hurt workflows where you use the same variable
> names
> several times in different contexts, even though they are only declared
> locally
> for the purposes of compact() itself.
>
> Below is an example of what I mean. The array used for compact() is the
> same one
> that is iterated over outside of class A. Further, we use the variables
> that are
> compacted earlier in A::c(), so it is convenient to use them as local
> variables.
>
> class A
> {
> private static $b = [
> 'a',
> 'b',
> 'c',
> 'd',
> ];
>
> public static function b()
> {
> return self::$b;
> }
>
> public function d()
> {
> return 2;
> }
>
> public function c()
> {
> $d = $this->d();
> $a = pow($d, $d + 1);
> $c = $a ^ 0b1100;
> $b = $a - $d;
>
> return new B(
> compact(
> $this->b()
> )
> );
> }
> }
>
> class B
> {
> private $a;
>
> public function __construct($a)
> {
> $this->a = $a;
> }
>
> public function a(string $name)
> {
> return $this->a[$name];
> }
> }
>
> $a = (new A())->c();
>
> foreach (A::b() as $b) {
> echo $a->a($b) . PHP_EOL;
> }
>
> The alternative would be manipulating array elements directly, like this:
>
> public function c()
> {
> $e['d'] = $this->d();
> $e['a'] = pow($e['d'], $e['d'] + 1);
> $e['c'] = $e['a'] ^ 0b1100;
> $e['b'] = $e['a'] - $e['d'];
>
> return new B($e);
> }
>
> That is far more cumbersome. So, compact() has legitimate uses sometimes.
>
> Cheers,
>
> --
> Zach Hoffman
> ________________________________________
> From: Gabriel Caruso <[email protected]>
> Sent: Saturday, June 9, 2018 12:25
> To: Dmitry Stogov
> Cc: PHP Internals
> Subject: Re: [PHP-DEV] [VOTE] Make compact function reports undefined
> passed variables
>
> >
> > Hi Gabriel,
> >
> >> compact(), extract(), parse_str() (with 1 argument) and
> >> get_defined_vars() are bad functions, because they access local
> variables
> >> indirectly.
> >>
> >> They might be considered to be removed in the next major PHP version,
> >> despite of this fix.
> >>
> >>
> >> Thanks. Dmitry.
> >>
> >>
> >
> > Hello Dmitry.
> >
> > Thanks for this feedback. When I decided to create this RFC adding a
> > warning, many of friends actually suggested me creating an RFC depracting
> > `compact`, and complaing with the same arguments as yours.
> >
> > Do you think we should do this already in PHP 7.3?
> >
> > Thanks.
> >
> --
> Gabriel Caruso
>
> --
> PHP Internals - PHP Runtime Development Mailing List
> To unsubscribe, visit: http://www.php.net/unsub.php
>
>
>The existence of anything that internally relies on `get_defined_vars()` is a
>blocker for applying further optimisations to the engine (think stack frames),
>which is probably why Dmitry suggested its removal.

I did not know it was an optimization issue. I suppose I could substitute a userland equivalent:

function compact_(array $compact, array $vars)
{
return array_intersect_key($vars, array_flip($compact));
}

which changes A::c() to

function c()
{
$d = $this->d();
$a = pow($d, $d + 1);
$c = $a ^ 0b1100;
$b = $a - $d;

return new B(
compact_(
$this->b(),
get_defined_vars()
)
);
}

That is not so bad. As far as I am concerned, deprecate away.
On 10/06/2018 19:41, Hoffman, Zachary Robert wrote:
> Deprecating compact() would hurt workflows where you use the same variable names
> several times in different contexts, even though they are only declared locally
> for the purposes of compact() itself.

I am not convinced that building logic around variable names is ever a
good thing. Variable names are primarily a label used by the programmer
to write source code, and in my opinion that's all they should ever be.
Programmers should be free to rename them within their scope, and
compilers should be free to erase them when optimising a production build.


> public function c()
> {
> $d = $this->d();
> $a = pow($d, $d + 1);
> $c = $a ^ 0b1100;
> $b = $a - $d;
>
> return new B(
> compact(
> $this->b()
> )
> );
> }

Your two classes here are clearly collaborating on the processing of
some specific collection of values, which have something in common with
each other; they are not really passing 4 variables, but some structure
which has 4 fields. It might be nice in some cases if that could be a
by-value "struct" type, but the most appropriate in current PHP would be
to define a third class and pass an instance of that from class A to
class B.


> foreach (A::b() as $b) {
> echo $a->a($b) . PHP_EOL;
> }

This is a poor example, because if this was really the end result, then
the two classes aren't even using the data with the same meaning, and
class A might as well return a plain array - if the keys are no longer
important, it could simply return [$a, $b, $c, $d].


> The alternative would be manipulating array elements directly, like this:
>
> public function c()
> {
> $e['d'] = $this->d();
> $e['a'] = pow($e['d'], $e['d'] + 1);
> $e['c'] = $e['a'] ^ 0b1100;
> $e['b'] = $e['a'] - $e['d'];
>
> return new B($e);
> }
>
> That is far more cumbersome.

I agree that this is slightly harder to read (and the version I would
prefer where $e was an object rather than an array would look similar).
However, this feels like an argument for some syntactic sugar, like the
"with" block found in some languages:

with($e) {
   // Inside this block $a is actually compiled to mean $e->a
}

Not that this can be entirely de-sugared by a compiler or static
analysis tool, so can be completely ignored by optimisation, and
reliably processed by code inspection and refactoring aids. This isn't
true of compact(), variable-variables, etc, because by exchanging
strings and variable names you are writing fundamentally dynamic code.

Regards,

--
Rowan Collins
[IMSoP]


--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php
On Sun, 2018-06-10 at 23:49 +0100, Rowan Collins wrote:
> On 10/06/2018 19:41, Hoffman, Zachary Robert wrote:
> > Deprecating compact() would hurt workflows where you use the same
> > variable names
> > several times in different contexts, even though they are only
> > declared locally
> > for the purposes of compact() itself.
>
> I am not convinced that building logic around variable names is ever
> a
> good thing. Variable names are primarily a label used by the
> programmer
> to write source code, and in my opinion that's all they should ever
> be.
> Programmers should be free to rename them within their scope, and
> compilers should be free to erase them when optimising a production
> build.

That is fair, I can see the merit in that.

> > public function c()
> > {
> > $d = $this->d();
> > $a = pow($d, $d + 1);
> > $c = $a ^ 0b1100;
> > $b = $a - $d;
> >
> > return new B(
> > compact(
> > $this->b()
> > )
> > );
> > }
>
> Your two classes here are clearly collaborating on the processing of
> some specific collection of values, which have something in common
> with
> each other; they are not really passing 4 variables, but some
> structure
> which has 4 fields. It might be nice in some cases if that could be
> a
> by-value "struct" type, but the most appropriate in current PHP would
> be
> to define a third class and pass an instance of that from class A to
> class B.

It is true that, in the example I gave, it would make more sense to just pass the array itself, or some other structure. However, the end goal is different. Read on for details.

However, even if those 4 values end up being stored in a structure, I don't want to manipulate that structure within that class method itself, I want to perform computations with local variables. If I have to separately store those variables to the structure at the end of the method, that is fine.

>
>
> > foreach (A::b() as $b) {
> > echo $a->a($b) . PHP_EOL;
> > }
>
> This is a poor example, because if this was really the end result,
> then
> the two classes aren't even using the data with the same meaning,
> and
> class A might as well return a plain array - if the keys are no
> longer
> important, it could simply return [$a, $b, $c, $d].

You're right, that is a poor example. In my project where I do something like that, that is the first of several steps in sending an email from the Laravel framework (artisan command -> event -> listener -> mailable). The key/value array ends up being re-associated at a later step and passed into a view, where the array is extracted (each array key becomes a variable with the same name).

I did not think to include the other steps because I was trying to keep the discussion away from any framework in particular, but now that I'm thinking about it, Laravel views is my only use case for `compact()` in the first place.

I end up using `compact()` in the same way when rendering Blade views from Laravel controllers, which is easier to demonstrate, so I made a small Laravel project to demonstrate `compact()` being useful. It is viewable at https://gitlab.com/zrhoffman/compact-example . The project only consists of 2 short files (which are copied into a Laravel base install), which I will paste here anyway.

First is the Laravel route to the base URL:

Route::get('', function () {
$base = 2;
$power = pow($base, $base + 1);
$xor = $power ^ 0b1100;
$sum = $xor + $base;

return view('index', compact([
'power',
'sum',
'xor',
'base',
]));
});

And after that is the Blade view, which renders the variables in a non-iterative way, meaning that storing them in an array would not actually help us..

<!doctype html>
<html lang="{{ app()->getLocale() }}">
<body style="text-align: center">
<p>{{ $base }}<sup>{{ $base }} + 1</sup> ^ 0b1100 + {{ $base }}
= {{ $power }} ^ 0b1100 + {{ $base }}
= {{ $xor }} + {{ $base }}
= {{ $sum }}</p>
</body>

I included a small installer script in the project in case you want to see it rendered, but you can probably get the general idea just by looking at it. Hopefully, this demonstrates that, for the purposes of the view itself, there is no advantage to storing all of these values in a single structure.

>
>
> > The alternative would be manipulating array elements directly, like
> > this:
> >
> > public function c()
> > {
> > $e['d'] = $this->d();
> > $e['a'] = pow($e['d'], $e['d'] + 1);
> > $e['c'] = $e['a'] ^ 0b1100;
> > $e['b'] = $e['a'] - $e['d'];
> >
> > return new B($e);
> > }
> >
> > That is far more cumbersome.
>
> I agree that this is slightly harder to read (and the version I
> would
> prefer where $e was an object rather than an array would look
> similar).
> However, this feels like an argument for some syntactic sugar, like
> the
> "with" block found in some languages:
>
> with($e) {
> // Inside this block $a is actually compiled to mean $e->a
> }
>
> Not that this can be entirely de-sugared by a compiler or static
> analysis tool, so can be completely ignored by optimisation, and
> reliably processed by code inspection and refactoring aids. This
> isn't
> true of compact(), variable-variables, etc, because by exchanging
> strings and variable names you are writing fundamentally dynamic
> code.

The other reason I mentioned Laravel is that storing variables by key in an array and extracting them later is intrinsic to Laravel's process for rendering views, and writing "fundamentally dynamic code" is pretty much inescapable in this case.

We could still avoid it on our end by rewriting the return statement of the Laravel route to this:

return view('index', [
'power' => $power,
'sum' => $sum,
'xor' => $xor,
'base' => $base,
]);

Since I am necessarily expecting those variable names on the other side anyway, this seems like extra effort worth avoiding.

Cheers,

--
Zach Hoffman
--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php
On 11 June 2018 at 07:14, Hoffman, Zachary Robert <[email protected]> wrote:

>
> However, even if those 4 values end up being stored in a structure, I
> don't want to manipulate that structure within that class method itself, I
> want to perform computations with local variables. If I have to separately
> store those variables to the structure at the end of the method, that is
> fine.
>

If the variables have to share a name with the fields in the structure,
they're not really local variables - you can't rename them, add more
intermediate steps, etc, without breaking their magic affinity with the
named fields.



> I end up using `compact()` in the same way when rendering Blade views from
> Laravel controllers, which is easier to demonstrate, so I made a small
> Laravel project to demonstrate `compact()` being useful. It is viewable at
> https://gitlab.com/zrhoffman/compact-example . The project only consists
> of 2 short files (which are copied into a Laravel base install), which I
> will paste here anyway.
>

I can see why this code is tempting, but it looks very "closely coupled" to
me: *why* are the local variables in the controller named the same thing as
in the view? What if the word "power" is ambiguous in the context of the
controller, and so you want to rename that variable $exponent? Because
you've assumed you can use compact(), you can't do that without tracing the
variable through the whole application - these are now effectively global
variables, rather than local ones.


We could still avoid it on our end by rewriting the return statement of the
> Laravel route to this:
>
> return view('index', [
> 'power' => $power,
> 'sum' => $sum,
> 'xor' => $xor,
> 'base' => $base,
> ]);
>
> Since I am necessarily expecting those variable names on the other side
> anyway, this seems like extra effort worth avoiding.
>

The fact that the keys and values look so similar here is a coincidence of
the current implementation, and making that explicit is not wasted effort.
If you refactor the code and rename the local variable $power to $exponent,
even the most naive refactoring tool will give you working code:

return view('index', [
'power' => $exponent,
'sum' => $sum,
'xor' => $xor,
'base' => $base,
]);

This is a huge advantage in maintainability of the code - it's easier to
understand, harder to break, and more able to evolve.




> The other reason I mentioned Laravel is that storing variables by key in
> an array and extracting them later is intrinsic to Laravel's process for
> rendering views, and writing "fundamentally dynamic code" is pretty much
> inescapable in this case.
>

Yes, I've seen Blade (and templating generally) as a justification for
dynamic variable names before, and I can see the appeal of reusing the PHP
engine as the templating engine, with a different set of variables in
scope. It could, however, be considered syntax sugar rather than truly
dynamic variable names - a template compiler could convert "$foo" into
"$params['foo']", just as it converts other Blade-specific language
features.


Regards,
--
Rowan Collins
[IMSoP]
On Jun 10, 2018 10:46 PM, "Hoffman, Zachary Robert" <[email protected]> wrote:
>The existence of anything that internally relies on `get_defined_vars()` is a
>blocker for applying further optimisations to the engine (think stack frames),
>which is probably why Dmitry suggested its removal.

I did not know it was an optimization issue.

Accessing variables by name is a problem for optimimisation and JIT-ing, because we lose data-flow dependencies. But, this is not the end of the world. We just don't optimize functions that use this "bad" pattern.

Thanks. Dmitry.




I suppose I could substitute a userland equivalent:

function compact_(array $compact, array $vars)
{
return array_intersect_key($vars, array_flip($compact));
}

which changes A::c() to

function c()
{
$d = $this->d();
$a = pow($d, $d + 1);
$c = $a ^ 0b1100;
$b = $a - $d;

return new B(
compact_(
$this->b(),
get_defined_vars()
)
);
}

That is not so bad. As far as I am concerned, deprecate away.
Sorry, only registered users may post in this forum.

Click here to login