Welcome! Log In Create A New Profile

Advanced

[PHP-DEV] array_values should be a no-op for packed layout arrays

Posted by Jesse Schalken 
I noticed this commit
https://github.com/facebook/hhvm/commit/4167d66fcfa81a1c2a19e853282fac968d9cd454
recently in HHVM, which makes array_values() return the same array back if
it's packed instead of building a copy.

My understanding is PHP 7 also has packed arrays like HHVM, and my reading
of PHP's array_values() implementation (here
<https://github.com/php/php-src/blob/c6982995504b6e21e8a5ade29cfb16a55196dc43/ext/standard/array.c#L4000>;)
is that it always creates a copy of the array, even if it's packed.

Would this be a worthwhile optimisation to make in PHP as well?
On Mon, Mar 13, 2017 at 6:34 AM, Jesse Schalken <[email protected]>
wrote:

> I noticed this commit
> <https://github.com/facebook/hhvm/commit/4167d66fcfa81a1c2a1
> 9e853282fac968d9cd454>
> recently in HHVM, which makes array_values() return the same array back if
> it's packed instead of building a copy.
>
> My understanding is PHP 7 also has packed arrays like HHVM, and my reading
> of PHP's array_values() implementation (here
> <https://github.com/php/php-src/blob/c6982995504b6e21e8a5ade
> 29cfb16a55196dc43/ext/standard/array.c#L4000>)
> is that it always creates a copy of the array, even if it's packed.
>
> Would this be a worthwhile optimisation to make in PHP as well?
>



That would be nice if array_values() wouldn't reindex the array.

But it does.

$a = [12=> 'foo', 42 => 'bar']; // packed array
$b = array_values($a);

$b // [0 => 'foo', 42 => 'bar']; // keys have been changed and are not
kept from source


So, array_values() must allocate a new array, and fill it in with the
values of source, because array_values() reindexes the source array from
key 0 ; whatever was the original array.


Note that the allocation is very light, in PHP 7, we allocate a full buffer
at once , and not one buffer for each slot. The performance inpact of such
an allocation is really really tiny.


Julien.Pauli
On Tue, Mar 14, 2017 at 11:21 AM, Julien Pauli <[email protected]> wrote:
>> I noticed this commit
>> <https://github.com/facebook/hhvm/commit/4167d66fcfa81a1c2a1
>> 9e853282fac968d9cd454>
>> recently in HHVM, which makes array_values() return the same array back if
>> it's packed instead of building a copy.
>>
>> My understanding is PHP 7 also has packed arrays like HHVM, and my reading
>> of PHP's array_values() implementation (here
>> <https://github.com/php/php-src/blob/c6982995504b6e21e8a5ade
>> 29cfb16a55196dc43/ext/standard/array.c#L4000>)
>> is that it always creates a copy of the array, even if it's packed.
>>
>> Would this be a worthwhile optimisation to make in PHP as well?
>>
>
> That would be nice if array_values() wouldn't reindex the array.
>
> But it does.
>
> $a = [12=> 'foo', 42 => 'bar']; // packed array
> $b = array_values($a);
>
> $b // [0 => 'foo', 42 => 'bar']; // keys have been changed and are not
> kept from source
>
> So, array_values() must allocate a new array, and fill it in with the
> values of source, because array_values() reindexes the source array from
> key 0 ; whatever was the original array.
>
> Note that the allocation is very light, in PHP 7, we allocate a full buffer
> at once , and not one buffer for each slot. The performance inpact of such
> an allocation is really really tiny.
>
Minor nit: [12=>'foo', 42=>'bar'] is not a packed array.

[0=>'foo', 2=>'bar'] is however, so your primary point stands.
However it should be simple enough to detect when a packed array is
also vector-like (indexed from 0 to n-1) and make this minor
optimization.

-Sara

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php
On Wed, Mar 15, 2017 at 3:48 AM, Sara Golemon <[email protected]> wrote:
>
> Minor nit: [12=>'foo', 42=>'bar'] is not a packed array.
>
> [0=>'foo', 2=>'bar'] is however, so your primary point stands.
> However it should be simple enough to detect when a packed array is
> also vector-like (indexed from 0 to n-1) and make this minor
> optimization.
>
> -Sara
>

I didn't realise packed arrays could still have gaps. If that's the case,
why can't [12=>'foo', 42=>'bar'] be packed (index 0-11 and 13-41 are
undefined)?

I agree checking if the array is both packed and without gaps would still
make a worthwhile optimisation.
On Tue, Mar 14, 2017 at 6:25 PM, Jesse Schalken <[email protected]> wrote:
> I didn't realise packed arrays could still have gaps. If that's the case,
> why can't [12=>'foo', 42=>'bar'] be packed (index 0-11 and 13-41 are
> undefined)?
>
It can be, but in practice isn't likely to be since that's a lot of
empty space for a two element array. It will end up mixed pretty
quickly before you get to that point.

> I agree checking if the array is both packed and without gaps would still
> make a worthwhile optimisation.
>
Oh yeah, btw... six hours ago...
https://github.com/php/php-src/commit/c74bc87c74f48bc55541b3bf2fc67d595f58a3b5

-Sara

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php
On Wed, Mar 15, 2017 at 10:39 AM, Sara Golemon <[email protected]> wrote:

>
> Oh yeah, btw... six hours ago...
> https://github.com/php/php-src/commit/c74bc87c74f48bc55541b3bf2fc67d
> 595f58a3b5
>
> -Sara
>

Awesome, thanks Sara.
On Wed, Mar 15, 2017 at 12:58 AM, Jesse Schalken <[email protected]>
wrote:

>
> On Wed, Mar 15, 2017 at 10:39 AM, Sara Golemon <[email protected]> wrote:
>
>>
>> Oh yeah, btw... six hours ago...
>> https://github.com/php/php-src/commit/c74bc87c74f48bc55541b3
>> bf2fc67d595f58a3b5
>>
>> -Sara
>>
>
> Awesome, thanks Sara.
>
>

Nice Sara.

Yep my example wasn't that good , but such an array with holes could be
packed, thus pretty unlikely.

For interested souls , I detailed that in an article of mine, targeting 7.0
, at http://jpauli.github.io/2016/04/08/hashtables.html



Julien.P
Sorry, only registered users may post in this forum.

Click here to login