Benjamin Coutu
[PHP-DEV] Apply substr() optimization to array_slice()
October 27, 2017 02:20PM
Hello everyone,

Please consider these two statements:

substr($string, 0, $length);
array_slice($array, 0, $length, true);

Currently, with substr(), if $offset is zero and $length is smaller or equal to the original string length we just increase the reference count of the original string and return it via RETURN_STR_COPY.
In that case we completely save the allocation of a new string.

Now, array_slice() could be optimized similarly, but currently (unless the resulting array is expected to be empty) we always create a new array no matter if we actually have to.
The same mechanism as used with substr() could be applied to array_slice(), given that $offset is zero and of course only if $preserve_keys is true.

A patch would look like this:

if (length <= num_in && offset == 0 && preserve_keys) {
/* Copy the original array */
ZVAL_COPY(return_value, input);
return;
}

I'd appreciate if someone could commit this. Thanks.

Cheers,

Benjamin

--

Bejamin Coutu
ben.coutu@zeyos.com

ZeyOS, Inc.
http://www.zeyos.com


--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php
Dennis Birkholz
Re: [PHP-DEV] Apply substr() optimization to array_slice()
October 27, 2017 04:20PM
Hello Benjamin,

Am 27.10.2017 um 14:12 schrieb Benjamin Coutu:
> Hello everyone,
>
> Please consider these two statements:
>
> substr($string, 0, $length);
> array_slice($array, 0, $length, true);
>
> Currently, with substr(), if $offset is zero and $length is smaller or equal to the original string length we just increase the reference count of the original string and return it via RETURN_STR_COPY.
> In that case we completely save the allocation of a new string.

the optimization actually only returns the same string if the supplied
length is the length of the string, see:
https://lxr.room11.org/xref/php-src%40master/ext/standard/string.c#2436

> Now, array_slice() could be optimized similarly, but currently (unless the resulting array is expected to be empty) we always create a new array no matter if we actually have to.
> The same mechanism as used with substr() could be applied to array_slice(), given that $offset is zero and of course only if $preserve_keys is true.
>
> A patch would look like this:
>
> if (length <= num_in && offset == 0 && preserve_keys) {
> /* Copy the original array */
> ZVAL_COPY(return_value, input);
> return;
> }

So the array_slice optimization should only do the some, otherwise the
returned array would be longer than desired.

Greets,
Dennis

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php
Sara Golemon
Re: [PHP-DEV] Apply substr() optimization to array_slice()
October 27, 2017 04:20PM
On Fri, Oct 27, 2017 at 8:12 AM, Benjamin Coutu <[email protected]> wrote:
> Now, array_slice() could be optimized similarly, but currently
> (unless the resulting array is expected to be empty) we always
> create a new array no matter if we actually have to.
>
Pushed, with an additional escape hatch for vector-like arrays (which
are implicitly like preserve_keys). In the future though, please use
the PR process. Thanks.

-Sara

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php
Nikita Popov
Re: [PHP-DEV] Apply substr() optimization to array_slice()
October 27, 2017 06:20PM
On Fri, Oct 27, 2017 at 4:16 PM, Sara Golemon <[email protected]> wrote:

> On Fri, Oct 27, 2017 at 8:12 AM, Benjamin Coutu <[email protected]>
> wrote:
> > Now, array_slice() could be optimized similarly, but currently
> > (unless the resulting array is expected to be empty) we always
> > create a new array no matter if we actually have to.
> >
> Pushed, with an additional escape hatch for vector-like arrays (which
> are implicitly like preserve_keys). In the future though, please use
> the PR process. Thanks.
>

Unfortunately these optimizations are subtly incorrect in the current form,
because arrays have a bunch of additional hidden state. See
https://bugs.php.net/bug.php?id=75433 for a (not yet fixed) issue that
resulted from similar optimizations in 7.2. We'll have to review all the
places where we apply optimizations like these and make sure that we're not
introducing incorrect behavior wrt the next free element or internal array
pointer.

Nikita
Sara Golemon
Re: [PHP-DEV] Apply substr() optimization to array_slice()
October 27, 2017 06:40PM
On Fri, Oct 27, 2017 at 12:10 PM, Nikita Popov <[email protected]> wrote:
> On Fri, Oct 27, 2017 at 4:16 PM, Sara Golemon <[email protected]> wrote:
>>
>> On Fri, Oct 27, 2017 at 8:12 AM, Benjamin Coutu <[email protected]>
>> wrote:
>> > Now, array_slice() could be optimized similarly, but currently
>> > (unless the resulting array is expected to be empty) we always
>> > create a new array no matter if we actually have to.
>> >
>> Pushed, with an additional escape hatch for vector-like arrays (which
>> are implicitly like preserve_keys). In the future though, please use
>> the PR process. Thanks.
>
>
> Unfortunately these optimizations are subtly incorrect in the current form,
> because arrays have a bunch of additional hidden state. See
> https://bugs.php.net/bug.php?id=75433 for a (not yet fixed) issue that
> resulted from similar optimizations in 7.2. We'll have to review all the
> places where we apply optimizations like these and make sure that we're not
> introducing incorrect behavior wrt the next free element or internal array
> pointer.
>
It took awhile to unwind the repro case given in the bug report, but
this seems to be a simpler example:
https://3v4l.org/rqphD

Basically, that next insert index for the output of array_values()
should be reset, and with the optimization it's not.

For array_values() the quick and dirty fix is adding "&& nextIndex ==
num_elements" psuedocode. In the case of array_slice, the same will
be true, so I agree we should be careful about applying such
optimizations.

I'll clean up these uses now, and would suggest something like:

zend_array* zend_hash_duplicate(zend_array* input, zend_bool
preserve_keys); type API which can be a certral place for making that
kind of short-circuit versus slow-copy decision when called from
places like array_values() and array_slice().

-Sara

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php
Benjamin Coutu
Re: [PHP-DEV] Apply substr() optimization to array_slice()
October 27, 2017 11:20PM
Yes, abstracting such additional checks through something like zend_hash_duplicate() would make sense, but IMHO it should be named differently, e.g. zend_hash_copy_lazy. So to be analogous to zend_string_copy(), but not to be confused with the existing zend_hash_copy().

By the way: array_pad() in L4320, array_unique() in L4476 and array_diff() in L5415 also return the array without those kind of checks (analougous to what I have proposed for the array_slice() case). These existing bugs would have to be fixed as well.

- Ben -

========== Original ==========
From: Sara Golemon <[email protected]>
To: Nikita Popov <[email protected]>
Date: Fri, 27 Oct 2017 18:34:15 +0200
Subject: Re: [PHP-DEV] Apply substr() optimization to array_slice()

>
>
> On Fri, Oct 27, 2017 at 12:10 PM, Nikita Popov <[email protected]> wrote:
> > On Fri, Oct 27, 2017 at 4:16 PM, Sara Golemon <[email protected]> wrote:
> >>
> >> On Fri, Oct 27, 2017 at 8:12 AM, Benjamin Coutu <[email protected]>
> >> wrote:
> >> > Now, array_slice() could be optimized similarly, but currently
> >> > (unless the resulting array is expected to be empty) we always
> >> > create a new array no matter if we actually have to.
> >> >
> >> Pushed, with an additional escape hatch for vector-like arrays (which
> >> are implicitly like preserve_keys). In the future though, please use
> >> the PR process. Thanks.
> >
> >
> > Unfortunately these optimizations are subtly incorrect in the current form,
> > because arrays have a bunch of additional hidden state. See
> > https://bugs.php.net/bug.php?id=75433 for a (not yet fixed) issue that
> > resulted from similar optimizations in 7.2. We'll have to review all the
> > places where we apply optimizations like these and make sure that we're not
> > introducing incorrect behavior wrt the next free element or internal array
> > pointer.
> >
> It took awhile to unwind the repro case given in the bug report, but
> this seems to be a simpler example:
> https://3v4l.org/rqphD
>
> Basically, that next insert index for the output of array_values()
> should be reset, and with the optimization it's not.
>
> For array_values() the quick and dirty fix is adding "&& nextIndex ==
> num_elements" psuedocode. In the case of array_slice, the same will
> be true, so I agree we should be careful about applying such
> optimizations.
>
> I'll clean up these uses now, and would suggest something like:
>
> zend_array* zend_hash_duplicate(zend_array* input, zend_bool
> preserve_keys); type API which can be a certral place for making that
> kind of short-circuit versus slow-copy decision when called from
> places like array_values() and array_slice().
>
> -Sara


--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php
Dmitry Stogov
Re: [PHP-DEV] Apply substr() optimization to array_slice()
October 30, 2017 12:00PM
The optimization also broke Zend/tests/bug72598.phpt and Zend/tests/bug72598_2.phpt

________________________________
From: php@golemon.com <[email protected]> on behalf of Sara Golemon <[email protected]>
Sent: Friday, October 27, 2017 7:34:15 PM
To: Nikita Popov
Cc: Benjamin Coutu; PHP Internals; Dmitry Stogov
Subject: Re: [PHP-DEV] Apply substr() optimization to array_slice()

On Fri, Oct 27, 2017 at 12:10 PM, Nikita Popov <[email protected]> wrote:
> On Fri, Oct 27, 2017 at 4:16 PM, Sara Golemon <[email protected]> wrote:
>>
>> On Fri, Oct 27, 2017 at 8:12 AM, Benjamin Coutu <[email protected]>
>> wrote:
>> > Now, array_slice() could be optimized similarly, but currently
>> > (unless the resulting array is expected to be empty) we always
>> > create a new array no matter if we actually have to.
>> >
>> Pushed, with an additional escape hatch for vector-like arrays (which
>> are implicitly like preserve_keys). In the future though, please use
>> the PR process. Thanks.
>
>
> Unfortunately these optimizations are subtly incorrect in the current form,
> because arrays have a bunch of additional hidden state. See
> https://bugs.php.net/bug.php?id=75433 for a (not yet fixed) issue that
> resulted from similar optimizations in 7.2. We'll have to review all the
> places where we apply optimizations like these and make sure that we're not
> introducing incorrect behavior wrt the next free element or internal array
> pointer.
>
It took awhile to unwind the repro case given in the bug report, but
this seems to be a simpler example:
https://3v4l.org/rqphD

Basically, that next insert index for the output of array_values()
should be reset, and with the optimization it's not.

For array_values() the quick and dirty fix is adding "&& nextIndex ==
num_elements" psuedocode. In the case of array_slice, the same will
be true, so I agree we should be careful about applying such
optimizations.

I'll clean up these uses now, and would suggest something like:

zend_array* zend_hash_duplicate(zend_array* input, zend_bool
preserve_keys); type API which can be a certral place for making that
kind of short-circuit versus slow-copy decision when called from
places like array_values() and array_slice().

-Sara
Sara Golemon
Re: [PHP-DEV] Apply substr() optimization to array_slice()
October 30, 2017 04:20PM
On Mon, Oct 30, 2017 at 6:51 AM, Dmitry Stogov <[email protected]> wrote:
> The optimization also broke Zend/tests/bug72598.phpt and Zend/tests/bug72598_2.phpt
>
Damn. I had a feeling references would rear their ugly head here.

Fuggit, reverted. We can take another swing at it later, but it's not
a vital optimization.

-Sara

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php
Nikita Popov
Re: [PHP-DEV] Apply substr() optimization to array_slice()
October 30, 2017 04:20PM
On Mon, Oct 30, 2017 at 4:09 PM, Sara Golemon <[email protected]> wrote:

> On Mon, Oct 30, 2017 at 6:51 AM, Dmitry Stogov <dmitry[email protected]> wrote:
> > The optimization also broke Zend/tests/bug72598.phpt and
> Zend/tests/bug72598_2.phpt
> >
> Damn. I had a feeling references would rear their ugly head here.
>
> Fuggit, reverted. We can take another swing at it later, but it's not
> a vital optimization.
>
> -Sara
>

This is really more of a bug in call_user_func_array(). It should be
treating rc=1 references as non-references and throw a warning in that case
as well. That may not be very popular behavior though...

Nikita
Sorry, only registered users may post in this forum.

Click here to login