Welcome! Log In Create A New Profile

Advanced

[PHP-DEV] scalar type-casting

Posted by Rasmus Schultz 
Rasmus Schultz
[PHP-DEV] scalar type-casting
April 09, 2017 11:40AM
Since PHP 7.0, I've started using scalar type-hints a lot, and it got me
thinking about scalar type-casting.

After poking through existing RFC's, it turns out, I was going to propose
precisely the thing that Anthony Ferrara proposed in 2012:

https://wiki.php.net/rfc/object_cast_to_types

In the light of scalar type-hints, I feel this RFC is now much more
relevant and useful than it was then.

One thing in this RFC jumps out at me though:

"when an internal function accepts a typed parameter by reference, if the
magic cast method is defined on the passed in object, an error is raised as
the cast cannot be performed because of the reference."

I'm afraid this is inconsistent with the behavior of built-in scalar
type-casts.

For example:

function add_one(int &$value) { $value += 1; }
$one = "1";
add_one($one);
var_dump($one); // int(2)

That is, an implicit type-cast made by passing e.g. a string to an int
reference argument has the side-effect of overwriting the input variable.

This behavior may be "okay" in the case of scalars, which, for the most
part, can just ping-pong between actual types - like, if someone were to
subsequently append to what they thought was a string in the above example,
the string-turned-integer would just convert itself back to a string.

The situation would be very different with an object being passed as
argument and cast to integer - if the object was simply replaced with an
integer as a side-effect, clearly this would have much more serious
ramifications than with scalars which can probably be cast back and forth
between various scalar types.

I'm guessing, at the time when scalar type-hints were introduced, you
likely weighed the pros and cons while designing this behavior and decided
it's "good enough", since it's damn near impossible to define another
rational behavior that is side-effect free and would also do something
meaningful with references (?)

It seems that references are once again the culprit that inspired "weird"
design-decisions such as side-effects.

I would call again for the deprecation/removal of references, but I know
that's a major language BC break and very unlikely to bear fruit, so I
won't suggest that.

Instead, I would like you to consider another, much smaller BC break, much
less likely to affect most code: rather than type-casting values when
passed by reference, instead type-check them, and error on mismatch.

That is, in the example above, add_one($one) would trigger an error,
because the variable you passed by reference isn't of the correct type.

I would need to refactor that code slightly, and introduce an intermediary
variable that is actually an integer, then call the function - or in other
words, I would need to write code that expresses what really happens, the
fact that the function operates on an integer variable in the calling scope:

$one = "1";
$one_int = (int) $one;
add_one($one_int);

This is much safer and much more transparent than the potentially very
surprising side-effect of having your local variable overwritten with a
different type.

The problem I'm describing is pretty serious for the one type-cast that
exists at present: __toString()

Example:

class Foo { public function __toString() { return "foo"; } }
function append_to(string &$str) { $str .= "_bar"; }
$foo = new Foo();
append_to($foo);
var_dump($foo); // string(7) "foo_bar"

In this example, the caller's instance of Foo gets wiped out and replaced
by a string - the "ping pong type-casting" that saved us in the previous
example won't save us this time.

While the side-effects for scalars being replaced by scalars may be "okay"
under most circumstances, I think this kind of side-effect is pretty
unnatural and surprising for any non-scalar type.

Most of the time, arguments are not by-reference, so I think changing this
this will likely have a pretty minimal impact on real world code - and the
work around (as in the previous example) is pretty easy to implement, and
could likely be fully automated by e.g. PHP Storm, CodeSniffer's cbf tool,
etc.

With this change, what Anthony proposed in 2012 becomes feasible, I think?

(And perhaps it comes feasible to (later) think about completing the
type-casting feature with support for casting between class/interface
types, but that's another subject...)
Rowan Collins
Re: [PHP-DEV] scalar type-casting
April 09, 2017 01:10PM
On 9 April 2017 10:30:02 BST, Rasmus Schultz <[email protected]> wrote:
>Example:
>
>class Foo { public function __toString() { return "foo"; } }
>function append_to(string &$str) { $str .= "_bar"; }
>$foo = new Foo();
>append_to($foo);
>var_dump($foo); // string(7) "foo_bar"
>
>In this example, the caller's instance of Foo gets wiped out and
>replaced
>by a string

While this looks surprising in the form you've written it, it should only really be a surprise to the function author, not the caller. If the caller sees only the signature, then the function can do *literally anything* to their passed by reference variable. The caller is giving full control and "ownership" of that variable, and shouldn't make any assumptions about what it will look like when it comes back.

For example, you don't even need PHP7 to do this:

class Foo { public function __toString() { return "foo"; } }
function append_to(&$str) { $str = (string)$str . "_bar"; }
$foo = new Foo();
append_to($foo);
var_dump($foo); // string(7) "foo_bar"

I don't think it's any more unreasonable for a reference parameter to change type in the parameter handling of your example function (with strict_types off) than inside the body of my example function. Of course, by setting strict_types=1, the caller can change the implicit cast to an implicit assertion, and get an error in your example; it won't save them from my example, though.

Regards,

--
Rowan Collins
[IMSoP]

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php
Rasmus Schultz
Re: [PHP-DEV] scalar type-casting
April 09, 2017 07:10PM
> If the caller sees only the signature, then the function can do
*literally anything* to their passed by reference variable. The caller is
giving full control and "ownership" of that variable, and shouldn't make
any assumptions about what it will look like when it comes back.

Understood, only this isn't the function doing something - it's the
language.

But I guess the best thing under any circumstances is to simply avoid using
references entirely - you won't have these problems then.

The main point was, I'd still very much like to see Anthony's RFC revived
:-)

So I guess this should be updated:

"when an internal function accepts a typed parameter by reference, if the
magic cast method is defined on the passed in object, an error is raised as
the cast cannot be performed because of the reference."

Perhaps to:

"when a function receives a type-hinted parameter by reference, if the
magic cast method is defined on the passed in object, the variable in the
calling scope is immediately overwritten with the result of the type-cast."

I don't happen to like it, but that's consistent with the existing
behavior, right?


On Sun, Apr 9, 2017 at 1:07 PM, Rowan Collins <[email protected]>
wrote:

> On 9 April 2017 10:30:02 BST, Rasmus Schultz <[email protected]> wrote:
> >Example:
> >
> >class Foo { public function __toString() { return "foo"; } }
> >function append_to(string &$str) { $str .= "_bar"; }
> >$foo = new Foo();
> >append_to($foo);
> >var_dump($foo); // string(7) "foo_bar"
> >
> >In this example, the caller's instance of Foo gets wiped out and
> >replaced
> >by a string
>
> While this looks surprising in the form you've written it, it should only
> really be a surprise to the function author, not the caller. If the caller
> sees only the signature, then the function can do *literally anything* to
> their passed by reference variable. The caller is giving full control and
> "ownership" of that variable, and shouldn't make any assumptions about what
> it will look like when it comes back.
>
> For example, you don't even need PHP7 to do this:
>
> class Foo { public function __toString() { return "foo"; } }
> function append_to(&$str) { $str = (string)$str . "_bar"; }
> $foo = new Foo();
> append_to($foo);
> var_dump($foo); // string(7) "foo_bar"
>
> I don't think it's any more unreasonable for a reference parameter to
> change type in the parameter handling of your example function (with
> strict_types off) than inside the body of my example function. Of course,
> by setting strict_types=1, the caller can change the implicit cast to an
> implicit assertion, and get an error in your example; it won't save them
> from my example, though.
>
> Regards,
>
> --
> Rowan Collins
> [IMSoP]
>
Yasuo Ohgaki
Re: [PHP-DEV] scalar type-casting
April 10, 2017 04:10AM
Hi Rasmus,

Although DbC is not what you need, but DbC could solve your issue
more efficiently. i.e. Faster execution, not shorter code.

https://wiki.php.net/rfc/dbc2

With DbC, caller has responsibility to pass correct parameters.

On Sun, Apr 9, 2017 at 6:30 PM, Rasmus Schultz <[email protected]> wrote:

>
> $one = "1";
> $one_int = (int) $one;
> add_one($one_int);
>


add_one(&$value)
require (is_int($value))
{
$value += 1;
}

// Caller has responsibility to pass correct parameters.
$one = filter_validate($_GET['var'], FILTER_VALIDATE_INT);
add_one($one);




> class Foo { public function __toString() { return "foo"; } }
> function append_to(string &$str) { $str .= "_bar"; }
> $foo = new Foo();
> append_to($foo);
> var_dump($foo); // string(7) "foo_bar"



class Foo { public function __toString() { return "foo"; } }

function append_to(&$str)
require (is_string($str))
{
$str .= "_bar";
}

$foo = new Foo();

// Caller has responsibility to pass correct parameters, but it's not
append_to($foo); // Error at DbC precondition check in append_foo()
var_dump($foo); // Cannot reach here in dev mode



I really like parameter type check.
Problem is type check makes execution slower.
Another problem is type check is not enough for many codes.

With DbC support, we can specify any expressions. Therefore, we can
check much more complex requirements for functions/methods at
development time.



e.g.
function save_age($user_age)
require (is_int($user_age))
require ($user_age >= 0)
require ($user_age < 150)
{
save_to_somewehre($user_age);
}
//Note: All input parameters must be validated to be correct value for the
app. e.g. use filter_validate()/etc


What you really need might be DbC.

Regards,

--
Yasuo Ohgaki
yohgaki@ohgaki.net
Rasmus Schultz
Re: [PHP-DEV] scalar type-casting
April 10, 2017 10:20AM
My concern is actually neither performance nor brevity - my concern is, can
you read the code and actually understand what it does, can you write code
without running into surprising side-effects, and so on.

DbC might have merit in terms of performance, but perhaps not so much in a
scripting language - if performance was critical to a given project, I
would not be using a scripting language. The addition of DbC and marginally
better performance for certain specific use-cases wouldn't change that for
me.


On Mon, Apr 10, 2017 at 4:04 AM, Yasuo Ohgaki <[email protected]> wrote:

> Hi Rasmus,
>
> Although DbC is not what you need, but DbC could solve your issue
> more efficiently. i.e. Faster execution, not shorter code.
>
> https://wiki.php.net/rfc/dbc2
>
> With DbC, caller has responsibility to pass correct parameters.
>
> On Sun, Apr 9, 2017 at 6:30 PM, Rasmus Schultz <[email protected]> wrote:
>
>>
>> $one = "1";
>> $one_int = (int) $one;
>> add_one($one_int);
>>
>
>
> add_one(&$value)
> require (is_int($value))
> {
> $value += 1;
> }
>
> // Caller has responsibility to pass correct parameters.
> $one = filter_validate($_GET['var'], FILTER_VALIDATE_INT);
> add_one($one);
>
>
>
>
>> class Foo { public function __toString() { return "foo"; } }
>> function append_to(string &$str) { $str .= "_bar"; }
>> $foo = new Foo();
>> append_to($foo);
>> var_dump($foo); // string(7) "foo_bar"
>
>
>
> class Foo { public function __toString() { return "foo"; } }
>
> function append_to(&$str)
> require (is_string($str))
> {
> $str .= "_bar";
> }
>
> $foo = new Foo();
>
> // Caller has responsibility to pass correct parameters, but it's not
> append_to($foo); // Error at DbC precondition check in append_foo()
> var_dump($foo); // Cannot reach here in dev mode
>
>
>
> I really like parameter type check.
> Problem is type check makes execution slower.
> Another problem is type check is not enough for many codes.
>
> With DbC support, we can specify any expressions. Therefore, we can
> check much more complex requirements for functions/methods at
> development time.
>
>
>
> e.g.
> function save_age($user_age)
> require (is_int($user_age))
> require ($user_age >= 0)
> require ($user_age < 150)
> {
> save_to_somewehre($user_age);
> }
> //Note: All input parameters must be validated to be correct value for the
> app. e.g. use filter_validate()/etc
>
>
> What you really need might be DbC.
>
> Regards,
>
> --
> Yasuo Ohgaki
> yohgaki@ohgaki.net
>
>
Yasuo Ohgaki
Re: [PHP-DEV] scalar type-casting
April 10, 2017 10:30PM
Hi Rasmus,

On Mon, Apr 10, 2017 at 5:18 PM, Rasmus Schultz <[email protected]> wrote:

> My concern is actually neither performance nor brevity - my concern is,
> can you read the code and actually understand what it does, can you write
> code without running into surprising side-effects, and so on.


Users must not write code that has side effect, just like user must not do
it with assert().

DbC has 2 main merits
- ensure program correctness by pre/post conditions (and invariant) during
development.
- better performance and security.

With DbC, it's easy to write and maintain _all_ "necessary and sufficient
conditions" for
_every_ functions/methods that makes sure program correctness.

Unit Test can't achieve what DbC can. i.e. It is not feasible to write all
"necessary and
sufficient conditions" unit tests for every single functions/methods.
invariant check
is even more difficult.

The most important DbC merit is "Ensured program correctness", then
security.
Performance would be the least important for PHP as you mentioned.

P.S. DbC is not a Unit Test replacement. Unless there is Unit Test,
pre/post/invariant
conditions cannot be checked easily/repeatedly.

--
Yasuo Ohgaki
yohgaki@ohgaki.net
Sorry, only registered users may post in this forum.

Click here to login