Thomas Punt
[PHP-DEV] [RFC] Flexible Heredoc and Nowdoc Syntaxes
October 13, 2017 11:50AM
Morning internals,


I'd like to propose an RFC to make the heredoc and nowdoc syntaxes more flexible[1]. Any thoughts?


Thanks,

Tom


[1]: https://wiki.php.net/rfc/flexible_heredoc_nowdoc_syntaxes
Fleshgrinder
Re: [PHP-DEV] [RFC] Flexible Heredoc and Nowdoc Syntaxes
October 13, 2017 06:40PM
On 10/13/2017 11:40 AM, Thomas Punt wrote:
> Morning internals,
>
>
> I'd like to propose an RFC to make the heredoc and nowdoc syntaxes more flexible[1]. Any thoughts?
>
>
> Thanks,
>
> Tom
>
>
> [1]: https://wiki.php.net/rfc/flexible_heredoc_nowdoc_syntaxes
>

Hi Tom!

I love it, definitely should go in. I am sure there are some special
cases that will be discovered in this discussion, but it's an awesome
feature.

--
Richard "Fleshgrinder" Fussenegger
Ryan Pallas
Re: [PHP-DEV] [RFC] Flexible Heredoc and Nowdoc Syntaxes
October 13, 2017 09:30PM
On Fri, Oct 13, 2017 at 3:40 AM, Thomas Punt <[email protected]> wrote:

> Morning internals,
>
>
> I'd like to propose an RFC to make the heredoc and nowdoc syntaxes more
> flexible[1]. Any thoughts?
>

I really like this, and I don't think it's that hard to not use the marker
in the body at the far left of a line unattached to other characters. In my
experience, people that use these syntaxes (like me) already make sure not
to use something within the body.


>
>
> Thanks,
>
> Tom
>
>
> [1]: https://wiki.php.net/rfc/flexible_heredoc_nowdoc_syntaxes
>
Christopher Jones
Re: [PHP-DEV] [RFC] Flexible Heredoc and Nowdoc Syntaxes
October 24, 2017 02:50AM
On 13/10/17 8:40 pm, Thomas Punt wrote:
> Morning internals,
>
>
> I'd like to propose an RFC to make the heredoc and nowdoc syntaxes more flexible[1]. Any thoughts?
>
>
> Thanks,
>
> Tom
>
>
> [1]: https://wiki.php.net/rfc/flexible_heredoc_nowdoc_syntaxes
>

I like the added flexibility in placement of the end token, but I think requiring only tabs or spaces, and stripping whitespace from all {here|now}doc
lines is error prone and adds unnecessary complexity.

Chris

--
http://twitter.com/ghrd


--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php
Thomas Punt
Re: [PHP-DEV] [RFC] Flexible Heredoc and Nowdoc Syntaxes
October 24, 2017 11:30AM
Hi Christopher,


> I like the added flexibility in placement of the end token, but I think requiring only tabs or spaces, and stripping whitespace from all {here|now}doc
> lines is error prone and adds unnecessary complexity.

I agree that the requirement for using either tabs or spaces is not necessary, but
I included it because it does help with readability when looking at the indentation
level of the heredoc and nowdoc (and subsequently how much whitespace will
be stripped from each line). With respect to the stripping of whitespace, however,
I feel that this is definitely necessary. If it was not stripped, then indenting the
closing token and body will cause a lot of whitespace to prepend every line in
the body of text. This is definitely not desirable, and may cause programmers to
continue to not indent the body of the heredoc/nowdoc, which leads us back to
where we currently are of having indentation of code ruined with such syntaxes.


Other languages follow these semantics of stripping whitespace from new lines

according to the indentation of the closing marker, such as Elixir (normal """ syntax)

and Ruby (special <<~ syntax).


Thanks,

Tom
Nikita Popov
Re: [PHP-DEV] [RFC] Flexible Heredoc and Nowdoc Syntaxes
October 24, 2017 12:10PM
On Tue, Oct 24, 2017 at 11:27 AM, Thomas Punt <[email protected]> wrote:

> Hi Christopher,
>
>
> > I like the added flexibility in placement of the end token, but I think
> requiring only tabs or spaces, and stripping whitespace from all
> {here|now}doc
> > lines is error prone and adds unnecessary complexity.
>
> I agree that the requirement for using either tabs or spaces is not
> necessary, but
> I included it because it does help with readability when looking at the
> indentation
> level of the heredoc and nowdoc (and subsequently how much whitespace will
> be stripped from each line).


It's not just a question of readability. You just can't strip a mixed
space/tab indentation unless you specify a tab width. If one line is
indented with two tabs and the other with 16 spaces, what do you strip? All
16 spaces (ts=8)? Only 8 (ts=4)? Only 4 (ts=2)? Unless we want to specify
the One True Tab Width or introduce an ini setting for this, it's not
really possible to handle this in a reasonable way.

With respect to the stripping of whitespace, however,
> I feel that this is definitely necessary. If it was not stripped, then
> indenting the
> closing token and body will cause a lot of whitespace to prepend every
> line in
> the body of text. This is definitely not desirable, and may cause
> programmers to
> continue to not indent the body of the heredoc/nowdoc, which leads us back
> to
> where we currently are of having indentation of code ruined with such
> syntaxes.
>
>
> Other languages follow these semantics of stripping whitespace from new
> lines
>
> according to the indentation of the closing marker, such as Elixir (normal
> """ syntax)
>
> and Ruby (special <<~ syntax).
>
>
> Thanks,
>
> Tom
>
Christopher Jones
Re: [PHP-DEV] [RFC] Flexible Heredoc and Nowdoc Syntaxes
October 25, 2017 07:40AM
On 24/10/17 8:27 pm, Thomas Punt wrote:
>
> Hi Christopher,
>
>
> > I like the added flexibility in placement of the end token, but I think requiring only tabs or spaces, and stripping whitespace from all
> {here|now}doc
> > lines is error prone and adds unnecessary complexity.
>
> I agree that the requirement for using either tabs or spaces is not necessary, but
> I included it because it does help with readability when looking at the indentation
> level of the heredoc and nowdoc (and subsequently how much whitespace will
> be stripped from each line). With respect to the stripping of whitespace, however,
> I feel that this is definitely necessary.

If developers accidentally add/subtract leading space from the closing token then the whole string changes;  this can lead to subtle bugs and annoyances.

Chris

> If it was not stripped, then indenting the
> closing token and body will cause a lot of whitespace to prepend every line in
> the body of text. This is definitely not desirable, and may cause programmers to
> continue to not indent the body of the heredoc/nowdoc, which leads us back to
> where we currently are of having indentation of code ruined with such syntaxes.
>
> Other languages follow these semantics of stripping whitespace from new lines
>
> according to the indentation of the closing marker, such as Elixir (normal """ syntax)
>
> and Ruby (special <<~ syntax).
>
>
> Thanks,
>
> Tom
>

--
http://twitter.com/ghrd
Thomas Punt
Re: [PHP-DEV] [RFC] Flexible Heredoc and Nowdoc Syntaxes
October 27, 2017 08:40PM
Hi,


> If developers accidentally add/subtract leading space from the closing token then the whole string changes;


Yes, this is a feature of the chosen semantics. The indentation level of the body can be chosen based upon the current indentation level of the code (for which, the closing marker should be lined up to), not the indentation level from the start of the line (which may cause developers to indent the body text less to prevent leading whitespace, leading us back to the current situation of having indentation levels ruined by these syntaxes).


> this can lead to subtle bugs and annoyances.

I think this clause is a little too exaggerated. Once a developer understands that the closing token guides the indentation level of the body text, then the cause of the change in whitespace should be pretty obvious (if it's not already visually obvious from the fact that they must have broken the indentation level of their own code).

-Tom
Stephen Reay
Re: [PHP-DEV] [RFC] Flexible Heredoc and Nowdoc Syntaxes
October 28, 2017 10:10AM
> On 24 Oct 2017, at 4:58 pm, Nikita Popov <[email protected]> wrote:
>
> On Tue, Oct 24, 2017 at 11:27 AM, Thomas Punt <[email protected]> wrote:
>
>> Hi Christopher,
>>
>>
>>> I like the added flexibility in placement of the end token, but I think
>> requiring only tabs or spaces, and stripping whitespace from all
>> {here|now}doc
>>> lines is error prone and adds unnecessary complexity.
>>
>> I agree that the requirement for using either tabs or spaces is not
>> necessary, but
>> I included it because it does help with readability when looking at the
>> indentation
>> level of the heredoc and nowdoc (and subsequently how much whitespace will
>> be stripped from each line).
>
>
> It's not just a question of readability. You just can't strip a mixed
> space/tab indentation unless you specify a tab width. If one line is
> indented with two tabs and the other with 16 spaces, what do you strip? All
> 16 spaces (ts=8)? Only 8 (ts=4)? Only 4 (ts=2)? Unless we want to specify
> the One True Tab Width or introduce an ini setting for this, it's not
> really possible to handle this in a reasonable way.
>
> With respect to the stripping of whitespace, however,
>> I feel that this is definitely necessary. If it was not stripped, then
>> indenting the
>> closing token and body will cause a lot of whitespace to prepend every
>> line in
>> the body of text. This is definitely not desirable, and may cause
>> programmers to
>> continue to not indent the body of the heredoc/nowdoc, which leads us back
>> to
>> where we currently are of having indentation of code ruined with such
>> syntaxes.
>>
>>
>> Other languages follow these semantics of stripping whitespace from new
>> lines
>>
>> according to the indentation of the closing marker, such as Elixir (normal
>> """ syntax)
>>
>> and Ruby (special <<~ syntax).
>>
>>
>> Thanks,
>>
>> Tom
>>

Hi Nikita,

I disagree. To me this change would simply mean that the literal exact white-space string preceding the end marker is removed as a prefix from all lines.

So if you have an end marker intended by two tab characters, but all the ‘content’ lines of the heredoc are indented by 8 spaces, nothing is removed.

While I agree that features shouldn’t be surprising, I think converting tabs to spaces or vice-versa *would* be surprising.

Also, sidenote: ‘heredoc’ is listed as a spelling mistake with a suggestion of ‘heretic’. Maybe the computers know more than we do.

Cheers

Stephen
--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php
Thomas Punt
Re: [PHP-DEV] [RFC] Flexible Heredoc and Nowdoc Syntaxes
October 28, 2017 01:20PM
Hi Stephen,

> I disagree. To me this change would simply mean that the literal exact white-space string preceding the end marker is removed as a prefix from all lines.
>
> So if you have an end marker intended by two tab characters, but all the ‘content’ lines of the heredoc are indented by 8 spaces, nothing is removed.

So let's say the ending marker is indented with [space][tab][space] and the body is indented with [tab][space][tab], by your logic, a [space] and a [tab] should be removed from the body's indentation? Or nothing at all?

The problem with such rules is that they bring more complication to the implementation when it comes down to the finer details. It is far better to simply disallow such nonsense to begin with.

A nice benefit of this (choosing a stricter approach to begin with) is that, should we see advantages to bringing more leniency to the current semantics (such as enabling the mixture of tabs and spaces), then we can enable this later without causing any new BC breaks. Whereas if we introduce a really loose-style syntax to begin with, then we cannot make it stricter later on without introducing new BC breaks.

-Tom
Stephen Reay
Re: [PHP-DEV] [RFC] Flexible Heredoc and Nowdoc Syntaxes
October 28, 2017 02:20PM
> On 28 Oct 2017, at 18:17, Thomas Punt <[email protected]> wrote:
>
> Hi Stephen,
>
> > I disagree. To me this change would simply mean that the literal exact white-space string preceding the end marker is removed as a prefix from all lines.
> >
> > So if you have an end marker intended by two tab characters, but all the ‘content’ lines of the heredoc are indented by 8 spaces, nothing is removed.
>
> So let's say the ending marker is indented with [space][tab][space] and the body is indented with [tab][space][tab], by your logic, a [space] and a [tab] should be removed from the body's indentation? Or nothing at all?
>
> The problem with such rules is that they bring more complication to the implementation when it comes down to the finer details. It is far better to simply disallow such nonsense to begin with.
>
> A nice benefit of this (choosing a stricter approach to begin with) is that, should we see advantages to bringing more leniency to the current semantics (such as enabling the mixture of tabs and spaces), then we can enable this later without causing any new BC breaks. Whereas if we introduce a really loose-style syntax to begin with, then we cannot make it stricter later on without introducing new BC breaks.
>
> -Tom

Hi Tom

In your scenario, nothing at all would be removed. I'd imagine an exact substring anchored to the start-of-line is what's matched for any replacement.

Strict removal like that is the only thing I can see that makes sense - any conversion is likely to lead to developer surprise.

Cheers

Stephen
Sorry, only registered users may post in this forum.

Click here to login