Welcome! Log In Create A New Profile

Advanced

[PHP-DEV] [RFC] Reproducible Builds Support

Posted by Jelle van der Waa 
Jelle van der Waa
[PHP-DEV] [RFC] Reproducible Builds Support
December 11, 2017 10:20PM
Hi all,

Debian, Arch Linux and other distro's are trying to get full
reproducible builds. There are some issues in PHP's codebase which makes
builds unreproducible. Reprodicuble builds are currently reproduced in
Arch Linux by building PHP twice, and in two different env's, varying
hostname, system time, etc. [1]

Once issue is the PHP_BUILD_DATE, which makes the build
non-reproducible. I've made a PR which uses SOURCE_DATE_EPOCH which is
set in the reprodiculbe build env. This should keep the current
functionality intact, while adding support for reproducible builds. [2]
[3]

Another issue is the php_uname functions which contains the
hostname, since the hostname is varied per build this makes it
non-reproducible. This is caused by the following line:

configure.ac:PHP_UNAME=`uname -a | xargs` required in:
ext/standard/info.c: php_uname = PHP_UNAME;

Which is there as fallback as the php.net documentation describes:

"On some older UNIX platforms, it may not be able to determine the
current OS information in which case it will revert to displaying the OS
PHP was built on. This will only happen if your uname() library call
either doesn't exist or doesn't work.".

I would argue that this is strange unexpected behaviour, and maybe it
should throw an exception instead? Or can it show only "Linux" as
fallback? basically PHP_OS. Ideas?

The last issue is phar.phar being non-reproducible of which I am not
sure what the issue would be. I'm not sure how the binary data in the
phar.phar is generated.

[1] https://tests.reproducible-builds.org/archlinux/extra/php/php-7.2.0-2-x86_64.pkg.tar.xz.html
[2] https://github.com/php/php-src/pull/2965
[3] https://reproducible-builds.org/specs/source-date-epoch/

Thanks,

--
Jelle van der Waa

Arch Linux Developer

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php
Stanislav Malyshev
Re: [PHP-DEV] [RFC] Reproducible Builds Support
December 12, 2017 08:40AM
Hi!

> Once issue is the PHP_BUILD_DATE, which makes the build
> non-reproducible. I've made a PR which uses SOURCE_DATE_EPOCH which is
> set in the reprodiculbe build env. This should keep the current
> functionality intact, while adding support for reproducible builds. [2]
> [3]

SOURCE_DATE_EPOCH (or any other variable) looks like a good way to make
it predictable.

> Another issue is the php_uname functions which contains the
> hostname, since the hostname is varied per build this makes it
> non-reproducible. This is caused by the following line:
>
> configure.ac:PHP_UNAME=`uname -a | xargs` required in:
> ext/standard/info.c: php_uname = PHP_UNAME;

I think the best solution here would be to have another variable to
override this.

> I would argue that this is strange unexpected behaviour, and maybe it
> should throw an exception instead? Or can it show only "Linux" as
> fallback? basically PHP_OS. Ideas?

If those old systems run PHP and need uname, changing stuff there is
probably harder and more expensive than on other systems. With this in
mind, I'd rather not mess with it, especially for a purpose that can
easily be achieved without it.

--
Stas Malyshev
smalyshev@gmail.com

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php
Jelle van der Waa
Re: [PHP-DEV] [RFC] Reproducible Builds Support
December 12, 2017 10:00PM
On 12/11/17 at 11:33pm, Stanislav Malyshev wrote:
> Hi!
>
> > Once issue is the PHP_BUILD_DATE, which makes the build
> > non-reproducible. I've made a PR which uses SOURCE_DATE_EPOCH which is
> > set in the reprodiculbe build env. This should keep the current
> > functionality intact, while adding support for reproducible builds. [2]
> > [3]
>
> SOURCE_DATE_EPOCH (or any other variable) looks like a good way to make
> it predictable.
>
> > Another issue is the php_uname functions which contains the
> > hostname, since the hostname is varied per build this makes it
> > non-reproducible. This is caused by the following line:
> >
> > configure.ac:PHP_UNAME=`uname -a | xargs` required in:
> > ext/standard/info.c: php_uname = PHP_UNAME;
>
> I think the best solution here would be to have another variable to
> override this.

The issue with this approach would be that every distribution has to set
this variable. I know it's the same with SOURCE_DATE_EPOCH, but that is
well established.
>
> > I would argue that this is strange unexpected behaviour, and maybe it
> > should throw an exception instead? Or can it show only "Linux" as
> > fallback? basically PHP_OS. Ideas?
>
> If those old systems run PHP and need uname, changing stuff there is
> probably harder and more expensive than on other systems. With this in
> mind, I'd rather not mess with it, especially for a purpose that can
> easily be achieved without it.

Hmmm true, but the fallback being the hostname where PHP was build on
seems a little bit odd, doesn't it?

--
Jelle van der Waa

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php
Levi Morrison
Re: [PHP-DEV] [RFC] Reproducible Builds Support
December 12, 2017 10:20PM
On Mon, Dec 11, 2017 at 2:11 PM, Jelle van der Waa <[email protected]> wrote:
> Hi all,
>
> Debian, Arch Linux and other distro's are trying to get full
> reproducible builds. There are some issues in PHP's codebase which makes
> builds unreproducible. Reprodicuble builds are currently reproduced in
> Arch Linux by building PHP twice, and in two different env's, varying
> hostname, system time, etc. [1]
>
> Once issue is the PHP_BUILD_DATE, which makes the build
> non-reproducible. I've made a PR which uses SOURCE_DATE_EPOCH which is
> set in the reprodiculbe build env. This should keep the current
> functionality intact, while adding support for reproducible builds. [2]
> [3]

It looks good to me.

> Another issue is the php_uname functions which contains the
> hostname, since the hostname is varied per build this makes it
> non-reproducible. This is caused by the following line:
>
> configure.ac:PHP_UNAME=`uname -a | xargs` required in:
> ext/standard/info.c: php_uname = PHP_UNAME;
>
> Which is there as fallback as the php.net documentation describes:
>
> "On some older UNIX platforms, it may not be able to determine the
> current OS information in which case it will revert to displaying the OS
> PHP was built on. This will only happen if your uname() library call
> either doesn't exist or doesn't work.".
>
> I would argue that this is strange unexpected behaviour, and maybe it
> should throw an exception instead? Or can it show only "Linux" as
> fallback? basically PHP_OS. Ideas?

I wouldn't throw an exception here. It seems PHP_OS is
under-documented; maybe PHP_OS_FAMILY is better:

> The operating system family PHP was built for. Either of 'Windows', 'BSD', 'Darwin', 'Solaris', 'Linux' or 'Unknown'. Available as of PHP 7.2.0.

However, I really don't think we should change this for already
released PHP versions. We should our maintainers how they feel about
changing it in a x.y.NEXT patch. My inclination is to do this for PHP
7.3 and beyond and accept that official PHP sources of earlier
versions will not produce reproducible builds.

> The last issue is phar.phar being non-reproducible of which I am not
> sure what the issue would be. I'm not sure how the binary data in the
> phar.phar is generated.

Phars are like `tars` that are also valid PHP files. This means there
are probably modification times, etc, set in there. Not sure what else
would need to be changed.

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php
Stanislav Malyshev
Re: [PHP-DEV] [RFC] Reproducible Builds Support
December 13, 2017 01:10AM
Hi!

>> I think the best solution here would be to have another variable to
>> override this.
>
> The issue with this approach would be that every distribution has to set
> this variable. I know it's the same with SOURCE_DATE_EPOCH, but that is
> well established.

All distros that want reproducible build of PHP. But I assume they need
to do some special magic to initiate reproducible build anyway, if so,
we could document the procedure of setting up reproducible build in some
readme file, and make it easy to set it up. They won't need to set it up
for all builds, just for PHP build, and since most use special scripts
to build PHP anyway, it shouldn't be too hard to add.

>> If those old systems run PHP and need uname, changing stuff there is
>> probably harder and more expensive than on other systems. With this in
>> mind, I'd rather not mess with it, especially for a purpose that can
>> easily be achieved without it.
>
> Hmmm true, but the fallback being the hostname where PHP was build on
> seems a little bit odd, doesn't it?

Yes, but I'd follow "Chesterton fence" principle here. Maybe we could
use some ifdefs and configure magic to ensure this is actually not
happening on the kind of systems where reproducible builds are run?
--
Stas Malyshev
smalyshev@gmail.com

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php
Jelle van der Waa
Re: [PHP-DEV] [RFC] Reproducible Builds Support
December 14, 2017 10:10AM
On 12/12/17 at 02:12pm, Levi Morrison wrote:
> On Mon, Dec 11, 2017 at 2:11 PM, Jelle van der Waa <[email protected]> wrote:
> > Hi all,
> >
> > Debian, Arch Linux and other distro's are trying to get full
> > reproducible builds. There are some issues in PHP's codebase which makes
> > builds unreproducible. Reprodicuble builds are currently reproduced in
> > Arch Linux by building PHP twice, and in two different env's, varying
> > hostname, system time, etc. [1]
> >
> > Once issue is the PHP_BUILD_DATE, which makes the build
> > non-reproducible. I've made a PR which uses SOURCE_DATE_EPOCH which is
> > set in the reprodiculbe build env. This should keep the current
> > functionality intact, while adding support for reproducible builds. [2]
> > [3]
>
> It looks good to me.
>
> > Another issue is the php_uname functions which contains the
> > hostname, since the hostname is varied per build this makes it
> > non-reproducible. This is caused by the following line:
> >
> > configure.ac:PHP_UNAME=`uname -a | xargs` required in:
> > ext/standard/info.c: php_uname = PHP_UNAME;
> >
> > Which is there as fallback as the php.net documentation describes:
> >
> > "On some older UNIX platforms, it may not be able to determine the
> > current OS information in which case it will revert to displaying the OS
> > PHP was built on. This will only happen if your uname() library call
> > either doesn't exist or doesn't work.".
> >
> > I would argue that this is strange unexpected behaviour, and maybe it
> > should throw an exception instead? Or can it show only "Linux" as
> > fallback? basically PHP_OS. Ideas?
>
> I wouldn't throw an exception here. It seems PHP_OS is
> under-documented; maybe PHP_OS_FAMILY is better:

PHP_OS and PHP_OS_FAMILY is a strange difference indeed. I'll have to do
some further digging.

>
> > The operating system family PHP was built for. Either of 'Windows', 'BSD', 'Darwin', 'Solaris', 'Linux' or 'Unknown'. Available as of PHP 7.2.0.
>
> However, I really don't think we should change this for already
> released PHP versions. We should our maintainers how they feel about
> changing it in a x.y.NEXT patch. My inclination is to do this for PHP
> 7.3 and beyond and accept that official PHP sources of earlier
> versions will not produce reproducible builds.

Indeed, as an Arch Linux developer I'm fine with these changes adding up
in the next release and no backporting.

> > The last issue is phar.phar being non-reproducible of which I am not
> > sure what the issue would be. I'm not sure how the binary data in the
> > phar.phar is generated.
>
> Phars are like `tars` that are also valid PHP files. This means there
> are probably modification times, etc, set in there. Not sure what else
> would need to be changed.

Thanks for the information, I'll see if I can do some more digging.

--
Jelle van der Waa

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php
Jordi Boggiano
Re: [PHP-DEV] [RFC] Reproducible Builds Support
December 15, 2017 11:20AM
On 2017-12-14 10:02 AM, Jelle van der Waa wrote:
>>> The last issue is phar.phar being non-reproducible of which I am not
>>> sure what the issue would be. I'm not sure how the binary data in the
>>> phar.phar is generated.
>>
>> Phars are like `tars` that are also valid PHP files. This means there
>> are probably modification times, etc, set in there. Not sure what else
>> would need to be changed.
>
> Thanks for the information, I'll see if I can do some more digging.

I have had similar issues with Phar files when I tried to make Composer
builds reproducible. The cause is that the Phar extension uses the
current unix timestamp as filemtime for all files in the table of
content (at least when using addFromString), so every time you build the
TOC is different and hence the signature at the end also is.

I built a tool to fix this which just overwrites the TOC timestamps with
whatever you want and then updates the signature.. If it helps, you can
find it there:

https://github.com/Seldaek/phar-utils

Example usage in Composer:

https://github.com/composer/composer/blob/84f5a1a7e8293978a718663dfac399e83f093e9e/src/Composer/Compiler.php#L161-L164

I guess an alternative fix would be for someone to actually fix the Phar
extension so addFromString has a filemtime parameter you can pass the
desired mtime to. I have not checked whether addFile suffers from the
same issue or not, but possibly it needs to be fixed to read the mtime
from the file you add.

Best,
Jordi

--
Jordi Boggiano
@seldaek - http://seld.be

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php
Sebastian Bergmann
Re: [PHP-DEV] [RFC] Reproducible Builds Support
December 15, 2017 12:00PM
Am 15.12.2017 um 11:13 schrieb Jordi Boggiano:
> I guess an alternative fix would be for someone to actually fix the Phar
> extension so addFromString has a filemtime parameter you can pass the
> desired mtime to. I have not checked whether addFile suffers from the same
> issue or not, but possibly it needs to be fixed to read the mtime from the
> file you add.

+1

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php
Jelle van der Waa
Re: [PHP-DEV] [RFC] Reproducible Builds Support
December 24, 2017 12:30PM
On 12/15/17 at 11:54am, Sebastian Bergmann wrote:
> Am 15.12.2017 um 11:13 schrieb Jordi Boggiano:
> > I guess an alternative fix would be for someone to actually fix the Phar
> > extension so addFromString has a filemtime parameter you can pass the
> > desired mtime to. I have not checked whether addFile suffers from the same
> > issue or not, but possibly it needs to be fixed to read the mtime from the
> > file you add.
>
> +1

I'm not sure if timestamps are the issue, the created phar.phar binary
is non-reproducible as can be seen in this diff. I'll do some more
digging :)

https://tests.reproducible-builds.org/archlinux/extra/php/php-7.2.0-2-x86_64.pkg.tar.xz.html

--
Jelle van der Waa

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php
Sorry, only registered users may post in this forum.

Click here to login