Welcome! Log In Create A New Profile

Advanced

[PHP-DEV] Use decent entropy for uniqid($prefix, TRUE)

Posted by Yasuo Ohgaki 
Yasuo Ohgaki
[PHP-DEV] Use decent entropy for uniqid($prefix, TRUE)
December 31, 2016 12:30AM
Hi all,

I thought we must fix due to proposed PHPMailer bug fix patch. (See below
for detail) Previous discussion went wrong because of compatibility
misunderstanding. There is _no_ additional BC issue. Please keep in mind
this.

This is simple change proposal replacing weak entropy to string one.


Background:

uniqid() returns timestamp or timestamp+random value which is based on
current timestamp. It is obvious that current uniqid() has fatal problems,
uniqid() does not produce acceptable unique ID unlike the name implies.
https://php.net/uniqid


What are the fatal problems:

1) uniqid() (w/o more_entropy option) returns timestamp simply, tries
to guarantee uniqueness by sleeping short amount of time for a
process. This behavior may create duplicates among processes and by
system clock adjustment. i.e. NTP or like. This issue is not addressed
by proposed patch.

2) uniqid('', TRUE) (w/ more_entropy option) should return acceptable
unique ID, but it's not. This is due to more_entropy is generated by
timestamp, i.e. combined_lcg(), and issue like 1) exists. Proposed
patch fixes this issue.

Use of combined LCG for uniqid() entropy does not make sense now.


Solution:

Current PHP has php_random_int/bytes() which are much better
entropy than combined_lcg(). php_random_int/bytes() use CSPRNG.


Note:

There is no additional compatibility issue at all with
php_random_int/bytes(). Previously merged patch discussion regarding
uniqid() improvement was ruined compatibility FUD. open_basedir
restriction is irrelevant to internal functions. PRNG failure is
unlikely, and the system is unusable anyway if it fails. e.g. random
functions, session, crypt related functions do not work at all.

Although stronger entropy is needed, uniqid('', TRUE) will have
acceptable uniqueness similar to UUID by replacing CSPRNG entropy.
https://en.wikipedia.org/wiki/Universally_unique_identifier

This kind of change is simple enough change to be merged as "Simple
improvement for new release" or even "Simple bug fix for released
versions", IMO. I'm posting this mail to ask comments because of
previous discussion for this issue.

Following patch is basically the same as patch I committed and
reverted before. I suppose nobody insists RFC for this change now.

I'll merge the patch to master (7.2) if there is no comment.


Patch:

$ git diff
diff --git a/ext/standard/uniqid.c b/ext/standard/uniqid.c
index f429e6d..80dacdb 100644
--- a/ext/standard/uniqid.c
+++ b/ext/standard/uniqid.c
@@ -35,7 +35,7 @@
#include <sys/time.h>
#endif

-#include "php_lcg.h"
+#include "php_random.h"
#include "uniqid.h"

/* {{{ proto string uniqid([string prefix [, bool more_entropy]])
@@ -77,7 +77,9 @@ PHP_FUNCTION(uniqid)
* digits for usecs.
*/
if (more_entropy) {
- uniqid = strpprintf(0, "%s%08x%05x%.8F", prefix, sec,
usec, php_combined_lcg() * 10);
+ zend_long rand;
+ php_random_int(1000000000, 9999999999, &rand, 1);
+ uniqid = strpprintf(0, "%s%08x%05x%.8F", prefix, sec,
usec, (double)rand/10000000000);
} else {
uniqid = strpprintf(0, "%s%08x%05x", prefix, sec, usec);
}



Proposed patch for PHPMailer command injection issue:

I found following code(patch) for PHPMailer security issue.
https://core.trac.wordpress.org/attachment/ticket/37210/0001
-Upgrade-PHPMailer-from-5.2.14-to-5.2.19.patch

2086 * Create unique ID
2087 * @return string
2088 */
2089 protected function generateId() {
2090 return md5(uniqid(time()));
2024 2091 }
2025 2092
2026 2093 /**
……class PHPMailer
2034 2101 {
2035 2102 $body = '';
2036 2103 //Create unique IDs and preset boundaries
2037 $this->uniqueid = md5(uniqid(time()));
2104 $this->uniqueid = $this->generateId();
2038 2105 $this->boundary[1] = 'b1_' . $this->uniqueid;
2039 2106 $this->boundary[2] = 'b2_' . $this->uniqueid;
2040 2107 $this->boundary[3] = 'b3_' . $this->uniqueid;

Although I never recommend such code, the ID is good enough for this
specific usage. I think we should remove the goccha, "uniqid() is not
unique". This code explains why.


Related RFC:

Improve uniqid() uniqueness
https://wiki.php.net/rfc/uniqid

IMHO, we should enable more_entropy by default, with stronger entropy
prefered.


Regards,

--
Yasuo Ohgaki
yohgaki@ohgaki.net
Kazuo Oishi
Re: [PHP-DEV] Use decent entropy for uniqid($prefix, TRUE)
January 01, 2017 07:10PM
Hi,

> I'll merge the patch to master (7.2) if there is no comment.
>
>
> Patch:
>
> $ git diff
> diff --git a/ext/standard/uniqid.c b/ext/standard/uniqid.c
> index f429e6d..80dacdb 100644
> --- a/ext/standard/uniqid.c
> +++ b/ext/standard/uniqid.c
> @@ -35,7 +35,7 @@
> #include <sys/time.h>
> #endif
>
> -#include "php_lcg.h"
> +#include "php_random.h"
> #include "uniqid.h"
>
> /* {{{ proto string uniqid([string prefix [, bool more_entropy]])
> @@ -77,7 +77,9 @@ PHP_FUNCTION(uniqid)
> * digits for usecs.
> */
> if (more_entropy) {
> - uniqid = strpprintf(0, "%s%08x%05x%.8F", prefix, sec,
> usec, php_combined_lcg() * 10);
> + zend_long rand;
> + php_random_int(1000000000, 9999999999, &rand, 1);
> + uniqid = strpprintf(0, "%s%08x%05x%.8F", prefix, sec,
> usec, (double)rand/10000000000);
> } else {
> uniqid = strpprintf(0, "%s%08x%05x", prefix, sec, usec);
> }
>

I think it would be

php_random_int(0, 9999999999, &rand, 1);
uniqid = strpprintf(0, "%s%08x%05x%.8F", prefix, sec, usec, (double)rand/1000000000);

[Please note that "(double)rand/10000000000" is changed to
"(double)rand/1000000000".]


I don't oppose this patch, but I cannot understand your argument.

> I thought we must fix due to proposed PHPMailer bug fix patch. (See below
> for detail) Previous discussion went wrong because of compatibility
> misunderstanding. There is _no_ additional BC issue. Please keep in mind
> this.
....
> Proposed patch for PHPMailer command injection issue:
>
> I found following code(patch) for PHPMailer security issue.
> https://core.trac.wordpress.org/attachment/ticket/37210/0001
> -Upgrade-PHPMailer-from-5.2.14-to-5.2.19.patch
>
> 2086 * Create unique ID
> 2087 * @return string
> 2088 */
> 2089 protected function generateId() {
> 2090 return md5(uniqid(time()));
> 2024 2091 }
> 2025 2092
> 2026 2093 /**
> ……class PHPMailer
> 2034 2101 {
> 2035 2102 $body = '';
> 2036 2103 //Create unique IDs and preset boundaries
> 2037 $this->uniqueid = md5(uniqid(time()));
> 2104 $this->uniqueid = $this->generateId();
> 2038 2105 $this->boundary[1] = 'b1_' . $this->uniqueid;
> 2039 2106 $this->boundary[2] = 'b2_' . $this->uniqueid;
> 2040 2107 $this->boundary[3] = 'b3_' . $this->uniqueid;
>
> Although I never recommend such code, the ID is good enough for this
> specific usage. I think we should remove the goccha, "uniqid() is not
> unique". This code explains why.

Obviously, this is not related to your patch. "we must fix due to
proposed PHPMailer bug fix patch" is "FUD". Behavior of uniqid without
$more_entropy=TRUE is not changed.

What's your intention?


BTW, I agree that uniqid() is good enough to use as MIME boundary key, if
collision to body content is checked. But non-timestamp-based, simple
random value generator is more useful for this purpose.


> IMHO, we should enable more_entropy by default, with stronger entropy
> prefered.

This is BC-break, and I don't think such change is wanted.

(IMHO, what is wanted is new simple random string generator that uses
only specified characters.)

--
Kazuo Oishi

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php
Lauri Kenttä
Re: [PHP-DEV] Use decent entropy for uniqid($prefix, TRUE)
January 03, 2017 09:00PM
On 2016-12-31 01:20, Yasuo Ohgaki wrote:
> + zend_long rand;
> + php_random_int(1000000000, 9999999999, &rand, 1);
> + uniqid = strpprintf(0, "%s%08x%05x%.8F", prefix, sec,
> usec, (double)rand/10000000000);

Your code is broken. It produces 0.10000000 - 0.99999999 when it should
produce 0.00000000 - 9.99999999. Also, you have integer overflow on
32-bit systems.

Why do you mess with oversized integers and doubles and at all? It would
be cleaner and simpler to use just regular 32-bit integers like this:

+ zend_long rand;
+ php_random_int(0, 999999999, &rand, 1);
+ uniqid = strpprintf(0, "%s%08x%05x%d.%08d", prefix, sec,
usec, rand % 10, rand / 10);

Also, your argument about PHPMailer has nothing to do with your main
complaint about lcg_value, since collisions of lcg_value are not the
problem there.

Why don't you put your effort into a more useful solution such as
random_string or something?
random_string(PHP_STRING_HEX_LOWER, 32) would produce md5-style output.
random_string(PHP_STRING_BASE64, 32) would produce a lot more entropy.
random_string("my_charset", 20) would cover the general case.
random_array([1,2,3], 20) could extend this to arbitrary arrays.

--
Lauri Kenttä

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php
Yasuo Ohgaki
Re: [PHP-DEV] Use decent entropy for uniqid($prefix, TRUE)
January 07, 2017 02:00AM
Hi Kazuo,

On Mon, Jan 2, 2017 at 3:03 AM, Kazuo Oishi <[email protected]> wrote:

> > I thought we must fix due to proposed PHPMailer bug fix patch. (See below
> > for detail) Previous discussion went wrong because of compatibility
> > misunderstanding. There is _no_ additional BC issue. Please keep in mind
> > this.
> ...
> > Proposed patch for PHPMailer command injection issue:
> >
> > I found following code(patch) for PHPMailer security issue.
> > https://core.trac.wordpress.org/attachment/ticket/37210/0001
> > -Upgrade-PHPMailer-from-5.2.14-to-5.2.19.patch
> >
> > 2086 * Create unique ID
> > 2087 * @return string
> > 2088 */
> > 2089 protected function generateId() {
> > 2090 return md5(uniqid(time()));
> > 2024 2091 }
> > 2025 2092
> > 2026 2093 /**
> > ……class PHPMailer
> > 2034 2101 {
> > 2035 2102 $body = '';
> > 2036 2103 //Create unique IDs and preset boundaries
> > 2037 $this->uniqueid = md5(uniqid(time()));
> > 2104 $this->uniqueid = $this->generateId();
> > 2038 2105 $this->boundary[1] = 'b1_' . $this->uniqueid;
> > 2039 2106 $this->boundary[2] = 'b2_' . $this->uniqueid;
> > 2040 2107 $this->boundary[3] = 'b3_' . $this->uniqueid;
> >
> > Although I never recommend such code, the ID is good enough for this
> > specific usage. I think we should remove the goccha, "uniqid() is not
> > unique". This code explains why.
>
> Obviously, this is not related to your patch. "we must fix due to
> proposed PHPMailer bug fix patch" is "FUD". Behavior of uniqid without
> $more_entropy=TRUE is not changed.
>

You misunderstand the mail.
PHPMailer and uniqid() fix is unrelated, but uniqid() is misused proposed
patch in obvious way.


>
> What's your intention?
>

The point we should learn from the code is, it is clear that users
misunderstand how uniqid() works. You'll find number of such usages if you
search net.

Regards,

--
Yasuo Ohgaki
yohgaki@ohgaki.net
Yasuo Ohgaki
Re: [PHP-DEV] Use decent entropy for uniqid($prefix, TRUE)
January 07, 2017 02:20AM
Hi Lauri,

On Wed, Jan 4, 2017 at 4:56 AM, Lauri Kenttä <[email protected]> wrote:

> On 2016-12-31 01:20, Yasuo Ohgaki wrote:
>
>> + zend_long rand;
>> + php_random_int(1000000000, 9999999999, &rand, 1);
>> + uniqid = strpprintf(0, "%s%08x%05x%.8F", prefix, sec,
>> usec, (double)rand/10000000000);
>>
>
> Your code is broken. It produces 0.10000000 - 0.99999999 when it should
> produce 0.00000000 - 9.99999999. Also, you have integer overflow on 32-bit
> systems.
>
> Why do you mess with oversized integers and doubles and at all? It would
> be cleaner and simpler to use just regular 32-bit integers like this:
>
> + zend_long rand;
> + php_random_int(0, 999999999, &rand, 1);
> + uniqid = strpprintf(0, "%s%08x%05x%d.%08d", prefix, sec,
> usec, rand % 10, rand / 10);
>
> Also, your argument about PHPMailer has nothing to do with your main
> complaint about lcg_value, since collisions of lcg_value are not the
> problem there.
>

+ php_random_int(1000000000, 9999999999, &rand, 1);

This should be

+ php_random_int(0, 9999999999, &rand, 1);

Anyway, I forgot about 32 bit systems. Thank you for catching this.

My original patch is better, then,

[[email protected] PHP-master]$ git show 48f1a17886d874dc90867c669481804de90
commit 48f1a17886d874dc90867c669481804de90509e8
Author: Yasuo Ohgaki <[email protected]>
Date: Tue Oct 18 09:04:57 2016 +0900

Fix bug #47890 #73215 uniqid() should use better random source

diff --git a/ext/standard/uniqid.c b/ext/standard/uniqid.c
index f429e6d..207cf01 100644
--- a/ext/standard/uniqid.c
+++ b/ext/standard/uniqid.c
@@ -35,9 +35,11 @@
#include <sys/time.h>
#endif

-#include "php_lcg.h"
+#include "php_random.h"
#include "uniqid.h"

+#define PHP_UNIQID_ENTROPY_LEN 10
+
/* {{{ proto string uniqid([string prefix [, bool more_entropy]])
Generates a unique ID */
#ifdef HAVE_GETTIMEOFDAY
@@ -77,7 +79,22 @@ PHP_FUNCTION(uniqid)
* digits for usecs.
*/
if (more_entropy) {
- uniqid = strpprintf(0, "%s%08x%05x%.8F", prefix, sec, usec,
php_combined_lcg() * 10);
+ int i;
+ unsigned char c, entropy[PHP_UNIQID_ENTROPY_LEN+1];
+
+ for(i = 0; i < PHP_UNIQID_ENTROPY_LEN;) {
+ php_random_bytes_throw(&c, sizeof(c));
+ /* Avoid modulo bias */
+ if (c > 249) {
+ continue;
+ }
+ entropy = c % 10 + '0';
+ i++;
+ }
+ /* Set . for compatibility */
+ entropy[1] = '.';
+ entropy[PHP_UNIQID_ENTROPY_LEN] = '\0';
+ uniqid = strpprintf(0, "%s%08x%05x%s", prefix, sec, usec,
entropy);
} else {
uniqid = strpprintf(0, "%s%08x%05x", prefix, sec, usec);
}


I'll revert the revert commit to master.


> Why don't you put your effort into a more useful solution such as
> random_string or something?
> random_string(PHP_STRING_HEX_LOWER, 32) would produce md5-style output.
> random_string(PHP_STRING_BASE64, 32) would produce a lot more entropy.
> random_string("my_charset", 20) would cover the general case.
> random_array([1,2,3], 20) could extend this to arbitrary arrays.
>

They are useful. I would like to have such functions, too. However, they
don't fix uniqid() problem. i.e. They are out of scope of uniqid()
improvement. As I mentioned, we should get rid of useless goccha like
uniqid() is merely a system timestamp. i.e. more_entropy option should be
enabled by default. Additionally, we may provide stronger entropy such as
128 bits random value.

Do you think uniqid() is good function that we should keep as it is now,
forever? I guess you do not. Why not improve it

Regards,

--
Yasuo Ohgaki
yohgaki@ohgaki.net
Yasuo Ohgaki
Re: [PHP-DEV] Use decent entropy for uniqid($prefix, TRUE)
January 07, 2017 02:30AM
Hi Kazuo,

On Sat, Jan 7, 2017 at 9:54 AM, Yasuo Ohgaki <[email protected]> wrote:

> You misunderstand the mail.
> PHPMailer and uniqid() fix is unrelated, but uniqid() is misused proposed
> patch in obvious way.
>
>
>>
>> What's your intention?
>>
>
> The point we should learn from the code is, it is clear that users
> misunderstand how uniqid() works. You'll find number of such usages if you
> search net.


There is uniqid() improvement RFC
https://wiki.php.net/rfc/uniqid

The proposed patch for PHPmailer proves we should improve 'more_entropy'
option and enable it by default. That's my point. Enabling 'more_entropy'
option by default will be handled by RFC process, but we can simply improve
randomness of 'more_entropy' w/o BC for now. This is what I proposed.

Regards,

--
Yasuo Ohgaki
yohgaki@ohgaki.net
Niklas Keller
Re: [PHP-DEV] Use decent entropy for uniqid($prefix, TRUE)
January 07, 2017 08:10PM
2017-01-07 2:15 GMT+01:00 Yasuo Ohgaki <[email protected]>:

> Hi Lauri,
>
> On Wed, Jan 4, 2017 at 4:56 AM, Lauri Kenttä <[email protected]>
> wrote:
>
> > On 2016-12-31 01:20, Yasuo Ohgaki wrote:
> >
> >> + zend_long rand;
> >> + php_random_int(1000000000, 9999999999, &rand, 1);
> >> + uniqid = strpprintf(0, "%s%08x%05x%.8F", prefix, sec,
> >> usec, (double)rand/10000000000);
> >>
> >
> > Your code is broken. It produces 0.10000000 - 0.99999999 when it should
> > produce 0.00000000 - 9.99999999. Also, you have integer overflow on
> 32-bit
> > systems.
> >
> > Why do you mess with oversized integers and doubles and at all? It would
> > be cleaner and simpler to use just regular 32-bit integers like this:
> >
> > + zend_long rand;
> > + php_random_int(0, 999999999, &rand, 1);
> > + uniqid = strpprintf(0, "%s%08x%05x%d.%08d", prefix, sec,
> > usec, rand % 10, rand / 10);
> >
> > Also, your argument about PHPMailer has nothing to do with your main
> > complaint about lcg_value, since collisions of lcg_value are not the
> > problem there.
> >
>
> + php_random_int(1000000000, 9999999999, &rand, 1);
>
> This should be
>
> + php_random_int(0, 9999999999, &rand, 1);
>
> Anyway, I forgot about 32 bit systems. Thank you for catching this.
>
> My original patch is better, then,
>
> [[email protected] PHP-master]$ git show 48f1a17886d874dc90867c669481804de90
> commit 48f1a17886d874dc90867c669481804de90509e8
> Author: Yasuo Ohgaki <[email protected]>
> Date: Tue Oct 18 09:04:57 2016 +0900
>
> Fix bug #47890 #73215 uniqid() should use better random source
>
> diff --git a/ext/standard/uniqid.c b/ext/standard/uniqid.c
> index f429e6d..207cf01 100644
> --- a/ext/standard/uniqid.c
> +++ b/ext/standard/uniqid.c
> @@ -35,9 +35,11 @@
> #include <sys/time.h>
> #endif
>
> -#include "php_lcg.h"
> +#include "php_random.h"
> #include "uniqid.h"
>
> +#define PHP_UNIQID_ENTROPY_LEN 10
> +
> /* {{{ proto string uniqid([string prefix [, bool more_entropy]])
> Generates a unique ID */
> #ifdef HAVE_GETTIMEOFDAY
> @@ -77,7 +79,22 @@ PHP_FUNCTION(uniqid)
> * digits for usecs.
> */
> if (more_entropy) {
> - uniqid = strpprintf(0, "%s%08x%05x%.8F", prefix, sec, usec,
> php_combined_lcg() * 10);
> + int i;
> + unsigned char c, entropy[PHP_UNIQID_ENTROPY_LEN+1];
> +
> + for(i = 0; i < PHP_UNIQID_ENTROPY_LEN;) {
> + php_random_bytes_throw(&c, sizeof(c));
> + /* Avoid modulo bias */
> + if (c > 249) {
> + continue;
> + }
> + entropy = c % 10 + '0';
> + i++;
> + }
> + /* Set . for compatibility */
> + entropy[1] = '.';
> + entropy[PHP_UNIQID_ENTROPY_LEN] = '\0';
> + uniqid = strpprintf(0, "%s%08x%05x%s", prefix, sec, usec,
> entropy);
> } else {
> uniqid = strpprintf(0, "%s%08x%05x", prefix, sec, usec);
> }
>
>
> I'll revert the revert commit to master.
>
>
> > Why don't you put your effort into a more useful solution such as
> > random_string or something?
> > random_string(PHP_STRING_HEX_LOWER, 32) would produce md5-style output.
> > random_string(PHP_STRING_BASE64, 32) would produce a lot more entropy.
> > random_string("my_charset", 20) would cover the general case.
> > random_array([1,2,3], 20) could extend this to arbitrary arrays.
> >
>
> They are useful. I would like to have such functions, too. However, they
> don't fix uniqid() problem. i.e. They are out of scope of uniqid()
> improvement. As I mentioned, we should get rid of useless goccha like
> uniqid() is merely a system timestamp. i.e. more_entropy option should be
> enabled by default. Additionally, we may provide stronger entropy such as
> 128 bits random value.
>
> Do you think uniqid() is good function that we should keep as it is now,
> forever? I guess you do not. Why not improve it


Because it's a BC break. A new function with the deprecation of the old one
is way easier for developers to handle. They don't need version checks.
Maybe it will work on older versions, but will behave differently.

Regards, Niklas
Yasuo Ohgaki
Re: [PHP-DEV] Use decent entropy for uniqid($prefix, TRUE)
January 07, 2017 09:30PM
Hi Niklas,

On Sun, Jan 8, 2017 at 4:08 AM, Niklas Keller <[email protected]> wrote:

> 2017-01-07 2:15 GMT+01:00 Yasuo Ohgaki <[email protected]>:
>
>> Hi Lauri,
>>
>> On Wed, Jan 4, 2017 at 4:56 AM, Lauri Kenttä <[email protected]>
>> wrote:
>>
>> > On 2016-12-31 01:20, Yasuo Ohgaki wrote:
>> >
>> >> + zend_long rand;
>> >> + php_random_int(1000000000, 9999999999, &rand, 1);
>> >> + uniqid = strpprintf(0, "%s%08x%05x%.8F", prefix, sec,
>> >> usec, (double)rand/10000000000);
>> >>
>> >
>> > Your code is broken. It produces 0.10000000 - 0.99999999 when it should
>> > produce 0.00000000 - 9.99999999. Also, you have integer overflow on
>> 32-bit
>> > systems.
>> >
>> > Why do you mess with oversized integers and doubles and at all? It would
>> > be cleaner and simpler to use just regular 32-bit integers like this:
>> >
>> > + zend_long rand;
>> > + php_random_int(0, 999999999, &rand, 1);
>> > + uniqid = strpprintf(0, "%s%08x%05x%d.%08d", prefix, sec,
>> > usec, rand % 10, rand / 10);
>> >
>> > Also, your argument about PHPMailer has nothing to do with your main
>> > complaint about lcg_value, since collisions of lcg_value are not the
>> > problem there.
>> >
>>
>> + php_random_int(1000000000, 9999999999, &rand, 1);
>>
>> This should be
>>
>> + php_random_int(0, 9999999999, &rand, 1);
>>
>> Anyway, I forgot about 32 bit systems. Thank you for catching this.
>>
>> My original patch is better, then,
>>
>> [[email protected] PHP-master]$ git show 48f1a17886d874dc90867c669481804de90
>> commit 48f1a17886d874dc90867c669481804de90509e8
>> Author: Yasuo Ohgaki <[email protected]>
>> Date: Tue Oct 18 09:04:57 2016 +0900
>>
>> Fix bug #47890 #73215 uniqid() should use better random source
>>
>> diff --git a/ext/standard/uniqid.c b/ext/standard/uniqid.c
>> index f429e6d..207cf01 100644
>> --- a/ext/standard/uniqid.c
>> +++ b/ext/standard/uniqid.c
>> @@ -35,9 +35,11 @@
>> #include <sys/time.h>
>> #endif
>>
>> -#include "php_lcg.h"
>> +#include "php_random.h"
>> #include "uniqid.h"
>>
>> +#define PHP_UNIQID_ENTROPY_LEN 10
>> +
>> /* {{{ proto string uniqid([string prefix [, bool more_entropy]])
>> Generates a unique ID */
>> #ifdef HAVE_GETTIMEOFDAY
>> @@ -77,7 +79,22 @@ PHP_FUNCTION(uniqid)
>> * digits for usecs.
>> */
>> if (more_entropy) {
>> - uniqid = strpprintf(0, "%s%08x%05x%.8F", prefix, sec,
>> usec,
>> php_combined_lcg() * 10);
>> + int i;
>> + unsigned char c, entropy[PHP_UNIQID_ENTROPY_LEN+1];
>> +
>> + for(i = 0; i < PHP_UNIQID_ENTROPY_LEN;) {
>> + php_random_bytes_throw(&c, sizeof(c));
>> + /* Avoid modulo bias */
>> + if (c > 249) {
>> + continue;
>> + }
>> + entropy = c % 10 + '0';
>> + i++;
>> + }
>> + /* Set . for compatibility */
>> + entropy[1] = '.';
>> + entropy[PHP_UNIQID_ENTROPY_LEN] = '\0';
>> + uniqid = strpprintf(0, "%s%08x%05x%s", prefix, sec, usec,
>> entropy);
>> } else {
>> uniqid = strpprintf(0, "%s%08x%05x", prefix, sec, usec);
>> }
>>
>>
>> I'll revert the revert commit to master.
>>
>>
>> > Why don't you put your effort into a more useful solution such as
>> > random_string or something?
>> > random_string(PHP_STRING_HEX_LOWER, 32) would produce md5-style output..
>> > random_string(PHP_STRING_BASE64, 32) would produce a lot more entropy.
>> > random_string("my_charset", 20) would cover the general case.
>> > random_array([1,2,3], 20) could extend this to arbitrary arrays.
>> >
>>
>> They are useful. I would like to have such functions, too. However, they
>> don't fix uniqid() problem. i.e. They are out of scope of uniqid()
>> improvement. As I mentioned, we should get rid of useless goccha like
>> uniqid() is merely a system timestamp. i.e. more_entropy option should be
>> enabled by default. Additionally, we may provide stronger entropy such as
>> 128 bits random value.
>>
>> Do you think uniqid() is good function that we should keep as it is now,
>> forever? I guess you do not. Why not improve it
>
>
> Because it's a BC break. A new function with the deprecation of the old
> one is way easier for developers to handle. They don't need version checks.
> Maybe it will work on older versions, but will behave differently.
>

Recent rand() BC is more severe. i.e. rand() is predictable PRNG and
backend was replaced to MT rand by 7.1.

BC is important, but I couldn't find code that will be broken by
"more_entropy" by default. uniqid() is popular function yet misused a lot.
I don't think this will change because removing uniqid() is more severe BC
and RFC for it wouldn't pass. We knew slight change could break something,
but question is "Does it worth it". Wordpress's PHPMailer patch explains it
worths.

Anyway, are there any objections for revert the revert patch that uses
CSPRNG for uniqid() entropy?

48f1a17886d874dc90867c669481804de90509e8

I'll wait few days more.

Regards,

--
Yasuo Ohgaki
yohgaki@ohgaki.net
Lauri Kenttä
Re: [PHP-DEV] Use decent entropy for uniqid($prefix, TRUE)
January 08, 2017 06:40PM
On 2017-01-07 03:15, Yasuo Ohgaki wrote:
> + php_random_int(1000000000, 9999999999, &rand, 1);
>
> This should be
>
> + php_random_int(0, 9999999999, &rand, 1);

No, it shouldn't. That fixes none of the reported problems. You still
have too many numbers (integer overflow) and still produce 0.abcdefgh
instead of a.bcdefghi.

If you can't fix it, maybe you shouldn't be doing it in the first
place...

--
Lauri Kenttä

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php
Kazuo Oishi
Re: [PHP-DEV] Use decent entropy for uniqid($prefix, TRUE)
January 08, 2017 07:20PM
Hi,

>>> + zend_long rand;
>>> + php_random_int(1000000000, 9999999999, &rand, 1);
>>> + uniqid = strpprintf(0, "%s%08x%05x%.8F", prefix, sec,
>>> usec, (double)rand/10000000000);
>>
>> Your code is broken. It produces 0.10000000 - 0.99999999 when it should
>> produce 0.00000000 - 9.99999999. Also, you have integer overflow on 32-bit
>> systems.
>>
>> Why do you mess with oversized integers and doubles and at all? It would
>> be cleaner and simpler to use just regular 32-bit integers like this:
>>
>> + zend_long rand;
>> + php_random_int(0, 999999999, &rand, 1);
>> + uniqid = strpprintf(0, "%s%08x%05x%d.%08d", prefix, sec,
>> usec, rand % 10, rand / 10);
....
> My original patch is better, then,
>
> [[email protected] PHP-master]$ git show 48f1a17886d874dc90867c669481804de90
> commit 48f1a17886d874dc90867c669481804de90509e8
> Author: Yasuo Ohgaki <[email protected]>
> Date: Tue Oct 18 09:04:57 2016 +0900
>
> Fix bug #47890 #73215 uniqid() should use better random source
>
> diff --git a/ext/standard/uniqid.c b/ext/standard/uniqid.c
> index f429e6d..207cf01 100644
> --- a/ext/standard/uniqid.c
> +++ b/ext/standard/uniqid.c
> @@ -35,9 +35,11 @@
> #include <sys/time.h>
> #endif
>
> -#include "php_lcg.h"
> +#include "php_random.h"
> #include "uniqid.h"
>
> +#define PHP_UNIQID_ENTROPY_LEN 10
> +
> /* {{{ proto string uniqid([string prefix [, bool more_entropy]])
> Generates a unique ID */
> #ifdef HAVE_GETTIMEOFDAY
> @@ -77,7 +79,22 @@ PHP_FUNCTION(uniqid)
> * digits for usecs.
> */
> if (more_entropy) {
> - uniqid = strpprintf(0, "%s%08x%05x%.8F", prefix, sec, usec,
> php_combined_lcg() * 10);
> + int i;
> + unsigned char c, entropy[PHP_UNIQID_ENTROPY_LEN+1];
> +
> + for(i = 0; i < PHP_UNIQID_ENTROPY_LEN;) {
> + php_random_bytes_throw(&c, sizeof(c));
> + /* Avoid modulo bias */
> + if (c > 249) {
> + continue;
> + }
> + entropy = c % 10 + '0';
> + i++;
> + }
> + /* Set . for compatibility */
> + entropy[1] = '.';
> + entropy[PHP_UNIQID_ENTROPY_LEN] = '\0';
> + uniqid = strpprintf(0, "%s%08x%05x%s", prefix, sec, usec,
> entropy);
> } else {
> uniqid = strpprintf(0, "%s%08x%05x", prefix, sec, usec);
> }
>
>
> I'll revert the revert commit to master.

No. Lauri's version is better.

Your php_random_bytes_throw() version is significantly slow. Lauri's
version is faster and cleaner.

[original uniqid() using php_combined_lcg]
$ time ./php_uniqid_orig -r 'for($i=0; $i<1000000;$i++) uniqid("",true);'
real 0m0.366s
user 0m0.350s
sys 0m0.010s

[your php_random_bytes_throw version (commit 48f1a17886d874dc90867c669481804de90509e8)]
$ time ./php_uniqid_yohgaki -r 'for($i=0; $i<1000000;$i++) uniqid("",true);'
real 0m4.509s
user 0m0.430s
sys 0m4.070s

[Lauri's php_random_int version]
$ time ./php_uniqid_lauri -r 'for($i=0; $i<1000000;$i++) uniqid("",true);'
real 0m0.664s
user 0m0.260s
sys 0m0.400s


--
Kazuo Oishi

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php
Yasuo Ohgaki
Re: [PHP-DEV] Use decent entropy for uniqid($prefix, TRUE)
January 08, 2017 09:10PM
On Mon, Jan 9, 2017 at 2:29 AM, Lauri Kenttä <[email protected]> wrote:

> On 2017-01-07 03:15, Yasuo Ohgaki wrote:
>
>> + php_random_int(1000000000, 9999999999, &rand, 1);
>>
>> This should be
>>
>> + php_random_int(0, 9999999999, &rand, 1);
>>
>
> No, it shouldn't. That fixes none of the reported problems. You still have
> too many numbers (integer overflow) and still produce 0.abcdefgh instead of
> a.bcdefghi.
>
> If you can't fix it, maybe you shouldn't be doing it in the first place....


Did you read my mail?
Please read mail again.

Regards,

--
Yasuo Ohgaki
yohgaki@ohgaki.net
Yasuo Ohgaki
Re: [PHP-DEV] Use decent entropy for uniqid($prefix, TRUE)
January 08, 2017 09:40PM
Hi Kazuo,

On Mon, Jan 9, 2017 at 3:15 AM, Kazuo Oishi <[email protected]> wrote:

> No. Lauri's version is better.
>
> Your php_random_bytes_throw() version is significantly slow. Lauri's
> version is faster and cleaner.
>
> [original uniqid() using php_combined_lcg]
> $ time ./php_uniqid_orig -r 'for($i=0; $i<1000000;$i++) uniqid("",true);'
> real 0m0.366s
> user 0m0.350s
> sys 0m0.010s
>
> [your php_random_bytes_throw version (commit 48f1a17886d874dc90867c66948180
> 4de90509e8)]
> $ time ./php_uniqid_yohgaki -r 'for($i=0; $i<1000000;$i++)
> uniqid("",true);'
> real 0m4.509s
> user 0m0.430s
> sys 0m4.070s
>
> [Lauri's php_random_int version]
> $ time ./php_uniqid_lauri -r 'for($i=0; $i<1000000;$i++) uniqid("",true);'
> real 0m0.664s
> user 0m0.260s
> sys 0m0.400s
>

Interesting result. AFAIK, I didn't get significant difference when I made
the patch.
What is your system? It seems your PRNG is significantly slow.

BTW, my patch is extension in mind. i.e. Use of non numeric chars

Regards,

--
Yasuo Ohgaki
yohgaki@ohgaki.net
Yasuo Ohgaki
Re: [PHP-DEV] Use decent entropy for uniqid($prefix, TRUE)
January 08, 2017 09:40PM
On Mon, Jan 9, 2017 at 5:31 AM, Yasuo Ohgaki <[email protected]> wrote:

> On Mon, Jan 9, 2017 at 3:15 AM, Kazuo Oishi <[email protected]> wrote:
>
>> No. Lauri's version is better.
>>
>> Your php_random_bytes_throw() version is significantly slow. Lauri's
>> version is faster and cleaner.
>>
>> [original uniqid() using php_combined_lcg]
>> $ time ./php_uniqid_orig -r 'for($i=0; $i<1000000;$i++) uniqid("",true);'
>> real 0m0.366s
>> user 0m0.350s
>> sys 0m0.010s
>>
>> [your php_random_bytes_throw version (commit
>> 48f1a17886d874dc90867c669481804de90509e8)]
>> $ time ./php_uniqid_yohgaki -r 'for($i=0; $i<1000000;$i++)
>> uniqid("",true);'
>> real 0m4.509s
>> user 0m0.430s
>> sys 0m4.070s
>>
>> [Lauri's php_random_int version]
>> $ time ./php_uniqid_lauri -r 'for($i=0; $i<1000000;$i++) uniqid("",true);'
>> real 0m0.664s
>> user 0m0.260s
>> sys 0m0.400s
>>
>
> Interesting result. AFAIK, I didn't get significant difference when I made
> the patch.
> What is your system? It seems your PRNG is significantly slow.
>

The performance will be improved by reducing multiple PRNG calls to 1.
I'll modify patch later, could you test it with your system?

Regards,

--
Yasuo Ohgaki
yohgaki@ohgaki.net
Yasuo Ohgaki
Re: [PHP-DEV] Use decent entropy for uniqid($prefix, TRUE)
January 08, 2017 10:20PM
On Mon, Jan 9, 2017 at 5:07 AM, Yasuo Ohgaki <[email protected]> wrote:

> On Mon, Jan 9, 2017 at 2:29 AM, Lauri Kenttä <[email protected]>
> wrote:
>
>> On 2017-01-07 03:15, Yasuo Ohgaki wrote:
>>
>>> + php_random_int(1000000000, 9999999999, &rand, 1);
>>>
>>> This should be
>>>
>>> + php_random_int(0, 9999999999, &rand, 1);
>>>
>>
>> No, it shouldn't. That fixes none of the reported problems. You still
>> have too many numbers (integer overflow) and still produce 0.abcdefgh
>> instead of a.bcdefghi.
>>
>> If you can't fix it, maybe you shouldn't be doing it in the first place....
>
>
> Did you read my mail?
> Please read mail again.
>

Anyway, I agree your way is optimal for 9 digit chars entropy. I don't care
about extending entropy strength, longer length and use of non digits, for
now. Are we OK with the patch Lauri proposed?

Regards,

--
Yasuo Ohgaki
yohgaki@ohgaki.net
Kazuo Oishi
Re: [PHP-DEV] Use decent entropy for uniqid($prefix, TRUE)
January 09, 2017 01:30AM
Hi,

>>> [original uniqid() using php_combined_lcg]
>>> $ time ./php_uniqid_orig -r 'for($i=0; $i<1000000;$i++) uniqid("",true);'
>>> real 0m0.366s
>>> user 0m0.350s
>>> sys 0m0.010s
>>>
>>> [your php_random_bytes_throw version (commit
>>> 48f1a17886d874dc90867c669481804de90509e8)]
>>> $ time ./php_uniqid_yohgaki -r 'for($i=0; $i<1000000;$i++)
>>> uniqid("",true);'
>>> real 0m4.509s
>>> user 0m0.430s
>>> sys 0m4.070s
>>>
>>> [Lauri's php_random_int version]
>>> $ time ./php_uniqid_lauri -r 'for($i=0; $i<1000000;$i++) uniqid("",true);'
>>> real 0m0.664s
>>> user 0m0.260s
>>> sys 0m0.400s
>>
>> Interesting result. AFAIK, I didn't get significant difference when I made
>> the patch.
>> What is your system? It seems your PRNG is significantly slow.

Core i7-5600U 2.60GHz
Linux version 4.8.10, gcc version 4.9.3, gentoo

> The performance will be improved by reducing multiple PRNG calls to 1.
> I'll modify patch later, could you test it with your system?

Sure. But as you said, Lauri's version would be optimal.

--
Kazuo Oishi

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php
Yasuo Ohgaki
Re: [PHP-DEV] Use decent entropy for uniqid($prefix, TRUE)
January 09, 2017 07:20AM
Hi Kazuo,

On Mon, Jan 9, 2017 at 9:27 AM, Kazuo Oishi <[email protected]> wrote:

> >>> [original uniqid() using php_combined_lcg]
> >>> $ time ./php_uniqid_orig -r 'for($i=0; $i<1000000;$i++)
> uniqid("",true);'
> >>> real 0m0.366s
> >>> user 0m0.350s
> >>> sys 0m0.010s
> >>>
> >>> [your php_random_bytes_throw version (commit
> >>> 48f1a17886d874dc90867c669481804de90509e8)]
> >>> $ time ./php_uniqid_yohgaki -r 'for($i=0; $i<1000000;$i++)
> >>> uniqid("",true);'
> >>> real 0m4.509s
> >>> user 0m0.430s
> >>> sys 0m4.070s
> >>>
> >>> [Lauri's php_random_int version]
> >>> $ time ./php_uniqid_lauri -r 'for($i=0; $i<1000000;$i++)
> uniqid("",true);'
> >>> real 0m0.664s
> >>> user 0m0.260s
> >>> sys 0m0.400s
> >>
> >> Interesting result. AFAIK, I didn't get significant difference when I
> made
> >> the patch.
> >> What is your system? It seems your PRNG is significantly slow.
>
> Core i7-5600U 2.60GHz
> Linux version 4.8.10, gcc version 4.9.3, gentoo
>

Thanks.
I don't see such difference on my
Fedora 25 Corei7-4770S
gcc version 6.3.1 20161221 (Red Hat 6.3.1-1) (GCC)

I'm curious because I don't see performance issue you have.
I'll send patch next week or so because I'm interested in how modified
patch will perform on your system.


> > The performance will be improved by reducing multiple PRNG calls to 1.
> > I'll modify patch later, could you test it with your system?
>
> Sure. But as you said, Lauri's version would be optimal.


I wrote the same patch right after php_random_int() was implemented and
didn't find any problem. I think I've posted benchmark result in the
previous uniqid() discussion thread. So I checked my patch again and found
it should be

PHP-master]$ git diff
diff --git a/ext/standard/uniqid.c b/ext/standard/uniqid.c
index 22173ae..bbd0e0a 100644
--- a/ext/standard/uniqid.c
+++ b/ext/standard/uniqid.c
@@ -35,7 +35,7 @@
#include <sys/time.h>
#endif

-#include "php_lcg.h"
+#include "php_random.h"
#include "uniqid.h"

/* {{{ proto string uniqid([string prefix [, bool more_entropy]])
@@ -78,7 +78,9 @@ PHP_FUNCTION(uniqid)
* digits for usecs.
*/
if (more_entropy) {
- uniqid = strpprintf(0, "%s%08x%05x%.8F", prefix, sec, usec,
php_combined_lcg() * 10);
+ zend_long rand;
+ php_random_int(0, 999999999, &rand, 1);
+ uniqid = strpprintf(0, "%s%08x%05x%.8F", prefix, sec, usec,
(double)rand/100000000);
} else {
uniqid = strpprintf(0, "%s%08x%05x", prefix, sec, usec);
}

Notice that int values are less than a billion which is inside of signed 32
bit int range. This version is as fast as php_combined_lcg() version on my
system. Both versions executes a million uniqid() calls about 0.36 sec.

$ php -r '$s = microtime(TRUE);for($i=0;$i<1000000;$i++) uniqid("", TRUE);
echo microtime(TRUE) - $s;'
0.36102104187012

So above patch would be the final patch. I don't expect issues but if there
is performace issue on some systems, we may consider Lauri's integer
computation version.

I should have been disturbed by something when I wrote the silly patch.
Sorry for confusions.

Regards,

--
Yasuo Ohgaki
yohgaki@ohgaki.net
Lauri Kenttä
Re: [PHP-DEV] Use decent entropy for uniqid($prefix, TRUE)
January 09, 2017 03:10PM
On 2017-01-09 08:08, Yasuo Ohgaki wrote:
> Hi Kazuo,
>
> On Mon, Jan 9, 2017 at 9:27 AM, Kazuo Oishi <[email protected]> wrote:
>
>>>>> [original uniqid() using php_combined_lcg]
>>>>> $ time ./php_uniqid_orig -r 'for($i=0; $i<1000000;$i++)
>> uniqid("",true);'
>>>>> real 0m0.366s
>>>>> user 0m0.350s
>>>>> sys 0m0.010s
>>>>>
>>>>> [your php_random_bytes_throw version (commit
>>>>> 48f1a17886d874dc90867c669481804de90509e8)]
>>>>> $ time ./php_uniqid_yohgaki -r 'for($i=0; $i<1000000;$i++)
>>>>> uniqid("",true);'
>>>>> real 0m4.509s
>>>>> user 0m0.430s
>>>>> sys 0m4.070s
>>>>>
>>>>> [Lauri's php_random_int version]
>>>>> $ time ./php_uniqid_lauri -r 'for($i=0; $i<1000000;$i++)
>> uniqid("",true);'
>>>>> real 0m0.664s
>>>>> user 0m0.260s
>>>>> sys 0m0.400s
>>>>
>>>> Interesting result. AFAIK, I didn't get significant difference
>> when I made
>>>> the patch.
>>>> What is your system? It seems your PRNG is significantly slow.
>>
>> Core i7-5600U 2.60GHz
>> Linux version 4.8.10, gcc version 4.9.3, gentoo
>
> Thanks.
> I don't see such difference on my
> Fedora 25 Corei7-4770S
> gcc version 6.3.1 20161221 (Red Hat 6.3.1-1) (GCC)
>
> I'm curious because I don't see performance issue you have.
> I'll send patch next week or so because I'm interested in how modified
> patch will perform on your system.
>
>>> The performance will be improved by reducing multiple PRNG calls
>> to 1.
>>> I'll modify patch later, could you test it with your system?
>>
>> Sure. But as you said, Lauri's version would be optimal.
>
> I wrote the same patch right after php_random_int() was implemented
> and didn't find any problem. I think I've posted benchmark result in
> the previous uniqid() discussion thread. So I checked my patch again
> and found it should be
>
> PHP-master]$ git diff
> diff --git a/ext/standard/uniqid.c b/ext/standard/uniqid.c
> index 22173ae..bbd0e0a 100644
> --- a/ext/standard/uniqid.c
> +++ b/ext/standard/uniqid.c
> @@ -35,7 +35,7 @@
> #include <sys/time.h>
> #endif
>
> -#include "php_lcg.h"
> +#include "php_random.h"
> #include "uniqid.h"
>
> /* {{{ proto string uniqid([string prefix [, bool more_entropy]])
> @@ -78,7 +78,9 @@ PHP_FUNCTION(uniqid)
> * digits for usecs.
> */
> if (more_entropy) {
> - uniqid = strpprintf(0, "%s%08x%05x%.8F", prefix, sec,
> usec, php_combined_lcg() * 10);
> + zend_long rand;
> + php_random_int(0, 999999999, &rand, 1);
> + uniqid = strpprintf(0, "%s%08x%05x%.8F", prefix, sec,
> usec, (double)rand/100000000);
> } else {
> uniqid = strpprintf(0, "%s%08x%05x", prefix, sec,
> usec);
> }
>
> Notice that int values are less than a billion which is inside of
> signed 32 bit int range. This version is as fast as php_combined_lcg()
> version on my system. Both versions executes a million uniqid() calls
> about 0.36 sec.
>
> $ php -r '$s = microtime(TRUE);for($i=0;$i<1000000;$i++) uniqid("",
> TRUE); echo microtime(TRUE) - $s;'
>
> 0.36102104187012
>
> So above patch would be the final patch. I don't expect issues but if
> there is performace issue on some systems, we may consider Lauri's
> integer computation version.

I can confirm that the integer version is faster (2,6 seconds vs 3,6
seconds for 1000000 uniqid calls). Testing on ARM (Raspberry Pi) gives
similar results. It's not surprising, since converting to float/double
is not free and formatting a floating-point number is also slower than
formatting an integer.

Using floating-point numbers has exactly zero benefits, while now at
least two people have reported that integers are faster. I think it's
also much easier to read and understand integers, and your bugs tell the
same tale. So do you have some actual arguments for your version, or is
this just ”not invented here”?

Also, I must say that I'm neither for nor agains this change in general;
I'm discussing only the implementation.

--
Lauri Kenttä

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php
Kazuo Oishi
Re: [PHP-DEV] Use decent entropy for uniqid($prefix, TRUE)
January 09, 2017 05:00PM
Lauri Kenttä <[email protected]> writes:
>> signed 32 bit int range. This version is as fast as php_combined_lcg()
>> version on my system. Both versions executes a million uniqid() calls
>> about 0.36 sec.
>>
>> $ php -r '$s = microtime(TRUE);for($i=0;$i<1000000;$i++) uniqid("",
>> TRUE); echo microtime(TRUE) - $s;'
>>
>> 0.36102104187012
>>
>> So above patch would be the final patch. I don't expect issues but if
>> there is performace issue on some systems, we may consider Lauri's
>> integer computation version.
>
> I can confirm that the integer version is faster (2,6 seconds vs 3,6 seconds for
> 1000000 uniqid calls). Testing on ARM (Raspberry Pi) gives similar results. It's
> not surprising, since converting to float/double is not free and formatting a
> floating-point number is also slower than formatting an integer.

I have same result; integer version is 0.65 seconds and double version
is 0.82 seconds.

--
Kazuo Oishi

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php
Sorry, only registered users may post in this forum.

Click here to login