Forum: d0p3 BBS

Re: srand(0)

From Michael Sanders@3:633/10 to All on Fri Jan 9 22:38:59 2026

On Thu, 08 Jan 2026 22:46:42 -0800, Keith Thompson wrote:

But your original statement implied that clang would *use* that
particular piece of code, which didn't make much sense. Were you
just asking about how the __OpenBSD__ macro is defined, without
reference to srand?

Well, under OpenBSD I plan on using:

#ifdef __OpenBSD__
srand_deterministic(seed);
#else
srand(seed);
#endif

But what I was asking is whether or not gcc would recognize
the __OpenBSD__ macro (why wouldn't I'm assuming) since clang
is the default compiler.

But also about srand()... you've got me really wondering why
OpenBSD would deviate from the standard as they have. I get
that the those folks disagree because its deterministic, but
its the accepted standard to be deterministic with srand().

Only speaking for myself here, rather than srand_deterministic()
and srand() (that's not deterministic under OpenBSD) it
would've made more sense to've implemented srand_non_deterministic()
and left srand() alone. That design decision on their part only
muddies the waters in my thinking. Live & learn =)

--
:wq
Mike Sanders

--- PyGate Linux v1.5.2
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From Scott Lurndal@3:633/10 to All on Fri Jan 9 23:27:15 2026

Michael Sanders <porkchop@invalid.foo> writes:

On Thu, 08 Jan 2026 22:46:42 -0800, Keith Thompson wrote:

But your original statement implied that clang would *use* that
particular piece of code, which didn't make much sense. Were you
just asking about how the __OpenBSD__ macro is defined, without
reference to srand?

Well, under OpenBSD I plan on using:

#ifdef __OpenBSD__
srand_deterministic(seed);
#else
srand(seed);
#endif

But what I was asking is whether or not gcc would recognize
the __OpenBSD__ macro (why wouldn't I'm assuming) since clang
is the default compiler.

$ gcc -dM -E - < /dev/null

will show all the preprocessor macros predefined by the compiler.

There 397 predefined macros on my Fedora gcc 14 installation.

Note that other macros may be defined in header files.

But also about srand()... you've got me really wondering why
OpenBSD would deviate from the standard as they have. I get
that the those folks disagree because its deterministic, but
its the accepted standard to be deterministic with srand().

I expect they were primary concerned with the security
implications of a deterministic algorithm.

Only speaking for myself here, rather than srand_deterministic()
and srand() (that's not deterministic under OpenBSD) it
would've made more sense to've implemented srand_non_deterministic()
and left srand() alone. That design decision on their part only
muddies the waters in my thinking. Live & learn =)

I'm sure they wanted the change to apply by default to existing
applications (many of them likely distributed with various BSD
releases).

--- PyGate Linux v1.5.2
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From Keith Thompson@3:633/10 to All on Fri Jan 9 17:09:47 2026

Michael Sanders <porkchop@invalid.foo> writes:

On Thu, 08 Jan 2026 22:46:42 -0800, Keith Thompson wrote:

But your original statement implied that clang would *use* that
particular piece of code, which didn't make much sense. Were you
just asking about how the __OpenBSD__ macro is defined, without
reference to srand?

Well, under OpenBSD I plan on using:

#ifdef __OpenBSD__
srand_deterministic(seed);
#else
srand(seed);
#endif

But what I was asking is whether or not gcc would recognize
the __OpenBSD__ macro (why wouldn't I'm assuming) since clang
is the default compiler.

OK.

Do you understand that your original question was unclear?

You said that "clang would use" the quoted 5-line code snippet,
and asked whether "gcc does as well". It's not clang or gcc that
would use that code. It would be used by a programmer writing code
to be compiled with clang or gcc.

I understand now what you meant. I'd like to be sure that you
understand the problem with the question as you originally wrote it.

I have clang 19.1.7 and gcc 13.2.0 installed on OpenBSD 7.8, and
both predefine the macro __OpenBSD__.

But also about srand()... you've got me really wondering why
OpenBSD would deviate from the standard as they have. I get
that the those folks disagree because its deterministic, but
its the accepted standard to be deterministic with srand().

Only speaking for myself here, rather than srand_deterministic()
and srand() (that's not deterministic under OpenBSD) it
would've made more sense to've implemented srand_non_deterministic()
and left srand() alone. That design decision on their part only
muddies the waters in my thinking. Live & learn =)

I don't know why they made that decision. It was clearly deliberate.
I agree that adding an srand_non_deterministic() function would
have been better.

--
Keith Thompson (The_Other_Keith) Keith.S.Thompson+u@gmail.com
void Void(void) { Void(); } /* The recursive call of the void */

--- PyGate Linux v1.5.2
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From Kaz Kylheku@3:633/10 to All on Sat Jan 10 19:44:01 2026

On 2026-01-09, Michael Sanders <porkchop@invalid.foo> wrote:

On Thu, 08 Jan 2026 22:46:42 -0800, Keith Thompson wrote:

But your original statement implied that clang would *use* that
particular piece of code, which didn't make much sense. Were you
just asking about how the __OpenBSD__ macro is defined, without
reference to srand?

Well, under OpenBSD I plan on using:

#ifdef __OpenBSD__
srand_deterministic(seed);
#else
srand(seed);
#endif

This is is better

// In some common configuration header:

#ifdef __OpenBSD__
#define HAVE_SRAND_DETERMINISTIC 1
#define HAVE_... /* other such macros */
#endif

(Or the configuration can be generated by scripts which detect features
in environment.)

Then in the code:

#if HAVE_SRAND_DETERMINISTIC
srand_deterministic(seed);
#else
srand(seed);
#endif

If a platform other than __OpenBSD__ comes along which has
srand_deterministic you just make sure HAVE_SRAND_DETERMINISTIC 1 is
turned on; you don't have to edit the code where that is used.

This idea is seen in the configuration of GNU programs and such.

There is a "GDB Internals" document which discusses it in a section
called "Clean Design"

Partial quote:

New #ifdef?s which test for specific compilers or manufacturers or
operating systems are unacceptable. All #ifdef?s should test for
features. The information about which configurations contain which
features should be segregated into the configuration files. Experience
has proven far too often that a feature unique to one particular system
often creeps into other systems; and that a conditional based on some
predefined macro for your current system will become worthless over
time, as new versions of your system come out that behave differently
with regard to this feature.

[ ... more discussion ... ]

https://www.sourceware.org/gdb/5/onlinedocs/gdbint.pdf

I think the GNU Coding Standards document may have had a similar
discussion; I don't see it in the current version though.

--
TXR Programming Language: http://nongnu.org/txr
Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal
Mastodon: @Kazinator@mstdn.ca

--- PyGate Linux v1.5.2
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From Tim Rentsch@3:633/10 to All on Tue Feb 3 05:26:47 2026

Michael S <already5chosen@yahoo.com> writes:

On Wed, 07 Jan 2026 08:41:25 -0800
Tim Rentsch <tr.17687@z991.linuxsc.com> wrote:

Michael S <already5chosen@yahoo.com> writes:

On Tue, 23 Dec 2025 17:54:05 -0000 (UTC)
antispam@fricas.org (Waldek Hebisch) wrote:

[...]

There is a paper "PCG: A Family of Simple Fast Space-Efficient
Statistically Good Algorithms for Random Number Generation"
by M. O?Neill where she gives a family of algorithms and runs
several statistical tests against known algorithms. Mersenne
Twister does not look good in tests. If you have enough (128) bits
LCGs do pass tests. A bunch of generators with 64-bit state also
passes tests. So the only reason to prefer Mersenne Twister is
that it is implemented in available libraries. Otherwise it is
not so good, have large state and needs more execution time
than alternatives.

I don't know. Testing randomness is complicated matter.
How can I be sure that L'Ecuyer and Simard's TestU01 suite tests
things that I personally care about and that it does not test
things that are of no interest for me? Especially, the latter.

Do you think any of the tests in the TestU01 suite are actually
counter-indicated? As long as you don't think any TestU01 test
makes things worse, there is no reason not to use all of them.
You are always free to disregard tests you don't care about.

Except that it's difficult psychologically.
The batteries of test gains position of of authority in your mind.
Well, may be, you specifically are resistant, but I am not. Nor is
Melissa O'Nail, it seems.

To illustrate my point, I will tell you the story about myself.
Sort of confession.
[very large portion]

I have read through your whole posting several times, and also
looked through your other postings in this thread. Despite my
efforts I am still not sure what you think or what you're trying to
say.

Let me put it as a question. Do you think there is a good and
objective test for measuring the quality of a PRNG? If so what test
(or tests) do you think would suffice? Here "quality" is meant as
some sort of numeric measure, which could be a monotonic metric (as
in "the larger the number the higher the quality") or just a simple
pass/fail.

If you don't think there is any such test, how do you propose that
PRNGs be evaluated?

One important point that I seem to figure out recently is that the
only practical way to produce both solid and very fast PRNG that
adheres to standard language APIs with 32-bit and to somewhat smaller
extent 64-bit output, is to use buffering. I.e. most of the time
generator simply reads pre-calculated word from the buffer and only
ones per N iterations runs an actual PRNG algorithm, probably in a
loop, often in SIMD. In order for this approach to be effective,
buffer can't be particularly small. 32 bytes (256 bits) appear to be
an absolute minimum. The buffer and counter that manages buffering,
are parts of the generator state. That alone sets a practical minimal
limit on the size of generator and diminishes significance of the
difference between PRNGs with "algorithmic" state of 64 bits, 128 bits
or even 256 bits.

I understand what you're saying. These concerns are orthogonal to
the question of the quality of the numbers generated, which for the
purposes of this conversation is all I'm interested in.

--- PyGate Linux v1.5.11
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From Michael S@3:633/10 to All on Tue Feb 3 16:37:08 2026

On Tue, 03 Feb 2026 05:26:47 -0800
Tim Rentsch <tr.17687@z991.linuxsc.com> wrote:

Michael S <already5chosen@yahoo.com> writes:

On Wed, 07 Jan 2026 08:41:25 -0800
Tim Rentsch <tr.17687@z991.linuxsc.com> wrote:

Michael S <already5chosen@yahoo.com> writes:

On Tue, 23 Dec 2025 17:54:05 -0000 (UTC)
antispam@fricas.org (Waldek Hebisch) wrote:

[...]

There is a paper "PCG: A Family of Simple Fast Space-Efficient
Statistically Good Algorithms for Random Number Generation"
by M. O?Neill where she gives a family of algorithms and runs
several statistical tests against known algorithms. Mersenne
Twister does not look good in tests. If you have enough (128)
bits LCGs do pass tests. A bunch of generators with 64-bit
state also passes tests. So the only reason to prefer Mersenne
Twister is that it is implemented in available libraries.
Otherwise it is not so good, have large state and needs more
execution time than alternatives.

I don't know. Testing randomness is complicated matter.
How can I be sure that L'Ecuyer and Simard's TestU01 suite tests
things that I personally care about and that it does not test
things that are of no interest for me? Especially, the latter.

Do you think any of the tests in the TestU01 suite are actually
counter-indicated? As long as you don't think any TestU01 test
makes things worse, there is no reason not to use all of them.
You are always free to disregard tests you don't care about.

Except that it's difficult psychologically.
The batteries of test gains position of of authority in your mind.
Well, may be, you specifically are resistant, but I am not. Nor is
Melissa O'Nail, it seems.

To illustrate my point, I will tell you the story about myself.
Sort of confession.
[very large portion]

I have read through your whole posting several times, and also
looked through your other postings in this thread. Despite my
efforts I am still not sure what you think or what you're trying to
say.

Let me put it as a question. Do you think there is a good and
objective test for measuring the quality of a PRNG? If so what test
(or tests) do you think would suffice? Here "quality" is meant as
some sort of numeric measure, which could be a monotonic metric (as
in "the larger the number the higher the quality") or just a simple pass/fail.

I don't think that it is possible to create generic empirical tests of
this sort.
What is possible is a test that measures specific property that is
known to be important for specific use.

If you don't think there is any such test, how do you propose that
PRNGs be evaluated?

I don't know.
But I believe in one philosophical principle that I learned first ~40
years ago in the field unrelated to PRNGs or to Computer Science: when
you don't know how to measure quality of your product then it is
advisable to ask your consumer. Do not ask a random consumer, ask
the one who requested improvement of the property that you have
difficulties to measure. He is the most likely one to know.

The field where I first heard about that principle was manufacturing of ultra-pure chemicals.

--- PyGate Linux v1.5.11
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From Tim Rentsch@3:633/10 to All on Tue Feb 17 23:47:06 2026

Michael S <already5chosen@yahoo.com> writes:

On Tue, 03 Feb 2026 05:26:47 -0800
Tim Rentsch <tr.17687@z991.linuxsc.com> wrote:

Michael S <already5chosen@yahoo.com> writes:

On Wed, 07 Jan 2026 08:41:25 -0800
Tim Rentsch <tr.17687@z991.linuxsc.com> wrote:

Michael S <already5chosen@yahoo.com> writes:

On Tue, 23 Dec 2025 17:54:05 -0000 (UTC)
antispam@fricas.org (Waldek Hebisch) wrote:

[...]

There is a paper "PCG: A Family of Simple Fast Space-Efficient
Statistically Good Algorithms for Random Number Generation"
by M. O?Neill where she gives a family of algorithms and runs
several statistical tests against known algorithms. Mersenne
Twister does not look good in tests. If you have enough (128)
bits LCGs do pass tests. A bunch of generators with 64-bit
state also passes tests. So the only reason to prefer Mersenne
Twister is that it is implemented in available libraries.
Otherwise it is not so good, have large state and needs more
execution time than alternatives.

I don't know. Testing randomness is complicated matter.
How can I be sure that L'Ecuyer and Simard's TestU01 suite tests
things that I personally care about and that it does not test
things that are of no interest for me? Especially, the latter.

Do you think any of the tests in the TestU01 suite are actually
counter-indicated? As long as you don't think any TestU01 test
makes things worse, there is no reason not to use all of them.
You are always free to disregard tests you don't care about.

Except that it's difficult psychologically.
The batteries of test gains position of of authority in your mind.
Well, may be, you specifically are resistant, but I am not. Nor is
Melissa O'Nail, it seems.

To illustrate my point, I will tell you the story about myself.
Sort of confession.
[very large portion]

I have read through your whole posting several times, and also
looked through your other postings in this thread. Despite my
efforts I am still not sure what you think or what you're trying to
say.

Let me put it as a question. Do you think there is a good and
objective test for measuring the quality of a PRNG? If so what test
(or tests) do you think would suffice? Here "quality" is meant as
some sort of numeric measure, which could be a monotonic metric (as
in "the larger the number the higher the quality") or just a simple
pass/fail.

I don't think that it is possible to create generic empirical tests
of this sort.
What is possible is a test that measures specific property that is
known to be important for specific use.

If you don't think there is any such test, how do you propose that
PRNGs be evaluated?

I don't know.
But I believe in one philosophical principle that I learned first ~40
years ago in the field unrelated to PRNGs or to Computer Science:
when you don't know how to measure quality of your product then it is advisable to ask your consumer. Do not ask a random consumer, ask the
one who requested improvement of the property that you have
difficulties to measure. He is the most likely one to know.

The field where I first heard about that principle was manufacturing
of ultra-pure chemicals.

I'm surprised to see this answer from you, primarily because it
seems to confuse different aspects of the subject.

The key property of a (pseudo) random number generator is that the
values produced exhibit no discernible pattern. Of course this
question cannot be answered absolutely, or more precisely it can be
answered definitely only in the negative. Any affirmative answer
must be partial and also statistical rather than absolute.

If someone wants to write a PRNG for general use, there is no point
in asking users what they want, because they don't know. Very few
people in the entire world have the mathematical sophistication
necessary to answer the question competently (I know I don't, and I
would put myself in the 75 percentile or above). The sort of people
who could answer the question are also likely to have written test
suites like TestU01, so it seems reasonable to use one or more of
those test suites to establish a lower bar for any proposed general
pupose PRNG.

Of course there are properties besides statistical quality measures
that might be desirable, such as whether the state of the PRNG can
be readily ascertained, the size of the state space, various sorts
of cryptographic concerns, and so forth. For some applications
there could be performance thresholds that are essential to meet,
such as how long to produce a new random value, or what the memory
footprint is. But these concerns are outside the domain of the
question, which is only about the quality of the values produced.
Do you see now what I'm getting at?

--- PyGate Linux v1.5.11
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From Tristan Wibberley@3:633/10 to All on Wed Feb 18 11:21:22 2026

On 18/02/2026 07:47, Tim Rentsch wrote:

The key property of a (pseudo) random number generator is that the
values produced exhibit no discernible pattern.

For a PRNG, they exhibit the pattern of following the sequence of the PRNG!

Is it that, for any finite sequence of numbers from a PRNG, without
information about where it came from and how many numbers came before
you can't predict the next number better than chance?

--
Tristan Wibberley

The message body is Copyright (C) 2026 Tristan Wibberley except
citations and quotations noted. All Rights Reserved except that you may,
of course, cite it academically giving credit to me, distribute it
verbatim as part of a usenet system or its archives, and use it to
promote my greatness and general superiority without misrepresentation
of my opinions other than my opinion of my greatness and general
superiority which you _may_ misrepresent. You definitely MAY NOT train
any production AI system with it but you may train experimental AI that
will only be used for evaluation of the AI methods it implements.

--- PyGate Linux v1.5.11
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From David Brown@3:633/10 to All on Thu Feb 19 10:01:04 2026

On 18/02/2026 12:21, Tristan Wibberley wrote:

On 18/02/2026 07:47, Tim Rentsch wrote:

The key property of a (pseudo) random number generator is that the
values produced exhibit no discernible pattern.

For a PRNG, they exhibit the pattern of following the sequence of the PRNG!

As a deterministic function, a PRNG will obviously follow the pattern of
its generating function. But the aim is to have no /discernible/
pattern. The sequence 3, 4, 2, 1, 1, 7, 0, 6, 7 has no pattern that
could be identified without knowledge of where they came from - and thus
no way to predict the next number, 9, in the sequence. But there is a
pattern there - it's the 90th - 100th digits of the decimal expansion of pi.

Is it that, for any finite sequence of numbers from a PRNG, without information about where it came from and how many numbers came before
you can't predict the next number better than chance?

That's the general aim, yes.

But Michael is absolutely correct that only the consumer can say what
they want to measure in order to judge the quality of any piece of code.
It is the customer that gives the requirement specifications, and the programmer's job is to write code that fulfils those specifications.
PRNGs are no different. (In practice, many customers need help figuring
out what their requirements are, and how to express those, but that's
another matter.)

In the case of PRNGs, there are many possible requirements beyond the
"it's hard to predict the next number in the sequence". These include :

* Simplicity of implementation
* Cryptographic security of implementation
* Running speed
* Statistical distribution of values (with many possible patterns, and consideration of length of samples)
* Repeat cycle length
* Psychological factors (sometimes you roll five sixes in a row, but
that might look like a loaded dice. Randomised playlists often use modifications to their PRNGs to avoid repetition of songs, and plotting
random points in a 2-D space does not look "random" to most people)
* Interaction with added entropy sources

As with most requirements for most software, turning most of these into
some kind of directly and objectively measurable "quality" function is difficult or impossible in practice. As Michael says, the only thing
you can do is when a consumer complains that it is not good enough for
their purposes, ask them how to identify when it would be good enough.

--- PyGate Linux v1.5.11
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From James Kuyper@3:633/10 to All on Thu Feb 19 14:33:34 2026

On 2026-02-19 04:01, David Brown wrote:

On 18/02/2026 12:21, Tristan Wibberley wrote:

On 18/02/2026 07:47, Tim Rentsch wrote:

The key property of a (pseudo) random number generator is that the
values produced exhibit no discernible pattern.

For a PRNG, they exhibit the pattern of following the sequence of the PRNG! >>

As a deterministic function, a PRNG will obviously follow the pattern of
its generating function. But the aim is to have no /discernible/
pattern. The sequence 3, 4, 2, 1, 1, 7, 0, 6, 7 has no pattern that
could be identified without knowledge of where they came from - and thus
no way to predict the next number, 9, in the sequence. But there is a pattern there - it's the 90th - 100th digits of the decimal expansion of pi.

I think you're being overoptimistic. I suspect that the pattern could be identified, exactly, without knowing how it was generated. That's
because every possible pattern has infinitely many different ways in
which in can be produced. One of those other ways might be easier to
describe than the way in which the numbers were actually produced, in
which case that simpler way might be guessed more easily that the actual
one - possibly a lot easier.

--- PyGate Linux v1.5.11
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From David Brown@3:633/10 to All on Thu Feb 19 20:47:41 2026

On 19/02/2026 20:33, James Kuyper wrote:

On 2026-02-19 04:01, David Brown wrote:

On 18/02/2026 12:21, Tristan Wibberley wrote:

On 18/02/2026 07:47, Tim Rentsch wrote:

The key property of a (pseudo) random number generator is that the
values produced exhibit no discernible pattern.

For a PRNG, they exhibit the pattern of following the sequence of the PRNG! >>>

As a deterministic function, a PRNG will obviously follow the pattern of
its generating function. But the aim is to have no /discernible/
pattern. The sequence 3, 4, 2, 1, 1, 7, 0, 6, 7 has no pattern that
could be identified without knowledge of where they came from - and thus
no way to predict the next number, 9, in the sequence. But there is a
pattern there - it's the 90th - 100th digits of the decimal expansion of pi.

I think you're being overoptimistic. I suspect that the pattern could be identified, exactly, without knowing how it was generated. That's
because every possible pattern has infinitely many different ways in
which in can be produced. One of those other ways might be easier to
describe than the way in which the numbers were actually produced, in
which case that simpler way might be guessed more easily that the actual
one - possibly a lot easier.

How likely is it that someone would guess a formula that happened to
generate the decimal digits of pi, without more knowledge than a part of
the sequence? I don't believe it is possible to quantify such a
probability, but I would expect it to be very low.

--- PyGate Linux v1.5.11
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From Keith Thompson@3:633/10 to All on Thu Feb 19 14:39:33 2026

David Brown <david.brown@hesbynett.no> writes:
[...]

As a deterministic function, a PRNG will obviously follow the pattern
of its generating function. But the aim is to have no /discernible/
pattern. The sequence 3, 4, 2, 1, 1, 7, 0, 6, 7 has no pattern that
could be identified without knowledge of where they came from - and
thus no way to predict the next number, 9, in the sequence. But there
is a pattern there - it's the 90th - 100th digits of the decimal
expansion of pi.

[...]

A Google search for 342117067 gives numerous hits referring to the
digits of pi.

--
Keith Thompson (The_Other_Keith) Keith.S.Thompson+u@gmail.com
void Void(void) { Void(); } /* The recursive call of the void */

--- PyGate Linux v1.5.11
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From David Brown@3:633/10 to All on Fri Feb 20 09:16:04 2026

On 19/02/2026 23:39, Keith Thompson wrote:

David Brown <david.brown@hesbynett.no> writes:
[...]

As a deterministic function, a PRNG will obviously follow the pattern
of its generating function. But the aim is to have no /discernible/
pattern. The sequence 3, 4, 2, 1, 1, 7, 0, 6, 7 has no pattern that
could be identified without knowledge of where they came from - and
thus no way to predict the next number, 9, in the sequence. But there
is a pattern there - it's the 90th - 100th digits of the decimal
expansion of pi.

[...]

A Google search for 342117067 gives numerous hits referring to the
digits of pi.

That is using knowledge of where the sequence comes from - something
else's knowledge rather than your own, but it's the same principle.

--- PyGate Linux v1.5.11
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From James Kuyper@3:633/10 to All on Fri Feb 20 16:01:02 2026

On 2026-02-19 14:47, David Brown wrote:
...

How likely is it that someone would guess a formula that happened to
generate the decimal digits of pi, without more knowledge than a part
of the sequence? I don't believe it is possible to quantify such a probability, but I would expect it to be very low.

I'm thinking of the kind of software that looks for patterns in
something, such as compression utilities. A compression utility
basically converts a long string of numbers into a much shorter string
that can be expanded by the decompression utility to recover the
original pattern. If you look at the algorithms such code uses, you
realize that they do not attempt to recreate the process that originally generated the long string, they just, in effect, characterize the
resulting sequence.

--- PyGate Linux v1.5.11
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From David Brown@3:633/10 to All on Sat Feb 21 11:09:12 2026

On 20/02/2026 22:01, James Kuyper wrote:

On 2026-02-19 14:47, David Brown wrote:
...

How likely is it that someone would guess a formula that happened to
generate the decimal digits of pi, without more knowledge than a part
of the sequence? I don't believe it is possible to quantify such a
probability, but I would expect it to be very low.

I'm thinking of the kind of software that looks for patterns in
something, such as compression utilities. A compression utility
basically converts a long string of numbers into a much shorter string
that can be expanded by the decompression utility to recover the
original pattern. If you look at the algorithms such code uses, you
realize that they do not attempt to recreate the process that originally generated the long string, they just, in effect, characterize the
resulting sequence.

One of the characteristics of a good PRNG is that its unpredictability
and its statistical properties (typically they aim to be like white
noise, but other distributions are sometimes useful) make them
uncompressible with generic algorithms. Since the PRNG's sequence is generated from an algorithm, it is of course always possible to
re-create that algorithm as a "compressed" form of the sequence. But
generic algorithms will never manage it.

Indeed, you can /define/ a random sequence as a series x(i) such that
for any given compression algorithm Z, there is an integer n such that
the Z-compressed version of x is always bigger than the original x for a sequence length n or more.

And for any given compression algorithm Z, you can find a PRNG algorithm
that Z cannot compress (after a possible initial segment). I haven't
dug through the details, but I am confident that you could show this
with a diagonalisation algorithm over Turing machines, or something akin
to the halting problem proofs.

Or you can just try it yourself:

$ dd if=/dev/urandom of=rand bs=1M count=1
$ cp rand rand1
$ cp rand rand2
$ gzip rand1
$ bzip2 rand2
$ ls -l
-rw-rw-r-- 1 david david 1048576 Feb 20 09:12 rand
-rw-rw-r-- 1 david david 1048760 Feb 20 09:12 rand1.gz
-rw-rw-r-- 1 david david 1053414 Feb 20 09:12 rand2.bz2

--- PyGate Linux v1.5.11
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From Paul@3:633/10 to All on Mon Feb 23 08:32:38 2026

On Fri, 2/20/2026 3:16 AM, David Brown wrote:

On 19/02/2026 23:39, Keith Thompson wrote:

David Brown <david.brown@hesbynett.no> writes:
[...]

As a deterministic function, a PRNG will obviously follow the pattern
of its generating function.� But the aim is to have no /discernible/
pattern.� The sequence 3, 4, 2, 1, 1, 7, 0, 6, 7 has no pattern that
could be identified without knowledge of where they came from - and
thus no way to predict the next number, 9, in the sequence.� But there
is a pattern there - it's the 90th - 100th digits of the decimal
expansion of pi.

[...]

A Google search for 342117067 gives numerous hits referring to the
digits of pi.

That is using knowledge of where the sequence comes from - something else's knowledge rather than your own, but it's the same principle.

"In the following sequence, what is the next digit 7,7,7,7,7,7,7,7,7 ? " :-)

PI=3.

1415926535 8979323846 2643383279 5028841971 6939937510
...
7777777772 4846769425 9310468643 5260899021 0266057232 # Line 517834

I suspect seeing that, that's not good.

Using pgmp-chudnovsky.c , and dumping pi as a binary float to a file,
I get this:

(text version of PI) 100,000,022 bytes
PI-Binary.bin 41,524,121 bytes exponent and limbs
PI-Binary.bin.7Z 41,526,823 bytes 7Z Ultra compression, running on 1 core

The entropy property looks pretty good, but I doubt I would
be using that for my supply of random numbers :-)

https://gmplib.org/list-archives/gmp-discuss/2008-November/003444.html https://stackoverflow.com/questions/3318979/how-to-serialize-the-gmp-mpf-type
https://gmplib.org/list-archives/gmp-discuss/2007-November/002981.html

gcc -DNO_FACTOR -fopenmp -Wall -O2 -o pgmp-chudnovsky pgmp-chudnovsky.c -lgmp -lm

Paul

--- PyGate Linux v1.5.12
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From David Brown@3:633/10 to All on Mon Feb 23 16:05:45 2026

On 23/02/2026 14:32, Paul wrote:

On Fri, 2/20/2026 3:16 AM, David Brown wrote:

On 19/02/2026 23:39, Keith Thompson wrote:

David Brown <david.brown@hesbynett.no> writes:
[...]

As a deterministic function, a PRNG will obviously follow the pattern
of its generating function.� But the aim is to have no /discernible/
pattern.� The sequence 3, 4, 2, 1, 1, 7, 0, 6, 7 has no pattern that
could be identified without knowledge of where they came from - and
thus no way to predict the next number, 9, in the sequence.� But there >>>> is a pattern there - it's the 90th - 100th digits of the decimal
expansion of pi.

[...]

A Google search for 342117067 gives numerous hits referring to the
digits of pi.

That is using knowledge of where the sequence comes from - something else's knowledge rather than your own, but it's the same principle.

"In the following sequence, what is the next digit 7,7,7,7,7,7,7,7,7 ? " :-)

PI=3.

1415926535 8979323846 2643383279 5028841971 6939937510
...
7777777772 4846769425 9310468643 5260899021 0266057232 # Line 517834

I suspect seeing that, that's not good.

Using pgmp-chudnovsky.c , and dumping pi as a binary float to a file,
I get this:

(text version of PI) 100,000,022 bytes
PI-Binary.bin 41,524,121 bytes exponent and limbs
PI-Binary.bin.7Z 41,526,823 bytes 7Z Ultra compression, running on 1 core

The entropy property looks pretty good, but I doubt I would
be using that for my supply of random numbers :-)

In a random sequence of decimal digits, you would expect a sequence of
nine identical digits to turn up on average every 10 ^ 8 digits or so.
You calculated 10 ^ 8 digits, so it's not surprising to see that here.

As for your compression, remember that your text file contains only the
digits 0 to 9, spaces and newlines - 12 different characters in 8-bit
bytes. If these were purely randomly distributed, you'd expect a best compression ratio of log(12) / log(256), or 0.448. But they are not completely random - your space characters and newlines are predictably
spaced so you get marginally better compression ratios. Without spaces
and newlines, you'd expect log(10) / log(256) compression - 0.415241012.
What a coincidence - this matches your "exponent and limbs", and your compressor can't improve on it. (I downloaded a billion digits of pi
and gzip'ed it, for a compression ration of 0.469.)

It turns out that the pseudo-randomness here is extremely good. While
it has not been proven that pi is "normal" (that is to say, its digits
are all evenly distributed), it is strongly believed to be so.

Of course it's not a great source of entropy for secure random numbers,
but the digits of pi form a fine pseudo-random generator function (if
you don't mind the calculation time).

https://gmplib.org/list-archives/gmp-discuss/2008-November/003444.html https://stackoverflow.com/questions/3318979/how-to-serialize-the-gmp-mpf-type
https://gmplib.org/list-archives/gmp-discuss/2007-November/002981.html

gcc -DNO_FACTOR -fopenmp -Wall -O2 -o pgmp-chudnovsky pgmp-chudnovsky.c -lgmp -lm

Paul

--- PyGate Linux v1.5.12
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From Michael S@3:633/10 to All on Mon Feb 23 19:59:17 2026

On Mon, 23 Feb 2026 16:05:45 +0100
David Brown <david.brown@hesbynett.no> wrote:

On 23/02/2026 14:32, Paul wrote:

On Fri, 2/20/2026 3:16 AM, David Brown wrote:

On 19/02/2026 23:39, Keith Thompson wrote:

David Brown <david.brown@hesbynett.no> writes:
[...]

As a deterministic function, a PRNG will obviously follow the
pattern of its generating function.� But the aim is to have no
/discernible/ pattern.� The sequence 3, 4, 2, 1, 1, 7, 0, 6, 7
has no pattern that could be identified without knowledge of
where they came from - and thus no way to predict the next
number, 9, in the sequence.� But there is a pattern there - it's
the 90th - 100th digits of the decimal expansion of pi.

[...]

A Google search for 342117067 gives numerous hits referring to the
digits of pi.

That is using knowledge of where the sequence comes from -
something else's knowledge rather than your own, but it's the same
principle.

"In the following sequence, what is the next digit
7,7,7,7,7,7,7,7,7 ? " :-)

PI=3.

1415926535 8979323846 2643383279 5028841971 6939937510
...
7777777772 4846769425 9310468643 5260899021 0266057232 # Line
517834

I suspect seeing that, that's not good.

Using pgmp-chudnovsky.c , and dumping pi as a binary float to a
file, I get this:

(text version of PI) 100,000,022 bytes
PI-Binary.bin 41,524,121 bytes exponent and limbs
PI-Binary.bin.7Z 41,526,823 bytes 7Z Ultra
compression, running on 1 core

The entropy property looks pretty good, but I doubt I would
be using that for my supply of random numbers :-)

In a random sequence of decimal digits, you would expect a sequence
of nine identical digits to turn up on average every 10 ^ 8 digits or
so. You calculated 10 ^ 8 digits, so it's not surprising to see that
here.

As for your compression, remember that your text file contains only
the digits 0 to 9, spaces and newlines - 12 different characters in
8-bit bytes. If these were purely randomly distributed, you'd expect
a best compression ratio of log(12) / log(256), or 0.448. But they
are not completely random - your space characters and newlines are predictably spaced so you get marginally better compression ratios.
Without spaces and newlines, you'd expect log(10) / log(256)
compression - 0.415241012. What a coincidence - this matches your
"exponent and limbs", and your compressor can't improve on it. (I
downloaded a billion digits of pi and gzip'ed it, for a compression
ration of 0.469.)

It turns out that the pseudo-randomness here is extremely good.
While it has not been proven that pi is "normal" (that is to say, its
digits are all evenly distributed), it is strongly believed to be so.

Of course it's not a great source of entropy for secure random
numbers, but the digits of pi form a fine pseudo-random generator
function (if you don't mind the calculation time).

Would be interesting to find out if it passes Big Crash of L?Ecuye
r.
Of course, one would need far more than a billion of decimal digits to
have a chance. Something like 100B hexadecimal digits appears to be a
minimum.

https://gmplib.org/list-archives/gmp-discuss/2008-November/003444.html https://stackoverflow.com/questions/3318979/how-to-serialize-the-gmp-mp

f-type

https://gmplib.org/list-archives/gmp-discuss/2007-November/002

981.html

gcc -DNO_FACTOR -fopenmp -Wall -O2 -o pgmp-chudnovsky
pgmp-chudnovsky.c -lgmp -lm

Paul

--- PyGate Linux v1.5.12
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From David Brown@3:633/10 to All on Mon Feb 23 20:06:23 2026

On 23/02/2026 18:59, Michael S wrote:

On Mon, 23 Feb 2026 16:05:45 +0100
David Brown <david.brown@hesbynett.no> wrote:

On 23/02/2026 14:32, Paul wrote:

On Fri, 2/20/2026 3:16 AM, David Brown wrote:

On 19/02/2026 23:39, Keith Thompson wrote:

David Brown <david.brown@hesbynett.no> writes:
[...]

As a deterministic function, a PRNG will obviously follow the
pattern of its generating function.� But the aim is to have no
/discernible/ pattern.� The sequence 3, 4, 2, 1, 1, 7, 0, 6, 7
has no pattern that could be identified without knowledge of
where they came from - and thus no way to predict the next
number, 9, in the sequence.� But there is a pattern there - it's
the 90th - 100th digits of the decimal expansion of pi.

[...]

A Google search for 342117067 gives numerous hits referring to the
digits of pi.

That is using knowledge of where the sequence comes from -
something else's knowledge rather than your own, but it's the same
principle.

"In the following sequence, what is the next digit
7,7,7,7,7,7,7,7,7 ? " :-)

PI=3.

1415926535 8979323846 2643383279 5028841971 6939937510
...
7777777772 4846769425 9310468643 5260899021 0266057232 # Line
517834

I suspect seeing that, that's not good.

Using pgmp-chudnovsky.c , and dumping pi as a binary float to a
file, I get this:

(text version of PI) 100,000,022 bytes
PI-Binary.bin 41,524,121 bytes exponent and limbs
PI-Binary.bin.7Z 41,526,823 bytes 7Z Ultra
compression, running on 1 core

The entropy property looks pretty good, but I doubt I would
be using that for my supply of random numbers :-)

In a random sequence of decimal digits, you would expect a sequence
of nine identical digits to turn up on average every 10 ^ 8 digits or
so. You calculated 10 ^ 8 digits, so it's not surprising to see that
here.

As for your compression, remember that your text file contains only
the digits 0 to 9, spaces and newlines - 12 different characters in
8-bit bytes. If these were purely randomly distributed, you'd expect
a best compression ratio of log(12) / log(256), or 0.448. But they
are not completely random - your space characters and newlines are
predictably spaced so you get marginally better compression ratios.
Without spaces and newlines, you'd expect log(10) / log(256)
compression - 0.415241012. What a coincidence - this matches your
"exponent and limbs", and your compressor can't improve on it. (I
downloaded a billion digits of pi and gzip'ed it, for a compression
ration of 0.469.)

It turns out that the pseudo-randomness here is extremely good.
While it has not been proven that pi is "normal" (that is to say, its
digits are all evenly distributed), it is strongly believed to be so.

Of course it's not a great source of entropy for secure random
numbers, but the digits of pi form a fine pseudo-random generator
function (if you don't mind the calculation time).

Would be interesting to find out if it passes Big Crash of L?Ecuyer.
Of course, one would need far more than a billion of decimal digits to
have a chance. Something like 100B hexadecimal digits appears to be a minimum.

Weirdly (at least /I/ think it is weird), it is easier to calculate hexadecimal digits of pi than decimal digits. At least, it is possible
to calculate them independently, without having to calculate and
remember all the previous digits. So it should be possible to split the
task up and run it in parallel on multiple systems.

Of course, confirming that the hexadecimal digits of pi are random
enough to pass such a test does not ensure that the decimal digits would
do so too.

https://gmplib.org/list-archives/gmp-discuss/2008-November/003444.html
https://stackoverflow.com/questions/3318979/how-to-serialize-the-gmp-mpf-type
https://gmplib.org/list-archives/gmp-discuss/2007-November/002981.html

gcc -DNO_FACTOR -fopenmp -Wall -O2 -o pgmp-chudnovsky
pgmp-chudnovsky.c -lgmp -lm

Paul

--- PyGate Linux v1.5.12
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From Paul@3:633/10 to All on Mon Feb 23 15:24:20 2026

On Mon, 2/23/2026 2:06 PM, David Brown wrote:

�� (text version of PI)� 100,000,022 bytes
�� PI-Binary.bin�� 41,524,121 bytes�� exponent and limbs

Weirdly (at least /I/ think it is weird), it is easier to calculate hexadecimal digits of pi than decimal digits.

I computed the decimal representation and the hex representation
(by dumping the exponent and limbs), in the same run.

int mpf_out_raw (FILE *f, mpf_t X) {
int expt; mpz_t Z; size_t nz;
expt = X->_mp_exp;
fwrite(&expt, sizeof(int), 1, f);
nz = X->_mp_size;
Z->_mp_alloc = nz;
Z->_mp_size = nz;
Z->_mp_d = X->_mp_d;
return (mpz_out_raw(f, Z) + sizeof(int));
}

And that's called this way.

/* Open the destination file in binary write mode */
FILE *destination = fopen("PI-Binary.bin", "wb");
if (!destination) {
perror("Error opening PI-Binary.bin file");
} else {
mpf_out_raw(destination, qi); /* qi happens to hold 100 million digits of PI */
fflush(destination);
fclose(destination);
}

You can do them in the same run.

The 7,7,7,7,7,7,7,7,2 sequence was detected in a 32 million digit run
of SuperPi 1.5 XS. The 100 million digit sequence is too large
for SuperPI, and pgmp-chudnovsky.c (with OpenMP) was
used for that, with a little extra code thrown in so I could get
the floating point storage in raw format. It takes a bit more
than one minute, to generate 100 million digits (16 cores). The
compression attempt was done on a single core, to ensure the best
attempt at compression with 7ZIP.

The order of the PI method O() used is covered here.

https://en.wikipedia.org/wiki/Chudnovsky_algorithm

Paul

--- PyGate Linux v1.5.12
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From Axel Reichert@3:633/10 to All on Tue Feb 24 07:08:52 2026

David Brown <david.brown@hesbynett.no> writes:

Of course, confirming that the hexadecimal digits of pi are random
enough to pass such a test does not ensure that the decimal digits
would do so too.

I was puzzled by the "Of course": To me, this is not intuitively clear.
Is there any easy (not too technical) way to "see this"/make it
plausible? My gut feeling (wrongly?) said that the base should not
affect the randomness of a numerical pattern. "Of course" I am aware
(and taught to dozens of numerical beginners) that, say, 0.5 in base 10
has a non terminating representation in base 2, but "random" is neither representation.

Pointers or simple counter-examples highly welcome!

Axel

--- PyGate Linux v1.5.12
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From David Brown@3:633/10 to All on Tue Feb 24 10:24:34 2026

On 24/02/2026 07:08, Axel Reichert wrote:

David Brown <david.brown@hesbynett.no> writes:

Of course, confirming that the hexadecimal digits of pi are random
enough to pass such a test does not ensure that the decimal digits
would do so too.

I was puzzled by the "Of course": To me, this is not intuitively clear.
Is there any easy (not too technical) way to "see this"/make it
plausible? My gut feeling (wrongly?) said that the base should not
affect the randomness of a numerical pattern. "Of course" I am aware
(and taught to dozens of numerical beginners) that, say, 0.5 in base 10
has a non terminating representation in base 2, but "random" is neither representation.

Pointers or simple counter-examples highly welcome!

Axel

Numbers can be normal in some bases and not in others. This is easy to
see if we pick related bases, such as base 2 and base 16. For example,
let x be 1/3. Then x is 0.0101010101... in base 2. That is clearly
normal in base 2. But in base 16, it is 0.55555555..., which is clearly
very far from normal.

I think (but I am not sure) that if a number is normal in base B, then
it will be normal in any other bases co-prime to B. If the bases are
not co-prime, then things are not as clear (as shown by my simple example).

Almost all (in the technical mathematical sense) real numbers are normal
in all bases. And lots of numbers (including pi) are believed to be
normal, and have been checked to various lengths in many bases. But it
is extremely difficult to prove that any given number is normal, unless
it can be seen from its construction.

Being normal in a base is not sufficient to have the digits form a good pseudo-random sequence, but it is a necessary condition for a uniform distribution random sequence.

(I know you set the follow-ups to exclude comp.lang.c, since it is
off-topic in that group, but I added it again as I don't follow sci.math.num-analysis, so I would not see any replies. People have
different opinions on the pros and cons of limiting follow-up groups.
My opinion is that it is better to leave all groups there as long as
there are people from different groups in the discussion, even if it is
moving off-topic for a group, because limiting follow-up groups can lead
to fragmentation. It is better for comp.lang.c to have one off-topic
thread than to have multiple threads that are part of the same
discussion but appear separately as groups are added and removed. If
the discussion goes on for long, and becomes dominated by s.m.n regulars
and of no interest to c.l.c regulars, then it becomes time to move it
over to just that one group. Others can have different opinions on such matters.)

--- PyGate Linux v1.5.12
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From Axel Reichert@3:633/10 to All on Thu Feb 26 19:13:41 2026

David Brown <david.brown@hesbynett.no> writes:

Numbers can be normal in some bases and not in others. This is easy to
see if we pick related bases, such as base 2 and base 16. For example,
let x be 1/3. Then x is 0.0101010101... in base 2. That is clearly
normal in base 2. But in base 16, it is 0.55555555..., which is
clearly very far from normal.

I had to look up "normal" on Wikipedia and was in for a delightful read.
Thanks for the pointer and the nice explanation!

Best regards

Axel

--- PyGate Linux v1.5.12
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

Who's Online

System Info

Sysop:	Tetrazocine
Location:	Melbourne, VIC, Australia
Users:	15
Nodes:	8 (0 / 8)
Uptime:	84:48:08
Calls:	208
Files:	21,502
Messages:	81,378

Re: srand(0)

Who's Online

System Info