Forum: d0p3 BBS

A thought of C

From wij@3:633/10 to All on Tue Apr 14 22:47:37 2026

In attempting writting a simple language, I had a thought of what language
is
to share. Because I saw many people are stuck in thinking C/C++ (or other
high level language) can be so abstract, unlimited 'high level' to mysterio usly
solve various human description of idea.

C and assembly are essentially the same, maybe better call it 'portable ass embly'.
In C, we don't explicitly specify how wide the register/memory unit is, we
use
char/int (short/long, signed/unsigned) to denote the basic unit. I.e.

a=b; // equ. to "mov a,b"

The 2nd difference: Assembly contains too many burdomsom labels. In C, we u
se
'structure', for example:

while(a<b) { // 'while', '(', ')' may be the place for implicit lables
a+=1;
} // '}' is an implicit label

if(a<b) {
} else { // '{', '}' are implicit labels
} // ditto

The 3rd difference: Function calling convention in C is reentrance-able (mo stly).

The 4th difference: Local variable.
(Assembly can theoritically do the same but I don't have impression which o
ne
support this feature.)

Basically, all components map to assembly, we cannot express idea more capa ble,
more expressive than Turing Machine can. ... One consequence is that one ca nnot
really understand C/C++ (or other high-level language) without understand

assembly (C++ is worse in this respect, but skipped here).

--- PyGate Linux v1.5.14
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From Bart@3:633/10 to All on Tue Apr 14 18:45:06 2026

On 14/04/2026 15:47, wij wrote:

In attempting writting a simple language, I had a thought of what language is to share. Because I saw many people are stuck in thinking C/C++ (or other high level language) can be so abstract, unlimited 'high level' to mysteriously
solve various human description of idea.

C and assembly are essentially the same, maybe better call it 'portable assembly'.
In C, we don't explicitly specify how wide the register/memory unit is, we use
char/int (short/long, signed/unsigned) to denote the basic unit. I.e.

a=b; // equ. to "mov a,b"

What C's 'a=b' equates to in assembly could be anything, depending on
target machine, the types of 'a' and 'b', their scopes and linkage, the compiler used, and the optimisation levels employed.

The 2nd difference: Assembly contains too many burdomsom labels. In C, we use 'structure', for example:

while(a<b) { // 'while', '(', ')' may be the place for implicit lables
a+=1;
} // '}' is an implicit label

if(a<b) {
} else { // '{', '}' are implicit labels
} // ditto

The 3rd difference: Function calling convention in C is reentrance-able (mostly).

The 4th difference: Local variable.
(Assembly can theoritically do the same but I don't have impression which one support this feature.)

So basically, C and Assembly are NOT essentially the same. C has far
more abstractions: it is a HLL.

And actually, there are at least a couple of language levels I've used
that sit between Assembly and C.

--- PyGate Linux v1.5.14
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From wij@3:633/10 to All on Wed Apr 15 02:41:14 2026

On Tue, 2026-04-14 at 18:45 +0100, Bart wrote:

On 14/04/2026 15:47, wij wrote:

In attempting writting a simple language, I had a thought of what langu

age is

to share. Because I saw many people are stuck in thinking C/C++ (or oth

er

high level language) can be so abstract, unlimited 'high level' to myst

eriously

solve various human description of idea.

C and assembly are essentially the same, maybe better call it 'portable

assembly'.

In C, we don't explicitly specify how wide the register/memory unit is,

we use

char/int (short/long, signed/unsigned) to denote the basic unit. I.e.

�� a=b;�� // equ. to "mov a,b"

What C's 'a=b' equates to in assembly could be anything, depending on

target machine, the types of 'a' and 'b', their scopes and linkage, the

compiler used, and the optimisation levels employed.

The 2nd difference: Assembly contains too many burdomsom labels. In C,

we use

'structure', for example:

�� while(a<b) {�� // 'while', '(', ')' may be the p

lace for implicit lables

�� a+=1;
�� }��?

?�� // '}' is an implicit label

�� if(a<b) {
�� } else {�� // '{', '}' a

re implicit labels

�� }��?

?�� // ditto

The 3rd difference: Function calling convention in C is reentrance-able

(mostly).

The 4th difference: Local variable.
(Assembly can theoritically do the same but I don't have impression whi

ch one

support this feature.)

So basically, C and Assembly are NOT essentially the same. C has far
more abstractions: it is a HLL.

Anyway, IMO, 'portable assembly' is more descriptive.
'High-Level Language' is anyone's interpretation (prone to mis-interpretati
on and
misunderstanding).

'Assembly' can also be like C:

// This is 'assembly'
def int=32bit; // Choose right bits for your platform, or leave it for
def char= 8bit; // compiler to decide.

int a;
char b;
a=b; // allow auto promotion

while(a<b) {
a+=1;
}

You also can call the above example 'C'. If so, you still have to know how wide
int/char is (Not rare. programmers often struggle which size to use) while

writing "a=b", eventually. What the 'abstracton' really mean? Maybe, even tually
back to int32_t and int8_t after long theoretical/phillisophical pondering?

HHL is just 'style' in favor of specific purpose than the other for me. I a
m not
saying it is wrong, instead it is very helpful (measured by actuall effort
and
gain).

And actually, there are at least a couple of language levels I've used

that sit between Assembly and C.

--- PyGate Linux v1.5.14
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From Jonathan Lamothe@3:633/10 to All on Tue Apr 14 15:56:12 2026

wij <wyniijj5@gmail.com> writes:

On Tue, 2026-04-14 at 18:45 +0100, Bart wrote:

On 14/04/2026 15:47, wij wrote:

In attempting writting a simple language, I had a thought of what language is
to share. Because I saw many people are stuck in thinking C/C++ (or other >> > high level language) can be so abstract, unlimited 'high level' to mysteriously
solve various human description of idea.

C and assembly are essentially the same, maybe better call it 'portable assembly'.
In C, we don't explicitly specify how wide the register/memory unit is, we use
char/int (short/long, signed/unsigned) to denote the basic unit. I.e.

�� a=b;�� // equ. to "mov a,b"

What C's 'a=b' equates to in assembly could be anything, depending on
target machine, the types of 'a' and 'b', their scopes and linkage, the
compiler used, and the optimisation levels employed.

The 2nd difference: Assembly contains too many burdomsom labels. In C, we use
'structure', for example:

�� while(a<b) {�� // 'while', '(', ')' may be the place for implicit lables
�� a+=1;
�� }�� // '}' is an implicit label

�� if(a<b) {
�� } else {�� // '{', '}' are implicit labels
�� }�� // ditto

The 3rd difference: Function calling convention in C is reentrance-able (mostly).

The 4th difference: Local variable.
(Assembly can theoritically do the same but I don't have impression which one
support this feature.)

So basically, C and Assembly are NOT essentially the same. C has far
more abstractions: it is a HLL.

Anyway, IMO, 'portable assembly' is more descriptive.
'High-Level Language' is anyone's interpretation (prone to mis-interpretation and
misunderstanding).

'Assembly' can also be like C:

// This is 'assembly'
def int=32bit; // Choose right bits for your platform, or leave it for
def char= 8bit; // compiler to decide.

int a;
char b;
a=b; // allow auto promotion

while(a<b) {
a+=1;
}

You also can call the above example 'C'. If so, you still have to know how wide
int/char is (Not rare. programmers often struggle which size to use) while writing "a=b", eventually. What the 'abstracton' really mean? Maybe, eventually
back to int32_t and int8_t after long theoretical/phillisophical pondering?

HHL is just 'style' in favor of specific purpose than the other for me. I am not
saying it is wrong, instead it is very helpful (measured by actuall effort and
gain).

And actually, there are at least a couple of language levels I've used
that sit between Assembly and C.

While I'll grant you that "high-level" is a subjective term, I don't
know that "portable assembly" is quite right either (and also a bit of
an oxymoron).

That said, I think we're splitting hairs here.

--
Regards,
Jonathan Lamothe
https://jlamothe.net

--- PyGate Linux v1.5.14
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From Bart@3:633/10 to All on Tue Apr 14 22:41:50 2026

On 14/04/2026 19:41, wij wrote:

On Tue, 2026-04-14 at 18:45 +0100, Bart wrote:

On 14/04/2026 15:47, wij wrote:

In attempting writting a simple language, I had a thought of what language is
to share. Because I saw many people are stuck in thinking C/C++ (or other >>> high level language) can be so abstract, unlimited 'high level' to mysteriously
solve various human description of idea.

C and assembly are essentially the same, maybe better call it 'portable assembly'.
In C, we don't explicitly specify how wide the register/memory unit is, we use
char/int (short/long, signed/unsigned) to denote the basic unit. I.e.

�� a=b;�� // equ. to "mov a,b"

What C's 'a=b' equates to in assembly could be anything, depending on
target machine, the types of 'a' and 'b', their scopes and linkage, the
compiler used, and the optimisation levels employed.

The 2nd difference: Assembly contains too many burdomsom labels. In C, we use
'structure', for example:

�� while(a<b) {�� // 'while', '(', ')' may be the place for implicit lables
�� a+=1;
�� }�� // '}' is an implicit label

�� if(a<b) {
�� } else {�� // '{', '}' are implicit labels
�� }�� // ditto

The 3rd difference: Function calling convention in C is reentrance-able (mostly).

The 4th difference: Local variable.
(Assembly can theoritically do the same but I don't have impression which one
support this feature.)

So basically, C and Assembly are NOT essentially the same. C has far
more abstractions: it is a HLL.

Anyway, IMO, 'portable assembly' is more descriptive.
'High-Level Language' is anyone's interpretation (prone to mis-interpretation and
misunderstanding).

'Assembly' can also be like C:

// This is 'assembly'
def int=32bit; // Choose right bits for your platform, or leave it for
def char= 8bit; // compiler to decide.

int a;
char b;
a=b; // allow auto promotion

while(a<b) {
a+=1;
}

You also can call the above example 'C'.

That's because it is pretty much C. It's not like any assembly I've ever
seen!

If so, you still have to know how wide
int/char is (Not rare. programmers often struggle which size to use) while writing "a=b", eventually. What the 'abstracton' really mean? Maybe, eventually
back to int32_t and int8_t after long theoretical/phillisophical pondering?

Wrap the above into a viable C function. Paste it into godbolt.org, then
look at the actual assembly that is generation for combinations of
target, compile and options. All will be different. Some may not even
generate any code for that loop. (You might also try Clang with -emit-llvm.)

Then change the types of a and b, say to floats or pointers, and do it
again. The assembly will changer yet again, even though you've modified nothing else. That is a characteristic of a HLL: you change one small
part, and the generated code changes across the program.

But if you want to call C some kind of assembler, even though it is
several levels above actual assembly, then that's up to you.

HHL is just 'style' in favor of specific purpose than the other for me. I am not
saying it is wrong, instead it is very helpful (measured by actuall effort and
gain).

Below C there are HLAs or high-level assemblers, which at one time were
also called machine-oriented languages, intended for humans to use. And
a little below that might be intermediate languages (ILs), usually machine-generated, intended for compiler backends.

ILs will be target-independent and so portable to some extent. I'd say
that 'portable assembly' fits those better.

(I've implemented, or devised and implemented, all the four levels
discussed here. There are also other languages in this space such as
PL/M, or Forth.)

--- PyGate Linux v1.5.14
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From Janis Papanagnou@3:633/10 to All on Wed Apr 15 00:20:44 2026

On 2026-04-14 23:41, Bart wrote:

But if you want to call C some kind of assembler, even though it is
several levels above actual assembly, then that's up to you.

Can you name and describe a couple of these "several levels above
actual assembly"? (Assembler macros might qualify as one level.)

Beyond the inherent subjective aspects of that or the OP's initial
statement I certainly see "C" closer to the machine than many HLLs.
It certainly depends on where one is coming from; from an abstract
or user-application level or from the machine level.

There was often mentioned here - very much to the despise of the
audience - that there's a lot effort necessary to implement simple
concepts. To jump on that bandwagon; how would, say, Awk's array
construct map[key] = value have to be modeled in (native) "C".
(Note that this simple statement represents an associative array.)

"C" is abstracting from the machine. And the OP's initial statement
"C and assembly are essentially the same" may be nonsense, or just
a sloppy formulation at best. Or just another exaggeration to make
a debatable point about its low-levelness (compared to other HLLs).
More likely is that he wanted to point out the conceptual building
blocks have "structural similarities". (IMO not worth a debate.)

Janis

--- PyGate Linux v1.5.14
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From Keith Thompson@3:633/10 to All on Tue Apr 14 15:31:58 2026

wij <wyniijj5@gmail.com> writes:

In attempting writting a simple language, I had a thought of what language is to share. Because I saw many people are stuck in thinking C/C++ (or other high level language) can be so abstract, unlimited 'high level' to mysteriously
solve various human description of idea.

C and assembly are essentially the same, maybe better call it 'portable assembly'.

No, C is not any kind of assembly. Assembly language and C are
fundamentally different.

An assembly language program specifies a sequence of CPU instructions.

A C program specifies run-time behavior. (A compiler generates CPU instructions behind the scenes to implement that behavior.)

In C, we don't explicitly specify how wide the register/memory unit is, we use
char/int (short/long, signed/unsigned) to denote the basic unit. I.e.

a=b; // equ. to "mov a,b"

(Or "mov b,a" depending on the assembly syntax.)

Nope. `a=b` could translate to a lot of different instruction
sequences. Either or both of the operands could be registers. There
might or might not be different "mov" instructions for integers, pointers, floating-point values. a and b could be large structs, and the
assignment might be translated to a call to memcpy(), or to equivalent
inline code.

Or the assignment might not result in any code at all, if the compiler
can prove that it has no side effects and the value of a is not used.

[...]

--
Keith Thompson (The_Other_Keith) Keith.S.Thompson+u@gmail.com
void Void(void) { Void(); } /* The recursive call of the void */

--- PyGate Linux v1.5.14
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From Bart@3:633/10 to All on Wed Apr 15 00:33:39 2026

On 14/04/2026 23:20, Janis Papanagnou wrote:

On 2026-04-14 23:41, Bart wrote:

But if you want to call C some kind of assembler, even though it is
several levels above actual assembly, then that's up to you.

Can you name and describe a couple of these "several levels above
actual assembly"?� (Assembler macros might qualify as one level.)

I said C is several levels above, and mentioned 2 categories and 2
specific ones that can be considered to be in-between.

Namely:

* HLAs (high-level assemblers) of various kinds, as this is a broad
category (see note)

* Intermediate languages (IRs/ILs) such as LLVM IR

* Forth

* PL/M (an old one; there was also C--, now dead)

(Note: the one I implemented was called 'Babbage', devised for the GEC
4000 machines. My task was to port it to DEC PDP10. There's something
about it 2/3 down this page: https://en.wikipedia.org/wiki/GEC_4000_series)

Beyond the inherent subjective aspects of that or the OP's initial
statement I certainly see "C" closer to the machine than many HLLs.

I see it as striving to distance itself from the machine as much as
possible! Certainly until C99 when stdint.h came along.

For example:

* Not committing to actual machine types, widths or representations,
such as a 'byte', or 'twos complement'.

* Being vague about the relations between the different integer types

* Not allowing (until standardised after half a century) binary
literals, and still not allowing those to be printed

* Not being allowed to do a dozen things that you KNOW are well-defined
on your target machine, but C says are UB.

It certainly depends on where one is coming from; from an abstract
or user-application level or from the machine level.

There was often mentioned here - very much to the despise of the
audience - that there's a lot effort necessary to implement simple
concepts. To jump on that bandwagon; how would, say, Awk's array
construct� map[key] = value� have to be modeled in (native) "C".
(Note that this simple statement represents an associative array.)

"C" is abstracting from the machine. And the OP's initial statement
"C and assembly are essentially the same" may be nonsense

Actually, describing C as 'portable assembly' annoys me which is why I
went into some detail.

--- PyGate Linux v1.5.14
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From wij@3:633/10 to All on Wed Apr 15 12:15:56 2026

On Tue, 2026-04-14 at 15:31 -0700, Keith Thompson wrote:

wij <wyniijj5@gmail.com> writes:

In attempting writting a simple language, I had a thought of what langu

age is

to share. Because I saw many people are stuck in thinking C/C++ (or oth

er

high level language) can be so abstract, unlimited 'high level' to myst

eriously

solve various human description of idea.

C and assembly are essentially the same, maybe better call it 'portable

assembly'.

No, C is not any kind of assembly.� Assembly language and C are
fundamentally different.

An assembly language program specifies a sequence of CPU instructions.

[Repeat] 'Assembly' can also be like C:
// This is 'assembly'
def int=32bit; // Choose right bits for your platform, or leave it for
def char= 8bit; // compiler to decide.

int a;
char b;
a=b; // allow auto promotion

while(a<b) {
a+=1;
}

Yes, the C-like example above specifies exactly a sequence of CPU instructi
ons
(well, small deviation is allowed, and assembly can also have function, mac
ro)

A C program specifies run-time behavior.� (A compiler generates CPU instructions behind the scenes to implement that behavior.)

Being 'portable', it should specify 'run-time behavior', no exact instructi ons.

In C, we don't explicitly specify how wide the register/memory unit is,

we use

char/int (short/long, signed/unsigned) to denote the basic unit. I.e.

� a=b;�� // equ. to "mov a,b"

(Or "mov b,a" depending on the assembly syntax.)

Nope.� `a=b` could translate to a lot of different instruction
sequences.� Either or both of the operands could be registers.�

There

might or might not be different "mov" instructions for integers, pointers

,

floating-point values.� a and b could be large structs, and the
assignment might be translated to a call to memcpy(), or to equivalent
inline code.

Or the assignment might not result in any code at all, if the compiler
can prove that it has no side effects and the value of a is not used.

[...]

All mentioned could also be implemented in assembly.
Note that I am not saying C is assembly.

--- PyGate Linux v1.5.14
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From wij@3:633/10 to All on Wed Apr 15 12:20:12 2026

On Tue, 2026-04-14 at 22:41 +0100, Bart wrote:

On 14/04/2026 19:41, wij wrote:

On Tue, 2026-04-14 at 18:45 +0100, Bart wrote:

On 14/04/2026 15:47, wij wrote:

In attempting writting a simple language, I had a thought of what l

anguage is

to share. Because I saw many people are stuck in thinking C/C++ (or

other

high level language) can be so abstract, unlimited 'high level' to

mysteriously

solve various human description of idea.

C and assembly are essentially the same, maybe better call it 'port

able assembly'.

In C, we don't explicitly specify how wide the register/memory unit

is, we use

char/int (short/long, signed/unsigned) to denote the basic unit. I.

e.

�� a=b;�� // equ. to "mov a,b"

What C's 'a=b' equates to in assembly could be anything, depending

on

target machine, the types of 'a' and 'b', their scopes and linkage, t

he

compiler used, and the optimisation levels employed.

The 2nd difference: Assembly contains too many burdomsom labels. In

C, we use

'structure', for example:

�� while(a<b) {�� // 'while', '(', ')' ma

y be the place for implicit lables

�� a+=1;
�� }��?

?�� // '}' is an implicit label

�� if(a<b) {
�� } else {�� //

'{', '}' are implicit labels

�� }��?

?�� // ditto

The 3rd difference: Function calling convention in C is reentrance-

able (mostly).

The 4th difference: Local variable.
(Assembly can theoritically do the same but I don't have impression

which one

support this feature.)

So basically, C and Assembly are NOT essentially the same. C has far
more abstractions: it is a HLL.

Anyway, IMO, 'portable assembly' is more descriptive.
'High-Level Language' is anyone's interpretation (prone to mis-interpre

tation and

misunderstanding).

'Assembly' can also be like C:

� // This is 'assembly'
� def int=32bit;�� // Choose right bits for your platf

orm, or leave it for

� def char= 8bit;� // compiler to decide.

� int a;
� char b;
� a=b;�� // allow auto promotion

� while(a<b) {
�� a+=1;
� }

You also can call the above example 'C'.

That's because it is pretty much C. It's not like any assembly I've ever

seen!

If so, you still have to know how wide
int/char is (Not rare. programmers often struggle which size to use) wh

ile

writing "a=b", eventually. What the 'abstracton' really mean? Maybe,

eventually

back to int32_t and int8_t after long theoretical/phillisophical ponder

ing?

Wrap the above into a viable C function. Paste it into godbolt.org, then

look at the actual assembly that is generation for combinations of
target, compile and options. All will be different. Some may not even generate any code for that loop. (You might also try Clang with -emit-llv

m.)

Then change the types of a and b, say to floats or pointers, and do it

again. The assembly will changer yet again, even though you've modified

nothing else. That is a characteristic of a HLL: you change one small
part, and the generated code changes across the program.

But if you want to call C some kind of assembler, even though it is
several levels above actual assembly, then that's up to you.

HHL is just 'style' in favor of specific purpose than the other for me.

I am not

saying it is wrong, instead it is very helpful (measured by actuall eff

ort and

gain).

Below C there are HLAs or high-level assemblers, which at one time were

also called machine-oriented languages, intended for humans to use. And

a little below that might be intermediate languages (ILs), usually machine-generated, intended for compiler backends.

ILs will be target-independent and so portable to some extent. I'd say

that 'portable assembly' fits those better.

Do you program (read/write) IL directly?
I am talking about the language that human uses directly.

(I've implemented, or devised and implemented, all the four levels
discussed here. There are also other languages in this space such as
PL/M, or Forth.)

I am not talking about compiler technology.

--- PyGate Linux v1.5.14
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From Keith Thompson@3:633/10 to All on Tue Apr 14 21:46:02 2026

wij <wyniijj5@gmail.com> writes:

On Tue, 2026-04-14 at 15:31 -0700, Keith Thompson wrote:

wij <wyniijj5@gmail.com> writes:

In attempting writting a simple language, I had a thought of what language is
to share. Because I saw many people are stuck in thinking C/C++ (or other >> > high level language) can be so abstract, unlimited 'high level' to mysteriously
solve various human description of idea.

C and assembly are essentially the same, maybe better call it 'portable assembly'.

No, C is not any kind of assembly.� Assembly language and C are
fundamentally different.

An assembly language program specifies a sequence of CPU instructions.

[Repeat] 'Assembly' can also be like C:
// This is 'assembly'
def int=32bit; // Choose right bits for your platform, or leave it for
def char= 8bit; // compiler to decide.

Compiler? You said this was assembly.

int a;
char b;
a=b; // allow auto promotion

while(a<b) {
a+=1;
}

You've claimed that that's assembly language. What assembler?
For what CPU?

Is it even for a real assembler?

Yes, the C-like example above specifies exactly a sequence of CPU instructions
(well, small deviation is allowed, and assembly can also have function, macro)

A C program specifies run-time behavior.� (A compiler generates CPU
instructions behind the scenes to implement that behavior.)

Being 'portable', it should specify 'run-time behavior', no exact instructions.

Yes, that's what I said. And that's the fundamental difference between assembly and C.

In C, we don't explicitly specify how wide the register/memory unit is, we use
char/int (short/long, signed/unsigned) to denote the basic unit. I.e.

� a=b;�� // equ. to "mov a,b"

(Or "mov b,a" depending on the assembly syntax.)

Nope.� `a=b` could translate to a lot of different instruction
sequences.� Either or both of the operands could be registers.� There
might or might not be different "mov" instructions for integers, pointers, >> floating-point values.� a and b could be large structs, and the
assignment might be translated to a call to memcpy(), or to equivalent
inline code.

Or the assignment might not result in any code at all, if the compiler
can prove that it has no side effects and the value of a is not used.

[...]

All mentioned could also be implemented in assembly.

Sure, many C compilers can generate assembly code. But I question your
claim that an assembler can plausibly generate a call to memcpy() for
something that looks like a simple assignment.

Many assemblers support macros, but the assembly language still
specifies the sequence of CPU instructions.

If you can cite a real-world "assembler" that behaves that way,
there might be something to discuss.

Note that I am not saying C is assembly.

You said that "C and assembly are essentially the same, maybe better
call it 'portable assembly'." I disagree.

I had a similar discussion here some time ago. As I recall, the
other participant repeatedly claimed that sophisticated assemblers
that don't generate specified sequences of CPU instructions are
common, but never provided an example. (I haven't been able to
track down the discussion.)

--
Keith Thompson (The_Other_Keith) Keith.S.Thompson+u@gmail.com
void Void(void) { Void(); } /* The recursive call of the void */

--- PyGate Linux v1.5.14
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From ??Jacek Marcin Jaworski??@3:633/10 to All on Wed Apr 15 07:00:31 2026

W dniu 14.04.2026 o�16:47, wij pisze:

C and assembly are essentially the same

No! C and Assembler are both compile able in to "binary machine code",
which is defined by processor manufacturer. And this is end of any
similarity. Main differences between C and Assembler langs are:

a) Assembler lang is processor manufacturer specific, and is compiler
vendor specific;
b) Above point mean: Assembler has not standardized API nor ABI;
c) Porting Assembler code to another processor always mean rewrite whole
code "from scratch" - you can only reuse algorithms and general code structure, but not earlier coded functions;
d) Assembler lang has different model of code: function declarations and definitions are inseparably. Programmer can call function before it's declaration in code (this is main nonsense in Assembler lang). This sick
code model is widely used in all other langs (except C and C++) and in
all scripting langs;
e) Assembler lang code model, in spite to its illogical approach, has
huge advantage over C code model: it require only 50% code of the same functionality in C lang - read E) point bellow.

A) C lang is standardized, and portable to different processors, microcontrolers, with OS or "bare metal" (without OS);
B) Above point mean: C lang has standardized API and ABI;
C) Porting good C lang code should not require any changes (or can
require some minor tweaks);
D) C lang has different model of code: function declarations and
definitions can be separated. Function declarations can be grouped in separated files called "headers". In order to use any function, first
you must declare it in above code. The only another lang which follow
this rational and 100% logical approach is C++. Even D lang switch back
in to Assembler code model;
E) 100% logical C lang code model require 100% more code to reach the
same functionality. I check this by code the same simple console app in
C++, Bash, Python and in D lang. C++ code was perfectly 100% more LoC
than code in Python and in D lang (both have Assembler code model, and
both have almost the same number of LoC).

To sum up: I don't want answer in this thread because it seems to be
some kind of provoking by claim above quoted nonsense. But I answer
because I can afford half hour or so for writing, because now I am
unemployed and have more free time than in the past.

--
Jacek Marcin Jaworski, Pruszcz Gd., woj. Pomorskie, Polska ??, EU ??;
tel.: +48-609-170-742, najlepiej w godz.: 5:00-5:55 lub 16:00-17:25; <jmj@energokod.gda.pl>, gpg: 4A541AA7A6E872318B85D7F6A651CC39244B0BFA;
Domowa s. WWW: <https://energokod.gda.pl>;
Mini Netykieta: <https://energokod.gda.pl/MiniNetykieta.html>;
Mailowa Samoobrona: <https://emailselfdefense.fsf.org/pl>.
UWAGA:
NIE ZACI?GAJ "UKRYTEGO D?UGU"! P?A? ZA PROG. FOSS I INFO. INTERNETOWE!
CZYTAJ DARMOWY: "17. Raport Totaliztyczny - Patroni Kontra Bankierzy": <https://energokod.gda.pl/raporty-totaliztyczne/17.%20Patroni%20Kontra%20Bankierzy.pdf>

--- PyGate Linux v1.5.14
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From wij@3:633/10 to All on Wed Apr 15 14:05:00 2026

On Tue, 2026-04-14 at 21:46 -0700, Keith Thompson wrote:

wij <wyniijj5@gmail.com> writes:

On Tue, 2026-04-14 at 15:31 -0700, Keith Thompson wrote:

wij <wyniijj5@gmail.com> writes:

In attempting writting a simple language, I had a thought of what l

anguage is

to share. Because I saw many people are stuck in thinking C/C++ (or

other

high level language) can be so abstract, unlimited 'high level' to

mysteriously

solve various human description of idea.

C and assembly are essentially the same, maybe better call it 'port

able assembly'.

No, C is not any kind of assembly.� Assembly language and C are fundamentally different.

An assembly language program specifies a sequence of CPU instructions

.

[Repeat] 'Assembly' can also be like C:
�// This is 'assembly'
�def int=32bit;�� // Choose right bits for your platfo

rm, or leave it for

�def char= 8bit;� // compiler to decide.

Compiler?� You said this was assembly.

�int a;
�char b;
�a=b;�� // allow auto promotion

�while(a<b) {
�� a+=1;
�}

You've claimed that that's assembly language.� What assembler?
For what CPU?

Is it even for a real assembler?

I think you realize the example above is just an example to demo my idea.

Yes, the C-like example above specifies exactly a sequence of CPU instr

uctions

(well, small deviation is allowed, and assembly can also have function,

macro)

A C program specifies run-time behavior.� (A compiler generates

CPU

instructions behind the scenes to implement that behavior.)

Being 'portable', it should specify 'run-time behavior', no exact instr

uctions.

Yes, that's what I said.� And that's the fundamental difference betw

een

assembly and C.

How/what do you specify 'run-time behavior'? Not based on CPU?

E.g. in C, int types are fixed-size, have range, wrap-around, alignment
and 'atomic','overlapping' properties, you cannot really understand or hide
it and
program C/C++ correctly from the high-level concept of 'integer'.

The point is that C has NO WAY get rid of these (hard-ware) features, no ma tter
how high-level one thinks C is or expect C would be.

In C, we don't explicitly specify how wide the register/memory unit

is, we use

char/int (short/long, signed/unsigned) to denote the basic unit. I.

e.

� a=b;�� // equ. to "mov a,b"

(Or "mov b,a" depending on the assembly syntax.)

Nope.� `a=b` could translate to a lot of different instruction
sequences.� Either or both of the operands could be registers.

� There

might or might not be different "mov" instructions for integers, poin

ters,

floating-point values.� a and b could be large structs, and the assignment might be translated to a call to memcpy(), or to equivalen

t

inline code.

Or the assignment might not result in any code at all, if the compile

r

can prove that it has no side effects and the value of a is not used.

[...]

All mentioned could also be implemented in assembly.

Sure, many C compilers can generate assembly code.� But I question y

our

claim that an assembler can plausibly generate a call to memcpy() for something that looks like a simple assignment.

Many assemblers support macros, but the assembly language still
specifies the sequence of CPU instructions.

If you can cite a real-world "assembler" that behaves that way,
there might be something to discuss.

Note that I am not saying C is assembly.

You said that "C and assembly are essentially the same, maybe better
call it 'portable assembly'."� I disagree.

I had a similar discussion here some time ago.� As I recall, the
other participant repeatedly claimed that sophisticated assemblers
that don't generate specified sequences of CPU instructions are
common, but never provided an example.� (I haven't been able to
track down the discussion.)

When I heard 'sophisticated assemblers', I would think something like my id
ea of
'portable' assembly, but maybe different.�
One my point should be clear as stated in the above int example�"... C
has NO WAY
get rid of these (hard-ware) features, no matter how high-level�one th
inks C is or
expect C would be."

--- PyGate Linux v1.5.14
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From David Brown@3:633/10 to All on Wed Apr 15 09:57:46 2026

On 15/04/2026 01:33, Bart wrote:

On 14/04/2026 23:20, Janis Papanagnou wrote:

On 2026-04-14 23:41, Bart wrote:

But if you want to call C some kind of assembler, even though it is
several levels above actual assembly, then that's up to you.

Can you name and describe a couple of these "several levels above
actual assembly"?� (Assembler macros might qualify as one level.)

I said C is several levels above, and mentioned 2 categories and 2
specific ones that can be considered to be in-between.

I agree with a great deal you have written in this thread (at least what
I have read so far). My points below are mainly additional comments
rather than arguments or disagreements. Like you, my disagreement is primarily with wij.

Namely:

* HLAs (high-level assemblers) of various kinds, as this is a broad
category (see note)

When I used to do significant amounts of assembly programming (often on "brain-dead" 8-bit CISC microcontrollers), I would make heavy use of
assembler macros as a way of getting slightly "higher level" assembly.
Even with common assembler tools you can write something that is a kind
of HLA. And then for some targets, there are more sophisticated tools
(or you can write them yourself) for additional higher level constructs.

* Intermediate languages (IRs/ILs) such as LLVM IR

LLVM is probably the best candidate for something that could be called a "portable assembler". It is quite likely that other such "languages"
have been developed and used (perhaps internally in multi-target
compilers), but LLVM's is the biggest and with the widest support.

* Forth

Forth is always a bit difficult to categorise. Many Forth
implementations are done with virtual machines or byte-code
interpreters, raising them above assembly. Others are for stack machine processors (very common in the 4-bit world) where the assembly /is/ a
small Forth language. A lot of Forth tools compile very directly
(giving you the "you get what your code looks like" aspect of assembly), others do more optimisation (for the "you get what your code means"
aspect of high level languages).

* PL/M (an old one; there was also C--, now dead)

I never used PL/M - I'm too young for that! C-- was conceived as a
portable intermediary language that compilers could generate to get cross-target compilation without needing individual target backends. In practice, ordinary C does a good enough job for many transpilers during development, then they can move to LLVM for more control and efficiency
if they see it as worth the effort.

(Note: the one I implemented was called 'Babbage', devised for the GEC
4000 machines. My task was to port it to DEC PDP10. There's something
about it 2/3 down this page: https://en.wikipedia.org/wiki/GEC_4000_series)

Beyond the inherent subjective aspects of that or the OP's initial
statement I certainly see "C" closer to the machine than many HLLs.

I see it as striving to distance itself from the machine as much as possible!

Yes - as much as possible while retaining efficiency.

Certainly until C99 when stdint.h came along.

I would not draw that distinction - indeed, I see the opposite. Prior
to <stdint.h>, your integer type sizes were directly from the target
machine - with <stdint.h> explicitly sized integer types, they are now independent of the target hardware.

C has always intended to be as independent from the machine as
practically possible without compromising efficiency. That's why it has implementation-defined behaviour where it makes a significant difference
(such as the size of integer types), while giving full definitions of
things that can reasonably be widely portable while still being
efficient (and sometimes leaving things undefined to encourage portability).

For example:

* Not committing to actual machine types, widths or representations,
such as a 'byte', or 'twos complement'.

(With C23, two's complement is the only allowed signed integer
representation. There comes a point where something is so dominant that
even C commits it to the standards.)

* Being vague about the relations between the different integer types

* Not allowing (until standardised after half a century) binary
literals, and still not allowing those to be printed

That one is more that no one had bothered standardising binary literals.
The people that wanted them, for the most part, are low-level embedded programmers and their tools already supported them. (And even then,
they are not much used in practice.) Printing in binary is not
something people often want - it is far too cumbersome for numbers, and
if you want to dump a view of some flag register then a custom function
with letters is vastly more useful.

C is standardised on binary - unsigned integer types would not work well
on a non-binary target.

* Not being allowed to do a dozen things that you KNOW are well-defined
on your target machine, but C says are UB.

That is certainly part of it. Things like "signed integer arithmetic overflow" is UB at least partly because C models mathematical integer arithmetic. It does not attempt to mimic the underlying hardware. This
is clearly "high level language" territory - C defines the behaviour of
an abstract machine in terms of mathematics. It is not an "assembler"
that defines operations in terms of hardware instructions.

It certainly depends on where one is coming from; from an abstract
or user-application level or from the machine level.

There was often mentioned here - very much to the despise of the
audience - that there's a lot effort necessary to implement simple
concepts. To jump on that bandwagon; how would, say, Awk's array
construct� map[key] = value� have to be modeled in (native) "C".
(Note that this simple statement represents an associative array.)

"C" is abstracting from the machine. And the OP's initial statement
"C and assembly are essentially the same" may be nonsense

Actually, describing C as 'portable assembly' annoys me which is why I
went into some detail.

Indeed.

C is defined in terms of an abstract machine, not hardware. And the C
source code running on this abstract machine only needs to match up with
the actual binary code on the real target machine in very specific and
limited ways - the "observable behaviour" of the program. That's
basically start, stop, volatile accesses and IO. Everything else
follows the "as if" - the compiler needs to generate target code that
works (for observable behaviour) "as if" it had done a direct, na�ve translation of the source.

As I understand the history - and certainly the practice - of the C
language, it is a language with two goals. One is that it should be
possible to write highly portable C code that can be used on a very wide
range of target systems while remaining efficient. The other is that it should be useable for a lot of target-specific system code.

C has never been, and was never intended to be, a "portable assembly".
It was designed to reduce the need to write assembly code. There is a
huge difference in these concepts.

--- PyGate Linux v1.5.14
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From David Brown@3:633/10 to All on Wed Apr 15 10:24:36 2026

On 15/04/2026 08:05, wij wrote:

On Tue, 2026-04-14 at 21:46 -0700, Keith Thompson wrote:

wij <wyniijj5@gmail.com> writes:

On Tue, 2026-04-14 at 15:31 -0700, Keith Thompson wrote:

wij <wyniijj5@gmail.com> writes:

In attempting writting a simple language, I had a thought of what language is
to share. Because I saw many people are stuck in thinking C/C++ (or other >>>>> high level language) can be so abstract, unlimited 'high level' to mysteriously
solve various human description of idea.

C and assembly are essentially the same, maybe better call it 'portable assembly'.

No, C is not any kind of assembly.� Assembly language and C are
fundamentally different.

An assembly language program specifies a sequence of CPU instructions.

[Repeat] 'Assembly' can also be like C:
�// This is 'assembly'
�def int=32bit;�� // Choose right bits for your platform, or leave it for >>> �def char= 8bit;� // compiler to decide.

Compiler?� You said this was assembly.

�int a;
�char b;
�a=b;�� // allow auto promotion

�while(a<b) {
�� a+=1;
�}

You've claimed that that's assembly language.� What assembler?
For what CPU?

Is it even for a real assembler?

I think you realize the example above is just an example to demo my idea.

In other words, it is imaginary. Hitchen's razor - "What can be
asserted without evidence can also be dismissed without evidence". All
you have shown is that your understanding of what "assembly language"
really is, is questionable.

Yes, the C-like example above specifies exactly a sequence of CPU instructions
(well, small deviation is allowed, and assembly can also have function, macro)

A C program specifies run-time behavior.� (A compiler generates CPU
instructions behind the scenes to implement that behavior.)

Being 'portable', it should specify 'run-time behavior', no exact instructions.

Yes, that's what I said.� And that's the fundamental difference between
assembly and C.

How/what do you specify 'run-time behavior'? Not based on CPU?

Correct.

From the C standard:

"The semantic descriptions in this International Standard describe the behaviour of an abstract machine"

Some aspects of C are given as "implementation-defined behaviour", and
the choice of the details of the behaviour will come primarily from the
target processor and/or OS (if any), within the range allowed in the C standard.

E.g. in C, int types are fixed-size, have range, wrap-around, alignment
and 'atomic','overlapping' properties, you cannot really understand or hide it and
program C/C++ correctly from the high-level concept of 'integer'.

Signed integer types in C do not have wrapping behaviour on overflow - arithmetic overflow is undefined behaviour. Unsigned integer types have modulo behaviour, and thus do not overflow.

And yes, you absolutely /can/ program in C and C++ with a high-level
concept of "integer". Pick integer types that are big enough for the
task in hand, and use them without concern for size, alignment, overflow behaviour (because you are not overflowing them), or other behaviours.
If you need atomic access, use atomic types. Occasionally your code
needs to be so low-level, or you are so concerned about maximal
efficiency, that it is helpful to know the details. Usually, however,
you can concentrate on writing correct code. If you have used an
appropriate type, it will be as efficient as practical.

Understanding the underlying hardware can certainly help you write more efficient code, or at least avoid some inefficiencies. But you don't
need more than a rough idea for most kinds of C and C++ programming.

The point is that C has NO WAY get rid of these (hard-ware) features, no matter
how high-level one thinks C is or expect C would be.

A great many real-world programming languages have limitations or
features that exist because of real-world hardware. They often have
some kind of "integer" type that is implemented as 32-bit or 64-bit,
because that efficiently matches hardware. (For example, Haskell is undoubtedly a high-level language by any standards - it's "Int" type is typically 64-bits (though only guaranteed to hold a range of 30 bits)
and is used far more often than unlimited range "Integer".)

C has more "implementation-defined" behaviours than many high-level
languages. It can be considered as a relatively low-level high-level language. But the language's definition is primarily in terms of an
abstract machine and behavioural semantics, not in terms of hardware or implementation, and thus it is a high-level language and not in any
sense an assembler language.

--- PyGate Linux v1.5.14
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From Bart@3:633/10 to All on Wed Apr 15 11:21:44 2026

On 15/04/2026 05:20, wij wrote:

On Tue, 2026-04-14 at 22:41 +0100, Bart wrote:

HHL is just 'style' in favor of specific purpose than the other for me. I am not
saying it is wrong, instead it is very helpful (measured by actuall effort and
gain).

Below C there are HLAs or high-level assemblers, which at one time were
also called machine-oriented languages, intended for humans to use. And
a little below that might be intermediate languages (ILs), usually
machine-generated, intended for compiler backends.

ILs will be target-independent and so portable to some extent. I'd say
that 'portable assembly' fits those better.

Do you program (read/write) IL directly?
I am talking about the language that human uses directly.

It is possible to write IL directly, when a textual form of it exists.

Not many do that, but then not many write assembly these days either,
/because more convenient higher level languages exist/, one of them being C.

Why do /you/ think that people prefer to use C to write programs rather
than assembly, if they are 'essentially the same'?

(I've implemented, or devised and implemented, all the four levels
discussed here. There are also other languages in this space such as
PL/M, or Forth.)

I am not talking about compiler technology.

You claimed that C and assembler are at pretty much the same level. I'm
saying that they are not only at different levels, but other levels
exist, and I know because I've used them!

A compiler can choose to translate a language to any of those levels, including C (from a higher level language than C usually).

--- PyGate Linux v1.5.14
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From Bart@3:633/10 to All on Wed Apr 15 11:46:08 2026

On 15/04/2026 07:05, wij wrote:

On Tue, 2026-04-14 at 21:46 -0700, Keith Thompson wrote:

�int a;
�char b;
�a=b;�� // allow auto promotion

�while(a<b) {
�� a+=1;
�}

You've claimed that that's assembly language.� What assembler?
For what CPU?

Is it even for a real assembler?

I think you realize the example above is just an example to demo my idea.

So you've invented an 'assembly' syntax that looks exactly like C, in
order to support your notion that C and assembly are really the same thing!

Real assembly generally uses explicit instructions and labels rather
than the implicit ones used here. It would also have limits on the
complexity of expressions. If your pseudo-assembler supports:

a = b+c*f(x,y);

then you've invented a HLL.

Yes, the C-like example above specifies exactly a sequence of CPU instructions
(well, small deviation is allowed, and assembly can also have function, macro)

A C program specifies run-time behavior.� (A compiler generates CPU
instructions behind the scenes to implement that behavior.)

Being 'portable', it should specify 'run-time behavior', no exact instructions.

Yes, that's what I said.� And that's the fundamental difference between
assembly and C.

How/what do you specify 'run-time behavior'? Not based on CPU?

E.g. in C, int types are fixed-size, have range, wrap-around, alignment
and 'atomic','overlapping' properties, you cannot really understand or hide it and
program C/C++ correctly from the high-level concept of 'integer'.

The point is that C has NO WAY get rid of these (hard-ware) features, no matter
how high-level one thinks C is or expect C would be.

There are a dozen or more HLLs that have exactly such a set of integer
types. Actually, those have fixed-width integers with fixed ranges, wrap-around behaviour, twos complement format and so on, even more so
than C.

So those HLLs (that is, C++, C#, D, Rust, Java, Zig, Go, ...) are even
more closely tied to the machine than C is. (In C, built-in types are
not sized, but have mininum widths, and until C23, integer
representation was not specified.)

Would you claim that those are also essentially assembly? If not, why not?

I had a similar discussion here some time ago.� As I recall, the
other participant repeatedly claimed that sophisticated assemblers
that don't generate specified sequences of CPU instructions are
common, but never provided an example.� (I haven't been able to
track down the discussion.)

When I heard 'sophisticated assemblers', I would think something like my idea of
'portable' assembly, but maybe different.
One my point should be clear as stated in the above int example�"... C has NO WAY
get rid of these (hard-ware) features, no matter how high-level�one thinks C is or
expect C would be."

Starting with C23, C has _BitInt, where you can define a 1000000-bit
integer type if you want. (There may be limits as to how big.)

Or a 37-bit type.

While I don't agree with such a feature for this language (partly
/because/ it is a big departure from machine types), it is a
counter-example to your point.

--- PyGate Linux v1.5.14
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From wij@3:633/10 to All on Wed Apr 15 19:52:44 2026

On Wed, 2026-04-15 at 11:21 +0100, Bart wrote:

On 15/04/2026 05:20, wij wrote:

On Tue, 2026-04-14 at 22:41 +0100, Bart wrote:

HHL is just 'style' in favor of specific purpose than the other for

me. I am not

saying it is wrong, instead it is very helpful (measured by actuall

effort and

gain).

Below C there are HLAs or high-level assemblers, which at one time we

re

also called machine-oriented languages, intended for humans to use. A

nd

a little below that might be intermediate languages (ILs), usually machine-generated, intended for compiler backends.

ILs will be target-independent and so portable to some extent. I'd sa

y

that 'portable assembly' fits those better.

Do you program (read/write) IL directly?
I am talking about the language that human uses directly.

It is possible to write IL directly, when a textual form of it exists.

Not many do that, but then not many write assembly these days either, /because more convenient higher level languages exist/, one of them being

C.

Why do /you/ think that people prefer to use C to write programs rather

than assembly, if they are 'essentially the same'?

There are many reasons for those have chose C. I agree the dominant one
is support and convenience.

(I've implemented, or devised and implemented, all the four levels discussed here. There are also other languages in this space such as PL/M, or Forth.)

I am not talking about compiler technology.

You claimed that C and assembler are at pretty much the same level. I'm

saying that they are not only at different levels, but other levels
exist, and I know because I've used them!

A compiler can choose to translate a language to any of those levels, including C (from a higher level language than C usually).

This argument of 'level' seems based on engineering of compiler of multiple languages. My point of view is based on theory of computation and maybe psychological recognition.

--- PyGate Linux v1.5.14
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From wij@3:633/10 to All on Wed Apr 15 20:21:15 2026

On Wed, 2026-04-15 at 11:46 +0100, Bart wrote:

On 15/04/2026 07:05, wij wrote:

On Tue, 2026-04-14 at 21:46 -0700, Keith Thompson wrote:

��int a;
��char b;
��a=b;�� // allow auto promotion

��while(a<b) {
�� a+=1;
��}

You've claimed that that's assembly language.� What assembler?
For what CPU?

Is it even for a real assembler?

I think you realize the example above is just an example to demo my ide

a.

So you've invented an 'assembly' syntax that looks exactly like C, in
order to support your notion that C and assembly are really the same thin

g!

Exactly. But not really 'invented'. I feagured if anyone wants to implement
a 'portable assembly', he would find it not much different from C (from the example shown, 'structured C'). So, in a sense, not worthy to implement.

Real assembly generally uses explicit instructions and labels rather
than the implicit ones used here. It would also have limits on the complexity of expressions. If your pseudo-assembler supports:

�� a = b+c*f(x,y);

then you've invented a HLL.

You may say that.

Yes, the C-like example above specifies exactly a sequence of CPU i

nstructions

(well, small deviation is allowed, and assembly can also have funct

ion, macro)

A C program specifies run-time behavior.� (A compiler genera

tes CPU

instructions behind the scenes to implement that behavior.)

Being 'portable', it should specify 'run-time behavior', no exact i

nstructions.

Yes, that's what I said.� And that's the fundamental difference

between

assembly and C.

How/what do you specify 'run-time behavior'? Not based on CPU?

E.g. in C, int types are fixed-size, have range, wrap-around, alignment
and 'atomic','overlapping' properties, you cannot really understand or

hide it and

program C/C++ correctly from the high-level concept of 'integer'.

The point is that C has NO WAY get rid of these (hard-ware) features, n

o matter

how high-level one thinks C is or expect C would be.

There are a dozen or more HLLs that have exactly such a set of integer

types. Actually, those have fixed-width integers with fixed ranges, wrap-around behaviour, twos complement format and so on, even more so
than C.

So those HLLs (that is, C++, C#, D, Rust, Java, Zig, Go, ...) are even

more closely tied to the machine than C is. (In C, built-in types are
not sized, but have mininum widths, and until C23, integer
representation was not specified.)

Would you claim that those are also essentially assembly? If not, why not

?

I calim C is (maybe I should use 'may be'. Sometimes I feel the conversatio
n
is difficult) 'portable assembly' is because C (subset) could map to 'assem bly'
and in a sense have to. E.g.

int p2; // p2 is connected to extern hardware

p2=0;
p2=0; // significant (hard-ware knows the second 'touch' triggers diff
erent
// action (or for delay purpose).

And, in union, I don't how 'high-level' can explain the way read/write part
of float object officially.
union {
char carr[sizeof(uint64_t)]; // C++ guarantees sizeof(char)==1
float f;
}

I had a similar discussion here some time ago.� As I recall, the
other participant repeatedly claimed that sophisticated assemblers
that don't generate specified sequences of CPU instructions are
common, but never provided an example.� (I haven't been able to
track down the discussion.)

When I heard 'sophisticated assemblers', I would think something like m

y idea of

'portable' assembly, but maybe different.
One my point should be clear as stated in the above int example�".

.. C has NO WAY

get rid of these (hard-ware) features, no matter how high-level�on

e thinks C is or

expect C would be."

Starting with C23, C has _BitInt, where you can define a 1000000-bit
integer type if you want. (There may be limits as to how big.)

Or a 37-bit type.

While I don't agree with such a feature for this language (partly
/because/ it is a big departure from machine types), it is a
counter-example to your point.

Thanks for the example. I did not stress 'C is assembly', maybe it is
because I saw too many Bonita-type of programming concept to stress 'portab
le
assembly' (also I think it may be helpful to others).

My understand of C is that the development of C is simply from practical ne
eds
(i.e. rare of C is from 'theoretical imagination'). Maybe _BitInt is the sa
me but
I don't know.

--- PyGate Linux v1.5.14
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From wij@3:633/10 to All on Wed Apr 15 20:26:34 2026

On Wed, 2026-04-15 at 20:21 +0800, wij wrote:

On Wed, 2026-04-15 at 11:46 +0100, Bart wrote:

On 15/04/2026 07:05, wij wrote:

On Tue, 2026-04-14 at 21:46 -0700, Keith Thompson wrote:

��int a;
��char b;
��a=b;�� // allow auto promotion

��while(a<b) {
�� a+=1;
��}

You've claimed that that's assembly language.� What assembler?
For what CPU?

Is it even for a real assembler?

I think you realize the example above is just an example to demo my i

dea.

So you've invented an 'assembly' syntax that looks exactly like C, in

order to support your notion that C and assembly are really the same th

ing!

Exactly. But not really 'invented'. I feagured if anyone wants to impleme

nt

a 'portable assembly', he would find it not much different from C (from t

he

example shown, 'structured C'). So, in a sense, not worthy to implement.

Typo: 'structured assembly'

Real assembly generally uses explicit instructions and labels rather

than the implicit ones used here. It would also have limits on the complexity of expressions. If your pseudo-assembler supports:

�� a = b+c*f(x,y);

then you've invented a HLL.

You may say that.

Yes, the C-like example above specifies exactly a sequence of CPU

instructions

(well, small deviation is allowed, and assembly can also have fun

ction, macro)

A C program specifies run-time behavior.� (A compiler gene

rates CPU

instructions behind the scenes to implement that behavior.)

Being 'portable', it should specify 'run-time behavior', no exact

instructions.

Yes, that's what I said.� And that's the fundamental differenc

e between

assembly and C.

How/what do you specify 'run-time behavior'? Not based on CPU?

E.g. in C, int types are fixed-size, have range, wrap-around, alignme

nt

and 'atomic','overlapping' properties, you cannot really understand o

r hide it and

program C/C++ correctly from the high-level concept of 'integer'.

The point is that C has NO WAY get rid of these (hard-ware) features,

no matter

how high-level one thinks C is or expect C would be.

There are a dozen or more HLLs that have exactly such a set of integer

types. Actually, those have fixed-width integers with fixed ranges, wrap-around behaviour, twos complement format and so on, even more so

than C.

So those HLLs (that is, C++, C#, D, Rust, Java, Zig, Go, ...) are even

more closely tied to the machine than C is. (In C, built-in types are

not sized, but have mininum widths, and until C23, integer
representation was not specified.)

Would you claim that those are also essentially assembly? If not, why n

ot?

I calim C is (maybe I should use 'may be'. Sometimes I feel the conversat

ion

is difficult) 'portable assembly' is because C (subset) could map to 'ass

embly'

and in a sense have to. E.g.

� int p2; // p2 is connected to extern hardware

� p2=0;
� p2=0;� // significant (hard-ware knows the second 'touch' t

riggers different

�� // action (or for delay

purpose).

And, in union, I don't how 'high-level' can explain the way read/write pa

rt

of float object officially.
� union {
�� char carr[sizeof(uint64_t)];�� // C++ guaran

tees sizeof(char)==1

�� float f;
� }

Typo: char carr[sizeof(float)];

I had a similar discussion here some time ago.� As I recall, t

he

other participant repeatedly claimed that sophisticated assemblers
that don't generate specified sequences of CPU instructions are
common, but never provided an example.� (I haven't been able t

o

track down the discussion.)

When I heard 'sophisticated assemblers', I would think something like

my idea of

'portable' assembly, but maybe different.
One my point should be clear as stated in the above int example�

"... C has NO WAY

get rid of these (hard-ware) features, no matter how high-level�

one thinks C is or

expect C would be."

Starting with C23, C has _BitInt, where you can define a 1000000-bit

integer type if you want. (There may be limits as to how big.)

Or a 37-bit type.

While I don't agree with such a feature for this language (partly /because/ it is a big departure from machine types), it is a counter-example to your point.

Thanks for the example. I did not stress 'C is assembly', maybe it is
because I saw too many Bonita-type of programming concept to stress 'port

able

assembly' (also I think it may be helpful to others).

My understand of C is that the development of C is simply from practical

needs

(i.e. rare of C is from 'theoretical imagination'). Maybe _BitInt is the

same but

I don't know.

--- PyGate Linux v1.5.14
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From Bart@3:633/10 to All on Wed Apr 15 14:24:25 2026

On 15/04/2026 12:52, wij wrote:

On Wed, 2026-04-15 at 11:21 +0100, Bart wrote:

On 15/04/2026 05:20, wij wrote:

On Tue, 2026-04-14 at 22:41 +0100, Bart wrote:

HHL is just 'style' in favor of specific purpose than the other for me. I am not
saying it is wrong, instead it is very helpful (measured by actuall effort and
gain).

Below C there are HLAs or high-level assemblers, which at one time were >>>> also called machine-oriented languages, intended for humans to use. And >>>> a little below that might be intermediate languages (ILs), usually
machine-generated, intended for compiler backends.

ILs will be target-independent and so portable to some extent. I'd say >>>> that 'portable assembly' fits those better.

Do you program (read/write) IL directly?
I am talking about the language that human uses directly.

It is possible to write IL directly, when a textual form of it exists.

Not many do that, but then not many write assembly these days either,
/because more convenient higher level languages exist/, one of them being C. >>
Why do /you/ think that people prefer to use C to write programs rather
than assembly, if they are 'essentially the same'?

There are many reasons for those have chose C. I agree the dominant one
is support and convenience.

(I've implemented, or devised and implemented, all the four levels
discussed here. There are also other languages in this space such as
PL/M, or Forth.)

I am not talking about compiler technology.

You claimed that C and assembler are at pretty much the same level. I'm
saying that they are not only at different levels, but other levels
exist, and I know because I've used them!

A compiler can choose to translate a language to any of those levels,
including C (from a higher level language than C usually).

This argument of 'level' seems based on engineering of compiler of multiple languages. My point of view is based on theory of computation and maybe psychological recognition.

If you take syntax out of the equation, and then 'lower' what's left
(ie. flatten the various abstractions), then probably you can compare
the behaviour of a lot of languages with assembly.

However, those things are exactly what HLLs are about, while that
removing of syntax and lowering is exactly what compilers do.

That's why we use HLLs and not ASM unless we need to.

--- PyGate Linux v1.5.14
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From David Brown@3:633/10 to All on Wed Apr 15 15:38:17 2026

On 15/04/2026 14:21, wij wrote:

On Wed, 2026-04-15 at 11:46 +0100, Bart wrote:

There are a dozen or more HLLs that have exactly such a set of integer
types. Actually, those have fixed-width integers with fixed ranges,
wrap-around behaviour, twos complement format and so on, even more so
than C.

So those HLLs (that is, C++, C#, D, Rust, Java, Zig, Go, ...) are even
more closely tied to the machine than C is. (In C, built-in types are
not sized, but have mininum widths, and until C23, integer
representation was not specified.)

Would you claim that those are also essentially assembly? If not, why not?

I calim C is (maybe I should use 'may be'. Sometimes I feel the conversation is difficult) 'portable assembly' is because C (subset) could map to 'assembly'
and in a sense have to. E.g.

int p2; // p2 is connected to extern hardware

p2=0;
p2=0; // significant (hard-ware knows the second 'touch' triggers different
// action (or for delay purpose).

You are not making any sense. I don't think you understand what C is,
how the language is defined, or how typical C implementations work.

In C, when you write the code above there is /nothing/ to suggest that
there should be two actions. C compilers can - and many will - combine
the two "p2 = 0;" statements. This is critical to understanding why C
is not in any sense an "assembler". In assembly languages, if you write
the equivalent of "p2 = 0;" twice, you get the appropriate opcode twice.
In C, the language do not require an operation for the statement "p2 =
0;". They require that after that statement, any observable behaviour produced by the program will be as if the value 0 had been assigned to
the object "p2". Repeating that same requirement does not change it -
the compiler does not have to have implement "p2 = 0;" twice. (It is
free to do so twice - or two hundred times if it likes. And if the
value of p2 is not used, it can be completely eliminated.)

Have you actually done any C programming at all?

--- PyGate Linux v1.5.14
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From makendo@3:633/10 to All on Wed Apr 15 21:40:36 2026

The 4th difference: Local variable.
(Assembly can theoritically do the same but I don't have impression which one support this feature.)

If you are talking about function-local data, there are multiple ways
to do store them in an easy-to-clean-up fashion:

- Volatile registers, for the shortest lived data. Calling other
functions causes them to be overwritten with the function's
return value or irrelevant data.
- Non-volatile registers, for data that need to persist across
function calls. You save the contents of them before using them,
as your caller expects the contents of these registers to be
intact once you return.
- The stack, for long-lived function-local data when you are out of
non-volatile registers. You manipulate a dedicated stack pointer
register to allocate and deallocate space for your data.
- Immediates, and the .rodata (ELF) / .rdata (PE) section, for
constants and tables of constants.

The notion of local variable allows you to ignore all of these in C,
though. Assembly having multiple ways to store local data instead of
one can make things fairly complicated to read, write and debug.

(forwarding to alt.lang.asm because you are comparing C with it)

--- PyGate Linux v1.5.14
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From Bart@3:633/10 to All on Wed Apr 15 15:06:20 2026

On 15/04/2026 13:21, wij wrote:

On Wed, 2026-04-15 at 11:46 +0100, Bart wrote:

On 15/04/2026 07:05, wij wrote:

On Tue, 2026-04-14 at 21:46 -0700, Keith Thompson wrote:

��int a;
��char b;
��a=b;�� // allow auto promotion

��while(a<b) {
�� a+=1;
��}

You've claimed that that's assembly language.� What assembler?
For what CPU?

Is it even for a real assembler?

I think you realize the example above is just an example to demo my idea. >>

So you've invented an 'assembly' syntax that looks exactly like C, in
order to support your notion that C and assembly are really the same thing!

Exactly. But not really 'invented'. I feagured if anyone wants to implement
a 'portable assembly', he would find it not much different from C (from the example shown, 'structured C'). So, in a sense, not worthy to implement.

Real assembly generally uses explicit instructions and labels rather
than the implicit ones used here. It would also have limits on the
complexity of expressions. If your pseudo-assembler supports:

�� a = b+c*f(x,y);

then you've invented a HLL.

You may say that.

It sounds like you don't understand the difference between a low-level language and a high-level one.

These days C might be considered mid-level (I call it a lower-level HLL, because so many HLLs are much higher level and more abstract).

Compiling a HLL involves lowering it to a different representation, say
from language A to language B.

But just because that translation happens to be routine, doesn't mean
and A is essentially B.

Yes, the C-like example above specifies exactly a sequence of CPU instructions
(well, small deviation is allowed, and assembly can also have function, macro)

A C program specifies run-time behavior.� (A compiler generates CPU >>>>>> instructions behind the scenes to implement that behavior.)

Being 'portable', it should specify 'run-time behavior', no exact instructions.

Yes, that's what I said.� And that's the fundamental difference between >>>> assembly and C.

How/what do you specify 'run-time behavior'? Not based on CPU?

E.g. in C, int types are fixed-size, have range, wrap-around, alignment
and 'atomic','overlapping' properties, you cannot really understand or hide it and
program C/C++ correctly from the high-level concept of 'integer'.

The point is that C has NO WAY get rid of these (hard-ware) features, no matter
how high-level one thinks C is or expect C would be.

There are a dozen or more HLLs that have exactly such a set of integer
types. Actually, those have fixed-width integers with fixed ranges,
wrap-around behaviour, twos complement format and so on, even more so
than C.

So those HLLs (that is, C++, C#, D, Rust, Java, Zig, Go, ...) are even
more closely tied to the machine than C is. (In C, built-in types are
not sized, but have mininum widths, and until C23, integer
representation was not specified.)

Would you claim that those are also essentially assembly? If not, why not?

I calim C is (maybe I should use 'may be'. Sometimes I feel the conversation is difficult) 'portable assembly' is because C (subset) could map to 'assembly'
and in a sense have to. E.g.

int p2; // p2 is connected to extern hardware

p2=0;
p2=0; // significant (hard-ware knows the second 'touch' triggers different
// action (or for delay purpose).

That won't work in C. 'p2' is likely to be in a register; that extra
write may be elided.

You'd have to use 'volatile' to guard against that. But you still can't control where p2 is put into memory. C /is/ used for this stuff, but all
sorts of special extensions, or compiler specifics, may be employed.

In assembly it's much easier.

And, in union, I don't how 'high-level' can explain the way read/write part of float object officially.
union {
char carr[sizeof(float)]; // C++ guarantees sizeof(char)==1
float f;
}

(Fixed that sizeof.)

I normally use my own systems language. That one is aligned much more
directly to hardware than C is, even though it is marginally higher level.

This is because C is intended to work on possible hardware, while mine
was created to work with one target as a time.

Also, when I started on mine (c. 1982 rather than 1972), hardware was
already standardising on 8-bit bytes, byte-addressed, power-of-two word
sizes, and twos-complement integers.

I don't however consider my language to be a form of assembly for lots
of reasons already mentioned.

Its compilers use 3 internal representations before it gets to native code:

HLL source -> AST -> IL -> MCL -> Native

'MCL' is the internal representation of the native code. If I need ASM
output, then MCL can be dumped into a suitable syntax (I support 4
different ASM syntaxes for x64).

This MCL/ASM itself has abstractions, so the same 'MOV' mnemonic is used
for a dozens of different move instructions that each have different
binary opcodes.

--- PyGate Linux v1.5.14
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From Jonathan Lamothe@3:633/10 to All on Wed Apr 15 10:51:00 2026

makendo <makendo@makendo.invalid> writes:

The 4th difference: Local variable.
(Assembly can theoritically do the same but I don't have impression which one
support this feature.)

If you are talking about function-local data, there are multiple ways
to do store them in an easy-to-clean-up fashion:

- Volatile registers, for the shortest lived data. Calling other
functions causes them to be overwritten with the function's
return value or irrelevant data.
- Non-volatile registers, for data that need to persist across
function calls. You save the contents of them before using them,
as your caller expects the contents of these registers to be
intact once you return.
- The stack, for long-lived function-local data when you are out of
non-volatile registers. You manipulate a dedicated stack pointer
register to allocate and deallocate space for your data.
- Immediates, and the .rodata (ELF) / .rdata (PE) section, for
constants and tables of constants.

The notion of local variable allows you to ignore all of these in C,
though. Assembly having multiple ways to store local data instead of
one can make things fairly complicated to read, write and debug.

(forwarding to alt.lang.asm because you are comparing C with it)

So what you're saying is that assembly can do anything that any other
arbitrary language (that has to eventually compile down to the same
machine code) can do? This should not be surprising to anyone.

--
Regards,
Jonathan Lamothe
https://jlamothe.net

--- PyGate Linux v1.5.14
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From wij@3:633/10 to All on Thu Apr 16 00:40:22 2026

On Wed, 2026-04-15 at 14:24 +0100, Bart wrote:

On 15/04/2026 12:52, wij wrote:

On Wed, 2026-04-15 at 11:21 +0100, Bart wrote:

On 15/04/2026 05:20, wij wrote:

On Tue, 2026-04-14 at 22:41 +0100, Bart wrote:

HHL is just 'style' in favor of specific purpose than the other

for me. I am not

saying it is wrong, instead it is very helpful (measured by act

uall effort and

gain).

Below C there are HLAs or high-level assemblers, which at one tim

e were

also called machine-oriented languages, intended for humans to us

e. And

a little below that might be intermediate languages (ILs), usuall

y

machine-generated, intended for compiler backends.

ILs will be target-independent and so portable to some extent. I'

d say

that 'portable assembly' fits those better.

Do you program (read/write) IL directly?
I am talking about the language that human uses directly.

It is possible to write IL directly, when a textual form of it exists

.

Not many do that, but then not many write assembly these days either, /because more convenient higher level languages exist/, one of them b

eing C.

Why do /you/ think that people prefer to use C to write programs rath

er

than assembly, if they are 'essentially the same'?

There are many reasons for those have chose C. I agree the dominant one
is support and convenience.

(I've implemented, or devised and implemented, all the four level

s

discussed here. There are also other languages in this space such

as

PL/M, or Forth.)

I am not talking about compiler technology.

You claimed that C and assembler are at pretty much the same level. I

'm

saying that they are not only at different levels, but other levels exist, and I know because I've used them!

A compiler can choose to translate a language to any of those levels, including C (from a higher level language than C usually).

This argument of 'level' seems based on engineering of compiler of mult

iple

languages. My point of view is based on theory of computation and maybe psychological recognition.

If you take syntax out of the equation, and then 'lower' what's left
(ie. flatten the various abstractions), then probably you can compare
the behaviour of a lot of languages with assembly.

However, those things are exactly what HLLs are about, while that
removing of syntax and lowering is exactly what compilers do.

That's why we use HLLs and not ASM unless we need to.

I guess yes, every foundamental of computation is the same (e.g. assembly a
nd
TM), yet you might surprise: assembly (or TM) is more powerful than any oth
er
formal 'language' (formal system), including those seen in math/logic.
"Why we use HLL" is a deep question and what AI is now trying to solve.

--- PyGate Linux v1.5.14
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From wij@3:633/10 to All on Thu Apr 16 00:58:06 2026

On Wed, 2026-04-15 at 15:38 +0200, David Brown wrote:

On 15/04/2026 14:21, wij wrote:

On Wed, 2026-04-15 at 11:46 +0100, Bart wrote:

There are a dozen or more HLLs that have exactly such a set of intege

r

types. Actually, those have fixed-width integers with fixed ranges, wrap-around behaviour, twos complement format and so on, even more so than C.

So those HLLs (that is, C++, C#, D, Rust, Java, Zig, Go, ...) are eve

n

more closely tied to the machine than C is. (In C, built-in types are
not sized, but have mininum widths, and until C23, integer
representation was not specified.)

Would you claim that those are also essentially assembly? If not, why

not?

I calim C is (maybe I should use 'may be'. Sometimes I feel the convers

ation

is difficult) 'portable assembly' is because C (subset) could map to 'a

ssembly'

and in a sense have to. E.g.

�� int p2; // p2 is connected to extern hardware

�� p2=0;
�� p2=0;� // significant (hard-ware knows the second '

touch' triggers different

�� // action (or fo

r delay purpose).

You are not making any sense.� I don't think you understand what C i

s,

how the language is defined, or how typical C implementations work.

I switched from C to C++ 30 years ago. But that is 'theoretical', I see thi
ngs
from real world side.�I think you approach 'C' from standard documents
, that is
not the way of understanding. You cannot understand the world by/from readi
ng�
the bible.

In C, when you write the code above there is /nothing/ to suggest that

there should be two actions.��

As I know, 'old-time' C has no optimization.

C compilers can - and many will - combine
the two "p2 = 0;" statements.� This is critical to understanding w

hy C

is not in any sense an "assembler".�

Not a valid reason.

In assembly languages, if you write
the equivalent of "p2 = 0;" twice, you get the appropriate opcode twice

.

Assembly compiler (or language) can also do the same optimization.

� In C, the language do not require an operation for the statement "

p2 =

0;".� They require that after that statement, any observable behavio

ur

produced by the program will be as if the value 0 had been assigned to

the object "p2".��

You need a model now by saying so.

Repeating that same requirement does not change it -
the compiler does not have to have implement "p2 = 0;" twice.� (It

is

free to do so twice - or two hundred times if it likes.� And if the

value of p2 is not used, it can be completely eliminated.)

Have you actually done any C programming at all?

Nope, I quit C (but I keep watching C, since part of C++ is C)

--- PyGate Linux v1.5.14
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From wij@3:633/10 to All on Thu Apr 16 01:11:52 2026

On Wed, 2026-04-15 at 21:40 +0800, makendo wrote:

The 4th difference: Local variable.
(Assembly can theoritically do the same but I don't have impression whi

ch one

support this feature.)

If you are talking about function-local data, there are multiple ways
to do store them in an easy-to-clean-up fashion:

�� - Volatile registers, for the shortest lived data. Calling o

ther

�� functions causes them to be overwritten with the

function's

�� return value or irrelevant data.
�� - Non-volatile registers, for data that need to persist acro

ss

�� function calls. You save the contents of them be

fore using them,

�� as your caller expects the contents of these reg

isters to be

�� intact once you return.
�� - The stack, for long-lived function-local data when you are

out of

�� non-volatile registers. You manipulate a dedicat

ed stack pointer

�� register to allocate and deallocate space for yo

ur data.

�� - Immediates, and the .rodata (ELF) / .rdata (PE) section, f

or

�� constants and tables of constants.

The notion of local variable allows you to ignore all of these in C,
though. Assembly having multiple ways to store local data instead of
one can make things fairly complicated to read, write and debug.

(forwarding to alt.lang.asm because you are comparing C with it)

I think the local variable thing (of my trial) depends on the program model (computation model). Thanks for the info, your suggestion relates to more
�
complicated stuff than I encountered (I am compiler newbie).

--- PyGate Linux v1.5.14
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From ??Jacek Marcin Jaworski??@3:633/10 to All on Wed Apr 15 20:23:52 2026

W dniu 15.04.2026 o�15:40, makendo pisze:

(forwarding to alt.lang.asm because you are comparing C with it)

Great, but what is wrong with comp.lang.asm ? I subcribe it instead any
ohter asm related groups. Is this wrong aproach?

--
Jacek Marcin Jaworski, Pruszcz Gd., woj. Pomorskie, Polska ??, EU ??;
tel.: +48-609-170-742, najlepiej w godz.: 5:00-5:55 lub 16:00-17:25; <jmj@energokod.gda.pl>, gpg: 4A541AA7A6E872318B85D7F6A651CC39244B0BFA;
Domowa s. WWW: <https://energokod.gda.pl>;
Mini Netykieta: <https://energokod.gda.pl/MiniNetykieta.html>;
Mailowa Samoobrona: <https://emailselfdefense.fsf.org/pl>.
UWAGA:
NIE ZACI?GAJ "UKRYTEGO D?UGU"! P?A? ZA PROG. FOSS I INFO. INTERNETOWE!
CZYTAJ DARMOWY: "17. Raport Totaliztyczny - Patroni Kontra Bankierzy": <https://energokod.gda.pl/raporty-totaliztyczne/17.%20Patroni%20Kontra%20Bankierzy.pdf>

--- PyGate Linux v1.5.14
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From Kerr-Mudd, John@3:633/10 to All on Wed Apr 15 21:01:18 2026

On Wed, 15 Apr 2026 20:23:52 +0200
??Jacek Marcin Jaworski?? <jmj@energokod.gda.pl> wrote:

W dniu 15.04.2026 o�15:40, makendo pisze:

(forwarding to alt.lang.asm because you are comparing C with it)

Great, but what is wrong with comp.lang.asm ? I subcribe it instead any ohter asm related groups. Is this wrong aproach?

DYM comp.lang.asm.x86?

comp.lang.asm is an empty header for me on eternal september's feed.

AFAIAA there's no moderator approving posts to clax these days, so I
don't think anything gets posted there.

--
Jacek Marcin Jaworski, Pruszcz Gd., woj. Pomorskie, Polska ??, EU ??;
tel.: +48-609-170-742, najlepiej w godz.: 5:00-5:55 lub 16:00-17:25; <jmj@energokod.gda.pl>, gpg: 4A541AA7A6E872318B85D7F6A651CC39244B0BFA;
Domowa s. WWW: <https://energokod.gda.pl>;
Mini Netykieta: <https://energokod.gda.pl/MiniNetykieta.html>; Mailowa Samoobrona: <https://emailselfdefense.fsf.org/pl>.
UWAGA:
NIE ZACI?GAJ "UKRYTEGO D?UGU"! P?A? ZA PROG. FOSS I INFO. INTERNETOWE!
CZYTAJ DARMOWY: "17. Raport Totaliztyczny - Patroni Kontra Bankierzy": <https://energokod.gda.pl/raporty-totaliztyczne/17.%20Patroni%20Kontra%20Bankierzy.pdf>

sig is overlong, and crowded, IMHO.

--
Bah, and indeed Humbug.

--- PyGate Linux v1.5.14
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From David Brown@3:633/10 to All on Wed Apr 15 22:11:37 2026

On 15/04/2026 18:58, wij wrote:

On Wed, 2026-04-15 at 15:38 +0200, David Brown wrote:

On 15/04/2026 14:21, wij wrote:

On Wed, 2026-04-15 at 11:46 +0100, Bart wrote:

There are a dozen or more HLLs that have exactly such a set of integer >>>> types. Actually, those have fixed-width integers with fixed ranges,
wrap-around behaviour, twos complement format and so on, even more so
than C.

So those HLLs (that is, C++, C#, D, Rust, Java, Zig, Go, ...) are even >>>> more closely tied to the machine than C is. (In C, built-in types are
not sized, but have mininum widths, and until C23, integer
representation was not specified.)

Would you claim that those are also essentially assembly? If not, why not? >>>

I calim C is (maybe I should use 'may be'. Sometimes I feel the conversation
is difficult) 'portable assembly' is because C (subset) could map to 'assembly'
and in a sense have to. E.g.

�� int p2; // p2 is connected to extern hardware

�� p2=0;
�� p2=0;� // significant (hard-ware knows the second 'touch' triggers different
�� // action (or for delay purpose).

You are not making any sense.� I don't think you understand what C is,
how the language is defined, or how typical C implementations work.

I switched from C to C++ 30 years ago.

I don't think you understand C++ either. In the context of this
discussion, it is not different from C.

But that is 'theoretical', I see things
from real world side.�I think you approach 'C' from standard documents, that is
not the way of understanding. You cannot understand the world by/from reading the bible.

No, I understand C and C++ from using them in real-world code - as well
as knowing what the code means and what is guaranteed by the language. Practical experience tells you what works well in practice - but
theoretical knowledge tells you what you can expect so that you are not
just programming by luck and "it worked for me when I tried it".

In C, when you write the code above there is /nothing/ to suggest that
there should be two actions.

As I know, 'old-time' C has no optimization.

Nonsense.

Modern C compilers often do more optimisation than older ones, but there
was never a "pre-optimisation" world. Things like eliminating dead
code, or optimising based on knowing that signed integer overflow never
occurs in a correct program, have been around from early tools. I have
used heavily optimising compilers for 30 years.

C compilers can - and many will - combine
the two "p2 = 0;" statements.� This is critical to understanding why C
is not in any sense an "assembler".

Not a valid reason.

What do you mean by that? It's a fact, not a "reason".

In assembly languages, if you write
the equivalent of "p2 = 0;" twice, you get the appropriate opcode twice.

Assembly compiler (or language) can also do the same optimization.

No, assemblers cannot do that - if they did, they would not be
assemblers. An assembler directly translates your instructions from
mnemonic codes (assembly instructions) to binary opcodes. Some
assemblers might have pseudo-instructions that translate to more than
one binary opcode, but always in a specific defined pattern.

� In C, the language do not require an operation for the statement "p2 =
0;".� They require that after that statement, any observable behaviour
produced by the program will be as if the value 0 had been assigned to
the object "p2".

You need a model now by saying so.

Again, I don't know what you are trying to say.

Repeating that same requirement does not change it -
the compiler does not have to have implement "p2 = 0;" twice.� (It is
free to do so twice - or two hundred times if it likes.� And if the
value of p2 is not used, it can be completely eliminated.)

Have you actually done any C programming at all?

Nope, I quit C (but I keep watching C, since part of C++ is C)

Okay, have you ever actually done any C++ programming? The languages
share the same philosophy here.

--- PyGate Linux v1.5.14
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From ??Jacek Marcin Jaworski??@3:633/10 to All on Wed Apr 15 22:40:55 2026

W dniu 15.04.2026 o�22:01, Kerr-Mudd, John pisze:

On Wed, 15 Apr 2026 20:23:52 +0200
??Jacek Marcin Jaworski??<jmj@energokod.gda.pl> wrote:

W dniu 15.04.2026 o�15:40, makendo pisze:

(forwarding to alt.lang.asm because you are comparing C with it)

Great, but what is wrong with comp.lang.asm ? I subcribe it instead any
ohter asm related groups. Is this wrong aproach?

DYM comp.lang.asm.x86?

No!

comp.lang.asm is an empty header for me on eternal september's feed.

After question "is comp.lang.asm moderated?" ecosia.org AI answer today, quote:

The newsgroup comp.lang.asm is generally considered an unmoderated Usenet group. Unlike comp.lang.asm.x86, which is known to be moderated, comp.lang.asm does not have an official moderation process and typically allows posts to appear without prior review.

I see old posts published on comp.lang.asm, and last is yours: "Kenny
Code for DOS", from 2023-04-24, mon. (without any answers).

sig is overlong, and crowded, IMHO.

I have so many things to communicate Poles - this is the reason of bit
sig. But I try to be laconic.

--
Jacek Marcin Jaworski, Pruszcz Gd., woj. Pomorskie, Polska ??, EU ??;
tel.: +48-609-170-742, najlepiej w godz.: 5:00-5:55 lub 16:00-17:25; <jmj@energokod.gda.pl>, gpg: 4A541AA7A6E872318B85D7F6A651CC39244B0BFA;
Domowa s. WWW: <https://energokod.gda.pl>;
Mini Netykieta: <https://energokod.gda.pl/MiniNetykieta.html>;
Mailowa Samoobrona: <https://emailselfdefense.fsf.org/pl>.
UWAGA:
NIE ZACI?GAJ "UKRYTEGO D?UGU"! P?A? ZA PROG. FOSS I INFO. INTERNETOWE!
CZYTAJ DARMOWY: "17. Raport Totaliztyczny - Patroni Kontra Bankierzy": <https://energokod.gda.pl/raporty-totaliztyczne/17.%20Patroni%20Kontra%20Bankierzy.pdf>

--- PyGate Linux v1.5.14
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From wij@3:633/10 to All on Thu Apr 16 05:12:54 2026

On Wed, 2026-04-15 at 15:06 +0100, Bart wrote:

On 15/04/2026 13:21, wij wrote:

On Wed, 2026-04-15 at 11:46 +0100, Bart wrote:

On 15/04/2026 07:05, wij wrote:

On Tue, 2026-04-14 at 21:46 -0700, Keith Thompson wrote:

��int a;
��char b;
��a=b;�� // allow auto promotion

��while(a<b) {
�� a+=1;
��}

You've claimed that that's assembly language.� What assemble

r?

For what CPU?

Is it even for a real assembler?

I think you realize the example above is just an example to demo my

idea.

So you've invented an 'assembly' syntax that looks exactly like C, in order to support your notion that C and assembly are really the same

thing!

Exactly. But not really 'invented'. I feagured if anyone wants to imple

ment

a 'portable assembly', he would find it not much different from C (from

the

example shown, 'structured C'). So, in a sense, not worthy to implement

.

Real assembly generally uses explicit instructions and labels rather
than the implicit ones used here. It would also have limits on the complexity of expressions. If your pseudo-assembler supports:

�� a = b+c*f(x,y);

then you've invented a HLL.

You may say that.

It sounds like you don't understand the difference between a low-level

language and a high-level one.

These days C might be considered mid-level (I call it a lower-level HLL,

because so many HLLs are much higher level and more abstract).

Compiling a HLL involves lowering it to a different representation, say

from language A to language B.

But just because that translation happens to be routine, doesn't mean
and A is essentially B.

Yes, the C-like example above specifies exactly a sequence of C

PU instructions

(well, small deviation is allowed, and assembly can also have f

unction, macro)

A C program specifies run-time behavior.� (A compiler ge

nerates CPU

instructions behind the scenes to implement that behavior.)

Being 'portable', it should specify 'run-time behavior', no exa

ct instructions.

Yes, that's what I said.� And that's the fundamental differe

nce between

assembly and C.

How/what do you specify 'run-time behavior'? Not based on CPU?

E.g. in C, int types are fixed-size, have range, wrap-around, align

ment

and 'atomic','overlapping' properties, you cannot really understand

or hide it and

program C/C++ correctly from the high-level concept of 'integer'.

The point is that C has NO WAY get rid of these (hard-ware) feature

s, no matter

how high-level one thinks C is or expect C would be.

There are a dozen or more HLLs that have exactly such a set of intege

r

types. Actually, those have fixed-width integers with fixed ranges, wrap-around behaviour, twos complement format and so on, even more so than C.

So those HLLs (that is, C++, C#, D, Rust, Java, Zig, Go, ...) are eve

n

more closely tied to the machine than C is. (In C, built-in types are
not sized, but have mininum widths, and until C23, integer
representation was not specified.)

Would you claim that those are also essentially assembly? If not, why

not?

I calim C is (maybe I should use 'may be'. Sometimes I feel the convers

ation

is difficult) 'portable assembly' is because C (subset) could map to 'a

ssembly'

and in a sense have to. E.g.

�� int p2; // p2 is connected to extern hardware

�� p2=0;
�� p2=0;� // significant (hard-ware knows the second '

touch' triggers different

�� // action (or fo

r delay purpose).

That won't work in C. 'p2' is likely to be in a register; that extra
write may be elided.

You'd have to use 'volatile' to guard against that. But you still can't

control where p2 is put into memory. C /is/ used for this stuff, but all

sorts of special extensions, or compiler specifics, may be employed.

In assembly it's much easier.

And, in union, I don't how 'high-level' can explain the way read/write

part

of float object officially.
�� union {
�� char carr[sizeof(float)];�� // C++ g

uarantees sizeof(char)==1

�� float f;
�� }

(Fixed that sizeof.)

I normally use my own systems language. That one is aligned much more directly to hardware than C is, even though it is marginally higher level

.

This is because C is intended to work on possible hardware, while mine

was created to work with one target as a time.

Also, when I started on mine (c. 1982 rather than 1972), hardware was already standardising on 8-bit bytes, byte-addressed, power-of-two word

sizes, and twos-complement integers.

I don't however consider my language to be a form of assembly for lots

of reasons already mentioned.

Its compilers use 3 internal representations before it gets to native cod

e:

�� HLL source -> AST -> IL -> MCL -> Native

'MCL' is the internal representation of the native code. If I need ASM

output, then MCL can be dumped into a suitable syntax (I support 4
different ASM syntaxes for x64).

This MCL/ASM itself has abstractions, so the same 'MOV' mnemonic is used

for a dozens of different move instructions that each have different
binary opcodes.

I have a Soft-CPU class you might be insterested suitable for various kind
of
script languages. The idea should not be too difficult to implement in C.

--------------------------------------------------------------------------- ----
Wy.Sct.Spu(3wy) Wy.Sct.Spu( 3wy)

NAME
Spu - Class of general purpose Soft-CPU

SYNOPSIS
Except POD types, C structures, all types are declared in namespace
Wy.

#include <CSCall/SctBase.h>

Spu is an object oriented model (class) of Turing Machine that acts like
a general purpose CPU-based computing machine to provide semantics
for
computing languages (for programming or for program communication).
Ap?
plications are both theoretical and practical.

The main differences of Spu and general purpose CPU (or TM) is that
Spu
has no ?register? nor ?flag? (which, along with
others, can be simu?
lated), Spu has only a tape and a stack. The tape is initially em pty.
Every object (referred to as tape variable or cell) in the tape is a llo?
cated via instruction Alloc and identified by a continuous index num ber.
Tape variable can be any C++ type, including Spu.

The instructions of Spu are application definable because they
are
wildly varying from different purposes. Except necessary few, about

30

instructions are defined for convenience for common usage, see man page
Wy.Sct(3wy).
[cut] ....
-------------------------------

The boundary of assembly and HLL is not clear to me.
I had wrote a killer-grade commercial assembly program, it may still be run ning
today after >30 years. My experience is that assembly is not that scary as commonly�
thought, just don't think in low level.�

--- PyGate Linux v1.5.14
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From wij@3:633/10 to All on Thu Apr 16 05:38:37 2026

On Wed, 2026-04-15 at 22:11 +0200, David Brown wrote:

On 15/04/2026 18:58, wij wrote:

On Wed, 2026-04-15 at 15:38 +0200, David Brown wrote:

On 15/04/2026 14:21, wij wrote:

On Wed, 2026-04-15 at 11:46 +0100, Bart wrote:

There are a dozen or more HLLs that have exactly such a set of in

teger

types. Actually, those have fixed-width integers with fixed range

s,

wrap-around behaviour, twos complement format and so on, even mor

e so

than C.

So those HLLs (that is, C++, C#, D, Rust, Java, Zig, Go, ...) are

even

more closely tied to the machine than C is. (In C, built-in types

are

not sized, but have mininum widths, and until C23, integer representation was not specified.)

Would you claim that those are also essentially assembly? If not,

why not?

I calim C is (maybe I should use 'may be'. Sometimes I feel the con

versation

is difficult) 'portable assembly' is because C (subset) could map t

o 'assembly'

and in a sense have to. E.g.

�� int p2; // p2 is connected to extern hardware

�� p2=0;
�� p2=0;� // significant (hard-ware knows th

e second 'touch' triggers different

�� // act

ion (or for delay purpose).

You are not making any sense.� I don't think you understand what

C is,

how the language is defined, or how typical C implementations work.

I switched from C to C++ 30 years ago.

I don't think you understand C++ either.� In the context of this
discussion, it is not different from C.

But that is 'theoretical', I see things
from real world side.�I think you approach 'C' from standard docum

ents, that is

not the way of understanding. You cannot understand the world by/from r

eading

the bible.

No, I understand C and C++ from using them in real-world code - as well

as knowing what the code means and what is guaranteed by the language.

Practical experience tells you what works well in practice - but
theoretical knowledge tells you what you can expect so that you are not

just programming by luck and "it worked for me when I tried it".

In C, when you write the code above there is /nothing/ to suggest tha

t

there should be two actions.

As I know, 'old-time' C has no optimization.

Nonsense.

Modern C compilers often do more optimisation than older ones, but there

was never a "pre-optimisation" world.� Things like eliminating dead

code, or optimising based on knowing that signed integer overflow never

occurs in a correct program, have been around from early tools.� I h

ave

used heavily optimising compilers for 30 years.

C compilers can - and many will - combine
the two "p2 = 0;" statements.� This is critical to understandi

ng why C

is not in any sense an "assembler".

Not a valid reason.

What do you mean by that?� It's a fact, not a "reason".

� In assembly languages, if you write
the equivalent of "p2 = 0;" twice, you get the appropriate opcode t

wice.

Assembly compiler (or language) can also do the same optimization.

No, assemblers cannot do that - if they did, they would not be
assemblers.� An assembler directly translates your instructions from

mnemonic codes (assembly instructions) to binary opcodes.� Some
assemblers might have pseudo-instructions that translate to more than
one binary opcode, but always in a specific defined pattern.

�� In C, the language do not require an operation for the s

tatement "p2 =

0;".� They require that after that statement, any observable beh

aviour

produced by the program will be as if the value 0 had been assigned t

o

the object "p2".

You need a model now by saying so.

Again, I don't know what you are trying to say.

Repeating that same requirement does not change it -
the compiler does not have to have implement "p2 = 0;" twice.�

(It is

free to do so twice - or two hundred times if it likes.� And if

the

value of p2 is not used, it can be completely eliminated.)

Have you actually done any C programming at all?

Nope, I quit C (but I keep watching C, since part of C++ is C)

Okay, have you ever actually done any C++ programming?� The language

s

share the same philosophy here.

You are really a sick person. Looser of the real world. You just don't know yourself.
I have a gold medal, an aluminum medal and a bronze commemorative plaque (f
or
solving a riddle of Northrop Coorp.). What you have? Well... a paper (paid
for)
and still making false memory everyday for yourself.
I retired at 37, can you?

Ah, recently, you also failed to verify a simple program that proves
3x+1 problem. Fact is not made by mouth (like DJT?), looser.

--- PyGate Linux v1.5.14
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From Keith Thompson@3:633/10 to All on Wed Apr 15 15:12:50 2026

wij <wyniijj5@gmail.com> writes:

On Tue, 2026-04-14 at 21:46 -0700, Keith Thompson wrote:

wij <wyniijj5@gmail.com> writes:

On Tue, 2026-04-14 at 15:31 -0700, Keith Thompson wrote:

wij <wyniijj5@gmail.com> writes:

In attempting writting a simple language, I had a thought of what language is
to share. Because I saw many people are stuck in thinking C/C++ (or other
high level language) can be so abstract, unlimited 'high level' to mysteriously
solve various human description of idea.

C and assembly are essentially the same, maybe better call it 'portable assembly'.

No, C is not any kind of assembly.� Assembly language and C are
fundamentally different.

An assembly language program specifies a sequence of CPU instructions. >> >

[Repeat] 'Assembly' can also be like C:
�// This is 'assembly'
�def int=32bit;�� // Choose right bits for your platform, or leave it for >> > �def char= 8bit;� // compiler to decide.

Compiler?� You said this was assembly.

�int a;
�char b;
�a=b;�� // allow auto promotion

�while(a<b) {
�� a+=1;
�}

You've claimed that that's assembly language.� What assembler?
For what CPU?

Is it even for a real assembler?

I think you realize the example above is just an example to demo my idea.

I hadn't. I realize it now that you've admitted it.

In other words, you made it up.

I don't believe there is any real-world assembler that accepts
that syntax. Your example is meaningless.

For every assembler I've used, the assembly language input
unambiguously specifies the sequence of CPU instructions in the
generated object file. Support for macros do not change that;
it just means the mapping is slightly more complicated.

Cite an example of an existing real-world assembler that does not
behave that way, and we might have something interesting to discuss.

Yes, the C-like example above specifies exactly a sequence of CPU instructions
(well, small deviation is allowed, and assembly can also have function, macro)

A C program specifies run-time behavior.� (A compiler generates CPU
instructions behind the scenes to implement that behavior.)

Being 'portable', it should specify 'run-time behavior', no exact instructions.

Yes, that's what I said.� And that's the fundamental difference between
assembly and C.

How/what do you specify 'run-time behavior'? Not based on CPU?

The C standard defines "behavior" as "external appearance or action",
which is admittedly vague. Run-time behavior is what happens when the
program is running on the target system. It includes things like input
and output, either to a console or to files.

The C standard specifies the behavior of this program:

#include <stdio.h>
int main(void) { puts("hello, world"); }

It does so without reference to any CPU. (Of course some CPU will be
used to implement that behavior.)

E.g. in C, int types are fixed-size, have range, wrap-around, alignment
and 'atomic','overlapping' properties, you cannot really understand or hide it and
program C/C++ correctly from the high-level concept of 'integer'.

The point is that C has NO WAY get rid of these (hard-ware) features, no matter
how high-level one thinks C is or expect C would be.

Right, C doesn't directly support abstract mathematical integers.
Of course I agree that C is a lower level language than many others.
Python, for example, has reasonably transparent support for integers
of arbitrary width. Python is a higher level language than C.
(Notably, the Python interpreter is written in C).

That doesn't make C an assembly language.

[...]

When I heard 'sophisticated assemblers', I would think something like
my idea of 'portable' assembly, but maybe different.� One my point
should be clear as stated in the above int example�"... C has NO WAY
get rid of these (hard-ware) features, no matter how high-level�one
thinks C is or expect C would be."

Again, yes, C is a relatively low-level language. And again,
C is not an assembly language.

And again, if you can cite a real-world example of the kind of
"sophisticated assembler" you're talking about, that would be an
interesting data point.

--
Keith Thompson (The_Other_Keith) Keith.S.Thompson+u@gmail.com
void Void(void) { Void(); } /* The recursive call of the void */

--- PyGate Linux v1.5.14
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From Keith Thompson@3:633/10 to All on Wed Apr 15 15:34:07 2026

wij <wyniijj5@gmail.com> writes:

On Wed, 2026-04-15 at 15:38 +0200, David Brown wrote:

[...]

In C, when you write the code above there is /nothing/ to suggest that
there should be two actions.��

As I know, 'old-time' C has no optimization.

I'm not sure that's even meaningful. A C compiler translates C
source code to assembly or machine code. That's a non-trivial
transformation. I'm not sure you can even say that such a
transformation doesn't optimize anything.

But ok, given `n = 0; n = 0;` a C compiler *may or may not* generate
two stores to n. If it generates just one, that's an optimization.
But the point is that that kind of optimization is not specified
by the language. For every assembler I've encoutered, such
optimizations are simply not implemented.

[...]

In assembly languages, if you write
the equivalent of "p2 = 0;" twice, you get the appropriate opcode twice.

Assembly compiler (or language) can also do the same optimization.

Oh? Name a real-world assembler (not from your imagination) that does
that kind of optimization. I'd like to read its manual.

[...]

--
Keith Thompson (The_Other_Keith) Keith.S.Thompson+u@gmail.com
void Void(void) { Void(); } /* The recursive call of the void */

--- PyGate Linux v1.5.14
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From Keith Thompson@3:633/10 to All on Wed Apr 15 15:43:11 2026

David Brown <david.brown@hesbynett.no> writes:

On 15/04/2026 01:33, Bart wrote:

[...]

Certainly until C99 when stdint.h came along.

I would not draw that distinction - indeed, I see the opposite. Prior
to <stdint.h>, your integer type sizes were directly from the target
machine - with <stdint.h> explicitly sized integer types, they are now independent of the target hardware.

A minor quibble: The sizes of the predefined integer types have
always been determined by the compiler, often mandated by an ABI
for the target platform. The choice is *influenced* by the target
hardware, but not controlled by it. For example, the width of
`long` on x86_64 is likely to be 32 bits on Windows, 64 bits on
other platforms.

[...]

--
Keith Thompson (The_Other_Keith) Keith.S.Thompson+u@gmail.com
void Void(void) { Void(); } /* The recursive call of the void */

--- PyGate Linux v1.5.14
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From Keith Thompson@3:633/10 to All on Wed Apr 15 15:48:57 2026

wij <wyniijj5@gmail.com> writes:

On Wed, 2026-04-15 at 22:11 +0200, David Brown wrote:

[...]

Okay, have you ever actually done any C++ programming?� The languages
share the same philosophy here.

You are really a sick person. Looser of the real world. You just don't know yourself.
I have a gold medal, an aluminum medal and a bronze commemorative plaque (for
solving a riddle of Northrop Coorp.). What you have? Well... a paper (paid for)
and still making false memory everyday for yourself.
I retired at 37, can you?

Ah, recently, you also failed to verify a simple program that proves
3x+1 problem. Fact is not made by mouth (like DJT?), looser.

Keep the personal abuse to yourself.

--
Keith Thompson (The_Other_Keith) Keith.S.Thompson+u@gmail.com
void Void(void) { Void(); } /* The recursive call of the void */

--- PyGate Linux v1.5.14
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From Bart@3:633/10 to All on Wed Apr 15 23:52:15 2026

On 15/04/2026 22:12, wij wrote:

On Wed, 2026-04-15 at 15:06 +0100, Bart wrote:

The boundary of assembly and HLL is not clear to me.

That seems to be obvious.

I had wrote a killer-grade commercial assembly program, it may still be running
today after >30 years. My experience is that assembly is not that scary as commonly
thought, just don't think in low level.

It's not that scary. Just unergonomic to code in it, taking longer,
being more error prone, much harder to understand, harder to maintain,
much less portable ...

--- PyGate Linux v1.5.14
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From wij@3:633/10 to All on Thu Apr 16 07:22:24 2026

On Wed, 2026-04-15 at 15:12 -0700, Keith Thompson wrote:

wij <wyniijj5@gmail.com> writes:

On Tue, 2026-04-14 at 21:46 -0700, Keith Thompson wrote:

wij <wyniijj5@gmail.com> writes:

On Tue, 2026-04-14 at 15:31 -0700, Keith Thompson wrote:

wij <wyniijj5@gmail.com> writes:

In attempting writting a simple language, I had a thought of wh

at language is

to share. Because I saw many people are stuck in thinking C/C++

(or other

high level language) can be so abstract, unlimited 'high level'

to mysteriously

solve various human description of idea.

C and assembly are essentially the same, maybe better call it '

portable assembly'.

No, C is not any kind of assembly.� Assembly language and C

are

fundamentally different.

An assembly language program specifies a sequence of CPU instruct

ions.

[Repeat] 'Assembly' can also be like C:
�// This is 'assembly'
�def int=32bit;�� // Choose right bits for your pl

atform, or leave it for

�def char= 8bit;� // compiler to decide.

Compiler?� You said this was assembly.

�int a;
�char b;
�a=b;�� // allow auto promotion

�while(a<b) {
�� a+=1;
�}

You've claimed that that's assembly language.� What assembler?
For what CPU?

Is it even for a real assembler?

I think you realize the example above is just an example to demo my ide

a.

I hadn't.� I realize it now that you've admitted it.

In other words, you made it up.

I don't believe there is any real-world assembler that accepts
that syntax.� Your example is meaningless.

For every assembler I've used, the assembly language input
unambiguously specifies the sequence of CPU instructions in the
generated object file.� Support for macros do not change that;
it just means the mapping is slightly more complicated.

Cite an example of an existing real-world assembler that does not
behave that way, and we might have something interesting to discuss.

Yes, the C-like example above specifies exactly a sequence of CPU i

nstructions

(well, small deviation is allowed, and assembly can also have funct

ion, macro)

A C program specifies run-time behavior.� (A compiler genera

tes CPU

instructions behind the scenes to implement that behavior.)

Being 'portable', it should specify 'run-time behavior', no exact i

nstructions.

Yes, that's what I said.� And that's the fundamental difference

between

assembly and C.

How/what do you specify 'run-time behavior'? Not based on CPU?

The C standard defines "behavior" as "external appearance or action",
which is admittedly vague.� Run-time behavior is what happens when t

he

program is running on the target system.� It includes things like in

put

and output, either to a console or to files.

The C standard specifies the behavior of this program:

�� #include <stdio.h>
�� int main(void) { puts("hello, world"); }

It does so without reference to any CPU.� (Of course some CPU will b

e

used to implement that behavior.)

E.g. in C, int types are fixed-size, have range, wrap-around, alignment
and 'atomic','overlapping' properties, you cannot really understand or

hide it and

program C/C++ correctly from the high-level concept of 'integer'.

The point is that C has NO WAY get rid of these (hard-ware) features, n

o matter

how high-level one thinks C is or expect C would be.

Right, C doesn't directly support abstract mathematical integers.
Of course I agree that C is a lower level language than many others.
Python, for example, has reasonably transparent support for integers
of arbitrary width.� Python is a higher level language than C.
(Notably, the Python interpreter is written in C).

That doesn't make C an assembly language.

[...]

When I heard 'sophisticated assemblers', I would think something like
my idea of 'portable' assembly, but maybe different.�� One my

point

should be clear as stated in the above int example�"... C has NO W

AY

get rid of these (hard-ware) features, no matter how high-level�on

e

thinks C is or expect C would be."

Again, yes, C is a relatively low-level language.� And again,
C is not an assembly language.

And again, if you can cite a real-world example of the kind of
"sophisticated assembler" you're talking about, that would be an
interesting data point.

--�
Keith Thompson (The_Other_Keith) Keith.S.Thompson+u@gmail.com
void Void(void) { Void(); } /* The recursive call of the void */

I had thought questions like yours might have been due to the English probl
em.�
I did not mean C is (equal to) assembly, but C is-a assembly (logic course 101).
And I hope the following code could explain some confusion.

----------------- file s_tut2.cpp
/* Copyright is licensed by GNU LGPL, see file COPYING. by I.J.Wang 2
025

Spu program: 'instruction' is a C++ function:
"Mov a,b" performs the function of C++ expression "a=b"
"Add a,b" performs the function of C++ expression "a+=b"
"Add a,b,c" performs the function of C++ expression "c=a+b"

Build: g++ s_tut2.cpp -lwy
*/
#include <Wy.stdio.h>
#include "CSCall/Sct.h"

using namespace Wy;
using namespace Wy::Sct;

void t0() {
Errno r;
Spu spu;

// Note: In general, program.reserve(...) is needed if non-memcpy_able vari able
// (String) is used. Because this spu program is simple and no error
is
// thrown, we save the trouble.
/* 0 */ spu.add_instr( new Alloc<float>()); // 0 (alloc 3 float)
/* 1 */ spu.add_instr( new Alloc<float>()); // 1
/* 2 */ spu.add_instr( new Alloc<float>()); // 2
/* 3 */ spu.add_instr( new Alloc<String>()); // 3 (alloc 3 String)
/* 4 */ spu.add_instr( new Alloc<String>()); // 4
/* 5 */ spu.add_instr( new Alloc<String>()); // 5
/* 6 */ spu.add_instr( new Mov<float,float>(TpVar(0),1.32)); // init. var.
/* 7 */ spu.add_instr( new Mov<float,float>(TpVar(1),3.2));
/* 8 */ spu.add_instr( new Add<float,float>(TpVar(0),TpVar(1),TpVar(2)));
/* 9 */ spu.add_instr( new Dump<float>(TpVar(2))); // print resut of v(0)+v
(1)
/* 10 */ spu.add_instr( new Dump<char>('\n'));

/* 11 */ spu.add_instr( new Mov<String,const char*>(TpVar(3),"hello "));
/* 12 */ spu.add_instr( new Mov<String,const char*>(TpVar(4),"world\n"));
/* 13 */ spu.add_instr( new Add<String,String>(TpVar(3),TpVar(4)));
/* 14 */ spu.add_instr( new Dump<String>(TpVar(3))); // print result of v(3 )+v(4)
/* 15 */ spu.add_instr( new Free<String>()); // free 3 non-memcpy-able obje
cts
/* 16 */ spu.add_instr( new Free<String>());
/* 17 */ spu.add_instr( new Free<String>());
/* 18 */ spu.add_instr( new Fin(0));

// Note: If implemented, 'the real assembly' would look much better

if((r=spu.tape.reserve(256))!=Ok) { // reserve enough capacity of tap
e for
WY_THROW(r); // non-copy_able object
}
if((r=spu.run( InstrIdx(0) ))!=Ok) { // run the program from InstrIdx(
0)
WY_THROW(r);
}
};

int main(int argc, const char* argv[])
try {
t0();
cout << "OK" WY_ENDL;
return 0;
}
catch(const Errno& e) {
cerr << wrd(e) << WY_ENDL;
return -1; // e.c_errno();
}
catch(...) {
cerr << "main() caught(...)" WY_ENDL;
throw;
};
--------------

The intended 'assembly' of the above should look cleaner:

alloc<float> // float var. at index 0, no name
alloc<float> // float var. at index 1, no name
alloc<float> // ..
alloc<String> // String var. at index 3
alloc<String>
alloc<String>
mov [0], 1.32
mov [1], 3.2
add [0],[1],[2] // [2]= [0]+[1]
dump [2]
dump '\n'
mov [3], "hello " // [3]="hello "
mov [4], "world\n" // [4]="world\n"
add [3], [4] // [3]+=[4]
dump [3] // dump "helo world\n"
free // free allocated var.
free
free
fin // finish

--------
$make s_tut2
$./s_tut2
4.520000
hello world
OK
-----------------------------------

The 'assembly' could be 'structured assembly', but then I felt the
result should not be much different from C...

--- PyGate Linux v1.5.14
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From wij@3:633/10 to All on Thu Apr 16 07:30:23 2026

On Wed, 2026-04-15 at 23:52 +0100, Bart wrote:

On 15/04/2026 22:12, wij wrote:

On Wed, 2026-04-15 at 15:06 +0100, Bart wrote:

The boundary of assembly and HLL is not clear to me.

That seems to be obvious.

I had wrote a killer-grade commercial assembly program, it may still be

running

today after >30 years. My experience is that assembly is not that scary

as commonly

thought, just don't think in low level.

It's not that scary. Just unergonomic to code in it, taking longer,
being more error prone, much harder to understand, harder to maintain,

much less portable ...

Skill. Treat assembly as a chunk. Well document.

--- PyGate Linux v1.5.14
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From Bart@3:633/10 to All on Thu Apr 16 00:50:16 2026

On 16/04/2026 00:30, wij wrote:

On Wed, 2026-04-15 at 23:52 +0100, Bart wrote:

On 15/04/2026 22:12, wij wrote:

On Wed, 2026-04-15 at 15:06 +0100, Bart wrote:

The boundary of assembly and HLL is not clear to me.

That seems to be obvious.

I had wrote a killer-grade commercial assembly program, it may still be running
today after >30 years. My experience is that assembly is not that scary as commonly
thought, just don't think in low level.

It's not that scary. Just unergonomic to code in it, taking longer,
being more error prone, much harder to understand, harder to maintain,
much less portable ...

Skill. Treat assembly as a chunk. Well document.

You're not making sense. It's like saying I should walk everywhere
instead of using my car.

But I don't want to spend two extra hours a day walking and carrying
shopping etc.

What exactly is the benefit of using assembly over a HLL when both can
tackle the task?

When I first started with microprocessors, I first had to build the
hardware, which was programmed in binary. I wrote a hex editor so I
could use a keyboard. Then used that to write an assembler. Then used
the assembler to write a compiler for a simple HLL.

The HLL allowed me to be far more productive than otherwise. Everybody
seems to understand that, except you.

But I have a counter-proposal: why don't you also program in binary
machine code (I'll let you use hex!) instead of assembly? After all it's
just a skill.

--- PyGate Linux v1.5.14
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From Keith Thompson@3:633/10 to All on Wed Apr 15 16:52:01 2026

wij <wyniijj5@gmail.com> writes:
[104 lines deleted]

Again, yes, C is a relatively low-level language.� And again,
C is not an assembly language.

And again, if you can cite a real-world example of the kind of
"sophisticated assembler" you're talking about, that would be an
interesting data point.

[signature snipped]

When you post a followup, please trim quoted text that's not relevant to
your reply. And in particular, don't quote signatures.

I had thought questions like yours might have been due to the English problem.�
I did not mean C is (equal to) assembly, but C is-a assembly (logic course 101).

No I don't think there's an English language issue. I understand what
you're saying.

There are a number of programming languages. C is one of them. There
are a number of assembly languages, which are a subset of the set of programming languages.

You claim that C is an assembly language.

You're wrong. C is not an assembly language.

The meaning of "assembly language" is, I believe, reasonably
and consistently well understood. An assembly language program
specifies a particular sequence of CPU instructions as its output,
typically stored in an object file. C is not that.

And I hope the following code could explain some confusion.

[code and 'assembly' snipped]

No, not at all.

[...]

The 'assembly' could be 'structured assembly', but then I felt the
result should not be much different from C...

One last try. I'm going to make a few statements, all of which
I believe to be true. For each one, please indicate whether you
agree or disagree. Feel free to elaborate, but I'm looking for a
yes/no for each.

1. An assembly language program specifies a sequence of CPU
instructions. The mapping might not be simple (e.g., macros),
but it is unambiguous.

2. A C program does not specify a sequence of CPU instructions.
(I'm ignoring inline assembly, which is a non-standard feature and
not what we're talking about.)

3. A C program specifies behavior, without reference to any particular
CPU instructions. (For example, I can write a "hello, world" C
program and compile and execute it on an x86_64 system and an ARM
system. It behaves as specified on both. The two executables have
no CPU instructions in common.)

5. C is a relatively low-level language (compared to Python,
for example).

6. The fact that C is a relatively low-level language does not imply
that C is an assembly language.

7. C is not an assembly language.

If you still think that C is an assembly language, please provide a
definition of the phrase "assembly language".

--
Keith Thompson (The_Other_Keith) Keith.S.Thompson+u@gmail.com
void Void(void) { Void(); } /* The recursive call of the void */

--- PyGate Linux v1.5.14
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From Tim Rentsch@3:633/10 to All on Wed Apr 15 17:14:31 2026

wij <wyniijj5@gmail.com> writes:

[... comparing C and assembly language ...]

Gentlemen,

I understand the natural reaction to want to respond to the kind of
statements being made in this thread. I hope y'all can resist this
natural reaction and not respond to people who persist in making
arguments that are basically isomorphic to saying 1 equals 0.

Thank you for your assistance in this matter.

--- PyGate Linux v1.5.14
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From Jonathan Lamothe@3:633/10 to All on Wed Apr 15 20:17:07 2026

Bart <bc@freeuk.com> writes:

On 16/04/2026 00:30, wij wrote:

On Wed, 2026-04-15 at 23:52 +0100, Bart wrote:

On 15/04/2026 22:12, wij wrote:

On Wed, 2026-04-15 at 15:06 +0100, Bart wrote:

The boundary of assembly and HLL is not clear to me.

That seems to be obvious.

I had wrote a killer-grade commercial assembly program, it may still be running
today after >30 years. My experience is that assembly is not that scary as commonly
thought, just don't think in low level.

It's not that scary. Just unergonomic to code in it, taking longer,
being more error prone, much harder to understand, harder to maintain,
much less portable ...

Skill. Treat assembly as a chunk. Well document.

You're not making sense. It's like saying I should walk everywhere
instead of using my car.

But I don't want to spend two extra hours a day walking and carrying
shopping etc.

What exactly is the benefit of using assembly over a HLL when both can
tackle the task?

When I first started with microprocessors, I first had to build the
hardware, which was programmed in binary. I wrote a hex editor so I
could use a keyboard. Then used that to write an assembler. Then used
the assembler to write a compiler for a simple HLL.

The HLL allowed me to be far more productive than otherwise. Everybody
seems to understand that, except you.

But I have a counter-proposal: why don't you also program in binary
machine code (I'll let you use hex!) instead of assembly? After all
it's just a skill.

Assembly is a great thing to know. It makes it easier to know what's
going on under the hood of higher level languages, and can even help in trouboeshooting and reasoning about how to make your code more efficiet.

Do I think that learning assembly is an asset? Absolutely.

Do I think it's something that a project should be written in directly?
In most cases, absolutely not.

--
Regards,
Jonathan Lamothe
https://jlamothe.net

--- PyGate Linux v1.5.14
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From Janis Papanagnou@3:633/10 to All on Thu Apr 16 03:13:34 2026

On 2026-04-16 01:52, Keith Thompson wrote:

wij <wyniijj5@gmail.com> writes:

[...]

[signature snipped]

When you post a followup, please trim quoted text that's not relevant to
your reply. And in particular, don't quote signatures.

The latter was actually (partly) your fault; usually your signatures
are separated by '-- ', but your recent post had '-- ' and was thus
not seen as signature. Of course it could have been manually trimmed
(like the other irrelevant text, as you suggested). :-)

Janis

--- PyGate Linux v1.5.14
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From wij@3:633/10 to All on Thu Apr 16 09:27:06 2026

On Wed, 2026-04-15 at 17:14 -0700, Tim Rentsch wrote:

wij <wyniijj5@gmail.com> writes:

[... comparing C and assembly language ...]

Gentlemen,

I understand the natural reaction to want to respond to the kind of statements being made in this thread.� I hope y'all can resist this
natural reaction and not respond to people who persist in making
arguments that are basically isomorphic to saying 1 equals 0.

Thank you for your assistance in this matter.

Maybe you are right. I say A is-a B, one persist to read A is (exactly) B.
I provide help to using assembly. One persist to read I persuade using assembly and give up HLL. What is going on here?

--- PyGate Linux v1.5.14
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From Keith Thompson@3:633/10 to All on Wed Apr 15 19:04:20 2026

wij <wyniijj5@gmail.com> writes:

On Wed, 2026-04-15 at 17:14 -0700, Tim Rentsch wrote:

wij <wyniijj5@gmail.com> writes:

[... comparing C and assembly language ...]

Gentlemen,

I understand the natural reaction to want to respond to the kind of
statements being made in this thread.� I hope y'all can resist this
natural reaction and not respond to people who persist in making
arguments that are basically isomorphic to saying 1 equals 0.

Thank you for your assistance in this matter.

Maybe you are right. I say A is-a B, one persist to read A is (exactly) B.
I provide help to using assembly. One persist to read I persuade using assembly and give up HLL. What is going on here?

You say that C is an assembly language. Nobody here thinks that
you're *equating* C and assembly language. It's obvious that
there are plenty of assembly languages that are not C, and nobody
has said otherwise. I have no idea why you think anyone has that
particular confusion.

At least one person has apparently interpreted your defense of
assembly language (that it isn't as scary as some think it is)
as a claim that we should program in assembly language rather
than in HLLs. You're right, that was a misinterpretation of what
you wrote. I considered mentioning that, but didn't bother.

The issue I've been discussing is your claim that C is an assembly
language. It is not.

I do not intend to post again in this thread until and unless you
post something substantive on that issue.

--
Keith Thompson (The_Other_Keith) Keith.S.Thompson+u@gmail.com
void Void(void) { Void(); } /* The recursive call of the void */

--- PyGate Linux v1.5.14
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From Janis Papanagnou@3:633/10 to All on Thu Apr 16 04:05:05 2026

On 2026-04-16 03:27, wij wrote:

On Wed, 2026-04-15 at 17:14 -0700, Tim Rentsch wrote:

wij <wyniijj5@gmail.com> writes:

[... comparing C and assembly language ...]

Gentlemen,

I understand the natural reaction to want to respond to the kind of
statements being made in this thread.� I hope y'all can resist this
natural reaction and not respond to people who persist in making
arguments that are basically isomorphic to saying 1 equals 0.

Thank you for your assistance in this matter.

Maybe you are right. I say A is-a B, one persist to read A is (exactly) B.
I provide help to using assembly. One persist to read I persuade using assembly and give up HLL. What is going on here?

The problem seems to be that you have a different opinion on
(or interpretation of) the meaning of your initial statement
"C and assembly are essentially the same" than most of the
audience.

Also what you then added to "clarify" things (e.g. what your
understanding of an "is-a" relation is) is, to formulate it
cautiously, debatable. (Not that I'd be inclined to debate
on that fuzzy level.)

Those posts appear to me to lack coherence and relatability,
let alone contributing any stressable facts on these points.
Understand that it's difficult to communicate on that basis.

That's IMO all that "is going on here", no more, no less.

Janis

--- PyGate Linux v1.5.14
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From makendo@3:633/10 to All on Thu Apr 16 12:44:44 2026

So what you're saying is that assembly can do anything that any other arbitrary language (that has to eventually compile down to the same
machine code) can do? This should not be surprising to anyone.

Not really. Just to say that C is indeed abstract, and not same as
assembly. Compiler optimizations will make the abstractness of C more
evident: for example, you can break down a complex process into many
functions tail calling each other, and under -O2, the process will
only take one frame of stack space, instead of naively eating up the
stack with the call chain.

--- PyGate Linux v1.5.14
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From Chris M. Thomasson@3:633/10 to All on Wed Apr 15 23:12:05 2026

On 4/15/2026 12:57 AM, David Brown wrote:
[...]

C has never been, and was never intended to be, a "portable assembly".
It was designed to reduce the need to write assembly code.� There is a
huge difference in these concepts.

Using C++ for raw Applesoft BASIC? Can be done, but why? To prove algos
can be back ported?

--- PyGate Linux v1.5.14
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From Chris M. Thomasson@3:633/10 to All on Wed Apr 15 23:12:30 2026

On 4/15/2026 12:57 AM, David Brown wrote:

On 15/04/2026 01:33, Bart wrote:

On 14/04/2026 23:20, Janis Papanagnou wrote:

On 2026-04-14 23:41, Bart wrote:

But if you want to call C some kind of assembler, even though it is
several levels above actual assembly, then that's up to you.

Can you name and describe a couple of these "several levels above
actual assembly"?� (Assembler macros might qualify as one level.)

I said C is several levels above, and mentioned 2 categories and 2
specific ones that can be considered to be in-between.

I agree with a great deal you have written in this thread (at least what
I have read so far).� My points below are mainly additional comments
rather than arguments or disagreements.� Like you, my disagreement is primarily with wij.

Namely:

* HLAs (high-level assemblers) of various kinds, as this is a broad
category (see note)

When I used to do significant amounts of assembly programming (often on "brain-dead" 8-bit CISC microcontrollers), I would make heavy use of assembler macros as a way of getting slightly "higher level" assembly.
Even with common assembler tools you can write something that is a kind
of HLA.� And then for some targets, there are more sophisticated tools
(or you can write them yourself) for additional higher level constructs.

* Intermediate languages (IRs/ILs) such as LLVM IR

LLVM is probably the best candidate for something that could be called a "portable assembler".� It is quite likely that other such "languages"
have been developed and used (perhaps internally in multi-target
compilers), but LLVM's is the biggest and with the widest support.

* Forth

Forth is always a bit difficult to categorise.� Many Forth
implementations are done with virtual machines or byte-code
interpreters, raising them above assembly.� Others are for stack machine processors (very common in the 4-bit world) where the assembly /is/ a
small Forth language.� A lot of Forth tools compile very directly
(giving you the "you get what your code looks like" aspect of assembly), others do more optimisation (for the "you get what your code means"
aspect of high level languages).

* PL/M (an old one; there was also C--, now dead)

I never used PL/M - I'm too young for that!� C-- was conceived as a
portable intermediary language that compilers could generate to get cross-target compilation without needing individual target backends.� In practice, ordinary C does a good enough job for many transpilers during development, then they can move to LLVM for more control and efficiency
if they see it as worth the effort.

(Note: the one I implemented was called 'Babbage', devised for the GEC
4000 machines. My task was to port it to DEC PDP10. There's something
about it 2/3 down this page: https://en.wikipedia.org/wiki/
GEC_4000_series)

Beyond the inherent subjective aspects of that or the OP's initial
statement I certainly see "C" closer to the machine than many HLLs.

I see it as striving to distance itself from the machine as much as
possible!

Yes - as much as possible while retaining efficiency.

Certainly until C99 when stdint.h came along.

I would not draw that distinction - indeed, I see the opposite.� Prior
to <stdint.h>, your integer type sizes were directly from the target
machine - with <stdint.h> explicitly sized integer types, they are now independent of the target hardware.

C has always intended to be as independent from the machine as
practically possible without compromising efficiency.� That's why it has implementation-defined behaviour where it makes a significant difference (such as the size of integer types), while giving full definitions of
things that can reasonably be widely portable while still being
efficient (and sometimes leaving things undefined to encourage
portability).

For example:

* Not committing to actual machine types, widths or representations,
such as a 'byte', or 'twos complement'.

(With C23, two's complement is the only allowed signed integer representation.� There comes a point where something is so dominant that even C commits it to the standards.)

* Being vague about the relations between the different integer types

* Not allowing (until standardised after half a century) binary
literals, and still not allowing those to be printed

That one is more that no one had bothered standardising binary literals.
�The people that wanted them, for the most part, are low-level embedded programmers and their tools already supported them.� (And even then,
they are not much used in practice.)� Printing in binary is not
something people often want - it is far too cumbersome for numbers, and
if you want to dump a view of some flag register then a custom function
with letters is vastly more useful.

C is standardised on binary - unsigned integer types would not work well
on a non-binary target.

* Not being allowed to do a dozen things that you KNOW are well-
defined on your target machine, but C says are UB.

That is certainly part of it.� Things like "signed integer arithmetic overflow" is UB at least partly because C models mathematical integer arithmetic.� It does not attempt to mimic the underlying hardware.� This
is clearly "high level language" territory - C defines the behaviour of
an abstract machine in terms of mathematics.� It is not an "assembler"
that defines operations in terms of hardware instructions.

It certainly depends on where one is coming from; from an abstract
or user-application level or from the machine level.

There was often mentioned here - very much to the despise of the
audience - that there's a lot effort necessary to implement simple
concepts. To jump on that bandwagon; how would, say, Awk's array
construct� map[key] = value� have to be modeled in (native) "C".
(Note that this simple statement represents an associative array.)

"C" is abstracting from the machine. And the OP's initial statement
"C and assembly are essentially the same" may be nonsense

Actually, describing C as 'portable assembly' annoys me which is why I
went into some detail.

Indeed.

C is defined in terms of an abstract machine, not hardware.� And the C source code running on this abstract machine only needs to match up with
the actual binary code on the real target machine in very specific and limited ways - the "observable behaviour" of the program.� That's
basically start, stop, volatile accesses and IO.� Everything else
follows the "as if" - the compiler needs to generate target code that
works (for observable behaviour) "as if" it had done a direct, na�ve translation of the source.

As I understand the history - and certainly the practice - of the C language, it is a language with two goals.� One is that it should be possible to write highly portable C code that can be used on a very wide range of target systems while remaining efficient.� The other is that it should be useable for a lot of target-specific system code.

C has never been, and was never intended to be, a "portable assembly".
It was designed to reduce the need to write assembly code.� There is a
huge difference in these concepts.

Use C to create the ASM, then GAS it... ;^)

--- PyGate Linux v1.5.14
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From Chris M. Thomasson@3:633/10 to All on Wed Apr 15 23:20:27 2026

On 4/15/2026 11:12 PM, Chris M. Thomasson wrote:

On 4/15/2026 12:57 AM, David Brown wrote:

On 15/04/2026 01:33, Bart wrote:

On 14/04/2026 23:20, Janis Papanagnou wrote:

On 2026-04-14 23:41, Bart wrote:

But if you want to call C some kind of assembler, even though it is >>>>> several levels above actual assembly, then that's up to you.

Can you name and describe a couple of these "several levels above
actual assembly"?� (Assembler macros might qualify as one level.)

I said C is several levels above, and mentioned 2 categories and 2
specific ones that can be considered to be in-between.

I agree with a great deal you have written in this thread (at least
what I have read so far).� My points below are mainly additional
comments rather than arguments or disagreements.� Like you, my
disagreement is primarily with wij.

Namely:

* HLAs (high-level assemblers) of various kinds, as this is a broad
category (see note)

When I used to do significant amounts of assembly programming (often
on "brain-dead" 8-bit CISC microcontrollers), I would make heavy use
of assembler macros as a way of getting slightly "higher level"
assembly. Even with common assembler tools you can write something
that is a kind of HLA.� And then for some targets, there are more
sophisticated tools (or you can write them yourself) for additional
higher level constructs.

* Intermediate languages (IRs/ILs) such as LLVM IR

LLVM is probably the best candidate for something that could be called
a "portable assembler".� It is quite likely that other such
"languages" have been developed and used (perhaps internally in multi-
target compilers), but LLVM's is the biggest and with the widest support.

* Forth

Forth is always a bit difficult to categorise.� Many Forth
implementations are done with virtual machines or byte-code
interpreters, raising them above assembly.� Others are for stack
machine processors (very common in the 4-bit world) where the
assembly /is/ a small Forth language.� A lot of Forth tools compile
very directly (giving you the "you get what your code looks like"
aspect of assembly), others do more optimisation (for the "you get
what your code means" aspect of high level languages).

* PL/M (an old one; there was also C--, now dead)

I never used PL/M - I'm too young for that!� C-- was conceived as a
portable intermediary language that compilers could generate to get
cross-target compilation without needing individual target backends.
In practice, ordinary C does a good enough job for many transpilers
during development, then they can move to LLVM for more control and
efficiency if they see it as worth the effort.

(Note: the one I implemented was called 'Babbage', devised for the
GEC 4000 machines. My task was to port it to DEC PDP10. There's
something about it 2/3 down this page: https://en.wikipedia.org/wiki/
GEC_4000_series)

Beyond the inherent subjective aspects of that or the OP's initial
statement I certainly see "C" closer to the machine than many HLLs.

I see it as striving to distance itself from the machine as much as
possible!

Yes - as much as possible while retaining efficiency.

Certainly until C99 when stdint.h came along.

I would not draw that distinction - indeed, I see the opposite.� Prior
to <stdint.h>, your integer type sizes were directly from the target
machine - with <stdint.h> explicitly sized integer types, they are now
independent of the target hardware.

C has always intended to be as independent from the machine as
practically possible without compromising efficiency.� That's why it
has implementation-defined behaviour where it makes a significant
difference (such as the size of integer types), while giving full
definitions of things that can reasonably be widely portable while
still being efficient (and sometimes leaving things undefined to
encourage portability).

For example:

* Not committing to actual machine types, widths or representations,
such as a 'byte', or 'twos complement'.

(With C23, two's complement is the only allowed signed integer
representation.� There comes a point where something is so dominant
that even C commits it to the standards.)

* Being vague about the relations between the different integer types

* Not allowing (until standardised after half a century) binary
literals, and still not allowing those to be printed

That one is more that no one had bothered standardising binary
literals. ��The people that wanted them, for the most part, are low-
level embedded programmers and their tools already supported them.
(And even then, they are not much used in practice.)� Printing in
binary is not something people often want - it is far too cumbersome
for numbers, and if you want to dump a view of some flag register then
a custom function with letters is vastly more useful.

C is standardised on binary - unsigned integer types would not work
well on a non-binary target.

* Not being allowed to do a dozen things that you KNOW are well-
defined on your target machine, but C says are UB.

That is certainly part of it.� Things like "signed integer arithmetic
overflow" is UB at least partly because C models mathematical integer
arithmetic.� It does not attempt to mimic the underlying hardware.
This is clearly "high level language" territory - C defines the
behaviour of an abstract machine in terms of mathematics.� It is not
an "assembler" that defines operations in terms of hardware instructions.

It certainly depends on where one is coming from; from an abstract
or user-application level or from the machine level.

There was often mentioned here - very much to the despise of the
audience - that there's a lot effort necessary to implement simple
concepts. To jump on that bandwagon; how would, say, Awk's array
construct� map[key] = value� have to be modeled in (native) "C".
(Note that this simple statement represents an associative array.)

"C" is abstracting from the machine. And the OP's initial statement
"C and assembly are essentially the same" may be nonsense

Actually, describing C as 'portable assembly' annoys me which is why
I went into some detail.

Indeed.

C is defined in terms of an abstract machine, not hardware.� And the C
source code running on this abstract machine only needs to match up
with the actual binary code on the real target machine in very
specific and limited ways - the "observable behaviour" of the
program.� That's basically start, stop, volatile accesses and IO.
Everything else follows the "as if" - the compiler needs to generate
target code that works (for observable behaviour) "as if" it had done
a direct, na�ve translation of the source.

As I understand the history - and certainly the practice - of the C
language, it is a language with two goals.� One is that it should be
possible to write highly portable C code that can be used on a very
wide range of target systems while remaining efficient.� The other is
that it should be useable for a lot of target-specific system code.

C has never been, and was never intended to be, a "portable assembly".
It was designed to reduce the need to write assembly code.� There is a
huge difference in these concepts.

Use C to create the ASM, then GAS it... ;^)

Humm... Actually, is this use of my macro(s), CT_ASB_*, okay for this
little code example I wrote to "help" me write AppleSoft BASIC? Can you
run it, is the code Kosher, so to speak, well, does it work for you or
not...? Any undefined behavior in my macro? The macros seem a bit
hackish, but they seem to work okay for now: ______________________________________
#include <iostream>
#include <sstream>

// Macro kosher? Seems to be...
namespace ct_basic {

struct program_counter {
unsigned long m_origin;
unsigned long m_cur;
unsigned long m_inc;
std::stringstream m_prog;

program_counter(
unsigned long cur = 0,
unsigned long inc = 10
) : m_origin(cur), m_cur(cur), m_inc(inc) {
}

void line(std::stringstream const& line0) {
m_prog << m_cur << " " << line0.str() << std::endl;
m_cur += m_inc;
}
};

#define CT_ASB_LINE(mp_pc, mp_x) \
{ \
std::stringstream line0; \
line0 << mp_x; \
(mp_pc).line(line0); \
}

#define CT_ASB_GOSUB(mp_pc0, mp_pc1, mp_indent) \
CT_ASB_LINE(mp_pc0, mp_indent "GOSUB " << mp_pc1.m_origin)
}

int
main()
{
{
std::cout << "ctBasic testing 123... :^)\n";
std::cout << "__________________________\n";

{

ct_basic::program_counter pc0(100);
ct_basic::program_counter pc1(1000);

// ct_main
{
ct_basic::program_counter& PC = pc0;

CT_ASB_LINE(PC, "REM ct_main");
CT_ASB_LINE(PC, " PRINT \"ct_main\"");
CT_ASB_GOSUB(PC, pc1, " ");
CT_ASB_LINE(PC, "END");
}

// ct_init
{
ct_basic::program_counter& PC = pc1;

CT_ASB_LINE(PC, "REM ct_init");
CT_ASB_LINE(PC, " PRINT \"ct_init\"");
CT_ASB_LINE(PC, "RETURN");
}

std::cout << pc0.m_prog.str() << "\n\n";
std::cout << pc1.m_prog.str() << "\n\n";
}

std::cout << "__________________________\n";
}

std::cout << "Complete! Well, time to test the\n";
std::cout << "generated AppleSoft BASIC";

return 0;
}
______________________________________

Fwiw, I get an output of:

ctBasic testing 123... :^)
__________________________
100 REM ct_main
110 PRINT "ct_main"
120 GOSUB 1000
130 END

1000 REM ct_init
1010 PRINT "ct_init"
1020 RETURN

__________________________
Complete! Well, time to test the
generated AppleSoft BASIC

;^)

--- PyGate Linux v1.5.14
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From Chris M. Thomasson@3:633/10 to All on Wed Apr 15 23:29:42 2026

On 4/15/2026 9:58 AM, wij wrote:
[...]

Nope, I quit C (but I keep watching C, since part of C++ is C)

C is nice on several levels...

--- PyGate Linux v1.5.14
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From Chris M. Thomasson@3:633/10 to All on Wed Apr 15 23:31:46 2026

On 4/15/2026 3:48 PM, Keith Thompson wrote:

wij <wyniijj5@gmail.com> writes:

On Wed, 2026-04-15 at 22:11 +0200, David Brown wrote:

[...]

Okay, have you ever actually done any C++ programming?� The languages
share the same philosophy here.

You are really a sick person. Looser of the real world. You just don't know >> yourself.
I have a gold medal, an aluminum medal and a bronze commemorative plaque (for
solving a riddle of Northrop Coorp.). What you have? Well... a paper (paid for)
and still making false memory everyday for yourself.
I retired at 37, can you?

Ah, recently, you also failed to verify a simple program that proves
3x+1 problem. Fact is not made by mouth (like DJT?), looser.

Keep the personal abuse to yourself.

YIKES! ;^o

--- PyGate Linux v1.5.14
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From Chris M. Thomasson@3:633/10 to All on Wed Apr 15 23:34:18 2026

On 4/15/2026 7:06 AM, Bart wrote:

On 15/04/2026 13:21, wij wrote:

On Wed, 2026-04-15 at 11:46 +0100, Bart wrote:

On 15/04/2026 07:05, wij wrote:

On Tue, 2026-04-14 at 21:46 -0700, Keith Thompson wrote:

��int a;
��char b;
��a=b;�� // allow auto promotion

��while(a<b) {
�� a+=1;
��}

You've claimed that that's assembly language.� What assembler?
For what CPU?

Is it even for a real assembler?

I think you realize the example above is just an example to demo my
idea.

So you've invented an 'assembly' syntax that looks exactly like C, in
order to support your notion that C and assembly are really the same
thing!

Exactly. But not really 'invented'. I feagured if anyone wants to
implement
a 'portable assembly', he would find it not much different from C
(from the
example shown, 'structured C'). So, in a sense, not worthy to implement.

Real assembly generally uses explicit instructions and labels rather
than the implicit ones used here. It would also have limits on the
complexity of expressions. If your pseudo-assembler supports:

�� a = b+c*f(x,y);

then you've invented a HLL.

You may say that.

It sounds like you don't understand the difference between a low-level language and a high-level one.

A high level lang can dump code for a lower level one and vise versa.
Some code of mine created all in C++:

100 REM ct_vfield_applesoft_basic
110 HOME
120 HGR: HCOLOR = 3: VTAB 22
130 PRINT "ct_vfield_applesoft_basic"
140 GOSUB 1000
150 GOSUB 3000
160 SP = 0
170 RS(SP, 0) = 0
180 RS(SP, 1) = -1
190 RS(SP, 2) = 0
200 RS(SP, 3) = 1
210 RS(SP, 4) = 0
220 GOSUB 8000
230 V1(1) = 0: V1(2) = 0: V1(3) = 1: V1(4) = 128
240 GOSUB 6000
245 PRINT "Chris Thomasson's Koch Complete!"
250 END

1000 REM ct_init
1010 PRINT "ct_init"
1020 DIM A0(6)
1030 DIM V0(4)
1040 DIM V1(4)
1050 DIM V2(4)
1060 DIM V3(4)
1070 DIM V4(4)
1080 DIM V5(4)
1090 RN = 3
1100 DIM RS(RN, 16)
1110 GOSUB 2000
1120 RETURN

2000 REM ct_init_plane
2010 PRINT "ct_init_plane"
2020 A0(1) = 279: REM m_plane.m_width
2030 A0(2) = 191: REM m_plane.m_height
2040 A0(3) = 0.0126106: REM m_plane.m_xstep
2050 A0(4) = 0.0126316: REM m_plane.m_ystep
2060 A0(5) = -1.75288: REM m_plane.m_axes.m_xmin
2070 A0(6) = 1.2: REM m_plane.m_axes.m_ymax
2080 RETURN

3000 REM ct_display_plane
3010 PRINT "ct_display_plane"
3020 FOR I0 = 1 TO 6
3030 PRINT "A0("; I0; ") = " A0(I0)
3040 NEXT I0
3050 RETURN

4000 REM ct_project_point
4010 REM PRINT "ct_project_point"
4020 V0(3) = (V0(1) - A0(5)) / A0(3)
4030 V0(4) = (A0(6) - V0(2)) / A0(4)
4040 IF V0(3) < 0 THEN V0(3) = INT(V0(3) - .5)
4050 IF V0(3) >= 0 THEN V0(3) = INT(V0(3) + .5)
4060 IF V0(4) < 0 THEN V0(4) = INT(V0(4) - .5)
4070 IF V0(4) >= 0 THEN V0(4) = INT(V0(4) + .5)
4080 RETURN

5000 REM ct_plot_point
5010 REM PRINT "ct_plot_point"
5020 GOSUB 4000
5030 IF V0(3) > -1 AND V0(3) <= A0(1) AND V0(4) > -1 AND V0(4) <=
A0(2) THEN HPLOT V0(3), V0(4)
5040 RETURN

6000 REM ct_plot_circle
6010 PRINT "ct_plot_circle"
6020 AB = 6.28318 / V1(4)
6030 FOR I1 = 0 TO 6.28318 STEP AB
6040 V0(1) = V1(1) + COS(I1) * V1(3)
6050 V0(2) = V1(2) + SIN(I1) * V1(3)
6060 GOSUB 5000
6070 NEXT I1
6080 RETURN

7000 REM ct_plot_line
7010 PRINT "ct_plot_line"
7020 V0(1) = V5(1): V0(2) = V5(2)
7030 GOSUB 4000
7040 IF V0(3) < 0 THEN V0(3) = 0
7050 IF V0(3) > A0(1) THEN V0(3) = A0(1)
7060 IF V0(4) < 0 THEN V0(4) = 0
7070 IF V0(4) > A0(2) THEN V0(4) = A0(2)
7080 HPLOT V0(3), V0(4)
7090 V0(1) = V5(3): V0(2) = V5(4)
7100 GOSUB 4000
7110 IF V0(3) < 0 THEN V0(3) = 0
7120 IF V0(3) > A0(1) THEN V0(3) = A0(1)
7130 IF V0(4) < 0 THEN V0(4) = 0
7140 IF V0(4) > A0(2) THEN V0(4) = A0(2)
7150 HPLOT TO V0(3), V0(4)
7160 RETURN

8000 REM ct_koch
8010 IF RS(SP, 0) >= RN THEN RETURN
8020 PRINT "ct_koch = "; RS(SP, 0); " "; RS(SP, 1); " "; RS(SP, 2);
" "; RS(SP, 3); " "; RS(SP, 4)"
8030 RS(SP, 5) = RS(SP, 3) - RS(SP, 1) : REM difx
8040 RS(SP, 6) = RS(SP, 4) - RS(SP, 2) : REM dify
8050 RS(SP, 7) = RS(SP, 1) + RS(SP, 5) / 2 : REM dify
8060 RS(SP, 8) = RS(SP, 2) + RS(SP, 6) / 2 : REM dify
8070 RS(SP, 9) = -RS(SP, 6) : REM perpx
8080 RS(SP, 10) = RS(SP, 5) : REM perpy
8090 RS(SP, 11) = RS(SP, 7) + RS(SP, 9) / 3 : REM tipx
8100 RS(SP, 12) = RS(SP, 8) + RS(SP, 10) / 3 : REM tipy
8110 RS(SP, 13) = RS(SP, 1) + RS(SP, 5) / 3 : REM k0x
8120 RS(SP, 14) = RS(SP, 2) + RS(SP, 6) / 3 : REM k0y
8130 RS(SP, 15) = RS(SP, 3) - RS(SP, 5) / 3 : REM k1x
8140 RS(SP, 16) = RS(SP, 4) - RS(SP, 6) / 3 : REM k1y

8145 IF RS(SP, 0) < RN - 1 GOTO 8230
8150 V5(1) = RS(SP, 1): V5(2) = RS(SP, 2): V5(3) = RS(SP, 13): V5(4)
= RS(SP, 14)
8160 GOSUB 7000
8170 V5(1) = RS(SP, 13): V5(2) = RS(SP, 14): V5(3) = RS(SP, 11):
V5(4) = RS(SP, 12)
8180 GOSUB 7000
8190 V5(1) = RS(SP, 11): V5(2) = RS(SP, 12): V5(3) = RS(SP, 15):
V5(4) = RS(SP, 16)
8200 GOSUB 7000
8210 V5(1) = RS(SP, 15): V5(2) = RS(SP, 16): V5(3) = RS(SP, 3):
V5(4) = RS(SP, 4)
8220 GOSUB 7000

8230 REM line 0
8240 SP = SP + 1
8250 RS(SP, 0) = RS(SP - 1, 0) + 1
8260 RS(SP, 1) = RS(SP - 1, 1)
8270 RS(SP, 2) = RS(SP - 1, 2)
8280 RS(SP, 3) = RS(SP - 1, 13)
8290 RS(SP, 4) = RS(SP - 1, 14)
8300 GOSUB 8000
8310 SP = SP - 1
8320 REM line 1
8330 SP = SP + 1
8340 RS(SP, 0) = RS(SP - 1, 0) + 1
8350 RS(SP, 1) = RS(SP - 1, 13)
8360 RS(SP, 2) = RS(SP - 1, 14)
8370 RS(SP, 3) = RS(SP - 1, 11)
8380 RS(SP, 4) = RS(SP - 1, 12)
8390 GOSUB 8000
8400 SP = SP - 1
8410 REM line 2
8420 SP = SP + 1
8430 RS(SP, 0) = RS(SP - 1, 0) + 1
8440 RS(SP, 1) = RS(SP - 1, 11)
8450 RS(SP, 2) = RS(SP - 1, 12)
8460 RS(SP, 3) = RS(SP - 1, 15)
8470 RS(SP, 4) = RS(SP - 1, 16)
8480 GOSUB 8000
8490 SP = SP - 1
8500 REM line 3
8510 SP = SP + 1
8520 RS(SP, 0) = RS(SP - 1, 0) + 1
8530 RS(SP, 1) = RS(SP - 1, 15)
8540 RS(SP, 2) = RS(SP - 1, 16)
8550 RS(SP, 3) = RS(SP - 1, 3)
8560 RS(SP, 4) = RS(SP - 1, 4)
8570 GOSUB 8000
8580 SP = SP - 1
8590 RETURN

Guess the lang?

[...]

--- PyGate Linux v1.5.14
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From Chris M. Thomasson@3:633/10 to All on Wed Apr 15 23:37:41 2026

On 4/15/2026 4:30 PM, wij wrote:

On Wed, 2026-04-15 at 23:52 +0100, Bart wrote:

On 15/04/2026 22:12, wij wrote:

On Wed, 2026-04-15 at 15:06 +0100, Bart wrote:

The boundary of assembly and HLL is not clear to me.

That seems to be obvious.

I had wrote a killer-grade commercial assembly program, it may still be running
today after >30 years. My experience is that assembly is not that scary as commonly
thought, just don't think in low level.

It's not that scary. Just unergonomic to code in it, taking longer,
being more error prone, much harder to understand, harder to maintain,
much less portable ...

Skill. Treat assembly as a chunk. Well document.

Well crafted asm is not bad. Only used when needed! simple... :^)

I found some of my old asm on the way back machine:

https://web.archive.org/web/20060214112345/http://appcore.home.comcast.net/appcore/src/cpu/i686/ac_i686_gcc_asm.html

--- PyGate Linux v1.5.14
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From Chris M. Thomasson@3:633/10 to All on Wed Apr 15 23:40:35 2026

On 4/15/2026 11:37 PM, Chris M. Thomasson wrote:

On 4/15/2026 4:30 PM, wij wrote:

On Wed, 2026-04-15 at 23:52 +0100, Bart wrote:

On 15/04/2026 22:12, wij wrote:

On Wed, 2026-04-15 at 15:06 +0100, Bart wrote:

The boundary of assembly and HLL is not clear to me.

That seems to be obvious.

I had wrote a killer-grade commercial assembly program, it may still
be running
today after >30 years. My experience is that assembly is not that
scary as commonly
thought, just don't think in low level.

It's not that scary. Just unergonomic to code in it, taking longer,
being more error prone, much harder to understand, harder to maintain,
much less portable ...

Skill. Treat assembly as a chunk. Well document.

Well crafted asm is not bad. Only used when needed! simple... :^)

I found some of my old asm on the way back machine:

https://web.archive.org/web/20060214112345/http:// appcore.home.comcast.net/appcore/src/cpu/i686/ac_i686_gcc_asm.html

2005, damn time goes on bye, bye... ;^o

--- PyGate Linux v1.5.14
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From Chris M. Thomasson@3:633/10 to All on Wed Apr 15 23:47:03 2026

On 4/15/2026 11:40 PM, Chris M. Thomasson wrote:

On 4/15/2026 11:37 PM, Chris M. Thomasson wrote:

On 4/15/2026 4:30 PM, wij wrote:

On Wed, 2026-04-15 at 23:52 +0100, Bart wrote:

On 15/04/2026 22:12, wij wrote:

On Wed, 2026-04-15 at 15:06 +0100, Bart wrote:

The boundary of assembly and HLL is not clear to me.

That seems to be obvious.

I had wrote a killer-grade commercial assembly program, it may
still be running
today after >30 years. My experience is that assembly is not that
scary as commonly
thought, just don't think in low level.

It's not that scary. Just unergonomic to code in it, taking longer,
being more error prone, much harder to understand, harder to maintain, >>>> much less portable ...

Skill. Treat assembly as a chunk. Well document.

Well crafted asm is not bad. Only used when needed! simple... :^)

I found some of my old asm on the way back machine:

https://web.archive.org/web/20060214112345/http://
appcore.home.comcast.net/appcore/src/cpu/i686/ac_i686_gcc_asm.html

2005, damn time goes on bye, bye... ;^o

A song for using lang a to gen code for other langs:

Operatique (Internal Struggle Version)

https://youtu.be/A3Qc-2j-6Gc?list=RDGMEMYH9CUrFO7CfLJpaD7UR85w&t=50

;^)

--- PyGate Linux v1.5.14
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From David Brown@3:633/10 to All on Thu Apr 16 09:10:21 2026

On 16/04/2026 00:43, Keith Thompson wrote:

David Brown <david.brown@hesbynett.no> writes:

On 15/04/2026 01:33, Bart wrote:

[...]

Certainly until C99 when stdint.h came along.

I would not draw that distinction - indeed, I see the opposite. Prior
to <stdint.h>, your integer type sizes were directly from the target
machine - with <stdint.h> explicitly sized integer types, they are now
independent of the target hardware.

A minor quibble: The sizes of the predefined integer types have
always been determined by the compiler, often mandated by an ABI
for the target platform. The choice is *influenced* by the target
hardware, but not controlled by it. For example, the width of
`long` on x86_64 is likely to be 32 bits on Windows, 64 bits on
other platforms.

That's more detail that I had included, but entirely correct. The sizes
are determined by the implementation (compiler and standard library,
which need to agree on these things), which in turn is usually
determined by the ABI, which is determined from the OS and target
processor. But along that path there can be differences - such as the
size of "long" on 64-bit Windows compared to 64-bit *nix, or that "int"
is usually 32-bit on 64-bit processors. And there are implementations
where the size of "int" can be picked with compiler flags - for 68k
processors it is not uncommon for compilers to support options for
16-bit or 32-bit "int".

--- PyGate Linux v1.5.14
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From David Brown@3:633/10 to All on Thu Apr 16 09:13:40 2026

On 15/04/2026 23:38, wij wrote:
<skip drivel>

I don't think there is anything more to be said. You apparently know everything already.

--- PyGate Linux v1.5.14
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From David Brown@3:633/10 to All on Thu Apr 16 09:23:13 2026

On 16/04/2026 06:43, Waldek Hebisch wrote:

David Brown <david.brown@hesbynett.no> wrote:

On 15/04/2026 18:58, wij wrote:

Assembly compiler (or language) can also do the same optimization.

No, assemblers cannot do that - if they did, they would not be
assemblers. An assembler directly translates your instructions from
mnemonic codes (assembly instructions) to binary opcodes. Some
assemblers might have pseudo-instructions that translate to more than
one binary opcode, but always in a specific defined pattern.

Well, as a program assembler is not a compiler. But people talk
about "assembly language" and you can have a compiler that
takes assembly language as an input. This was done by DEC
for VAX assembly. A guy created compilers for 360 assembly,
one targeting 386, another one targetimg Java. Such compilers
to be useful should do same optimization.

You can indeed do that kind of thing. I once used (briefly) a setup
where the assembly output of a simple 8086 C compiler was piped into a converter to turn that into assembly for an 8-bit microcontroller. The results were not particularly efficient, but it was a path from C to
assembly for that microcontroller. (Other more normal C compilers for
that microcontroller were terrible in other ways.) I have also seen
some assembly "post-processors" that aim to clean up or otherwise
optimise assembly - typically the output of C compilers that generate particularly inefficient patterns.

"Assembly language" is a symbolic form of a processor's (real or
virtual) instruction code, combined with helpful features like labels,
address calculations, and other directives. An "assembler" is a program
that translates this into machine code. But assembly language can be
used for plenty of other things - these days I mostly use it to see what
the compiler has generated.

--- PyGate Linux v1.5.14
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From David Brown@3:633/10 to All on Thu Apr 16 09:35:36 2026

On 16/04/2026 02:17, Jonathan Lamothe wrote:

Bart <bc@freeuk.com> writes:

On 16/04/2026 00:30, wij wrote:

On Wed, 2026-04-15 at 23:52 +0100, Bart wrote:

On 15/04/2026 22:12, wij wrote:

On Wed, 2026-04-15 at 15:06 +0100, Bart wrote:

The boundary of assembly and HLL is not clear to me.

That seems to be obvious.

I had wrote a killer-grade commercial assembly program, it may still be running
today after >30 years. My experience is that assembly is not that scary as commonly
thought, just don't think in low level.

It's not that scary. Just unergonomic to code in it, taking longer,
being more error prone, much harder to understand, harder to maintain, >>>> much less portable ...

Skill. Treat assembly as a chunk. Well document.

You're not making sense. It's like saying I should walk everywhere
instead of using my car.

But I don't want to spend two extra hours a day walking and carrying
shopping etc.

What exactly is the benefit of using assembly over a HLL when both can
tackle the task?

When I first started with microprocessors, I first had to build the
hardware, which was programmed in binary. I wrote a hex editor so I
could use a keyboard. Then used that to write an assembler. Then used
the assembler to write a compiler for a simple HLL.

The HLL allowed me to be far more productive than otherwise. Everybody
seems to understand that, except you.

But I have a counter-proposal: why don't you also program in binary
machine code (I'll let you use hex!) instead of assembly? After all
it's just a skill.

Assembly is a great thing to know. It makes it easier to know what's
going on under the hood of higher level languages, and can even help in trouboeshooting and reasoning about how to make your code more efficiet.

Do I think that learning assembly is an asset? Absolutely.

Do I think it's something that a project should be written in directly?
In most cases, absolutely not.

I agree entirely.

I work in the field of small-systems embedded programming (these days,
that means mostly 32-bit ARM devices programmed in C or C++). I see it
as an advantage for programmers in the field to have done some assembly programming - they have a better understanding of what is going on, and
avoid unnecessary pessimisations. People coming to this field from the
world of PC programming often misunderstand how to write efficient code.

(I once had someone ask me for help to improve the speed of their code
on an 8-bit microcontroller. It turned out they had needed to half the
value of an integer variable, and had done it by "x = x * 0.5;".)

It is even an advantage, I think, for programmers to have worked with
some digital electronics design. In electronics, you quickly learn that
you can connect one output to two inputs, but you cannot connect two
outputs to one input without additional electronics (a multiplexer, a
logic gate, etc.). In programming, you often have resources (variables
or something more complicated) with wide scopes - when you think of it
like electronics, you get an immediate understanding of why it is a bad
idea to set or change these resources in different parts of the code
without some kind of coordination.

--- PyGate Linux v1.5.14
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From James Kuyper@3:633/10 to All on Thu Apr 16 06:28:36 2026

On 15/04/2026 01:33, Bart wrote:
...

* Not allowing (until standardised after half a century) binary
literals, and still not allowing those to be printed

The latest draft standard supports %b and %B formats.

...

* Not being allowed to do a dozen things that you KNOW are well-defined
on your target machine, but C says are UB.

If you know they are well-defined on your only target platform, there's
nothing wrong with writing such code. That's part of the reason why C
says the behavior is undefined, rather than requiring that such code be rejected. Implementations are intended to take advantage of that fact
for code that does not need to be portable.

--- PyGate Linux v1.5.14
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From Kerr-Mudd, John@3:633/10 to All on Thu Apr 16 11:38:05 2026

On Wed, 15 Apr 2026 22:40:55 +0200
??Jacek Marcin Jaworski?? <jmj@energokod.gda.pl> wrote:

W dniu 15.04.2026 o�22:01, Kerr-Mudd, John pisze:

On Wed, 15 Apr 2026 20:23:52 +0200
??Jacek Marcin Jaworski??<jmj@energokod.gda.pl> wrote:

W dniu 15.04.2026 o�15:40, makendo pisze:

(forwarding to alt.lang.asm because you are comparing C with it)

Great, but what is wrong with comp.lang.asm ? I subcribe it instead any
ohter asm related groups. Is this wrong aproach?

DYM comp.lang.asm.x86?

No!

comp.lang.asm is an empty header for me on eternal september's feed.

After question "is comp.lang.asm moderated?" ecosia.org AI answer today, quote:

The newsgroup comp.lang.asm is generally considered an unmoderated Usenet group. Unlike comp.lang.asm.x86, which is known to be moderated, comp.lang.asm does not have an official moderation process and typically allows posts to appear without prior review.

I see old posts published on comp.lang.asm, and last is yours: "Kenny
Code for DOS", from 2023-04-24, mon. (without any answers).

I've only posted (xposted) asm code to ala, clax and comp. never to
cla.

This post would be a first; I'll see if it gets there.

sig is overlong, and crowded, IMHO.

I have so many things to communicate Poles - this is the reason of bit
sig. But I try to be laconic.

--
Jacek Marcin Jaworski, Pruszcz Gd., woj. Pomorskie, Polska ??, EU ??;
tel.: +48-609-170-742, najlepiej w godz.: 5:00-5:55 lub 16:00-17:25; <jmj@energokod.gda.pl>, gpg: 4A541AA7A6E872318B85D7F6A651CC39244B0BFA;
Domowa s. WWW: <https://energokod.gda.pl>;
Mini Netykieta: <https://energokod.gda.pl/MiniNetykieta.html>; Mailowa Samoobrona: <https://emailselfdefense.fsf.org/pl>.
UWAGA:
NIE ZACI?GAJ "UKRYTEGO D?UGU"! P?A? ZA PROG. FOSS I INFO. INTERNETOWE!
CZYTAJ DARMOWY: "17. Raport Totaliztyczny - Patroni Kontra Bankierzy": <https://energokod.gda.pl/raporty-totaliztyczne/17.%20Patroni%20Kontra%20Bankierzy.pdf>

--
Bah, and indeed, Humbug

--- PyGate Linux v1.5.14
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From wij@3:633/10 to All on Thu Apr 16 18:42:40 2026

On Wed, 2026-04-15 at 19:04 -0700, Keith Thompson wrote:

wij <wyniijj5@gmail.com> writes:

On Wed, 2026-04-15 at 17:14 -0700, Tim Rentsch wrote:

wij <wyniijj5@gmail.com> writes:

[... comparing C and assembly language ...]

Gentlemen,

I understand the natural reaction to want to respond to the kind of statements being made in this thread.� I hope y'all can resist t

his

natural reaction and not respond to people who persist in making arguments that are basically isomorphic to saying 1 equals 0.

Thank you for your assistance in this matter.

Maybe you are right. I say A is-a B, one persist to read A is (exactly)

B.

I provide help to using assembly. One persist to read I persuade using

assembly and give up HLL. What is going on here?

You say that C is an assembly language.� Nobody here thinks that
you're *equating* C and assembly language.� It's obvious that
there are plenty of assembly languages that are not C, and nobody
has said otherwise.� I have no idea why you think anyone has that
particular confusion.

At least one person has apparently interpreted your defense of
assembly language (that it isn't as scary as some think it is)
as a claim that we should program in assembly language rather
than in HLLs.� You're right, that was a misinterpretation of what
you wrote.� I considered mentioning that, but didn't bother.

The issue I've been discussing is your claim that C is an assembly
language.� It is not.

If I said C is assembly is in the sense that have at least shown in the las
t
post (s_tut2.cpp), where even 'instruction' can be any function (e.g. chang
e
directory, copy files, launch an editor,...). And also, what is 'computatio
n'
is demonstrated, which include suggestion what C is, essentially any progra
m,
and in this sense what HLL is. Finally, it could demonstrate the meaning an
d
testify Church-Turing thesis (my words: no computation language, including

various kind of math formula, can exceeds the expressive power of TM).

It seem you insist C and assembly have to be exactly what your bible says.
If
so, I would say what C standard (I cannot read it) says is the meaning of terminology of term in it, not intended to be anything used in any other si tuation.

I do not intend to post again in this thread until and unless you
post something substantive on that issue.

--- PyGate Linux v1.5.14
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From Janis Papanagnou@3:633/10 to All on Thu Apr 16 13:19:55 2026

On 2026-04-16 08:37, Chris M. Thomasson wrote:

Well crafted asm is not bad. Only used when needed! simple... :^)

And in practice a throwaway-product once you change platform.
(I'm shuddering thinking about porting my decades old DSP asm
code to some other platform/CPU architecture.) But I've ported
or re-used old "C" code without much effort. This is a crucial
differences, especially in the light of the thread-theses.

Janis

--- PyGate Linux v1.5.14
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From Dan Cross@3:633/10 to All on Thu Apr 16 11:24:59 2026

In article <86qzof7od4.fsf@linuxsc.com>,
Tim Rentsch <tr.17687@z991.linuxsc.com> wrote:

wij <wyniijj5@gmail.com> writes:

[... comparing C and assembly language ...]

Gentlemen,

I understand the natural reaction to want to respond to the kind of >statements being made in this thread. I hope y'all can resist this
natural reaction and not respond to people who persist in making
arguments that are basically isomorphic to saying 1 equals 0.

Thank you for your assistance in this matter.

Thank you. This is one of the people that made comp.theory
utterly useless; indulging him here could do the same to
comp.lang.c. As Tim put it, please avoid.

- Dan C.

--- PyGate Linux v1.5.14
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From Kenny McCormack@3:633/10 to All on Thu Apr 16 13:10:14 2026

In article <1e4ef965d5ee27013e0abfd3c5dc18831400ad5f.camel@gmail.com>,
wij <wyniijj5@gmail.com> wrote:
...

It seem you insist C and assembly have to be exactly what your bible says.
If so, I would say what C standard (I cannot read it) says is the meaning
of terminology of term in it, not intended to be anything used in any
other situation.

Keith is the king of this newsgroup. What he says, goes.
The way he defines words is the law, and all must fall in line with that.

You're new around here, so you are probably not familar with these rules,
but you will be soon (if you choose to stick around).

Kind Keith then stated:

I do not intend to post again in this thread until and unless you
post something substantive on that issue.

For which we are all grateful.

--
When someone tells me he/she is a Christian I check to see if I'm
still in possession of my wallet.

--- PyGate Linux v1.5.14
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From wij@3:633/10 to All on Thu Apr 16 22:14:58 2026

On Thu, 2026-04-16 at 18:42 +0800, wij wrote:

On Wed, 2026-04-15 at 19:04 -0700, Keith Thompson wrote:

wij <wyniijj5@gmail.com> writes:

On Wed, 2026-04-15 at 17:14 -0700, Tim Rentsch wrote:

wij <wyniijj5@gmail.com> writes:

[... comparing C and assembly language ...]

Gentlemen,

I understand the natural reaction to want to respond to the kind of statements being made in this thread.� I hope y'all can resist

this

natural reaction and not respond to people who persist in making arguments that are basically isomorphic to saying 1 equals 0.

Thank you for your assistance in this matter.

Maybe you are right. I say A is-a B, one persist to read A is (exactl

y) B.

I provide help to using assembly. One persist to read I persuade usin

g

assembly and give up HLL. What is going on here?

You say that C is an assembly language.� Nobody here thinks that
you're *equating* C and assembly language.� It's obvious that
there are plenty of assembly languages that are not C, and nobody
has said otherwise.� I have no idea why you think anyone has that particular confusion.

At least one person has apparently interpreted your defense of
assembly language (that it isn't as scary as some think it is)
as a claim that we should program in assembly language rather
than in HLLs.� You're right, that was a misinterpretation of what
you wrote.� I considered mentioning that, but didn't bother.

The issue I've been discussing is your claim that C is an assembly language.� It is not.

If I said C is assembly is in the sense that have at least shown in the l

ast

post (s_tut2.cpp), where even 'instruction' can be any function (e.g. cha

nge

directory, copy files, launch an editor,...). And also, what is 'computat

ion'

is demonstrated, which include suggestion what C is, essentially any prog

ram,

and in this sense what HLL is. Finally, it could demonstrate the meaning

and

testify Church-Turing thesis (my words: no computation language, includin

g�

various kind of math formula, can exceeds the expressive power of TM).

It seem you insist C and assembly have to be exactly what your bible says

. If

so, I would say what C standard (I cannot read it) says is the meaning of terminology of term in it, not intended to be anything used in any other

situation.

I do not intend to post again in this thread until and unless you
post something substantive on that issue.

(continue)
IMO, C standard is like book of legal terms. Like many symbols in the heade
r
file, it defines one symbol in anoter symbol. The real meaning is not fixed
.
The result is you cannot 'prove' correctness of the source program, even

consistency is a problem.

'Instruction' is low-level? Yes, by definition, but not as one might think. Instruction could refer to a processing unit (might be like the x87 math

co-processor, which may even be more higher level to process expression,...
)
As good chance of C is to find a good function that can be hardwired.

So, the basic feature of HLL is 'structured' (or 'nested') text which remov
es
labels. Semantics is inventor's imagination. So, avoid bizarre complexity,
it
won't add express power to the language, just a matter of short or lengthy
�
expression of programming idea.

--- PyGate Linux v1.5.14
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From wij@3:633/10 to All on Thu Apr 16 22:21:25 2026

On Thu, 2026-04-16 at 13:10 +0000, Kenny McCormack wrote:

In article <1e4ef965d5ee27013e0abfd3c5dc18831400ad5f.camel@gmail.com>,
wij� <wyniijj5@gmail.com> wrote:
...

It seem you insist C and assembly have to be exactly what your bible sa

ys.

If so, I would say what C standard (I cannot read it) says is the meani

ng

of terminology of term in it, not intended to be anything used in any
other situation.

Keith is the king of this newsgroup.� What he says, goes.
The way he defines words is the law, and all must fall in line with that.

Forget about that, fact first. There are LLM. This is not court.

As I know, comp.lang.c should be a forum for more general topics than lang.c.mod, comp.lang.c.std. And,�refrain from telling what other shou
ld do,
you are just another participant.

You're new around here, so you are probably not familar with these rules,
but you will be soon (if you choose to stick around).

Kind Keith then stated:

I do not intend to post again in this thread until and unless you
post something substantive on that issue.

For which we are all grateful.

comp.lang.c allows idiots (not all, but maybe including me)

--- PyGate Linux v1.5.14
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From Scott Lurndal@3:633/10 to All on Thu Apr 16 14:38:06 2026

antispam@fricas.org (Waldek Hebisch) writes:

David Brown <david.brown@hesbynett.no> wrote:

On 15/04/2026 18:58, wij wrote:

Assembly compiler (or language) can also do the same optimization.

No, assemblers cannot do that - if they did, they would not be
assemblers. An assembler directly translates your instructions from
mnemonic codes (assembly instructions) to binary opcodes. Some
assemblers might have pseudo-instructions that translate to more than
one binary opcode, but always in a specific defined pattern.

Well, as a program assembler is not a compiler. But people talk
about "assembly language" and you can have a compiler that
takes assembly language as an input. This was done by DEC
for VAX assembly. A guy created compilers for 360 assembly,
one targeting 386, another one targetimg Java. Such compilers
to be useful should do same optimization.

The C compiler in the GNU Compiler Collection provides
a mechanism to 'take assembly language as an input'
in the form of in-line assembler fragments. It's
useful in some limited cases (machine-level software like
kernels, boot loaders and the like).

The Burroughs Large systems (B5500 and descendents) has
never had an assembler; all code is written in a flavor
of Algol (with special syntax extensions required for
the MCP and other privileged applications).

The Burroughs Medium systems COBOL68 compiler supported
the 'ENTER SYMBOLIC' statement, which was followed by
in-line assembler until the LEAVE SYMBOLIC statement.

That functionality was not present in COBOL74 and
COBOL85 compilers. The Burroughs Programming Language
(BPL) compiler had the STORE verb which allowed
arbitrary values to be stored into the instruction
stream and was often used insert instructions
into the code stream.

For example:
GetMixInfo(a,b,c)= BEGIN UNSEGMENTED 00032000
BCT 0214; 00033000
BeginBCTParms 00034000
STORE(08) := @000000DB@; 00035000
STORE(06) := [a.NO]; 00036000
STORE(06) := [b.NO]; 00037000
STORE(06) := [c.NO]; 00038000
EndBCTParms; 00039000
END;#; 00040000

Stored the parameters for a branch communicate (trap to MCP) instruction.

DEFINE SPIO(iocb) = STORE(6) := @940000@;
STORE := [iocb.UN];#;
DEFINE HBR(label) = STORE(2) := @29@;
STORE := [label.NO];#;
DEFINE SST(area) = STORE(6) := @990001@;
STORE := [area.UA];#;
DEFINE WHR(bf, field) = STORE(4) := @6500@;
STORE(2) := bf;
STORE := [field.UN];#;

Defines macros to generate specific instructions (SPIO is Start Physical
I/O, HBR is Halt Branch, SST is System Status, WHR is Write Hardware Register).

--- PyGate Linux v1.5.14
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From Janis Papanagnou@3:633/10 to All on Thu Apr 16 17:03:39 2026

On 2026-04-16 16:21, wij wrote:

On Thu, 2026-04-16 at 13:10 +0000, Kenny McCormack wrote:

In article <1e4ef965d5ee27013e0abfd3c5dc18831400ad5f.camel@gmail.com>,
wij� <wyniijj5@gmail.com> wrote:
...

It seem you insist C and assembly have to be exactly what your bible says. >>> If so, I would say what C standard (I cannot read it) says is the meaning >>> of terminology of term in it, not intended to be anything used in any
other situation.

Keith is the king of this newsgroup.� What he says, goes.
The way he defines words is the law, and all must fall in line with that.

Forget about that, fact first. There are LLM. This is not court.

As I know, comp.lang.c should be a forum for more general topics than lang.c.mod, comp.lang.c.std. And,�refrain from telling what other should do, you are just another participant.

You probably have taken Kenny's words literally, missed his tone,
and thus misunderstood what he actually was intending to express.

Reread his post and allow the option that it was meant sarcastic.

Janis

--- PyGate Linux v1.5.14
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From David Brown@3:633/10 to All on Thu Apr 16 17:05:51 2026

On 16/04/2026 16:38, Scott Lurndal wrote:

antispam@fricas.org (Waldek Hebisch) writes:

David Brown <david.brown@hesbynett.no> wrote:

On 15/04/2026 18:58, wij wrote:

Assembly compiler (or language) can also do the same optimization.

No, assemblers cannot do that - if they did, they would not be
assemblers. An assembler directly translates your instructions from
mnemonic codes (assembly instructions) to binary opcodes. Some
assemblers might have pseudo-instructions that translate to more than
one binary opcode, but always in a specific defined pattern.

Well, as a program assembler is not a compiler. But people talk
about "assembly language" and you can have a compiler that
takes assembly language as an input. This was done by DEC
for VAX assembly. A guy created compilers for 360 assembly,
one targeting 386, another one targetimg Java. Such compilers
to be useful should do same optimization.

The C compiler in the GNU Compiler Collection provides
a mechanism to 'take assembly language as an input'
in the form of in-line assembler fragments. It's
useful in some limited cases (machine-level software like
kernels, boot loaders and the like).

Not really, no. gcc "asm" statements accept a kind of template language
and fills in certain types of parameters (used to pass register names, symbolic addresses, etc. to the assembly) then passes the rest of the
template string directly on to the generated assembly file. It does not interpret the assembly in any way, or do any kind of checking. gcc does
not take assembly language as an input any more than a typical printf statement takes English language as an input.

But gcc inline assembly is definitely useful at times - you are right there!

--- PyGate Linux v1.5.14
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From Lew Pitcher@3:633/10 to All on Thu Apr 16 15:11:44 2026

On Thu, 16 Apr 2026 14:38:06 +0000, Scott Lurndal wrote:

antispam@fricas.org (Waldek Hebisch) writes:

David Brown <david.brown@hesbynett.no> wrote:

On 15/04/2026 18:58, wij wrote:

Assembly compiler (or language) can also do the same optimization.

No, assemblers cannot do that - if they did, they would not be
assemblers. An assembler directly translates your instructions from
mnemonic codes (assembly instructions) to binary opcodes. Some
assemblers might have pseudo-instructions that translate to more than
one binary opcode, but always in a specific defined pattern.

Well, as a program assembler is not a compiler. But people talk
about "assembly language" and you can have a compiler that
takes assembly language as an input. This was done by DEC
for VAX assembly. A guy created compilers for 360 assembly,
one targeting 386, another one targetimg Java. Such compilers
to be useful should do same optimization.

The C compiler in the GNU Compiler Collection provides
a mechanism to 'take assembly language as an input'
in the form of in-line assembler fragments. It's
useful in some limited cases (machine-level software like
kernels, boot loaders and the like).

I believe that the authors of GNU C latched on to an (at the
time) useful extension of the C language, originally implemented
in Ron Cain's "Small C Compiler for the 8080's" (Dr. Dobbs
Journal # 45, 1980) as the #asm/#endasm preprocessor directives.
Ron's K&R C subset compiler didn't compile to machine code;
instead, it compiled to CP/M 8080 assembler (CP/M came with
an 8080 assembler as it's only language tool), and so an
sourcecode assembly "passthrough" was easily implemented.

The Burroughs Large systems (B5500 and descendents) has
never had an assembler; all code is written in a flavor
of Algol (with special syntax extensions required for
the MCP and other privileged applications).

The Burroughs Medium systems COBOL68 compiler supported
the 'ENTER SYMBOLIC' statement, which was followed by
in-line assembler until the LEAVE SYMBOLIC statement.

The IBM language environments that I worked in all
supported static (and later, dynamic) linkage, and my
employer could afford a suite of IBM language tools.
IBMs language tools shared a common object interface,
so it was (relatively) easy to write the Assembly
parts in Assembler, and the HLL parts in the appropriate
HLL (ususally, for us, COBOL), and link them together
for execution.

Consequently, none of the high-level languages supported
an "assembly" escape (although COBOL provided extensions
for IBM DB2 relational database interaction).

[snip]

FWIW, I believe that the origins of C had much the same
philosophy: write parts in suitable languages, and link
them together prior to execution. K&R C had no reason
to support inline assembly (and, as far as I have read)
the authors studiously avoided that capability.

--
Lew Pitcher
"In Skills We Trust"
Not LLM output - I'm just like this.

--- PyGate Linux v1.5.14
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From Janis Papanagnou@3:633/10 to All on Thu Apr 16 17:43:19 2026

On 2026-04-16 17:11, Lew Pitcher wrote:

[snip]

FWIW, I believe that the origins of C had much the same
philosophy: write parts in suitable languages, and link
them together prior to execution.

But was that an outcome of the C-language design, or of
the UNIX operating system concepts with its languages,
toolbox, and linking-editor?

There also seems to have been an asymmetry here with "C",
at least evolving later...

From what I observed, "C" had reached a status to not be
"inter pares". As a comparably low-level language it had
been often used for other languages as the compile-output
to be then handled by any C-compiler. Also HLLs supported
interfaces to access (primarily) "C" modules because of
their (much better) performance and the typically easier
access to system resources.

K&R C had no reason
to support inline assembly (and, as far as I have read)
the authors studiously avoided that capability.

Nonetheless it supported the reserved word 'asm' (as I can
read in my old translation of K&R). (Not exactly what I'd
call "studiously avoided".)

Janis

--- PyGate Linux v1.5.14
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From Lew Pitcher@3:633/10 to All on Thu Apr 16 16:23:38 2026

On Thu, 16 Apr 2026 17:43:19 +0200, Janis Papanagnou wrote:

On 2026-04-16 17:11, Lew Pitcher wrote:

[snip]

FWIW, I believe that the origins of C had much the same
philosophy: write parts in suitable languages, and link
them together prior to execution.

But was that an outcome of the C-language design, or of
the UNIX operating system concepts with its languages,
toolbox, and linking-editor?

All of the above.

Linkage editors were (and still are) common technology,
as was separation of languages (assembler vs high level
language). Originally, Unix was written in assembler, and
(according to the histories) C was designed (with the existent
language tools in mind) to allow the Unix developers to use
a high-level language in their development. Remember, Bell
Labs wrote more than just Unix in C; C became the lingua-franca
for all the tools and applications, including the text management
tools (TROFF, EQN, SED, AWK, etc) and games (CHESS/CHECKERS/
BACKGAMMON)

I recall reading (but cannot find the reference now) that
Unix (V7 perhaps?) consisted of thousands of lines of C code,
and a few hundred lines of assembly for device drivers.

There also seems to have been an asymmetry here with "C",
at least evolving later...

From what I observed, "C" had reached a status to not be
"inter pares". As a comparably low-level language it had
been often used for other languages as the compile-output
to be then handled by any C-compiler. Also HLLs supported
interfaces to access (primarily) "C" modules because of
their (much better) performance and the typically easier
access to system resources.

K&R C had no reason
to support inline assembly (and, as far as I have read)
the authors studiously avoided that capability.

Nonetheless it supported the reserved word 'asm' (as I can
read in my old translation of K&R). (Not exactly what I'd
call "studiously avoided".)

To quote K&R ("The C Programming Language" 1978)
from Appendix A ("C Reference Manual") section 2.3 ("Keywords")
"The 'entry' keyword is not currently implemented by
any compiler, but is reserved for future use. Some
implementations also reserve the words 'fortran' and 'asm'."

I note that, according to that appendix, C had been ported to
PDP 11, Honeywell 6000, IBM 360/370, and Interdata 8/32 systems
at that time, none of them running Unix, to my knowledge. As
such, the language (at that time in a bit of a plastic state,
being supplied as source code to AT&T customers and educators
alike) may have been altered on a site-by-site basis to suit
the needs of each particular client. As the context of these
keywords was never explained, I find it easier to believe that
the intent for these keywords was as a storage modifier, and
not an inline language change. Something like
extern fortran int F1(); /* use fortran calling convention */
extern asm char *F2(); /* use assembly calling convention */

Janis

--
Lew Pitcher
"In Skills We Trust"
Not LLM output - I'm just like this.

--- PyGate Linux v1.5.14
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From Scott Lurndal@3:633/10 to All on Thu Apr 16 19:01:42 2026

Lew Pitcher <lew.pitcher@digitalfreehold.ca> writes:

On Thu, 16 Apr 2026 14:38:06 +0000, Scott Lurndal wrote:

antispam@fricas.org (Waldek Hebisch) writes:

David Brown <david.brown@hesbynett.no> wrote:

On 15/04/2026 18:58, wij wrote:

Assembly compiler (or language) can also do the same optimization.

No, assemblers cannot do that - if they did, they would not be
assemblers. An assembler directly translates your instructions from
mnemonic codes (assembly instructions) to binary opcodes. Some
assemblers might have pseudo-instructions that translate to more than >>>> one binary opcode, but always in a specific defined pattern.

Well, as a program assembler is not a compiler. But people talk
about "assembly language" and you can have a compiler that
takes assembly language as an input. This was done by DEC
for VAX assembly. A guy created compilers for 360 assembly,
one targeting 386, another one targetimg Java. Such compilers
to be useful should do same optimization.

The C compiler in the GNU Compiler Collection provides
a mechanism to 'take assembly language as an input'
in the form of in-line assembler fragments. It's
useful in some limited cases (machine-level software like
kernels, boot loaders and the like).

I believe that the authors of GNU C latched on to an (at the
time) useful extension of the C language, originally implemented
in Ron Cain's "Small C Compiler for the 8080's" (Dr. Dobbs
Journal # 45, 1980) as the #asm/#endasm preprocessor directives.
Ron's K&R C subset compiler didn't compile to machine code;
instead, it compiled to CP/M 8080 assembler (CP/M came with
an 8080 assembler as it's only language tool), and so an
sourcecode assembly "passthrough" was easily implemented.

The Burroughs Large systems (B5500 and descendents) has
never had an assembler; all code is written in a flavor
of Algol (with special syntax extensions required for
the MCP and other privileged applications).

The Burroughs Medium systems COBOL68 compiler supported
the 'ENTER SYMBOLIC' statement, which was followed by
in-line assembler until the LEAVE SYMBOLIC statement.

The IBM language environments that I worked in all
supported static (and later, dynamic) linkage, and my
employer could afford a suite of IBM language tools.
IBMs language tools shared a common object interface,
so it was (relatively) easy to write the Assembly
parts in Assembler, and the HLL parts in the appropriate
HLL (ususally, for us, COBOL), and link them together
for execution.

Burroughs had the same thing (independently compiled
modules (ICMs) bound together by the BINDER). Although
that arrived in the 70's, which is why COBOL74 and
COBOL85 no longer supported inline assembler but
COBOL68 did.

Consequently, none of the high-level languages supported
an "assembly" escape (although COBOL provided extensions
for IBM DB2 relational database interaction).

--- PyGate Linux v1.5.14
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From Scott Lurndal@3:633/10 to All on Thu Apr 16 19:04:41 2026

Janis Papanagnou <janis_papanagnou+ng@hotmail.com> writes:

On 2026-04-16 17:11, Lew Pitcher wrote:

[snip]

FWIW, I believe that the origins of C had much the same
philosophy: write parts in suitable languages, and link
them together prior to execution.

But was that an outcome of the C-language design, or of
the UNIX operating system concepts with its languages,
toolbox, and linking-editor?

I expect that it was driven by the extremely limited
memory of the day. V6 C had four executables:

cc Driver
c0 Preprocessor
c1 Parser/Code Generator
c2 Optimizer (optimized the asm output of c1)

then to generate an a.out, cc would invoke both

as Assembler
ld Linker

--- PyGate Linux v1.5.14
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From Dan Cross@3:633/10 to All on Thu Apr 16 19:48:51 2026

In article <10rr2ea$1mmnk$2@dont-email.me>,
Lew Pitcher <lew.pitcher@digitalfreehold.ca> wrote:

On Thu, 16 Apr 2026 17:43:19 +0200, Janis Papanagnou wrote:

On 2026-04-16 17:11, Lew Pitcher wrote:

[snip]

FWIW, I believe that the origins of C had much the same
philosophy: write parts in suitable languages, and link
them together prior to execution.

But was that an outcome of the C-language design, or of
the UNIX operating system concepts with its languages,
toolbox, and linking-editor?

All of the above.

Linkage editors were (and still are) common technology,
as was separation of languages (assembler vs high level
language). Originally, Unix was written in assembler, and
(according to the histories) C was designed (with the existent
language tools in mind) to allow the Unix developers to use
a high-level language in their development. Remember, Bell
Labs wrote more than just Unix in C; C became the lingua-franca
for all the tools and applications, including the text management
tools (TROFF, EQN, SED, AWK, etc) and games (CHESS/CHECKERS/
BACKGAMMON)

I recall reading (but cannot find the reference now) that
Unix (V7 perhaps?) consisted of thousands of lines of C code,
and a few hundred lines of assembly for device drivers.

There also seems to have been an asymmetry here with "C",
at least evolving later...

From what I observed, "C" had reached a status to not be
"inter pares". As a comparably low-level language it had
been often used for other languages as the compile-output
to be then handled by any C-compiler. Also HLLs supported
interfaces to access (primarily) "C" modules because of
their (much better) performance and the typically easier
access to system resources.

K&R C had no reason
to support inline assembly (and, as far as I have read)
the authors studiously avoided that capability.

Nonetheless it supported the reserved word 'asm' (as I can
read in my old translation of K&R). (Not exactly what I'd
call "studiously avoided".)

To quote K&R ("The C Programming Language" 1978)
from Appendix A ("C Reference Manual") section 2.3 ("Keywords")
"The 'entry' keyword is not currently implemented by
any compiler, but is reserved for future use. Some
implementations also reserve the words 'fortran' and 'asm'."

I note that, according to that appendix, C had been ported to
PDP 11, Honeywell 6000, IBM 360/370, and Interdata 8/32 systems
at that time, none of them running Unix, to my knowledge.

By 1978, all of those save the Honeywell machine were running
Unix, either at Bell Labs or elsehwere.

I am aware of at least two independent ports: Tom Lyon and other
students did a port at Princeton starting 1976. A Bell system
port was done to support the development of telephone switching
software for the 5ESS; in both cases, unix booted under VM/370.

The Interdata 8/32 work was done explicitly as an exercise in
portability for the Unix system itself, and was the source of
the infamous "You are not expected to understand this" comment
in the context switching code. Unknown to the Bell Labs folks
at the time, a paralel effort in Australia was porting to the
Interdata 7/32 machine; that effort actually reached the summit
of a working port first.

The PDP-11, of course, was the primary development environment
for Unix throughout most of the 1970s, until they got ahold of
VAX-11 machines at the end of the decade.

The H6000 machine was the outlier, and that probably grew out of
a much older port to the GE-635 machine.

As
such, the language (at that time in a bit of a plastic state,
being supplied as source code to AT&T customers and educators
alike) may have been altered on a site-by-site basis to suit
the needs of each particular client. As the context of these
keywords was never explained, I find it easier to believe that
the intent for these keywords was as a storage modifier, and
not an inline language change. Something like
extern fortran int F1(); /* use fortran calling convention */
extern asm char *F2(); /* use assembly calling convention */

By '78 and the publication of the first version of "The C
Programming Language", the version that became known as "K&R C"
was actually what Unix folks called, "typesetter C". This was
the language, as extended to support a version of `roff` with
support for a phototypesetting machine that had been acquired by
Bell Labs. It was pretty well solidified by then, though some
folks made changes; `asm` was available in the compiler used to
build Comer's "Xinu" system for the LSI-11, for example. Work
shifted from Dennis Ritchie's original PDP-11 compiler to the
Johnson's "Portable C Compiler" (PCC), which was available in
7th Edition Unix, and became the basis for the compiler used by
many in the growing minicomputer and later workstation markets.

The next major change came with the ANSI standard in 1988, of
course.

- Dan C.

--- PyGate Linux v1.5.14
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From Chris M. Thomasson@3:633/10 to All on Thu Apr 16 14:03:13 2026

On 4/16/2026 3:28 AM, James Kuyper wrote:

On 15/04/2026 01:33, Bart wrote:
...

* Not allowing (until standardised after half a century) binary
literals, and still not allowing those to be printed

The latest draft standard supports %b and %B formats.

...

* Not being allowed to do a dozen things that you KNOW are well-defined
on your target machine, but C says are UB.

If you know they are well-defined on your only target platform, there's nothing wrong with writing such code. That's part of the reason why C
says the behavior is undefined, rather than requiring that such code be rejected. Implementations are intended to take advantage of that fact
for code that does not need to be portable.

Exactly! Undefined here can me defined over there... ;^)

--- PyGate Linux v1.5.14
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From Bart@3:633/10 to All on Thu Apr 16 22:13:19 2026

On 16/04/2026 11:28, James Kuyper wrote:

On 15/04/2026 01:33, Bart wrote:
...

* Not allowing (until standardised after half a century) binary
literals, and still not allowing those to be printed

The latest draft standard supports %b and %B formats.

...

* Not being allowed to do a dozen things that you KNOW are well-defined
on your target machine, but C says are UB.

If you know they are well-defined on your only target platform, there's nothing wrong with writing such code. That's part of the reason why C
says the behavior is undefined, rather than requiring that such code be rejected. Implementations are intended to take advantage of that fact
for code that does not need to be portable.

Taking advantage in what way? Doing something entirely unexpected or unintuitive?

/That's/ the problem!

--- PyGate Linux v1.5.14
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From Chris M. Thomasson@3:633/10 to All on Thu Apr 16 14:28:56 2026

On 4/16/2026 4:19 AM, Janis Papanagnou wrote:

On 2026-04-16 08:37, Chris M. Thomasson wrote:

Well crafted asm is not bad. Only used when needed! simple... :^)

And in practice a throwaway-product once you change platform.

Well, yeah, but its still there... Well, I am glad that the wayback
machine still has my old code! :^)

(I'm shuddering thinking about porting my decades old DSP asm
code to some other platform/CPU architecture.) But I've ported
or re-used old "C" code without much effort. This is a crucial
differences, especially in the light of the thread-theses.

Indeed. Bitch and a half, well, sometimes.

--- PyGate Linux v1.5.14
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From Keith Thompson@3:633/10 to All on Thu Apr 16 14:33:32 2026

Bart <bc@freeuk.com> writes:

On 16/04/2026 11:28, James Kuyper wrote:

On 15/04/2026 01:33, Bart wrote:
...

* Not allowing (until standardised after half a century) binary
literals, and still not allowing those to be printed

The latest draft standard supports %b and %B formats.
...

* Not being allowed to do a dozen things that you KNOW are well-defined
on your target machine, but C says are UB.

If you know they are well-defined on your only target platform,
there's
nothing wrong with writing such code. That's part of the reason why C
says the behavior is undefined, rather than requiring that such code be
rejected. Implementations are intended to take advantage of that fact
for code that does not need to be portable.

Taking advantage in what way? Doing something entirely unexpected or unintuitive?

No, of course not.

/That's/ the problem!

James's point is that any behavior that is not defined by the
standard can still be defined by an implementation.

For example, signed integer overflow is undefined, but an
implementation could define it as 2's-complement wraparound,
so for example INT_MAX + 1 == INT_MIN. If an implementation
guarantees that, there's nothing wrong with an application relying
on that guarantee (with the proviso that the code is not portable
to implementations that don't make that guarantee). This assumes
that there's a real advantage in relying on the behavior specified
by the implementation.

--
Keith Thompson (The_Other_Keith) Keith.S.Thompson+u@gmail.com
void Void(void) { Void(); } /* The recursive call of the void */

--- PyGate Linux v1.5.14
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From James Kuyper@3:633/10 to All on Thu Apr 16 20:26:03 2026

Bart <bc@freeuk.com> writes:

On 16/04/2026 11:28, James Kuyper wrote:

On 15/04/2026 01:33, Bart wrote:
...

* Not allowing (until standardised after half a century) binary
literals, and still not allowing those to be printed

The latest draft standard supports %b and %B formats.
...

* Not being allowed to do a dozen things that you KNOW are well-defined
on your target machine, but C says are UB.

If you know they are well-defined on your only target platform,
there's
nothing wrong with writing such code. That's part of the reason why C
says the behavior is undefined, rather than requiring that such code be
rejected. Implementations are intended to take advantage of that fact
for code that does not need to be portable.

Taking advantage in what way? Doing something entirely unexpected or unintuitive?

How ridiculous! If you can figure out a way to take advantage of
unexpected behavior, I'd appreciate knowing what it is. I was talking
about defining the behavior that the C standard itself leaves undefined,
in ways that make things convenient for the developer. As a general
rule, if there was one unique way to do that, it would be the defined
behavior. If there were a restricted set of possibilities (even if the
set was infinitely large), the behavior would be implementation-defined.
If the committee chose to make the behavior undefined, that generally
indicates that there are at least some platforms where no reasonable
list of permitted implementation-defined behaviors includes one that
would be efficiently implementable on those platforms. However, on any particular platform there might be a particular behavior that is
convenient, and leaving the behavior undefined, rather than mandating
rejection of such code, allows the implementation to choose that
convenient behavior.
Such code inherently cannot be ported away from that platform, but not
all code needs to be portable. I worked for 24 years under rules that
required my delivered code to be highly portable. That gave me a someone unusual perspective - I was very concerned to distinguish what was
guaranteed by the C standard from what was not, but I had very little
interest in how specific implementations handled my code. Most
developers target a smaller range of implementations than I was required
to support.

--- PyGate Linux v1.5.14
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From Bart@3:633/10 to All on Fri Apr 17 12:27:21 2026

On 17/04/2026 01:26, James Kuyper wrote:

Bart <bc@freeuk.com> writes:

On 16/04/2026 11:28, James Kuyper wrote:

On 15/04/2026 01:33, Bart wrote:
...

* Not allowing (until standardised after half a century) binary
literals, and still not allowing those to be printed

The latest draft standard supports %b and %B formats.
...

* Not being allowed to do a dozen things that you KNOW are well-defined >>>> on your target machine, but C says are UB.

If you know they are well-defined on your only target platform,
there's
nothing wrong with writing such code. That's part of the reason why C
says the behavior is undefined, rather than requiring that such code be
rejected. Implementations are intended to take advantage of that fact
for code that does not need to be portable.

Taking advantage in what way? Doing something entirely unexpected or
unintuitive?

How ridiculous! If you can figure out a way to take advantage of
unexpected behavior, I'd appreciate knowing what it is.

It was you who mentioned taking advantage.

And by taking advantage, I assume you meant all the unpredictable things
that optimising compilers like to do, because they assume that UB cannot happen.

Signed integer overflow is the one that everyone knows (though oddly it
is not listed in Appendix J.2, or if it is, it doesn't use the word 'overflow'!).

I think there are other obscure ones to do with the order you read and
write members of unions, or apply type-punning, or what you can do with pointers.

A common scenario is where someone is implementing a language where such things are well-defined, and they want to run it on a target machine
where they are also well-defined, but decide to use C as an intermediate language.

Unfortunately C has other ideas! So this means somehow getting around
the UB in the C that is generated, or stipulating specific compilers or compiler options.

Or just crossing your fingers and hoping the compiler will not be so crass.

Another scenerio is where you just writing C code and want that same behaviour.

I was talking
about defining the behavior that the C standard itself leaves undefined,
in ways that make things convenient for the developer.

The developer of the C implementation, or the C application?

I don't often use intermediate C code now, but that code is no longer
portable among C compilers. It is for gcc only, and requires:

-fno-strict-aliasing

I can't remember exactly why it's needed, but some programs won't work
without it.

(It's used with -O2, also necessary due to much redundancy in the C
code. Without the aliasing option, gcc will warn with: "dereferencing type-punned pointer will break strict-aliasing rules")

Whatever it is, I don't need anything like that when bypassing C and
going straight to native code.

And you won't need it if writing real assembly.

--- PyGate Linux v1.5.14
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From David Brown@3:633/10 to All on Fri Apr 17 14:37:47 2026

On 17/04/2026 13:27, Bart wrote:

On 17/04/2026 01:26, James Kuyper wrote:

Bart <bc@freeuk.com> writes:

On 16/04/2026 11:28, James Kuyper wrote:

On 15/04/2026 01:33, Bart wrote:
...

* Not allowing (until standardised after half a century) binary
literals, and still not allowing those to be printed

The latest draft standard supports %b and %B formats.
...

* Not being allowed to do a dozen things that you KNOW are well-
defined
on your target machine, but C says are UB.

If you know they are well-defined on your only target platform,
there's
nothing wrong with writing such code. That's part of the reason why C
says the behavior is undefined, rather than requiring that such code be >>>> rejected. Implementations are intended to take advantage of that fact
for code that does not need to be portable.

Taking advantage in what way? Doing something entirely unexpected or
unintuitive?

How ridiculous! If you can figure out a way to take advantage of
unexpected behavior, I'd appreciate knowing what it is.

It was you who mentioned taking advantage.

And by taking advantage, I assume you meant all the unpredictable things that optimising compilers like to do, because they assume that UB cannot happen.

Signed integer overflow is the one that everyone knows (though oddly it
is not listed in Appendix J.2, or if it is, it doesn't use the word 'overflow'!).

"An exceptional condition occurs during the evaluation of an expression (6.5.1)"

You are correct that it does not use the word "overflow" - it's a bit
more generic than that.

I think there are other obscure ones to do with the order you read and
write members of unions, or apply type-punning, or what you can do with pointers.

A common scenario is where someone is implementing a language where such things are well-defined, and they want to run it on a target machine
where they are also well-defined, but decide to use C as an intermediate language.

That is an extraordinarily /uncommon/ scenario. I know it applies to
you, but you are not a typical C user in this respect.

People who want to use C as an intermediate language need to generate
code that is correct according to C semantics. It does not matter how
well the source language matches the target processor in its behaviour
if the C code in the middle has different ideas. (Indeed, it does not
matter what the target processor semantics are, except for knowing the efficiency you can hope to achieve.) Thus if you want wrapping signed
integer arithmetic in your source language, you must generate C code
that emulates those semantics - such as by having casts back and forth
to unsigned types, or using bigger types and then masking, or writing non-portable code such as adding "#pragma GCC optimize ("wrapv")" to the generated code.

Unfortunately C has other ideas! So this means somehow getting around
the UB in the C that is generated, or stipulating specific compilers or compiler options.

Should C semantics be designed to suit millions of general C developers
over several generations, or should they be optimised to suit a single developer of non-C languages who can't be bothered adding some casts to
his code generator? Hm, that's a difficult trade-off question...

Or just crossing your fingers and hoping the compiler will not be so crass.

Another scenerio is where you just writing C code and want that same behaviour.

That's a great deal more common than the transpiler situation. But it
is still far rarer than many people think. In general, people don't
want their integer arithmetic to overflow - doing so is a bug, no matter
what the results.

I was talking
about defining the behavior that the C standard itself leaves undefined,
in ways that make things convenient for the developer.

The developer of the C implementation, or the C application?

I don't often use intermediate C code now, but that code is no longer portable among C compilers. It is for gcc only, and requires:

�� -fno-strict-aliasing

I recommend adding that as a pragma, not expecting people (yourself) to remember it as a command-line option.

I can't remember exactly why it's needed, but some programs won't work without it.

It is needed if you faff around with converting pointer types - lying to
your compiler by saying "this is a pointer to type A" when you are
setting it to the address of an object of type B. Such "tricks" can be convenient sometimes, more convenient than semantically correct methods
(like unions or using memmove) so I can understand the appeal. But you
should understand clearly that your C code here is non-portable and has undefined behaviour according to the C standard - "gcc
-fno-strict-aliasing" provides additional semantics that you can rely on
as long as you use that flag.

(It's used with -O2, also necessary due to much redundancy in the C
code. Without the aliasing option, gcc will warn with: "dereferencing type-punned pointer will break strict-aliasing rules")

gcc's warning here is slightly inaccurately worded, but very useful.

Whatever it is, I don't need anything like that when bypassing C and
going straight to native code.

And you won't need it if writing real assembly.

Sure. If you don't use C, you don't have to care about C semantics.

--- PyGate Linux v1.5.14
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From Michael S@3:633/10 to All on Fri Apr 17 16:37:35 2026

On Fri, 17 Apr 2026 14:37:47 +0200
David Brown <david.brown@hesbynett.no> wrote:

On 17/04/2026 13:27, Bart wrote:

On 17/04/2026 01:26, James Kuyper wrote:

Bart <bc@freeuk.com> writes:

On 16/04/2026 11:28, James Kuyper wrote:

On 15/04/2026 01:33, Bart wrote:
...

* Not allowing (until standardised after half a century) binary
literals, and still not allowing those to be printed

The latest draft standard supports %b and %B formats.
...

* Not being allowed to do a dozen things that you KNOW are
well- defined
on your target machine, but C says are UB.

If you know they are well-defined on your only target platform,
there's
nothing wrong with writing such code. That's part of the reason
why C says the behavior is undefined, rather than requiring that
such code be rejected. Implementations are intended to take
advantage of that fact for code that does not need to be
portable.

Taking advantage in what way? Doing something entirely unexpected
or unintuitive?

How ridiculous! If you can figure out a way to take advantage of
unexpected behavior, I'd appreciate knowing what it is.

It was you who mentioned taking advantage.

And by taking advantage, I assume you meant all the unpredictable
things that optimising compilers like to do, because they assume
that UB cannot happen.

Signed integer overflow is the one that everyone knows (though
oddly it is not listed in Appendix J.2, or if it is, it doesn't use
the word 'overflow'!).

"An exceptional condition occurs during the evaluation of an
expression (6.5.1)"

You are correct that it does not use the word "overflow" - it's a bit
more generic than that.

I think there are other obscure ones to do with the order you read
and write members of unions, or apply type-punning, or what you can
do with pointers.

A common scenario is where someone is implementing a language where
such things are well-defined, and they want to run it on a target
machine where they are also well-defined, but decide to use C as an intermediate language.

That is an extraordinarily /uncommon/ scenario. I know it applies to
you, but you are not a typical C user in this respect.

People who want to use C as an intermediate language need to generate
code that is correct according to C semantics. It does not matter
how well the source language matches the target processor in its
behaviour if the C code in the middle has different ideas. (Indeed,
it does not matter what the target processor semantics are, except
for knowing the efficiency you can hope to achieve.) Thus if you
want wrapping signed integer arithmetic in your source language, you
must generate C code that emulates those semantics - such as by
having casts back and forth to unsigned types,

That would, indeed, avoid undefined behavior, but it leaves you in the
realm of implementation-defined behavior (6.3.1.3.3).

6.3.1.3 Signed and unsigned integers
1
When a value with integer type is converted to another integer type
other than _Bool, if the value can be represented by the new type, it
is unchanged.
2
Otherwise, if the new type is unsigned, the value is converted by
repeatedly adding or subtracting one more than the maximum value that
can be represented in the new type until the value is in the range of
the new type.60)
3
Otherwise, the new type is signed and the value cannot be represented
in it; either the result is implementation-defined or an
implementation-defined signal is raised.

or using bigger types and then masking,

Which can be problematic when dealing with the widest integer type.
Besides, it's still implementation-defined (the same 6.3.1.3.3 apply),
unless generated code is *very* elaborate.

or writing non-portable code such as adding
"#pragma GCC optimize ("wrapv")" to the generated code.

Unfortunately C has other ideas! So this means somehow getting
around the UB in the C that is generated, or stipulating specific
compilers or compiler options.

Should C semantics be designed to suit millions of general C
developers over several generations, or should they be optimised to
suit a single developer of non-C languages who can't be bothered
adding some casts to his code generator? Hm, that's a difficult
trade-off question...

Or just crossing your fingers and hoping the compiler will not be
so crass.

Another scenerio is where you just writing C code and want that
same behaviour.

That's a great deal more common than the transpiler situation. But
it is still far rarer than many people think. In general, people
don't want their integer arithmetic to overflow - doing so is a bug,
no matter what the results.

I was talking
about defining the behavior that the C standard itself leaves
undefined, in ways that make things convenient for the developer.

The developer of the C implementation, or the C application?

I don't often use intermediate C code now, but that code is no
longer portable among C compilers. It is for gcc only, and requires:

?? -fno-strict-aliasing

I recommend adding that as a pragma, not expecting people (yourself)
to remember it as a command-line option.

I can't remember exactly why it's needed, but some programs won't
work without it.

It is needed if you faff around with converting pointer types - lying
to your compiler by saying "this is a pointer to type A" when you are setting it to the address of an object of type B. Such "tricks" can
be convenient sometimes, more convenient than semantically correct
methods (like unions or using memmove) so I can understand the
appeal. But you should understand clearly that your C code here is non-portable and has undefined behaviour according to the C standard
- "gcc -fno-strict-aliasing" provides additional semantics that you
can rely on as long as you use that flag.

(It's used with -O2, also necessary due to much redundancy in the C
code. Without the aliasing option, gcc will warn with:
"dereferencing type-punned pointer will break strict-aliasing
rules")

gcc's warning here is slightly inaccurately worded, but very useful.

Whatever it is, I don't need anything like that when bypassing C
and going straight to native code.

And you won't need it if writing real assembly.

Sure. If you don't use C, you don't have to care about C semantics.

--- PyGate Linux v1.5.14
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From Bart@3:633/10 to All on Fri Apr 17 14:49:33 2026

On 17/04/2026 13:37, David Brown wrote:

On 17/04/2026 13:27, Bart wrote:

Signed integer overflow is the one that everyone knows (though oddly
it is not listed in Appendix J.2, or if it is, it doesn't use the word
'overflow'!).

"An exceptional condition occurs during the evaluation of an expression (6.5.1)"

You are correct that it does not use the word "overflow" - it's a bit
more generic than that.

I think there are other obscure ones to do with the order you read and
write members of unions, or apply type-punning, or what you can do
with pointers.

A common scenario is where someone is implementing a language where
such things are well-defined, and they want to run it on a target
machine where they are also well-defined, but decide to use C as an
intermediate language.

That is an extraordinarily /uncommon/ scenario.

Lots of languages do this. A few that you may have heard of include
Haxe, Seed7, Nim, FreeBasic and Haskell. Although with some it will be
an option.

Even early C++ did so, but there it had mainly C semantics anyway.

People who want to use C as an intermediate language need to generate
code that is correct according to C semantics.� It does not matter how
well the source language matches the target processor in its behaviour
if the C code in the middle has different ideas.

Well, this is the problem.

But the thread is about C being equated to assembly, and this is one of
the differences. Some UBs are reasonable, others are not because the
behaviour is poorly defined on some rare or obsolete hardware, but would
be fine on virtually anything someone is likely to use.

Unfortunately C has other ideas! So this means somehow getting around
the UB in the C that is generated, or stipulating specific compilers
or compiler options.

Should C semantics be designed to suit millions of general C developers
over several generations, or should they be optimised to suit a single developer of non-C languages who can't be bothered adding some casts to
his code generator?� Hm, that's a difficult trade-off question...

My generated C now is full of casts. It doesn't help much, partly
because the casts are designed to match the C to the semantics of the
source language (here it is typed IL code transpiled to C), rather than
fixing the problems of C.

Example (extract from a larger output; module name is 'h'):

#define asi64(x) *(i64*)&x
#define tou64(x) (u64)x

extern i32 printf(u64 $1, ...);
extern void exit(i32);

void h_main();

int main(int nargs, char** args) {
h_main();
}

void h_main() {
u64 R1, R2;
i64 a;
i64 b;
i64 c;
asi64(R1) = b;
asi64(R2) = c;
asi64(R1) += asi64(R2);
a = asi64(R1);
asi64(R1) = a;
R2 = tou64("hello %lld\n");
printf(asu64(R2), asi64(R1));
R1 = 0;
exit(R1);
return;
}

Another scenerio is where you just writing C code and want that same
behaviour.

That's a great deal more common than the transpiler situation.� But it
is still far rarer than many people think.� In general, people don't
want their integer arithmetic to overflow - doing so is a bug, no matter what the results.

They want to do arbitrary conversions and type-punning. They want to
use unions in whatever way they like without worrying that it may or may
not be UB.

I was talking
about defining the behavior that the C standard itself leaves undefined, >>> in ways that make things convenient for the developer.

The developer of the C implementation, or the C application?

I don't often use intermediate C code now, but that code is no longer
portable among C compilers. It is for gcc only, and requires:

�� -fno-strict-aliasing

I recommend adding that as a pragma, not expecting people (yourself) to remember it as a command-line option.

I've tried pragmas before, but they only worked on gcc/Windows; they
seemed to be ignored on gcc/Linux. For example for '-fno-builtin'. But
maybe I'll try it again.

Still, the entire build process for any of my programs, when expressed
as C, is still one command line involving one source file.

It is needed if you faff around with converting pointer types - lying to your compiler by saying "this is a pointer to type A" when you are
setting it to the address of an object of type B.

Why would I be lying if I clearly use a cast to change T* (or some
integer X) to U*?

--- PyGate Linux v1.5.14
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From David Brown@3:633/10 to All on Fri Apr 17 15:54:32 2026

On 17/04/2026 15:37, Michael S wrote:

On Fri, 17 Apr 2026 14:37:47 +0200
David Brown <david.brown@hesbynett.no> wrote:

On 17/04/2026 13:27, Bart wrote:

On 17/04/2026 01:26, James Kuyper wrote:

Bart <bc@freeuk.com> writes:

On 16/04/2026 11:28, James Kuyper wrote:

On 15/04/2026 01:33, Bart wrote:
...

* Not allowing (until standardised after half a century) binary
literals, and still not allowing those to be printed

The latest draft standard supports %b and %B formats.
...

* Not being allowed to do a dozen things that you KNOW are
well- defined
on your target machine, but C says are UB.

If you know they are well-defined on your only target platform,
there's
nothing wrong with writing such code. That's part of the reason
why C says the behavior is undefined, rather than requiring that
such code be rejected. Implementations are intended to take
advantage of that fact for code that does not need to be
portable.

Taking advantage in what way? Doing something entirely unexpected
or unintuitive?

How ridiculous! If you can figure out a way to take advantage of
unexpected behavior, I'd appreciate knowing what it is.

It was you who mentioned taking advantage.

And by taking advantage, I assume you meant all the unpredictable
things that optimising compilers like to do, because they assume
that UB cannot happen.

Signed integer overflow is the one that everyone knows (though
oddly it is not listed in Appendix J.2, or if it is, it doesn't use
the word 'overflow'!).

"An exceptional condition occurs during the evaluation of an
expression (6.5.1)"

You are correct that it does not use the word "overflow" - it's a bit
more generic than that.

I think there are other obscure ones to do with the order you read
and write members of unions, or apply type-punning, or what you can
do with pointers.

A common scenario is where someone is implementing a language where
such things are well-defined, and they want to run it on a target
machine where they are also well-defined, but decide to use C as an
intermediate language.

That is an extraordinarily /uncommon/ scenario. I know it applies to
you, but you are not a typical C user in this respect.

People who want to use C as an intermediate language need to generate
code that is correct according to C semantics. It does not matter
how well the source language matches the target processor in its
behaviour if the C code in the middle has different ideas. (Indeed,
it does not matter what the target processor semantics are, except
for knowing the efficiency you can hope to achieve.) Thus if you
want wrapping signed integer arithmetic in your source language, you
must generate C code that emulates those semantics - such as by
having casts back and forth to unsigned types,

That would, indeed, avoid undefined behavior, but it leaves you in the
realm of implementation-defined behavior (6.3.1.3.3).

6.3.1.3 Signed and unsigned integers
1
When a value with integer type is converted to another integer type
other than _Bool, if the value can be represented by the new type, it
is unchanged.
2
Otherwise, if the new type is unsigned, the value is converted by
repeatedly adding or subtracting one more than the maximum value that
can be represented in the new type until the value is in the range of
the new type.60)
3
Otherwise, the new type is signed and the value cannot be represented
in it; either the result is implementation-defined or an implementation-defined signal is raised.

Yes. But in this case it is what you might call highly portable code -
not fully portable, but having the same implementation-defined behaviour
on almost all platforms. Simply casting back and forth between signed
and unsigned types could be problematic on platforms that are not two's complement. Maybe you'd have issues if the integer types had padding
bits (I haven't thought it through in detail). A lot of code should be somewhat portable, but it's rare that code needs to be fully portable to
every conforming C implementation. (You might still insist on writing
"fully portable" C code as that is a clearer definition than a vague
"highly portable" category.)

or using bigger types and then masking,

Which can be problematic when dealing with the widest integer type.
Besides, it's still implementation-defined (the same 6.3.1.3.3 apply),
unless generated code is *very* elaborate.

Again you are correct - but again, it can still be good enough for the
task at hand.

--- PyGate Linux v1.5.14
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From David Brown@3:633/10 to All on Fri Apr 17 16:45:24 2026

On 17/04/2026 15:49, Bart wrote:

On 17/04/2026 13:37, David Brown wrote:

On 17/04/2026 13:27, Bart wrote:

Signed integer overflow is the one that everyone knows (though oddly
it is not listed in Appendix J.2, or if it is, it doesn't use the
word 'overflow'!).

"An exceptional condition occurs during the evaluation of an
expression (6.5.1)"

You are correct that it does not use the word "overflow" - it's a bit
more generic than that.

I think there are other obscure ones to do with the order you read
and write members of unions, or apply type-punning, or what you can
do with pointers.

A common scenario is where someone is implementing a language where
such things are well-defined, and they want to run it on a target
machine where they are also well-defined, but decide to use C as an
intermediate language.

That is an extraordinarily /uncommon/ scenario.

Lots of languages do this. A few that you may have heard of include
Haxe, Seed7, Nim, FreeBasic and Haskell. Although with some it will be
an option.

Do all these languages support type-punning, unions, and signed integer arithmetic overflow defined in the way you think? I know Haskell does
not, I don't imagine FreeBasic does, but I can't answer for the others.

There are languages for which the tools transpile into C rather than
compiling directly to assembly or object code, or to some kind of byte
code. I don't dispute that. But these tools are primarily written by
people who understand C semantics and do not generate C with undefined behaviour. (They might rely on some implementation-defined behaviour,
or be limited to specific C implementations.)

And more relevantly, there are not a lot of people writing such programs
- it is an extremely uncommon scenario. There are not even that many
people /using/ such programs - when a language gets a big enough
community, it typically progresses towards its own compilation or LLVM.

Even early C++ did so, but there it had mainly C semantics anyway.

It was "C with classes". And it had the same semantics here as C, so
never saw the issues you are worrying about here.

People who want to use C as an intermediate language need to generate
code that is correct according to C semantics.� It does not matter how
well the source language matches the target processor in its behaviour
if the C code in the middle has different ideas.

Well, this is the problem.

If you feel it is a problem for /you/, then I can't argue against that -
but it is /your/ problem.

But the thread is about C being equated to assembly, and this is one of
the differences.

Sure. (And I agree with you on pretty much everything you have said
about C not being equatable with assembly.)

Some UBs are reasonable, others are not because the
behaviour is poorly defined on some rare or obsolete hardware, but would
be fine on virtually anything someone is likely to use.

As long as C intends to support rare or obscure hardware, defining the language in a way that can be efficiently implemented on such hardware
is reasonable. And I am confident that there are categories of UB that
I consider reasonable (indeed, desirable) but which you believe are UB
only because of obscure hardware.

There are some cases of UB in C that I would prefer to be fully defined,
or implementation-dependent, or unspecified, or perhaps require
diagnostics from the compiler. And there is a category of UBs that are clearly errors, but the C standards authors felt it unreasonable to
require compilers to detect them at build time.

I expect that every C programmer has different opinions about some types
of UB. In practice, all we can really do is understand the rules and
follow them (or use compilers and flags that let us change the semantics
a little).

Unfortunately C has other ideas! So this means somehow getting around
the UB in the C that is generated, or stipulating specific compilers
or compiler options.

Should C semantics be designed to suit millions of general C
developers over several generations, or should they be optimised to
suit a single developer of non-C languages who can't be bothered
adding some casts to his code generator?� Hm, that's a difficult
trade-off question...

My generated C now is full of casts. It doesn't help much, partly
because the casts are designed to match the C to the semantics of the
source language (here it is typed IL code transpiled to C), rather than fixing the problems of C.

Example (extract from a larger output; module name is 'h'):

�#define asi64(x) *(i64*)&x

This is an extremely bad way to do conversions. It is possibly the
reason you need the "-fno-strict-aliasing" flag. Prefer to use value
casts, not pointer casts, as you do with "tou64". It is okay to mix
pointers to signed and unsigned versions of the same type, so it is safe
if "x" is "u64" (unless your definitions of "i64" and "u64" are weird),
but it is a lot easier for accidental mistakes to pass unseen.

�#define tou64(x) (u64)x

It would normally be better to write :

#define tou64(x) ((u64) (x))

(or even better, use "uint64_t").

But if your code generate never generates anything where the extra
parentheses are needed, fair enough - this is for generated code, not hand-written code.

�extern i32 printf(u64 $1, ...);
�extern void exit(i32);

Why would you declare these standard library functions like that? Using "printf" will be UB, as the declaration does not match the definition.
It might happen to work on x86, but some platform ABIs pass pointers and integers in different registers.

�void h_main();

�int main(int nargs, char** args) {
�� h_main();
�}

�void h_main() {
�� u64 R1, R2;
�� i64 a;
�� i64 b;
�� i64 c;
�� asi64(R1) = b;
�� asi64(R2) = c;
�� asi64(R1) += asi64(R2);

I have no idea what you are trying to achieve with these casts. It
looks like you are taking something with fully defined wrapping
behaviour in C and your language - unsigned addition - and doing it with
C code that has undefined behaviour on overflow.

And you are doing all this with uninitialised variables, which is UB.

�� a = asi64(R1);
�� asi64(R1) = a;
�� R2 = tou64("hello %lld\n");

Now you are converting a pointer to a "u64". Why?

�� printf(asu64(R2), asi64(R1));
�� R1 = 0;
�� exit(R1);
�� return;
�}

It's not C that's the problem here. If you see problems, it is because
you are pretending that C is something that it is not, and that you can
write all sorts of risky nonsense.

Another scenerio is where you just writing C code and want that same
behaviour.

That's a great deal more common than the transpiler situation.� But it
is still far rarer than many people think.� In general, people don't
want their integer arithmetic to overflow - doing so is a bug, no
matter what the results.

They want to do arbitrary conversions and type-punning. They want to use unions in whatever way they like without worrying that it may or may not
be UB.

Who is "they" ?

I have never wanted to do "arbitrary conversions" - I have only ever
wanted to do conversions that make sense in my code. And I have only
ever wanted to do type-punning in a manner that is allowed by the
language. (Even then, it is very rare. Most conversions are value conversions, not type punning.) And I don't "worry" about what may or
may not be UB - I try my best to /know/ what is defined or not. The way
I like to use my unions is in a defined manner. People can make
mistakes, or be ignorant of the details of a language, but why would
anyone /want/ to use a language feature in an undefined way? It makes
no sense.

I was talking
about defining the behavior that the C standard itself leaves
undefined,
in ways that make things convenient for the developer.

The developer of the C implementation, or the C application?

I don't often use intermediate C code now, but that code is no longer
portable among C compilers. It is for gcc only, and requires:

�� -fno-strict-aliasing

I recommend adding that as a pragma, not expecting people (yourself)
to remember it as a command-line option.

I've tried pragmas before, but they only worked on gcc/Windows; they
seemed to be ignored on gcc/Linux. For example for '-fno-builtin'. But
maybe I'll try it again.

Some flags are appropriate for pragmas, and some are not. (The gcc
manual is not as good as it could be in giving the details.) Both "-fno-strict-aliasing" and "-fwrapv" definitely both work as pragmas, as
they are optimisation flags. "-fno-builtin" controls the C dialect, and applies to a compilation as a whole - it does not work as a pragma.

(It will not work on a Windows port of gcc either, AFAIK - it will make
no difference to the code. But some gcc Windows builds are configured
so that they always use external library calls, effectively forcing -fno-builtin.)

Still, the entire build process for any of my programs, when expressed
as C, is still one command line involving one source file.

If your code relies on non-standard behaviour, it is always best to put
it in the source code (if you can).

It is needed if you faff around with converting pointer types - lying
to your compiler by saying "this is a pointer to type A" when you are
setting it to the address of an object of type B.

Why would I be lying if I clearly use a cast to change T* (or some
integer X) to U*?

C requires you to access objects using lvalues of the appropriate type.
But it also allows conversions between various pointer types - that is
how you can have generic and flexible code (such as using malloc returns
for different types). So if you have a pointer "p" of type "T*", and
you write "(U*) p", you are telling the compiler "I know I said p was a pointer to objects of type T, but in this particular case it is actually pointing to an object of type U - the value contained in p started off
as the address of a U, before it was converted to a T*". If the thing
your pointer "p" points to is /not/ an object of type U (or other
suitable type, following the compatibility and qualifier rules), then
you are lying to the compiler.

--- PyGate Linux v1.5.14
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From Bart@3:633/10 to All on Fri Apr 17 17:42:53 2026

On 17/04/2026 15:45, David Brown wrote:

On 17/04/2026 15:49, Bart wrote:

On 17/04/2026 13:37, David Brown wrote:

On 17/04/2026 13:27, Bart wrote:

Signed integer overflow is the one that everyone knows (though oddly
it is not listed in Appendix J.2, or if it is, it doesn't use the
word 'overflow'!).

"An exceptional condition occurs during the evaluation of an
expression (6.5.1)"

You are correct that it does not use the word "overflow" - it's a bit
more generic than that.

I think there are other obscure ones to do with the order you read
and write members of unions, or apply type-punning, or what you can
do with pointers.

A common scenario is where someone is implementing a language where
such things are well-defined, and they want to run it on a target
machine where they are also well-defined, but decide to use C as an
intermediate language.

That is an extraordinarily /uncommon/ scenario.

Lots of languages do this. A few that you may have heard of include
Haxe, Seed7, Nim, FreeBasic and Haskell. Although with some it will be
an option.

Do all these languages support type-punning, unions, and signed integer arithmetic overflow defined in the way you think?� I know Haskell does
not, I don't imagine FreeBasic does, but I can't answer for the others.

I've no idea. You just said it was uncommon to use C in this way. But
every other amateur compiler project on Reddit forums likes to use a C
target.

Well, this is the problem.

If you feel it is a problem for /you/, then I can't argue against that -
but it is /your/ problem.

It is a problem when using C for this purpose, which wouldn't arise
using a language designed to be used as an intermediate target.

��#define asi64(x) *(i64*)&x

This is an extremely bad way to do conversions.� It is possibly the
reason you need the "-fno-strict-aliasing" flag.� Prefer to use value
casts, not pointer casts, as you do with "tou64".

For this purpose, the C has to emulate a stack machine, with the stack
slots being a fixed type (u64) which have to contain signed and unsigned integers, floats or doubles, or any kinds of pointer, or even any
arbitrary struct or array, by value.

One option was to use a union type for each stack element, but I decided
my choice would give cleaner code.

��extern i32 printf(u64 $1, ...);
��extern void exit(i32);

Why would you declare these standard library functions like that?� Using "printf" will be UB, as the declaration does not match the definition.

On most 64-bit machines these days you have float and non-float register banks. Pointers are non-floats so can be handled like ints.

In the IL that this C comes from, there are no pointer types. The
convention is to use 'u64' to represent addresses.

It might happen to work on x86, but some platform ABIs pass pointers and integers in different registers.

The only one I know off-hand is 68K, which has separate data and address registers, and that might happen, so I'll keep it in mind!

(There is a slim chance I can target 68K from my IL, via an emulator
that I would make, but likely I wouldn't be able to use a C library
anyway. It's funny I remember thinking around 1984 that those dual
register files would make it tricky to compile for.)

And you are doing all this with uninitialised variables, which is UB.

So this is something else. There should be no problem with using
unitialised data here, other than not being meaningful or useful.

They're uninitialised because my test program didn't bother to do so.

But running the program shouldn't be a problem. Why, what do you think C
might do that is so bad?

Here is the original HLL fragment:

int a, b, c
a := b + c

This is the portable IL generated from that:

i64 x
i64 y
i64 z

load y i64 # load/store = push/pop
load z i64
add i64
store x i64

This uses two stack slots. In the C above, those slots are called R1 and R2.

The same IL can be turned directly into x64 code:

R.a = D3 # D0-D15 are 64-bit regs
R.b = D4
R.c = D5
mov D0, R.b
add D0, R.c
mov R.a, D0

(This could be reduced to one instruction, but it's no faster.)

AFAIK no hardware exceptions are caused by adding whatever bit patterns
happen to be in those 'b' and 'c' registers.

It's not C that's the problem here.� If you see problems, it is because
you are pretending that C is something that it is not, and that you can write all sorts of risky nonsense.

I use generated C for three things:

* To share my non-C programs with others, who can't/won't use my
compiler binary

* To optimise my non-C programs

* To run my non-C programs on platforms I don't directly support.

When I do that, it seems to work. Eg. Paul Edwards is using my C
compiler, and I distribute it as a 66Kloc file full of code like my
example. But I can also distribute it as NASM, AT&T or MASM assembly.

Why would I be lying if I clearly use a cast to change T* (or some
integer X) to U*?

C requires you to access objects using lvalues of the appropriate type.
But it also allows conversions between various pointer types - that is
how you can have generic and flexible code (such as using malloc returns
for different types).� So if you have a pointer "p" of type "T*", and
you write "(U*) p", you are telling the compiler "I know I said p was a pointer to objects of type T, but in this particular case it is actually pointing to an object of type U - the value contained in p started off
as the address of a U, before it was converted to a T*".� If the thing
your pointer "p" points to is /not/ an object of type U (or other
suitable type, following the compatibility and qualifier rules), then
you are lying to the compiler.

I don't care about any of this. If I take a byte* pointer and cast it to
int* and then write via that version, I expect it to do exactly that,
and not question my choice!

This is exactly how it works in assembly, in my HLLs, and in my ILs.

It's possible that I may have done that erroneously, but that is another matter. This is not about detecting coding bugs in the source language.

--- PyGate Linux v1.5.14
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From Keith Thompson@3:633/10 to All on Fri Apr 17 10:22:01 2026

Bart <bc@freeuk.com> writes:
[...]

I've no idea. You just said it was uncommon to use C in this way. But
every other amateur compiler project on Reddit forums likes to use a C target.

[...]

Amateur compiler projects are uncommon. The vast majority of C code
has nothing to do with implementing compilers. I know it's what
you focus on, but you are a rather extreme outlier (not suggesting
that's a bad thing, just that it's unusual).

--
Keith Thompson (The_Other_Keith) Keith.S.Thompson+u@gmail.com
void Void(void) { Void(); } /* The recursive call of the void */

--- PyGate Linux v1.5.14
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From Keith Thompson@3:633/10 to All on Fri Apr 17 10:25:32 2026

Bart <bc@freeuk.com> writes:

On 17/04/2026 01:26, James Kuyper wrote:

Bart <bc@freeuk.com> writes:

On 16/04/2026 11:28, James Kuyper wrote:

On 15/04/2026 01:33, Bart wrote:
...

* Not allowing (until standardised after half a century) binary
literals, and still not allowing those to be printed

The latest draft standard supports %b and %B formats.
...

* Not being allowed to do a dozen things that you KNOW are well-defined >>>>> on your target machine, but C says are UB.

If you know they are well-defined on your only target platform,
there's
nothing wrong with writing such code. That's part of the reason why C
says the behavior is undefined, rather than requiring that such code be >>>> rejected. Implementations are intended to take advantage of that fact
for code that does not need to be portable.

Taking advantage in what way? Doing something entirely unexpected or
unintuitive?

How ridiculous! If you can figure out a way to take advantage of
unexpected behavior, I'd appreciate knowing what it is.

It was you who mentioned taking advantage.

And by taking advantage, I assume you meant all the unpredictable
things that optimising compilers like to do, because they assume that
UB cannot happen.

I'm reasonably sure that's not what James meant at all. He was
talking about C programmers (not creating C implementations) taking
advantage of implementation-specific features (such as extensions),
writing code whose behavior is not defined by the C standard but
that is defined by an implementation or by a secondary standard.

He'll correct me if I'm mistaken.

--
Keith Thompson (The_Other_Keith) Keith.S.Thompson+u@gmail.com
void Void(void) { Void(); } /* The recursive call of the void */

--- PyGate Linux v1.5.14
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From wij@3:633/10 to All on Sat Apr 18 03:41:19 2026

On Fri, 2026-04-17 at 17:42 +0100, Bart wrote:

On 17/04/2026 15:45, David Brown wrote:

On 17/04/2026 15:49, Bart wrote:

On 17/04/2026 13:37, David Brown wrote:

On 17/04/2026 13:27, Bart wrote:

Signed integer overflow is the one that everyone knows (though od

dly

it is not listed in Appendix J.2, or if it is, it doesn't use the

word 'overflow'!).

"An exceptional condition occurs during the evaluation of an expression (6.5.1)"

You are correct that it does not use the word "overflow" - it's a b

it

more generic than that.

I think there are other obscure ones to do with the order you rea

d

and write members of unions, or apply type-punning, or what you c

an

do with pointers.

A common scenario is where someone is implementing a language whe

re

such things are well-defined, and they want to run it on a target

machine where they are also well-defined, but decide to use C as

an

intermediate language.

That is an extraordinarily /uncommon/ scenario.

Lots of languages do this. A few that you may have heard of include

Haxe, Seed7, Nim, FreeBasic and Haskell. Although with some it will b

e

an option.

Do all these languages support type-punning, unions, and signed integer

arithmetic overflow defined in the way you think?� I know Haskell

does

not, I don't imagine FreeBasic does, but I can't answer for the others.

I've no idea. You just said it was uncommon to use C in this way. But
every other amateur compiler project on Reddit forums likes to use a C

target.

Well, this is the problem.

If you feel it is a problem for /you/, then I can't argue against that

-

but it is /your/ problem.

It is a problem when using C for this purpose, which wouldn't arise
using a language designed to be used as an intermediate target.

��#define asi64(x) *(i64*)&x

This is an extremely bad way to do conversions.� It is possibly th

e

reason you need the "-fno-strict-aliasing" flag.� Prefer to use va

lue

casts, not pointer casts, as you do with "tou64".

For this purpose, the C has to emulate a stack machine, with the stack

slots being a fixed type (u64) which have to contain signed and unsigned

integers, floats or doubles, or any kinds of pointer, or even any
arbitrary struct or array, by value.

One option was to use a union type for each stack element, but I decided

my choice would give cleaner code.

��extern i32 printf(u64 $1, ...);
��extern void exit(i32);

Why would you declare these standard library functions like that?�

Using

"printf" will be UB, as the declaration does not match the definition.

On most 64-bit machines these days you have float and non-float register

banks. Pointers are non-floats so can be handled like ints.

In the IL that this C comes from, there are no pointer types. The
convention is to use 'u64' to represent addresses.

It might happen to work on x86, but some platform ABIs pass pointers an

d

integers in different registers.

The only one I know off-hand is 68K, which has separate data and address

registers, and that might happen, so I'll keep it in mind!

(There is a slim chance I can target 68K from my IL, via an emulator
that I would make, but likely I wouldn't be able to use a C library
anyway. It's funny I remember thinking around 1984 that those dual
register files would make it tricky to compile for.)

And you are doing all this with uninitialised variables, which is UB.

So this is something else. There should be no problem with using
unitialised data here, other than not being meaningful or useful.

They're uninitialised because my test program didn't bother to do so.

But running the program shouldn't be a problem. Why, what do you think C

might do that is so bad?

Here is the original HLL fragment:

�� int a, b, c
�� a := b + c

This is the portable IL generated from that:

�� i64 x
�� i64 y
�� i64 z

�� load�� y��

��?
?�� i64�
�� # load/store = push/pop

�� load�� z��

��?
?�� i64

�� add��?

?��
��?
? i64

�� store�� x��?

?��
�� i64

This uses two stack slots. In the C above, those slots are called R1 and

R2.

The same IL can be turned directly into x64 code:

�� R.a = D3��

�� # D0-D15 are 64-bit regs

�� R.b = D4
�� R.c = D5
�� mov�� D0,�� R.b
�� add�� D0,�� R.c
�� mov�� R.a,� D0

(This could be reduced to one instruction, but it's no faster.)

AFAIK no hardware exceptions are caused by adding whatever bit patterns

happen to be in those 'b' and 'c' registers.

It's not C that's the problem here.� If you see problems, it is be

cause

you are pretending that C is something that it is not, and that you can

write all sorts of risky nonsense.

I use generated C for three things:

* To share my non-C programs with others, who can't/won't use my
compiler binary

* To optimise my non-C programs

* To run my non-C programs on platforms I don't directly support.

When I do that, it seems to work. Eg. Paul Edwards is using my C
compiler, and I distribute it as a 66Kloc file full of code like my
example. But I can also distribute it as NASM, AT&T or MASM assembly.

Why would I be lying if I clearly use a cast to change T* (or some

integer X) to U*?

C requires you to access objects using lvalues of the appropriate type.

But it also allows conversions between various pointer types - that is

how you can have generic and flexible code (such as using malloc return

s

for different types).� So if you have a pointer "p" of type "T*",

and

you write "(U*) p", you are telling the compiler "I know I said p was a

pointer to objects of type T, but in this particular case it is actuall

y

pointing to an object of type U - the value contained in p started off

as the address of a U, before it was converted to a T*".� If the t

hing

your pointer "p" points to is /not/ an object of type U (or other
suitable type, following the compatibility and qualifier rules), then

you are lying to the compiler.

I don't care about any of this. If I take a byte* pointer and cast it to

int* and then write via that version, I expect it to do exactly that,
and not question my choice!

This is exactly how it works in assembly, in my HLLs, and in my ILs.

It's possible that I may have done that erroneously, but that is another

matter. This is not about detecting coding bugs in the source language.

If I understand from several of your posts, you might expect that the stand
ard C
can cover all you need. Nope, C standard only specifies the common part of
C. IOW.

Let Mx denote the set of machine using x language.
Mstd= Ma && Mb && Mc && ... // |Mstd|<= min(|Ma|, |Mb|, |Mc|, ... )

The first thing is you have to specify the set of machine. 'portibility' al
ong
has no meaning.

The second. Your IL would be more 'portable' than any of the source langua
ge
of the IL. I.e. Your IL (in C) is doomed non-portable C, you have do all th
e
non-portable part yourself to cover all the source languages (machine). I.e
.
IL would be non-standard conforming.

The third. There are still many details there, e.g. C uses flat memory mode l...
many that C might not be able to simulate properly, depends on requirement.

So, I still recogmend the Soft Processing Unit idea (again):

------------------
NAME
Spu - Class of general purpose Soft-CPU

SYNOPSIS
Except POD types, C structures, all types are declared in namespace
Wy.

#include <CSCall/SctBase.h>

Spu is an object oriented model (class) of Turing Machine that acts like
a general purpose CPU-based computing machine to provide semantics
for
computing languages (for programming or for program communication).
Ap?
plications are both theoretical and practical.

The main differences of Spu and general purpose CPU (or TM) is that
Spu
has no ?register? nor ?flag? (which, along with
others, can be simu?
lated), Spu has only a tape and a stack. The tape is initially em pty.
Every object (referred to as tape variable or cell) in the tape is a llo?
cated via instruction Alloc and identified by a continuous index num ber.
Tape variable can be any C++ type, including Spu.

The instructions of Spu are application definable because they
are
wildly varying from different purposes. Except necessary few, about

30

instructions are defined for convenience for common usage, see man page
Wy.Sct(3wy).

Documentation following omits the scope name Wy::Sct for each occurr ence
of Spu for clarity.

PUBLIC MEMBERS
class Reply
typedef ssize_t IndexType
Spu()
~Spu()

InstrIdx next_instr
Array<InstrIdx> istack
Array<unsign char> tape
PtrArray<InstrBase> program

template<T> const T& get_data(IndexType) const
template<T> T& get_data(IndexType)
const std::type_info& get_typeinfo(IndexType) const
void set_instr_base(Spu&)
Errno run(InstrIdx)
Errno step()
void add_instr(InstrBase*)
[cut].....
-----------------

Basically, invent IL instruction, and use 'alloc' instruction to create mem
ory space.

--- PyGate Linux v1.5.14
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From James Kuyper@3:633/10 to All on Fri Apr 17 16:30:48 2026

On 17/04/2026 13:27, Bart wrote:

On 17/04/2026 01:26, James Kuyper wrote:

Bart <bc@freeuk.com> writes:

On 16/04/2026 11:28, James Kuyper wrote:

On 15/04/2026 01:33, Bart wrote:

...

* Not being allowed to do a dozen things that you KNOW are well-
defined
on your target machine, but C says are UB.

If you know they are well-defined on your only target platform,
there's
nothing wrong with writing such code. That's part of the reason why C
says the behavior is undefined, rather than requiring that such code be >>>> rejected. Implementations are intended to take advantage of that fact
for code that does not need to be portable.

Taking advantage in what way? Doing something entirely unexpected or
unintuitive?

How ridiculous! If you can figure out a way to take advantage of
unexpected behavior, I'd appreciate knowing what it is.

It was you who mentioned taking advantage.

And by taking advantage, I assume you meant all the unpredictable things that optimising compilers like to do, because they assume that UB cannot happen.

Bad assumption. Well, now that I've explained otherwise, I presume you
now know that I was NOT talking about that.

I think there are other obscure ones to do with the order you read and
write members of unions, or apply type-punning, or what you can do with pointers.

A common scenario is where someone is implementing a language where such things are well-defined, and they want to run it on a target machine
where they are also well-defined, but decide to use C as an intermediate language.

Unfortunately C has other ideas! So this means somehow getting around
the UB in the C that is generated, or stipulating specific compilers or

You were talking about the case where the C standard leaves the behavior undefined, but where you "KNOW are well-defined on your target machine".
In this context, that means that you know that the C compiler you are
using on that machine does not in fact take advantage of the freedom
that the C standard gives it in those regards. If you do know that, why
do you need to "get around" the possible UB? It may be possible, but
you've asserted that you know it doesn't apply on this platform.

Or just crossing your fingers and hoping the compiler will not be so crass.

You said that you "KNOW it's well-defined", so why would you need to
cross your fingers?

Another scenerio is where you just writing C code and want that same behaviour.

In virtually every case where the C behavior is undefined, you can
re-write the C code to force it to have the defined behavior that you
want. That is, in fact, the most common way normal developers deal with
the issue. For instance,

a[i++] = i++;

has undefined behavior because it updates the value of i twice with no intervening sequence point. But to avoid that problem, you just need to
be more specific about what precisely it is you want to happen:

int j = i++;
a[j] = i++;

or

int j = i++;
a[i++] = j;

whichever matches your original intent.

I was talking
about defining the behavior that the C standard itself leaves undefined,
in ways that make things convenient for the developer.

The developer of the C implementation, or the C application?

The latter.

I don't often use intermediate C code now, but that code is no longer portable among C compilers. It is for gcc only, and requires:

�� -fno-strict-aliasing

I can't remember exactly why it's needed, but some programs won't work without it.

Code that requires -fno-strict-aliasing to work properly can always be re-written so that it doesn't require it. It's just a matter of handling pointers correctly.

--- PyGate Linux v1.5.14
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From Rosario19@3:633/10 to All on Sat Apr 18 11:33:36 2026

On Tue, 14 Apr 2026 22:47:37 +0800, wij <wyniijj5@gmail.com> wrote:

In attempting writting a simple language, I had a thought of what language is >to share. Because I saw many people are stuck in thinking C/C++ (or other >high level language) can be so abstract, unlimited 'high level' to mysteriously
solve various human description of idea.

C and assembly are essentially the same, maybe better call it 'portable assembly'.
In C, we don't explicitly specify how wide the register/memory unit is, we use >char/int (short/long, signed/unsigned) to denote the basic unit. I.e.

a=b; // equ. to "mov a,b"

if language allow to use low instructions and varaibles as c=a+b or
a=b, the 4 operations +-*/ the operator or equivalent <,>,<=,>= and
the jump instruction, one can program in low level language= assemby
in every languages even if in restrict part of the code. so even in
apl is possible low level kind of programming

--- PyGate Linux v1.5.14
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From David Brown@3:633/10 to All on Sat Apr 18 15:37:04 2026

On 17/04/2026 18:42, Bart wrote:

On 17/04/2026 15:45, David Brown wrote:

On 17/04/2026 15:49, Bart wrote:

On 17/04/2026 13:37, David Brown wrote:

On 17/04/2026 13:27, Bart wrote:

Signed integer overflow is the one that everyone knows (though
oddly it is not listed in Appendix J.2, or if it is, it doesn't use >>>>> the word 'overflow'!).

"An exceptional condition occurs during the evaluation of an
expression (6.5.1)"

You are correct that it does not use the word "overflow" - it's a
bit more generic than that.

I think there are other obscure ones to do with the order you read
and write members of unions, or apply type-punning, or what you can >>>>> do with pointers.

A common scenario is where someone is implementing a language where >>>>> such things are well-defined, and they want to run it on a target
machine where they are also well-defined, but decide to use C as an >>>>> intermediate language.

That is an extraordinarily /uncommon/ scenario.

Lots of languages do this. A few that you may have heard of include
Haxe, Seed7, Nim, FreeBasic and Haskell. Although with some it will
be an option.

Do all these languages support type-punning, unions, and signed
integer arithmetic overflow defined in the way you think?� I know
Haskell does not, I don't imagine FreeBasic does, but I can't answer
for the others.

I've no idea. You just said it was uncommon to use C in this way. But
every other amateur compiler project on Reddit forums likes to use a C target.

You didn't simply claim that people were using C as an intermediary
language - you claimed they were doing so specifically for languages
that defined things like type punning, wrapping signed integer
arithmetic, and messing about with pointers. And you claimed this was a common scenario - and then you blamed C for making the process
difficult. Key to your point was that these (mostly hypothetical)
languages have semantics that fit with common processors but do not fit
with C. That's what I was rejecting.

Even if you shift the goalposts and talk about languages you know
nothing about, and languages that do not have this kind of semantic, the
use of C as an intermediary language is a negligible use of C compared
to "normal" C programming. Sure, C is the most common choice of an intermediary language - but for every person writing a transpiler that generates C code, there are a million other programmers writing C directly.

Well, this is the problem.

If you feel it is a problem for /you/, then I can't argue against that
- but it is /your/ problem.

It is a problem when using C for this purpose, which wouldn't arise
using a language designed to be used as an intermediate target.

C is not designed for that purpose, nor are C compilers. So if you
don't like C here, don't use it. It is not the fault of C, its language designers, or toolchain implementers. And if this really were the
problem you seem to think, people would use something else.

As it turns out, people /do/ use something else. There are countless
virtual machines with their own byte-codes, specialised for different
types of source languages. And there is a common intermediary language
used by a lot of tools - LLVM "assembly". This /was/ designed for that purpose, and does a pretty good job at it.

And if you don't like it (of course you don't like it - you didn't
invent it), find or make something else.

��#define asi64(x) *(i64*)&x

This is an extremely bad way to do conversions.� It is possibly the
reason you need the "-fno-strict-aliasing" flag.� Prefer to use value
casts, not pointer casts, as you do with "tou64".

For this purpose, the C has to emulate a stack machine, with the stack
slots being a fixed type (u64) which have to contain signed and unsigned integers, floats or doubles, or any kinds of pointer, or even any
arbitrary struct or array, by value.

If only C had a way to have a type that can hold either an integer, a
float, a double, a pointer, or other types - in a manner that is
efficient and has well-defined semantics. We'd be looking for some way
to describe a combination, or "union", of these types. How could that
be done, I wonder?

One option was to use a union type for each stack element, but I decided
my choice would give cleaner code.

Oh, right - you knew of a correct solution, but decided instead that
something broken would be cleaner.

��extern i32 printf(u64 $1, ...);
��extern void exit(i32);

Why would you declare these standard library functions like that?
Using "printf" will be UB, as the declaration does not match the
definition.

On most 64-bit machines these days you have float and non-float register banks. Pointers are non-floats so can be handled like ints.

So you think UB is better than doing things correctly, and then you
complain when C doesn't have the semantics you want?

In the IL that this C comes from, there are no pointer types. The
convention is to use 'u64' to represent addresses.

It might happen to work on x86, but some platform ABIs pass pointers
and integers in different registers.

The only one I know off-hand is 68K, which has separate data and address registers, and that might happen, so I'll keep it in mind!

(There is a slim chance I can target 68K from my IL, via an emulator
that I would make, but likely I wouldn't be able to use a C library
anyway. It's funny I remember thinking around 1984 that those dual
register files would make it tricky to compile for.)

The 68k is the only 32-bit device I know of with separate pointer and
data registers, but I know of other smaller devices.

And you are doing all this with uninitialised variables, which is UB.

So this is something else. There should be no problem with using
unitialised data here, other than not being meaningful or useful.

Again - you are pretending that C means what you think it should mean.
Using uninitialised local data leads, in most cases, to UB in C. If
your language treats uninitialised data as unspecified values, or
default initialised (typically to 0), or has some other determined
behaviour, then you need to implement that behaviour in the generated C
code. You don't get to generate C and pretend it means something different.

They're uninitialised because my test program didn't bother to do so.

But running the program shouldn't be a problem. Why, what do you think C might do that is so bad?

I don't know. Depending on compiler options, it could mean a program
crash with an error message. It could mean failure to compile (though
that would, I think, be non-conforming).

It's garbage in, garbage out, and there not a useful or informative test
or demonstration.

--- PyGate Linux v1.5.14
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From wij@3:633/10 to All on Sat Apr 18 22:17:01 2026

On Thu, 2026-04-16 at 22:14 +0800, wij wrote:

On Thu, 2026-04-16 at 18:42 +0800, wij wrote:

On Wed, 2026-04-15 at 19:04 -0700, Keith Thompson wrote:

wij <wyniijj5@gmail.com> writes:

On Wed, 2026-04-15 at 17:14 -0700, Tim Rentsch wrote:

wij <wyniijj5@gmail.com> writes:

[... comparing C and assembly language ...]

Gentlemen,

I understand the natural reaction to want to respond to the kind

of

statements being made in this thread.� I hope y'all can resi

st this

natural reaction and not respond to people who persist in making arguments that are basically isomorphic to saying 1 equals 0.

Thank you for your assistance in this matter.

Maybe you are right. I say A is-a B, one persist to read A is (exac

tly) B.

I provide help to using assembly. One persist to read I persuade us

ing

assembly and give up HLL. What is going on here?

You say that C is an assembly language.� Nobody here thinks that
you're *equating* C and assembly language.� It's obvious that
there are plenty of assembly languages that are not C, and nobody
has said otherwise.� I have no idea why you think anyone has tha

t

particular confusion.

At least one person has apparently interpreted your defense of
assembly language (that it isn't as scary as some think it is)
as a claim that we should program in assembly language rather
than in HLLs.� You're right, that was a misinterpretation of wha

t

you wrote.� I considered mentioning that, but didn't bother.

The issue I've been discussing is your claim that C is an assembly language.� It is not.

If I said C is assembly is in the sense that have at least shown in the

last

post (s_tut2.cpp), where even 'instruction' can be any function (e.g. c

hange

directory, copy files, launch an editor,...). And also, what is 'comput

ation'

is demonstrated, which include suggestion what C is, essentially any pr

ogram,

and in this sense what HLL is. Finally, it could demonstrate the meanin

g and

testify Church-Turing thesis (my words: no computation language, includ

ing�

various kind of math formula, can exceeds the expressive power of TM).

It seem you insist C and assembly have to be exactly what your bible sa

ys. If

so, I would say what C standard (I cannot read it) says is the meaning

of

terminology of term in it, not intended to be anything used in any othe

r situation.

I do not intend to post again in this thread until and unless you
post something substantive on that issue.

(continue)
IMO, C standard is like book of legal terms. Like many symbols in the hea

der

file, it defines one symbol in anoter symbol. The real meaning is not fix

ed.

The result is you cannot 'prove' correctness of the source program, even

consistency is a problem.

'Instruction' is low-level? Yes, by definition, but not as one might thin

k.

Instruction could refer to a processing unit (might be like the x87 math

co-processor, which may even be more higher level to process expression,.

..)

As good chance of C is to find a good function that can be hardwired.

So, the basic feature of HLL is 'structured' (or 'nested') text which rem

oves

labels. Semantics is inventor's imagination. So, avoid bizarre complexity

, it

won't add express power to the language, just a matter of short or length

y�

expression of programming idea.

(Continue)
Thus, C is-a language for controlling hardware. Therefore, the term 'portab
le
assembly' seems fit for this meaning. But on the other side, C needs to be user
friendly. But skipping the friend part, I think there should more, C could
be
the foundation of forml system (particularily for academic uses). For examp
le:

Case 1: "�(n=1,m) f(n)" should be defined as:

sum=0;
for(int n=1; n<=m; ++n) {
sum+=f(n)
}

By doing so, it is easier to deduce things from nested series.

Case 2: What if m=� ?

for(int n=1; ; ++n) {
sum+=f(n)
}

The infinity case has no consensus. At least, it demonstrates that 'infini
ty'
simply refers to an infinite loop. This leads to the long debate of
0.999....=? (0.999... will not terminates BY DEFINITION, no finite proof
can
prove it equals to anything except you define it).... And what INF,INFINIT
Y�
should be in C.

Case 3: Proposition ?x,P(x)::= P(x1)?P(x2)?..?
??P(xn) (x?{x1,x2,..})

bool f() { // f()= "?x,P(x)"
for(int x=1; x<=S.size(); ++x) {
if(P(x)==false) {
return false;
}
}
return true;
};

Universal quantifier itself is also a proposition, therefore, from
definition, its negation exists:
~Prop(?x,P(x))= ~(P(x1)?P(x2)?..?P(xn
)= ~P(x1)?~P(x2)?..?~P(xn)?
= Prop(?x,~P(x))

Math/logic has no such clear definition.
Multiple quantifiers (?x?y?z) and its negation
are thus easier to
understand and used in 'reasoning'.

Note: This leads to a case: if(a&&b) { /*...*/ }
I tends to think the omission of evaluation of b in case a==f
alse
is not really optimizaion. It is the problem or definition of the
traditional logic.

So, don't make C too bizarre.

--- PyGate Linux v1.5.14
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From Bart@3:633/10 to All on Sat Apr 18 16:08:49 2026

On 18/04/2026 14:37, David Brown wrote:

On 17/04/2026 18:42, Bart wrote:

I've no idea. You just said it was uncommon to use C in this way. But
every other amateur compiler project on Reddit forums likes to use a C
target.

You didn't simply claim that people were using C as an intermediary
language - you claimed they were doing so specifically for languages
that defined things like type punning, wrapping signed integer
arithmetic, and messing about with pointers.

The broader picture is being forgotten. The thread is partly about C
being a 'portable assembler', and this is a common notion.

C is famous for being low level; being close to the hardware; for a 1:1 correspondence between types that people work with, and the operations
on those, with the equivalent assembly.

Whether that is correct or not, that is what people think or say, and
what many assume.

It is also what very many want, including me.

This particular use-case for C as an intermediate language is one
example, a good one as it highlights the issues. But I also want all
those assumptions to be true.

(In my systems language, it is a lot truer than in C. But my language
supports a small number of targets, and usually one at a time.)

It is a problem when using C for this purpose, which wouldn't arise
using a language designed to be used as an intermediate target.

C is not designed for that purpose, nor are C compilers.

Yes, I know. There should have been one that is much better - a HLL, not
the monstrosity that is LLVM. But it doesn't exist.

If it did, then it could have served another purpose for which C is
currently used and is not ideal either, which is to express APIs of
libraries. Currently that is too C-centric and it is a big task to
tranlate into bindings for other languages.

(For example, the headers for GTK2 include about 4000 C macro definitions.)

� So if you
don't like C here, don't use it.� It is not the fault of C, its language designers, or toolchain implementers.� And if this really were the
problem you seem to think, people would use something else.

There /is/ nothing else. C is the best of a bad bunch.

As it turns out, people /do/ use something else.� There are countless virtual machines with their own byte-codes, specialised for different
types of source languages.� And there is a common intermediary language
used by a lot of tools - LLVM "assembly".� This /was/ designed for that purpose, and does a pretty good job at it.

It depends on your metrics. By mine, it does a pretty bad job, by being bigger, slower, and more complex by a number of magnitudes in each case.

Actually, on Reddit forums, people do keep asking for simpler,
lightweight alternatives. At present, if someone has a 100MB LLVM-based compiler, the front-end which is specific to their language, may BE
under 1% of that.

And if you don't like it (of course you don't like it - you didn't
invent it), find or make something else.

I did, but it was more of an exercise to show what is possible. I had a standalone product that was about 200KB as a DLL, with an API to
generate IL code, but same the 200KB program could:

* Interpret IL programs directly
* Translate to native code and run them immediately
* Generate EXE and DLL files
* Generate OBJ files
* Generate ASM files in several syntaxes

It could also be configured with a front-end that allowed a textual form
as input (like .ll files).

So, this entire program, builds in 60ms, and produces a 0.2MB standalone library (for x64 using Win64 ABI).

The speed of it can be 1Mlps or more, while the generated code is
usually within a factor of two of fully optimised code.

You will be aware that LLVM is rather different.

One option was to use a union type for each stack element, but I
decided my choice would give cleaner code.

Oh, right - you knew of a correct solution, but decided instead that something broken would be cleaner.

Well, it shouldn't BE broken! That's the problem with C.

So you think UB is better than doing things correctly, and then you
complain when C doesn't have the semantics you want?

I'm saying that a lot of things shouldn't be UB. Some people just want
want to write assembly - for a specific machine - but also want HLL conveniences.

So this is something else. There should be no problem with using
unitialised data here, other than not being meaningful or useful.

Again - you are pretending that C means what you think it should mean.
Using uninitialised local data leads, in most cases, to UB in C.� If
your language treats uninitialised data as unspecified values, or
default initialised (typically to 0), or has some other determined behaviour, then you need to implement that behaviour in the generated C code.� You don't get to generate C and pretend it means something
different.

In a real application then using unitialised data, outside of .bss,
would be uncommon, and likely be a bug.

But outside of a real application, such as in fragments of test code
that I work on every day, then variables can be uninitialised,
especially if I'm interested more in the code that is being generated
and will not actually run it.

It seems that a C compiler cannot make that distinction and must always
assume that every program, even in development, is mission-critical.

In my original example, they weren't initialised in order to keep the
posted examples short.

I still wouldn't call A + B undefined behaviour when A/B are not
initialised; the result is the sum of whatever A and B happen to
contain, and is little different from:

A = rand();
B = rand();
A + B;

--- PyGate Linux v1.5.14
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From Keith Thompson@3:633/10 to All on Sat Apr 18 17:35:30 2026

Bart <bc@freeuk.com> writes:

On 18/04/2026 14:37, David Brown wrote:

On 17/04/2026 18:42, Bart wrote:

I've no idea. You just said it was uncommon to use C in this
way. But every other amateur compiler project on Reddit forums
likes to use a C target.

You didn't simply claim that people were using C as an intermediary
language - you claimed they were doing so specifically for languages
that defined things like type punning, wrapping signed integer
arithmetic, and messing about with pointers.

The broader picture is being forgotten. The thread is partly about C
being a 'portable assembler', and this is a common notion.

It's a common wrong notion.

One person here recently claimed that C is a kind of assembly
language. The ensuing thread consisted mostly of other people saying
that's wrong, and asking that person to clarify just what they mean.
Happily, that discussion seems to have died out.

C is not an assembly language. C can be used as an intermediate
language for translation from other languages, but that's not what
it's designed for.

C is famous for being low level; being close to the hardware; for a
1:1 correspondence between types that people work with, and the
operations on those, with the equivalent assembly.

Whether that is correct or not, that is what people think or say, and
what many assume.

It is also what very many want, including me.

It's too bad that C doesn't give you want you want. It was never
intended to. Complaining about it won't help.

[...]

Yes, I know. There should have been one that is much better - a HLL,
not the monstrosity that is LLVM. But it doesn't exist.

Given your habit of inventing your own languages and writing your own compilers, I'm surprised you haven't defined your own intermediate
language, something like LLVM IR but suiting your purposes better.
You're complaining about a problem that *you* might be in a position
to address.

[...]

Oh, right - you knew of a correct solution, but decided instead that
something broken would be cleaner.

Well, it shouldn't BE broken! That's the problem with C.

No, that's the problem with the way you use C.

So you think UB is better than doing things correctly, and then you
complain when C doesn't have the semantics you want?

I'm saying that a lot of things shouldn't be UB. Some people just want
want to write assembly - for a specific machine - but also want HLL conveniences.

There is an ongoing effort to remove many instances of
undefined behavior in the next edition of the C standard.
(No doubt you're angry that it wasn't done sooner.) See <https://www.open-std.org/jtc1/sc22/wg14/www/wg14_document_log>
and search for titles like "Slaying Some Earthly Demons" and similar.

So this is something else. There should be no problem with using
unitialised data here, other than not being meaningful or useful.

Again - you are pretending that C means what you think it should
mean. Using uninitialised local data leads, in most cases, to UB in
C.� If your language treats uninitialised data as unspecified
values, or default initialised (typically to 0), or has some other
determined behaviour, then you need to implement that behaviour in
the generated C code.� You don't get to generate C and pretend it
means something different.

In a real application then using unitialised data, outside of .bss,
would be uncommon, and likely be a bug.

But outside of a real application, such as in fragments of test code
that I work on every day, then variables can be uninitialised,
especially if I'm interested more in the code that is being generated
and will not actually run it.

It seems that a C compiler cannot make that distinction and must
always assume that every program, even in development, is
mission-critical.

Right, nothing in the C language distinguishes "mission-critical"
code. I'm not show how the language could make such a distinction.
(Rust's "unsafe" keyword is arguably similar.) Things that are
undefined behavior are undefined behavior. Implementations can
make finer distinctions; C does not.

In my original example, they weren't initialised in order to keep the
posted examples short.

I still wouldn't call A + B undefined behaviour when A/B are not
initialised; the result is the sum of whatever A and B happen to
contain, and is little different from:

A = rand();
B = rand();
A + B;

Bart, it doesn't matter whether *you* call it undefined behavior
or not.

If you want to compile your C code with a compiler that doesn't treat
reading an uninitialized object as undefined behavior, it's likely
there's an option (at least for gcc and clang) to make it do so.
There's nothing wrong with using such an option, if you want to
target a dialect of C rather than ISO standard C.

--
Keith Thompson (The_Other_Keith) Keith.S.Thompson+u@gmail.com
void Void(void) { Void(); } /* The recursive call of the void */

--- PyGate Linux v1.5.14
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From Keith Thompson@3:633/10 to All on Sat Apr 18 17:35:38 2026

Bart <bc@freeuk.com> writes:

On 18/04/2026 14:37, David Brown wrote:

On 17/04/2026 18:42, Bart wrote:

I've no idea. You just said it was uncommon to use C in this
way. But every other amateur compiler project on Reddit forums
likes to use a C target.

You didn't simply claim that people were using C as an intermediary
language - you claimed they were doing so specifically for languages
that defined things like type punning, wrapping signed integer
arithmetic, and messing about with pointers.

The broader picture is being forgotten. The thread is partly about C
being a 'portable assembler', and this is a common notion.

It's a common wrong notion.

One person here recently claimed that C is a kind of assembly
language. The ensuing thread consisted mostly of other people saying
that's wrong, and asking that person to clarify just what they mean.
Happily, that discussion seems to have died out.

C is not an assembly language. C can be used as an intermediate
language for translation from other languages, but that's not what
it's designed for.

C is famous for being low level; being close to the hardware; for a
1:1 correspondence between types that people work with, and the
operations on those, with the equivalent assembly.

Whether that is correct or not, that is what people think or say, and
what many assume.

It is also what very many want, including me.

It's too bad that C doesn't give you want you want. It was never
intended to. Complaining about it won't help.

[...]

Yes, I know. There should have been one that is much better - a HLL,
not the monstrosity that is LLVM. But it doesn't exist.

Given your habit of inventing your own languages and writing your own compilers, I'm surprised you haven't defined your own intermediate
language, something like LLVM IR but suiting your purposes better.
You're complaining about a problem that *you* might be in a position
to address.

[...]

Oh, right - you knew of a correct solution, but decided instead that
something broken would be cleaner.

Well, it shouldn't BE broken! That's the problem with C.

No, that's the problem with the way you use C.

So you think UB is better than doing things correctly, and then you
complain when C doesn't have the semantics you want?

I'm saying that a lot of things shouldn't be UB. Some people just want
want to write assembly - for a specific machine - but also want HLL conveniences.

There is an ongoing effort to remove many instances of
undefined behavior in the next edition of the C standard.
(No doubt you're angry that it wasn't done sooner.) See <https://www.open-std.org/jtc1/sc22/wg14/www/wg14_document_log>
and search for titles like "Slaying Some Earthly Demons" and similar.

So this is something else. There should be no problem with using
unitialised data here, other than not being meaningful or useful.

Again - you are pretending that C means what you think it should
mean. Using uninitialised local data leads, in most cases, to UB in
C.� If your language treats uninitialised data as unspecified
values, or default initialised (typically to 0), or has some other
determined behaviour, then you need to implement that behaviour in
the generated C code.� You don't get to generate C and pretend it
means something different.

In a real application then using unitialised data, outside of .bss,
would be uncommon, and likely be a bug.

But outside of a real application, such as in fragments of test code
that I work on every day, then variables can be uninitialised,
especially if I'm interested more in the code that is being generated
and will not actually run it.

It seems that a C compiler cannot make that distinction and must
always assume that every program, even in development, is
mission-critical.

Right, nothing in the C language distinguishes "mission-critical"
code. I'm not show how the language could make such a distinction.
(Rust's "unsafe" keyword is arguably similar.) Things that are
undefined behavior are undefined behavior. Implementations can
make finer distinctions; C does not.

In my original example, they weren't initialised in order to keep the
posted examples short.

I still wouldn't call A + B undefined behaviour when A/B are not
initialised; the result is the sum of whatever A and B happen to
contain, and is little different from:

A = rand();
B = rand();
A + B;

Bart, it doesn't matter whether *you* call it undefined behavior
or not.

If you want to compile your C code with a compiler that doesn't treat
reading an uninitialized object as undefined behavior, it's likely
there's an option (at least for gcc and clang) to make it do so.
There's nothing wrong with using such an option, if you want to
target a dialect of C rather than ISO standard C.

--
Keith Thompson (The_Other_Keith) Keith.S.Thompson+u@gmail.com
void Void(void) { Void(); } /* The recursive call of the void */

--- PyGate Linux v1.5.14
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From Keith Thompson@3:633/10 to All on Sat Apr 18 18:29:48 2026

Keith Thompson <Keith.S.Thompson+u@gmail.com> writes:
[...]

It's a common wrong notion.

[...]

Sorry about the double post. Both articles are identical apart from
irrelevant headers.

--
Keith Thompson (The_Other_Keith) Keith.S.Thompson+u@gmail.com
void Void(void) { Void(); } /* The recursive call of the void */

--- PyGate Linux v1.5.14
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From Lawrence D?Oliveiro@3:633/10 to All on Sun Apr 19 07:41:52 2026

On Wed, 15 Apr 2026 21:40:36 +0800, makendo wrote:

If you are talking about function-local data, there are multiple
ways to do store them in an easy-to-clean-up fashion:

Which one does lexical binding?

One could say, lexical binding is a necessary, though perhaps not
sufficient, characteristic of a true high-level language.

--- PyGate Linux v1.5.14
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From David Brown@3:633/10 to All on Sun Apr 19 12:17:01 2026

On 18/04/2026 17:08, Bart wrote:

On 18/04/2026 14:37, David Brown wrote:

On 17/04/2026 18:42, Bart wrote:

I've no idea. You just said it was uncommon to use C in this way. But
every other amateur compiler project on Reddit forums likes to use a
C target.

You didn't simply claim that people were using C as an intermediary
language - you claimed they were doing so specifically for languages
that defined things like type punning, wrapping signed integer
arithmetic, and messing about with pointers.

The broader picture is being forgotten. The thread is partly about C
being a 'portable assembler', and this is a common notion.

It is a common misconception - and I believe we agree it is a misconception.

C is famous for being low level; being close to the hardware; for a 1:1 correspondence between types that people work with, and the operations
on those, with the equivalent assembly.

I describe C as being a relatively low level high-level language. It
has standard types that are usually close to the hardware (but often not
- C is regularly used on processors that have no hardware floating
point, or registers and ALUs that are not big enough to support types
like long long). C also supports derived types of many sorts - some
given in the standard library, others from user code. These types do
not correspond directly to hardware, even if their component parts do.

Whether that is correct or not, that is what people think or say, and
what many assume.

Certainly some people will think, say or assume that - and to some
extent they would be correct. Some people may get the details wrong,
however.

It is also what very many want, including me.

I don't doubt that you would like a language that has a lot in common
with C, but worked slightly differently. You can get a fair way towards
it with "gcc -fno-strict-aliasing -fwrapv". I fully appreciate that you
would find it very useful to have a language that was like C but had implementation-defined behaviour for some of the things C leaves
undefined, and fully defined behaviour for some things it leaves implementation-defined, unspecified or UB. I can see how it would be
more useful than real-world C for you, and others, as an intermediary language.

But it would also be less useful to many other people. And the language
you want here does not, currently, exist. (If you were to specify it
well and get a gcc flag to compile code with those semantics, I am sure
it would be useful. But that would be a daunting task.)

C is not the language you want here, nor is it the language that some
people seem to think it is. And that's a good thing, IMHO, because C as
it stands is a better language for many other people and many other
tasks. And no modification of the language semantics is going to please everyone.

This particular use-case for C as an intermediate language is one
example, a good one as it highlights the issues. But I also want all
those assumptions to be true.

I do appreciate that C is used as an IL. But you must also understand
that this is a very niche usage.

(In my systems language, it is a lot truer than in C. But my language supports a small number of targets, and usually one at a time.)

It is a problem when using C for this purpose, which wouldn't arise
using a language designed to be used as an intermediate target.

C is not designed for that purpose, nor are C compilers.

Yes, I know. There should have been one that is much better - a HLL, not
the monstrosity that is LLVM. But it doesn't exist.

If C were really as bad for the task as you make out, alternatives would
be developed. And that has happened to some extent - you mentioned C--
as an example. And as you noted, it died away. The disadvantages of C
for this usage are minor in practice (assuming you are aware of them),
and it's simply not worth the effort trying to make a different C-like IL.

(Yes, LLVM and the tools around it are big. It takes a lot of effort to
make use of them, but you get a lot in return. A "little language" has
to grow to a certain size in numbers of toolchain developers and numbers
of toolchain users before it can make sense to move to LLVM. But doing
so is still a fraction of the work compared to making a serious
optimising back-end for multiple targets.)

If it did, then it could have served another purpose for which C is currently used and is not ideal either, which is to express APIs of libraries. Currently that is too C-centric and it is a big task to
tranlate into bindings for other languages.

(For example, the headers for GTK2 include about 4000 C macro definitions.)

And yet in practice C is it good enough for almost cases. I fully agree
there are aspects of C (and more noticeably, aspects of C usage) that
are not idea for such purposes - yet it is still leagues beyond anything anyone else has come up with.

You can be supremely confident that any solution you (or anyone else -
this is in no way a personal matter) developed - whether for an IL or
for an API language - would have at least as many flaws and limitations
than using C for the purpose, from the viewpoint of other people. You
could make a language that suits your personal needs better than C. But different people have different needs, and it is totally impossible to
make something ideal for everyone. For every single thing you think is
a flaw in C, there will be people who see it as an advantage.

Thus C - imperfect though it is - is not much worse than any conceivable alternative for many purposes. (It is also used in many cases where it
is completely inappropriate, but that's beside the point here.) And C
is the devil we know - people are used to it. They know what works and
what does not work. They have tools that help - tools that would have
to be re-designed and re-written for any new alternatives. To replace C
with something else, it is not enough for that something else to be
better than C, or designed with (only) modern systems in mind. It would
have to be /massively/ better to be a positive change overall.

� So if you don't like C here, don't use it.� It is not the fault of
C, its language designers, or toolchain implementers.� And if this
really were the problem you seem to think, people would use something
else.

There /is/ nothing else. C is the best of a bad bunch.

One option was to use a union type for each stack element, but I
decided my choice would give cleaner code.

Oh, right - you knew of a correct solution, but decided instead that
something broken would be cleaner.

Well, it shouldn't BE broken! That's the problem with C.

C doesn't have a problem here - C is what it is. If you use C
incorrectly, then your code is broken. That is not the fault of C, it
is the fault of the person writing bad C code.

So you think UB is better than doing things correctly, and then you
complain when C doesn't have the semantics you want?

I'm saying that a lot of things shouldn't be UB. Some people just want
want to write assembly - for a specific machine - but also want HLL conveniences.

I understand what you want (and I appreciate why you want it). But you
have to understand that you can't have what you want here. You need to
work with what you have, not what you wish you had.

So this is something else. There should be no problem with using
unitialised data here, other than not being meaningful or useful.

Again - you are pretending that C means what you think it should mean.
Using uninitialised local data leads, in most cases, to UB in C.� If
your language treats uninitialised data as unspecified values, or
default initialised (typically to 0), or has some other determined
behaviour, then you need to implement that behaviour in the generated
C code.� You don't get to generate C and pretend it means something
different.

In a real application then using unitialised data, outside of .bss,
would be uncommon, and likely be a bug.

In C, it is not /likely/ a bug, it /is/ a bug. For your own language,
either there is well-defined handling of such uninitialised data (and therefore no bug), or there is not well-defined handling (and therefore
it is always a bug). I don't see a middle ground here.

But outside of a real application, such as in fragments of test code
that I work on every day, then variables can be uninitialised,
especially if I'm interested more in the code that is being generated
and will not actually run it.

It seems that a C compiler cannot make that distinction and must always assume that every program, even in development, is mission-critical.

A C compiler expects code written in valid C. Compilers expect code to
be run - I don't think that is unreasonable. And when I use a compiler
to look at generated assembly for some C code (and I do that quite
often), I am using C code that has a meaning if it were to be run. How
is a tool supposed to give you meaningful and useful results from
meaningless or invalid input?

In my original example, they weren't initialised in order to keep the
posted examples short.

I still wouldn't call A + B undefined behaviour when A/B are not initialised; the result is the sum of whatever A and B happen to
contain, and is little different from:

�� A = rand();
�� B = rand();
�� A + B;

--- PyGate Linux v1.5.14
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From Bart@3:633/10 to All on Sun Apr 19 12:50:04 2026

On 19/04/2026 11:17, David Brown wrote:

On 18/04/2026 17:08, Bart wrote:

(Yes, LLVM and the tools around it are big.� It takes a lot of effort to make use of them, but you get a lot in return.� A "little language" has
to grow to a certain size in numbers of toolchain developers and numbers
of toolchain users before it can make sense to move to LLVM.

Actually lots of small projects use LLVM.

But probably people don't realise it is like installing the engine from
a container ship into your small family car.

But doing
so is still a fraction of the work compared to making a serious
optimising back-end for multiple targets.)

If it did, then it could have served another purpose for which C is
currently used and is not ideal either, which is to express APIs of
libraries. Currently that is too C-centric and it is a big task to
tranlate into bindings for other languages.

(For example, the headers for GTK2 include about 4000 C macro
definitions.)

And yet in practice C is it good enough for almost cases.

It is not even good enough C. To get back to GTK2 (which I looked at in
detail some years back), compiling this program:

#include <gtk2.h>

involved processing over 1000 #includes, some 550 discrete headers, 330K
lines of declarations, with a bunch of -I options to tell it the dozen different folders it needs to go and look for those headers.

I was looking at reducing the whole thing to one file - a set of
bindings in my language for the functions, types etc that are exposed.

This file would have been 25Kloc in my language (including those 4000
headers; most would have been simple #defines, but many will have needed manual translation: macros can contain actual C code, not just
declarations).

HOWEVER... if such an exercise works for my language, why can't it work
for C too? That is, reduce those 100s of header files and dozens of
folders into a single 25Kloc file, specific to your platform.

Think how much easier it would be to install, or employ, and how much
faster to /compile/!

So why doesn't this happen? The equivalent exercise for SDL2 would
reduce 50Kloc across 80 header files (at least these are in the same
folder) to one 3Kloc file.

A C compiler expects code written in valid C.� Compilers expect code to
be run - I don't think that is unreasonable.

What's not valid about 'a = b + c'?

� And when I use a compiler
to look at generated assembly for some C code (and I do that quite
often), I am using C code that has a meaning if it were to be run.

I'm interested too, but if I compile this in godbolt:

void F() {
int a, b, c;
a = b + c * 8;
}

then all the C compilers I tried generated code at -O0 which kept those variables in memory.

What does the code look like when a/b/c are kept in registers? I've no
idea, because at soon as you try -O1 and above, the whole expression is elided.

If you stick 'static' in front, then the whole function disappears. This
is not very useful when trying to compare code generation across
compilers and languages!

If I do something meaningful with 'a' to keep the expression alive, and initialise b and c, then the whole expression is reduced to a constant.

What do you have to do see if the expression would be compiled to, for example, 'lea ra, [rb + rc*8]'?

--- PyGate Linux v1.5.14
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From David Brown@3:633/10 to All on Sun Apr 19 14:17:51 2026

On 19/04/2026 13:50, Bart wrote:

On 19/04/2026 11:17, David Brown wrote:

On 18/04/2026 17:08, Bart wrote:

(Yes, LLVM and the tools around it are big.� It takes a lot of effort
to make use of them, but you get a lot in return.� A "little language"
has to grow to a certain size in numbers of toolchain developers and
numbers of toolchain users before it can make sense to move to LLVM.

Actually lots of small projects use LLVM.

But probably people don't realise it is like installing the engine from
a container ship into your small family car.

The strange thing about the software world is that it does not matter.

I do appreciate liking things to be small, simple and efficient.
Sometimes that is important - in my own work, it is very often
important. But often it doesn't matter at all. There are other things
more worthy of our time and effort.

Would it be better if the gcc toolchain installation for the cross
compiler I use were 1 MB of installation over 20 files, rather than
whatever it is now? Yes, assuming it did as good a job for generating
the code I need and providing the static checking I want. Is it worth
fussing about for the cost of 20 seconds extra download time once a
year? No, it is not.

A C compiler expects code written in valid C.� Compilers expect code
to be run - I don't think that is unreasonable.

What's not valid about 'a = b + c'?

It is valid if and only if you can read "b" and "c", you have no
arithmetic overflows, and you can store the result in "c". Your test
code fails at the first hurdle.

� And when I use a compiler to look at generated assembly for some C
code (and I do that quite often), I am using C code that has a meaning
if it were to be run.

I'm interested too, but if I compile this in godbolt:

�void F() {
�� int a, b, c;
�� a = b + c * 8;
�}

So you want to know how the compiler deals with meaningless code. Why?
Do you not know how to write meaningful code?

then all the C compilers I tried generated code at -O0 which kept those variables in memory.

They are on the stack in memory, yes. You've asked for close to a
direct and na�ve translation, which gives no insight into what kind of
code the compiler can generate and is harder to follow (because it's
mostly moving things onto and off from the stack).

What does the code look like when a/b/c are kept in registers? I've no
idea, because at soon as you try -O1 and above, the whole expression is elided.

If you stick 'static' in front, then the whole function disappears. This
is not very useful when trying to compare code generation across
compilers and languages!

If I do something meaningful with 'a' to keep the expression alive, and initialise b and c, then the whole expression is reduced to a constant.

What do you have to do see if the expression would be compiled to, for example, 'lea ra, [rb + rc*8]'?

int f(int b, int c)
{
int a;
a = b + c * 8;
return a;
}

If you don't want to use parameters and return values, I recommend
declaring externally linked volatile variables and use them as the
source and destination of your calculations:

volatile int xa;
volatile int xb;
volatile int xc;

void foo(void) {
int a, b, c;

b = xa;
c = xc;

a = b + c * 8;

xa = a;
}

When you ask the compiler "give me an efficient implementation of this
code" and the compiler can see that the code does nothing, it generates
no code (or just a "ret"). This should not be a surprise. So you might
need "tricks" to make the code mean something - including access to
volatile objects is one of these tricks.

--- PyGate Linux v1.5.14
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From Bart@3:633/10 to All on Sun Apr 19 13:44:35 2026

On 19/04/2026 01:35, Keith Thompson wrote:

Bart <bc@freeuk.com> writes:

On 18/04/2026 14:37, David Brown wrote:

On 17/04/2026 18:42, Bart wrote:

I've no idea. You just said it was uncommon to use C in this
way. But every other amateur compiler project on Reddit forums
likes to use a C target.

You didn't simply claim that people were using C as an intermediary
language - you claimed they were doing so specifically for languages
that defined things like type punning, wrapping signed integer
arithmetic, and messing about with pointers.

The broader picture is being forgotten. The thread is partly about C
being a 'portable assembler', and this is a common notion.

It's a common wrong notion.

One person here recently claimed that C is a kind of assembly
language.

'C being portable assembly' keeps coming up, not just here.

Yes, I know. There should have been one that is much better - a HLL,
not the monstrosity that is LLVM. But it doesn't exist.

Given your habit of inventing your own languages and writing your own compilers, I'm surprised you haven't defined your own intermediate
language, something like LLVM IR but suiting your purposes better.
You're complaining about a problem that *you* might be in a position
to address.

If you read my post again, you'll see that I did exactly that.

In my last version of it that could be used in standalone form, it was a single 0.2MB library. That one, also used for my C compiler, was v7. I
now have a simpler version, v8. I've used it to target x64-WinABI, and
raw Z80 (the 8-bit microprocessor). v8 is designed to integrate with a compiler front-end.

Take this C function:

typedef long long int i64;

i64 sumarray(i64* a, i64 n) {
i64 sum = 0;
for (i64 i=0; i<n; ++i)
sum+=a[i];
return sum;
}

This is that sum+= line in LLVM IR as generated by clang -emit-llvm -O0:

%12 = load ptr, ptr %3, align 8, !dbg !33
%13 = load i64, ptr %6, align 8, !dbg !34
%14 = getelementptr inbounds i64, ptr %12, i64 %13, !dbg !33
%15 = load i64, ptr %14, align 8, !dbg !33
%16 = load i64, ptr %5, align 8, !dbg !35
%17 = add nsw i64 %16, %15, !dbg !35
store i64 %17, ptr %5, align 8, !dbg !35

At least, I /think/ this is the bit, since it's pretty much unreadable.

(Using -O1 etc will give shorter sequences, however that involves
optimising the IR. Whose responsible for that? You can do that, but
you're paying LLVM to do the job, so you will be feeding it unoptimised
IR code.)

This is the equivalent from my C compiler, as v7 IL:

load u64 a
load i64 i.2
iloadx i64 /8
load u64 /1 &sum
addto i64

My IL is a stack VM, which tend to need more lines. I also have an older
IL which is more in the style of LLVM IR. There, it looks like this:

T1 := (a + i*8)^ i64
sum +:= T1 i64

However it was much harder to work despite it looking simpler.

Anyway, perhaps you can understand why I prefer my own solutions. They
are also 3-4 magnitudes smaller and 1-2 magnitudes faster in compile-time.

As for the generated code, I used the above function to sum a
200M-element array, populated with 1,2,3.... These are the results
(using gcc -O3 not LLVM):

c:\c>gcc -O3 -s sum.c -o sum.exe
c:\c>tim sum
19999999900000000
0.098
Time: 1.041

c:\c>bcc sum
Compiling sum.c to sum.exe
c:\c>tim sum
19999999900000000
0.142
Time: 1.115

(Best of three shown.)

So, for overall runtime, gcc-O3 is 7% faster, because both need to
allocate and initialise 1.6 GB.

But if I isolate the summing part, which is the second figure displayed,
then gcc-O3 is still only 45% faster.

In this case, the task is memory-bound, but that 45% is still typical of applications /I/ write.

--- PyGate Linux v1.5.14
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From Janis Papanagnou@3:633/10 to All on Sun Apr 19 15:36:41 2026

On 2026-04-19 12:17, David Brown wrote:

On 18/04/2026 17:08, Bart wrote:

The broader picture is being forgotten. The thread is partly about C
being a 'portable assembler', and this is a common notion.

It is a common misconception - and I believe we agree it is a
misconception.

Not quite. I would agree it to be a misrepresentation! In any way
it should be considered in the respective context. It is actually
the first time that I saw a comparison of assembler and "C" with
the claim that it would be "essentially the same"; certainly wrong
(or a misconception). But usually (that mentioned "common notion")
"C" is compared to HLLs with respect to its abstractions. And here
we find the common exaggeration (a deliberate misrepresentation as exaggerations inherently are) of a "portable assembler".

C is famous for being low level; [...]

I describe C as being a relatively low level high-level language.� [...]

It boils down to this, yes. And I don't see any disagreement here.
The OP with his original thesis is certainly an outlier.

Janis

--- PyGate Linux v1.5.14
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From Michael S@3:633/10 to All on Sun Apr 19 16:54:06 2026

On Sun, 19 Apr 2026 12:50:04 +0100
Bart <bc@freeuk.com> wrote:

On 19/04/2026 11:17, David Brown wrote:

On 18/04/2026 17:08, Bart wrote:

(Yes, LLVM and the tools around it are big.? It takes a lot of
effort to make use of them, but you get a lot in return.? A "little language" has to grow to a certain size in numbers of toolchain
developers and numbers of toolchain users before it can make sense
to move to LLVM.

Actually lots of small projects use LLVM.

But probably people don't realise it is like installing the engine
from a container ship into your small family car.

But doing
so is still a fraction of the work compared to making a serious
optimising back-end for multiple targets.)

If it did, then it could have served another purpose for which C
is currently used and is not ideal either, which is to express
APIs of libraries. Currently that is too C-centric and it is a big
task to tranlate into bindings for other languages.

(For example, the headers for GTK2 include about 4000 C macro
definitions.)

And yet in practice C is it good enough for almost cases.

It is not even good enough C. To get back to GTK2 (which I looked at
in detail some years back), compiling this program:

#include <gtk2.h>

involved processing over 1000 #includes, some 550 discrete headers,
330K lines of declarations, with a bunch of -I options to tell it the
dozen different folders it needs to go and look for those headers.

I was looking at reducing the whole thing to one file - a set of
bindings in my language for the functions, types etc that are exposed.

This file would have been 25Kloc in my language (including those 4000 headers; most would have been simple #defines, but many will have
needed manual translation: macros can contain actual C code, not just declarations).

HOWEVER... if such an exercise works for my language, why can't it
work for C too? That is, reduce those 100s of header files and dozens
of folders into a single 25Kloc file, specific to your platform.

Think how much easier it would be to install, or employ, and how
much faster to /compile/!

It would be faster to compile. Probably, meaningfully faster for
compiling large GUI project from scratch with very slow compiler like
gcc. Probably, not meaningfully faster in other situations.
It would not be easier to install or employ unless one happens to be as stubborn as you are.
If I ever want to write code using GTK2 for hobby purpose, which is
extremely unlikely, then all I'd need to do is to type 'pacman -S mingw-w64-ucrt-x86_64-gtk2' at msys2 command prompt. That's all.
For somebody on Debian/Ubuntu it likely would be 'apt-get install
gtk2'. RHEL/Fedora, MSVC command prompt or Mac it would be some other
magic incantation. Except that for the latter two it's probably not
available at all, so even easier.
The point is - it's already so easy that you can't really make it any
easier, at best the same.

So why doesn't this happen? The equivalent exercise for SDL2 would
reduce 50Kloc across 80 header files (at least these are in the same
folder) to one 3Kloc file.

A C compiler expects code written in valid C.? Compilers expect
code to be run - I don't think that is unreasonable.

What's not valid about 'a = b + c'?

? And when I use a compiler
to look at generated assembly for some C code (and I do that quite
often), I am using C code that has a meaning if it were to be run.

I'm interested too, but if I compile this in godbolt:

void F() {
int a, b, c;
a = b + c * 8;
}

then all the C compilers I tried generated code at -O0 which kept
those variables in memory.

What does the code look like when a/b/c are kept in registers? I've
no idea, because at soon as you try -O1 and above, the whole
expression is elided.

If you stick 'static' in front, then the whole function disappears.
This is not very useful when trying to compare code generation across compilers and languages!

If I do something meaningful with 'a' to keep the expression alive,
and initialise b and c, then the whole expression is reduced to a
constant.

What do you have to do see if the expression would be compiled to,
for example, 'lea ra, [rb + rc*8]'?

Come on, I don't believe that you didn't know the answer that David
Brown gave you above. You were arguing because you like to argue.

--- PyGate Linux v1.5.14
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From Bart@3:633/10 to All on Sun Apr 19 15:28:15 2026

On 19/04/2026 13:17, David Brown wrote:

On 19/04/2026 13:50, Bart wrote:

On 19/04/2026 11:17, David Brown wrote:

On 18/04/2026 17:08, Bart wrote:

(Yes, LLVM and the tools around it are big.� It takes a lot of effort
to make use of them, but you get a lot in return.� A "little
language" has to grow to a certain size in numbers of toolchain
developers and numbers of toolchain users before it can make sense to
move to LLVM.

Actually lots of small projects use LLVM.

But probably people don't realise it is like installing the engine
from a container ship into your small family car.

The strange thing about the software world is that it does not matter.

I do appreciate liking things to be small, simple and efficient.
Sometimes that is important - in my own work, it is very often
important.� But often it doesn't matter at all.� There are other things
more worthy of our time and effort.

It is always important. The slowness of LLVM is a recognised problem.
It's slow even when not optimising!

People using it complain that nearly all of their compilers' build times
are spent inside LLVM. Some languages (eg. Zig) are trying to break away
from that dependency. From an "AI Overview":

"Zig is moving away from LLVM as a mandatory dependency to reduce
compilation times, decrease binary sizes, and improve debugging, aiming
for a self-hosted compiler backend. While LLVM is currently used for
highly optimized releases, Zig is developing its own backends (e.g.,
x86_64, AArch64) to allow for faster development cycle"

Also: "LLVM is slow and has a large dependency footprint."

This is all to overcome issues that you suggest don't matter.

Would it be better if the gcc toolchain installation for the cross
compiler I use were 1 MB of installation over 20 files, rather than
whatever it is now?

My gcc 14.1 is 1100 files over nearly 400 nested folders, totalling
nearly 1.5GB. A small fraction will be the core compile, but which bits?

This makes it hard if you want something to just stick on a pen-drive
and use anywhere. That 1.5GB is only a fraction of a drive's capacity is
not relevant.

I've just copoed gcc 14.1 installation to a pen-drive, and it took 9
minutes. If I do the same with my compiler, it takes 0.1 seconds. It's
also easier to spot if there's an essential file missing, as there's
only one!

��void F() {
�� int a, b, c;
�� a = b + c * 8;
��}

So you want to know how the compiler deals with meaningless code.� Why?
Do you not know how to write meaningful code?

I don't want the compiler deciding what's a meaningful program. The
intent here is clear:

* Allocate 3 local slots for int
* Add the contents of two of those, and store into the third.

That is the task. In terms of observable effects, there are are at least
two: the code that is generated, and the time it might take to execute.

There is also the code size, and the compilation time.

One of my favourite compilation benchmarks is this:

void F() {
int a, b=2, c=3, d=4
a = b + c * d;
.... // repeat N times
printf("%d\n", a);
}

Here initialisation is used otherwise it causes problems with
interpreted languages for example.

It is amazing how many language implementations have trouble with this, especially with bigger N such as 1000000. The bigger ones usually fare
worse.

This program is not meaningful; it is simply a stress test. Two more r observable effects are at what N it fails, and whether it crashes or
fails gracefully.

then all the C compilers I tried generated code at -O0 which kept
those variables in memory.

They are on the stack in memory, yes.� You've asked for close to a
direct and na�ve translation, which gives no insight into what kind of
code the compiler can generate and is harder to follow (because it's
mostly moving things onto and off from the stack).

It's easier to follow. Or would be it the compiler were to generate
decent assembly. gcc -O0 produces:

F:
pushq %rbp
movq %rsp, %rbp
movl -4(%rbp), %eax
leal 0(,%rax,8), %edx
movl -8(%rbp), %eax
addl %edx, %eax
movl %eax, -12(%rbp)
nop
popq %rbp
ret

Intel version is a bit better but still uses those offsets. Mine
generates, for development purposes (and with the register allocator disabled):

F::
a = -8 # will be `F.a etc in normal use
b = -16
c = -24
push Dfp
mov Dfp, Dsp
sub Dsp, 32
;---------------
mov A0, [Dfp + c]
shl A0, 3
mov A1, [Dfp + b]
add A1, A0
mov [Dfp + a], A1
;---------------
add Dsp, 32
pop Dfp
ret

(This also uses a different register naming scheme optimised for Win64
ABI.) Such code is very easy to associate with C source.

What does the code look like when a/b/c are kept in registers? I've no
idea, because at soon as you try -O1 and above, the whole expression
is elided.

If you stick 'static' in front, then the whole function disappears.
This is not very useful when trying to compare code generation across
compilers and languages!

If I do something meaningful with 'a' to keep the expression alive,
and initialise b and c, then the whole expression is reduced to a
constant.

What do you have to do see if the expression would be compiled to, for
example, 'lea ra, [rb + rc*8]'?

int f(int b, int c)
{
�� int a;
�� a = b + c * 8;
�� return a;
}

If you don't want to use parameters and return values, I recommend
declaring externally linked volatile variables and use them as the
source and destination of your calculations:

volatile int xa;
volatile int xb;
volatile int xc;

void foo(void) {
�� int a, b, c;

�� b = xa;
�� c = xc;

�� a = b + c * 8;

�� xa = a;
}

When you ask the compiler "give me an efficient implementation of this
code" and the compiler can see that the code does nothing, it generates
no code (or just a "ret").� This should not be a surprise.� So you might need "tricks" to make the code mean something - including access to
volatile objects is one of these tricks.

So, you have to spend time fooling the compiler. And then you are never
quite sure if it has left something out so that you're not comparing
like with like.

However, this is a perfect example of how even a language and especially
its compilers differ from assembly and assemblers.

It can happen with my compilers too, but on a much smaller scale. For
example 'a = 2 + 2' is reduced to 'a = 4'. But it is easier to get around.

--- PyGate Linux v1.5.14
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From Bart@3:633/10 to All on Sun Apr 19 16:02:37 2026

On 19/04/2026 14:54, Michael S wrote:

On Sun, 19 Apr 2026 12:50:04 +0100
Bart <bc@freeuk.com> wrote:

It is not even good enough C. To get back to GTK2 (which I looked at
in detail some years back), compiling this program:

#include <gtk2.h>

involved processing over 1000 #includes, some 550 discrete headers,
330K lines of declarations, with a bunch of -I options to tell it the
dozen different folders it needs to go and look for those headers.

I was looking at reducing the whole thing to one file - a set of
bindings in my language for the functions, types etc that are exposed.

This file would have been 25Kloc in my language (including those 4000
headers; most would have been simple #defines, but many will have
needed manual translation: macros can contain actual C code, not just
declarations).

HOWEVER... if such an exercise works for my language, why can't it
work for C too? That is, reduce those 100s of header files and dozens
of folders into a single 25Kloc file, specific to your platform.

Think how much easier it would be to install, or employ, and how
much faster to /compile/!

It would be faster to compile. Probably, meaningfully faster for
compiling large GUI project from scratch with very slow compiler like
gcc. Probably, not meaningfully faster in other situations.

'Probably'. So, my example was 550 headers and 1/3 million lines of
code, that may have to be recompiled in every module that happens to use
that library.

You don't think there's any merit in instead using a single header that contains 92% less code?

One has to wonder why precompiled headers were invented, if it was such
a non-problem! But maybe they can be combined.

It would not be easier to install or employ unless one happens to be as stubborn as you are.
If I ever want to write code using GTK2 for hobby purpose, which is
extremely unlikely, then all I'd need to do is to type 'pacman -S mingw-w64-ucrt-x86_64-gtk2' at msys2 command prompt. That's all.
For somebody on Debian/Ubuntu it likely would be 'apt-get install
gtk2'. RHEL/Fedora, MSVC command prompt or Mac it would be some other
magic incantation. Except that for the latter two it's probably not
available at all, so even easier.
The point is - it's already so easy that you can't really make it any
easier, at best the same.

This is the wrong way of solving these problems.

The original issues of complexity are still there; you're just observing
a set of solutions which are extra layers of complexity that people have
had to apply.

On Windows, it would be some program like PKCONFIG that needs to used in
some manner.

With my suggestion, you'd have two files:

testprog.c # program using #include "gtk2.h"
gtk2.h

You'd compile it like this:

gcc -c testprog.c

(Linking its DLLs is another matter, less easy to solve. GTK2 used some
50 large DLLs each with version numbers.)

Let's put it another way: given the choice of downloading 1000 headers
in 50 nested folders to use a given library, or just a single file of a fraction the size /which does exactly the same thing/, which would you
go for?

SDL2 is a better example. That is 80 headers, but only one DLL (ignoring SDLimage which for some reason is separate).

You might ask, why not 80 DLLs as well? Or some dozen or so at least.
After all those same solutions you mention should deal with it.

If it is more convenient to deal with one DLL, and there is zero reason
to split it up, then why doesn't the same applies with the 80 headers?

After all, how often would you include only a subset of those 80?

(BTW this is the SDL2 headers flattened into a single 3.1Kloc file in
the syntax of my language:

https://github.com/sal55/langs/blob/master/sdl2.m

This was done via an option of my C compiler, but if you look towards
the end (from line 2636) there are 500 lines of macros.

The simpler ones were converted to 'const' declarations, the rest stay
as macros, but the macro bodies are full of C syntax that needs manual conversion.

However, all you need to imagine is that those same headers have been condensed instead into a single 3Kloc C header file.)

--- PyGate Linux v1.5.14
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From David Brown@3:633/10 to All on Sun Apr 19 17:47:58 2026

On 19/04/2026 16:28, Bart wrote:

On 19/04/2026 13:17, David Brown wrote:

On 19/04/2026 13:50, Bart wrote:

On 19/04/2026 11:17, David Brown wrote:

On 18/04/2026 17:08, Bart wrote:

��void F() {
�� int a, b, c;
�� a = b + c * 8;
��}

So you want to know how the compiler deals with meaningless code.
Why? Do you not know how to write meaningful code?

I don't want the compiler deciding what's a meaningful program. The
intent here is clear:

No, the intent is not clear. If you are writing in C, and you intend
the code to have a definite meaning, you have to write that meaning in
C. Break C's rules, and the code does not have meaning as a whole - and compilers cannot be expected to guess what you meant, especially when
you ask them to analyse your code carefully to generate optimised output.

* Allocate 3 local slots for int
* Add the contents of two of those, and store into the third.

That is not what you wrote - because that's not what the C means.

As you pointed out yourself, C is not assembly. It does not have a
direct meaning like this. It has something related to that - C
declarations and statements have a meaning in the "abstract machine".
But as soon as you do something that is not allowed by the C rules, such
as attempt to add uninitialised local variables, meaning is lost.

And even when there is no error in your code, C only requires a synchronisation between the "abstract machine" and real code at points
of observable behaviour. Your code has no observable behaviour other
than entry and exit of F(), so the appropriate behaviour of a good
compiler is to generate nothing but a "return" code - after giving
helpful warning messages. (I'd prefer if gcc did so without needing
-Wall, but I'm used to adding that flag.)

That is the task. In terms of observable effects, there are are at least two: the code that is generated, and the time it might take to execute.

Neither are observable behaviour as far as C is concerned. Outside of
C, there are all sorts of aspects of this that humans can observe.

There is also the code size, and the compilation time.

One of my favourite compilation benchmarks is this:

�� void F() {
�� int a, b=2, c=3, d=4
�� a = b + c * d;
�� ....�� // repeat N times
�� printf("%d\n", a);
�� }

Here initialisation is used otherwise it causes problems with
interpreted languages for example.

It is amazing how many language implementations have trouble with this, especially with bigger N such as 1000000. The bigger ones usually fare worse.

This program is not meaningful; it is simply a stress test. Two more r observable effects are at what N it fails, and whether it crashes or
fails gracefully.

Stress tests of tools can be useful. I would not say something like
this is useful as a compilation benchmark - I want my tools to be fast
enough for practical use on the real code I write, and don't care how
slow they are for totally meaningless and unrealistic code. But if I
were writing a tool, I'd like to know how well it handled extreme cases.
(Sometimes generated C code has functions with huge numbers of simple
lines, totally unlike code that anyone would write by hand. It would
not have pointless repetition of lines, however.)

then all the C compilers I tried generated code at -O0 which kept
those variables in memory.

They are on the stack in memory, yes.� You've asked for close to a
direct and na�ve translation, which gives no insight into what kind of
code the compiler can generate and is harder to follow (because it's
mostly moving things onto and off from the stack).

It's easier to follow. Or would be it the compiler were to generate
decent assembly. gcc -O0 produces:

F:
�� pushq�� %rbp
�� movq�� %rsp, %rbp
�� movl�� -4(%rbp), %eax
�� leal�� 0(,%rax,8), %edx
�� movl�� -8(%rbp), %eax
�� addl�� %edx, %eax
�� movl�� %eax, -12(%rbp)
�� nop
�� popq�� %rbp
�� ret

I find it harder to follow the logic of code that uses the stack all the
time - when the generated code uses registers, I find it easier. Maybe
that's just a personal preference or from habit of experience. (I don't
much care if it is AT or Intel syntax for x86 - and anyway, x86 assembly
is not particularly relevant to me.)

What does the code look like when a/b/c are kept in registers? I've
no idea, because at soon as you try -O1 and above, the whole
expression is elided.

If you stick 'static' in front, then the whole function disappears.
This is not very useful when trying to compare code generation across
compilers and languages!

If I do something meaningful with 'a' to keep the expression alive,
and initialise b and c, then the whole expression is reduced to a
constant.

What do you have to do see if the expression would be compiled to,
for example, 'lea ra, [rb + rc*8]'?

int f(int b, int c)
{
�� int a;
�� a = b + c * 8;
�� return a;
}

If you don't want to use parameters and return values, I recommend
declaring externally linked volatile variables and use them as the
source and destination of your calculations:

volatile int xa;
volatile int xb;
volatile int xc;

void foo(void) {
�� int a, b, c;

�� b = xa;
�� c = xc;

�� a = b + c * 8;

�� xa = a;
}

When you ask the compiler "give me an efficient implementation of this
code" and the compiler can see that the code does nothing, it
generates no code (or just a "ret").� This should not be a surprise.
So you might need "tricks" to make the code mean something - including
access to volatile objects is one of these tricks.

So, you have to spend time fooling the compiler. And then you are never quite sure if it has left something out so that you're not comparing
like with like.

Sorry, I thought I was being helpful so that you would understand how to
get the results you are asking for from compilers. I am not "fooling"
the compiler, I am showing you how to ask the right questions.

However, this is a perfect example of how even a language and especially
its compilers differ from assembly and assemblers.

I suppose so.

It can happen with my compilers too, but on a much smaller scale. For example 'a = 2 + 2' is reduced to 'a = 4'. But it is easier to get around.

"a = 2 + 2;" and "a = 4;" mean the same thing in most languages, I would
have thought.

--- PyGate Linux v1.5.14
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From Bart@3:633/10 to All on Sun Apr 19 18:47:18 2026

On 19/04/2026 16:47, David Brown wrote:

On 19/04/2026 16:28, Bart wrote:

On 19/04/2026 13:17, David Brown wrote:

On 19/04/2026 13:50, Bart wrote:

On 19/04/2026 11:17, David Brown wrote:

On 18/04/2026 17:08, Bart wrote:

��void F() {
�� int a, b, c;
�� a = b + c * 8;
��}

So you want to know how the compiler deals with meaningless code.
Why? Do you not know how to write meaningful code?

I don't want the compiler deciding what's a meaningful program. The
intent here is clear:

No, the intent is not clear.� If you are writing in C, and you intend
the code to have a definite meaning, you have to write that meaning in
C.� Break C's rules, and the code does not have meaning as a whole - and compilers cannot be expected to guess what you meant, especially when
you ask them to analyse your code carefully to generate optimised output.

* Allocate 3 local slots for int
* Add the contents of two of those, and store into the third.

That is not what you wrote - because that's not what the C means.

I forgot the scaling of 'c'.

As you pointed out yourself, C is not assembly.� It does not have a
direct meaning like this.

I don't understand what else it can possibly mean.

Get the value of 'b', whatever it happens to be, add the value of 'c'
scaled by 8, and store the result it into 'a'. The only things to
consider are that some intermediate results may lose the top bits.

Is 'a = b' equally undefined? If so that C is even crazy than I'd thought.

There should be no problem with doing integer add of arbitrary
bit-patterns. Possibly there might be with floating point if badly implemented, but that doesn't seem to be the case with x64 XMM
registers. At worst you get a NaN or Inf result.

Integer add though should be equivalent to this pseudo-code:

func addu64(u64 a, b, &carry)u64 =
int c, s, cy

cy := 0
for i in 0..63 do
s := a.[i] + b.[i] + cy
c.[i] := s.[0]
cy := s.[1]
end

carry := cy
return c
end

(It happens to be valid code in my language. A.[i] accesses the i'th bit
of A. It doesn't care whether the bit patterns represent signed or
unsigned values.)

Stress tests of tools can be useful.� I would not say something like
this is useful as a compilation benchmark - I want my tools to be fast enough for practical use on the real code I write, and don't care how
slow they are for totally meaningless and unrealistic code.

Meaningless and unrealistic are what stress tests and benchmarks are!

But they can also give useful insights, highlight shortcomings, and can
be used to compare implementations.

I think if I used a real program such as sqlite3.c, you still wouldn't
care about my results.

� But if I
were writing a tool, I'd like to know how well it handled extreme cases.
�(Sometimes generated C code has functions with huge numbers of simple lines, totally unlike code that anyone would write by hand.

� It would
not have pointless repetition of lines, however.)

Then it becomes much, much harder to have a simple test that can used
for practically any language.

As a matter of interest, I tried 1 million lines of 'a=b+c*d' now. These
are some results:

gcc -O0 560 seconds
Tiny C 1.7 seconds
bcc 2.0 seconds
mm 1.9 seconds (non-C); both these run unoptimised code

gcc likely uses some sort of SSA representation, meaning a new variable
for each intermediate result. Here it probably needs 5 million
intermediates.

(From memory, it is faster when using optimise flags, because it can
eliminate 99.9999% of those assignments, so there is less code to
process later. You know when that happens as the resulting EXE is too
small to be feasible.)

Here's an interesting one:

void F(){
L1:
L2:
...
L1000000:
;
}

Both bcc and tcc crash instantly, because in C, labels have a recursive definition in the grammar, and this cause a stack overflow.

But I can't tell you what gcc does, as I aborted it after 5 minutes.

(In my language, labels are just another statement, and this compiled in
1.5 seconds. However I had to increase the hashtable size as it doesn't
grow as needed.)

So, you have to spend time fooling the compiler. And then you are
never quite sure if it has left something out so that you're not
comparing like with like.

Sorry, I thought I was being helpful so that you would understand how to
get the results you are asking for from compilers.� I am not "fooling"
the compiler, I am showing you how to ask the right questions.

The question was already posed as I wanted in my original fragment.

--- PyGate Linux v1.5.14
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From David Brown@3:633/10 to All on Sun Apr 19 21:32:34 2026

On 19/04/2026 19:47, Bart wrote:

On 19/04/2026 16:47, David Brown wrote:

On 19/04/2026 16:28, Bart wrote:

On 19/04/2026 13:17, David Brown wrote:

On 19/04/2026 13:50, Bart wrote:

On 19/04/2026 11:17, David Brown wrote:

On 18/04/2026 17:08, Bart wrote:

��void F() {
�� int a, b, c;
�� a = b + c * 8;
��}

So you want to know how the compiler deals with meaningless code.
Why? Do you not know how to write meaningful code?

I don't want the compiler deciding what's a meaningful program. The
intent here is clear:

No, the intent is not clear.� If you are writing in C, and you intend
the code to have a definite meaning, you have to write that meaning in
C.� Break C's rules, and the code does not have meaning as a whole -
and compilers cannot be expected to guess what you meant, especially
when you ask them to analyse your code carefully to generate optimised
output.

* Allocate 3 local slots for int
* Add the contents of two of those, and store into the third.

That is not what you wrote - because that's not what the C means.

I forgot the scaling of 'c'.

As you pointed out yourself, C is not assembly.� It does not have a
direct meaning like this.

I don't understand what else it can possibly mean.

You are asking the wrong question. Code that violates C constraints,
syntax rules, or semantic violations (so that you have UB) does not mean something else - it does not mean anything at all.

Get the value of 'b',

You can't do that. "b" has no value. "b" is indeterminate, and using
its value is UB - the code has no meaning right out of the gate.

When you use "b" in an expression, you are /not/ asking C to read the
bits and bytes stored at the address of the object "b". You are asking
for the /value/ of the object "b". How the compiler gets that value is
up to the compiler - it can read the memory, or use a stored copy in a register, or use program analysis to know what the value is in some
other way. And if the object "b" does not have a value, you are asking
the impossible.

Try asking a human "You have two numbers, b and c. Add them. What is
the answer?". Would you expect them to give you a number, or would they
tell you the question makes no sense?

whatever it happens to be, add the value of 'c'
scaled by 8, and store the result it into 'a'. The only things to
consider are that some intermediate results may lose the top bits.

Is 'a = b' equally undefined? If so that C is even crazy than I'd thought.

If "a" or "b" are indeterminate, then using them is undefined. I have
two things - are they the same colour? How is that supposed to make sense?

You keep thinking of objects like "b" as a section of memory with a bit pattern in it. Objects are not that simple in C - C is not assembly.
It is better to think of the object as not existing at all until it is initialised or assigned a value. (It is considered by many to be poor
style to even declare the object until you can initialise it.)

Stress tests of tools can be useful.� I would not say something like
this is useful as a compilation benchmark - I want my tools to be fast
enough for practical use on the real code I write, and don't care how
slow they are for totally meaningless and unrealistic code.

Meaningless and unrealistic are what stress tests and benchmarks are!

No, they are not.

Stress tests are for checking the reaction to a system in the face of
extreme inputs. The input to a compiler stress test might be code that doesn't do anything useful when run - you are only testing the compiler,
not the code. You may or may not want to ensure that the code is valid
- maybe you are stress-testing the compiler's handling of bad code.

Benchmarks are pointless unless they are realistic for the thing you are testing. Sometimes you only want to benchmark a specific part of your
code, but the values you get from a benchmark are only valid for the
code you are testing. (Testing related code can give some useful
information, of course. But testing meaningless or incorrect code is
rarely of any use.)

But they can also give useful insights, highlight shortcomings, and can
be used to compare implementations.

I think if I used a real program such as sqlite3.c, you still wouldn't
care about my results.

I probably wouldn't care much, no - because it doesn't matter to me how
fast sqlite3.c runs during testing, and it matters even less (if
possible) how fast it can be compiled. But I can happily accept that
such benchmarking could be of use to someone who cared about those results.

� But if I were writing a tool, I'd like to know how well it handled
extreme cases. ��(Sometimes generated C code has functions with huge
numbers of simple lines, totally unlike code that anyone would write
by hand.

� It would not have pointless repetition of lines, however.)

Then it becomes much, much harder to have a simple test that can used
for practically any language.

As a matter of interest, I tried 1 million lines of 'a=b+c*d' now. These
are some results:

�gcc -O0�� 560�� seconds
�Tiny C�� 1.7 seconds
�bcc�� 2.0 seconds
�mm�� 1.9 seconds (non-C); both these run unoptimised code

gcc likely uses some sort of SSA representation, meaning a new variable
for each intermediate result. Here it probably needs 5 million intermediates.

(From memory, it is faster when using optimise flags, because it can eliminate 99.9999% of those assignments, so there is less code to
process later. You know when that happens as the resulting EXE is too
small to be feasible.)

You mean when the object code is small because the compiler did a good job?

Here's an interesting one:

� void F(){
� L1:
� L2:
� ...
� L1000000:
� ;
� }

Both bcc and tcc crash instantly, because in C, labels have a recursive definition in the grammar, and this cause a stack overflow.

But I can't tell you what gcc does, as I aborted it after 5 minutes.

(In my language, labels are just another statement, and this compiled in
1.5 seconds. However I had to increase the hashtable size as it doesn't
grow as needed.)

And why is that interesting? Have you any use for a program consisting
of nothing but one function with a million labels?

So, you have to spend time fooling the compiler. And then you are
never quite sure if it has left something out so that you're not
comparing like with like.

Sorry, I thought I was being helpful so that you would understand how
to get the results you are asking for from compilers.� I am not
"fooling" the compiler, I am showing you how to ask the right questions.

The question was already posed as I wanted in my original fragment.

Are you looking for help for how to get good results looking at the
generated assembly from a compiler, or are you just wanting to show
(again) that you don't understand how to do it and don't want to learn?

--- PyGate Linux v1.5.14
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From Keith Thompson@3:633/10 to All on Sun Apr 19 16:06:29 2026

Bart <bc@freeuk.com> writes:

On 19/04/2026 01:35, Keith Thompson wrote:

Bart <bc@freeuk.com> writes:

On 18/04/2026 14:37, David Brown wrote:

On 17/04/2026 18:42, Bart wrote:

I've no idea. You just said it was uncommon to use C in this
way. But every other amateur compiler project on Reddit forums
likes to use a C target.

You didn't simply claim that people were using C as an intermediary
language - you claimed they were doing so specifically for languages
that defined things like type punning, wrapping signed integer
arithmetic, and messing about with pointers.

The broader picture is being forgotten. The thread is partly about C
being a 'portable assembler', and this is a common notion.

It's a common wrong notion.
One person here recently claimed that C is a kind of assembly
language.

'C being portable assembly' keeps coming up, not just here.

Yes, it does. It's wrong. You know it's wrong. You and I have both
correctly argued in this thread that it's wrong. What point are you
trying to make by bringing it up?

Yes, I know. There should have been one that is much better - a HLL,
not the monstrosity that is LLVM. But it doesn't exist.

Given your habit of inventing your own languages and writing your
own
compilers, I'm surprised you haven't defined your own intermediate
language, something like LLVM IR but suiting your purposes better.
You're complaining about a problem that *you* might be in a position
to address.

If you read my post again, you'll see that I did exactly that.

OK, you have your own intermediate language that serves your purposes
better than LLVM or generating C as intermediate code.

Good for you.

[...]

--
Keith Thompson (The_Other_Keith) Keith.S.Thompson+u@gmail.com
void Void(void) { Void(); } /* The recursive call of the void */

--- PyGate Linux v1.5.14
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From Bart@3:633/10 to All on Mon Apr 20 00:36:23 2026

On 19/04/2026 20:32, David Brown wrote:

On 19/04/2026 19:47, Bart wrote:

Get the value of 'b',

You can't do that.� "b" has no value.� "b" is indeterminate, and using
its value is UB - the code has no meaning right out of the gate.

When you use "b" in an expression, you are /not/ asking C to read the
bits and bytes stored at the address of the object "b".� You are asking
for the /value/ of the object "b".� How the compiler gets that value is
up to the compiler - it can read the memory, or use a stored copy in a register, or use program analysis to know what the value is in some
other way.� And if the object "b" does not have a value, you are asking
the impossible.

Try asking a human "You have two numbers, b and c.� Add them.� What is
the answer?".

You have two slates A and B which someone should have wiped clean then
written a new number on each.

But that part hasn't been done; they each still have an old number from
their last use.

You can still add them together, nothing bad will happen. It just may be
the wrong answer if the purpose of the exercise was to find the sum of
two specific new numbers.

But the purpose may also be see how good they are adding. Or in
following instructions.

whatever it happens to be, add the value of 'c' scaled by 8, and store
the result it into 'a'. The only things to consider are that some
intermediate results may lose the top bits.

Is 'a = b' equally undefined? If so that C is even crazy than I'd
thought.

If "a" or "b" are indeterminate, then using them is undefined.� I have
two things - are they the same colour?� How is that supposed to make sense?

You keep thinking of objects like "b" as a section of memory with a bit pattern in it.� Objects are not that simple in C - C is not assembly.

Why ISN'T it that simple? What ghastly thing would happen if it was?

"b" will be some location in memory or it might be some register, and it
WILL have a value. That value happens to be unknown until it is initialised.

So accessing it will return garbage (unless you know exactly what you
are doing then it may be something useful).

My original example was something like 'a = b + c' (I think in my
language), converted to my IL, then expressed in very low-level C.

You were concerned that in that C, the values weren't initialised. How
would that have affected the code that C compiler generated from that?

It's starting to appear that the compiler is more of the problem!
Because mine would certainly not be bothered by it and nobody would be scratching their heads wondering what surpises the compiler might have
in store.

Would the compiler have been happier with this:

int a, b = F(), c = F();
a = b + c;

If so, then suppose F was this:

int F() {int x; return x;}

When the body of F is not visible, then that cannot possibly affect what
is generated for 'a = b + c'.

So I'm still interested in what possible reason the compiler might have
for generating code that is any different in the absence of
initialisation. Warn about it, sure, but why do anything else?

THIS is why I try to stay from using C intermediate code.

You mean when the object code is small because the compiler did a good job?

It may do a good job of eliminating duplicate or redundant code. But
maybe you are measuring how well it copes with a certain quantity of
code, which when synthesised may well be duplicate or redundant.

Then it is not helpful that it discards most of it. How is that supposed
to give an accurate meaure of how well it does when it really does need
to do it all?

It's like comparing car A and car B over a course (we've been here
before), but A's driver is using clever shortcuts. Or maybe he doesn't
even bother going anywhere if the course is circular.

That will give A an unfair advantage, and a misleading result. It could
be that B is actually faster, so somebody deciding to buy A based on
this test is going to be disappointed!

--- PyGate Linux v1.5.14
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From David Brown@3:633/10 to All on Mon Apr 20 08:25:47 2026

On 20/04/2026 01:36, Bart wrote:

On 19/04/2026 20:32, David Brown wrote:

On 19/04/2026 19:47, Bart wrote:

Get the value of 'b',

You can't do that.� "b" has no value.� "b" is indeterminate, and using
its value is UB - the code has no meaning right out of the gate.

When you use "b" in an expression, you are /not/ asking C to read the
bits and bytes stored at the address of the object "b".� You are
asking for the /value/ of the object "b".� How the compiler gets that
value is up to the compiler - it can read the memory, or use a stored
copy in a register, or use program analysis to know what the value is
in some other way.� And if the object "b" does not have a value, you
are asking the impossible.

Try asking a human "You have two numbers, b and c.� Add them.� What is
the answer?".

You have two slates A and B which someone should have wiped clean then written a new number on each.

That would be a different question. My wording is a bit closer to the
way C works. (Your wording may be closer for some other languages.)

But that part hasn't been done; they each still have an old number from their last use.

You can still add them together, nothing bad will happen. It just may be
the wrong answer if the purpose of the exercise was to find the sum of
two specific new numbers.

In a non-C language, perhaps.

But the purpose may also be see how good they are adding. Or in
following instructions.

It would fail at both of these.

whatever it happens to be, add the value of 'c' scaled by 8, and
store the result it into 'a'. The only things to consider are that
some intermediate results may lose the top bits.

Is 'a = b' equally undefined? If so that C is even crazy than I'd
thought.

If "a" or "b" are indeterminate, then using them is undefined.� I have
two things - are they the same colour?� How is that supposed to make
sense?

You keep thinking of objects like "b" as a section of memory with a
bit pattern in it.� Objects are not that simple in C - C is not assembly.

Why ISN'T it that simple? What ghastly thing would happen if it was?

Are you are asking why C leaves uninitialised local variables as
indeterminate values and makes using those undefined behaviour? Before attempting to answer that, I'd like to be sure you really want to know
the answer.

"b" will be some location in memory or it might be some register, and it WILL have a value. That value happens to be unknown until it is
initialised.

In C, "b" is not any specific place. In optimising compilers, the implementation is unlikely to exist at all until it has a real value,
and even then it can move around or be purely theoretical (or perhaps "ethereal" is a good word for it?). C is not assembly.

So accessing it will return garbage (unless you know exactly what you
are doing then it may be something useful).

I've covered that.

My original example was something like 'a = b + c' (I think in my
language), converted to my IL, then expressed in very low-level C.

You were concerned that in that C, the values weren't initialised. How
would that have affected the code that C compiler generated from that?

You have seen how it affects code generated.

It's starting to appear that the compiler is more of the problem!

No, your misunderstandings are the problem. Despite your claims, you
are thinking of C as closer to assembly - you think a compiler must do
direct translation (perhaps with a bit of simple re-arrangements for optimisation).

Because mine would certainly not be bothered by it and nobody would be scratching their heads wondering what surpises the compiler might have
in store.

C programmers don't need to scratch their heads. They simply have to
write meaningful code. It's not rocket science.

(As a C implementer, you should have a better understanding of these
details than C programmers usually need.)

Would the compiler have been happier with this:

�� int a, b = F(), c = F();
�� a = b + c;

Assuming F() is an appropriate function and returns an int (or something
that can be converted to an int), then that is valid. If you don't do anything with "a", then then the addition is meaningless and will be
dropped by compilers. The calls to F() will remain unless the compiler
knows that they lead to no observable behaviour.

If so, then suppose F was this:

�� int F() {int x; return x;}

That's UB when F() is called and the return value is used.

When the body of F is not visible, then that cannot possibly affect what
is generated for 'a = b + c'.

Correct.

So I'm still interested in what possible reason the compiler might have
for generating code that is any different in the absence of
initialisation. Warn about it, sure, but why do anything else?

In this case, it would not - the compiler would assume that the call to
F() is valid when it is generating the calling code.

THIS is why I try to stay from using C intermediate code.

That is ridiculous and petty.

You are arguing that C is difficult to use as an intermediate language
because you don't know what happens when you generate shite sort-of C
code? Just generate valid C code that has meaning, and stop worrying.

Do you think English is a bad language for communicating just because
you don't understand the words in Jabberwocky?

You mean when the object code is small because the compiler did a good
job?

It may do a good job of eliminating duplicate or redundant code. But
maybe you are measuring how well it copes with a certain quantity of
code, which when synthesised may well be duplicate or redundant.

It looks like it has done a fine job with duplicate and redundant code.

Then it is not helpful that it discards most of it. How is that supposed
to give an accurate meaure of how well it does when it really does need
to do it all?

If you want to know how the compiler deals with some type of code, you
need to give it that kind of code. You can't give it something else! Benchmarking is about asking the right questions to get the measurements
you want. You are not good enough at identifying these questions. I
hope that after this discussion, you know a little more and /can/ ask appropriate questions of your tools when benchmarking them.

Do you also try to benchmark compression programs using files of
duplicate copies of the same line? Yes, if you want to find out how
well they deal with such input - no, if you want to know how they handle
other kinds of data. It's the same with compilers.

It's like comparing car A and car B over a course (we've been here
before), but A's driver is using clever shortcuts. Or maybe he doesn't
even bother going anywhere if the course is circular.

That will give A an unfair advantage, and a misleading result. It could
be that B is actually faster, so somebody deciding to buy A based on
this test is going to be disappointed!

So gcc is a more powerful compiler than yours, and that's not fair?

--- PyGate Linux v1.5.14
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From Bart@3:633/10 to All on Mon Apr 20 12:45:06 2026

On 20/04/2026 07:25, David Brown wrote:

On 20/04/2026 01:36, Bart wrote:

In C, "b" is not any specific place.� In optimising compilers, the implementation is unlikely to exist at all until it has a real value,

It should come into existence when it is referenced. Then it will have a value.

Here for example:

int b;
if (rand()&1) b=0;
printf("%d", b);

'b' may or may not be initialised. But I expect the 'b' used in that
last line to exist somewhere and for the generated code to access that location. I'd also expect the same if the assignment was commented out.

Someone could write some actual code like my example, with an
unconditional assignment, but for various reasons has to temporarily
comment out that assignment.

It might be a function that is not called. Or it might be in a program
that is not run at all, because the developer is sorting out some build
issue.

But according to you, that part of the code is UB, whether the program
is ever run or not, and so the whole thing is undefined.

That would be ludicrous.

C programmers don't need to scratch their heads.� They simply have to
write meaningful code.� It's not rocket science.

(As a C implementer, you should have a better understanding of these
details than C programmers usually need.)

I implement it in a common sense manner. I don't say, Ah, 'x' might not
be initialised at this point, so it is UB, therefore I don't need to
bother compiling these remaining 100 lines, then the program will be
smaller and faster!

You are arguing that C is difficult to use as an intermediate language because you don't know what happens when you generate shite sort-of C
code?� Just generate valid C code that has meaning, and stop worrying.

My language allows you to do this:

int a, b
a := b

It is well-defined in the language, and I know it is well defined on all
my likely targets. (I think we're back where we started!)

However, we are generating C via an IL. The IL will be something like this:

local a
local b
...
load b
store a

Again, it is perfectly well-defined. Whatever bit-pattern in b is
transfered to a. In assembly, the same thing: b will be in memory or
register.

All well and good. UNTIL we decided to involve C! Let's say everything
has u64 type:

u64 R1; # represents the one stack slot used
u64 a, b; # our local variables

Now we need to translate that load and store:

R1 = b;
a = R1;

This looks really easy, but no, C just has to make it UB.

So, how do you suggest this is fixed? Do I now have to do an in-depth
analysis of that IL to figure out whether 'b' was initialised at this
point (there might be 100 lines of IL code in-between, including
conditional code and loops). Even if I find out it wasn't, what do I do
about it?

Maybe a simpler solution: zero all locals whether necessary or not:

u64 a = 0, b = 0;

However, the point of using C may be to get a faster program. I don't
want unnecessary assignments which the C compiler may or may not be able
to elide. Especially when declaring entire arrays or structs:

struct $B1 dd = {0};
struct $B1 ee = {0};

In any case, there will be a million other things that are probably UB
as well. How far do you go in trying to appease C?

So, C as an intermediate is undesirable, but the real reason is the
compilers. A simple, dumb compiler like bcc or tcc is preferable (but
mine doesn't optimise and is for Windows, and tcc has its own problems).

This is why we needed C--.

So gcc is a more powerful compiler than yours, and that's not fair?

If it is effectively cheating at benchmarks, then no it isn't.

If you take recursive Fibonacci, then fib(N) is expected to execute
about 2*fib(N) function calls (for versions that start 'if (N<3) return 1').

However, using -Os or O1, gcc's code only does half of those calls. And
with -O2/-O3, only 5%, via aggressive inlining and TOC optimising.

So, no, that's not a fair comparison. Fibonacci is supposed to be a
measure of how many calls/second a language implementation can make, and
the figure you'd get with gcc can be misleading.

(It might as well use memoisation, or be clever enough to convert to to iterative form, then it can report an infinite number of calls per
second. That's really useful!

So, for gcc and Fibonacci, I now use -fno-inline and another to turn off
TCO.)

--- PyGate Linux v1.5.14
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From David Brown@3:633/10 to All on Mon Apr 20 15:02:40 2026

On 20/04/2026 13:45, Bart wrote:

On 20/04/2026 07:25, David Brown wrote:

On 20/04/2026 01:36, Bart wrote:

In C, "b" is not any specific place.� In optimising compilers, the
implementation is unlikely to exist at all until it has a real value,

It should come into existence when it is referenced. Then it will have a value.

It doesn't matter how often you try to claim this, it is not true.
Until you appreciate that trying to use the value of an indeterminate
object is UB, we are going nowhere.

Here for example:

�� int b;
�� if (rand()&1) b=0;
�� printf("%d", b);

'b' may or may not be initialised. But I expect the 'b' used in that
last line to exist somewhere and for the generated code to access that location. I'd also expect the same if the assignment was commented out.

Someone could write some actual code like my example, with an
unconditional assignment, but for various reasons has to temporarily
comment out that assignment.

It might be a function that is not called. Or it might be in a program
that is not run at all, because the developer is sorting out some build issue.

But according to you, that part of the code is UB, whether the program
is ever run or not, and so the whole thing is undefined.

That would be ludicrous.

This kind of thing is run-time UB. It is not UB until you try to run
the UB. A more obvious example is "x / y" (where "x" and "y" have
defined values). If y is 0, or if it is -1 and x is INT_MAX, you have
UB - otherwise the behaviour is defined and perfectly good.

Because a lot of potential UB is not an issue until run time, compilers generate code on the assumption that the run-time UB does not occur.
This is why they can't (in conforming modes) halt compilation with an
error message, even for many things that will obviously be UB if run.
Your sample code using uninitialised variables is only UB if it is run - compiling it is not UB. (Compilers can give warnings whenever they
like, and many compilers will, with appropriate flags, warn on many
cases of potential or likely UB - even if it is not always UB to run the code.)

But the compiler can always generate code on the assumption that the UB
does not occur. So your code here could be treated as :

if (rand() & 1) {
printf("0");
} else {
launch_nasal_daemons();
}

There is no way for the print to show anything other than 0 without UB,
and the compiler can use that information (if it is smart enough, and if
it is deemed more efficient for when there is no UB, and if any relevant
user flags are set appropriately).

C programmers don't need to scratch their heads.� They simply have to
write meaningful code.� It's not rocket science.

(As a C implementer, you should have a better understanding of these
details than C programmers usually need.)

I implement it in a common sense manner.

"Common sense" is another way of saying "I don't know the actual rules".
If "common sense" were a good strategy for defining the semantics of programming languages, we wouldn't have language standards - we'd just
say "the meaning of this code is obvious".

I don't say, Ah, 'x' might not
be initialised at this point, so it is UB, therefore I don't need to
bother compiling these remaining 100 lines, then the program will be
smaller and faster!

You are arguing that C is difficult to use as an intermediate language
because you don't know what happens when you generate shite sort-of C
code?� Just generate valid C code that has meaning, and stop worrying.

My language allows you to do this:

�� int a, b
�� a := b

It is well-defined in the language, and I know it is well defined on all
my likely targets. (I think we're back where we started!)

It is a shame if we are back where we started - because you started out
wrong. You started out treating C like assembly, and you haven't shown
you understand the difference.

The semantics of your language are important to you - but not to C. The semantics of whatever targets you use are important to the back-end of
the C compiler, but not to the C language or its semantics.

However, we are generating C via an IL. The IL will be something like this:

�� local a
�� local b
�� ...
�� load b
�� store a

Again, it is perfectly well-defined. Whatever bit-pattern in b is
transfered to a. In assembly, the same thing: b will be in memory or register.

All well and good. UNTIL we decided to involve C! Let's say everything
has u64 type:

�� u64 R1;�� # represents the one stack slot used
�� u64 a, b;�� # our local variables

Now we need to translate that load and store:

�� R1 = b;
�� a = R1;

This looks really easy, but no, C just has to make it UB.

So, how do you suggest this is fixed? Do I now have to do an in-depth analysis of that IL to figure out whether 'b' was initialised at this
point (there might be 100 lines of IL code in-between, including
conditional code and loops). Even if I find out it wasn't, what do I do about it?

I really do not care how you fix it. That's up to you. But it is
genuinely absurd to say that you have been writing languages,
translators, compilers, transpilers and other tools for decades, and
don't understand such simple things. This really is the first step for
a compiler that transforms language A to language B - you have to
generate code in language B that implements the semantics in language A.

If I were in your shoes, I would translate "int a;" from your language
into "int a = 0;" in C. Simple and clear. The semantics on the C side
are stronger than those in the source language (since there is a
definite value of 0, rather than an unspecified int value as you have
AFAIUI), but that's fine. If your translator generates something with
weaker semantics - like plain "int a;" in C - then your translator is
broken by design.

(Note that any half-decent C compiler will be able to remove any actual
store zero operations if the variable is assigned later before use.)

You could also use the gcc option "-ftrivial-auto-var-init=zero", but
that needs a relatively new version of gcc.

Maybe a simpler solution: zero all locals whether necessary or not:

� u64 a = 0, b = 0;

Yes, that's the obvious simple solution.

A better solution might be to consider if your own language is sensibly defined here. C compilers can't make the use of uninitialised local
data a compile-time error (without being non-conforming), but /you/ can
do that in your own tools for your own language.

However, the point of using C may be to get a faster program. I don't
want unnecessary assignments which the C compiler may or may not be able
to elide.

Correctness trumps efficiency every time. Or do you think it is better
to rely on luck?

Especially when declaring entire arrays or structs:

� struct $B1 dd = {0};
� struct $B1 ee = {0};

In any case, there will be a million other things that are probably UB
as well. How far do you go in trying to appease C?

This is not about "appeasing C" - it is about writing code that is
meaningful. You are desperately clinging to your right to do stupid
things in your code. Yes, you have that right - and since I am never
going to use or rely on your code, it's no harm to me if you make
mistakes. But this group is about teaching, learning, and exchanging experiences in C - I can lead you to water, but I can't make you drink.

So, C as an intermediate is undesirable, but the real reason is the compilers. A simple, dumb compiler like bcc or tcc is preferable (but
mine doesn't optimise and is for Windows, and tcc has its own problems).

This is why we needed C--.

So gcc is a more powerful compiler than yours, and that's not fair?

If it is effectively cheating at benchmarks, then no it isn't.

Again - you are asking the wrong questions in your benchmarks.

You /think/ you are asking the car to drive round a loop. But you what
you are writing is asking the car to go from A to B. And then you
complain when gcc figures out it can drive directly from A to B without
going through the loop.

If you want to benchmark a compiler going through the whole path, write
that in the code. Force observable behaviour at the start (with a
volatile access), use code lines that depend on that input and previous
lines, and observe the behaviour at the end (with a volatile write, or a printf, or something else /real/).

If you take recursive Fibonacci, then fib(N) is expected to execute
about 2*fib(N) function calls (for versions that start 'if (N<3) return
1').

int fibonacci(int n)
{
if (n <= 2) return 1;
return fibonacci(n - 1) + fibonacci(n - 2);
}

No, I don't expect the generated code to have 2 * fib(n) recursive
calls. I expect the code to give the same results as if it had made
those calls.

If a compiler can optimise in such a way as to reduce the number of
calls, that's great. There may be a trade-off between size and speed -
that's why compilers offer different optimisation flags. If a
super-smart compiler (gcc is not there yet) can see how to make the
function tail-recursive, and turn it into a loop, that would be
brilliant. (It seems to manage a single recursive call per n/2 steps,
which is very neat - I need to look at that code to see how it works.)

However, using -Os or O1, gcc's code only does half of those calls. And
with -O2/-O3, only 5%, via aggressive inlining and TOC optimising.

So, no, that's not a fair comparison.

Again - it is unfair in the "gcc is better than my tool, that's not
fair" sense.

Fibonacci is supposed to be a
measure of how many calls/second a language implementation can make, and
the figure you'd get with gcc can be misleading.

No, it is not. Jump in your TARDIS and go to Pisa around 1200 to ask
Mr. Fibonacci if that's what the sequence was supposed to be. More
seriously, you cannot assume that a high-level language follows the
pattern of a code sequence directly - that is fundamentally what makes
it a "high-level" language.

(It might as well use memoisation, or be clever enough to convert to to iterative form, then it can report an infinite number of calls per
second. That's really useful!

So, for gcc and Fibonacci, I now use -fno-inline and another to turn off TCO.)

And does that give you any kind of information that is useful for any
purpose? I suspect not.

--- PyGate Linux v1.5.14
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From Bart@3:633/10 to All on Mon Apr 20 15:32:18 2026

On 20/04/2026 14:02, David Brown wrote:

On 20/04/2026 13:45, Bart wrote:

I implement it in a common sense manner.

"Common sense" is another way of saying "I don't know the actual rules".

It means doing the obvious thing with no unexpected surprises. If the resulting program runs with exactly the behaviour the user expects, then
what is the problem?

So here:

int a;
printf("%d", a);

it will print some garbage value, most likely zero. And here:

printf("%d", &a[123456789]);

it will most likely crash. This is on the user. The former is harmless;
the latter isn't. In either case, my compiler will do exactly what is asked.

It is a shame if we are back where we started - because you started out wrong.� You started out treating C like assembly, and you haven't shown
you understand the difference.

I disagree with how a lot of C works. However in practice, these
quibbles are irrelevant since it generally works. Here, 'qq' is my
interpreted language:

c:\qx>mc qq
Compiling qq.m to qq.c

c:\qx>gc qq opt
Invoking: gcc -xc -s -O2 qq.c -o qq.exe -lm -ldl -fno-strict-aliasing
Compiled qq.c to qq.exe

c:\qx>qq hello
Hello, World!

qq.c is nearly 80,000 lines of C code that you disapprove of and is
likely full of UBs. However, the interpreter works, and is somewhat
faster than my native-compiled version.

Further, I can make it run on Linux:

c:\qx>mc -linux qq
Compiling qq.m to qq.c

c:\qx>wsl
root@DESKTOP-11:/mnt/c/qx# gcc -O2 qq.c -oqq -lm -ldl
-fno-strict-aliasing
root@DESKTOP-11:/mnt/c/qx# ./qq hello
Hello, World!

In both case I can choose to send qq.c to someone to build for
themselves, whether on Windows or Linux. I used this approach to enable
my editor and IDE (both interpreted) to run on ARM-based Linux.

So why should I listen to you, and why should I care?

The semantics of your language are important to you - but not to C.� The semantics of whatever targets you use are important to the back-end of
the C compiler, but not to the C language or its semantics.

Yes, intermediate C is nuisance. But it can made to work as shown above.

If it is effectively cheating at benchmarks, then no it isn't.

Again - you are asking the wrong questions in your benchmarks.

You /think/ you are asking the car to drive round a loop.� But you what
you are writing is asking the car to go from A to B.� And then you
complain when gcc figures out it can drive directly from A to B without going through the loop.

If you want to benchmark a compiler going through the whole path, write
that in the code.� Force observable behaviour at the start (with a
volatile access), use code lines that depend on that input and previous lines, and observe the behaviour at the end (with a volatile write, or a printf, or something else /real/).

This is likely to impact the results. Volatile variables aren't stored
in registers for example. That would give /my/ language an unfair advantage.

It also means the code will now be different from that in another
language. If comparing mine and C, there is usually a 1:1
correspondence, but not that if the C version has to be heavily modified.

Here's a better benchmark: PID uses my bignum library to work out N
digits of pi:

gcc -O2 0.69 seconds
bcc 1.64
mm 1.38 (version in my language, the C was ported from this)
gcc -O2 0.66 (my lang transpiled to C using UB-stricken code)

Here, gcc does well, around double the speed of my compiler. A big chunk
of it however is because I don't optimise division by constants. If I
take that out of the equation, then I get:

gcc -O2 0.89 seconds

int fibonacci(int n)
{
�� if (n <= 2) return 1;
�� return fibonacci(n - 1) + fibonacci(n - 2);
}

No, I don't expect the generated code to have 2 * fib(n) recursive
calls.� I expect the code to give the same results as if it had made
those calls.

If a compiler can optimise in such a way as to reduce the number of
calls, that's great.

Not for Fibonacci! The whole point of it is to do lots and lots of
function calls and little else. How can eliding 50-95% of them not be cheating?

(Why would recursive Fibonacci even ever be used as a benchmark when the iterative method be much faster in every case?)

Suppose it was clever enough to turn 'fib(42)' into:

puts("...")

would you consider that a fair result? If not, how would you make it
fair without crippling it so that is unfair the other way?

So, for gcc and Fibonacci, I now use -fno-inline and another to turn
off TCO.)

And does that give you any kind of information that is useful for any purpose?� I suspect not.

Somebody on Reddit claimed to have a made Python compiler that could do recursive Fibonacci 10,000 times faster than CPython (and 100 times than faster optimised C).

It turned out that memoisation was used. So clearly this was a highly misleading claim. That product may well have been faster, but we don't
know. To you of course it would be fair!

--- PyGate Linux v1.5.14
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From David Brown@3:633/10 to All on Mon Apr 20 18:04:31 2026

On 20/04/2026 16:32, Bart wrote:

On 20/04/2026 14:02, David Brown wrote:

On 20/04/2026 13:45, Bart wrote:

I implement it in a common sense manner.

"Common sense" is another way of saying "I don't know the actual rules".

It means doing the obvious thing with no unexpected surprises.

The obvious thing to me is "does what the standard says it does". How
is that /not/ the obvious thing to you? And the standard is not
surprising here either - uninitialised local variables don't have any
value, so you can't try and use that non-existent value.

It is a shame if we are back where we started - because you started
out wrong.� You started out treating C like assembly, and you haven't
shown you understand the difference.

<snip>

So why should I listen to you, and why should I care?

I am only trying to help you write correct code instead of something
that works by chance in the testing you happen to have done so far. If
you are happy with programming by luck rather than design, there's
little more I can do.

--- PyGate Linux v1.5.14
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From Bart@3:633/10 to All on Mon Apr 20 17:17:28 2026

On 20/04/2026 01:49, Waldek Hebisch wrote:

Bart <bc@freeuk.com> wrote:

On 19/04/2026 11:17, David Brown wrote:

On 18/04/2026 17:08, Bart wrote:

(Yes, LLVM and the tools around it are big. It takes a lot of

effort to

make use of them, but you get a lot in return. A "little language" has
to grow to a certain size in numbers of toolchain developers and

numbers

of toolchain users before it can make sense to move to LLVM.

Actually lots of small projects use LLVM.

But probably people don't realise it is like installing the engine from
a container ship into your small family car.

AFAICS people are proud using powerful engine and tend to ignore disadvantages.

It's also possible that they are so inured to slow, cumbersome tools
that they think of this as normal.

In non-C context co-worker in a project uses
"standard" documentation tool to generate tens of megabytes of
HTML documentation. This needs something like 2 min 30 seconds,
and tens of megabytes extra packages. I wrote few hundreds lines
of code to do almost the same thing but directly. Amount of
specialized code is similar to what is needed to interface with
external package. My code is doing the job in about 1.5 sec.
His reaction was essentially: "Why are you wasting time when the
code works". Actually, there were differing assumptions: he
assumed that code will be run once a few months, so performance
would not matter at all. I would run the code as part of build
and test cycle and 2 min 30 seconds per cycle matters a lot.
The external package has a lot of features, for example it
support tens (maybe hundreds) color schemes. But we need only
one color scheme.

And there are also irrational people about. Or maybe he welcomed that 2
1/2 minutes as a break from work (you will know the XCD page I have in
mind).

I come across such differences all the time, where people are not
bothered that some task takes minutes when it could be taking seconds.

My PC is about 10'000 times faster than the Z80 device I started my
career with, but let's say 1000 times to allowing for it running more appropriate software for the hardware.

This means than any task that takes 1 second now, would have taken about
15 minutes on that Z80 computer, or 2 1/2 HOURS if you stick with that
10,000 factor. This doesn't even take account of floppies vs SSDs.

I can tell you that programs rarely took that long, and one of my stuff
this.

Not suggesting all software needs to be magnitudes faster, but it can be
a useful metric to see if a modern run-time is reasonable.

Anyway, people belive that by using major "standard" package
they will somewhat get superior features.

I tried to use VS Studio over a decade ago. Installation took 90
minutes. However I couldn't figure out how to build hello.c on it! It
also took 90 seconds to start each, usually inadvertently due to file association.

Big slow software is not for me. But for many it works, so fair enough,
they can do what they like.

What's not valid about 'a = b + c'?

It is incomplete. Why do not you use eqiuvalent function:

int
add(int b, int c) {
return b + c;
}

That's not what I'm measuring. But I just don't see the need to pander
so much to the needs of C and especially of gcc. They shouldn't be so delicate!

It was suggested that accessing an uninitialised integer value would be serious enough UB that I need to be concerned about it.

I don't want to care about it. It is just some unfortunate consequence
of using intermediate C code within a pipeline, and in practice even
then it doesn't really matter.

Besides, all this was simply because the original test code in my language:

int a, b, c
a := b + c

which was to illustrate the C generated from the IL form, didn't bother
to initialise b and c; that was irrelevant. That's what DB I think it
was seized upon.

--- PyGate Linux v1.5.14
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From Bart@3:633/10 to All on Mon Apr 20 18:34:04 2026

On 20/04/2026 14:49, Waldek Hebisch wrote:

Bart <bc@freeuk.com> wrote:

On 19/04/2026 20:32, David Brown wrote:

On 19/04/2026 19:47, Bart wrote:

Get the value of 'b',

You can't do that.� "b" has no value.� "b" is indeterminate, and using
its value is UB - the code has no meaning right out of the gate.

When you use "b" in an expression, you are /not/ asking C to read the
bits and bytes stored at the address of the object "b".� You are asking
for the /value/ of the object "b".� How the compiler gets that value is
up to the compiler - it can read the memory, or use a stored copy in a
register, or use program analysis to know what the value is in some
other way.� And if the object "b" does not have a value, you are asking
the impossible.

Try asking a human "You have two numbers, b and c.� Add them.� What is
the answer?".

You have two slates A and B which someone should have wiped clean then
written a new number on each.

But that part hasn't been done; they each still have an old number from
their last use.

You can still add them together, nothing bad will happen. It just may be
the wrong answer if the purpose of the exercise was to find the sum of
two specific new numbers.

But the purpose may also be see how good they are adding. Or in
following instructions.

whatever it happens to be, add the value of 'c' scaled by 8, and store >>>> the result it into 'a'. The only things to consider are that some
intermediate results may lose the top bits.

Is 'a = b' equally undefined? If so that C is even crazy than I'd
thought.

If "a" or "b" are indeterminate, then using them is undefined.� I have
two things - are they the same colour?� How is that supposed to make sense? >>>
You keep thinking of objects like "b" as a section of memory with a bit
pattern in it.� Objects are not that simple in C - C is not assembly.

Why ISN'T it that simple? What ghastly thing would happen if it was?

"b" will be some location in memory or it might be some register, and it
WILL have a value. That value happens to be unknown until it is initialised. >>
So accessing it will return garbage (unless you know exactly what you
are doing then it may be something useful).

My original example was something like 'a = b + c' (I think in my
language), converted to my IL, then expressed in very low-level C.

You were concerned that in that C, the values weren't initialised. How
would that have affected the code that C compiler generated from that?

You look at trivial example, where AFAICS the best answer is:
"Compiler follows general rules, why should it make exception for
this case?". Note that in this trivial case "interesting"
behaviour could happen on exotic hardware (probably disallowed
by C23 rules, but AFAICS legal for earlier C versions).

I don't care about exotic hardware. I don't see why its needs should
impact the 99.99% (if not 100%) of actual hardware that people use.

It ought to have made more things implementation defined.

Namely, consder machine where one bit pattern is illegal
and causes exception at runtime when read from memory by
integer load. Compiler could "initialize" all otherwise
uninitialized variables with this bit pattern. So accessing
uninitialised integer variable would cause runtime exception.

I acknowledge this somewhere, for the case of floating point numbers. A
poor implementation may have problems. But in the case of XMM registers
on x64, they seem to tolerate arbitrary bit patterns used in floating
point operations.

At worst you end up with a NaN result or something.

And obviously, it is inadvisable to dereference a unknown pointer value.

But you can give all this advice, issue warnings etc, and still not
seize upon such UB as an excuse to invalidate the rest of the program or
for a compiler to choose to do whatever it likes.

If you look at more complex examples you may see why the rule
allows more efficient code on ordinary machines. Namely,
look at:

void
f() {
bool b;
printf("b is ");
if (b) {
printf("true\n");
}
if (!b) {
printf("false\n");
}
}

Compiler could contain function called 'known_false' and omit
code for conditional statement is condition (in our case 'b')
is known to be false. How compiler could know this? Simplest
case is when condition is a constant. But that is trivial case.
More interesting cases are when some earlier statement assigns
constant value to 'b'. But function may contain "interesting"
control, so determining which assigments are executed is tricky.
Instead, compiler probably would use some kind of approximation,
tracking possible values at different program points. Now,
according to your point of view, uninitialized variable would
mean "any value is possible". According to C rules uninitialized
variable can not occur in correct program, which means that
there must be assignment later and analyzing possible values
corrent statement is "no value". In the function above,
consistently propagating information according to your
rules means that in conditional 'b' can take any value, so
compiler must emit the code. Using C rules, 'b' has no
value, so can not be true and compiler can delete the conditional
(and the same for conditional involving '!b').

If I apply gcc-O2 to your example, it prints that b is false without
actually testing the value. If I get to return the value of b, it
returns a hard-coded zero.

This example
is still pretty simple, so you may think that your rules are
superior.

They're certainly simpler. I can't predict what gcc will do. And
whatever it does, can differ depending on options.

It comes down to the user's intention: was the non-initialisation an oversight? Did they know that only one of those conditionals can be true?

My compilers don't try and double-guess the user: they will simply do
what is requested. The results of calling your function are certainly
more interesting:

c:\c>bcc -r c # run as native code
Compiling c.c to c.(run)
b is true

c:\c>bcc -i c # interpret the IL
Compiling c.c to c.(int)
b is false

(Uses main() calling f();)

But imagine that between declaration of 'b' and
conditional there is some hairy code. This code initializes
'b' to false, but only if some conditions are satisfied.
Now, consider situation were in fact 'b' is always initialized,
but compiler is too limited to see this. Under C rules
compiler will assume that 'b' is initialized and conclude
that it is false, allowing it to delete the conditional.
Under your rules compiler would have to consider possibility
that 'b' is uninitialized and keep the conditional.

I will sometimes delete a conditional branch, and the associated test,
only if the test expression can be evaluated at compile-time.

So I'm still interested in what possible reason the compiler might have
for generating code that is any different in the absence of
initialisation. Warn about it, sure, but why do anything else?

As explained, under C rules compiler can generate more efficient
code.

A lot of it seems to be for dodgy-looking code. I tend to rely or assume sensibly written programs. That seems to go a long way!

For the programs /I/ write this is not so important. For example
(program is my C compiler, 'cc' for development version):

c:\cx>tim cc sql
Compiling sql.c to sql.exe
Time: 0.243

That is my compiler building a 250Kloc test input. Here I try and
optimised by transpiling to C then gcc-O2:

c:\cx>mc cc
Compiling cc.m to cc.c

c:\cx>gc cc opt
Invoking: gcc -xc -s -O2 cc.c -o cc.exe -lm -ldl -fno-strict-aliasing
Compiled cc.c to cc.exe

c:\cx>tim cc sql
Compiling sql.c to sql.exe
Time: 0.203

The result is only 20% faster. Admittedly gcc had to first optimise out
all the redundancies discussed below, but then applies the ones that are really needed.

Maybe if written in 'sensible' C in the first place, the improvement
would have been better, but bear in mind that, with all the code in one
file, gcc can also do whole-program optimisation.

If I use my bcc for that poor cc.c file, then it takes 0.45 seconds, not
so bad, and still 20 times faster than gcc-O0 building sql.c.

It may do a good job of eliminating duplicate or redundant code. But
maybe you are measuring how well it copes with a certain quantity of
code, which when synthesised may well be duplicate or redundant.

I very much want my compiler backend to eliminate duplicate or
redundant code inserted by front end.

Then it is not helpful that it discards most of it.

If you use C compiler as a backed it is quite helpful.

My current transpiled C is full of redundant intermediates like your
example, and such optimising is necessary to get reasonble size and speed.

But it is also a nuisance when that redundant code is supposed to be
part of your program and you are measuring compilation speed.

One test input I use has 10,000 randomly named functions, each of about
90 lines, but which all do the same thing. gcc (I think clang too) is
clever enough to detect that, and elimimates 9,999 of the functions. At
least, the EXE size averages 9 bytes per function, so quite a lot has
been culled!

--- PyGate Linux v1.5.14
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From Keith Thompson@3:633/10 to All on Mon Apr 20 10:48:47 2026

Bart <bc@freeuk.com> writes:

On 20/04/2026 07:25, David Brown wrote:

On 20/04/2026 01:36, Bart wrote:
In C, "b" is not any specific place.� In optimising compilers, the
implementation is unlikely to exist at all until it has a real
value,

It should come into existence when it is referenced. Then it will have
a value.

Here for example:

int b;
if (rand()&1) b=0;
printf("%d", b);

'b' may or may not be initialised. But I expect the 'b' used in that
last line to exist somewhere and for the generated code to access that location. I'd also expect the same if the assignment was commented
out.

Your expectations are inconsistent with reality.

[...]

So gcc is a more powerful compiler than yours, and that's not fair?

If it is effectively cheating at benchmarks, then no it isn't.

Fairness is not a requirement.

If you take recursive Fibonacci, then fib(N) is expected to execute
about 2*fib(N) function calls (for versions that start 'if (N<3)
return 1').

However, using -Os or O1, gcc's code only does half of those
calls. And with -O2/-O3, only 5%, via aggressive inlining and TOC
optimising.

So, no, that's not a fair comparison. Fibonacci is supposed to be a
measure of how many calls/second a language implementation can make,
and the figure you'd get with gcc can be misleading.

(It might as well use memoisation, or be clever enough to convert to
to iterative form, then it can report an infinite number of calls per
second. That's really useful!

Yes, that's really useful!

So, for gcc and Fibonacci, I now use -fno-inline and another to turn
off TCO.)

If I write this program:

#include <stdio.h>

int fib(int n) {
if (n <= 1) {
return 1;
}
else {
return fib(n-2) + fib(n-1);
}
}

int main(void) {
printf("%d\n", fib(10));
}

the implementation's job is to generate code that prints "89".
If it's able to do so by replacing the whole thing with `puts("89");`
*that's a good thing*. That's not cheating. That's good code
generation.

If you want to write a benchmark that avoids certain optimizations,
you need to write it carefully so you get the code you want.

--
Keith Thompson (The_Other_Keith) Keith.S.Thompson+u@gmail.com
void Void(void) { Void(); } /* The recursive call of the void */

--- PyGate Linux v1.5.14
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From Keith Thompson@3:633/10 to All on Mon Apr 20 10:50:47 2026

Bart <bc@freeuk.com> writes:
[...]

So why should I listen to you, and why should I care?

I don't know, why should you?

You obviously care a great deal, or you wouldn't spend so much
time arguing.

--
Keith Thompson (The_Other_Keith) Keith.S.Thompson+u@gmail.com
void Void(void) { Void(); } /* The recursive call of the void */

--- PyGate Linux v1.5.14
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From Bart@3:633/10 to All on Mon Apr 20 20:50:56 2026

On 20/04/2026 18:50, Keith Thompson wrote:

Bart <bc@freeuk.com> writes:
[...]

So why should I listen to you, and why should I care?

I don't know, why should you?

You obviously care a great deal, or you wouldn't spend so much
time arguing.

I first posted this to show how casts are extensively used in my
generated C:

i64 a;
i64 b;
i64 c;
asi64(R1) = b;
asi64(R2) = c;
asi64(R1) += asi64(R2);
a = asi64(R1);

This was generated from this fragment HLL code: "a := b + c". There is
no initialisation because that is rarely done when testing compiler code-generation. Examples are kept as simple as possible, and
initialisation would have absolutely no bearing on the matter.

But somebody said this was UB. Now even though uninitialised variables
are not used in my production programs (AFAIK), I disagreed about this
matter.

--- PyGate Linux v1.5.14
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From Bart@3:633/10 to All on Mon Apr 20 21:13:59 2026

On 20/04/2026 18:48, Keith Thompson wrote:

Bart <bc@freeuk.com> writes:

Yes, that's really useful!

So which implementation is faster at actually doing function calls? And
how many calls were actually made?

So, for gcc and Fibonacci, I now use -fno-inline and another to turn
off TCO.)

If I write this program:

#include <stdio.h>

int fib(int n) {
if (n <= 1) {
return 1;
}
else {
return fib(n-2) + fib(n-1);
}
}

int main(void) {
printf("%d\n", fib(10));
}

the implementation's job is to generate code that prints "89".

In that case, why bother using very slow recursive Fibonacci?

Presumably the expection is that it would actually be using recursion.

I already posed this question:

"(Why would recursive Fibonacci even ever be used as a benchmark when
the iterative method be much faster in every case?)"

I will give you the answer: it is to compare how implementations cope
with very large numbers of recursive function calls. So if one finds a
way to avoid doing such calls, then it is not a fair comparison.

Nobody is interested in the actual output, other than checking it worked correctly. But in how long it took.

If it's able to do so by replacing the whole thing with `puts("89");`
*that's a good thing*. That's not cheating. That's good code
generation.

What gcc-O2/-O3 actually does is to take the 5 lines of the Fibonacci
function in C, which normally generates 25 lines of assembly, and turn
it into 270 lines of assembly.

Imagine such a ten-fold explosion in code size across a whole program,
for some tiny function which might not even ever be called as far as it
knows. It's a little suspect; why these 5 lines over a 100Kloc program
for example?

If you want to write a benchmark that avoids certain optimizations,
you need to write it carefully so you get the code you want.

It's not possible to do that with Fibonacci without making it
unrecognisable and so a poor comparison for other reasons.

If testing with gcc now, I'd use these two options:

-fno-inline
-fno-optimize-sibling-calls

On my PC, gcc-O2 code than manages some 560M calls/second running
Fibonacci, rather than a misleading 1270M calls/second

See also:

https://github.com/drujensen/fib/issues/119

Referenced from: https://github.com/drujensen/fib

--- PyGate Linux v1.5.14
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From Scott Lurndal@3:633/10 to All on Mon Apr 20 20:27:04 2026

Bart <bc@freeuk.com> writes:

On 20/04/2026 18:48, Keith Thompson wrote:

Bart <bc@freeuk.com> writes:

Yes, that's really useful!

So which implementation is faster at actually doing function calls? And
how many calls were actually made?

<snip recursive fibonacci generator>

the implementation's job is to generate code that prints "89".

In that case, why bother using very slow recursive Fibonacci?

Why do you care? The beginner will get the correct result
and the actual compuation will use less energy. Everyone wins.

"(Why would recursive Fibonacci even ever be used as a benchmark when
the iterative method be much faster in every case?)"

I will give you the answer: it is to compare how implementations cope
with very large numbers of recursive function calls. So if one finds a
way to avoid doing such calls, then it is not a fair comparison.

Nobody is interested in the actual output, other than checking it worked >correctly. But in how long it took.

Actually _EVERYBODY_ is interested in the actual output, and NOBODY is interested in how long it took.

The 5 people in the world that think in terms of random irrelevent
benchmarks are the only people would even think to care.

--- PyGate Linux v1.5.14
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From David Brown@3:633/10 to All on Mon Apr 20 22:57:52 2026

On 20/04/2026 19:34, Bart wrote:

On 20/04/2026 14:49, Waldek Hebisch wrote:

Bart <bc@freeuk.com> wrote:

On 19/04/2026 20:32, David Brown wrote:

On 19/04/2026 19:47, Bart wrote:

You look at trivial example, where AFAICS the best answer is:
"Compiler follows general rules, why should it make exception for
this case?".� Note that in this trivial case "interesting"
behaviour could happen on exotic hardware (probably disallowed
by C23 rules, but AFAICS legal for earlier C versions).

I don't care about exotic hardware. I don't see why its needs should
impact the 99.99% (if not 100%) of actual hardware that people use.

It ought to have made more things implementation defined.

You do realise that C was not invented for your needs alone?

Namely, consder machine where one bit pattern is illegal
and causes exception at runtime when read from memory by
integer load.� Compiler could "initialize" all otherwise
uninitialized variables with this bit pattern.� So accessing
uninitialised integer variable would cause runtime exception.

I acknowledge this somewhere, for the case of floating point numbers. A
poor implementation may have problems. But in the case of XMM registers
on x64, they seem to tolerate arbitrary bit patterns used in floating
point operations.

At worst you end up with a NaN result or something.

And obviously, it is inadvisable to dereference a unknown pointer value.

Okay, so you think it is "obvious" that you should avoid doing some
things that are explicitly UB, and yet you think it is "obvious" that
you should be able to do other types of UB. Who makes up those
"obvious" rules? Why do you think such inconsistency is a good idea?

But you can give all this advice, issue warnings etc, and still not
seize upon such UB as an excuse to invalidate the rest of the program or
for a compiler to choose to do whatever it likes.

Compilers don't use UB to do whatever they like. They use the
assumption that the program will not reach run-time UB in order to
generate more efficient results for code that runs correctly without
trying to do any UB.

If you look at more complex examples you may see why the rule
allows more efficient code on ordinary machines.� Namely,
look at:

void
f() {
�� bool b;
�� printf("b is ");
�� if (b) {
�� printf("true\n");
�� }
�� if (!b) {
�� printf("false\n");
�� }
}

Compiler could contain function called 'known_false' and omit
code for conditional statement is condition (in our case 'b')
is known to be false.� How compiler could know this?� Simplest
case is� when condition is a constant.� But that is trivial case.
More interesting cases are when some earlier statement assigns
constant value to 'b'.� But function may contain "interesting"
control, so determining which assigments are executed is tricky.
Instead, compiler probably would use some kind of approximation,
tracking possible values at different program points.� Now,
according to your point of view, uninitialized variable would
mean "any value is possible".� According to C rules uninitialized
variable can not occur in correct program, which means that
there must be assignment later and analyzing possible values
corrent statement is "no value".� In the function above,
consistently propagating information according to your
rules means that in conditional 'b' can take any value, so
compiler must emit the code.� Using C rules, 'b' has no
value, so can not be true and compiler can delete the conditional
(and the same for conditional involving '!b').

If I apply gcc-O2 to your example, it prints that b is false without actually testing the value. If I get to return the value of b, it
returns a hard-coded zero.

The compiler can quite reasonably generate all sorts of different code
here. A different version, or a different compiler, or on a different
day, you could get different results. That's life when you use UB.

This example
is still pretty simple, so you may think that your rules are
superior.

They're certainly simpler. I can't predict what gcc will do. And
whatever it does, can differ depending on options.

No, your rules are far from simple - you have internal ideas about what
kinds of UB you think should produce certain results, and which should
not, and how compilers should interpret things that have no meaning in
C. That's not simple.

I can predict what gcc will do, when it sees valid code and when there
is no run-time behaviour from the values passed to functions. It will
do what the C standard says it should do. I can't predict what it will
do in other circumstances, and mostly I don't care. (I care that it
does its best to give me warnings when it sees I've made a mistake.)

It comes down to the user's intention: was the non-initialisation an oversight? Did they know that only one of those conditionals can be true?

What else could the non-initialisation have been other than an oversight
- a bug in their code due to ignorance, or just making a mistake as we
all do occasionally? Do you think it is likely that someone
intentionally and knowingly wrote incorrect code?

My compilers don't try and double-guess the user: they will simply do
what is requested.

No, guessing the user's intentions is /exactly/ what your compiler is
trying to do. It is trying to guess what the programmer wrote even
though the programmer made a mistake and wrote something that does not
make sense. gcc (when used correctly) also tries to make guesses in
such circumstances in order to give you a warning to help you correct
your errors - but it does not make guesses when generating object code.

So I'm still interested in what possible reason the compiler might have
for generating code that is any different in the absence of
initialisation. Warn about it, sure, but why do anything else?

As explained, under C rules compiler can generate more efficient
code.

A lot of it seems to be for dodgy-looking code. I tend to rely or assume sensibly written programs. That seems to go a long way!

A good compiler generates good object for correct source code (it
doesn't care if it "looks dodgy". You seem to be obsessed with writing incorrect code, or how compilers deal with incorrect code. A good
compiler will work with sensibly written programs, yet you insist on
writing C that is not sensibly written.

It may do a good job of eliminating duplicate or redundant code. But
maybe you are measuring how well it copes with a certain quantity of
code, which when synthesised may well be duplicate or redundant.

I very much want my compiler backend to eliminate duplicate or
redundant code inserted by front end.

Then it is not helpful that it discards most of it.

If you use C compiler as a backed it is quite helpful.

My current transpiled C is full of redundant intermediates like your example, and such optimising is necessary to get reasonble size and speed.

Oh, so you want gcc to optimise away redundant code when your transpiler generates redundant code, but it is "cheating" if it optimises away
redundant code in bizarre tests because your C compiler can't do that?
This is what you call "common sense" programming?

--- PyGate Linux v1.5.14
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From Keith Thompson@3:633/10 to All on Mon Apr 20 14:30:22 2026

Bart <bc@freeuk.com> writes:

On 20/04/2026 18:50, Keith Thompson wrote:

Bart <bc@freeuk.com> writes:
[...]

So why should I listen to you, and why should I care?

I don't know, why should you?
You obviously care a great deal, or you wouldn't spend so much
time arguing.

I first posted this to show how casts are extensively used in my
generated C:

i64 a;
i64 b;
i64 c;
asi64(R1) = b;
asi64(R2) = c;
asi64(R1) += asi64(R2);
a = asi64(R1);

This was generated from this fragment HLL code: "a := b + c". There is
no initialisation because that is rarely done when testing compiler code-generation. Examples are kept as simple as possible, and
initialisation would have absolutely no bearing on the matter.

But somebody said this was UB. Now even though uninitialised variables
are not used in my production programs (AFAIK), I disagreed about this matter.

The point is not that "somebody said" that this was UB.

The point is that it really is UB. It's not a matter of opinion, it's
what the ISO C standard says in black and white. "Somebody" was
right. You were wrong.

N3220 (C23 draft) 6.3.2.1p2, describes *lvalue conversion*, the process
by which an lvalue (such as a variable name) is converted to the value
stored in the designated object.

... If the lvalue designates an object of automatic storage
duration that could have been declared with the register
storage class (never had its address taken), and that object
is uninitialized (not declared with an initializer and no
assignment to it has been performed prior to use), the behavior
is undefined.

That clearly applies to "b" and "c" in your code fragment.

I'm aware of your opinions about this, but will you acknowledge that
the standard actually says what it says? I'm not asking whether
you think the behavior should be undefined. I'm asking whether
you'll acknowledge that the ISO C standard says it's undefined.
Yes or no.

--
Keith Thompson (The_Other_Keith) Keith.S.Thompson+u@gmail.com
void Void(void) { Void(); } /* The recursive call of the void */

--- PyGate Linux v1.5.14
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From Keith Thompson@3:633/10 to All on Mon Apr 20 14:59:33 2026

Bart <bc@freeuk.com> writes:

On 20/04/2026 18:48, Keith Thompson wrote:

Bart <bc@freeuk.com> writes:

Yes, that's really useful!

So which implementation is faster at actually doing function calls?
And how many calls were actually made?

I don't know or care.

Once again, *there are ways* to write C benchmarks that guarantee
that all the function calls you want to time actually occur during
execution. For example, you can use calls to separately compiled
functions (and disable link-time optimization if necessary). You can
do computations that the compiler can't unwrap. You might multiply
a value by (time(NULL) > 0); that always yields 1, but the compiler
probably doesn't know that. (That's off the top of my head; I don't
know what the best techniques are in practice.) And then you can
examine the generated code to make sure that it's what you want.

When you write C code and submit it to a C compiler, the compiler
does not and should not assume that it's benchmark code, and that
all constructs in the C source code must map to instructions in
the generated code so they can be timed.

So, for gcc and Fibonacci, I now use -fno-inline and another to turn
off TCO.)

If I write this program:
#include <stdio.h>
int fib(int n) {
if (n <= 1) {
return 1;
}
else {
return fib(n-2) + fib(n-1);
}
}
int main(void) {
printf("%d\n", fib(10));
}
the implementation's job is to generate code that prints "89".

In that case, why bother using very slow recursive Fibonacci?

Presumably the expection is that it would actually be using recursion.

That expectation was not expressed in the code. C code specifies
behavior, and recursion vs. iteration is not externally visible
behavior.

I already posed this question:

"(Why would recursive Fibonacci even ever be used as a benchmark when
the iterative method be much faster in every case?)"

Because you want to measure the speed of function calls, of course.

I will give you the answer: it is to compare how implementations cope
with very large numbers of recursive function calls. So if one finds a
way to avoid doing such calls, then it is not a fair comparison.

Then write the code so the compiler can't eliminate the calls.

You want the compiler to work with one hand tied behind its
metaphorical back for the sake of "fairness". Not gonna happen.

[...]

If you want to write a benchmark that avoids certain optimizations,
you need to write it carefully so you get the code you want.

It's not possible to do that with Fibonacci without making it
unrecognisable and so a poor comparison for other reasons.

Then that's the price you pay.

If you ask me to go from point A to point B, if it's a few kilometers
away, I'll probably drive my car. If you intended it to be a
three-legged race, I'm not cheating *if you didn't tell me that*.

If you write a C program whose correct output is "89\n", and the
compiler eliminates most of the function calls, it's not cheating
*if you didn't tell it you wanted to keep them*.

(Incidentally, I haven't checked whether any compiler would actually
optimize my recursive fibonacci code to `puts("89")`. The point is
that the language standard permits it.)

If testing with gcc now, I'd use these two options:

-fno-inline
-fno-optimize-sibling-calls

On my PC, gcc-O2 code than manages some 560M calls/second running
Fibonacci, rather than a misleading 1270M calls/second

It's misleading *to you*, because you (deliberately?) misinterpret
the results.

See also:

https://github.com/drujensen/fib/issues/119

Referenced from: https://github.com/drujensen/fib

Let me ask you a simple question. Given my fibonacci example,
if a compiler compiled it to the equivalent of `puts("89")`, would
that compiler fail to conform to the ISO C standard? If so, why?

Are you able to distinguish between "I dislike this requirement
in the standard" and "I deny that this requirement exists"?

--
Keith Thompson (The_Other_Keith) Keith.S.Thompson+u@gmail.com
void Void(void) { Void(); } /* The recursive call of the void */

--- PyGate Linux v1.5.14
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From Bart@3:633/10 to All on Mon Apr 20 23:03:29 2026

On 20/04/2026 21:57, David Brown wrote:

On 20/04/2026 19:34, Bart wrote:

And obviously, it is inadvisable to dereference a unknown pointer value.

Okay, so you think it is "obvious" that you should avoid doing some
things that are explicitly UB, and yet you think it is "obvious" that
you should be able to do other types of UB.� Who makes up those
"obvious" rules?� Why do you think such inconsistency is a good idea?

Common sense? Reading the contents of a variable /within/ your program
is harmless. Now trying reading from a random memory location that may
be outside your program, or trying to write somewhere within it.

You really think they are comparable?

No, your rules are far from simple - you have internal ideas about what kinds of UB you think should produce certain results, and which should
not, and how compilers should interpret things that have no meaning in
C.� That's not simple.

I don't have any ideas about UB at all. So long as a program is valid, I
will translate it. I do very little transformations, and I rarely elide
code, or only on a small scale.

My language works how a lot of people think C works. Maybe how they
wished it worked.

I can predict what gcc will do,

And yet you say:

The compiler can quite reasonably generate all sorts of different code
here. A different version, or a different compiler, or on a different
day, you could get different results. That's life when you use UB.

What else could the non-initialisation have been other than an oversight
- a bug in their code due to ignorance, or just making a mistake as we
all do occasionally?� Do you think it is likely that someone
intentionally and knowingly wrote incorrect code?

I write such code hundreds of times a day. It is rarely run.

I might also write such code in my dynamic language. But there,
executing this program:

a := b + c

generates a runtime error: '+' is not defined between 'void' types.
Here, variables are automatically initialised to 'void' when they come
into existence.

My systems language is lower level. I had thought about zeroing the
stack frame on function entry (and did once try this with C), but I
decided not to do that.

A correctly written program shouldn't need that (although it would be convenient if guaranteed and could be useful to get repeatable results
if debugging).

My compilers don't try and double-guess the user: they will simply do
what is requested.

No, guessing the user's intentions is /exactly/ what your compiler is
trying to do.� It is trying to guess what the programmer wrote even
though the programmer made a mistake and wrote something that does not
make sense.

It is not making any guesses at all. It is faithfully translating the
user's code without making any judgements so long as it is valid.

It is GCC takes which an entire function body or even an entire static function and vanishes it out of existence. And where the output depends
on the combination effects of dozens of options.

A good
compiler will work with sensibly written programs, yet you insist on
writing C that is not sensibly written.

IMO it is fine. If a problem were to come up, then I would adjust the
code generator to fix it.

Oh, so you want gcc to optimise away redundant code when your transpiler generates redundant code, but it is "cheating" if it optimises away redundant code in bizarre tests because your C compiler can't do that?

Yes, sometimes the redundant code is the very thing you are trying to
measure.

Who'd have thought that?

But sometimes what a compiler thinks is redundant can be surprising.

--- PyGate Linux v1.5.14
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From Keith Thompson@3:633/10 to All on Mon Apr 20 15:04:07 2026

scott@slp53.sl.home (Scott Lurndal) writes:
[...]

Actually _EVERYBODY_ is interested in the actual output, and NOBODY is interested in how long it took.

The 5 people in the world that think in terms of random irrelevent
benchmarks are the only people would even think to care.

I think that's an extreme exaggeration. Plenty of people are
interested in benchmarks. The TOP500 project, for example, ranks supercomputers on the basis of their performance on specific
benchmarks.

Of course those benchmarks are carefully written to avoid optimizing
away the code that's being measured.

--
Keith Thompson (The_Other_Keith) Keith.S.Thompson+u@gmail.com
void Void(void) { Void(); } /* The recursive call of the void */

--- PyGate Linux v1.5.14
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From Bart@3:633/10 to All on Mon Apr 20 23:09:48 2026

On 20/04/2026 22:30, Keith Thompson wrote:

Bart <bc@freeuk.com> writes:

On 20/04/2026 18:50, Keith Thompson wrote:

Bart <bc@freeuk.com> writes:
[...]

So why should I listen to you, and why should I care?

I don't know, why should you?
You obviously care a great deal, or you wouldn't spend so much
time arguing.

I first posted this to show how casts are extensively used in my
generated C:

i64 a;
i64 b;
i64 c;
asi64(R1) = b;
asi64(R2) = c;
asi64(R1) += asi64(R2);
a = asi64(R1);

This was generated from this fragment HLL code: "a := b + c". There is
no initialisation because that is rarely done when testing compiler
code-generation. Examples are kept as simple as possible, and
initialisation would have absolutely no bearing on the matter.

But somebody said this was UB. Now even though uninitialised variables
are not used in my production programs (AFAIK), I disagreed about this
matter.

The point is not that "somebody said" that this was UB.

No, the point is that this was a throwaway piece of code. It was an illustration, not a complete working program.

I made a mistake by picking on the UB business, as that seems to hit a
nerve in this forum.

I'm aware of your opinions about this, but will you acknowledge that
the standard actually says what it says? I'm not asking whether
you think the behavior should be undefined. I'm asking whether
you'll acknowledge that the ISO C standard says it's undefined.
Yes or no.

I'm sure it does. But I don't want to get involved. Life is much simpler
when it is just:

source code -> AST -> IL -> Native -> ...

But if I take a detour into C:

source code -> AST -> IL -> C source code + 700 pages of reference
and 187 sets of undefined behaviour + gcc + 2000 options ...

So I just do enough to achieve what I need to.

--- PyGate Linux v1.5.14
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From wij@3:633/10 to All on Tue Apr 21 06:29:01 2026

On Sat, 2026-04-18 at 22:17 +0800, wij wrote:

On Thu, 2026-04-16 at 22:14 +0800, wij wrote:

On Thu, 2026-04-16 at 18:42 +0800, wij wrote:

On Wed, 2026-04-15 at 19:04 -0700, Keith Thompson wrote:

wij <wyniijj5@gmail.com> writes:

On Wed, 2026-04-15 at 17:14 -0700, Tim Rentsch wrote:

wij <wyniijj5@gmail.com> writes:

[... comparing C and assembly language ...]

Gentlemen,

I understand the natural reaction to want to respond to the kin

d of

statements being made in this thread.� I hope y'all can re

sist this

natural reaction and not respond to people who persist in makin

g

arguments that are basically isomorphic to saying 1 equals 0.

Thank you for your assistance in this matter.

Maybe you are right. I say A is-a B, one persist to read A is (ex

actly) B.

I provide help to using assembly. One persist to read I persuade

using

assembly and give up HLL. What is going on here?

You say that C is an assembly language.� Nobody here thinks th

at

you're *equating* C and assembly language.� It's obvious that
there are plenty of assembly languages that are not C, and nobody
has said otherwise.� I have no idea why you think anyone has t

hat

particular confusion.

At least one person has apparently interpreted your defense of
assembly language (that it isn't as scary as some think it is)
as a claim that we should program in assembly language rather
than in HLLs.� You're right, that was a misinterpretation of w

hat

you wrote.� I considered mentioning that, but didn't bother.

The issue I've been discussing is your claim that C is an assembly language.� It is not.

If I said C is assembly is in the sense that have at least shown in t

he last

post (s_tut2.cpp), where even 'instruction' can be any function (e.g.

change

directory, copy files, launch an editor,...). And also, what is 'comp

utation'

is demonstrated, which include suggestion what C is, essentially any

program,

and in this sense what HLL is. Finally, it could demonstrate the mean

ing and

testify Church-Turing thesis (my words: no computation language, incl

uding�

various kind of math formula, can exceeds the expressive power of TM)

.

It seem you insist C and assembly have to be exactly what your bible

says. If

so, I would say what C standard (I cannot read it) says is the meanin

g of

terminology of term in it, not intended to be anything used in any ot

her situation.

I do not intend to post again in this thread until and unless you
post something substantive on that issue.

(continue)
IMO, C standard is like book of legal terms. Like many symbols in the h

eader

file, it defines one symbol in anoter symbol. The real meaning is not f

ixed.

The result is you cannot 'prove' correctness of the source program, eve

n

consistency is a problem.

'Instruction' is low-level? Yes, by definition, but not as one might th

ink.

Instruction could refer to a processing unit (might be like the x87 mat

h

co-processor, which may even be more higher level to process expression

,...)

As good chance of C is to find a good function that can be hardwired.

So, the basic feature of HLL is 'structured' (or 'nested') text which r

emoves

labels. Semantics is inventor's imagination. So, avoid bizarre complexi

ty, it

won't add express power to the language, just a matter of short or leng

thy�

expression of programming idea.

(Continue)
Thus, C is-a language for controlling hardware. Therefore, the term 'port

able

assembly' seems fit for this meaning. But on the other side, C needs to b

e user

friendly. But skipping the friend part, I think there should more, C coul

d be

the foundation of forml system (particularily for academic uses). For exa

mple:

��
Case 1: "�(n=1,m) f(n)" should be defined as:

�sum=0;
�for(int n=1; n<=m; ++n) {
�� sum+=f(n)
�}

�By doing so, it is easier to deduce things from nested series.
��
Case 2: What if m=� ?
��
�for(int n=1; ; ++n) {
�� sum+=f(n)
�}
��
�The infinity case has no consensus. At least, it demonstrates that

'infinity'

�simply refers to an infinite loop. This leads to the long debate of
�0.999....=? (0.999... will not terminates BY DEFINITION, no finit

e proof can

�prove it equals to anything except you define it).... And what INF,

INFINITY�

�should be in C.

Case 3: Proposition ?x,P(x)::= P(x1)?P(x2)?..

?P(xn) (x?{x1,x2,..})

�� bool f() {� // f()= "?x,P(x)"
�� for(int x=1; x<=S.size(); ++x) {
�� if(P(x)==false) {
�� return false;
�� }
�� }
�� return true;
�� };

�� Universal quantifier itself is also a proposition, the

refore, from

�� definition, its negation exists:
�� ~Prop(?x,P(x))= ~(P(x1)?P(x2)?

?..?P(xn)= ~P(x1)?~P(x2)?..?~P(xn)?
??

�� = Prop(?x,~P(x))

�� Math/logic has no such clear definition.
�� Multiple quantifiers (?x?y?z)

and its negation are thus easier to

�� understand and used in 'reasoning'.

�� Note: This leads to a case: if(a&&b) { /*...*/ }
�� I tends to think t

he omission of evaluation of b in case a==false

�� is not really opti

mizaion. It is the problem or definition of the

�� traditional logic.

So, don't make C too bizarre.

Therefore, from this point of view, C is also a formal language, and
may become a math/logic language in the future.

Optimiztion has nothing to do with *the language*.

C standard is a process, not bible.

--- PyGate Linux v1.5.14
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From Bart@3:633/10 to All on Mon Apr 20 23:36:04 2026

On 20/04/2026 22:59, Keith Thompson wrote:

Bart <bc@freeuk.com> writes:

On 20/04/2026 18:48, Keith Thompson wrote:

Presumably the expection is that it would actually be using recursion.

That expectation was not expressed in the code.

Other than clearly using recursion?

"(Why would recursive Fibonacci even ever be used as a benchmark when
the iterative method be much faster in every case?)"

Because you want to measure the speed of function calls, of course.

That's ... a surprising response.

I assumed both you and David had absolutely no interest in such matters
and were not sympathetic to those who did.

I will give you the answer: it is to compare how implementations cope
with very large numbers of recursive function calls. So if one finds a
way to avoid doing such calls, then it is not a fair comparison.

Then write the code so the compiler can't eliminate the calls.

Fibanacci can be expressed very simply, but different implementations
have to run the same algorithm.

You want the compiler to work with one hand tied behind its
metaphorical back for the sake of "fairness". Not gonna happen.

But you seem to be suggesting doing just that? See your last remark above.

If you ask me to go from point A to point B, if it's a few kilometers
away, I'll probably drive my car. If you intended it to be a
three-legged race, I'm not cheating *if you didn't tell me that*.

For this purpose such a race would over a specific route. What is
counted is the race-time and participants can take run it at different
times. Whoever does it fastest wins.

Then a car vs three-legged race would be fair. A car would clearly have
the raw speed to win, and that would not be unexpected.

However if the car also took a short-cut so it only did half the
expected distance, then it is cheating, even if it would have won anyway.

If testing with gcc now, I'd use these two options:

-fno-inline
-fno-optimize-sibling-calls

On my PC, gcc-O2 code than manages some 560M calls/second running
Fibonacci, rather than a misleading 1270M calls/second

It's misleading *to you*, because you (deliberately?) misinterpret
the results.

No, it really is misleading. It is possible to split a fib() routine
into three versions; fib1(), fib2() and fib3(). Each calls the other two.

Now put each functions into its own source file and compile the three
into one program.

The naive fib() benchmark tells me it can achieve 1.27 billion fib()
calls per second on my PC. Great!

In that case, I should also get 1.27 billion calls/second when I run the fib1/fib2/fib3 version.

But, it doesn't; I get less than half that throughput. What's gone wrong?

According to you, gcc code should be able to have that throughput; why
doesn't it?

The reason is that in the regular fib() version, it cheats.

On this test, my compiler manages 470M calls/second for regular fib().
If I test it on fib1/fib2/fib3, it still does 470M calls/second.

Its result is much more truthful, consistent and reliable.

See also:

https://github.com/drujensen/fib/issues/119

Referenced from: https://github.com/drujensen/fib

Let me ask you a simple question. Given my fibonacci example,
if a compiler compiled it to the equivalent of `puts("89")`, would
that compiler fail to conform to the ISO C standard? If so, why?

Are you able to distinguish between "I dislike this requirement
in the standard" and "I deny that this requirement exists"?

I don't understand. I assume you know the answer, that a C compiler can
do whatever it likes (including emailing your source to a human
accomplise and have them mail back a cheat like this).

My problem is doing fair comparisons between implementations doing the
same task using the same algorithm. And in the case of recursive
fibonacci, I showed above that the naive gcc results are unreliable.

--- PyGate Linux v1.5.14
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From Keith Thompson@3:633/10 to All on Mon Apr 20 18:22:23 2026

Bart <bc@freeuk.com> writes:

On 20/04/2026 22:59, Keith Thompson wrote:

Bart <bc@freeuk.com> writes:

On 20/04/2026 18:48, Keith Thompson wrote:
Presumably the expection is that it would actually be using recursion.

That expectation was not expressed in the code.

Other than clearly using recursion?

The C source code specifies the behavior of the program. Recursion
is not behavior. A recursive C program does not require a compiler
to generate a recursive executable, any more than a function call
requires it to generate a "call" instruction.

"(Why would recursive Fibonacci even ever be used as a benchmark when
the iterative method be much faster in every case?)"

Because you want to measure the speed of function calls, of course.

That's ... a surprising response.

I assumed both you and David had absolutely no interest in such
matters and were not sympathetic to those who did.

I have no idea how you reached that conclusion.

[...]

The naive fib() benchmark tells me it can achieve 1.27 billion fib()
calls per second on my PC. Great!

In that case, I should also get 1.27 billion calls/second when I run
the fib1/fib2/fib3 version.

But, it doesn't; I get less than half that throughput. What's gone wrong?

According to you, gcc code should be able to have that throughput; why doesn't it?

Where did I say that? To be clear, when I said that a compiler
could transform my Fibonacci program into just puts("89"), I did
not suggest that gcc actually does so.

[...]

Let me ask you a simple question. Given my fibonacci example,
if a compiler compiled it to the equivalent of `puts("89")`, would
that compiler fail to conform to the ISO C standard? If so, why?
Are you able to distinguish between "I dislike this requirement
in the standard" and "I deny that this requirement exists"?

I don't understand. I assume you know the answer, that a C compiler
can do whatever it likes (including emailing your source to a human accomplise and have them mail back a cheat like this).

So you agree that optimizing the program to just puts("89") is valid.

My problem is doing fair comparisons between implementations doing the
same task using the same algorithm. And in the case of recursive
fibonacci, I showed above that the naive gcc results are unreliable.

Your problem, apparently, is that you make some bad assumptions about
how to compare and measure performance. You assume that the mapping
from source code to machine code is, or should be, straightforward
enough that you can know how many CALL instructions will be executed.
You get performance numbers that are obviously absurd because some
calls are legitimately optimized away. And even though you now
acknowledge that the optimizations that break your measurements
are legitimate, you still blame the compiler rather than your own
benchmark code.

--
Keith Thompson (The_Other_Keith) Keith.S.Thompson+u@gmail.com
void Void(void) { Void(); } /* The recursive call of the void */

--- PyGate Linux v1.5.14
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From Bart@3:633/10 to All on Tue Apr 21 02:34:10 2026

On 21/04/2026 01:39, Waldek Hebisch wrote:

Bart <bc@freeuk.com> wrote:

On 20/04/2026 18:48, Keith Thompson wrote:

Bart <bc@freeuk.com> writes:

Yes, that's really useful!

So which implementation is faster at actually doing function calls? And
how many calls were actually made?

Implementation that uses 0 instructions to implement function call
on normal machine will get shorter runtime, so clearly is faster
at doing function calls. This does not differ much from what
some modern processors do, namely move instruction may effectively
take 0 cycles. People used to old ways when confrotnted with

movq %rax, %rdx

expect that there will be actual movement of data, that instruction
must travel the whole CPU pipeline. But modern processors do
register renaming and after looking at this istruction may
simply note that to get value of %rdx one uses place storing
%rax (I am using AT&T convention so direction is from %rax to
%rdx) and otherwise drop the istructruction. Is the processor
cheating? Naive benchmark where moves are overrepresented may
execute unexpectedy fast, but moves are frequent in real
program so this gives valuable speedup for all programs.

Coming back to function calls, consider programmer who cares
very much about speed. He knows that his program would be
simpler and easier to write if he used a lot of small
functions. In old days he would worry about cost of
function calls and he proably would write much bigger and
complicated functions to get good speed. But if cost of
function call is 0 he can freely use small functions, without
worrying about cost of calls.

If the cost was zero then function inlining wouldn't be a thing.

I will give you the answer: it is to compare how implementations cope
with very large numbers of recursive function calls. So if one finds a
way to avoid doing such calls, then it is not a fair comparison.

Well, Fibonacci and similar functions have limited use.

They are commonly used as benchmarks. I use them a lot to compare
interpreted and JITed languages, but need also some native code tests as
a reference.

It is the latter that are flawed when using gcc.

I decided to write a reference version in x64 assembly, a
straightforward version that does the requisite number of calls.

To evaluate fib(42), it took 0.85 seconds on my PC, or about 680M calls/second. gcc-O3 does it, miraculously, at 1270M calls/second.

However that is misleading and unsustainable. I showed in my last post
how, if the calls are split across modules using three fib() functions
that call each other, it can only manage 570M calls/second.

Meanwhile the versions that don't cheat can maintain the same throughput.

I've anyway counted the calls that gcc-O3 does make and it is a lot
fewer than needed (95% less IIRC). It is achieved via complex inlining
and use of TCO from what I can see.

So the
real question is what is the cost of function calls in actual
programs. For calls to small non-recursive function cost is
close to 0. Recursion increases makes optimization more tricky,
so increases cost. But still, in practice cost is lower than
one could naively expect.

Concerning fairness, AFAIK gcc optimization were developed to
speed up real programs. They speed up Fibonacci basically as
a side effect.

I suspect some time was spent on Fibonacci too!

So IMO it is fair: compier that can not speed
up calls in Fibonacci probably will have trouble speeding up
calls at least in some real programs.

Speeding up calls = avoiding making those calls?

That's a valid optimisation I suppose, unless you're specifically
measuring how good it is at /making/ calls rather than avoiding them!

This is the story with a lot of gcc optimisations. I want to know how it
it takes to do a task, not how long it takes to not do it.

And 'doing a task' doesn't mean just giving the expected end-result
either. Otherwise somebody running the Paris-Dakar rally could arrange
to have their vehicle parachuted down to the finish line.

--- PyGate Linux v1.5.14
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From David Brown@3:633/10 to All on Tue Apr 21 09:00:36 2026

On 21/04/2026 00:04, Keith Thompson wrote:

scott@slp53.sl.home (Scott Lurndal) writes:
[...]

Actually _EVERYBODY_ is interested in the actual output, and NOBODY is
interested in how long it took.

The 5 people in the world that think in terms of random irrelevent
benchmarks are the only people would even think to care.

I think that's an extreme exaggeration. Plenty of people are
interested in benchmarks. The TOP500 project, for example, ranks supercomputers on the basis of their performance on specific
benchmarks.

Of course those benchmarks are carefully written to avoid optimizing
away the code that's being measured.

Most people that are seriously interested in benchmarks are interested
in /relevant/ benchmarks, not /irrelevant/ ones. As you say, people
using benchmarks to compare hardware or software, will use carefully
chosen benchmarks with an appreciation for what they show and what they
do not show.

I'd expect most people who use "benchmarks" are much less serious about
it - they are just random tests to see that their new PC is much faster
than their old one, or that kind of very informal testing.

Bart is unusual in that he puts a lot of time and effort into
"benchmarks" that are often bogus or irrelevant (to most people), even
when it has been pointed out to him by many people here. (Not all of
his benchmarking is bogus or irrelevant, but the examples discussed here
are.) I'd expect there are more than 5 people doing this in the world,
but it's still a minority of benchmarkers.

--- PyGate Linux v1.5.14
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From David Brown@3:633/10 to All on Tue Apr 21 10:01:27 2026

On 20/04/2026 22:13, Bart wrote:

On 20/04/2026 18:48, Keith Thompson wrote:

Bart <bc@freeuk.com> writes:

Yes, that's really useful!

So which implementation is faster at actually doing function calls? And
how many calls were actually made?

Why do you care?

I often care about the speed of my generated code. Faster code in an
embedded system can mean using a less powerful microcontroller, saving
costs. Or it can mean less time calculating and more time sleeping,
improving battery life. Or it can mean faster reactions leading to more precise motor control. Speed is not always important to me, but it
often is - and sometimes it is critical.

But it /never/ matters to me if code is faster because function calls
are faster, or because the compiler reduces the number of function calls
with inlining or other code re-arrangements. It matters that it is faster.

(Sometimes code size is important, though that is less common. And
again, it does not matter how the small code works - it matters that it
works correctly to match the semantics of my C code.)

Languages use "function call" as a very useful abstraction of re-usable functionality. No one expects them to translate one-to-one into
assembly "call" instructions. In my C programming, I expect a lot of
things that look like function calls to be handled directly by the
compiler, or inlined. (In C++ programming, this applies to a large
proportion of logical function calls.) I also expect a lot of things
that don't look like function calls to result in assembly "call"
instructions - such as some arithmetic operations on small devices. And
then there is tail-call optimisation.

I don't think it even make sense to ask if an implementation is "faster
at actually doing function calls" or to ask how many calls were made -
it's too much of a mix of distinct concepts (the abstract idea in the
source code, and the concrete "call" instruction in the assembly).

So, for gcc and Fibonacci, I now use -fno-inline and another to turn
off TCO.)

If I write this program:

#include <stdio.h>

int fib(int n) {
�� if (n <= 1) {
�� return 1;
�� }
�� else {
�� return fib(n-2) + fib(n-1);
�� }
}

int main(void) {
�� printf("%d\n", fib(10));
}

the implementation's job is to generate code that prints "89".

In that case, why bother using very slow recursive Fibonacci?

Indeed.

Presumably the expection is that it would actually be using recursion.

That would be an expectation born of misconception.

It's fine to write code like that "fib" function to see how a compiler
handles such functions. The problem is when you think you are testing
details of implementation strategies that are not implied or required by
the source code. Something may look like recursion in the source code,
but be implemented as a loop. This is critical to some kinds of code -
often recursion can be a very neat and clear way to write code, but
loops are almost always more efficient in implementation. GCC even has
an attribute "musttail" that can be attached to a return statement to
give an error if the compiler can't apply tail-call optimisation.

I already posed this question:

"(Why would recursive Fibonacci even ever be used as a benchmark when
the iterative method be much faster in every case?)"

It should not be used as a benchmark - or at least, not as a benchmark
for what you think it is doing. It's fine as a benchmark for "how does
this language or implementation handle code like this?". It is not fine
for a benchmark asking "how fast is this when it really does double
recursive calls at the assembly level?".

I will give you the answer: it is to compare how implementations cope
with very large numbers of recursive function calls. So if one finds a
way to avoid doing such calls, then it is not a fair comparison.

Well, you'd be wrong. Maybe you are not the only person making this
mistake, but it is still a mistake.

Nobody is interested in the actual output, other than checking it worked correctly. But in how long it took.

Sure. It is interesting to see that some C compilers can generate
vastly more efficient code for that function than others. The mistake
is in thinking that compilers have to implement it in a particular way.

If it's able to do so by replacing the whole thing with `puts("89");`
*that's a good thing*.� That's not cheating.� That's good code
generation.

What gcc-O2/-O3 actually does is to take the 5 lines of the Fibonacci function in C, which normally generates 25 lines of assembly, and turn
it into 270 lines of assembly.

You asked the compiler to optimise for speed, even if that means
significantly larger assembly code. And you are surprised when the
compiler does exactly what you asked for?

Imagine such a ten-fold explosion in code size across a whole program,
for some tiny function which might not even ever be called as far as it knows. It's a little suspect; why these 5 lines over a 100Kloc program
for example?

Most code does not benefit from such as growth of assembly code.

If you want to write a benchmark that avoids certain optimizations,
you need to write it carefully so you get the code you want.

It's not possible to do that with Fibonacci without making it
unrecognisable and so a poor comparison for other reasons.

Sure it is.

__attribute__((noinline))
int fib(int n) {
if (n <= 1) {
return 1;
}
else {
return fib(n-2) + fib(n-1);
}
}

C has no concept of optimising or function inlining (the "inline"
function qualifier has other semantics. Regarding optimisation, it is
at best a hint to the compiler - one that compilers often ignore). So
if you want to stop gcc from using inlining optimisations, you have to
tell it not to. With that in place, gcc -O2 gives about 23 lines of
assembly - still doing the partial tail recursion so that the run-time
is O(n/2) rather than O(fib(n)).

All you need to do is stop crying about how things are not fair, and how
nasty C and GCC are, and /ask/ how to do the tests you want. Not
everyone thinks the C standards or the GCC reference manual are fun
reading - let those that /have/ read them help you instead of making unjustified complaints and refusing to listen.

If testing with gcc now, I'd use these two options:

�-fno-inline
�-fno-optimize-sibling-calls

On my PC, gcc-O2 code than manages some 560M calls/second running
Fibonacci, rather than a misleading 1270M calls/second

See also:

https://github.com/drujensen/fib/issues/119

Referenced from: https://github.com/drujensen/fib

--- PyGate Linux v1.5.14
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From David Brown@3:633/10 to All on Tue Apr 21 10:13:14 2026

On 20/04/2026 23:59, Keith Thompson wrote:

Bart <bc@freeuk.com> writes:

On 20/04/2026 18:48, Keith Thompson wrote:

Bart <bc@freeuk.com> writes:

Yes, that's really useful!

So which implementation is faster at actually doing function calls?
And how many calls were actually made?

I don't know or care.

Once again, *there are ways* to write C benchmarks that guarantee
that all the function calls you want to time actually occur during
execution. For example, you can use calls to separately compiled
functions (and disable link-time optimization if necessary). You can
do computations that the compiler can't unwrap. You might multiply
a value by (time(NULL) > 0); that always yields 1, but the compiler
probably doesn't know that. (That's off the top of my head; I don't
know what the best techniques are in practice.) And then you can
examine the generated code to make sure that it's what you want.

To add more suggestions here, I find the key to benchmarking when you
want to stick to standard C is use of "volatile". Use a volatile read
at the start of your code, then calculations that depend on each other
and that first read, then a volatile write of the result. That gives
minimal intrusion in the code while making sure the calculations have to
be generated, and have to be done at run time.

If you are testing on a particular compiler (like gcc or clang), then
there are other options. The "noinline" function attribute is very
handy. Then there are empty inline assembly statements:

If you think of processor registers as acting like a level -1 memory
cache (for things that are not always in registers), then this flushes
that cache:

asm volatile ("" ::: "memory");

This tells the compiler that it needs to have calculated "x" at this
point in time (so that its value can be passed to the assembly) :

asm volatile ("" :: "" (x));

This tells the compiler that "x" might be changed by the assembly, so it
must forget any additional knowledge it had of it :

asm volatile ("" : "+g" (x));

I've had use of all of these in real code, not just benchmarks or test
code. They can be helpful in some kinds of interactions between low
level code and hardware.

--- PyGate Linux v1.5.14
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From David Brown@3:633/10 to All on Tue Apr 21 10:28:52 2026

On 21/04/2026 00:36, Bart wrote:

On 20/04/2026 22:59, Keith Thompson wrote:

Bart <bc@freeuk.com> writes:

On 20/04/2026 18:48, Keith Thompson wrote:

Presumably the expection is that it would actually be using recursion.

That expectation was not expressed in the code.

Other than clearly using recursion?

The code clearly uses recursion in the source. But C source code is a description of the observable behaviour of the program - it is not a description of the implementation or the generated assembly. There is
no expectation about the implementation, because C cannot describe that.

"(Why would recursive Fibonacci even ever be used as a benchmark when
the iterative method be much faster in every case?)"

Because you want to measure the speed of function calls, of course.

That's ... a surprising response.

It is certainly clear that /you/ want to use a function like this to
measure the speed of function calls. The fact that you can't usefully
do that with the code as it is, does not change what /you/ want from it.
(And you are not alone in your thinking here.)

I assumed both you and David had absolutely no interest in such matters
and were not sympathetic to those who did.

I appreciate that you want to measure function call speed here. I don't really understand why - it's not very helpful for understanding the
real-world effect of function call overhead in real-world code. But I
do understand that /you/ want to do this. That is why I have been
pointing out where you have been going wrong, and giving suggestions
about how to do it better. If I were not sympathetic to you, I wouldn't
have bothered.

--- PyGate Linux v1.5.14
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From David Brown@3:633/10 to All on Tue Apr 21 10:53:12 2026

On 21/04/2026 00:03, Bart wrote:

On 20/04/2026 21:57, David Brown wrote:

On 20/04/2026 19:34, Bart wrote:

And obviously, it is inadvisable to dereference a unknown pointer value.

Okay, so you think it is "obvious" that you should avoid doing some
things that are explicitly UB, and yet you think it is "obvious" that
you should be able to do other types of UB.� Who makes up those
"obvious" rules?� Why do you think such inconsistency is a good idea?

Common sense? Reading the contents of a variable /within/ your program
is harmless. Now trying reading from a random memory location that may
be outside your program, or trying to write somewhere within it.

You really think they are comparable?

C makes things "undefined behaviour" for a variety of reasons. A very
common one is when there is no sensible way to define behaviour for the situation, and code attempting to do the operation cannot possibly be
correct and doing useful work. Leaving that action as UB means
compilers have more freedom for optimisation, tools have more options
for helping catch bugs in the code (such as by using sanitizers), and programmers are encouraged not to write meaningless code.

Whether or not the situation leads to a crash or other serious failure
is beside the point. (On the Itanium, depending on the way the code is generated, it /could/ lead to a crash - that processor supports an "uninitialised" flag in its registers to aid debugging.)

You seem to think that just writing "int a;" somehow creates an int
object called "a" along with a slot on the stack or a dedicated
register. That does not happen in many compilers. And it does not
happen in the C semantics. "a" is an lvalue that /potentially/
designates an object - for an uninitialised local variable, it does not designate an object until a value is assigned.

No, your rules are far from simple - you have internal ideas about
what kinds of UB you think should produce certain results, and which
should not, and how compilers should interpret things that have no
meaning in C.� That's not simple.

I don't have any ideas about UB at all. So long as a program is valid, I will translate it. I do very little transformations, and I rarely elide code, or only on a small scale.

It is fine to translate these reads of uninitialised variables into
reads of stack slots or registers. You can translate UB in any way you
want - and you can assume there is no UB and ignore the possibility. So
there is nothing wrong with the way you translate the C code in your own compiler.

But you can't expect other C compilers to do the same.

My language works how a lot of people think C works. Maybe how they
wished it worked.

I can predict what gcc will do,

And yet you say:

The compiler can quite reasonably generate all sorts of different code here.� A different version, or a different compiler, or on a different day, you could get different results.� That's life when you use UB.

You snipped the relevant part - I can predict what gcc will do with
correct code that does not invoke UB. I can't predict what it will do
in the face of UB (unless I use additional flags to give additional
semantics, such as using sanitizers).

What else could the non-initialisation have been other than an
oversight - a bug in their code due to ignorance, or just making a
mistake as we all do occasionally?� Do you think it is likely that
someone intentionally and knowingly wrote incorrect code?

I write such code hundreds of times a day. It is rarely run.

I might also write such code in my dynamic language. But there,
executing this program:

� a := b + c

generates a runtime error: '+' is not defined between 'void' types.
Here, variables are automatically initialised to 'void' when they come
into existence.

My systems language is lower level. I had thought about zeroing the
stack frame on function entry (and did once try this with C), but I
decided not to do that.

A correctly written program shouldn't need that (although it would be convenient if guaranteed and could be useful to get repeatable results
if debugging).

I agree that a correctly written program should not need zeroing the
stack frame. I disagree that it would be convenient, and I am far from convinced that it would be very helpful in debugging (though it is the
nature of debugging that there is a wide variation in what helps at
different times). It can be helpful in security-critical code for
minimising the risk of exposing secrets to attacks, and for limiting the damage if there are bugs in the code.

But since you seem to agree that accessing local variables before initialisation or assignment is not "a correctly written program", why
are you arguing that it somehow has meaning or that is a good idea to
generate or write such code?

My compilers don't try and double-guess the user: they will simply do
what is requested.

No, guessing the user's intentions is /exactly/ what your compiler is
trying to do.� It is trying to guess what the programmer wrote even
though the programmer made a mistake and wrote something that does not
make sense.

It is not making any guesses at all. It is faithfully translating the
user's code without making any judgements so long as it is valid.

The code is /not/ valid!

It is GCC takes which an entire function body or even an entire static function and vanishes it out of existence. And where the output depends
on the combination effects of dozens of options.

Code that does nothing can be eliminated - that's a /good/ thing.
Static functions that are not used within a file can never be used in
the program - eliminating them is a /good/ thing. Different users have
a wide variety of different needs from their tools, so GCC has a large
number of options. That's a /good/ thing.

A good compiler will work with sensibly written programs, yet you
insist on writing C that is not sensibly written.

IMO it is fine.

Your opinion is contrary to the facts.

If a problem were to come up, then I would adjust the
code generator to fix it.

Oh, so you want gcc to optimise away redundant code when your
transpiler generates redundant code, but it is "cheating" if it
optimises away redundant code in bizarre tests because your C compiler
can't do that?

Yes, sometimes the redundant code is the very thing you are trying to measure.

Who'd have thought that?

Only you, it seems.

But sometimes what a compiler thinks is redundant can be surprising.

As a general statement, I agree. In this particular case, I disagree.

--- PyGate Linux v1.5.14
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From Keith Thompson@3:633/10 to All on Tue Apr 21 02:56:10 2026

David Brown <david.brown@hesbynett.no> writes:
[...]

You seem to think that just writing "int a;" somehow creates an int
object called "a" along with a slot on the stack or a dedicated
register. That does not happen in many compilers. And it does not
happen in the C semantics. "a" is an lvalue that /potentially/
designates an object - for an uninitialised local variable, it does
not designate an object until a value is assigned.

[...]

I don't think that's correct. Within the scope of a declaration `int
a;` the expression `a` is an lvalue that *does* designate an object. If
that expression undergoes "lvalue conversion", the operation that
fetches the stored value, the behavior is undefined if a is
uninitialize.

The word "potentially (which was my idea, BTW) means that given:

int *p = NULL;

the expression *p is an lvalue, even if it doesn't currently designate
an object. C90 and C99 both messed up the definition of "lvalue" in
different ways.

--
Keith Thompson (The_Other_Keith) Keith.S.Thompson+u@gmail.com
void Void(void) { Void(); } /* The recursive call of the void */

--- PyGate Linux v1.5.14
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From Bart@3:633/10 to All on Tue Apr 21 11:01:11 2026

On 21/04/2026 02:22, Keith Thompson wrote:

Bart <bc@freeuk.com> writes:

On 20/04/2026 22:59, Keith Thompson wrote:

Bart <bc@freeuk.com> writes:

On 20/04/2026 18:48, Keith Thompson wrote:
Presumably the expection is that it would actually be using recursion.

That expectation was not expressed in the code.

Other than clearly using recursion?

The C source code specifies the behavior of the program. Recursion
is not behavior. A recursive C program does not require a compiler
to generate a recursive executable, any more than a function call
requires it to generate a "call" instruction.

"(Why would recursive Fibonacci even ever be used as a benchmark when
the iterative method be much faster in every case?)"

Because you want to measure the speed of function calls, of course.

That's ... a surprising response.

I assumed both you and David had absolutely no interest in such
matters and were not sympathetic to those who did.

I have no idea how you reached that conclusion.

[...]

The naive fib() benchmark tells me it can achieve 1.27 billion fib()
calls per second on my PC. Great!

In that case, I should also get 1.27 billion calls/second when I run
the fib1/fib2/fib3 version.

But, it doesn't; I get less than half that throughput. What's gone wrong?

According to you, gcc code should be able to have that throughput; why
doesn't it?

Where did I say that? To be clear, when I said that a compiler
could transform my Fibonacci program into just puts("89"), I did
not suggest that gcc actually does so.

I questioned the apparent throughput of gcc's 1.27B call/second, and you
said:

"It's misleading *to you*, because you (deliberately?) misinterpret
the results."

So what does that mean, that you agree with that result (1.27B) or that
you don't?

Note that a global counter can be injected into the benchmark at the
entry to fib(), and sure enough, it shows the expected number of calls
when displayed at the end (some 500M for fib(42)). But it's wrong!

[...]

Let me ask you a simple question. Given my fibonacci example,
if a compiler compiled it to the equivalent of `puts("89")`, would
that compiler fail to conform to the ISO C standard? If so, why?
Are you able to distinguish between "I dislike this requirement
in the standard" and "I deny that this requirement exists"?

I don't understand. I assume you know the answer, that a C compiler
can do whatever it likes (including emailing your source to a human
accomplise and have them mail back a cheat like this).

So you agree that optimizing the program to just puts("89") is valid.

My problem is doing fair comparisons between implementations doing the
same task using the same algorithm. And in the case of recursive
fibonacci, I showed above that the naive gcc results are unreliable.

Your problem, apparently, is that you make some bad assumptions about
how to compare and measure performance. You assume that the mapping
from source code to machine code is, or should be, straightforward
enough that you can know how many CALL instructions will be executed.
You get performance numbers that are obviously absurd because some
calls are legitimately optimized away. And even though you now
acknowledge that the optimizations that break your measurements
are legitimate, you still blame the compiler rather than your own
benchmark code.

I like how to ignore every single one of my points.

Your schtick seems to whether a program is conforming or not and seem to
care nothing about practicalities.

Mine, in this case, is comparing language implementations' abilities to achieve a certain throughput of function calls.

I claim that gcc-O3 is giving misleading result because it is not
executing the task I expect, which is to calculate fib(N) by expliciting
doing every one of the necessary 2*fib(N)-1 calls.

--- PyGate Linux v1.5.14
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From Bart@3:633/10 to All on Tue Apr 21 11:31:44 2026

On 21/04/2026 09:53, David Brown wrote:

On 21/04/2026 00:03, Bart wrote:

It is GCC takes which an entire function body or even an entire static
function and vanishes it out of existence. And where the output
depends on the combination effects of dozens of options.

Code that does nothing can be eliminated - that's a /good/ thing. Static functions that are not used within a file can never be used in the
program - eliminating them is a /good/ thing.� Different users have a
wide variety of different needs from their tools, so GCC has a large
number of options.� That's a /good/ thing.

I took a program ll.c (which is Lua source code in one file, so the
compiler can see the whole program), and replaced the body of main()
with 'exit(0)'. So none of the functions are called. I got these results:

c:\cx>gcc -s -Os ll.c # optimise for size

c:\cx>dir a.exe
21/04/2026 11:10 241,152 a.exe

c:\cx>bcc ll
Compiling ll.c to ll.exe

c:\cx>dir ll.exe
21/04/2026 11:11 237,056 ll.exe

Somehow my bcc-compiled version generated a smaller excutable!

However, all the functions should really be marked as 'static'. Most
have a LUA_API macro in front, to do with importing/exporting via DLL.
If I change that to 'static', then gcc-s-Os gives a 158KB result.

If I reinstate the full main(), then it goes up to 238KB. So it /is/
doing the kind of tracing that I was once thinking of doing myself (but decided not to bother as I'm not going to have many orphan functions).

--- PyGate Linux v1.5.14
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From Bart@3:633/10 to All on Tue Apr 21 12:12:28 2026

On 21/04/2026 09:01, David Brown wrote:

On 20/04/2026 22:13, Bart wrote:

On 20/04/2026 18:48, Keith Thompson wrote:

Bart <bc@freeuk.com> writes:

Yes, that's really useful!

So which implementation is faster at actually doing function calls?
And how many calls were actually made?

Why do you care?

I care because, as you know, I sometimes work on compilers, specifically
ones that do not have a formal optimiser.

And I'm interested in how far such compilers can be pushed.

Normally I don't pay much heed to micro-benchmarks, because it is so
easy for a big compiler to optimise a tiny program of a few dozen lines
to nothing.

What counts is what happens with real applications or real libraries.
There I compete favourably.

But working with small benchmarks can still be invaluable for
INTERPRETED languages, and it can also be a sport.

Here I work on both my own interpreted language, and the compiler of its implementation.

Fibonacci is a favourite, as the task can be simply expressed, it can
easily be scaled, and it is not so easy to optimise to nothing. Below is
an updated version of a survey I did last year.

It's not possible to do that with Fibonacci without making it
unrecognisable and so a poor comparison for other reasons.

Sure it is.

__attribute__((noinline))

I use these to get a more valid result:

If testing with gcc now, I'd use these two options:

��-fno-inline
��-fno-optimize-sibling-calls

This is now used in my survery, which still includes the cheating
versions. People can make up their own minds as to how valid they are.

I at least know that that is no point in chasing the 1270M figure; my
code will always do all the calls. The upper limit is 680M, and I'm
curently 45-50% slower for native code, and 10x slower for dynamic and interpreted code.

--------------------------------------------

Comparing Fib(N) across language implementations.

In the version used, it will do 2*Fib(N)-1 calls.
An appropriate N was chosen in each case, and throughput calculated.

Lang Implem Type Category Millions of Calls/second

Bash Bash ? Int 0.0014
C Pico C S Int 0.7
Seed7 s7 S Int 3.5
Algol68 A68G S Int 5
Python CPython 3.14 D Int 11
Wren Wren_cli D Int 11
Euphoria eui v4.1.0 S Int 13
C *bcc -i D Int 14
Lox Clox D Int 17
Lua Lua 5.4 D Int 22
'Pascal' Pascal S Int 27 (Toy Pascal interp in C)
M *pci S Int 28
Lua LuaJIT -joff D Int? 37 (2.1.0)
'Pascal' Pascal S Int 47 (Toy Pascal interp in M)
Q *dd D Int 73

PyPy PyPy 7.3.19 D JIT 128
JavaScript NodeJS D JIT 250 (See Note 1)
Lua LuaJIT -jon D JIT 260 (2.1.0)

C tcc 0.9.27 S Nat 390 (Tiny C)
C gcc -O0 S Nat 420
M *mm S Nat 450
C *bcc S Nat 470

Julia Julia I JIT 520

C gcc -O3 S Nat 570 (See Note 3)
ASM x64 S Nat 680 (See Note 4)

(C gcc -O1 S Nat 540 (See Note 2))
(C gcc -O3 S Nat 1270 (See Note 2))

'*' marks Bart's products

Type: S = Static, D = Dynamic, I = Inferred

Category: Int = Interpreted, Nat = Native code

Note 1: NodeJS has 0.5 second start-up delay. This has to be tested with
a larger than normal N to compensate, but it affects smaller N timings significantly.

Note 2: I believe these figures are suspect because the requisite number
of calls are not done.

Note 3: this uses -fno-line and -fno-optimize-sibling-calls to eliminate 'cheating' and do the correct number of calls.

Note 4: this is a reference implementation in pure x64 assembly which
does the correct number of calls.

--- PyGate Linux v1.5.14
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From David Brown@3:633/10 to All on Tue Apr 21 13:44:11 2026

On 21/04/2026 12:01, Bart wrote:

On 21/04/2026 02:22, Keith Thompson wrote:

Bart <bc@freeuk.com> writes:

On 20/04/2026 22:59, Keith Thompson wrote:

Bart <bc@freeuk.com> writes:

On 20/04/2026 18:48, Keith Thompson wrote:
Presumably the expection is that it would actually be using recursion. >>>> That expectation was not expressed in the code.

Other than clearly using recursion?

The C source code specifies the behavior of the program.� Recursion
is not behavior.� A recursive C program does not require a compiler
to generate a recursive executable, any more than a function call
requires it to generate a "call" instruction.

"(Why would recursive Fibonacci even ever be used as a benchmark when >>>>> the iterative method be much faster in every case?)"

Because you want to measure the speed of function calls, of course.

That's ... a surprising response.

I assumed both you and David had absolutely no interest in such
matters and were not sympathetic to those who did.

I have no idea how you reached that conclusion.

[...]

The naive fib() benchmark tells me it can achieve 1.27 billion fib()
calls per second on my PC. Great!

In that case, I should also get 1.27 billion calls/second when I run
the fib1/fib2/fib3 version.

But, it doesn't; I get less than half that throughput. What's gone
wrong?

According to you, gcc code should be able to have that throughput; why
doesn't it?

Where did I say that?� To be clear, when I said that a compiler
could transform my Fibonacci program into just puts("89"), I did
not suggest that gcc actually does so.

I questioned the apparent throughput of gcc's 1.27B call/second, and you said:

"It's misleading *to you*, because you (deliberately?) misinterpret
the results."

So what does that mean, that you agree with that result (1.27B) or that
you don't?

Keith will have to answer for himself. And I have not tried to
replicate your tests. But if you have got that number by timing gcc's calculations and dividing it by 2 * fib(n) - 1, and you think it shows
the number of assembly "call" instructions, then that sounds a great
deal like misinterpreting the results.

Note that a global counter can be injected into the benchmark at the
entry to fib(), and sure enough, it shows the expected number of calls
when displayed at the end (some 500M for fib(42)). But it's wrong!

Why would you think that adding a counter would tell you the number of
actual assembly "call" function calls? It can tell you the number of
logical C function calls made, but not the number of assembly calls.

It would be very helpful here if you made the distinction between those
two meanings of "function call".

Your schtick seems to whether a program is conforming or not and seem to care nothing about practicalities.

(Again, I am speaking for myself, not for Keith.) I'm fine with code
that is not conforming to standard C, as long as it is conforming to the implementation you are using - and as long as you are clear that it is
not standard C. So if you were using uninitialised variables and
compiling with "gcc -ftrivial-auto-var-init=zero", that's okay. I
consider correct code practically useful, and incorrect code practically useless. But "correct code" does not necessarily mean fully portable
code relying purely on the semantics of C in the standard.

Mine, in this case, is comparing language implementations' abilities to achieve a certain throughput of function calls.

You keep claiming that. People keep telling you that you are failing to
do so - and you know yourself that you are failing to do so. At what
point will you realise that is because what you are doing does not match
what you are trying to achieve, and that listening to other people will
let you achieve what you want (or what you say you want) ?

I claim that gcc-O3 is giving misleading result because it is not
executing the task I expect, which is to calculate fib(N) by expliciting doing every one of the necessary 2*fib(N)-1 calls.

Your expectations are unreasonable and are at odds with what you are
doing. The way out of this hole you have dug for yourself is either to
change your expectations, or change your benchmarking methodology.

--- PyGate Linux v1.5.14
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From David Brown@3:633/10 to All on Tue Apr 21 13:57:32 2026

On 21/04/2026 13:12, Bart wrote:

On 21/04/2026 09:01, David Brown wrote:

On 20/04/2026 22:13, Bart wrote:

On 20/04/2026 18:48, Keith Thompson wrote:

Bart <bc@freeuk.com> writes:

Yes, that's really useful!

So which implementation is faster at actually doing function calls?
And how many calls were actually made?

Why do you care?

I care because, as you know, I sometimes work on compilers, specifically ones that do not have a formal optimiser.

And I'm interested in how far such compilers can be pushed.

Then look at the results for your compilers. Stop fussing about
compilers that do something very different. If I want my go-kart to go faster, I'll try using different sized pram wheels and measure its speed
down a hill. I won't measure the speed of a Ferrari stuck in first gear
and complain it's using petrol instead of just gravity.

And note that you are arguably not finding out anything much about the
quality of the generated code. You are simply seeing that for a lot of
types of code, x86 processors are really good at running poorly
optimised object code. Often the details of additional shuffling of
data between registers and the stack, or additional operations, gets
hidden behind waits for memory or unavoidable scheduling delays. The
real differences come when an optimising compiler can eliminate or
amalgamate parts of the code, or when there are significant tight loops
- the kinds of optimisations that you think are "cheating".
Alternatively, if the code is run on simpler processors with little or
no superscaling, you will see a far bigger performance gap between good optimising compilers and direct translators.

Normally I don't pay much heed to micro-benchmarks, because it is so
easy for a big compiler to optimise a tiny program of a few dozen lines
to nothing.

What counts is what happens with real applications or real libraries.
There I compete favourably.

Yes! Finally you have said something that makes sense.

It's not possible to do that with Fibonacci without making it
unrecognisable and so a poor comparison for other reasons.

Sure it is.

__attribute__((noinline))

I use these to get a more valid result:

For some particular meaning of the word "valid".

If testing with gcc now, I'd use these two options:

��-fno-inline
��-fno-optimize-sibling-calls

This is now used in my survery, which still includes the cheating
versions. People can make up their own minds as to how valid they are.

--- PyGate Linux v1.5.14
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From David Brown@3:633/10 to All on Tue Apr 21 14:10:26 2026

On 21/04/2026 11:56, Keith Thompson wrote:

David Brown <david.brown@hesbynett.no> writes:
[...]

You seem to think that just writing "int a;" somehow creates an int
object called "a" along with a slot on the stack or a dedicated
register. That does not happen in many compilers. And it does not
happen in the C semantics. "a" is an lvalue that /potentially/
designates an object - for an uninitialised local variable, it does
not designate an object until a value is assigned.

[...]

I don't think that's correct. Within the scope of a declaration `int
a;` the expression `a` is an lvalue that *does* designate an object. If
that expression undergoes "lvalue conversion", the operation that
fetches the stored value, the behavior is undefined if a is
uninitialize.

Your interpretation is that after "int a;", "a" does designate an object
but the object has indeterminate value? That is probably more accurate
than saying it does not designate an object at all. I was thinking of "object" as meaning data storage which represents a value (which could
be unspecified, or a trap value), while uninitialised local variables
don't have a value. But the definition of "object" says the contents of
their data storage /can/ represent a value - so "a" could well designate
an object without it holding a value. The distinction would important
if code passes the address of the object off to some other function - it
has to point to an object, even if no value has been assigned to it yet.

The word "potentially (which was my idea, BTW) means that given:

int *p = NULL;

the expression *p is an lvalue, even if it doesn't currently designate
an object. C90 and C99 both messed up the definition of "lvalue" in different ways.

That could also apply to a pointer "p" that contained the address of a different type of object, or no object at all, or the middle of an
object. So "potentially" is a good choice of words here, and a good
addition to the standards. (IMHO).

--- PyGate Linux v1.5.14
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From Bart@3:633/10 to All on Tue Apr 21 13:48:48 2026

On 21/04/2026 12:44, David Brown wrote:

On 21/04/2026 12:01, Bart wrote:

So what does that mean, that you agree with that result (1.27B) or
that you don't?

Keith will have to answer for himself.� And I have not tried to
replicate your tests.� But if you have got that number by timing gcc's calculations and dividing it by 2 * fib(n) - 1, and you think it shows
the number of assembly "call" instructions, then that sounds a great
deal like misinterpreting the results.

Note that a global counter can be injected into the benchmark at the
entry to fib(), and sure enough, it shows the expected number of calls
when displayed at the end (some 500M for fib(42)). But it's wrong!

Why would you think that adding a counter would tell you the number of actual assembly "call" function calls?

Did you miss the bit where I said it's wrong?

� It can tell you the number of
logical C function calls made, but not the number of assembly calls.

I've measured the number of assembly calls too. By injecting, within the assembly, an increment of the count at every place where it calls 'fib'. That's how I discovered that with -O1 it only does 50% of the calls, and
with -O3 only 5%.

I gave this link to someone doing a similar analysis:

https://github.com/drujensen/fib/issues/119

which everyone has conveniently ignored.

It would be very helpful here if you made the distinction between those
two meanings of "function call".

In the case of native x64 code, it is counting the number of times
'CALL' is executed.

Your schtick seems to whether a program is conforming or not and seem
to care nothing about practicalities.

(Again, I am speaking for myself, not for Keith.)� I'm fine with code
that is not conforming to standard C, as long as it is conforming to the implementation you are using - and as long as you are clear that it is
not standard C.� So if you were using uninitialised variables and
compiling with "gcc -ftrivial-auto-var-init=zero", that's okay.� I
consider correct code practically useful, and incorrect code practically useless.� But "correct code" does not necessarily mean fully portable
code relying purely on the semantics of C in the standard.

Mine, in this case, is comparing language implementations' abilities
to achieve a certain throughput of function calls.

You keep claiming that.� People keep telling you that you are failing to
do so - and you know yourself that you are failing to do so.

You snipped my chart. I'm pretty sure that all of those interpreted
timings do the right number calls, and likely the JIT ones too.

As well as the C timings for Pico C, Tiny C, bcc, and gcc-O0.

It is gcc-O1 and above that are the outliers.

All I'm interested in here is comparing a range of languages with an attainable upper limit, not a fantasy one.

Your expectations are unreasonable and are at odds with what you are
doing.� The way out of this hole you have dug for yourself is either to change your expectations, or change your benchmarking methodology.

The methodology is fine. I just need to exclude the outliers. That would
those C ones, and also the slowest.

--- PyGate Linux v1.5.14
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From Michael S@3:633/10 to All on Tue Apr 21 15:55:35 2026

On Tue, 21 Apr 2026 12:12:28 +0100
Bart <bc@freeuk.com> wrote:

Note 2: I believe these figures are suspect because the requisite
number of calls are not done.

I don't see anything suspect in the -O1 code generated by gcc 14.2.0

Source:
unsigned long long fib(unsigned long long n)
{
if (n < 2)
return 1;
return fib(n-1)+fib(n-2);
}

$ gcc -S -O1 -Wall -Wextra -fno-asynchronous-unwind-tables ref0_fib.c
$ cat ref0_fib.s
.file "ref0_fib.c"
.text
.globl fib
.def fib; .scl 2; .type 32; .endef
fib:
movl $1, %eax
cmpq $1, %rcx
jbe .L5
pushq %rsi
pushq %rbx
subq $40, %rsp
movq %rcx, %rbx
leaq -1(%rcx), %rcx
call fib
movq %rax, %rsi
leaq -2(%rbx), %rcx
call fib
addq %rsi, %rax
addq $40, %rsp
popq %rbx
popq %rsi
ret
.L5:
ret
.ident "GCC: (Rev2, Built by MSYS2 project) 14.2.0"

Measured with n=43 on my very old home desktop it gave:
1402817465/2.646 s = 530165330.7 calls/sec

--- PyGate Linux v1.5.14
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From David Brown@3:633/10 to All on Tue Apr 21 15:43:48 2026

On 21/04/2026 14:48, Bart wrote:

On 21/04/2026 12:44, David Brown wrote:

On 21/04/2026 12:01, Bart wrote:

So what does that mean, that you agree with that result (1.27B) or
that you don't?

Keith will have to answer for himself.� And I have not tried to
replicate your tests.� But if you have got that number by timing gcc's
calculations and dividing it by 2 * fib(n) - 1, and you think it shows
the number of assembly "call" instructions, then that sounds a great
deal like misinterpreting the results.

Note that a global counter can be injected into the benchmark at the
entry to fib(), and sure enough, it shows the expected number of
calls when displayed at the end (some 500M for fib(42)). But it's wrong! >>>

Why would you think that adding a counter would tell you the number of
actual assembly "call" function calls?

Did you miss the bit where I said it's wrong?

"Wrong" implies you thought it would or should have a particular
behaviour, and that it does not have that behaviour. What behaviour did
you expect? Please distinguish between logical C function calls, and
assembly call instructions, or whatever else you may mean by "function
call".

� It can tell you the number of logical C function calls made, but not
the number of assembly calls.

I've measured the number of assembly calls too. By injecting, within the assembly, an increment of the count at every place where it calls 'fib'. That's how I discovered that with -O1 it only does 50% of the calls, and with -O3 only 5%.

But you will see from testing that the number of logical calls is as
expected. If gcc manages to make code that gives the correct results
with fewer assembly calls, that's a good thing.

I gave this link to someone doing a similar analysis:

https://github.com/drujensen/fib/issues/119

which everyone has conveniently ignored.

I can't answer for "everyone", but I rarely follow links posted on
Usenet. I am interested in your opinions and answers - a third party's opinions on a fourth party's project is not typically helpful. But
since you insist, I have looked at that page. Have you? The project
author and other posters agree that optimisations are not "cheating",
and question the realism of fibonacci as a benchmark.

It would be very helpful here if you made the distinction between
those two meanings of "function call".

In the case of native x64 code, it is counting the number of times
'CALL' is executed.

Okay. That is, of course, not a meaningful number as far as C (or any
other language, other than assembly) is concerned. It can be a somewhat meaningful value for comparing implementations - where a smaller number
of "CALL" instructions in the compilation indicates a better implementation.

Your schtick seems to whether a program is conforming or not and seem
to care nothing about practicalities.

(Again, I am speaking for myself, not for Keith.)� I'm fine with code
that is not conforming to standard C, as long as it is conforming to
the implementation you are using - and as long as you are clear that
it is not standard C.� So if you were using uninitialised variables
and compiling with "gcc -ftrivial-auto-var-init=zero", that's okay.� I
consider correct code practically useful, and incorrect code
practically useless.� But "correct code" does not necessarily mean
fully portable code relying purely on the semantics of C in the standard.

Mine, in this case, is comparing language implementations' abilities
to achieve a certain throughput of function calls.

You keep claiming that.� People keep telling you that you are failing
to do so - and you know yourself that you are failing to do so.

You snipped my chart. I'm pretty sure that all of those interpreted
timings do the right number calls, and likely the JIT ones too.

There is no "right" number of calls.

As well as the C timings for Pico C, Tiny C, bcc, and gcc-O0.

It is gcc-O1 and above that are the outliers.

So gcc, when optimising, does a better job of optimising than these
small C compilers? Is that a surprise to anyone?

All I'm interested in here is comparing a range of languages with an attainable upper limit, not a fantasy one.

You ran gcc with optimisations, you can see how fast it is possible to calculate results from that program. That was an attainable speed for
the source code (not necessarily an upper bound). Your mistake is
thinking that compiling your fibonacci function gives a measure of
something very different - the number of CALL instructions in the
generated assembly. That's simply not the case.

Your expectations are unreasonable and are at odds with what you are
doing.� The way out of this hole you have dug for yourself is either
to change your expectations, or change your benchmarking methodology.

The methodology is fine. I just need to exclude the outliers. That would those C ones, and also the slowest.

No, your methodology is bad. You are not measuring what you say you
want to measure.

Of course if you want to run tests with gcc with different flags and
claim that your compiler generates code almost as fast as gcc with those flags, that's up to you. I don't see how it is useful, or of interest
to anyone, but that's your choice. If I want to say I can run almost as
fast as a horse with its hind legs hobbled, I can say that too - but I
doubt if anyone is interested or impressed.

--- PyGate Linux v1.5.14
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From Bart@3:633/10 to All on Tue Apr 21 14:49:58 2026

On 21/04/2026 13:55, Michael S wrote:

On Tue, 21 Apr 2026 12:12:28 +0100
Bart <bc@freeuk.com> wrote:

Note 2: I believe these figures are suspect because the requisite
number of calls are not done.

I don't see anything suspect in the -O1 code generated by gcc 14.2.0

Source:
unsigned long long fib(unsigned long long n)
{
if (n < 2)
return 1;
return fib(n-1)+fib(n-2);
}

$ gcc -S -O1 -Wall -Wextra -fno-asynchronous-unwind-tables ref0_fib.c
$ cat ref0_fib.s
.file "ref0_fib.c"
.text
.globl fib
.def fib; .scl 2; .type 32; .endef
fib:
movl $1, %eax
cmpq $1, %rcx
jbe .L5
pushq %rsi
pushq %rbx
subq $40, %rsp
movq %rcx, %rbx
leaq -1(%rcx), %rcx
call fib
movq %rax, %rsi
leaq -2(%rbx), %rcx
call fib
addq %rsi, %rax
addq $40, %rsp
popq %rbx
popq %rsi
ret
.L5:
ret
.ident "GCC: (Rev2, Built by MSYS2 project) 14.2.0"

Measured with n=43 on my very old home desktop it gave:
1402817465/2.646 s = 530165330.7 calls/sec

You're right. It was either a different version or I was mistaken.

But it seems that Clang -O1 will generate a version with only a single
fib call. This is the godbolt code for the Fib() version using "if (n <
3) return 1":

fib:
pushq %r14
pushq %rbx
pushq %rax
movl %edi, %r14d
xorl %ebx, %ebx
cmpl $3, %r14d
jl .LBB0_3
.LBB0_2:
leal -1(%r14), %edi
callq fib
addl %eax, %ebx
addl $-2, %r14d
cmpl $3, %r14d
jge .LBB0_2
.LBB0_3:
incl %ebx
movl %ebx, %eax
addq $8, %rsp
popq %rbx
popq %r14
retq

If I inject an increment to a global counter just after than callq fib,
then it shows only half the expected value.

(This fib version is one-based, so that fib(10) is 55, while yours I
think has it as 89. Google tells me that Fibonacci(10) is 55.)

--- PyGate Linux v1.5.14
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From Scott Lurndal@3:633/10 to All on Tue Apr 21 14:27:09 2026

Bart <bc@freeuk.com> writes:

On 21/04/2026 09:53, David Brown wrote:

On 21/04/2026 00:03, Bart wrote:

I took a program ll.c (which is Lua source code in one file, so the
compiler can see the whole program), and replaced the body of main()
with 'exit(0)'. So none of the functions are called. I got these results:

c:\cx>gcc -s -Os ll.c # optimise for size

c:\cx>dir a.exe
21/04/2026 11:10 241,152 a.exe

c:\cx>bcc ll
Compiling ll.c to ll.exe

c:\cx>dir ll.exe
21/04/2026 11:11 237,056 ll.exe

Somehow my bcc-compiled version generated a smaller excutable!

Ah, back to the on-disk executable size. An irrelevent
metric. One might expect the 'in-memory' size is interesting
and that's what the -Os option is designed to minimize, not the
number of disk sectors consumed by the executable file.

You might wish to compare the text section sizes, but I suspect
that's like telling an apprentice to go fetch a left-handed pipe
wrench, as the size of the text doesn't necessarily correlate
with performance.

--- PyGate Linux v1.5.14
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From Bart@3:633/10 to All on Tue Apr 21 15:38:31 2026

On 21/04/2026 15:27, Scott Lurndal wrote:

Bart <bc@freeuk.com> writes:

On 21/04/2026 09:53, David Brown wrote:

On 21/04/2026 00:03, Bart wrote:

I took a program ll.c (which is Lua source code in one file, so the
compiler can see the whole program), and replaced the body of main()
with 'exit(0)'. So none of the functions are called. I got these results:

c:\cx>gcc -s -Os ll.c # optimise for size

c:\cx>dir a.exe
21/04/2026 11:10 241,152 a.exe

c:\cx>bcc ll
Compiling ll.c to ll.exe

c:\cx>dir ll.exe
21/04/2026 11:11 237,056 ll.exe

Somehow my bcc-compiled version generated a smaller excutable!

Ah, back to the on-disk executable size. An irrelevent
metric. One might expect the 'in-memory' size is interesting
and that's what the -Os option is designed to minimize, not the
number of disk sectors consumed by the executable file.

You don't think there's a correlation between the size of code and
initialised data, and the size of the executable?

Now I've heard everything!

You might wish to compare the text section sizes,

Both text and initialised data will take up valuable memory.

--- PyGate Linux v1.5.14
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From Bart@3:633/10 to All on Tue Apr 21 16:01:34 2026

On 21/04/2026 14:43, David Brown wrote:

On 21/04/2026 14:48, Bart wrote:

I gave this link to someone doing a similar analysis:

https://github.com/drujensen/fib/issues/119

which everyone has conveniently ignored.

I can't answer for "everyone", but I rarely follow links posted on
Usenet.� I am interested in your opinions and answers - a third party's opinions on a fourth party's project is not typically helpful.� But
since you insist, I have looked at that page.� Have you?� The project
author and other posters agree that optimisations are not "cheating",
and question the realism of fibonacci as a benchmark.

Literally the title of the page contains the word "cheating". And the
person maintaining the benchmarks says:

"I am open to suggestions on how to improve the fairness of the benchmark."

So the question of cheating and fairness has been raised. Some suggest a separate category for optimised code. Some suggest using flags as I have
done. Some agree with you that optimisation should not be restricted.

I think Fibonacci is a good benchmark for languages that don't cheat by avoiding doing the full quota of 2*fib(N)-1 calls.

I'm not going to dump a useful tool that works fine in dozens of implementations just because you say so.

It would be very helpful here if you made the distinction between

those two meanings of "function call".

In the case of native x64 code, it is counting the number of times
'CALL' is executed.

Okay.� That is, of course, not a meaningful number as far as C (or any
other language, other than assembly) is concerned.� It can be a somewhat meaningful value for comparing implementations - where a smaller number
of "CALL" instructions in the compilation indicates a better
implementation.

Suppose I had some a function implementing an algorithm and I wanted to
use that as a benchmark to compare languages.

I might measure performance by invoking it N times. Suppose I get these results across 4 languages:

L1: 3.5 seconds
L2: 4.2
L3: 0.1
L4 2.9

According to you, obviously L3 is the winner because of its superior optimiser! No red flags at all.

You snipped my chart. I'm pretty sure that all of those interpreted
timings do the right number calls, and likely the JIT ones too.

There is no "right" number of calls.

As well as the C timings for Pico C, Tiny C, bcc, and gcc-O0.

It is gcc-O1 and above that are the outliers.

So gcc, when optimising, does a better job of optimising than these
small C compilers?� Is that a surprise to anyone?

Below I've posted two programs that both evaluate fib(42).

The first uses the regular function, the second splits it into three
functions across three sources, that all call each other.

They are built like this, using gcc 14.1.0 on Windows:

gcc -O3 fib.c -o fib
gcc -O3 fib1.c fib2.c fib3.c -o fib123

Timings are:

fib: 0.418 seconds
fib123: 0.857

Both do the same thing, do the same number of calls (right?) but one is
twice as fast. Why is that?

If I try it with bcc:

fib: 1.20 seconds
fib123: 1.16

It's consistent. I did one more test, which was to combine fib1/2/4 into
one file fib4.c. This was the result:

fib4: 0.149 seconds

WTF is going on?! This is nearly 3 times that 0.42 seconds which was
already cheating IMV. And it is 6 times as fast having the code in
separate files.

(To try it, concatenate fib1/fib2/fib3 in that order, or paste directly
from below.)

=================================
Program1: "fib"
fib.c:
---------------------------------
#include <stdio.h>

int fib(int n) {
if (n<3)
return 1;
else
return fib(n-1)+fib(n-2);
}

int main() {
printf("%d\n", fib(42));
}

=================================
Program2: fib1.c + fib2.c + fib3.c
fib1.c:
---------------------------------

#include <stdio.h>

int fib1(int);
int fib2(int);
int fib3(int);

int fib1(int n) {
if (n<3)
return 1;
else
return fib2(n-1)+fib3(n-2);
}

int main() {
printf("%d\n", fib1(42));
}

---------------------------------
fib2.c:
---------------------------------

int fib1(int);
int fib2(int);
int fib3(int);

int fib2(int n) {
if (n<3)
return 1;
else
return fib1(n-1)+fib3(n-2);
}

---------------------------------
fib3.c:
---------------------------------
int fib1(int);
int fib2(int);
int fib3(int);

int fib3(int n) {
if (n<3)
return 1;
else
return fib1(n-1)+fib2(n-2);
}
---------------------------------

--- PyGate Linux v1.5.14
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From Scott Lurndal@3:633/10 to All on Tue Apr 21 15:38:10 2026

Bart <bc@freeuk.com> writes:

On 21/04/2026 15:27, Scott Lurndal wrote:

Bart <bc@freeuk.com> writes:

On 21/04/2026 09:53, David Brown wrote:

On 21/04/2026 00:03, Bart wrote:

I took a program ll.c (which is Lua source code in one file, so the
compiler can see the whole program), and replaced the body of main()
with 'exit(0)'. So none of the functions are called. I got these results: >>>
c:\cx>gcc -s -Os ll.c # optimise for size

c:\cx>dir a.exe
21/04/2026 11:10 241,152 a.exe

c:\cx>bcc ll
Compiling ll.c to ll.exe

c:\cx>dir ll.exe
21/04/2026 11:11 237,056 ll.exe

Somehow my bcc-compiled version generated a smaller excutable!

Ah, back to the on-disk executable size. An irrelevent
metric. One might expect the 'in-memory' size is interesting
and that's what the -Os option is designed to minimize, not the
number of disk sectors consumed by the executable file.

You don't think there's a correlation between the size of code and >initialised data, and the size of the executable?

A portion of the executable is metadata that never
gets loaded into memory (symbol tables, rtld data
and relocation information, etc.)

You might wish to compare the text section sizes,

Both text and initialised data will take up valuable memory.

$ size bin/test1
text data bss dec hex filename
6783060 85872 1861744 8730676 853834 bin/test1

The text only takes up memory -on demand-. If a code
page is never referenced, it is never loaded into
memory.

The working set size is interesting, but completely unrelated
to the size of the on-disk executable file.

--- PyGate Linux v1.5.14
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From Michael S@3:633/10 to All on Tue Apr 21 18:44:58 2026

On Tue, 21 Apr 2026 14:49:58 +0100
Bart <bc@freeuk.com> wrote:

On 21/04/2026 13:55, Michael S wrote:

On Tue, 21 Apr 2026 12:12:28 +0100
Bart <bc@freeuk.com> wrote:

Note 2: I believe these figures are suspect because the requisite
number of calls are not done.

I don't see anything suspect in the -O1 code generated by gcc 14.2.0

Source:
unsigned long long fib(unsigned long long n)
{
if (n < 2)
return 1;
return fib(n-1)+fib(n-2);
}

$ gcc -S -O1 -Wall -Wextra -fno-asynchronous-unwind-tables
ref0_fib.c $ cat ref0_fib.s
.file "ref0_fib.c"
.text
.globl fib
.def fib; .scl 2; .type 32; .endef
fib:
movl $1, %eax
cmpq $1, %rcx
jbe .L5
pushq %rsi
pushq %rbx
subq $40, %rsp
movq %rcx, %rbx
leaq -1(%rcx), %rcx
call fib
movq %rax, %rsi
leaq -2(%rbx), %rcx
call fib
addq %rsi, %rax
addq $40, %rsp
popq %rbx
popq %rsi
ret
.L5:
ret
.ident "GCC: (Rev2, Built by MSYS2 project) 14.2.0"

Measured with n=43 on my very old home desktop it gave:
1402817465/2.646 s = 530165330.7 calls/sec

You're right. It was either a different version or I was mistaken.

But it seems that Clang -O1 will generate a version with only a
single fib call. This is the godbolt code for the Fib() version using
"if (n < 3) return 1":

fib:
pushq %r14
pushq %rbx
pushq %rax
movl %edi, %r14d
xorl %ebx, %ebx
cmpl $3, %r14d
jl .LBB0_3
.LBB0_2:
leal -1(%r14), %edi
callq fib
addl %eax, %ebx
addl $-2, %r14d
cmpl $3, %r14d
jge .LBB0_2
.LBB0_3:
incl %ebx
movl %ebx, %eax
addq $8, %rsp
popq %rbx
popq %r14
retq

If I inject an increment to a global counter just after than callq
fib, then it shows only half the expected value.

(This fib version is one-based, so that fib(10) is 55, while yours I
think has it as 89. Google tells me that Fibonacci(10) is 55.)

That looks like tail call elimination. I.e. compiler turned the code
into:
unsigned long long fib(unsigned long long n)
{
unsigned long long res = 0;
while (n >= 3) {
res += fib(n-1);
n -= 2;
}
return res + 1;;
}

gcc generates similar code with -O -foptimize-sibling-calls
For certain styles of coding, e.g. one often preferred by Tim Rentsch,
this optimization is extremely important.

--- PyGate Linux v1.5.14
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From Keith Thompson@3:633/10 to All on Tue Apr 21 08:58:06 2026

Bart <bc@freeuk.com> writes:

On 21/04/2026 02:22, Keith Thompson wrote:

Bart <bc@freeuk.com> writes:

On 20/04/2026 22:59, Keith Thompson wrote:

Bart <bc@freeuk.com> writes:

On 20/04/2026 18:48, Keith Thompson wrote:
Presumably the expection is that it would actually be using recursion. >>>> That expectation was not expressed in the code.

Other than clearly using recursion?

The C source code specifies the behavior of the program. Recursion
is not behavior. A recursive C program does not require a compiler
to generate a recursive executable, any more than a function call
requires it to generate a "call" instruction.

"(Why would recursive Fibonacci even ever be used as a benchmark when >>>>> the iterative method be much faster in every case?)"

Because you want to measure the speed of function calls, of course.

That's ... a surprising response.

I assumed both you and David had absolutely no interest in such
matters and were not sympathetic to those who did.

I have no idea how you reached that conclusion.
[...]

The naive fib() benchmark tells me it can achieve 1.27 billion fib()
calls per second on my PC. Great!

In that case, I should also get 1.27 billion calls/second when I run
the fib1/fib2/fib3 version.

But, it doesn't; I get less than half that throughput. What's gone wrong? >>>
According to you, gcc code should be able to have that throughput; why
doesn't it?

Where did I say that? To be clear, when I said that a compiler
could transform my Fibonacci program into just puts("89"), I did
not suggest that gcc actually does so.

I questioned the apparent throughput of gcc's 1.27B call/second, and
you said:

"It's misleading *to you*, because you (deliberately?) misinterpret
the results."

So what does that mean, that you agree with that result (1.27B) or
that you don't?

I haven't paid much attention to the specific numbers. I think
you're assuming that the generated code *must* execute the number of
function calls that you expect. You apparently want to disallow
optimizations that reduce that number while yielding the same output.

I don't think gcc can reduce my Fibonacci program to puts("89"), but it
likely can do things like handling tail recursion that reduce the number
calls thare are executed. I haven't examined the generated code because
the specific optimizations that gcc happens to perform are not
particularly important to me. I'm discussing the requirements imposed
by the ISO C standard.

Note that a global counter can be injected into the benchmark at the
entry to fib(), and sure enough, it shows the expected number of calls
when displayed at the end (some 500M for fib(42)). But it's wrong!

How is it wrong? If you change the source code, it's a different
program, and the generated code can be different. Adding a global
counter is exactly the kind of thing that can force the compiler to
generate all the calls you want to see (though making the counter
volatile is safer for that purpose). On the other hand, if the
counter is not volatile and the program doesn't do anything with
its value, a compiler *could* eliminate the counter and perform
whatever optimizations it would have done without it.

[...]

Let me ask you a simple question. Given my fibonacci example,
if a compiler compiled it to the equivalent of `puts("89")`, would
that compiler fail to conform to the ISO C standard? If so, why?
Are you able to distinguish between "I dislike this requirement
in the standard" and "I deny that this requirement exists"?

I don't understand. I assume you know the answer, that a C compiler
can do whatever it likes (including emailing your source to a human
accomplise and have them mail back a cheat like this).

So you agree that optimizing the program to just puts("89") is
valid.

My problem is doing fair comparisons between implementations doing the
same task using the same algorithm. And in the case of recursive
fibonacci, I showed above that the naive gcc results are unreliable.

Your problem, apparently, is that you make some bad assumptions
about
how to compare and measure performance. You assume that the mapping
from source code to machine code is, or should be, straightforward
enough that you can know how many CALL instructions will be executed.
You get performance numbers that are obviously absurd because some
calls are legitimately optimized away. And even though you now
acknowledge that the optimizations that break your measurements
are legitimate, you still blame the compiler rather than your own
benchmark code.

I like how to ignore every single one of my points.

I certainly have not ignored "every single one" of your points.
If you think I've ignored *some* of them, please be specific.

Your schtick seems to whether a program is conforming or not and seem
to care nothing about practicalities.

That's an overstatement, but I suppose there's some truth in it. I'm
not talking about whether "a program is conforming". The standard
defines the term "conforming program", and it's not very useful.
(And in fact my Fibonacci program is probably *strictly conforming*,
though the fact that I/O can fail makes that a little ambiguous.)

I'm talking about whether the compiler's behavior is conforming.

Mine, in this case, is comparing language implementations' abilities
to achieve a certain throughput of function calls.

By "function calls", I think you mean executed "call" (or equivalent) instructions.

I claim that gcc-O3 is giving misleading result because it is not
executing the task I expect, which is to calculate fib(N) by
expliciting doing every one of the necessary 2*fib(N)-1 calls.

Your expectation is inconsistent with reality.

I infer that the results mislead you because you refuse to
understand what the language allows. If a compiler generates code
that executes one "call" instruction for each function call that
occurs in the abstract machine, if it reduces the number of calls
via tail recursion optimization, or if it reduces the entire program
to a single puts() call, the implementation is conforming if the
program produces the correct behavior (output) when it's executed.

You acknowledged (finally) that compilers are permitted by the standards
to perform optimizations that do not change the visible behavior of a
program, but you still insist that gcc is "wrong" because it performs
such optimizations.

--
Keith Thompson (The_Other_Keith) Keith.S.Thompson+u@gmail.com
void Void(void) { Void(); } /* The recursive call of the void */

--- PyGate Linux v1.5.14
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From Keith Thompson@3:633/10 to All on Tue Apr 21 09:03:07 2026

David Brown <david.brown@hesbynett.no> writes:

On 21/04/2026 12:01, Bart wrote:

[...]

Note that a global counter can be injected into the benchmark at the
entry to fib(), and sure enough, it shows the expected number of
calls when displayed at the end (some 500M for fib(42)). But it's
wrong!

Why would you think that adding a counter would tell you the number of
actual assembly "call" function calls? It can tell you the number of
logical C function calls made, but not the number of assembly calls.

You're right. In my previous post a few minutes ago, I assumed
that the value of the counter would match the number of "call"
instructions executed. Given reasonable optimizations, that's not
necessarily the case.

A C function call (that occurs during execution in the abstract machine)
is not necessarily implemented as a "call" instruction or equivalent.

On ARM, the instruction that calls a subroutine is called "bl",
not "call", so talking about "call" instructions is imprecise.
If C said anything about the CPU instructions generated for C code,
it would have to cover that kind of variation.

[...]

--
Keith Thompson (The_Other_Keith) Keith.S.Thompson+u@gmail.com
void Void(void) { Void(); } /* The recursive call of the void */

--- PyGate Linux v1.5.14
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From David Brown@3:633/10 to All on Tue Apr 21 18:03:56 2026

On 21/04/2026 17:01, Bart wrote:

On 21/04/2026 14:43, David Brown wrote:

On 21/04/2026 14:48, Bart wrote:

I gave this link to someone doing a similar analysis:

https://github.com/drujensen/fib/issues/119

which everyone has conveniently ignored.

I can't answer for "everyone", but I rarely follow links posted on
Usenet.� I am interested in your opinions and answers - a third
party's opinions on a fourth party's project is not typically
helpful.� But since you insist, I have looked at that page.� Have
you?� The project author and other posters agree that optimisations
are not "cheating", and question the realism of fibonacci as a benchmark.

Literally the title of the page contains the word "cheating". And the
person maintaining the benchmarks says:

"I am open to suggestions on how to improve the fairness of the benchmark."

Did you stop reading at that, or did you continue? The conclusion was
that there was no cheating, and optimisations are fair. Yes, questions
were raised - and answered.

So the question of cheating and fairness has been raised. Some suggest a separate category for optimised code. Some suggest using flags as I have done. Some agree with you that optimisation should not be restricted.

I think Fibonacci is a good benchmark for languages that don't cheat by avoiding doing the full quota of 2*fib(N)-1 calls.

I'm not going to dump a useful tool that works fine in dozens of implementations just because you say so.

No one has said you can't use fibonacci functions. I have just said
your testing doesn't do what you think it does, which is why you jump to incorrect conclusions about it. If you think it is fair to compare your
tools against a hobbled gcc build, that's up to you. I don't really
know your end goal here - certainly you are not going to influence
anyone else or convince them that your compiler is almost as good as gcc
for any real-world usage. And I don't see any of this being useful for judging how good your compiler is, or could be.

According to you, obviously L3 is the winner because of its superior optimiser! No red flags at all.

I wish you would not keep putting words in other people's mouths.

You snipped my chart. I'm pretty sure that all of those interpreted
timings do the right number calls, and likely the JIT ones too.

There is no "right" number of calls.

As well as the C timings for Pico C, Tiny C, bcc, and gcc-O0.

It is gcc-O1 and above that are the outliers.

So gcc, when optimising, does a better job of optimising than these
small C compilers?� Is that a surprise to anyone?

Below I've posted two programs that both evaluate fib(42).

The first uses the regular function, the second splits it into three functions across three sources, that all call each other.

They are built like this, using gcc 14.1.0 on Windows:

�� gcc -O3 fib.c -o fib
�� gcc -O3 fib1.c fib2.c fib3.c -o fib123

�Timings are:

�� fib:�� 0.418 seconds
�� fib123:�� 0.857

Both do the same thing, do the same number of calls (right?) but one is twice as fast. Why is that?

I assume that is a rhetorical question.

If I try it with bcc:

�� fib:�� 1.20 seconds
�� fib123:�� 1.16

It's consistent. I did one more test, which was to combine fib1/2/4 into
one file fib4.c. This was the result:

�� fib4:�� 0.149 seconds

WTF is going on?! This is nearly 3 times that 0.42 seconds which was
already cheating IMV. And it is 6 times as fast having the code in
separate files.

I'm guessing there's a bit of luck in how the inlining works out, along
with other possible inter-procedural optimisations. I'm not going to
bother spending time on benchmarking this or examining the object code
for various setups and flag options. But I do agree that it is interesting.

When you are testing with just one "fib" function, and not cheating
(that is to say - using "gcc -O2" without hindering tail-call
optimisations), try comparing your first version to a second version
with "return fib(n - 2) + fib(n - 1);". That is, swap the two clauses
here. You might also try the same exercise with your own tools.
Sometimes small details can have big effects.

--- PyGate Linux v1.5.14
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From David Brown@3:633/10 to All on Tue Apr 21 18:06:29 2026

On 21/04/2026 17:44, Michael S wrote:

On Tue, 21 Apr 2026 14:49:58 +0100
Bart <bc@freeuk.com> wrote:

On 21/04/2026 13:55, Michael S wrote:

On Tue, 21 Apr 2026 12:12:28 +0100
Bart <bc@freeuk.com> wrote:

Note 2: I believe these figures are suspect because the requisite
number of calls are not done.

I don't see anything suspect in the -O1 code generated by gcc 14.2.0

Source:
unsigned long long fib(unsigned long long n)
{
if (n < 2)
return 1;
return fib(n-1)+fib(n-2);
}

$ gcc -S -O1 -Wall -Wextra -fno-asynchronous-unwind-tables
ref0_fib.c $ cat ref0_fib.s
.file "ref0_fib.c"
.text
.globl fib
.def fib; .scl 2; .type 32; .endef
fib:
movl $1, %eax
cmpq $1, %rcx
jbe .L5
pushq %rsi
pushq %rbx
subq $40, %rsp
movq %rcx, %rbx
leaq -1(%rcx), %rcx
call fib
movq %rax, %rsi
leaq -2(%rbx), %rcx
call fib
addq %rsi, %rax
addq $40, %rsp
popq %rbx
popq %rsi
ret
.L5:
ret
.ident "GCC: (Rev2, Built by MSYS2 project) 14.2.0"

Measured with n=43 on my very old home desktop it gave:
1402817465/2.646 s = 530165330.7 calls/sec

You're right. It was either a different version or I was mistaken.

But it seems that Clang -O1 will generate a version with only a
single fib call. This is the godbolt code for the Fib() version using
"if (n < 3) return 1":

fib:
pushq %r14
pushq %rbx
pushq %rax
movl %edi, %r14d
xorl %ebx, %ebx
cmpl $3, %r14d
jl .LBB0_3
.LBB0_2:
leal -1(%r14), %edi
callq fib
addl %eax, %ebx
addl $-2, %r14d
cmpl $3, %r14d
jge .LBB0_2
.LBB0_3:
incl %ebx
movl %ebx, %eax
addq $8, %rsp
popq %rbx
popq %r14
retq

If I inject an increment to a global counter just after than callq
fib, then it shows only half the expected value.

(This fib version is one-based, so that fib(10) is 55, while yours I
think has it as 89. Google tells me that Fibonacci(10) is 55.)

That looks like tail call elimination. I.e. compiler turned the code
into:
unsigned long long fib(unsigned long long n)
{
unsigned long long res = 0;
while (n >= 3) {
res += fib(n-1);
n -= 2;
}
return res + 1;;
}

gcc generates similar code with -O -foptimize-sibling-calls
For certain styles of coding, e.g. one often preferred by Tim Rentsch,
this optimization is extremely important.

It is so important that gcc has an attribute "musttail" that can be used
to insist on getting tail call optimisation or giving a warning or error
if it fails. (It will fail here, since you are only getting partial
tail call optimisation - one recursive fib call is eliminated, but not
the other.)

--- PyGate Linux v1.5.14
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From Michael S@3:633/10 to All on Tue Apr 21 19:11:54 2026

On Tue, 21 Apr 2026 14:27:09 GMT
scott@slp53.sl.home (Scott Lurndal) wrote:

Bart <bc@freeuk.com> writes:

On 21/04/2026 09:53, David Brown wrote:

On 21/04/2026 00:03, Bart wrote:

I took a program ll.c (which is Lua source code in one file, so the >compiler can see the whole program), and replaced the body of main()
with 'exit(0)'. So none of the functions are called. I got these
results:

c:\cx>gcc -s -Os ll.c # optimise for size

c:\cx>dir a.exe
21/04/2026 11:10 241,152 a.exe

c:\cx>bcc ll
Compiling ll.c to ll.exe

c:\cx>dir ll.exe
21/04/2026 11:11 237,056 ll.exe

Somehow my bcc-compiled version generated a smaller excutable!

Ah, back to the on-disk executable size. An irrelevent
metric. One might expect the 'in-memory' size is interesting
and that's what the -Os option is designed to minimize, not the
number of disk sectors consumed by the executable file.

You might wish to compare the text section sizes, but I suspect
that's like telling an apprentice to go fetch a left-handed pipe
wrench, as the size of the text doesn't necessarily correlate
with performance.

First, I think that minimizing size of image *is* one of objectives of
-Os.
Second, 'gcc -Os' is often bad at producing small .text as well.
gcc just was not built for optimizing for size. If you look at the
docs, you'd find out that despite the name -Os does not really
optimize for size, it just doen't do optimization passes that are known
to cause code bloat.

--- PyGate Linux v1.5.14
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From Keith Thompson@3:633/10 to All on Tue Apr 21 09:19:48 2026

Bart <bc@freeuk.com> writes:

On 21/04/2026 14:43, David Brown wrote:

On 21/04/2026 14:48, Bart wrote:

I gave this link to someone doing a similar analysis:

https://github.com/drujensen/fib/issues/119

which everyone has conveniently ignored.

I can't answer for "everyone", but I rarely follow links posted on
Usenet.� I am interested in your opinions and answers - a third
party's opinions on a fourth party's project is not typically
helpful.� But since you insist, I have looked at that page.� Have
you?� The project author and other posters agree that optimisations
are not "cheating", and question the realism of fibonacci as a
benchmark.

Literally the title of the page contains the word "cheating". And the
person maintaining the benchmarks says:

"I am open to suggestions on how to improve the fairness of the
benchmark."

I'm not sure I would have used the word "cheating", though I might
have used it facetiously. None of the compilers we're discussing
"cheat" in the sense of violating the rules of the language.

The author is talking about *improving the benchmark* so that
it prevents the optimizations that make it difficult to measure
performance.

If you expect your benchmark program to execute N "call" instructions
during execution, and the generated code only executes N/2, then the
problem is in your benchmark program. The compiler is doing its job,
and doing it well, as long as the program's output is correct.

So the question of cheating and fairness has been raised. Some suggest
a separate category for optimised code. Some suggest using flags as I
have done. Some agree with you that optimisation should not be
restricted.

I think Fibonacci is a good benchmark for languages that don't cheat
by avoiding doing the full quota of 2*fib(N)-1 calls.

THAT'S NOT CHEATING. It's called optimization. If you refuse to
do the work of writing you benchmark so it avoids optimization,
then you'll end up with a bad benchmark.

The performance of "call" instructions and the ability of a compiler
to optimize code are both interesting things, both worth measuring.
If valid optimizations interfere with that measurement, you need
to fix the benchmark.

I'm not going to dump a useful tool that works fine in dozens of implementations just because you say so.

Apparently you intend to continue to use a tool that does not measure
what you want it to measure, that works with some implementations but
not with others. You say it works perfectly well while posting data
that shows that it does not.

It would be very helpful here if you made the distinction between

those two meanings of "function call".

In the case of native x64 code, it is counting the number of times
'CALL' is executed.

Okay.� That is, of course, not a meaningful number as far as C (or
any other language, other than assembly) is concerned.� It can be a
somewhat meaningful value for comparing implementations - where a
smaller number of "CALL" instructions in the compilation indicates a
better implementation.

Suppose I had some a function implementing an algorithm and I wanted
to use that as a benchmark to compare languages.

I might measure performance by invoking it N times. Suppose I get
these results across 4 languages:

L1: 3.5 seconds
L2: 4.2
L3: 0.1
L4 2.9

According to you, obviously L3 is the winner because of its superior optimiser! No red flags at all.

Does the code generated by L3 produce the correct output? If so, the
only problem is that your benchmark is affected by the L3 compiler's optimization. If your goal is to compare the performance of "call" instructions, fix the benchmark.

[...]

--
Keith Thompson (The_Other_Keith) Keith.S.Thompson+u@gmail.com
void Void(void) { Void(); } /* The recursive call of the void */

--- PyGate Linux v1.5.14
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From Bart@3:633/10 to All on Tue Apr 21 17:55:54 2026

On 21/04/2026 16:38, Scott Lurndal wrote:

Bart <bc@freeuk.com> writes:

On 21/04/2026 15:27, Scott Lurndal wrote:

Bart <bc@freeuk.com> writes:

On 21/04/2026 09:53, David Brown wrote:

On 21/04/2026 00:03, Bart wrote:

I took a program ll.c (which is Lua source code in one file, so the
compiler can see the whole program), and replaced the body of main()
with 'exit(0)'. So none of the functions are called. I got these results: >>>>
c:\cx>gcc -s -Os ll.c # optimise for size

c:\cx>dir a.exe
21/04/2026 11:10 241,152 a.exe

c:\cx>bcc ll
Compiling ll.c to ll.exe

c:\cx>dir ll.exe
21/04/2026 11:11 237,056 ll.exe

Somehow my bcc-compiled version generated a smaller excutable!

Ah, back to the on-disk executable size. An irrelevent
metric. One might expect the 'in-memory' size is interesting
and that's what the -Os option is designed to minimize, not the
number of disk sectors consumed by the executable file.

You don't think there's a correlation between the size of code and
initialised data, and the size of the executable?

A portion of the executable is metadata that never
gets loaded into memory (symbol tables, rtld data
and relocation information, etc.)

You might wish to compare the text section sizes,

Both text and initialised data will take up valuable memory.

$ size bin/test1
text data bss dec hex filename
6783060 85872 1861744 8730676 853834 bin/test1

The text only takes up memory -on demand-. If a code
page is never referenced, it is never loaded into
memory.

The working set size is interesting, but completely unrelated
to the size of the on-disk executable file.

So it's a just a coincidence that specifying -Os tends to get you a
smaller executable?

I see.

--- PyGate Linux v1.5.14
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From Scott Lurndal@3:633/10 to All on Tue Apr 21 17:28:25 2026

Bart <bc@freeuk.com> writes:

On 21/04/2026 16:38, Scott Lurndal wrote:

Bart <bc@freeuk.com> writes:

On 21/04/2026 15:27, Scott Lurndal wrote:

Bart <bc@freeuk.com> writes:

On 21/04/2026 09:53, David Brown wrote:

On 21/04/2026 00:03, Bart wrote:

I took a program ll.c (which is Lua source code in one file, so the
compiler can see the whole program), and replaced the body of main() >>>>> with 'exit(0)'. So none of the functions are called. I got these results: >>>>>
c:\cx>gcc -s -Os ll.c # optimise for size

c:\cx>dir a.exe
21/04/2026 11:10 241,152 a.exe

c:\cx>bcc ll
Compiling ll.c to ll.exe

c:\cx>dir ll.exe
21/04/2026 11:11 237,056 ll.exe

Somehow my bcc-compiled version generated a smaller excutable!

Ah, back to the on-disk executable size. An irrelevent
metric. One might expect the 'in-memory' size is interesting
and that's what the -Os option is designed to minimize, not the
number of disk sectors consumed by the executable file.

You don't think there's a correlation between the size of code and
initialised data, and the size of the executable?

A portion of the executable is metadata that never
gets loaded into memory (symbol tables, rtld data
and relocation information, etc.)

You might wish to compare the text section sizes,

Both text and initialised data will take up valuable memory.

$ size bin/test1
text data bss dec hex filename
6783060 85872 1861744 8730676 853834 bin/test1

The text only takes up memory -on demand-. If a code
page is never referenced, it is never loaded into
memory.

The working set size is interesting, but completely unrelated
to the size of the on-disk executable file.

So it's a just a coincidence that specifying -Os tends to get you a
smaller executable?

You have a bad habit of putting words in others mouths.

The size of an executable is a function. One of the terms
in the function is the text region size. It is not always
the dominant term, nor is it particularly interesting.

The text region size does not represent the full memory footprint
of the process that invokes the executable (e.g. rodata, data,
stack, shared objects such as libc, etc), it's often a very
small portion of it.

--- PyGate Linux v1.5.14
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From Bart@3:633/10 to All on Tue Apr 21 18:51:16 2026

On 21/04/2026 17:19, Keith Thompson wrote:

Bart <bc@freeuk.com> writes:

On 21/04/2026 14:43, David Brown wrote:

On 21/04/2026 14:48, Bart wrote:

I gave this link to someone doing a similar analysis:

https://github.com/drujensen/fib/issues/119

which everyone has conveniently ignored.

I can't answer for "everyone", but I rarely follow links posted on
Usenet.� I am interested in your opinions and answers - a third
party's opinions on a fourth party's project is not typically
helpful.� But since you insist, I have looked at that page.� Have
you?� The project author and other posters agree that optimisations
are not "cheating", and question the realism of fibonacci as a
benchmark.

Literally the title of the page contains the word "cheating". And the
person maintaining the benchmarks says:

"I am open to suggestions on how to improve the fairness of the
benchmark."

I'm not sure I would have used the word "cheating", though I might
have used it facetiously. None of the compilers we're discussing
"cheat" in the sense of violating the rules of the language.

The author is talking about *improving the benchmark* so that
it prevents the optimizations that make it difficult to measure
performance.

The author segregates different categories of language. I want to
compare across categories, but also want to compare unoptimised and
optimised native code.

Optimised timings tend to be fragile (see my example with fib1/2/3); unoptimised is far more reliable.

So the question of cheating and fairness has been raised. Some suggest
a separate category for optimised code. Some suggest using flags as I
have done. Some agree with you that optimisation should not be
restricted.

I think Fibonacci is a good benchmark for languages that don't cheat
by avoiding doing the full quota of 2*fib(N)-1 calls.

THAT'S NOT CHEATING. It's called optimization. If you refuse to
do the work of writing you benchmark so it avoids optimization,
then you'll end up with a bad benchmark.

For a fair comparison of language implementatons, you HAVE to be running
the same algorithm with the same steps.

If one doesn't bother executing some or most of those steps, then the comparison is meaningless. Unless we are comparing optimisation ability,
but that is not what this is about.

From that github home page: "Any language faster than Assembly is
performing unrolling type optimizations."

In fact, I am now using Assembly as the reference implementation. Using optimised C is far too unreliable and it can be inconsistent.

I'm not going to dump a useful tool that works fine in dozens of
implementations just because you say so.

Apparently you intend to continue to use a tool that does not measure
what you want it to measure, that works with some implementations but
not with others. You say it works perfectly well while posting data
that shows that it does not.

It only does not because of those erroneous and erratic gcc timings.

I might measure performance by invoking it N times. Suppose I get
these results across 4 languages:

L1: 3.5 seconds
L2: 4.2
L3: 0.1
L4 2.9

According to you, obviously L3 is the winner because of its superior
optimiser! No red flags at all.

Does the code generated by L3 produce the correct output? If so, the
only problem is that your benchmark is affected by the L3 compiler's optimization. If your goal is to compare the performance of "call" instructions, fix the benchmark.

So you would ignore such a giant red flag? That's good to know.

(You wouldn't even curious at all about such an outlier?)

--- PyGate Linux v1.5.14
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From Chris M. Thomasson@3:633/10 to All on Tue Apr 21 11:51:27 2026

On 4/21/2026 1:13 AM, David Brown wrote:

On 20/04/2026 23:59, Keith Thompson wrote:

Bart <bc@freeuk.com> writes:

On 20/04/2026 18:48, Keith Thompson wrote:

Bart <bc@freeuk.com> writes:

Yes, that's really useful!

So which implementation is faster at actually doing function calls?
And how many calls were actually made?

I don't know or care.

Once again, *there are ways* to write C benchmarks that guarantee
that all the function calls you want to time actually occur during
execution.� For example, you can use calls to separately compiled
functions (and disable link-time optimization if necessary).� You can
do computations that the compiler can't unwrap.� You might multiply
a value by (time(NULL) > 0); that always yields 1, but the compiler
probably doesn't know that.� (That's off the top of my head; I don't
know what the best techniques are in practice.)� And then you can
examine the generated code to make sure that it's what you want.

To add more suggestions here, I find the key to benchmarking when you
want to stick to standard C is use of "volatile".� Use a volatile read
at the start of your code, then calculations that depend on each other
and that first read, then a volatile write of the result.� That gives minimal intrusion in the code while making sure the calculations have to
be generated, and have to be done at run time.

If you are testing on a particular compiler (like gcc or clang), then
there are other options.� The "noinline" function attribute is very
handy.� Then there are empty inline assembly statements:

If you think of processor registers as acting like a level -1 memory
cache (for things that are not always in registers), then this flushes
that cache:

��asm volatile ("" ::: "memory");

This tells the compiler that it needs to have calculated "x" at this
point in time (so that its value can be passed to the assembly) :

��asm volatile ("" :: "" (x));

This tells the compiler that "x" might be changed by the assembly, so it must forget any additional knowledge it had of it :

��asm volatile ("" : "+g" (x));

I've had use of all of these in real code, not just benchmarks or test
code.� They can be helpful in some kinds of interactions between low
level code and hardware.

Well, we have to make a difference between a compiler barrier and a
memory barrier. All memory barriers should be compiler barriers, but
compiler barriers do not have to be memory barriers... Fair enough?

--- PyGate Linux v1.5.14
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From Keith Thompson@3:633/10 to All on Tue Apr 21 12:40:35 2026

Bart <bc@freeuk.com> writes:
[...]

Note that a global counter can be injected into the benchmark at the
entry to fib(), and sure enough, it shows the expected number of calls
when displayed at the end (some 500M for fib(42)). But it's wrong!

[...]

What exactly do you mean by "injected"? Do you modify the C program
to add a counter, or do you modify the compiler-generated assembly
or machine code? If the former, then of course you have a different
program, probably with different behavior.

And what exactly do you mean by "wrong"?

I can't be sure without seeing your code, but I'd expect the counter
to reflect the number of times the function is called in the abstract
machine. Whether those function calls are implemented by "call"
(or "bl") instructions is an implementation detail about which the
C standard says nothing. If you happen to care about that, that's
fine, but you'll need to go beyond what the language guarantees if
you want to measure it.

--
Keith Thompson (The_Other_Keith) Keith.S.Thompson+u@gmail.com
void Void(void) { Void(); } /* The recursive call of the void */

--- PyGate Linux v1.5.14
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From David Brown@3:633/10 to All on Tue Apr 21 22:13:26 2026

On 21/04/2026 20:51, Chris M. Thomasson wrote:

On 4/21/2026 1:13 AM, David Brown wrote:

On 20/04/2026 23:59, Keith Thompson wrote:

Bart <bc@freeuk.com> writes:

On 20/04/2026 18:48, Keith Thompson wrote:

Bart <bc@freeuk.com> writes:

Yes, that's really useful!

So which implementation is faster at actually doing function calls?
And how many calls were actually made?

I don't know or care.

Once again, *there are ways* to write C benchmarks that guarantee
that all the function calls you want to time actually occur during
execution.� For example, you can use calls to separately compiled
functions (and disable link-time optimization if necessary).� You can
do computations that the compiler can't unwrap.� You might multiply
a value by (time(NULL) > 0); that always yields 1, but the compiler
probably doesn't know that.� (That's off the top of my head; I don't
know what the best techniques are in practice.)� And then you can
examine the generated code to make sure that it's what you want.

To add more suggestions here, I find the key to benchmarking when you
want to stick to standard C is use of "volatile".� Use a volatile read
at the start of your code, then calculations that depend on each other
and that first read, then a volatile write of the result.� That gives
minimal intrusion in the code while making sure the calculations have
to be generated, and have to be done at run time.

If you are testing on a particular compiler (like gcc or clang), then
there are other options.� The "noinline" function attribute is very
handy.� Then there are empty inline assembly statements:

If you think of processor registers as acting like a level -1 memory
cache (for things that are not always in registers), then this flushes
that cache:

��asm volatile ("" ::: "memory");

This tells the compiler that it needs to have calculated "x" at this
point in time (so that its value can be passed to the assembly) :

��asm volatile ("" :: "" (x));

This tells the compiler that "x" might be changed by the assembly, so
it must forget any additional knowledge it had of it :

��asm volatile ("" : "+g" (x));

I've had use of all of these in real code, not just benchmarks or test
code.� They can be helpful in some kinds of interactions between low
level code and hardware.

Well, we have to make a difference between a compiler barrier and a
memory barrier. All memory barriers should be compiler barriers, but compiler barriers do not have to be memory barriers... Fair enough?

Of course there is a difference between memory barriers and compiler
barriers. We are talking about compiler barriers here, because they
have an effect on the semantics of the language (in this case, the
language is "C with gcc extensions") without the cost of real memory
barriers. C11 atomic fences are compiler and memory barriers, but they
can have a huge effect on code speed - these empty assembly statements
are aimed at having minimal impact outside of the intended effects.

--- PyGate Linux v1.5.14
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From Keith Thompson@3:633/10 to All on Tue Apr 21 13:16:38 2026

Bart <bc@freeuk.com> writes:

On 21/04/2026 17:19, Keith Thompson wrote:

Bart <bc@freeuk.com> writes:

On 21/04/2026 14:43, David Brown wrote:

On 21/04/2026 14:48, Bart wrote:

I gave this link to someone doing a similar analysis:

https://github.com/drujensen/fib/issues/119

which everyone has conveniently ignored.

I can't answer for "everyone", but I rarely follow links posted on
Usenet.� I am interested in your opinions and answers - a third
party's opinions on a fourth party's project is not typically
helpful.� But since you insist, I have looked at that page.� Have
you?� The project author and other posters agree that optimisations
are not "cheating", and question the realism of fibonacci as a
benchmark.

Literally the title of the page contains the word "cheating". And the
person maintaining the benchmarks says:

"I am open to suggestions on how to improve the fairness of the
benchmark."

I'm not sure I would have used the word "cheating", though I might
have used it facetiously. None of the compilers we're discussing
"cheat" in the sense of violating the rules of the language.
The author is talking about *improving the benchmark* so that
it prevents the optimizations that make it difficult to measure
performance.

The author segregates different categories of language. I want to
compare across categories, but also want to compare unoptimised and
optimised native code.

Optimised timings tend to be fragile (see my example with fib1/2/3); unoptimised is far more reliable.

So the question of cheating and fairness has been raised. Some suggest
a separate category for optimised code. Some suggest using flags as I
have done. Some agree with you that optimisation should not be
restricted.

I think Fibonacci is a good benchmark for languages that don't cheat
by avoiding doing the full quota of 2*fib(N)-1 calls.

THAT'S NOT CHEATING. It's called optimization. If you refuse to
do the work of writing you benchmark so it avoids optimization,
then you'll end up with a bad benchmark.

For a fair comparison of language implementatons, you HAVE to be
running the same algorithm with the same steps.

If one doesn't bother executing some or most of those steps, then the comparison is meaningless. Unless we are comparing optimisation
ability, but that is not what this is about.

From that github home page: "Any language faster than Assembly is
performing unrolling type optimizations."

In fact, I am now using Assembly as the reference
implementation. Using optimised C is far too unreliable and it can be inconsistent.

Comparing the performance of unoptimized code is not, in my humble
opinion, particularly useful. Benchmarks are for people who care
about performance. People who care about performance typically do
not ship unoptimized code, simply because optimized code is faster.

If you want to perform measurements that are relevant to the
performance of real-world code, it's best to (a) write the benchmark
in a way that forces the compiler *not* to perform optimizations
that destroy the thing you're trying to measure, and (b) compile
the benchmark with optimization *enabled*.

I'm not an expert on benchmarks. The people who write them presumably
know all about this stuff. (There's a comp.benchmarks newsgroup,
but it's inactive.)

I'm not going to dump a useful tool that works fine in dozens of
implementations just because you say so.

Apparently you intend to continue to use a tool that does not
measure
what you want it to measure, that works with some implementations but
not with others. You say it works perfectly well while posting data
that shows that it does not.

It only does not because of those erroneous and erratic gcc timings.

They are neither "erroneous" nor "erratic" as far as the behavior
required by the C standard or by gcc is concerned. They merely
violate your faulty assumptions.

I might measure performance by invoking it N times. Suppose I get
these results across 4 languages:

L1: 3.5 seconds
L2: 4.2
L3: 0.1
L4 2.9

According to you, obviously L3 is the winner because of its superior
optimiser! No red flags at all.

Does the code generated by L3 produce the correct output? If so, the
only problem is that your benchmark is affected by the L3 compiler's
optimization. If your goal is to compare the performance of "call"
instructions, fix the benchmark.

So you would ignore such a giant red flag? That's good to know.

(You wouldn't even curious at all about such an outlier?)

You really need to stop putting words in people's mouths. You've
made a number of serious mistakes when guessing about what I think.
Try *asking* rather than assuming.

Sure, I'd be curious about why L3 performs so much better. If L3 is
compiled and the others are interpreted, that's probably the answer.
If they're all compiled, its likely that the L3 compiler performs
optimizations that the others don't. (Of course this is assuming the
output is correct; fast wrong answers are not useful or interesting.)

I would not assume that it's a *problem*.

There is at least one kind of optimization that I'd call "cheating".
If a compiler computes the sha256 checksum of a C source file and
finds that it matches the checksum of a known benchmark, and then
generates an executable that just prints the expected output with impressive-looking numbers, I'd call that cheating, though it's
still conforming behavior as far as the C standard is concerned.

If a compiler is able to optimize code without changing its behavior
in impermissible ways, that's just good optimization.

Let me ask you a question. You've been complaining a lot about
how gcc behaves. If you could get the gcc maintainers to fix the
problems you're complaining about, what exactly would that look like?
What would it take to make you happy?

Would you insist that a program that computes fib(10) executes
exactly 177 "call" instructions (that's the number I got in a
quick experiment)? Would you insist that `2+2` must generate an
"add" instruction? If your answers differ, why?

--
Keith Thompson (The_Other_Keith) Keith.S.Thompson+u@gmail.com
void Void(void) { Void(); } /* The recursive call of the void */

--- PyGate Linux v1.5.14
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From Bart@3:633/10 to All on Tue Apr 21 22:56:55 2026

On 21/04/2026 20:40, Keith Thompson wrote:

Bart <bc@freeuk.com> writes:
[...]

Note that a global counter can be injected into the benchmark at the
entry to fib(), and sure enough, it shows the expected number of calls
when displayed at the end (some 500M for fib(42)). But it's wrong!

[...]

What exactly do you mean by "injected"? Do you modify the C program
to add a counter, or do you modify the compiler-generated assembly
or machine code? If the former, then of course you have a different
program, probably with different behavior.

And what exactly do you mean by "wrong"?

I can't be sure without seeing your code, but I'd expect the counter
to reflect the number of times the function is called in the abstract machine. Whether those function calls are implemented by "call"
(or "bl") instructions is an implementation detail about which the
C standard says nothing. If you happen to care about that, that's
fine, but you'll need to go beyond what the language guarantees if
you want to measure it.

Here I mean within the C, but I've done both.

Injecting into the ASM gives more accurate results. In the C, it just
displays what you would expect, so for fib(N), it would be 2*fib)-1,
even if it just uses a lookup table or hardcodes the value. It's a sham.

--- PyGate Linux v1.5.14
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From Keith Thompson@3:633/10 to All on Tue Apr 21 15:07:10 2026

Bart <bc@freeuk.com> writes:

On 21/04/2026 20:40, Keith Thompson wrote:

Bart <bc@freeuk.com> writes:
[...]

Note that a global counter can be injected into the benchmark at the
entry to fib(), and sure enough, it shows the expected number of calls
when displayed at the end (some 500M for fib(42)). But it's wrong!

[...]
What exactly do you mean by "injected"? Do you modify the C program
to add a counter, or do you modify the compiler-generated assembly
or machine code? If the former, then of course you have a different
program, probably with different behavior.
And what exactly do you mean by "wrong"?

I can't be sure without seeing your code, but I'd expect the counter
to reflect the number of times the function is called in the abstract
machine. Whether those function calls are implemented by "call"
(or "bl") instructions is an implementation detail about which the
C standard says nothing. If you happen to care about that, that's
fine, but you'll need to go beyond what the language guarantees if
you want to measure it.

Here I mean within the C, but I've done both.

Injecting into the ASM gives more accurate results. In the C, it just displays what you would expect, so for fib(N), it would be 2*fib)-1,
even if it just uses a lookup table or hardcodes the value. It's a
sham.

What exactly do you mean by "accurate"? What inaccurate counter values
have you seen? What quantity were you trying to count?

--
Keith Thompson (The_Other_Keith) Keith.S.Thompson+u@gmail.com
void Void(void) { Void(); } /* The recursive call of the void */

--- PyGate Linux v1.5.14
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From Bart@3:633/10 to All on Wed Apr 22 01:01:27 2026

On 21/04/2026 21:16, Keith Thompson wrote:

Bart <bc@freeuk.com> writes:

In fact, I am now using Assembly as the reference
implementation. Using optimised C is far too unreliable and it can be
inconsistent.

Comparing the performance of unoptimized code is not, in my humble
opinion, particularly useful. Benchmarks are for people who care
about performance. People who care about performance typically do
not ship unoptimized code, simply because optimized code is faster.

The assembly for this benchmark is very tight, it is the simplest
possible implementation that does exactly what is expected. Here is an
example in Z80 assembly:

fib:
cmp a, 3
jr nc, fib2
ld hl, 1
ret

fib2:
push af
dec a
call fib
pop af
push hl
sub a, 2
call fib
pop de
add hl, de
ret

This is probably the best you do and keep within the spirit of the
benchmark. That Fibonacci benchmarks site seems to consider Assembly as
its baseline reference.

If you want to perform measurements that are relevant to the
performance of real-world code, it's best to (a) write the benchmark
in a way that forces the compiler *not* to perform optimizations
that destroy the thing you're trying to measure, and (b) compile
the benchmark with optimization *enabled*.

I'm not an expert on benchmarks. The people who write them presumably
know all about this stuff. (There's a comp.benchmarks newsgroup,
but it's inactive.)

I've only been writing assemblers, compilers and interpreters for
something over 40 years. I've done a few benchmarks!

They are neither "erroneous" nor "erratic" as far as the behavior
required by the C standard or by gcc is concerned. They merely
violate your faulty assumptions.

They're erratic. Split the task amongst 3 functions in different files,
and it's half the speed. Put those 3 in the same file, and it's 3 times
the speed! So what IS the true speed? It's anyone's guess.

Sure, I'd be curious about why L3 performs so much better. If L3 is
compiled and the others are interpreted, that's probably the answer.
If they're all compiled, its likely that the L3 compiler performs optimizations that the others don't. (Of course this is assuming the
output is correct; fast wrong answers are not useful or interesting.)

I would not assume that it's a *problem*.

I would assume the opposite: clearly something is amiss.

There is at least one kind of optimization that I'd call "cheating".
If a compiler computes the sha256 checksum of a C source file and
finds that it matches the checksum of a known benchmark, and then
generates an executable that just prints the expected output with impressive-looking numbers, I'd call that cheating, though it's
still conforming behavior as far as the C standard is concerned.

That's sounds like a form of memoisation. I can add that to Fibonacci in
my scripting language, and then any result is practically instant.

But I consider that cheating, and pointless. I want to know if my normal
code is better or worse than another when it has to do the set task,
which is not the end result, but the process of getting there.

If a compiler is able to optimize code without changing its behavior
in impermissible ways, that's just good optimization.

There's doing the stated task well and efficiency. And there's doing
something entirely differently, or avoiding doing it.

Take compilation speed for example: there are any number of ways to
improve throughput, for example:

* Makefiles with dependency lists so that it only needs to invoke the
compiler when needed
* Using precompiled headers
* Using parallel builds with multiple cores, or across a network

However, I'm interested in raw compilation speed: how fast a compiler
works when it HAS to be invoked, and on one core. Get that sorted first,
then you can use those techniques if still needed.

If you are measuring build-times and comparing products, then
introducing those optimisations doesn't help.

Let me ask you a question. You've been complaining a lot about
how gcc behaves.

I'm not complaining about it. I'm just saying that its results for this benchmark are effectively cheating, even if not deliberate.

Would you insist that a program that computes fib(10) executes
exactly 177 "call" instructions

For comparing function calls across languages and implementations via
that benchmark, then yes.

I am not interested in finding out how clever optimising compilers can
be. They anyway have ample opportunity to do that within the real
applications that are the compiler-intepreters

The bytecode compilers of interpreted languages tend not to do
aggressive optimisations (its harder for dynamic code, and would take
too long).

quick experiment)? Would you insist that `2+2` must generate an
"add" instruction? If your answers differ, why?

If I was testing integer arithmetic then I would have to use variables. Constant expression reduction is too commonly done, and in some
languages (eg. C) it is mandated.

--- PyGate Linux v1.5.14
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From Janis Papanagnou@3:633/10 to All on Wed Apr 22 04:36:26 2026

On 2026-04-20 13:45, Bart wrote:

My language allows you to do this:

�� int a, b
�� a := b

It is well-defined in the language, and I know it is well defined on all
my likely targets.

That's perfectly fine if (for example) your language implies a default initialization semantics. (Simula, e.g., has such a semantic defined;
declared (instantiated) integer variables have the value 0.) - But "C"
does not! - Haven't you two been talking about "C" all the time?

(If you are again trying to project your language's "design decisions"
onto "C" I really suggest to stop that nonsense since it doesn't lead anywhere.)

Janis

--- PyGate Linux v1.5.14
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From Janis Papanagnou@3:633/10 to All on Wed Apr 22 04:53:05 2026

On 2026-04-20 15:02, David Brown wrote:

On 20/04/2026 13:45, Bart wrote:

[...]

[...] But it is
genuinely absurd to say that you have been writing languages,
translators, compilers, transpilers and other tools for decades, and
don't understand such simple things.� This really is the first step for
a compiler that transforms language A to language B - you have to
generate code in language B that implements the semantics in language A.

For his tools to translate L(a) -> L(B) he seems to be applying just
lexical (not semantic) transformations. - Ignoring the implications
(or not) it might fit his own "standards".

If I were in your shoes, I would translate "int a;" from your language
into "int a = 0;" in C.� Simple and clear.� The semantics on the C side
are stronger than those in the source language (since there is a
definite value of 0, rather than an unspecified int value as you have AFAIUI), but that's fine.� If your translator generates something with weaker semantics - like plain "int a;" in C - then your translator is
broken by design.

Janis

[...]

--- PyGate Linux v1.5.14
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From Keith Thompson@3:633/10 to All on Tue Apr 21 19:53:34 2026

Bart <bc@freeuk.com> writes:

On 21/04/2026 21:16, Keith Thompson wrote:

[...]

Would you insist that a program that computes fib(10) executes
exactly 177 "call" instructions

For comparing function calls across languages and implementations via
that benchmark, then yes.

I didn't say the program was a benchmark. I didn't say what the
purpose of the program was. And guess what, the compiler *doesn't
know* it's a benchmark, so it doesn't disable optimizations for the
sake of measuring something.

If I write a C program with a fib() function (never mind that
the naive recursive algorithm is horrible), and it calls fib(10)
because it needs to know the value of fib(10), would you insist that
it executes exactly 177 "call" instructions?

Say I have an interactive calculator program with a user-visible
fib() function. Maybe I use a nice efficient O(n) iterative
solution. If the compiler memoized the function, so the second call
to fib(10) yields a constant value rather than recomputing it, would
that be a problem? Again, this is an application, not a benchmark.

A compiler behaving differently when compiling in "benchmark mode"
*would* be cheating. Benchmarks are useful because they mimic the
performance characteristics of real code.

I am not interested in finding out how clever optimising compilers can
be. They anyway have ample opportunity to do that within the real applications that are the compiler-intepreters

You're not suggesting that optimization is useful only for compilers
and interpreters, are you?

The bytecode compilers of interpreted languages tend not to do
aggressive optimisations (its harder for dynamic code, and would take
too long).

quick experiment)? Would you insist that `2+2` must generate an
"add" instruction? If your answers differ, why?

If I was testing integer arithmetic then I would have to use
variables. Constant expression reduction is too commonly done, and in
some languages (eg. C) it is mandated.

You're *so* close to getting it.

If you want to measure the performance of addition, you have to write
your benchmark code so the addition operator can't be optimized away.
If you don't do that, the results will be meaningless.

If you want to measure the performance of function calls, ...

--
Keith Thompson (The_Other_Keith) Keith.S.Thompson+u@gmail.com
void Void(void) { Void(); } /* The recursive call of the void */

--- PyGate Linux v1.5.14
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From Janis Papanagnou@3:633/10 to All on Wed Apr 22 05:01:22 2026

On 2026-04-20 16:32, Bart wrote:

On 20/04/2026 14:02, David Brown wrote:

On 20/04/2026 13:45, Bart wrote:

I implement it in a common sense manner.

"Common sense" is another way of saying "I don't know the actual rules".

It means doing the obvious thing with no unexpected surprises. If the resulting program runs with exactly the behaviour the user expects, then what is the problem?

Are you aware that "common sense" and "unexpected surprises" are both
of the same class of fuzziness? - Since you've been asking: *that* is
the problem.

I think it's safe to say that a professional user expects well defined behavior.

Janis

--- PyGate Linux v1.5.14
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From Keith Thompson@3:633/10 to All on Tue Apr 21 20:13:21 2026

Janis Papanagnou <janis_papanagnou+ng@hotmail.com> writes:

On 2026-04-20 13:45, Bart wrote:

My language allows you to do this:
�� int a, b
�� a := b
It is well-defined in the language, and I know it is well defined on
all my likely targets.

That's perfectly fine if (for example) your language implies a default initialization semantics. (Simula, e.g., has such a semantic defined; declared (instantiated) integer variables have the value 0.) - But "C"
does not! - Haven't you two been talking about "C" all the time?

It's also perfectly fine if his language treats uninitialized
variables as having arbitrary values, perhaps whatever bits happened
to be in that memory location, and treats accesses to such variables consistently.

int a, b, c # uninitialized
b := a
c := a
if (b == c) ... # guaranteed to be true

(You wrote "for example", so I'm not disagreeing with you, just
expanding on your point.)

C, of course, is not such a language, and the result of `b == c`
could be true, or false, or kablooie.

(If you are again trying to project your language's "design decisions"
onto "C" I really suggest to stop that nonsense since it doesn't lead anywhere.)

--
Keith Thompson (The_Other_Keith) Keith.S.Thompson+u@gmail.com
void Void(void) { Void(); } /* The recursive call of the void */

--- PyGate Linux v1.5.14
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From Tim Rentsch@3:633/10 to All on Tue Apr 21 21:04:30 2026

Keith Thompson <Keith.S.Thompson+u@gmail.com> writes:

David Brown <david.brown@hesbynett.no> writes:
[...]

You seem to think that just writing "int a;" somehow creates an int
object called "a" along with a slot on the stack or a dedicated
register. That does not happen in many compilers. And it does not
happen in the C semantics. "a" is an lvalue that /potentially/
designates an object - for an uninitialised local variable, it does
not designate an object until a value is assigned.

[...]

I don't think that's correct.

In fact, it's wrong. Encountering a declaration 'int a;' certainly
means that in the abstract machine there is an object corresponding
to the identifier 'a'.

Within the scope of a declaration `int a;` the expression `a` is
an lvalue that *does* designate an object.

And more than that: in the abstract machine an object corresponding
to the identifier 'a' comes into existence as soon as the block
containing 'int a;' is entered, regardless of whether 'a' is
initialized or referenced in any way.

--- PyGate Linux v1.5.14
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From Tim Rentsch@3:633/10 to All on Tue Apr 21 21:09:03 2026

antispam@fricas.org (Waldek Hebisch) writes:

Bart <bc@freeuk.com> wrote:

On 19/04/2026 20:32, David Brown wrote:

On 19/04/2026 19:47, Bart wrote:

Get the value of 'b',

You can't do that. "b" has no value. "b" is indeterminate, and
using its value is UB - the code has no meaning right out of the
gate.

When you use "b" in an expression, you are /not/ asking C to read
the bits and bytes stored at the address of the object "b". You
are asking for the /value/ of the object "b". How the compiler
gets that value is up to the compiler - it can read the memory, or
use a stored copy in a register, or use program analysis to know
what the value is in some other way. And if the object "b" does
not have a value, you are asking the impossible.

Try asking a human "You have two numbers, b and c. Add them.
What is the answer?".

You have two slates A and B which someone should have wiped clean
then written a new number on each.

But that part hasn't been done; they each still have an old number
from their last use.

You can still add them together, nothing bad will happen. It just
may be the wrong answer if the purpose of the exercise was to find
the sum of two specific new numbers.

But the purpose may also be see how good they are adding. Or in
following instructions.

whatever it happens to be, add the value of 'c' scaled by 8, and
store the result it into 'a'. The only things to consider are
that some intermediate results may lose the top bits.

Is 'a = b' equally undefined? If so that C is even crazy than
I'd thought.

If "a" or "b" are indeterminate, then using them is undefined. I
have two things - are they the same colour? How is that supposed
to make sense?

You keep thinking of objects like "b" as a section of memory with
a bit pattern in it. Objects are not that simple in C - C is not
assembly.

Why ISN'T it that simple? What ghastly thing would happen if it
was?

"b" will be some location in memory or it might be some register,
and it WILL have a value. That value happens to be unknown until
it is initialised.

So accessing it will return garbage (unless you know exactly what
you are doing then it may be something useful).

My original example was something like 'a = b + c' (I think in my
language), converted to my IL, then expressed in very low-level C.

You were concerned that in that C, the values weren't initialised.
How would that have affected the code that C compiler generated
from that?

You look at trivial example, where AFAICS the best answer is:
"Compiler follows general rules, why should it make exception for
this case?". Note that in this trivial case "interesting"
behaviour could happen on exotic hardware (probably disallowed
by C23 rules, but AFAICS legal for earlier C versions).

The kinds of behavior Bart is asking about has been undefined
behavior for just over 15 years, since 2011 ISO C.

--- PyGate Linux v1.5.14
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From Tim Rentsch@3:633/10 to All on Tue Apr 21 21:12:27 2026

Michael S <already5chosen@yahoo.com> writes:

On Tue, 21 Apr 2026 12:12:28 +0100
Bart <bc@freeuk.com> wrote:

Note 2: I believe these figures are suspect because the requisite
number of calls are not done.

I don't see anything suspect in the -O1 code generated by gcc 14.2.0

Source:
unsigned long long fib(unsigned long long n)
{
if (n < 2)
return 1;
return fib(n-1)+fib(n-2);
}

Tsk, tsk. fibonacci(0) is 0.

--- PyGate Linux v1.5.14
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From Tim Rentsch@3:633/10 to All on Tue Apr 21 21:24:29 2026

Michael S <already5chosen@yahoo.com> writes:

On Tue, 21 Apr 2026 14:49:58 +0100
Bart <bc@freeuk.com> wrote:

On 21/04/2026 13:55, Michael S wrote:

On Tue, 21 Apr 2026 12:12:28 +0100
Bart <bc@freeuk.com> wrote:

Note 2: I believe these figures are suspect because the requisite
number of calls are not done.

I don't see anything suspect in the -O1 code generated by gcc 14.2.0

Source:
unsigned long long fib(unsigned long long n)
{
if (n < 2)
return 1;
return fib(n-1)+fib(n-2);
}

$ gcc -S -O1 -Wall -Wextra -fno-asynchronous-unwind-tables
ref0_fib.c $ cat ref0_fib.s
.file "ref0_fib.c"
.text
.globl fib
.def fib; .scl 2; .type 32; .endef
fib:
movl $1, %eax
cmpq $1, %rcx
jbe .L5
pushq %rsi
pushq %rbx
subq $40, %rsp
movq %rcx, %rbx
leaq -1(%rcx), %rcx
call fib
movq %rax, %rsi
leaq -2(%rbx), %rcx
call fib
addq %rsi, %rax
addq $40, %rsp
popq %rbx
popq %rsi
ret
.L5:
ret
.ident "GCC: (Rev2, Built by MSYS2 project) 14.2.0"

Measured with n=43 on my very old home desktop it gave:
1402817465/2.646 s = 530165330.7 calls/sec

You're right. It was either a different version or I was mistaken.

But it seems that Clang -O1 will generate a version with only a
single fib call. This is the godbolt code for the Fib() version using
"if (n < 3) return 1":

fib:
pushq %r14
pushq %rbx
pushq %rax
movl %edi, %r14d
xorl %ebx, %ebx
cmpl $3, %r14d
jl .LBB0_3
.LBB0_2:
leal -1(%r14), %edi
callq fib
addl %eax, %ebx
addl $-2, %r14d
cmpl $3, %r14d
jge .LBB0_2
.LBB0_3:
incl %ebx
movl %ebx, %eax
addq $8, %rsp
popq %rbx
popq %r14
retq

If I inject an increment to a global counter just after than callq
fib, then it shows only half the expected value.

(This fib version is one-based, so that fib(10) is 55, while yours I
think has it as 89. Google tells me that Fibonacci(10) is 55.)

That looks like tail call elimination. I.e. compiler turned the code
into:
unsigned long long fib(unsigned long long n)
{
unsigned long long res = 0;
while (n >= 3) {
res += fib(n-1);
n -= 2;
}
return res + 1;;
}

gcc generates similar code with -O -foptimize-sibling-calls
For certain styles of coding, e.g. one often preferred by Tim Rentsch,
this optimization is extremely important.

Please don't misrepresent me. The code transformation shown above
is not important to the functional recursive style that I often
employ. Neither of the two recursive calls to fib() in the C code
function shown at the top are tail calls.

--- PyGate Linux v1.5.14
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From David Brown@3:633/10 to All on Wed Apr 22 08:28:21 2026

On 22/04/2026 06:04, Tim Rentsch wrote:

Keith Thompson <Keith.S.Thompson+u@gmail.com> writes:

David Brown <david.brown@hesbynett.no> writes:
[...]

You seem to think that just writing "int a;" somehow creates an int
object called "a" along with a slot on the stack or a dedicated
register. That does not happen in many compilers. And it does not
happen in the C semantics. "a" is an lvalue that /potentially/
designates an object - for an uninitialised local variable, it does
not designate an object until a value is assigned.

[...]

I don't think that's correct.

In fact, it's wrong. Encountering a declaration 'int a;' certainly
means that in the abstract machine there is an object corresponding
to the identifier 'a'.

Within the scope of a declaration `int a;` the expression `a` is
an lvalue that *does* designate an object.

And more than that: in the abstract machine an object corresponding
to the identifier 'a' comes into existence as soon as the block
containing 'int a;' is entered, regardless of whether 'a' is
initialized or referenced in any way.

Thank you both for these two posts. I like to understand the details of
these things, even if they often make little practical difference, and I appreciate correction here.

--- PyGate Linux v1.5.14
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From David Brown@3:633/10 to All on Wed Apr 22 09:40:08 2026

On 22/04/2026 04:53, Keith Thompson wrote:

Bart <bc@freeuk.com> writes:

On 21/04/2026 21:16, Keith Thompson wrote:

[...]

If I was testing integer arithmetic then I would have to use
variables. Constant expression reduction is too commonly done, and in
some languages (eg. C) it is mandated.

You're *so* close to getting it.

If you want to measure the performance of addition, you have to write
your benchmark code so the addition operator can't be optimized away.
If you don't do that, the results will be meaningless.

If you want to measure the performance of function calls, ...

For example, if you wanted to measure the speed of this kind of double recursion, you need an operation that can't be removed by partial tail recursion. Something like the Ackermann function would do. But just
throw in a "max" operation and neither gcc nor clang can figure a way to
do partial tail optimisation. (I can't be sure if it is impossible
here, however.) I changed to an unsigned type so that there's never any
UB no matter how big the numbers go, and it will hinder compilers from
using any "(x + 1 > x) is always true for signed types" style of
optimisation. The GCC "noinline" attribute keeps the generated assembly easier to read - otherwise GCC will inline for about the first 5 values
of n. That won't affect the big O time complexity, but it would make
things a little faster.

typedef unsigned int T;

static T max(T a, T b) { return (a < b) ? b : a; }

T counter;

__attribute__((noinline))
T fibish(T n)
{
counter++;
if (n <= 2) return 1;
return max(fibish(n - 1), fibish(n - 2)) + 1;
}

--- PyGate Linux v1.5.14
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From David Brown@3:633/10 to All on Wed Apr 22 11:13:25 2026

On 21/04/2026 17:38, Scott Lurndal wrote:

Bart <bc@freeuk.com> writes:

On 21/04/2026 15:27, Scott Lurndal wrote:

Bart <bc@freeuk.com> writes:

On 21/04/2026 09:53, David Brown wrote:

On 21/04/2026 00:03, Bart wrote:

I took a program ll.c (which is Lua source code in one file, so the
compiler can see the whole program), and replaced the body of main()
with 'exit(0)'. So none of the functions are called. I got these results: >>>>
c:\cx>gcc -s -Os ll.c # optimise for size

c:\cx>dir a.exe
21/04/2026 11:10 241,152 a.exe

c:\cx>bcc ll
Compiling ll.c to ll.exe

c:\cx>dir ll.exe
21/04/2026 11:11 237,056 ll.exe

Somehow my bcc-compiled version generated a smaller excutable!

Ah, back to the on-disk executable size. An irrelevent
metric. One might expect the 'in-memory' size is interesting
and that's what the -Os option is designed to minimize, not the
number of disk sectors consumed by the executable file.

You don't think there's a correlation between the size of code and
initialised data, and the size of the executable?

A portion of the executable is metadata that never
gets loaded into memory (symbol tables, rtld data
and relocation information, etc.)

You are misrepresenting things here - exaggerated the irrelevance of
sizes is no more realistic or helpful than Bart's exaggerating the
importance of them.

Yes, on a PC the executable (exe file, elf file) can contain many things
other than executable code, including rarely used things like symbol
tables and debugging information. Yes, the runtime "size" of a program depends on things other than just the size of executable size - runtime
data being the biggie. On Linux, parts of an executable file that are
not actually needed when running are not loaded into ram - I don't know
if that also applies on Windows.

But it is absolutely true that there is a correlation between the size
of code and initialised data, and the size of an executable. It is not
a direct relationship, but to suggest they are unrelated is not helpful.

You also said "-Os is designed to minimise the in-memory size", and that
is not true. On GCC, -Os enables mostly the same optimisations as -O2,
but it omits some that typically significantly increase code size for relatively small performance gains, and it slightly changes the balance
of some optimisations to favour small size over faster code. It does
nothing to influence other in-memory size aspects, such as initialised
data, read-only data, uninitialised data, stack space or heap space.

(Just like trying to maximise the performance of code with optimisation
flags, trying to minimise the code size is not an exact science. I have
seen situations where -O2 produces smaller code than -Os, and situations
where -Os produces faster code than -O2. Getting the very best from
your tools is not always as simple as picking one -O option.)

While I agree that the size of files on disks are rarely particularly important, size of the executable code can definitely be important in
some types of work (such as for microcontrollers).

You might wish to compare the text section sizes,

Both text and initialised data will take up valuable memory.

$ size bin/test1
text data bss dec hex filename
6783060 85872 1861744 8730676 853834 bin/test1

The text only takes up memory -on demand-. If a code
page is never referenced, it is never loaded into
memory.

The working set size is interesting, but completely unrelated
to the size of the on-disk executable file.

--- PyGate Linux v1.5.14
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From Michael S@3:633/10 to All on Wed Apr 22 12:48:00 2026

On Tue, 21 Apr 2026 21:12:27 -0700
Tim Rentsch <tr.17687@z991.linuxsc.com> wrote:

Michael S <already5chosen@yahoo.com> writes:

On Tue, 21 Apr 2026 12:12:28 +0100
Bart <bc@freeuk.com> wrote:

Note 2: I believe these figures are suspect because the requisite
number of calls are not done.

I don't see anything suspect in the -O1 code generated by gcc 14.2.0

Source:
unsigned long long fib(unsigned long long n)
{
if (n < 2)
return 1;
return fib(n-1)+fib(n-2);
}

Tsk, tsk. fibonacci(0) is 0.

:(

--- PyGate Linux v1.5.14
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From Michael S@3:633/10 to All on Wed Apr 22 13:02:47 2026

On Tue, 21 Apr 2026 21:24:29 -0700
Tim Rentsch <tr.17687@z991.linuxsc.com> wrote:

Michael S <already5chosen@yahoo.com> writes:

On Tue, 21 Apr 2026 14:49:58 +0100
Bart <bc@freeuk.com> wrote:

On 21/04/2026 13:55, Michael S wrote:

On Tue, 21 Apr 2026 12:12:28 +0100
Bart <bc@freeuk.com> wrote:

Note 2: I believe these figures are suspect because the
requisite number of calls are not done.

I don't see anything suspect in the -O1 code generated by gcc
14.2.0

Source:
unsigned long long fib(unsigned long long n)
{
if (n < 2)
return 1;
return fib(n-1)+fib(n-2);
}

$ gcc -S -O1 -Wall -Wextra -fno-asynchronous-unwind-tables
ref0_fib.c $ cat ref0_fib.s
.file "ref0_fib.c"
.text
.globl fib
.def fib; .scl 2; .type 32; .endef
fib:
movl $1, %eax
cmpq $1, %rcx
jbe .L5
pushq %rsi
pushq %rbx
subq $40, %rsp
movq %rcx, %rbx
leaq -1(%rcx), %rcx
call fib
movq %rax, %rsi
leaq -2(%rbx), %rcx
call fib
addq %rsi, %rax
addq $40, %rsp
popq %rbx
popq %rsi
ret
.L5:
ret
.ident "GCC: (Rev2, Built by MSYS2 project) 14.2.0"

Measured with n=43 on my very old home desktop it gave:
1402817465/2.646 s = 530165330.7 calls/sec

You're right. It was either a different version or I was mistaken.

But it seems that Clang -O1 will generate a version with only a
single fib call. This is the godbolt code for the Fib() version
using "if (n < 3) return 1":

fib:
pushq %r14
pushq %rbx
pushq %rax
movl %edi, %r14d
xorl %ebx, %ebx
cmpl $3, %r14d
jl .LBB0_3
.LBB0_2:
leal -1(%r14), %edi
callq fib
addl %eax, %ebx
addl $-2, %r14d
cmpl $3, %r14d
jge .LBB0_2
.LBB0_3:
incl %ebx
movl %ebx, %eax
addq $8, %rsp
popq %rbx
popq %r14
retq

If I inject an increment to a global counter just after than callq
fib, then it shows only half the expected value.

(This fib version is one-based, so that fib(10) is 55, while yours
I think has it as 89. Google tells me that Fibonacci(10) is 55.)

That looks like tail call elimination. I.e. compiler turned the
code into:
unsigned long long fib(unsigned long long n)
{
unsigned long long res = 0;
while (n >= 3) {
res += fib(n-1);
n -= 2;
}
return res + 1;;
}

gcc generates similar code with -O -foptimize-sibling-calls
For certain styles of coding, e.g. one often preferred by Tim
Rentsch, this optimization is extremely important.

Please don't misrepresent me. The code transformation shown above
is not important to the functional recursive style that I often
employ. Neither of the two recursive calls to fib() in the C code
function shown at the top are tail calls.

Sorry, I didn't think deep enough.
Indeed, transformation applied by compilers in this case is
more complex than mere TCE.

In theory, it can be a result of two successive transformations.
First transforming original to fib2:

unsigned long long fib2(unsigned long long n, unsigned long long acc)
{
if (n < 3)
return acc + 1;
return fib2(n-2, fib2(n-1, acc));
}

Ant then applying TCE.
But more likely compiler arrived to the same outcome by different
logical steps.

--- PyGate Linux v1.5.14
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From Bart@3:633/10 to All on Wed Apr 22 12:49:06 2026

On 22/04/2026 00:41, Waldek Hebisch wrote:

Bart <bc@freeuk.com> wrote:

On 21/04/2026 01:39, Waldek Hebisch wrote:

Bart <bc@freeuk.com> wrote:

On 20/04/2026 18:48, Keith Thompson wrote:

Bart <bc@freeuk.com> writes:

Yes, that's really useful!

So which implementation is faster at actually doing function calls? And >>>> how many calls were actually made?

Implementation that uses 0 instructions to implement function call
on normal machine will get shorter runtime, so clearly is faster
at doing function calls. This does not differ much from what
some modern processors do, namely move instruction may effectively
take 0 cycles. People used to old ways when confrotnted with

movq %rax, %rdx

expect that there will be actual movement of data, that instruction
must travel the whole CPU pipeline. But modern processors do
register renaming and after looking at this istruction may
simply note that to get value of %rdx one uses place storing
%rax (I am using AT&T convention so direction is from %rax to
%rdx) and otherwise drop the istructruction. Is the processor
cheating? Naive benchmark where moves are overrepresented may
execute unexpectedy fast, but moves are frequent in real
program so this gives valuable speedup for all programs.

Coming back to function calls, consider programmer who cares
very much about speed. He knows that his program would be
simpler and easier to write if he used a lot of small
functions. In old days he would worry about cost of
function calls and he proably would write much bigger and
complicated functions to get good speed. But if cost of
function call is 0 he can freely use small functions, without
worrying about cost of calls.

If the cost was zero then function inlining wouldn't be a thing.

Inlining is a way to get 0 cost.

I will give you the answer: it is to compare how implementations cope
with very large numbers of recursive function calls. So if one finds a >>>> way to avoid doing such calls, then it is not a fair comparison.

Well, Fibonacci and similar functions have limited use.

They are commonly used as benchmarks. I use them a lot to compare
interpreted and JITed languages, but need also some native code tests as
a reference.

It is the latter that are flawed when using gcc.

I decided to write a reference version in x64 assembly, a
straightforward version that does the requisite number of calls.

To evaluate fib(42), it took 0.85 seconds on my PC, or about 680M
calls/second. gcc-O3 does it, miraculously, at 1270M calls/second.

However that is misleading and unsustainable. I showed in my last post
how, if the calls are split across modules using three fib() functions
that call each other, it can only manage 570M calls/second.

Meanwhile the versions that don't cheat can maintain the same throughput.

For curiosity I tried several different variant of Fibonacci-like
benchmark. First version is what you apparently expect, that is
single function that contains machine instructions that you expect.
Then a version which corresponds to C code like:

long
fib45(long n) {
return fib44(n - 1) + fib43(n - 2);
}

....

long
fib0(long n) {
return 0;
}

This version makes all control transfers perfectly predictable
and avoids conditionals. Then version which does not bother
actually passing parameters, but still is doing stack adjustments
and computes value. Then version which performs all required
calls but does not bother with computing value and stack adjustments.
Finally version which replaces final call by jump.

All version using calls had similar speed, nominally about 2.4
clocks per call with about 10% variation (I did not investigate
what caused variation). Version using jumps was significantly
faster, needing about 1.4 clocks per call. The following
version allows tail calls without inlining:

long
fibk(long n, long acc) {
if (n < 2) {
return n + acc;
}
return fibk(n - 1, fibk(n - 2, acc));
}

(start with acc equal to 0).

This one needs about 1.6 clock per call. Note that this one
executes instruction essentially in the same sequence as naive
version, only approximately half of calls is replaced by
jumps and consequently approximately half of returns is gone.

AFAICS reasonable interpretation of results above is that
jumps (even conditional ones) are cheaper than calls and
returns. And it seems that calls and returns are cheap
as long as you do not have too many of them. That is
you can execute a lot of instructions in parallel with
a call or return, but each call and return introduces
latency (probably 1 clock in good case, more if predictors
do no manage to speed it up).

I've anyway counted the calls that gcc-O3 does make and it is a lot
fewer than needed (95% less IIRC). It is achieved via complex inlining
and use of TCO from what I can see.

OK, tried that one too. It needs 0.93 clocks per nominal "call".

I tried also

long
fibk(long n, long acc) {
if (n == 2) {
return 1 + acc;
}
if (n < 2) {
return n + acc;
}
return fibk(n - 1, fibk(n - 2, acc));
}

It needs 1.02 clocks per "call". Of course it needs less calls
than version without special case for 'n == 2', that is why I
put call in quotes. This version could be produced by first
introducing special case for 'n == 2', that is

long
fibk(long n, long acc) {
if (n == 2) {
return fibk(n - 1, fibk(n - 2, acc));
}
if (n < 2) {
return n + acc;
}
return fibk(n - 1, fibk(n - 2, acc));
}

then replacing n by known value and replacing calls to fibk
with known first argument by its value. gcc is doing different
thing, but clearly "not optimizable" does not hold for
Fibonacci function.

You've spent some time on this!

All I've learned is that timings from gcc (I don't have clang easily available) are specious when it comes to recursive fibonacci and optimisations. This is a summary for code I posted yesterday (fib4.c
more or less concatentates fib1/2/3):

c:\cx>gcc -O3 fib.c -o fib
c:\cx>gcc -O3 fib1.c fib2.c fib3.c -o fib123
c:\cx>gcc -O3 fib4.c -o fib4

c:\cx>tim fib
267914296
Time: 0.431

c:\cx>tim fib123
267914296
Time: 0.881

c:\cx>tim fib4
267914296
Time: 0.149

(Each was run twice but faster one is shown.)

This is for fib(42), which means about 535M logical calls. I don't know whether relating to clock speed is meaningful, but at 2.6GHz, these vary
from 0.7 to 2.1 to 4.3 clock ticks per call. The fastest and slowest are
for the same fib1/2/3 version.

The fib4 represents an astonishing 3.5 billion calls per second -
logical ones of course.

How can you have a 6:1 difference for the same language, same logical
program, same compiler, and same options?

If someone asks, how much slower would CPython be than optimised native
code, which figure do I use? Cherry-picking would be misleading.

With gcc-O0, they're all about 1.2s. With Tiny C, all 1.5s. And with
bcc, 1.1-1.2 s. I like predictability.

So the
real question is what is the cost of function calls in actual
programs. For calls to small non-recursive function cost is
close to 0. Recursion increases makes optimization more tricky,
so increases cost. But still, in practice cost is lower than
one could naively expect.

Concerning fairness, AFAIK gcc optimization were developed to
speed up real programs. They speed up Fibonacci basically as
a side effect.

I suspect some time was spent on Fibonacci too!

Possible. But the main motivation were C++ methods and related
coding style. Minor motivation was to allow functional style
of programming. Consider silly (completely untested) code below:

struct node {struct node * next; int data;};

long
length0(struct node * n, long acc) {
if (!n) {
return acc;
} else {
return length0(n->next, acc + 1);
}
}

long
length(struct node * n) {
return length0(n, 0);
}

Let me say that there are people who prefer code like this. Now
it is very easy to turn this into a loop completely avoiding recursion. Functional programming folks call code like this "iterative" and
demand that compiler uses fixed stack space regardless of number
of calls. If you use C as intermediate language for functianal
languages, then it is tricky to satisfy this property in a
compiler except for case when C compiler implements this
optimization.

So IMO it is fair: compier that can not speed
up calls in Fibonacci probably will have trouble speeding up
calls at least in some real programs.

Speeding up calls = avoiding making those calls?

Avoiding emiting call instructions.

In my interpreted language, I'm just realised that no calls at all are
use in running Fibonacci.

That is, no hardware CALL instructions are used when executing bytecode
(other than to display the results). It's all either done with computed
goto, or threaded functions. Of course, there are calls in the bytecode language, but they are emulated by x64 code that excludes hardware CALL.

Still, the result (this is with dynamic typing) is still a magnitude
slower than the fastest reference timing for native code, which is now
the assembly version. It's the only one I can trust.

The fastest HLL version I can trust is somewhat slower, that is with
gcc-O3 but with options to disable inlining and TCO.

If I apply that to my fib/fib123/fib4 tests, I get 0.88, 0.93, 0.91
seconds, so that backs up my judgement.

That would also mean that my compilers are only 33% slower than gcc-O3,
WHEN BOTH HAVE TO DO 535 MILLION CALLS. That point seems lost on
everyone else here.

Sure you can have faster programs if a compiler can figure out how to do
fewer calls! But my tests show such results can't be relied upon.

--- PyGate Linux v1.5.14
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From Bart@3:633/10 to All on Wed Apr 22 13:01:07 2026

On 22/04/2026 03:53, Keith Thompson wrote:

Bart <bc@freeuk.com> writes:

On 21/04/2026 21:16, Keith Thompson wrote:

[...]

Would you insist that a program that computes fib(10) executes
exactly 177 "call" instructions

For comparing function calls across languages and implementations via
that benchmark, then yes.

I didn't say the program was a benchmark. I didn't say what the
purpose of the program was. And guess what, the compiler *doesn't
know* it's a benchmark, so it doesn't disable optimizations for the
sake of measuring something.

If I write a C program with a fib() function (never mind that
the naive recursive algorithm is horrible), and it calls fib(10)
because it needs to know the value of fib(10), would you insist that
it executes exactly 177 "call" instructions?

I don't care. I care about comparing language implementations and in
that case, the ones that don't do 177 calls can't be fairly compared.

I'm not saying that compilers shouldn't do aggressive optimising, only
that in this case it cannot be reliably used as a benchmark result.

Such results are anyway unreliable: rearrange the code a little and
suddenly it's much slower - or much faster.

(See my other recent post. Maybe I've found the key to making recursive fibonacci 3 times the speed?! Even if I had, it would be a fluke.
There's no place for flukes in benchmarking.)

If I was testing integer arithmetic then I would have to use
variables. Constant expression reduction is too commonly done, and in
some languages (eg. C) it is mandated.

You're *so* close to getting it.

If you want to measure the performance of addition, you have to write
your benchmark code so the addition operator can't be optimized away.
If you don't do that, the results will be meaningless.

It seems like you're close getting it too.

Would you agree that a result that involved executing ADD a billion
times, can't be reliably compared with one that does it zero times?

Even though both give the same result.

--- PyGate Linux v1.5.14
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From Bart@3:633/10 to All on Wed Apr 22 15:14:39 2026

On 22/04/2026 03:36, Janis Papanagnou wrote:

On 2026-04-20 13:45, Bart wrote:

My language allows you to do this:

�� int a, b
�� a := b

It is well-defined in the language, and I know it is well defined on
all my likely targets.

That's perfectly fine if (for example) your language implies a default initialization semantics. (Simula, e.g., has such a semantic defined; declared (instantiated) integer variables have the value 0.) - But "C"
does not!

I'm sure that was foremost in the designers' minds when they created C
in 1972. It wasn't retrofitted into the spec years later at all.

- Haven't you two been talking about "C" all the time?

(If you are again trying to project your language's "design decisions"
onto "C" I really suggest to stop that nonsense since it doesn't lead anywhere.)

It seems to be fine in C too according to observation:

c:\cx>type t.c
void F() {
int a, b;
a = b;
}

c:\cx>gcc -c t.c

c:\cx>gcc -c -O3 t.c

c:\cx>gcc -c -Wextra t.c
t.c:3:5: warning: 'b' is used uninitialized [-Wuninitialized]

c:\cx>gcc -c -O3 -Wextra t.c
t.c:3:5: warning: 'b' is used uninitialized [-Wuninitialized]

c:\cx>gcc -c -O3 -Wextra -Wno-uninitialized t.c

It either ignores it or it warns about it. Or you can optimise the
program and tell it to ignore it - apparently you call the shots.

(I notice it says nothing about 'a' not being used.)

So, what does language say about it again? Remind me! Or better, tell
the compiler.

What about this version:

void F() {
int a, b;
// b = G();
a = b;
}

This is part of a larger program, but G hasn't been written yet, and
this function will not be run in any tests until it has. Should gcc
swamp you with pointless warnings?

--- PyGate Linux v1.5.14
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From Bart@3:633/10 to All on Wed Apr 22 15:16:56 2026

On 22/04/2026 05:09, Tim Rentsch wrote:

antispam@fricas.org (Waldek Hebisch) writes:

Bart <bc@freeuk.com> wrote:

On 19/04/2026 20:32, David Brown wrote:

On 19/04/2026 19:47, Bart wrote:

Get the value of 'b',

You can't do that. "b" has no value. "b" is indeterminate, and
using its value is UB - the code has no meaning right out of the
gate.

When you use "b" in an expression, you are /not/ asking C to read
the bits and bytes stored at the address of the object "b". You
are asking for the /value/ of the object "b". How the compiler
gets that value is up to the compiler - it can read the memory, or
use a stored copy in a register, or use program analysis to know
what the value is in some other way. And if the object "b" does
not have a value, you are asking the impossible.

Try asking a human "You have two numbers, b and c. Add them.
What is the answer?".

You have two slates A and B which someone should have wiped clean
then written a new number on each.

But that part hasn't been done; they each still have an old number
from their last use.

You can still add them together, nothing bad will happen. It just
may be the wrong answer if the purpose of the exercise was to find
the sum of two specific new numbers.

But the purpose may also be see how good they are adding. Or in
following instructions.

whatever it happens to be, add the value of 'c' scaled by 8, and
store the result it into 'a'. The only things to consider are
that some intermediate results may lose the top bits.

Is 'a = b' equally undefined? If so that C is even crazy than
I'd thought.

If "a" or "b" are indeterminate, then using them is undefined. I
have two things - are they the same colour? How is that supposed
to make sense?

You keep thinking of objects like "b" as a section of memory with
a bit pattern in it. Objects are not that simple in C - C is not
assembly.

Why ISN'T it that simple? What ghastly thing would happen if it
was?

"b" will be some location in memory or it might be some register,
and it WILL have a value. That value happens to be unknown until
it is initialised.

So accessing it will return garbage (unless you know exactly what
you are doing then it may be something useful).

My original example was something like 'a = b + c' (I think in my
language), converted to my IL, then expressed in very low-level C.

You were concerned that in that C, the values weren't initialised.
How would that have affected the code that C compiler generated
from that?

You look at trivial example, where AFAICS the best answer is:
"Compiler follows general rules, why should it make exception for
this case?". Note that in this trivial case "interesting"
behaviour could happen on exotic hardware (probably disallowed
by C23 rules, but AFAICS legal for earlier C versions).

The kinds of behavior Bart is asking about has been undefined
behavior for just over 15 years, since 2011 ISO C.

So what was it between 1972 and 2011?

--- PyGate Linux v1.5.14
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From Michael S@3:633/10 to All on Wed Apr 22 17:56:12 2026

On Wed, 22 Apr 2026 15:14:39 +0100
Bart <bc@freeuk.com> wrote:

On 22/04/2026 03:36, Janis Papanagnou wrote:

On 2026-04-20 13:45, Bart wrote:

My language allows you to do this:

??? int a, b
??? a := b

It is well-defined in the language, and I know it is well defined
on all my likely targets.

That's perfectly fine if (for example) your language implies a
default initialization semantics. (Simula, e.g., has such a
semantic defined; declared (instantiated) integer variables have
the value 0.) - But "C" does not!

I'm sure that was foremost in the designers' minds when they created
C in 1972. It wasn't retrofitted into the spec years later at all.

- Haven't you two been talking about "C" all the time?

(If you are again trying to project your language's "design
decisions" onto "C" I really suggest to stop that nonsense since it
doesn't lead anywhere.)

It seems to be fine in C too according to observation:

c:\cx>type t.c
void F() {
int a, b;
a = b;
}

c:\cx>gcc -c t.c

c:\cx>gcc -c -O3 t.c

c:\cx>gcc -c -Wextra t.c
t.c:3:5: warning: 'b' is used uninitialized [-Wuninitialized]

c:\cx>gcc -c -O3 -Wextra t.c
t.c:3:5: warning: 'b' is used uninitialized [-Wuninitialized]

c:\cx>gcc -c -O3 -Wextra -Wno-uninitialized t.c

It either ignores it or it warns about it. Or you can optimise the
program and tell it to ignore it - apparently you call the shots.

(I notice it says nothing about 'a' not being used.)

So, what does language say about it again? Remind me! Or better, tell
the compiler.

What about this version:

void F() {
int a, b;
// b = G();
a = b;
}

This is part of a larger program, but G hasn't been written yet, and
this function will not be run in any tests until it has. Should gcc
swamp you with pointless warnings?

C warns you, but it does not stop. So, you can ignore a warning or even
disable it, if you happen to know a suitable spell.
Rust is not as tolerant, even in cases of less serious nonobservance.

--- PyGate Linux v1.5.14
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From David Brown@3:633/10 to All on Wed Apr 22 17:12:00 2026

On 22/04/2026 16:56, Michael S wrote:

On Wed, 22 Apr 2026 15:14:39 +0100
Bart <bc@freeuk.com> wrote:

On 22/04/2026 03:36, Janis Papanagnou wrote:

On 2026-04-20 13:45, Bart wrote:

My language allows you to do this:

�� int a, b
�� a := b

It is well-defined in the language, and I know it is well defined
on all my likely targets.

That's perfectly fine if (for example) your language implies a
default initialization semantics. (Simula, e.g., has such a
semantic defined; declared (instantiated) integer variables have
the value 0.) - But "C" does not!

I'm sure that was foremost in the designers' minds when they created
C in 1972. It wasn't retrofitted into the spec years later at all.

- Haven't you two been talking about "C" all the time?

(If you are again trying to project your language's "design
decisions" onto "C" I really suggest to stop that nonsense since it
doesn't lead anywhere.)

It seems to be fine in C too according to observation:

c:\cx>type t.c
void F() {
int a, b;
a = b;
}

c:\cx>gcc -c t.c

c:\cx>gcc -c -O3 t.c

c:\cx>gcc -c -Wextra t.c
t.c:3:5: warning: 'b' is used uninitialized [-Wuninitialized]

c:\cx>gcc -c -O3 -Wextra t.c
t.c:3:5: warning: 'b' is used uninitialized [-Wuninitialized]

c:\cx>gcc -c -O3 -Wextra -Wno-uninitialized t.c

It either ignores it or it warns about it. Or you can optimise the
program and tell it to ignore it - apparently you call the shots.

(I notice it says nothing about 'a' not being used.)

So, what does language say about it again? Remind me! Or better, tell
the compiler.

What about this version:

void F() {
int a, b;
// b = G();
a = b;
}

This is part of a larger program, but G hasn't been written yet, and
this function will not be run in any tests until it has. Should gcc
swamp you with pointless warnings?

C warns you, but it does not stop. So, you can ignore a warning or even disable it, if you happen to know a suitable spell.
Rust is not as tolerant, even in cases of less serious nonobservance.

"C" does not warn you about anything here - that's up to C compilers.
Some mistakes in your C code require a diagnostic (for conforming
compilers), but not this one.

gcc will, of course, treat the use of the uninitialised variable as an
error halting compilation if you know how to use it properly. It will
also give a warning that "a" is unused, if you know how to use it
properly. Bart knows how to use gcc with flags to set standards
conformance, warnings, etc., and he knows the difference between C the language and particular compiler implementations, but he thinks it is
fun to pretend he does not.

Rust is not fundamentally better than C here - it is simply that Rust is
a relatively new language without the historical baggage of existing questionable quality Rust code. So Rust tools were able to have better checking for this kind of thing out of the gate. gcc /could/ make "-Werror=uninitialized" the default setting, but that would cause
problems with some existing code. gcc has gradually expanded on the
selection of warnings it enables by default, but it's a slow process to
avoid upsetting people with large code bases that trigger such warnings.

--- PyGate Linux v1.5.14
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From Scott Lurndal@3:633/10 to All on Wed Apr 22 15:13:56 2026

Bart <bc@freeuk.com> writes:

On 22/04/2026 05:09, Tim Rentsch wrote:

antispam@fricas.org (Waldek Hebisch) writes:

Bart <bc@freeuk.com> wrote:

On 19/04/2026 20:32, David Brown wrote:
A

On 19/04/2026 19:47, Bart wrote:

Get the value of 'b',

You can't do that. "b" has no value. "b" is indeterminate, and
using its value is UB - the code has no meaning right out of the
gate.

The kinds of behavior Bart is asking about has been undefined
behavior for just over 15 years, since 2011 ISO C.

So what was it between 1972 and 2011?

Implementation specific. Depending on how the linker
and run-time loader handled uninitialized data regions
in the a.out file and when loading.

Some may have initialized to zero, others may have initialized
to some other data pattern (e.g. 0xdeadbeef) to catch
uninitialized pointer dereferences (particularly since early
unix systems often would return zero on a load from a NULL
pointer rather than trapping the access).

The key point is that portable C code could make no
assumptions about uninitialized data accesses as
the existing implementations differed. Hence, UB.

IMO, most "undefined behavior" in the C specification was
due to implementation differences between the C compilers/linkers
that existed at the time.

--- PyGate Linux v1.5.14
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From Michael S@3:633/10 to All on Wed Apr 22 18:21:53 2026

On Wed, 22 Apr 2026 17:12:00 +0200
David Brown <david.brown@hesbynett.no> wrote:

On 22/04/2026 16:56, Michael S wrote:

On Wed, 22 Apr 2026 15:14:39 +0100
Bart <bc@freeuk.com> wrote:

On 22/04/2026 03:36, Janis Papanagnou wrote:

On 2026-04-20 13:45, Bart wrote:

My language allows you to do this:

??? int a, b
??? a := b

It is well-defined in the language, and I know it is well defined
on all my likely targets.

That's perfectly fine if (for example) your language implies a
default initialization semantics. (Simula, e.g., has such a
semantic defined; declared (instantiated) integer variables have
the value 0.) - But "C" does not!

I'm sure that was foremost in the designers' minds when they
created C in 1972. It wasn't retrofitted into the spec years later
at all.

- Haven't you two been talking about "C" all the time?

(If you are again trying to project your language's "design
decisions" onto "C" I really suggest to stop that nonsense since
it doesn't lead anywhere.)

It seems to be fine in C too according to observation:

c:\cx>type t.c
void F() {
int a, b;
a = b;
}

c:\cx>gcc -c t.c

c:\cx>gcc -c -O3 t.c

c:\cx>gcc -c -Wextra t.c
t.c:3:5: warning: 'b' is used uninitialized [-Wuninitialized]

c:\cx>gcc -c -O3 -Wextra t.c
t.c:3:5: warning: 'b' is used uninitialized [-Wuninitialized]

c:\cx>gcc -c -O3 -Wextra -Wno-uninitialized t.c

It either ignores it or it warns about it. Or you can optimise the
program and tell it to ignore it - apparently you call the shots.

(I notice it says nothing about 'a' not being used.)

So, what does language say about it again? Remind me! Or better,
tell the compiler.

What about this version:

void F() {
int a, b;
// b = G();
a = b;
}

This is part of a larger program, but G hasn't been written yet,
and this function will not be run in any tests until it has.
Should gcc swamp you with pointless warnings?

C warns you, but it does not stop. So, you can ignore a warning or
even disable it, if you happen to know a suitable spell.
Rust is not as tolerant, even in cases of less serious
nonobservance.

"C" does not warn you about anything here - that's up to C compilers.
Some mistakes in your C code require a diagnostic (for conforming compilers), but not this one.

gcc will, of course, treat the use of the uninitialised variable as
an error halting compilation if you know how to use it properly. It
will also give a warning that "a" is unused, if you know how to use
it properly. Bart knows how to use gcc with flags to set standards conformance, warnings, etc., and he knows the difference between C
the language and particular compiler implementations, but he thinks
it is fun to pretend he does not.

Rust is not fundamentally better than C here - it is simply that Rust
is a relatively new language without the historical baggage of
existing questionable quality Rust code. So Rust tools were able to
have better checking for this kind of thing out of the gate. gcc
/could/ make "-Werror=uninitialized" the default setting, but that
would cause problems with some existing code. gcc has gradually
expanded on the selection of warnings it enables by default, but it's
a slow process to avoid upsetting people with large code bases that
trigger such warnings.

The intention of my post was to claim that C is better than Rust.
Unless you didn't pay attention yet, I am rustophobic.

--- PyGate Linux v1.5.14
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From Michael S@3:633/10 to All on Wed Apr 22 18:26:33 2026

On Wed, 22 Apr 2026 15:13:56 GMT
scott@slp53.sl.home (Scott Lurndal) wrote:

Bart <bc@freeuk.com> writes:

On 22/04/2026 05:09, Tim Rentsch wrote:

antispam@fricas.org (Waldek Hebisch) writes:

Bart <bc@freeuk.com> wrote:

On 19/04/2026 20:32, David Brown wrote:
A

On 19/04/2026 19:47, Bart wrote:

Get the value of 'b',

You can't do that. "b" has no value. "b" is indeterminate, and
using its value is UB - the code has no meaning right out of the
gate.

The kinds of behavior Bart is asking about has been undefined
behavior for just over 15 years, since 2011 ISO C.

So what was it between 1972 and 2011?

Implementation specific. Depending on how the linker
and run-time loader handled uninitialized data regions
in the a.out file and when loading.

Some may have initialized to zero, others may have initialized
to some other data pattern (e.g. 0xdeadbeef) to catch
uninitialized pointer dereferences (particularly since early
unix systems often would return zero on a load from a NULL
pointer rather than trapping the access).

The key point is that portable C code could make no
assumptions about uninitialized data accesses as
the existing implementations differed. Hence, UB.

IMO, most "undefined behavior" in the C specification was
due to implementation differences between the C compilers/linkers
that existed at the time.

Your opinion considered blasphemy by higher priesthood.
Although I suspect that many, if not most, members of Committee would
agree with you.

--- PyGate Linux v1.5.14
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From David Brown@3:633/10 to All on Wed Apr 22 17:27:30 2026

On 22/04/2026 16:16, Bart wrote:

On 22/04/2026 05:09, Tim Rentsch wrote:

antispam@fricas.org (Waldek Hebisch) writes:

<snip>

You look at trivial example, where AFAICS the best answer is:
"Compiler follows general rules, why should it make exception for
this case?".� Note that in this trivial case "interesting"
behaviour could happen on exotic hardware (probably disallowed
by C23 rules, but AFAICS legal for earlier C versions).

The kinds of behavior Bart is asking about has been undefined
behavior for just over 15 years, since 2011 ISO C.

So what was it between 1972 and 2011?

In the C99 standard, J.2 "Undefined behavior", it says:

-- The value of an object with automatic storage duration is used while
it is indeterminate.

The annexes are not normative AFAIUI - that is, they are intended to be accurate, but if there is a disagreement between something in an annex
and something in the main text, the main text is considered correct.
any such disagreements are usually fixed in later versions of the standard.

Tim is better at "language lawyer" stuff than I am, but it seems to me
that "int a, b; a = b;" is UB in C99. I don't have any older references conveniently on hand, but I do not think this has changed - though the
wording in the standard has been improved a little over time.

--- PyGate Linux v1.5.14
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From Michael S@3:633/10 to All on Wed Apr 22 18:52:52 2026

On Wed, 22 Apr 2026 15:16:56 +0100
Bart <bc@freeuk.com> wrote:

On 22/04/2026 05:09, Tim Rentsch wrote:

antispam@fricas.org (Waldek Hebisch) writes:

You look at trivial example, where AFAICS the best answer is:
"Compiler follows general rules, why should it make exception for
this case?". Note that in this trivial case "interesting"
behaviour could happen on exotic hardware (probably disallowed
by C23 rules, but AFAICS legal for earlier C versions).

The kinds of behavior Bart is asking about has been undefined
behavior for just over 15 years, since 2011 ISO C.

So what was it between 1972 and 2011?

My record at guessing exact meaning of Tim's statements is not
particularly good, but I'll try nevertheless.

Tim seems to suggest that function foo() below had defined behavior
(most likely of returning 1) in C90 and C99, then it became undefined in
C11 and C17 then again became defined in C23.
For years 1972 to 1989 Tim probably thinks that there is no sufficient
data to answer your question.

int foo(void) {
int a, b=a;
return a==b;
}

Naturally, I could be wrong and even not unlikely to be wrong.

--- PyGate Linux v1.5.14
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From David Brown@3:633/10 to All on Wed Apr 22 17:57:11 2026

On 22/04/2026 17:21, Michael S wrote:

On Wed, 22 Apr 2026 17:12:00 +0200
David Brown <david.brown@hesbynett.no> wrote:

On 22/04/2026 16:56, Michael S wrote:

On Wed, 22 Apr 2026 15:14:39 +0100
Bart <bc@freeuk.com> wrote:

On 22/04/2026 03:36, Janis Papanagnou wrote:

On 2026-04-20 13:45, Bart wrote:

My language allows you to do this:

�� int a, b
�� a := b

It is well-defined in the language, and I know it is well defined
on all my likely targets.

That's perfectly fine if (for example) your language implies a
default initialization semantics. (Simula, e.g., has such a
semantic defined; declared (instantiated) integer variables have
the value 0.) - But "C" does not!

I'm sure that was foremost in the designers' minds when they
created C in 1972. It wasn't retrofitted into the spec years later
at all.

- Haven't you two been talking about "C" all the time?

(If you are again trying to project your language's "design
decisions" onto "C" I really suggest to stop that nonsense since
it doesn't lead anywhere.)

It seems to be fine in C too according to observation:

c:\cx>type t.c
void F() {
int a, b;
a = b;
}

c:\cx>gcc -c t.c

c:\cx>gcc -c -O3 t.c

c:\cx>gcc -c -Wextra t.c
t.c:3:5: warning: 'b' is used uninitialized [-Wuninitialized]

c:\cx>gcc -c -O3 -Wextra t.c
t.c:3:5: warning: 'b' is used uninitialized [-Wuninitialized]

c:\cx>gcc -c -O3 -Wextra -Wno-uninitialized t.c

It either ignores it or it warns about it. Or you can optimise the
program and tell it to ignore it - apparently you call the shots.

(I notice it says nothing about 'a' not being used.)

So, what does language say about it again? Remind me! Or better,
tell the compiler.

What about this version:

void F() {
int a, b;
// b = G();
a = b;
}

This is part of a larger program, but G hasn't been written yet,
and this function will not be run in any tests until it has.
Should gcc swamp you with pointless warnings?

C warns you, but it does not stop. So, you can ignore a warning or
even disable it, if you happen to know a suitable spell.
Rust is not as tolerant, even in cases of less serious
nonobservance.

"C" does not warn you about anything here - that's up to C compilers.
Some mistakes in your C code require a diagnostic (for conforming
compilers), but not this one.

gcc will, of course, treat the use of the uninitialised variable as
an error halting compilation if you know how to use it properly. It
will also give a warning that "a" is unused, if you know how to use
it properly. Bart knows how to use gcc with flags to set standards
conformance, warnings, etc., and he knows the difference between C
the language and particular compiler implementations, but he thinks
it is fun to pretend he does not.

Rust is not fundamentally better than C here - it is simply that Rust
is a relatively new language without the historical baggage of
existing questionable quality Rust code. So Rust tools were able to
have better checking for this kind of thing out of the gate. gcc
/could/ make "-Werror=uninitialized" the default setting, but that
would cause problems with some existing code. gcc has gradually
expanded on the selection of warnings it enables by default, but it's
a slow process to avoid upsetting people with large code bases that
trigger such warnings.

The intention of my post was to claim that C is better than Rust.
Unless you didn't pay attention yet, I am rustophobic.

OK - that did not come across to me from your post. I think it is a
good thing if a language or tool is intolerant to mistakes like these.
But sometimes stricter rules and checking mean less flexibility, so
there can be trade-offs. My knowledge of Rust is far too limited to
know what it does there, and if defaults can be overridden.

--- PyGate Linux v1.5.14
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From Lew Pitcher@3:633/10 to All on Wed Apr 22 16:09:19 2026

On Wed, 22 Apr 2026 15:13:56 +0000, Scott Lurndal wrote:

Bart <bc@freeuk.com> writes:

On 22/04/2026 05:09, Tim Rentsch wrote:

antispam@fricas.org (Waldek Hebisch) writes:

Bart <bc@freeuk.com> wrote:

On 19/04/2026 20:32, David Brown wrote:
A

On 19/04/2026 19:47, Bart wrote:

Get the value of 'b',

You can't do that. "b" has no value. "b" is indeterminate, and
using its value is UB - the code has no meaning right out of the
gate.

The kinds of behavior Bart is asking about has been undefined
behavior for just over 15 years, since 2011 ISO C.

So what was it between 1972 and 2011?

Implementation specific. Depending on how the linker
and run-time loader handled uninitialized data regions
in the a.out file and when loading.

K&R is very specific about the initial value of automatic
variables:
1.10 Scope; External Variables
...
"Because automatic variables come and go with
function invocation, they do not retain their
values from one call to the next, and must be
explicitly set upon each entry. If they are
not set, they will contain garbage."
...

2.4 Declarations
...
"Automatic variables for which there is no
explicit initializer have undefined (i.e.
garbage) values."
...

4.9 Initialization
...
"In the absence of explicit initialization,
external and static variables are guaranteed
to be initialized to zero; automatic and
register variables have undefined (.e.e garbage)
values."
...

8.6 Initialization
...
"Static and external variables which are not
initialized are guaranteed to start off
as 0, automatic and register variables which
are not initialized are guaranteed to start
off as garbage."
...

So, for automatic and register variables at least,
even K&R defined that, before initialization, their
values were undefined.

--
Lew Pitcher
"In Skills We Trust"
Not LLM output - I'm just like this.

--- PyGate Linux v1.5.14
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From Michael S@3:633/10 to All on Wed Apr 22 19:16:39 2026

On Wed, 22 Apr 2026 17:57:11 +0200
David Brown <david.brown@hesbynett.no> wrote:

On 22/04/2026 17:21, Michael S wrote:

On Wed, 22 Apr 2026 17:12:00 +0200
David Brown <david.brown@hesbynett.no> wrote:

On 22/04/2026 16:56, Michael S wrote:

On Wed, 22 Apr 2026 15:14:39 +0100
Bart <bc@freeuk.com> wrote:

On 22/04/2026 03:36, Janis Papanagnou wrote:

On 2026-04-20 13:45, Bart wrote:

My language allows you to do this:

??? int a, b
??? a := b

It is well-defined in the language, and I know it is well
defined on all my likely targets.

That's perfectly fine if (for example) your language implies a
default initialization semantics. (Simula, e.g., has such a
semantic defined; declared (instantiated) integer variables have
the value 0.) - But "C" does not!

I'm sure that was foremost in the designers' minds when they
created C in 1972. It wasn't retrofitted into the spec years
later at all.

- Haven't you two been talking about "C" all the time?

(If you are again trying to project your language's "design
decisions" onto "C" I really suggest to stop that nonsense since
it doesn't lead anywhere.)

It seems to be fine in C too according to observation:

c:\cx>type t.c
void F() {
int a, b;
a = b;
}

c:\cx>gcc -c t.c

c:\cx>gcc -c -O3 t.c

c:\cx>gcc -c -Wextra t.c
t.c:3:5: warning: 'b' is used uninitialized
[-Wuninitialized]
c:\cx>gcc -c -O3 -Wextra t.c
t.c:3:5: warning: 'b' is used uninitialized
[-Wuninitialized]
c:\cx>gcc -c -O3 -Wextra -Wno-uninitialized t.c

It either ignores it or it warns about it. Or you can optimise
the program and tell it to ignore it - apparently you call the
shots.

(I notice it says nothing about 'a' not being used.)

So, what does language say about it again? Remind me! Or better,
tell the compiler.

What about this version:

void F() {
int a, b;
// b = G();
a = b;
}

This is part of a larger program, but G hasn't been written yet,
and this function will not be run in any tests until it has.
Should gcc swamp you with pointless warnings?

C warns you, but it does not stop. So, you can ignore a warning or
even disable it, if you happen to know a suitable spell.
Rust is not as tolerant, even in cases of less serious
nonobservance.

"C" does not warn you about anything here - that's up to C
compilers. Some mistakes in your C code require a diagnostic (for
conforming compilers), but not this one.

gcc will, of course, treat the use of the uninitialised variable as
an error halting compilation if you know how to use it properly.
It will also give a warning that "a" is unused, if you know how to
use it properly. Bart knows how to use gcc with flags to set
standards conformance, warnings, etc., and he knows the difference
between C the language and particular compiler implementations,
but he thinks it is fun to pretend he does not.

Rust is not fundamentally better than C here - it is simply that
Rust is a relatively new language without the historical baggage of
existing questionable quality Rust code. So Rust tools were able
to have better checking for this kind of thing out of the gate.
gcc /could/ make "-Werror=uninitialized" the default setting, but
that would cause problems with some existing code. gcc has
gradually expanded on the selection of warnings it enables by
default, but it's a slow process to avoid upsetting people with
large code bases that trigger such warnings.

The intention of my post was to claim that C is better than Rust.
Unless you didn't pay attention yet, I am rustophobic.

OK - that did not come across to me from your post. I think it is a
good thing if a language or tool is intolerant to mistakes like
these. But sometimes stricter rules and checking mean less
flexibility, so there can be trade-offs. My knowledge of Rust is far
too limited to know what it does there, and if defaults can be
overridden.

My comment was referring to "less serious nonobservance", mostly
I meant unused local variables.

--- PyGate Linux v1.5.14
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From David Brown@3:633/10 to All on Wed Apr 22 21:42:46 2026

On 22/04/2026 18:16, Michael S wrote:

On Wed, 22 Apr 2026 17:57:11 +0200
David Brown <david.brown@hesbynett.no> wrote:

On 22/04/2026 17:21, Michael S wrote:

On Wed, 22 Apr 2026 17:12:00 +0200
David Brown <david.brown@hesbynett.no> wrote:

On 22/04/2026 16:56, Michael S wrote:

On Wed, 22 Apr 2026 15:14:39 +0100
Bart <bc@freeuk.com> wrote:

On 22/04/2026 03:36, Janis Papanagnou wrote:

On 2026-04-20 13:45, Bart wrote:

My language allows you to do this:

�� int a, b
�� a := b

It is well-defined in the language, and I know it is well
defined on all my likely targets.

That's perfectly fine if (for example) your language implies a
default initialization semantics. (Simula, e.g., has such a
semantic defined; declared (instantiated) integer variables have >>>>>>> the value 0.) - But "C" does not!

I'm sure that was foremost in the designers' minds when they
created C in 1972. It wasn't retrofitted into the spec years
later at all.

- Haven't you two been talking about "C" all the time?

(If you are again trying to project your language's "design
decisions" onto "C" I really suggest to stop that nonsense since >>>>>>> it doesn't lead anywhere.)

It seems to be fine in C too according to observation:

c:\cx>type t.c
void F() {
int a, b;
a = b;
}

c:\cx>gcc -c t.c

c:\cx>gcc -c -O3 t.c

c:\cx>gcc -c -Wextra t.c
t.c:3:5: warning: 'b' is used uninitialized
[-Wuninitialized]
c:\cx>gcc -c -O3 -Wextra t.c
t.c:3:5: warning: 'b' is used uninitialized
[-Wuninitialized]
c:\cx>gcc -c -O3 -Wextra -Wno-uninitialized t.c

It either ignores it or it warns about it. Or you can optimise
the program and tell it to ignore it - apparently you call the
shots.

(I notice it says nothing about 'a' not being used.)

So, what does language say about it again? Remind me! Or better,
tell the compiler.

What about this version:

void F() {
int a, b;
// b = G();
a = b;
}

This is part of a larger program, but G hasn't been written yet,
and this function will not be run in any tests until it has.
Should gcc swamp you with pointless warnings?

C warns you, but it does not stop. So, you can ignore a warning or
even disable it, if you happen to know a suitable spell.
Rust is not as tolerant, even in cases of less serious
nonobservance.

"C" does not warn you about anything here - that's up to C
compilers. Some mistakes in your C code require a diagnostic (for
conforming compilers), but not this one.

gcc will, of course, treat the use of the uninitialised variable as
an error halting compilation if you know how to use it properly.
It will also give a warning that "a" is unused, if you know how to
use it properly. Bart knows how to use gcc with flags to set
standards conformance, warnings, etc., and he knows the difference
between C the language and particular compiler implementations,
but he thinks it is fun to pretend he does not.

Rust is not fundamentally better than C here - it is simply that
Rust is a relatively new language without the historical baggage of
existing questionable quality Rust code. So Rust tools were able
to have better checking for this kind of thing out of the gate.
gcc /could/ make "-Werror=uninitialized" the default setting, but
that would cause problems with some existing code. gcc has
gradually expanded on the selection of warnings it enables by
default, but it's a slow process to avoid upsetting people with
large code bases that trigger such warnings.

The intention of my post was to claim that C is better than Rust.
Unless you didn't pay attention yet, I am rustophobic.

OK - that did not come across to me from your post. I think it is a
good thing if a language or tool is intolerant to mistakes like
these. But sometimes stricter rules and checking mean less
flexibility, so there can be trade-offs. My knowledge of Rust is far
too limited to know what it does there, and if defaults can be
overridden.

My comment was referring to "less serious nonobservance", mostly
I meant unused local variables.

Again, the extra explanation makes things clearer. I still prefer a
warning on these, but that is because it is an indication that my code
is missing something - it is not referring to UB or subtly wrong code.
So it is more of a preference that suits the way I like to code, and
will not be helpful to everyone.

--- PyGate Linux v1.5.14
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From Keith Thompson@3:633/10 to All on Wed Apr 22 14:23:39 2026

Bart <bc@freeuk.com> writes:

On 22/04/2026 03:53, Keith Thompson wrote:

[...]

You're *so* close to getting it. If you want to measure the
performance of addition, you have to write your benchmark code so the
addition operator can't be optimized away. If you don't do that, the
results will be meaningless.

It seems like you're close getting it too.

OK, what am I getting close to?

Would you agree that a result that involved executing ADD a billion
times, can't be reliably compared with one that does it zero times?

No.

Even though both give the same result.

Of course they can be reliably compared. One is much faster than
the other. That's a reliable comparison.

It's just not a comparison of the performance of "add" CPU
instructions.

Apparently you want to measure something. I'm not sure exactly
what, but it seems to be the performance of a sequence of some large
number of "call" CPU instructions in code generated from a C program.
You complain that the approach you're using doesn't let you do that.

You need a different approach.

And if you *asked* for help with that, it's likely you'd get it.

--
Keith Thompson (The_Other_Keith) Keith.S.Thompson+u@gmail.com
void Void(void) { Void(); } /* The recursive call of the void */

--- PyGate Linux v1.5.14
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From Chris M. Thomasson@3:633/10 to All on Wed Apr 22 14:28:34 2026

On 4/21/2026 1:13 PM, David Brown wrote:

On 21/04/2026 20:51, Chris M. Thomasson wrote:

On 4/21/2026 1:13 AM, David Brown wrote:

On 20/04/2026 23:59, Keith Thompson wrote:

Bart <bc@freeuk.com> writes:

On 20/04/2026 18:48, Keith Thompson wrote:

Bart <bc@freeuk.com> writes:

Yes, that's really useful!

So which implementation is faster at actually doing function calls?
And how many calls were actually made?

I don't know or care.

Once again, *there are ways* to write C benchmarks that guarantee
that all the function calls you want to time actually occur during
execution.� For example, you can use calls to separately compiled
functions (and disable link-time optimization if necessary).� You can
do computations that the compiler can't unwrap.� You might multiply
a value by (time(NULL) > 0); that always yields 1, but the compiler
probably doesn't know that.� (That's off the top of my head; I don't
know what the best techniques are in practice.)� And then you can
examine the generated code to make sure that it's what you want.

To add more suggestions here, I find the key to benchmarking when you
want to stick to standard C is use of "volatile".� Use a volatile
read at the start of your code, then calculations that depend on each
other and that first read, then a volatile write of the result.� That
gives minimal intrusion in the code while making sure the
calculations have to be generated, and have to be done at run time.

If you are testing on a particular compiler (like gcc or clang), then
there are other options.� The "noinline" function attribute is very
handy.� Then there are empty inline assembly statements:

If you think of processor registers as acting like a level -1 memory
cache (for things that are not always in registers), then this
flushes that cache:

��asm volatile ("" ::: "memory");

This tells the compiler that it needs to have calculated "x" at this
point in time (so that its value can be passed to the assembly) :

��asm volatile ("" :: "" (x));

This tells the compiler that "x" might be changed by the assembly, so
it must forget any additional knowledge it had of it :

��asm volatile ("" : "+g" (x));

I've had use of all of these in real code, not just benchmarks or
test code.� They can be helpful in some kinds of interactions between
low level code and hardware.

Well, we have to make a difference between a compiler barrier and a
memory barrier. All memory barriers should be compiler barriers, but
compiler barriers do not have to be memory barriers... Fair enough?

Of course there is a difference between memory barriers and compiler barriers.� We are talking about compiler barriers here, because they
have an effect on the semantics of the language (in this case, the
language is "C with gcc extensions") without the cost of real memory barriers.� C11 atomic fences are compiler and memory barriers, but they
can have a huge effect on code speed - these empty assembly statements
are aimed at having minimal impact outside of the intended effects.

I think a relaxed memory barrier can be used as a compiler barrier and
be compatible with atomic, volatile does not have to be used here?

--- PyGate Linux v1.5.14
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From Chris M. Thomasson@3:633/10 to All on Wed Apr 22 14:29:11 2026

On 4/22/2026 2:28 PM, Chris M. Thomasson wrote:

On 4/21/2026 1:13 PM, David Brown wrote:

On 21/04/2026 20:51, Chris M. Thomasson wrote:

On 4/21/2026 1:13 AM, David Brown wrote:

On 20/04/2026 23:59, Keith Thompson wrote:

Bart <bc@freeuk.com> writes:

On 20/04/2026 18:48, Keith Thompson wrote:

Bart <bc@freeuk.com> writes:

Yes, that's really useful!

So which implementation is faster at actually doing function calls? >>>>>> And how many calls were actually made?

I don't know or care.

Once again, *there are ways* to write C benchmarks that guarantee
that all the function calls you want to time actually occur during
execution.� For example, you can use calls to separately compiled
functions (and disable link-time optimization if necessary).� You can >>>>> do computations that the compiler can't unwrap.� You might multiply
a value by (time(NULL) > 0); that always yields 1, but the compiler
probably doesn't know that.� (That's off the top of my head; I don't >>>>> know what the best techniques are in practice.)� And then you can
examine the generated code to make sure that it's what you want.

To add more suggestions here, I find the key to benchmarking when
you want to stick to standard C is use of "volatile".� Use a
volatile read at the start of your code, then calculations that
depend on each other and that first read, then a volatile write of
the result.� That gives minimal intrusion in the code while making
sure the calculations have to be generated, and have to be done at
run time.

If you are testing on a particular compiler (like gcc or clang),
then there are other options.� The "noinline" function attribute is
very handy.� Then there are empty inline assembly statements:

If you think of processor registers as acting like a level -1 memory
cache (for things that are not always in registers), then this
flushes that cache:

��asm volatile ("" ::: "memory");

This tells the compiler that it needs to have calculated "x" at this
point in time (so that its value can be passed to the assembly) :

��asm volatile ("" :: "" (x));

This tells the compiler that "x" might be changed by the assembly,
so it must forget any additional knowledge it had of it :

��asm volatile ("" : "+g" (x));

I've had use of all of these in real code, not just benchmarks or
test code.� They can be helpful in some kinds of interactions
between low level code and hardware.

Well, we have to make a difference between a compiler barrier and a
memory barrier. All memory barriers should be compiler barriers, but
compiler barriers do not have to be memory barriers... Fair enough?

Of course there is a difference between memory barriers and compiler
barriers.� We are talking about compiler barriers here, because they
have an effect on the semantics of the language (in this case, the
language is "C with gcc extensions") without the cost of real memory
barriers.� C11 atomic fences are compiler and memory barriers, but
they can have a huge effect on code speed - these empty assembly
statements are aimed at having minimal impact outside of the intended
effects.

I think a relaxed memory barrier can be used as a compiler barrier and
be compatible with atomic, volatile does not have to be used here?

load/store with relaxed should act like compiler barriers?

--- PyGate Linux v1.5.14
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From Keith Thompson@3:633/10 to All on Wed Apr 22 14:33:30 2026

Bart <bc@freeuk.com> writes:

On 22/04/2026 03:36, Janis Papanagnou wrote:

On 2026-04-20 13:45, Bart wrote:

My language allows you to do this:

�� int a, b
�� a := b

It is well-defined in the language, and I know it is well defined
on all my likely targets.

That's perfectly fine if (for example) your language implies a
default
initialization semantics. (Simula, e.g., has such a semantic defined;
declared (instantiated) integer variables have the value 0.) - But "C"
does not!

I'm sure that was foremost in the designers' minds when they created C
in 1972. It wasn't retrofitted into the spec years later at all.

- Haven't you two been talking about "C" all the time?

(If you are again trying to project your language's "design decisions"
onto "C" I really suggest to stop that nonsense since it doesn't lead
anywhere.)

It seems to be fine in C too according to observation:

c:\cx>type t.c
void F() {
int a, b;
a = b;
}

Come on, Bart, you already know this stuff.

The behavior of `a = b;` is undefined. You know what "undefined
behavior" means. You know that C implementations are not required
to diagnose undefined behavior.

You know that, since a and b are local to the function and their
values are never used, a compiler could generate machine code for F()
as an empty function. (I do not claim that any particular compiler
does or does not perform this optimization.)

[...]

So, what does language say about it again? Remind me! Or better, tell
the compiler.

I've already told you what the language says about it. I quoted
the section of the ISO C standard that says explicitly that the
behavior is undefined. N3220 6.3.2.1p2, last sentence.

The compiler's behavior is consistent with that requirment.

You cannot possibly have forgotten this. Why do you pretend?

[...]

--
Keith Thompson (The_Other_Keith) Keith.S.Thompson+u@gmail.com
void Void(void) { Void(); } /* The recursive call of the void */

--- PyGate Linux v1.5.14
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From Bart@3:633/10 to All on Thu Apr 23 00:22:12 2026

On 22/04/2026 22:33, Keith Thompson wrote:

Bart <bc@freeuk.com> writes:

On 22/04/2026 03:36, Janis Papanagnou wrote:

On 2026-04-20 13:45, Bart wrote:

My language allows you to do this:

�� int a, b
�� a := b

It is well-defined in the language, and I know it is well defined
on all my likely targets.

That's perfectly fine if (for example) your language implies a
default
initialization semantics. (Simula, e.g., has such a semantic defined;
declared (instantiated) integer variables have the value 0.) - But "C"
does not!

I'm sure that was foremost in the designers' minds when they created C
in 1972. It wasn't retrofitted into the spec years later at all.

- Haven't you two been talking about "C" all the time?

(If you are again trying to project your language's "design decisions"
onto "C" I really suggest to stop that nonsense since it doesn't lead
anywhere.)

It seems to be fine in C too according to observation:

c:\cx>type t.c
void F() {
int a, b;
a = b;
}

Come on, Bart, you already know this stuff.

The behavior of `a = b;` is undefined. You know what "undefined
behavior" means. You know that C implementations are not required
to diagnose undefined behavior.

You know that, since a and b are local to the function and their
values are never used, a compiler could generate machine code for F()
as an empty function. (I do not claim that any particular compiler
does or does not perform this optimization.)

[...]

So, what does language say about it again? Remind me! Or better, tell
the compiler.

I've already told you what the language says about it. I quoted
the section of the ISO C standard that says explicitly that the
behavior is undefined. N3220 6.3.2.1p2, last sentence.

The compiler's behavior is consistent with that requirment.

You cannot possibly have forgotten this. Why do you pretend?

Nobody seems to have a problem with gcc being lax about this (or with it allowing its users to let it be lax).

Everybody seems to have a problem with /me/ being lax about it.

Does anyone have any actual examples of very bad things happening with a program like the above?

From what I can see, with -O0 it just moves 32 bits from one part of
the allocated stack frame to another. And with -O1 and above, the code
is elided anyway.

Not exactly the end of the world.

--- PyGate Linux v1.5.14
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From Bart@3:633/10 to All on Thu Apr 23 00:52:59 2026

On 22/04/2026 22:23, Keith Thompson wrote:

Bart <bc@freeuk.com> writes:

On 22/04/2026 03:53, Keith Thompson wrote:

[...]

You're *so* close to getting it. If you want to measure the
performance of addition, you have to write your benchmark code so the
addition operator can't be optimized away. If you don't do that, the
results will be meaningless.

It seems like you're close getting it too.

OK, what am I getting close to?

That optimisation renders some results meaningless, but then ...

Would you agree that a result that involved executing ADD a billion
times, can't be reliably compared with one that does it zero times?

No.

... here you say the opposite of 'If you don't do that, the results will
be meaningless'.

Even though both give the same result.

Of course they can be reliably compared. One is much faster than
the other. That's a reliable comparison.

Ha, ha, ha!

Remind me never to take any benchmark of yours seriously.

You seem to be more interested in pedantry than anything else.

So, taking one like this:

long long int sum=0;
for (int j=0; j<10; ++j)
for (int i=0; i<2000000000; ++i) sum+=i;
printf("%lld\n", sum);

With gcc-O0, this takes 50 seconds. With gcc-O3, it takes 0.005 seconds.

According to your, gcc managed to make this program 10,000 times faster?

Do 1000 repeats of the inner loop instead, and gcc-O3 would amazingly
speed it up by a million times.

From your previous remarks, you'd consider that a fair and accurate assessment (remember that 0.1s outlier figure).

Obviously, gcc has elided the entire program here so that the timing is essentially zero (the 5ms is process overhead).

It's not like you can now apply that 1000000x speed-up to real programs
(who needs quantum computers!).

(If you are interested, which I doubt, it takes 6-7 seconds to run that program using optimised code that actually does the task, which is 20
billion iterations of 'sum+=i'.)

--- PyGate Linux v1.5.14
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From Keith Thompson@3:633/10 to All on Wed Apr 22 18:59:28 2026

Bart <bc@freeuk.com> writes:

Bart <bc@freeuk.com> writes:

[...]

So, what does language say about it again? Remind me! Or better, tell
the compiler.

I've already told you what the language says about it. I quoted
the section of the ISO C standard that says explicitly that the
behavior is undefined. N3220 6.3.2.1p2, last sentence.
The compiler's behavior is consistent with that requirment.
You cannot possibly have forgotten this. Why do you pretend?

Nobody seems to have a problem with gcc being lax about this (or with
it allowing its users to let it be lax).

gcc is not being lax. gcc is behaving in a matter that is consistent
with the requirements of the C standard. The code in question has
undefined behavior.

You know and understand all of that.

Everybody seems to have a problem with /me/ being lax about it.

Not everybody, but I certainly do.

Does anyone have any actual examples of very bad things happening with
a program like the above?

From what I can see, with -O0 it just moves 32 bits from one part of
the allocated stack frame to another. And with -O1 and above, the code
is elided anyway.

Not exactly the end of the world.

The behavior is undefined. You know exactly what that means, but you
pretend not to.

--
Keith Thompson (The_Other_Keith) Keith.S.Thompson+u@gmail.com
void Void(void) { Void(); } /* The recursive call of the void */

--- PyGate Linux v1.5.14
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From Keith Thompson@3:633/10 to All on Wed Apr 22 19:26:47 2026

Bart <bc@freeuk.com> writes:

On 22/04/2026 22:23, Keith Thompson wrote:

Bart <bc@freeuk.com> writes:

On 22/04/2026 03:53, Keith Thompson wrote:

[...]

You're *so* close to getting it. If you want to measure the
performance of addition, you have to write your benchmark code so the
addition operator can't be optimized away. If you don't do that, the
results will be meaningless.

It seems like you're close getting it too.

OK, what am I getting close to?

That optimisation renders some results meaningless, but then ...

Optimization can make the meaninglessness of some results visible.

Would you agree that a result that involved executing ADD a billion
times, can't be reliably compared with one that does it zero times?

No.

... here you say the opposite of 'If you don't do that, the results
will be meaningless'.

Even though both give the same result.

Of course they can be reliably compared. One is much faster than
the other. That's a reliable comparison.

Ha, ha, ha!

It wasn't a joke. I answered your question. Perhaps you meant
something by "reliably compared" other than what I assumed.
Can you rephrase the question and be more specific?

Remind me never to take any benchmark of yours seriously.

I rarely write benchmarks. If I did, they would be much more
sophisticated than your code fragment above.

You seem to be more interested in pedantry than anything else.

I'm interested in talking about the C programming language as defined by
the ISO C standard. I don't know what you're talking about.

So, taking one like this:

long long int sum=0;
for (int j=0; j<10; ++j)
for (int i=0; i<2000000000; ++i) sum+=i;
printf("%lld\n", sum);

With gcc-O0, this takes 50 seconds. With gcc-O3, it takes 0.005 seconds.

According to your, gcc managed to make this program 10,000 times faster?

Yes. (I got slightly different numbers, but not significantly so.)
It obviously did so by applying reasonable optimizations that cause
the resulting code to produce the correct output while doing fewer calculations. *That's a good thing.*

Do 1000 repeats of the inner loop instead, and gcc-O3 would amazingly
speed it up by a million times.

Probably.

From your previous remarks, you'd consider that a fair and accurate assessment (remember that 0.1s outlier figure).

Assessment of what? What exactly is it about your code fragment
that implies it's meant to be used as an assessment? Do you
expect the compiler to understand that what you want is a program
that performs 20 billion run-time operations rather than one that
produces correct output?

That's not what compilers are for.

Obviously, gcc has elided the entire program here so that the timing
is essentially zero (the 5ms is process overhead).

Almost. It elided everything except code to print
1553255916290448384. And the program still behaved as required.
Why do you have a problem with that?

It's not like you can now apply that 1000000x speed-up to real
programs (who needs quantum computers!).

In most cases, that's probably correct. Most "real programs" are
likely to be more complex than your trivial loop and have output
that depends on unpredictable input. Carefully written benchmarks
simulate that kind of behavior. Your trivial nested loop is not
a carefully written benchmark (unless the thing you're trying to
measure is effective optimization).

(If you are interested, which I doubt, it takes 6-7 seconds to run
that program using optimised code that actually does the task, which
is 20 billion iterations of 'sum+=i'.)

I added "volatile" to the definition of "sum", forcing each update
to be performed during execution. The resulting program took about
40 seconds to execute on my system with gcc -O0, -O1, -O2, and -O3.

If you wanted "sum" to be updated 20 billion times during program
execution, why didn't you define it as volatile? That's the exact
feature that C provides to do what you say you want.

Why, why, why do you expect the compiler to assume that you want
to measure CPU instructions rather than get correct output?

--
Keith Thompson (The_Other_Keith) Keith.S.Thompson+u@gmail.com
void Void(void) { Void(); } /* The recursive call of the void */

--- PyGate Linux v1.5.14
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From wij@3:633/10 to All on Thu Apr 23 11:07:42 2026

On Wed, 2026-04-22 at 18:59 -0700, Keith Thompson wrote:

Bart <bc@freeuk.com> writes:

Bart <bc@freeuk.com> writes:

[...]

So, what does language say about it again? Remind me! Or better, te

ll

the compiler.

I've already told you what the language says about it.� I quoted
the section of the ISO C standard that says explicitly that the
behavior is undefined.� N3220 6.3.2.1p2, last sentence.
The compiler's behavior is consistent with that requirment.
You cannot possibly have forgotten this.� Why do you pretend?

Nobody seems to have a problem with gcc being lax about this (or with
it allowing its users to let it be lax).

gcc is not being lax. gcc is behaving in a matter that is consistent
with the requirements of the C standard.� The code in question has
undefined behavior.

You know and understand all of that.

Everybody seems to have a problem with /me/ being lax about it.

Not everybody, but I certainly do.

Does anyone have any actual examples of very bad things happening with
a program like the above?

From what I can see, with -O0 it just moves 32 bits from one part of
the allocated stack frame to another. And with -O1 and above, the code
is elided anyway.

Not exactly the end of the world.

The behavior is undefined.� You know exactly what that means, but yo

u

pretend not to.

1. 'The language' must see the 'C program' as it is, i.e. every component i
n
this case must map to some assembly code (or 'portable assembly').
2. 'optimiztion' is a heigher level concept, nothing to do with 'The langua ge'.
3. If the code is defined as undefined, why it can be justified to optimize
?

So, the 'undefined' is but the C standard's concept, maybe about compiler s pec...
Because the real thing is that the development of C had been always buttom-
up.

--- PyGate Linux v1.5.14
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From Tim Rentsch@3:633/10 to All on Wed Apr 22 20:37:35 2026

Bart <bc@freeuk.com> writes:

On 22/04/2026 05:09, Tim Rentsch wrote:

antispam@fricas.org (Waldek Hebisch) writes:

Bart <bc@freeuk.com> wrote:

On 19/04/2026 20:32, David Brown wrote:

On 19/04/2026 19:47, Bart wrote:

Get the value of 'b',

You can't do that. "b" has no value. "b" is indeterminate, and
using its value is UB - the code has no meaning right out of the
gate.

When you use "b" in an expression, you are /not/ asking C to read
the bits and bytes stored at the address of the object "b". You
are asking for the /value/ of the object "b". How the compiler
gets that value is up to the compiler - it can read the memory, or
use a stored copy in a register, or use program analysis to know
what the value is in some other way. And if the object "b" does
not have a value, you are asking the impossible.

Try asking a human "You have two numbers, b and c. Add them.
What is the answer?".

You have two slates A and B which someone should have wiped clean
then written a new number on each.

But that part hasn't been done; they each still have an old number
from their last use.

You can still add them together, nothing bad will happen. It just
may be the wrong answer if the purpose of the exercise was to find
the sum of two specific new numbers.

But the purpose may also be see how good they are adding. Or in
following instructions.

whatever it happens to be, add the value of 'c' scaled by 8, and
store the result it into 'a'. The only things to consider are
that some intermediate results may lose the top bits.

Is 'a = b' equally undefined? If so that C is even crazy than
I'd thought.

If "a" or "b" are indeterminate, then using them is undefined. I
have two things - are they the same colour? How is that supposed
to make sense?

You keep thinking of objects like "b" as a section of memory with
a bit pattern in it. Objects are not that simple in C - C is not
assembly.

Why ISN'T it that simple? What ghastly thing would happen if it
was?

"b" will be some location in memory or it might be some register,
and it WILL have a value. That value happens to be unknown until
it is initialised.

So accessing it will return garbage (unless you know exactly what
you are doing then it may be something useful).

My original example was something like 'a = b + c' (I think in my
language), converted to my IL, then expressed in very low-level C.

You were concerned that in that C, the values weren't initialised.
How would that have affected the code that C compiler generated
from that?

You look at trivial example, where AFAICS the best answer is:
"Compiler follows general rules, why should it make exception for
this case?". Note that in this trivial case "interesting"
behaviour could happen on exotic hardware (probably disallowed
by C23 rules, but AFAICS legal for earlier C versions).

The kinds of behavior Bart is asking about has been undefined
behavior for just over 15 years, since 2011 ISO C.

So what was it between 1972 and 2011?

Between 1989 and 2011 the behavior was either always undefined or
potentially undefined, depending on when, on what data types are
involved, on some implementation-specific choices, and on how one
reads some passages in the C standard that unfortunately were not
written as clearly as they might have been.

Between 1978 and 1989, the defining document for C was K&R's "The
C Programming Language." K&R doesn't have a notion of undefined
behavior; but, neither does it give a definition of what happens
in such cases. I think it's fair to say that during this time
period the behavior was "not defined", as opposed to being
specifically "undefined behavior". K&R is written in an informal
style, and is meant to be read as such.

Between 1972 and 1978 Unix was not available to the general public,
and I think for all practical purposes neither was C. Also AFAIAA
there was no recognized defining document for C during that time.
IIRC there were some papers written about C before 1978, but nothing
like a real language manual. So the answer seems to be either that
the question doesn't make sense or that everything is "undefined
behavior" because there is no language manual that defines it.

I think the key point is that, TTBOMK, nothing has been written to
exclude the possibility of arbitrary (aka "undefined") behavior when
reading uninitialized memory, perhaps not counting some special
cases such as reading via type unsigned char.

--- PyGate Linux v1.5.14
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From Tim Rentsch@3:633/10 to All on Wed Apr 22 20:39:39 2026

Michael S <already5chosen@yahoo.com> writes:

On Wed, 22 Apr 2026 15:16:56 +0100
Bart <bc@freeuk.com> wrote:

On 22/04/2026 05:09, Tim Rentsch wrote:

antispam@fricas.org (Waldek Hebisch) writes:

You look at trivial example, where AFAICS the best answer is:
"Compiler follows general rules, why should it make exception for
this case?". Note that in this trivial case "interesting"
behaviour could happen on exotic hardware (probably disallowed
by C23 rules, but AFAICS legal for earlier C versions).

The kinds of behavior Bart is asking about has been undefined
behavior for just over 15 years, since 2011 ISO C.

So what was it between 1972 and 2011?

My record at guessing exact meaning of Tim's statements is not
particularly good, but I'll try nevertheless.

Tim seems to suggest that function foo() below had defined behavior
(most likely of returning 1) in C90 and C99, then it became undefined in
C11 and C17 then again became defined in C23.
For years 1972 to 1989 Tim probably thinks that there is no sufficient
data to answer your question.

I'm curious to know what you think of my answer now that I
have written one. :)

--- PyGate Linux v1.5.14
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From Chris M. Thomasson@3:633/10 to All on Wed Apr 22 23:16:23 2026

On 4/22/2026 6:59 PM, Keith Thompson wrote:

Bart <bc@freeuk.com> writes:

Bart <bc@freeuk.com> writes:

[...]

So, what does language say about it again? Remind me! Or better, tell
the compiler.

I've already told you what the language says about it. I quoted
the section of the ISO C standard that says explicitly that the
behavior is undefined. N3220 6.3.2.1p2, last sentence.
The compiler's behavior is consistent with that requirment.
You cannot possibly have forgotten this. Why do you pretend?

Nobody seems to have a problem with gcc being lax about this (or with
it allowing its users to let it be lax).

gcc is not being lax. gcc is behaving in a matter that is consistent
with the requirements of the C standard. The code in question has
undefined behavior.

You know and understand all of that.

Everybody seems to have a problem with /me/ being lax about it.

Not everybody, but I certainly do.

Does anyone have any actual examples of very bad things happening with
a program like the above?

From what I can see, with -O0 it just moves 32 bits from one part of
the allocated stack frame to another. And with -O1 and above, the code
is elided anyway.

Not exactly the end of the world.

The behavior is undefined. You know exactly what that means, but you
pretend not to.

Right. A compiler has the right to say we define that undefined behavior
in this and that way. Read the manual. Also, be sure to read how to turn
it on or off. Want an error, want a warning, or want our flag where we
define said undefined behavior in our favor... ;^)

--- PyGate Linux v1.5.14
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From David Brown@3:633/10 to All on Thu Apr 23 08:18:11 2026

On 22/04/2026 18:09, Lew Pitcher wrote:

On Wed, 22 Apr 2026 15:13:56 +0000, Scott Lurndal wrote:

Bart <bc@freeuk.com> writes:

On 22/04/2026 05:09, Tim Rentsch wrote:

antispam@fricas.org (Waldek Hebisch) writes:

Bart <bc@freeuk.com> wrote:

On 19/04/2026 20:32, David Brown wrote:
A

On 19/04/2026 19:47, Bart wrote:

Get the value of 'b',

You can't do that. "b" has no value. "b" is indeterminate, and >>>>>>> using its value is UB - the code has no meaning right out of the >>>>>>> gate.

The kinds of behavior Bart is asking about has been undefined
behavior for just over 15 years, since 2011 ISO C.

So what was it between 1972 and 2011?

Implementation specific. Depending on how the linker
and run-time loader handled uninitialized data regions
in the a.out file and when loading.

K&R is very specific about the initial value of automatic
variables:
1.10 Scope; External Variables
...
"Because automatic variables come and go with
function invocation, they do not retain their
values from one call to the next, and must be
explicitly set upon each entry. If they are
not set, they will contain garbage."
...

2.4 Declarations
...
"Automatic variables for which there is no
explicit initializer have undefined (i.e.
garbage) values."
...

4.9 Initialization
...
"In the absence of explicit initialization,
external and static variables are guaranteed
to be initialized to zero; automatic and
register variables have undefined (.e.e garbage)
values."
...

8.6 Initialization
...
"Static and external variables which are not
initialized are guaranteed to start off
as 0, automatic and register variables which
are not initialized are guaranteed to start
off as garbage."
...

So, for automatic and register variables at least,
even K&R defined that, before initialization, their
values were undefined.

I don't see the use of uninitialised variables being undefined here. It
just says their values are garbage (thus unspecified values, or possibly
trap values). Indeed, it says they are /guaranteed/ to be garbage,
which is a strange turn of phrase - it could be interpreted to mean an implementation is not allowed to zero-initialise them even if it wanted to.

There's no doubt that use of the values of uninitialised local variables
has been a bad idea - incorrect code - since early C. But UB is not
just a case of "nothing good will happen".

--- PyGate Linux v1.5.14
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From David Brown@3:633/10 to All on Thu Apr 23 08:25:51 2026

On 23/04/2026 01:52, Bart wrote:

On 22/04/2026 22:23, Keith Thompson wrote:

Bart <bc@freeuk.com> writes:

On 22/04/2026 03:53, Keith Thompson wrote:

[...]

You're *so* close to getting it.� If you want to measure the
performance of addition, you have to write your benchmark code so the
addition operator can't be optimized away.� If you don't do that, the
results will be meaningless.

It seems like you're close getting it too.

OK, what am I getting close to?

That optimisation renders some results meaningless, but then ...

Would you agree that a result that involved executing ADD a billion
times, can't be reliably compared with one that does it zero times?

No.

... here you say the opposite of 'If you don't do that, the results will
be meaningless'.

Even though both give the same result.

Of course they can be reliably compared.� One is much faster than
the other.� That's a reliable comparison.

Ha, ha, ha!

Remind me never to take any benchmark of yours seriously.

You seem to be more interested in pedantry than anything else.

So, taking one like this:

�� long long int sum=0;
�� for (int j=0; j<10; ++j)
�� for (int i=0; i<2000000000; ++i) sum+=i;
�� printf("%lld\n", sum);

With gcc-O0, this takes 50 seconds. With gcc-O3, it takes 0.005 seconds.

According to your, gcc managed to make this program 10,000 times faster?

You really must stop using phrases like "according to you" - even as
part of a question. Several people, including Keith and I, have asked
you to stop putting words in other people's mouths.

And quite clearly gcc has managed to make that program 10,000 times
faster. That is a simple fact. It's a program that adds up lots of
numbers and prints the total - no more, and no less. The C code does
/not/ say the object code must contain a loop with N iterations and N
"add" opcodes. The executables have the same effect each time, and one
is faster than the other.

The fact that this makes the code useless for benchmarking the ADD
instruction on different cpus (if that's what you were trying to do), is irrelevant. Neither the C language nor any given compiler can guess
your ulterior motives.

--- PyGate Linux v1.5.14
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From David Brown@3:633/10 to All on Thu Apr 23 09:22:57 2026

On 22/04/2026 23:29, Chris M. Thomasson wrote:

On 4/22/2026 2:28 PM, Chris M. Thomasson wrote:

On 4/21/2026 1:13 PM, David Brown wrote:

On 21/04/2026 20:51, Chris M. Thomasson wrote:

On 4/21/2026 1:13 AM, David Brown wrote:

On 20/04/2026 23:59, Keith Thompson wrote:

Bart <bc@freeuk.com> writes:

On 20/04/2026 18:48, Keith Thompson wrote:

Bart <bc@freeuk.com> writes:

Yes, that's really useful!

So which implementation is faster at actually doing function calls? >>>>>>> And how many calls were actually made?

I don't know or care.

Once again, *there are ways* to write C benchmarks that guarantee
that all the function calls you want to time actually occur during >>>>>> execution.� For example, you can use calls to separately compiled
functions (and disable link-time optimization if necessary).� You can >>>>>> do computations that the compiler can't unwrap.� You might multiply >>>>>> a value by (time(NULL) > 0); that always yields 1, but the compiler >>>>>> probably doesn't know that.� (That's off the top of my head; I don't >>>>>> know what the best techniques are in practice.)� And then you can
examine the generated code to make sure that it's what you want.

To add more suggestions here, I find the key to benchmarking when
you want to stick to standard C is use of "volatile".� Use a
volatile read at the start of your code, then calculations that
depend on each other and that first read, then a volatile write of
the result.� That gives minimal intrusion in the code while making
sure the calculations have to be generated, and have to be done at
run time.

If you are testing on a particular compiler (like gcc or clang),
then there are other options.� The "noinline" function attribute is >>>>> very handy.� Then there are empty inline assembly statements:

If you think of processor registers as acting like a level -1
memory cache (for things that are not always in registers), then
this flushes that cache:

��asm volatile ("" ::: "memory");

This tells the compiler that it needs to have calculated "x" at
this point in time (so that its value can be passed to the assembly) : >>>>>
��asm volatile ("" :: "" (x));

This tells the compiler that "x" might be changed by the assembly,
so it must forget any additional knowledge it had of it :

��asm volatile ("" : "+g" (x));

I've had use of all of these in real code, not just benchmarks or
test code.� They can be helpful in some kinds of interactions
between low level code and hardware.

Well, we have to make a difference between a compiler barrier and a
memory barrier. All memory barriers should be compiler barriers, but
compiler barriers do not have to be memory barriers... Fair enough?

Of course there is a difference between memory barriers and compiler
barriers.� We are talking about compiler barriers here, because they
have an effect on the semantics of the language (in this case, the
language is "C with gcc extensions") without the cost of real memory
barriers.� C11 atomic fences are compiler and memory barriers, but
they can have a huge effect on code speed - these empty assembly
statements are aimed at having minimal impact outside of the intended
effects.

I think a relaxed memory barrier can be used as a compiler barrier and
be compatible with atomic, volatile does not have to be used here?

load/store with relaxed should act like compiler barriers?

To be honest, I have never been at all sure how C11 atomic accesses and
fences relate to "memory barriers" of any sort, or how they enforce
order in respect to volatile accesses or non-volatile accesses.

The C standards at times use "volatile atomic" qualifications, which
implies that non-volatile atomic uses are not volatile. Volatile
accesses do two things - enforce an order (in the generated code, but
not necessarily at execution on the cpu) of volatile accesses, and make
the access "observable behaviour". My understanding is then that C11
atomics are missing one or both of these aspects, but I don't know which.

gcc has a "memory clobber" facility in inline assembly - and this is
commonly used as a compiler (but not cpu) memory barrier. I know what
it does in practical terms for the way I use it, but I am not sure how precisely it can be specified in relation to the standard C semantics.
It seems reasonable to suppose that a relaxed atomic fence could act
like a gcc compiler memory barrier, but the standard says that "atomic_thread_fence(memory_order_relaxed)" has no effects.

The main reason I have not bothered looking at the semantics and effects
of C11 atomics is that the libatomic implementation that is distributed
with gcc is (or at least /was/ when I looked a number of years ago) fundamentally and irreparably broken for single-core microcontrollers.
Using spinlocks to enforce atomic actions is fine on a multi-core Linux system, but a guaranteed hang on a single-core RTOS or when using
atomics from interrupts. So I use RTOS-specific features, or my own
critical section code (disabling interrupts is the way to do it on these
kinds of devices), along with gcc inline assembly - it's as far from
portable standard C code as you can get and still have it mixed with C,
but I don't need portability there.

But I have no objection at all if someone wants to give an explanation
of some of the C11 atomic semantics, though it might be better in a new thread.

--- PyGate Linux v1.5.14
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From David Brown@3:633/10 to All on Thu Apr 23 09:42:34 2026

On 22/04/2026 23:33, Keith Thompson wrote:

Bart <bc@freeuk.com> writes:

On 22/04/2026 03:36, Janis Papanagnou wrote:

On 2026-04-20 13:45, Bart wrote:

My language allows you to do this:

�� int a, b
�� a := b

It is well-defined in the language, and I know it is well defined
on all my likely targets.

That's perfectly fine if (for example) your language implies a
default
initialization semantics. (Simula, e.g., has such a semantic defined;
declared (instantiated) integer variables have the value 0.) - But "C"
does not!

I'm sure that was foremost in the designers' minds when they created C
in 1972. It wasn't retrofitted into the spec years later at all.

- Haven't you two been talking about "C" all the time?

(If you are again trying to project your language's "design decisions"
onto "C" I really suggest to stop that nonsense since it doesn't lead
anywhere.)

It seems to be fine in C too according to observation:

c:\cx>type t.c
void F() {
int a, b;
a = b;
}

Come on, Bart, you already know this stuff.

The behavior of `a = b;` is undefined. You know what "undefined
behavior" means. You know that C implementations are not required
to diagnose undefined behavior.

You know that, since a and b are local to the function and their
values are never used, a compiler could generate machine code for F()
as an empty function. (I do not claim that any particular compiler
does or does not perform this optimization.)

I think most C compilers with any kind of lifetime analysis or variable
usage optimisations will optimise F() as though it were simply an empty function. More interestingly, IMHO, is that compilers could treat it like

#include <stddef.h>
void F() {
unreachable();
}

Or they could generate code that at runtime would halt with an error
message (though that typically would need a compiler flag -
disappointingly, "gcc -fsanitize=undefined" does not have a check for
using the value of uninitialised variables).

Regardless of what C's original designers were thinking, I see the
benefits of UB in terms of greater efficiency in code paths that do not
have UB, along with better opportunities for static (compiler
warnings/errors) checking and runtime checking (sanitizers). The fact
that C23 has standardised "unreachable()" to allow programmers to inject explicit points of UB in their code, for precisely the reasons I like
it, shows that this is an established viewpoint.

--- PyGate Linux v1.5.14
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From David Brown@3:633/10 to All on Thu Apr 23 09:47:57 2026

On 23/04/2026 05:07, wij wrote:

On Wed, 2026-04-22 at 18:59 -0700, Keith Thompson wrote:

Bart <bc@freeuk.com> writes:

Bart <bc@freeuk.com> writes:

[...]

So, what does language say about it again? Remind me! Or better, tell >>>>> the compiler.

I've already told you what the language says about it.� I quoted
the section of the ISO C standard that says explicitly that the
behavior is undefined.� N3220 6.3.2.1p2, last sentence.
The compiler's behavior is consistent with that requirment.
You cannot possibly have forgotten this.� Why do you pretend?

Nobody seems to have a problem with gcc being lax about this (or with
it allowing its users to let it be lax).

gcc is not being lax. gcc is behaving in a matter that is consistent
with the requirements of the C standard.� The code in question has
undefined behavior.

You know and understand all of that.

Everybody seems to have a problem with /me/ being lax about it.

Not everybody, but I certainly do.

Does anyone have any actual examples of very bad things happening with
a program like the above?

From what I can see, with -O0 it just moves 32 bits from one part of
the allocated stack frame to another. And with -O1 and above, the code
is elided anyway.

Not exactly the end of the world.

The behavior is undefined.� You know exactly what that means, but you
pretend not to.

1. 'The language' must see the 'C program' as it is, i.e. every component in
this case must map to some assembly code (or 'portable assembly').

Nonsense.

If you want an access (read or write) in C to map to an access in the generated code, use "volatile". Other than that, it is all "as if".

2. 'optimiztion' is a heigher level concept, nothing to do with 'The language'.

It is correct that the C standard does not bother much about
optimisation (though there are some features that exist specifically to
allow better optimisations). But it does not in any way restrict optimisations - implementations can optimise as little or as much as
they like, as long as they don't affect the defined semantics of the code.

3. If the code is defined as undefined, why it can be justified to optimize?

When there is no defined behaviour in the source code, the compiler can generate absolutely any object code and it will be fine for the task.

So, the 'undefined' is but the C standard's concept, maybe about compiler spec...
Because the real thing is that the development of C had been always buttom-up.

--- PyGate Linux v1.5.14
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From Chris M. Thomasson@3:633/10 to All on Thu Apr 23 01:56:45 2026

On 4/22/2026 11:18 PM, David Brown wrote:

On 22/04/2026 18:09, Lew Pitcher wrote:

On Wed, 22 Apr 2026 15:13:56 +0000, Scott Lurndal wrote:

Bart <bc@freeuk.com> writes:

On 22/04/2026 05:09, Tim Rentsch wrote:

antispam@fricas.org (Waldek Hebisch) writes:

Bart <bc@freeuk.com> wrote:

On 19/04/2026 20:32, David Brown wrote:
A

On 19/04/2026 19:47, Bart wrote:

Get the value of 'b',

You can't do that.� "b" has no value.� "b" is indeterminate, and >>>>>>>> using its value is UB - the code has no meaning right out of the >>>>>>>> gate.

The kinds of behavior Bart is asking about has been undefined
behavior for just over 15 years, since 2011 ISO C.

So what was it between 1972 and 2011?

Implementation specific.� Depending on how the linker
and run-time loader handled uninitialized data regions
in the a.out file and when loading.

K&R is very specific about the initial value of automatic
variables:
�� 1.10 Scope; External Variables
�� ...
�� "Because automatic variables come and go with
�� function invocation, they do not retain their
�� values from one call to the next, and must be
�� explicitly set upon each entry. If they are
�� not set, they will contain garbage."
�� ...

�� 2.4� Declarations
�� ...
�� "Automatic variables for which there is no
�� explicit initializer have undefined (i.e.
�� garbage) values."
�� ...

�� 4.9 Initialization
�� ...
�� "In the absence of explicit initialization,
�� external and static variables are guaranteed
�� to be initialized to zero; automatic and
�� register variables have undefined (.e.e garbage)
�� values."
�� ...

�� 8.6 Initialization
�� ...
�� "Static and external variables which are not
�� initialized are guaranteed to start off
�� as 0, automatic and register variables which
�� are not initialized are guaranteed to start
�� off as garbage."
�� ...

So, for automatic and register variables at least,
even K&R defined that, before initialization, their
values were undefined.

I don't see the use of uninitialised variables being undefined here.� It just says their values are garbage (thus unspecified values, or possibly trap values).� Indeed, it says they are /guaranteed/ to be garbage,
which is a strange turn of phrase - it could be interpreted to mean an implementation is not allowed to zero-initialise them even if it wanted to.

There's no doubt that use of the values of uninitialised local variables
has been a bad idea - incorrect code - since early C.� But UB is not
just a case of "nothing good will happen".

int a;

a can now be a result from a TRNG.

a can be equal to GARBAGE, where GARBAGE = 0?

;^)

--- PyGate Linux v1.5.14
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From Chris M. Thomasson@3:633/10 to All on Thu Apr 23 02:03:50 2026

On 4/23/2026 12:22 AM, David Brown wrote:

On 22/04/2026 23:29, Chris M. Thomasson wrote:

On 4/22/2026 2:28 PM, Chris M. Thomasson wrote:

On 4/21/2026 1:13 PM, David Brown wrote:

On 21/04/2026 20:51, Chris M. Thomasson wrote:

On 4/21/2026 1:13 AM, David Brown wrote:

On 20/04/2026 23:59, Keith Thompson wrote:

Bart <bc@freeuk.com> writes:

On 20/04/2026 18:48, Keith Thompson wrote:

Bart <bc@freeuk.com> writes:

Yes, that's really useful!

So which implementation is faster at actually doing function calls? >>>>>>>> And how many calls were actually made?

I don't know or care.

Once again, *there are ways* to write C benchmarks that guarantee >>>>>>> that all the function calls you want to time actually occur during >>>>>>> execution.� For example, you can use calls to separately compiled >>>>>>> functions (and disable link-time optimization if necessary).� You >>>>>>> can
do computations that the compiler can't unwrap.� You might multiply >>>>>>> a value by (time(NULL) > 0); that always yields 1, but the compiler >>>>>>> probably doesn't know that.� (That's off the top of my head; I don't >>>>>>> know what the best techniques are in practice.)� And then you can >>>>>>> examine the generated code to make sure that it's what you want. >>>>>>>

To add more suggestions here, I find the key to benchmarking when >>>>>> you want to stick to standard C is use of "volatile".� Use a
volatile read at the start of your code, then calculations that
depend on each other and that first read, then a volatile write of >>>>>> the result.� That gives minimal intrusion in the code while making >>>>>> sure the calculations have to be generated, and have to be done at >>>>>> run time.

If you are testing on a particular compiler (like gcc or clang),
then there are other options.� The "noinline" function attribute
is very handy.� Then there are empty inline assembly statements:

If you think of processor registers as acting like a level -1
memory cache (for things that are not always in registers), then
this flushes that cache:

��asm volatile ("" ::: "memory");

This tells the compiler that it needs to have calculated "x" at
this point in time (so that its value can be passed to the
assembly) :

��asm volatile ("" :: "" (x));

This tells the compiler that "x" might be changed by the assembly, >>>>>> so it must forget any additional knowledge it had of it :

��asm volatile ("" : "+g" (x));

I've had use of all of these in real code, not just benchmarks or >>>>>> test code.� They can be helpful in some kinds of interactions
between low level code and hardware.

Well, we have to make a difference between a compiler barrier and a >>>>> memory barrier. All memory barriers should be compiler barriers,
but compiler barriers do not have to be memory barriers... Fair
enough?

Of course there is a difference between memory barriers and compiler
barriers.� We are talking about compiler barriers here, because they
have an effect on the semantics of the language (in this case, the
language is "C with gcc extensions") without the cost of real memory
barriers.� C11 atomic fences are compiler and memory barriers, but
they can have a huge effect on code speed - these empty assembly
statements are aimed at having minimal impact outside of the
intended effects.

I think a relaxed memory barrier can be used as a compiler barrier
and be compatible with atomic, volatile does not have to be used here?

load/store with relaxed should act like compiler barriers?

To be honest, I have never been at all sure how C11 atomic accesses and fences relate to "memory barriers" of any sort, or how they enforce
order in respect to volatile accesses or non-volatile accesses.

The C standards at times use "volatile atomic" qualifications, which
implies that non-volatile atomic uses are not volatile.� Volatile
accesses do two things - enforce an order (in the generated code, but
not necessarily at execution on the cpu) of volatile accesses, and make
the access "observable behaviour".� My understanding is then that C11 atomics are missing one or both of these aspects, but I don't know which.

gcc has a "memory clobber" facility in inline assembly - and this is commonly used as a compiler (but not cpu) memory barrier.� I know what
it does in practical terms for the way I use it, but I am not sure how precisely it can be specified in relation to the standard C semantics.
It seems reasonable to suppose that a relaxed atomic fence could act
like a gcc compiler memory barrier, but the standard says that "atomic_thread_fence(memory_order_relaxed)" has no effects.

The main reason I have not bothered looking at the semantics and effects
of C11 atomics is that the libatomic implementation that is distributed
with gcc is (or at least /was/ when I looked a number of years ago) fundamentally and irreparably broken for single-core microcontrollers.
Using spinlocks to enforce atomic actions is fine on a multi-core Linux system, but a guaranteed hang on a single-core RTOS or when using
atomics from interrupts.� So I use RTOS-specific features, or my own critical section code (disabling interrupts is the way to do it on these kinds of devices), along with gcc inline assembly - it's as far from portable standard C code as you can get and still have it mixed with C,
but I don't need portability there.

But I have no objection at all if someone wants to give an explanation
of some of the C11 atomic semantics, though it might be better in a new thread.

Yeah. Well, damn. I would hope that in the _compiled_ code, memory
ordering aside:

std::atomic<int> a = 0;

a.store(123);
a.store(666);

Better damn well issue two stores in that order. The memory order side
be damned for this moment, but I think std::atomic in impls are laced
with the volatile keyword anyway, but shit can happen. Humm...

--- PyGate Linux v1.5.14
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From Chris M. Thomasson@3:633/10 to All on Thu Apr 23 02:07:58 2026

On 4/23/2026 2:03 AM, Chris M. Thomasson wrote:

On 4/23/2026 12:22 AM, David Brown wrote:

On 22/04/2026 23:29, Chris M. Thomasson wrote:

On 4/22/2026 2:28 PM, Chris M. Thomasson wrote:

On 4/21/2026 1:13 PM, David Brown wrote:

On 21/04/2026 20:51, Chris M. Thomasson wrote:

On 4/21/2026 1:13 AM, David Brown wrote:

On 20/04/2026 23:59, Keith Thompson wrote:

Bart <bc@freeuk.com> writes:

On 20/04/2026 18:48, Keith Thompson wrote:

Bart <bc@freeuk.com> writes:

Yes, that's really useful!

So which implementation is faster at actually doing function >>>>>>>>> calls?
And how many calls were actually made?

I don't know or care.

Once again, *there are ways* to write C benchmarks that guarantee >>>>>>>> that all the function calls you want to time actually occur during >>>>>>>> execution.� For example, you can use calls to separately compiled >>>>>>>> functions (and disable link-time optimization if necessary). >>>>>>>> You can
do computations that the compiler can't unwrap.� You might multiply >>>>>>>> a value by (time(NULL) > 0); that always yields 1, but the compiler >>>>>>>> probably doesn't know that.� (That's off the top of my head; I >>>>>>>> don't
know what the best techniques are in practice.)� And then you can >>>>>>>> examine the generated code to make sure that it's what you want. >>>>>>>>

To add more suggestions here, I find the key to benchmarking when >>>>>>> you want to stick to standard C is use of "volatile".� Use a
volatile read at the start of your code, then calculations that >>>>>>> depend on each other and that first read, then a volatile write >>>>>>> of the result.� That gives minimal intrusion in the code while
making sure the calculations have to be generated, and have to be >>>>>>> done at run time.

If you are testing on a particular compiler (like gcc or clang), >>>>>>> then there are other options.� The "noinline" function attribute >>>>>>> is very handy.� Then there are empty inline assembly statements: >>>>>>>
If you think of processor registers as acting like a level -1
memory cache (for things that are not always in registers), then >>>>>>> this flushes that cache:

��asm volatile ("" ::: "memory");

This tells the compiler that it needs to have calculated "x" at >>>>>>> this point in time (so that its value can be passed to the
assembly) :

��asm volatile ("" :: "" (x));

This tells the compiler that "x" might be changed by the
assembly, so it must forget any additional knowledge it had of it : >>>>>>>
��asm volatile ("" : "+g" (x));

I've had use of all of these in real code, not just benchmarks or >>>>>>> test code.� They can be helpful in some kinds of interactions
between low level code and hardware.

Well, we have to make a difference between a compiler barrier and >>>>>> a memory barrier. All memory barriers should be compiler barriers, >>>>>> but compiler barriers do not have to be memory barriers... Fair
enough?

Of course there is a difference between memory barriers and
compiler barriers.� We are talking about compiler barriers here,
because they have an effect on the semantics of the language (in
this case, the language is "C with gcc extensions") without the
cost of real memory barriers.� C11 atomic fences are compiler and
memory barriers, but they can have a huge effect on code speed -
these empty assembly statements are aimed at having minimal impact
outside of the intended effects.

I think a relaxed memory barrier can be used as a compiler barrier
and be compatible with atomic, volatile does not have to be used here?

load/store with relaxed should act like compiler barriers?

To be honest, I have never been at all sure how C11 atomic accesses
and fences relate to "memory barriers" of any sort, or how they
enforce order in respect to volatile accesses or non-volatile accesses.

The C standards at times use "volatile atomic" qualifications, which
implies that non-volatile atomic uses are not volatile.� Volatile
accesses do two things - enforce an order (in the generated code, but
not necessarily at execution on the cpu) of volatile accesses, and
make the access "observable behaviour".� My understanding is then that
C11 atomics are missing one or both of these aspects, but I don't know
which.

gcc has a "memory clobber" facility in inline assembly - and this is
commonly used as a compiler (but not cpu) memory barrier.� I know what
it does in practical terms for the way I use it, but I am not sure how
precisely it can be specified in relation to the standard C semantics.
It seems reasonable to suppose that a relaxed atomic fence could act
like a gcc compiler memory barrier, but the standard says that
"atomic_thread_fence(memory_order_relaxed)" has no effects.

The main reason I have not bothered looking at the semantics and
effects of C11 atomics is that the libatomic implementation that is
distributed with gcc is (or at least /was/ when I looked a number of
years ago) fundamentally and irreparably broken for single-core
microcontrollers. Using spinlocks to enforce atomic actions is fine on
a multi-core Linux system, but a guaranteed hang on a single-core RTOS
or when using atomics from interrupts.� So I use RTOS-specific
features, or my own critical section code (disabling interrupts is the
way to do it on these kinds of devices), along with gcc inline
assembly - it's as far from portable standard C code as you can get
and still have it mixed with C, but I don't need portability there.

But I have no objection at all if someone wants to give an explanation
of some of the C11 atomic semantics, though it might be better in a
new thread.

Yeah. Well, damn. I would hope that in the _compiled_ code, memory
ordering aside:

std::atomic<int> a = 0;

a.store(123);
a.store(666);

Better damn well issue two stores in that order. The memory order side
be damned for this moment, but I think std::atomic in impls are laced
with the volatile keyword anyway, but shit can happen. Humm...

Damn it, even with memory_order_relaxed! Humm... A new thread would be better....

--- PyGate Linux v1.5.14
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From David Brown@3:633/10 to All on Thu Apr 23 11:30:39 2026

On 23/04/2026 11:03, Chris M. Thomasson wrote:

On 4/23/2026 12:22 AM, David Brown wrote:

On 22/04/2026 23:29, Chris M. Thomasson wrote:

On 4/22/2026 2:28 PM, Chris M. Thomasson wrote:

On 4/21/2026 1:13 PM, David Brown wrote:

On 21/04/2026 20:51, Chris M. Thomasson wrote:

On 4/21/2026 1:13 AM, David Brown wrote:

On 20/04/2026 23:59, Keith Thompson wrote:

Bart <bc@freeuk.com> writes:

On 20/04/2026 18:48, Keith Thompson wrote:

Bart <bc@freeuk.com> writes:

Yes, that's really useful!

So which implementation is faster at actually doing function >>>>>>>>> calls?
And how many calls were actually made?

I don't know or care.

Once again, *there are ways* to write C benchmarks that guarantee >>>>>>>> that all the function calls you want to time actually occur during >>>>>>>> execution.� For example, you can use calls to separately compiled >>>>>>>> functions (and disable link-time optimization if necessary). >>>>>>>> You can
do computations that the compiler can't unwrap.� You might multiply >>>>>>>> a value by (time(NULL) > 0); that always yields 1, but the compiler >>>>>>>> probably doesn't know that.� (That's off the top of my head; I >>>>>>>> don't
know what the best techniques are in practice.)� And then you can >>>>>>>> examine the generated code to make sure that it's what you want. >>>>>>>>

To add more suggestions here, I find the key to benchmarking when >>>>>>> you want to stick to standard C is use of "volatile".� Use a
volatile read at the start of your code, then calculations that >>>>>>> depend on each other and that first read, then a volatile write >>>>>>> of the result.� That gives minimal intrusion in the code while
making sure the calculations have to be generated, and have to be >>>>>>> done at run time.

If you are testing on a particular compiler (like gcc or clang), >>>>>>> then there are other options.� The "noinline" function attribute >>>>>>> is very handy.� Then there are empty inline assembly statements: >>>>>>>
If you think of processor registers as acting like a level -1
memory cache (for things that are not always in registers), then >>>>>>> this flushes that cache:

��asm volatile ("" ::: "memory");

This tells the compiler that it needs to have calculated "x" at >>>>>>> this point in time (so that its value can be passed to the
assembly) :

��asm volatile ("" :: "" (x));

This tells the compiler that "x" might be changed by the
assembly, so it must forget any additional knowledge it had of it : >>>>>>>
��asm volatile ("" : "+g" (x));

I've had use of all of these in real code, not just benchmarks or >>>>>>> test code.� They can be helpful in some kinds of interactions
between low level code and hardware.

Well, we have to make a difference between a compiler barrier and >>>>>> a memory barrier. All memory barriers should be compiler barriers, >>>>>> but compiler barriers do not have to be memory barriers... Fair
enough?

Of course there is a difference between memory barriers and
compiler barriers.� We are talking about compiler barriers here,
because they have an effect on the semantics of the language (in
this case, the language is "C with gcc extensions") without the
cost of real memory barriers.� C11 atomic fences are compiler and
memory barriers, but they can have a huge effect on code speed -
these empty assembly statements are aimed at having minimal impact
outside of the intended effects.

I think a relaxed memory barrier can be used as a compiler barrier
and be compatible with atomic, volatile does not have to be used here?

load/store with relaxed should act like compiler barriers?

To be honest, I have never been at all sure how C11 atomic accesses
and fences relate to "memory barriers" of any sort, or how they
enforce order in respect to volatile accesses or non-volatile accesses.

The C standards at times use "volatile atomic" qualifications, which
implies that non-volatile atomic uses are not volatile.� Volatile
accesses do two things - enforce an order (in the generated code, but
not necessarily at execution on the cpu) of volatile accesses, and
make the access "observable behaviour".� My understanding is then that
C11 atomics are missing one or both of these aspects, but I don't know
which.

gcc has a "memory clobber" facility in inline assembly - and this is
commonly used as a compiler (but not cpu) memory barrier.� I know what
it does in practical terms for the way I use it, but I am not sure how
precisely it can be specified in relation to the standard C semantics.
It seems reasonable to suppose that a relaxed atomic fence could act
like a gcc compiler memory barrier, but the standard says that
"atomic_thread_fence(memory_order_relaxed)" has no effects.

The main reason I have not bothered looking at the semantics and
effects of C11 atomics is that the libatomic implementation that is
distributed with gcc is (or at least /was/ when I looked a number of
years ago) fundamentally and irreparably broken for single-core
microcontrollers. Using spinlocks to enforce atomic actions is fine on
a multi-core Linux system, but a guaranteed hang on a single-core RTOS
or when using atomics from interrupts.� So I use RTOS-specific
features, or my own critical section code (disabling interrupts is the
way to do it on these kinds of devices), along with gcc inline
assembly - it's as far from portable standard C code as you can get
and still have it mixed with C, but I don't need portability there.

But I have no objection at all if someone wants to give an explanation
of some of the C11 atomic semantics, though it might be better in a
new thread.

Yeah. Well, damn. I would hope that in the _compiled_ code, memory
ordering aside:

std::atomic<int> a = 0;

a.store(123);
a.store(666);

Better damn well issue two stores in that order. The memory order side
be damned for this moment, but I think std::atomic in impls are laced
with the volatile keyword anyway, but shit can happen. Humm...

This is c.l.c., not c.l.c++, but they use the same memory model here.

A brief test shows that gcc seems to do both stores regardless of the
memory order (for atomic_store_explicit). With memory_order_seq_cst,
gcc appears to act as though there were a compiler memory barrier along
with the store - with memory_order_relaxes, there is no such barrier.
That is, non-volatile accesses can be moved around. So this:

_Atomic int a1;
int i1;

void foo(int x) {
i1 = 100;
atomic_store_explicit(&a1, x, memory_order_relaxed);
atomic_store_explicit(&a1, x + 1, memory_order_relaxed);
i1 = i1 + 1;
}

gets optimised as though it were:

void foo(int x) {
atomic_store_explicit(&a1, x, memory_order_relaxed);
atomic_store_explicit(&a1, x + 1, memory_order_relaxed);
i1 = 101;
}

It is difficult to test, by trial and error, if volatile accesses get re-ordered around relaxed atomic accesses. Regardless of semantics, the compiler is not going to re-order them unless there are clear efficiency benefits, and since relaxed atomic operations apparently can't be
combined (or at least, gcc does not combine them), I haven't got any
examples where the compiler would be likely to re-arrange things if it
is allowed to do so. But my failure to find a counter-example here does
not mean that I am sure relaxed atomic accesses cannot be re-ordered
with respect to non-atomic volatile accesses.

--- PyGate Linux v1.5.14
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From Keith Thompson@3:633/10 to All on Thu Apr 23 02:37:16 2026

Tim Rentsch <tr.17687@z991.linuxsc.com> writes:
[...]

Between 1972 and 1978 Unix was not available to the general public,
and I think for all practical purposes neither was C. Also AFAIAA
there was no recognized defining document for C during that time.
IIRC there were some papers written about C before 1978, but nothing
like a real language manual. So the answer seems to be either that
the question doesn't make sense or that everything is "undefined
behavior" because there is no language manual that defines it.

[...]

For the C history buffs, here are a few early papers on C:

C Reference Manual, Jan 15 1974, Dennis Ritchie https://www.nokia.com/bell-labs/about/dennis-m-ritchie/cman74.pdf

C Reference Manual, 1975, Dennis Ritchie https://www.nokia.com/bell-labs/about/dennis-m-ritchie/cman.pdf

Programming in C - A Tutorial, 1975(?), Brian Kernighan https://www.nokia.com/bell-labs/about/dennis-m-ritchie/ctut.pdf

The Development of the C Language, 1994, Dennis Ritchie https://www.nokia.com/bell-labs/about/dennis-m-ritchie/chist.pdf

Dennis Ritchie's home page https://www.nokia.com/bell-labs/about/dennis-m-ritchie/
has a number of other papers on early Unix, BCPL, B, and C.

--
Keith Thompson (The_Other_Keith) Keith.S.Thompson+u@gmail.com
void Void(void) { Void(); } /* The recursive call of the void */

--- PyGate Linux v1.5.14
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From Bart@3:633/10 to All on Thu Apr 23 10:58:58 2026

On 23/04/2026 02:59, Keith Thompson wrote:

Bart <bc@freeuk.com> writes:

Bart <bc@freeuk.com> writes:

[...]

So, what does language say about it again? Remind me! Or better, tell
the compiler.

I've already told you what the language says about it. I quoted
the section of the ISO C standard that says explicitly that the
behavior is undefined. N3220 6.3.2.1p2, last sentence.
The compiler's behavior is consistent with that requirment.
You cannot possibly have forgotten this. Why do you pretend?

Nobody seems to have a problem with gcc being lax about this (or with
it allowing its users to let it be lax).

gcc is not being lax. gcc is behaving in a matter that is consistent
with the requirements of the C standard. The code in question has
undefined behavior.

You know and understand all of that.

No, I don't.

So what is the concrete effect of all that on the behaviour of gcc and
the behaviour of the code it generates?

If something bad happens (what would that be exactly), whose fault would
that, mine or the compiler's?

Are you suggesting that because something is tagged as UB, that it
literally gives a compiler a licence to do anything?

If so, how is that not being lax by either language, compiler, or both?

I'm starting to suspect that either nobody knows the answer, or they do,
but are chary of either blaming the compiler or criticising the language
spec, and are trying to shift the blame to the user.

The behavior is undefined. You know exactly what that means, but you
pretend not to.

And yet, the behaviour I have observed is nothing remarkable: some
undefined bit patterns get used; zero is assumed; or code is just elided.

Again, do you have any real-life, practical examples of bad or unusual
things happening?

If you had to put money on whether some outcode is either one of those
three I listed, or something else, which would you go for?

--- PyGate Linux v1.5.14
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From Michael S@3:633/10 to All on Thu Apr 23 13:15:09 2026

On Wed, 22 Apr 2026 20:39:39 -0700
Tim Rentsch <tr.17687@z991.linuxsc.com> wrote:

Michael S <already5chosen@yahoo.com> writes:

On Wed, 22 Apr 2026 15:16:56 +0100
Bart <bc@freeuk.com> wrote:

On 22/04/2026 05:09, Tim Rentsch wrote:

antispam@fricas.org (Waldek Hebisch) writes:

You look at trivial example, where AFAICS the best answer is:
"Compiler follows general rules, why should it make exception for
this case?". Note that in this trivial case "interesting"
behaviour could happen on exotic hardware (probably disallowed
by C23 rules, but AFAICS legal for earlier C versions).

The kinds of behavior Bart is asking about has been undefined
behavior for just over 15 years, since 2011 ISO C.

So what was it between 1972 and 2011?

My record at guessing exact meaning of Tim's statements is not
particularly good, but I'll try nevertheless.

Tim seems to suggest that function foo() below had defined behavior
(most likely of returning 1) in C90 and C99, then it became
undefined in C11 and C17 then again became defined in C23.
For years 1972 to 1989 Tim probably thinks that there is no
sufficient data to answer your question.

I'm curious to know what you think of my answer now that I
have written one. :)

I'd like to read an explanation of what exactly was changed or
clarified in 2011 and again in 2024.

--- PyGate Linux v1.5.14
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From Bart@3:633/10 to All on Thu Apr 23 11:30:17 2026

On 23/04/2026 03:26, Keith Thompson wrote:

Bart <bc@freeuk.com> writes:

On 22/04/2026 22:23, Keith Thompson wrote:

Bart <bc@freeuk.com> writes:

On 22/04/2026 03:53, Keith Thompson wrote:

[...]

You're *so* close to getting it. If you want to measure the
performance of addition, you have to write your benchmark code so the >>>>> addition operator can't be optimized away. If you don't do that, the >>>>> results will be meaningless.

It seems like you're close getting it too.

OK, what am I getting close to?

That optimisation renders some results meaningless, but then ...

Optimization can make the meaninglessness of some results visible.

Would you agree that a result that involved executing ADD a billion
times, can't be reliably compared with one that does it zero times?

No.

... here you say the opposite of 'If you don't do that, the results
will be meaningless'.

Even though both give the same result.

Of course they can be reliably compared. One is much faster than
the other. That's a reliable comparison.

Ha, ha, ha!

It wasn't a joke. I answered your question. Perhaps you meant
something by "reliably compared" other than what I assumed.
Can you rephrase the question and be more specific?

Remind me never to take any benchmark of yours seriously.

I rarely write benchmarks. If I did, they would be much more
sophisticated than your code fragment above.

Would they make much use of 'volatile'?

So, taking one like this:

long long int sum=0;
for (int j=0; j<10; ++j)
for (int i=0; i<2000000000; ++i) sum+=i;
printf("%lld\n", sum);

Obviously, gcc has elided the entire program here so that the timing
is essentially zero (the 5ms is process overhead).

Almost. It elided everything except code to print
1553255916290448384.

(I don't notice that it overflowed.)

And the program still behaved as required.
Why do you have a problem with that?

Yes, because I expected that line to be executed 20 billion times and to
take an appreciable amount of time.

If you wanted "sum" to be updated 20 billion times during program
execution, why didn't you define it as volatile? That's the exact
feature that C provides to do what you say you want.

Because I want to know how long reasonably efficient code takes to
execute it 20 billion times. Using 'volatile' would keep 'sum' memory-bound.

I expect sensible code to keep it in a register, but to also do the
task. If I use 'volatile', then I get these results:

bcc 6.3 seconds ('volatile' is ignored)
gcc -O3 49.3 seconds

That would be nice, but it's quite misleading. The nearest I get is to
use gcc -O1 without 'volatile', then it takes 6.2 seconds.

Why, why, why do you expect the compiler to assume that you want
to measure CPU instructions rather than get correct output?

Because I told it I wanted a loop. If I just wanted the correct output,
I would have given it the formula for summing the sequence 0 to N-1, or hardcoded it myself.

(I also tested it in the two compilers for my language. The old one did
it in 6.2 seconds, but the new one in 12 seconds. Something needs
looking at!

Other than that, it is wonderful to use a language that does exactly
what you tell it, without a mind of its own, and strives to do it as efficiently as it can given a simple compiler.)

--- PyGate Linux v1.5.14
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From Keith Thompson@3:633/10 to All on Thu Apr 23 03:43:52 2026

Bart <bc@freeuk.com> writes:

On 23/04/2026 02:59, Keith Thompson wrote:

Bart <bc@freeuk.com> writes:

Bart <bc@freeuk.com> writes:

[...]

So, what does language say about it again? Remind me! Or better, tell >>>>> the compiler.

I've already told you what the language says about it. I quoted
the section of the ISO C standard that says explicitly that the
behavior is undefined. N3220 6.3.2.1p2, last sentence.
The compiler's behavior is consistent with that requirment.
You cannot possibly have forgotten this. Why do you pretend?

Nobody seems to have a problem with gcc being lax about this (or with
it allowing its users to let it be lax).

gcc is not being lax. gcc is behaving in a matter that is consistent
with the requirements of the C standard. The code in question has
undefined behavior.

You know and understand all of that.

No, I don't.

I don't believe you. (That's actually a compliment.)

[...]

Are you suggesting that because something is tagged as UB, that it
literally gives a compiler a licence to do anything?

As far as the C standard is concerned, yes, that's exactly what it
means.

What do you think it means? If the C standard imposes no requirements
on a program's behavior, what requirements do you think it imposes?

[...]

--
Keith Thompson (The_Other_Keith) Keith.S.Thompson+u@gmail.com
void Void(void) { Void(); } /* The recursive call of the void */

--- PyGate Linux v1.5.14
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From David Brown@3:633/10 to All on Thu Apr 23 12:48:54 2026

On 23/04/2026 11:58, Bart wrote:

On 23/04/2026 02:59, Keith Thompson wrote:

Bart <bc@freeuk.com> writes:

Bart <bc@freeuk.com> writes:

[...]

So, what does language say about it again? Remind me! Or better, tell >>>>> the compiler.

I've already told you what the language says about it.� I quoted
the section of the ISO C standard that says explicitly that the
behavior is undefined.� N3220 6.3.2.1p2, last sentence.
The compiler's behavior is consistent with that requirment.
You cannot possibly have forgotten this.� Why do you pretend?

Nobody seems to have a problem with gcc being lax about this (or with
it allowing its users to let it be lax).

gcc is not being lax. gcc is behaving in a matter that is consistent
with the requirements of the C standard.� The code in question has
undefined behavior.

You know and understand all of that.

No, I don't.

This has all been explained to you countless times over years (decades
even) in c.l.c.

So what is the concrete effect of all that on the behaviour of gcc and
the behaviour of the code it generates?

It may be nothing, it may be anything at all - that's what UB means.
Typically you don't see much in the way of "UB based optimisation" in a
very small test function, except perhaps that some code gets elided.
More often you see the effects when there is more complex code, or when
code is copied or moved around for inter-procedural optimisations such
as inlining.

If something bad happens (what would that be exactly), whose fault would
�that, mine or the compiler's?

Your fault. Without a shadow of a doubt.

Are you suggesting that because something is tagged as UB, that it
literally gives a compiler a licence to do anything?

I can't answer for Keith, but I can tell you that yes, compilers can do anything they want in the face of UB that they know is being "executed".
Haven't you heard of nasal daemons?

If so, how is that not being lax by either language, compiler, or both?

It is not about being lax. Programming is a cooperative task between
the programmer and the implementation. The standard forms the contract.
The compiler promises to make object code that implements the source
code, according to the semantics defined in the language standard (plus
any additional implementation-specific details or extensions), while the programmer promises to follow the rules of the language - including that
their program will not try to execute any UB. The compiler is only
bound as long as the programmer fulfils his or her part.

I'm starting to suspect that either nobody knows the answer, or they do,
but are chary of either blaming the compiler or criticising the language spec, and are trying to shift the blame to the user.

The behavior is undefined.� You know exactly what that means, but you
pretend not to.

And yet, the behaviour I have observed is nothing remarkable: some
undefined bit patterns get used; zero is assumed; or code is just elided.

That's the fun of UB - the compiler can do anything it wants, including
what the programmer thought the code meant. (And compilers try to do
that in many cases, unless that conflicts with making more efficient
runtime code when UB is not executed.) You can't determine that a piece
of code is free of UB by compiling it and doing some successful tests.
Testing can reveal the presence of bugs, not their absence.

Again, do you have any real-life, practical examples of bad or unusual things happening?

For the case of trying to read uninitialised variables? I think it is
fairly unlikely to have very bad effects, other than fail to give the programmer the results they expected. You might get code that has an uninitialised bool to execute code that is conditional on it being true,
and also code that is conditional on it being false. A more likely
scenario is the code acts oddly and gives strange results depending on
the circumstances when it is called and what happens to lie in
particular registers or uninitialised stack memory.

More serious problems occur when compilers - correctly - optimise on the assumption that UB does not occur and the programmer has - incorrectly - written tests or checks that depend on their imagined semantics for the
code.

If you had to put money on whether some outcode is either one of those
three I listed, or something else, which would you go for?

--- PyGate Linux v1.5.14
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From David Brown@3:633/10 to All on Thu Apr 23 12:50:47 2026

On 23/04/2026 12:15, Michael S wrote:

On Wed, 22 Apr 2026 20:39:39 -0700
Tim Rentsch <tr.17687@z991.linuxsc.com> wrote:

Michael S <already5chosen@yahoo.com> writes:

On Wed, 22 Apr 2026 15:16:56 +0100
Bart <bc@freeuk.com> wrote:

On 22/04/2026 05:09, Tim Rentsch wrote:

antispam@fricas.org (Waldek Hebisch) writes:

You look at trivial example, where AFAICS the best answer is:
"Compiler follows general rules, why should it make exception for
this case?". Note that in this trivial case "interesting"
behaviour could happen on exotic hardware (probably disallowed
by C23 rules, but AFAICS legal for earlier C versions).

The kinds of behavior Bart is asking about has been undefined
behavior for just over 15 years, since 2011 ISO C.

So what was it between 1972 and 2011?

My record at guessing exact meaning of Tim's statements is not
particularly good, but I'll try nevertheless.

Tim seems to suggest that function foo() below had defined behavior
(most likely of returning 1) in C90 and C99, then it became
undefined in C11 and C17 then again became defined in C23.
For years 1972 to 1989 Tim probably thinks that there is no
sufficient data to answer your question.

I'm curious to know what you think of my answer now that I
have written one. :)

I'd like to read an explanation of what exactly was changed or
clarified in 2011 and again in 2024.

I'd also appreciate some more detail for the first paragraph:

[ Copied from the referred post written by Tim ]

Between 1989 and 2011 the behavior was either always undefined or potentially undefined, depending on when, on what data types are
involved, on some implementation-specific choices, and on how one
reads some passages in the C standard that unfortunately were not
written as clearly as they might have been.

--- PyGate Linux v1.5.14
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From David Brown@3:633/10 to All on Thu Apr 23 13:12:16 2026

On 23/04/2026 12:30, Bart wrote:

On 23/04/2026 03:26, Keith Thompson wrote:

Bart <bc@freeuk.com> writes:

On 22/04/2026 22:23, Keith Thompson wrote:

Bart <bc@freeuk.com> writes:

On 22/04/2026 03:53, Keith Thompson wrote:

[...]

You're *so* close to getting it.� If you want to measure the
performance of addition, you have to write your benchmark code so the >>>>>> addition operator can't be optimized away.� If you don't do that, the >>>>>> results will be meaningless.

It seems like you're close getting it too.

OK, what am I getting close to?

That optimisation renders some results meaningless, but then ...

Optimization can make the meaninglessness of some results visible.

Would you agree that a result that involved executing ADD a billion
times, can't be reliably compared with one that does it zero times?

No.

... here you say the opposite of 'If you don't do that, the results
will be meaningless'.

Even though both give the same result.

Of course they can be reliably compared.� One is much faster than
the other.� That's a reliable comparison.

Ha, ha, ha!

It wasn't a joke.� I answered your question.� Perhaps you meant
something by "reliably compared" other than what I assumed.
Can you rephrase the question and be more specific?

Remind me never to take any benchmark of yours seriously.

I rarely write benchmarks.� If I did, they would be much more
sophisticated than your code fragment above.

Would they make much use of 'volatile'?

When I write benchmarks (I don't do so much, but I quite often look at generated code with godbolt.org, and the same applies there) I make use
of "volatile" as appropriate to force observable behaviour.

So, taking one like this:

�� long long int sum=0;
�� for (int j=0; j<10; ++j)
�� for (int i=0; i<2000000000; ++i) sum+=i;
�� printf("%lld\n", sum);

Obviously, gcc has elided the entire program here so that the timing
is essentially zero (the 5ms is process overhead).

Almost.� It elided everything except code to print
1553255916290448384.

(I don't notice that it overflowed.)

�And the program still behaved as required.
Why do you have a problem with that?

Yes, because I expected that line to be executed 20 billion times and to take an appreciable amount of time.

Have you still not understood that your expectations are wrong?

If I ask you "what is the sum of all the integers from 1 to 100 ?", do
you think I am expecting you to do all these sums on paper? Or in your
head? Or individually on a calculator? I think it is more likely that
you would write a program, or ask google, or use n(n+1)/2, or simply
know off-hand that it is 5050. If I wanted to be more specific about
how I wanted you to handle the task - not just give me the answer - I'd specify that, such as saying "you may not use a computer".

Programming is just the same. It really is that simple.

If you wanted "sum" to be updated 20 billion times during program
execution, why didn't you define it as volatile?� That's the exact
feature that C provides to do what you say you want.

Because I want to know how long reasonably efficient code takes to
execute it 20 billion times. Using 'volatile' would keep 'sum' memory- bound.

No, that is only the case if you use "volatile" blindly.

First, make the function more general :

long long int summation(long long int start, int n) {
long long int sum = start;
for (int i = 0; i < n; i++) {
sum += i;
}

return sum;
}

Then use volatile in your driver function :

int main(void) {
volatile long long int start = 0;
volatile int n = 20000000000;
volatile llong long int result = summation(start, n);

printf("%lld\n", result);
}

Put the volatile access at the beginning and end of the benchmark, not
in the loop, and they will have minimal overhead - but they will force
the calculation to be done at runtime.

Alternatively, I might use one of the other "do nothing" inline assembly fragments I mentioned in another post, such as changing the loop to :

for (int i = 0; i < n; i++) {
sum += i;
__asm__("" : "+g" (sum));
}

(I've already told you this. Do you not bother reading posts trying to
help you, because that would give you less to whine about? Or do you
have some other reason for ignoring them?)

Why, why, why do you expect the compiler to assume that you want
to measure CPU instructions rather than get correct output?

Because I told it I wanted a loop.

You told it you want the results as if there were a loop - you haven't
told it to generate a loop in the assembly. C is not assembly. Why do
you keep insisting that you expect C compilers to act like assemblers?

(I also tested it in the two compilers for my language. The old one did
it in 6.2 seconds, but the new one in 12 seconds. Something needs
looking at!

Your tests here are fine for comparing different versions of your own
language or your own tools. But if you want to benchmark aspects of implementations for different languages, learn how to write benchmarks
to measure and test the things you are interested in. Or learn that the things you are trying to measure are perhaps not particular important,
and learn to measure other things.

Other than that, it is wonderful to use a language that does exactly
what you tell it, without a mind of its own, and strives to do it as efficiently as it can given a simple compiler.)

That's why a lot of people like C. It does what they ask. Of course,
that only applies when you know the language, and know what you are
asking for.

--- PyGate Linux v1.5.14
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From Keith Thompson@3:633/10 to All on Thu Apr 23 04:43:19 2026

Bart <bc@freeuk.com> writes:

On 23/04/2026 03:26, Keith Thompson wrote:

Bart <bc@freeuk.com> writes:

On 22/04/2026 22:23, Keith Thompson wrote:

Bart <bc@freeuk.com> writes:

On 22/04/2026 03:53, Keith Thompson wrote:

[...]

You're *so* close to getting it. If you want to measure the
performance of addition, you have to write your benchmark code so the >>>>>> addition operator can't be optimized away. If you don't do that, the >>>>>> results will be meaningless.

It seems like you're close getting it too.

OK, what am I getting close to?

That optimisation renders some results meaningless, but then ...

Optimization can make the meaninglessness of some results visible.

Would you agree that a result that involved executing ADD a billion
times, can't be reliably compared with one that does it zero times?

No.

... here you say the opposite of 'If you don't do that, the results
will be meaningless'.

Even though both give the same result.

Of course they can be reliably compared. One is much faster than
the other. That's a reliable comparison.

Ha, ha, ha!

It wasn't a joke. I answered your question. Perhaps you meant
something by "reliably compared" other than what I assumed.
Can you rephrase the question and be more specific?

Remind me never to take any benchmark of yours seriously.

I rarely write benchmarks. If I did, they would be much more
sophisticated than your code fragment above.

Would they make much use of 'volatile'?

I don't know.

So, taking one like this:

long long int sum=0;
for (int j=0; j<10; ++j)
for (int i=0; i<2000000000; ++i) sum+=i;
printf("%lld\n", sum);

Obviously, gcc has elided the entire program here so that the timing
is essentially zero (the 5ms is process overhead).

Almost. It elided everything except code to print
1553255916290448384.

(I don't notice that it overflowed.)

Ah, neither did I. So the program's behavior is undefined.
If we change 2000000000 (20 billion) to 1000000000 (1 billion),
the final result fits in 64 bits -- but the unoptimized program
runs in about 2 seconds on my system, vs. 0.001 second with gcc -O3.
It still illustrates the point.

And the program still behaved as required.
Why do you have a problem with that?

Yes, because I expected that line to be executed 20 billion times and
to take an appreciable amount of time.

And your expectation was wrong.

If you wanted "sum" to be updated 20 billion times during program
execution, why didn't you define it as volatile? That's the exact
feature that C provides to do what you say you want.

Because I want to know how long reasonably efficient code takes to
execute it 20 billion times. Using 'volatile' would keep 'sum'
memory-bound.

Oh? I wouldn't expect "volatile" to prevent a variable from being
stored in a register. (I don't know whether it might do so with
a given compiler, or why.)

I expect sensible code to keep it in a register, but to also do the
task. If I use 'volatile', then I get these results:

bcc 6.3 seconds ('volatile' is ignored)
gcc -O3 49.3 seconds

I expect sensible C code to result in correct output. I expect
the speed of the generated code to depend on how well the compiler
optimizes it. If I want to force certain operations to be done at
run time rather than at compile time, I expect to have to do some
extra work to write C code that the compiler can't optimize in ways
I dotn' want it to. But 99% of the time, I don't want that.

That would be nice, but it's quite misleading. The nearest I get is to
use gcc -O1 without 'volatile', then it takes 6.2 seconds.

I see nothing misleading (other than your assumptions).

Why, why, why do you expect the compiler to assume that you want
to measure CPU instructions rather than get correct output?

Because I told it I wanted a loop. If I just wanted the correct
output, I would have given it the formula for summing the sequence 0
to N-1, or hardcoded it myself.

You wrote a C loop. You didn't tell it you wanted a machine code loop.
You, in effect, told the compiler to generate machine code that would
result in the same visible behavior (output in this case) as the
C code will give in the abstract machine.

A C loop doesn't have to map to a machine code loop any more than
a C function call has to map to a machine code "call" instruction.
(On ARM, it typically maps to a "bl" instruction.)

Earlier, I mentioned, and you failed to acknowledge, that optimizing
away function calls is exactly as valid as optimizing 2+2 to 4.
Will you address that?

[...]

Other than that, it is wonderful to use a language that does exactly
what you tell it, without a mind of its own, and strives to do it as efficiently as it can given a simple compiler.)

You're imposing artificial contraints and whining when a compiler
that is under no obligation to satisfy your assumptions doesn't
do so. You've correctly argued that C is not an assembly language,
but then you get angry when it doesn't act like one.

--
Keith Thompson (The_Other_Keith) Keith.S.Thompson+u@gmail.com
void Void(void) { Void(); } /* The recursive call of the void */

--- PyGate Linux v1.5.14
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From Michael S@3:633/10 to All on Thu Apr 23 15:12:47 2026

On Thu, 23 Apr 2026 13:12:16 +0200
David Brown <david.brown@hesbynett.no> wrote:

On 23/04/2026 12:30, Bart wrote:

On 23/04/2026 03:26, Keith Thompson wrote:

Bart <bc@freeuk.com> writes:

On 22/04/2026 22:23, Keith Thompson wrote:

Bart <bc@freeuk.com> writes:

On 22/04/2026 03:53, Keith Thompson wrote:

[...]

You're *so* close to getting it.? If you want to measure the
performance of addition, you have to write your benchmark code
so the addition operator can't be optimized away.? If you
don't do that, the results will be meaningless.

It seems like you're close getting it too.

OK, what am I getting close to?

That optimisation renders some results meaningless, but then ...

Optimization can make the meaninglessness of some results visible.

Would you agree that a result that involved executing ADD a
billion times, can't be reliably compared with one that does it
zero times?

No.

... here you say the opposite of 'If you don't do that, the
results will be meaningless'.

Even though both give the same result.

Of course they can be reliably compared.? One is much faster than
the other.? That's a reliable comparison.

Ha, ha, ha!

It wasn't a joke.? I answered your question.? Perhaps you meant
something by "reliably compared" other than what I assumed.
Can you rephrase the question and be more specific?

Remind me never to take any benchmark of yours seriously.

I rarely write benchmarks.? If I did, they would be much more
sophisticated than your code fragment above.

Would they make much use of 'volatile'?

When I write benchmarks (I don't do so much, but I quite often look
at generated code with godbolt.org, and the same applies there) I
make use of "volatile" as appropriate to force observable behaviour.

I never do.
I always try my best to give to execution of the "item under test" a
real meaning.

--- PyGate Linux v1.5.14
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From Bart@3:633/10 to All on Thu Apr 23 13:33:26 2026

On 23/04/2026 12:43, Keith Thompson wrote:

Bart <bc@freeuk.com> writes:

Earlier, I mentioned, and you failed to acknowledge, that optimizing
away function calls is exactly as valid as optimizing 2+2 to 4.
Will you address that?

I addressed it a couple of times: I first brought it up myself in a
reply to David Brown.

And I made a reply to you; to repeat, this is VERY common reduction that
many languages do, and some of them like C require it to be done.

If I want to isolate the ADD operation in "2 + 2", I'd have to those
operands within variables.

And if using a smart-arse C compiler like gcc, I'd further have to try
and hide the fact that those variables will always contain those values.

However, integer ADD itself is not interesting on a CPU like x64 or
ARM64, and the operation will be lost within all the things going on in parallel.

I mean, it is not as though you can choose a different ADD instruction
to make it faster! You have to look at the bigger picture; it can't be realistically isolated.

It is more interesting if the ADD is part of a language that is
interpreted, or JITed, or the CPU is being emulated.

--- PyGate Linux v1.5.14
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From David Brown@3:633/10 to All on Thu Apr 23 14:40:04 2026

On 23/04/2026 13:43, Keith Thompson wrote:

Bart <bc@freeuk.com> writes:

Because I want to know how long reasonably efficient code takes to
execute it 20 billion times. Using 'volatile' would keep 'sum'
memory-bound.

Oh? I wouldn't expect "volatile" to prevent a variable from being
stored in a register. (I don't know whether it might do so with
a given compiler, or why.)

In practice, "volatile" as a qualifier for a variable will almost always
make the compiler put that variable in memory (on the stack). I don't
believe the C standards require that, since "what constitutes a volatile access is implementation defined", but that is the practice. So if you declare "sum" to be "volatile", it's going to mean it is put on the stack.

The extent that this makes the code "memory-bound" is a different issue
- that will be strongly dependent on the processor. x86 processors
contain a lot of hardware to give short-cuts for access to stack data,
making them closer to register speed than even level 1 cache speed.
Most writes to the stack slot for "sum" would, I think, not even make it
to cache, never mind memory.

Thus adding "volatile" to the declaration of "sum" might be an
interesting test and benchmark, testing the processor's handling of
stack data. /Assuming/ it would make the benchmark memory-bound would
be an unwarranted assumption - but it would be reasonable to guess that
it would be slower than if sum remained in a register while the addition opcodes are run.

I have, in the past, seen compilers that treated "register volatile" qualifiers as requiring the variable had a fixed register and was always
read or written to that register as "observable behaviour". But I think
that is quite unusual.

--- PyGate Linux v1.5.14
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From Bart@3:633/10 to All on Thu Apr 23 13:43:13 2026

On 23/04/2026 12:12, David Brown wrote:

On 23/04/2026 12:30, Bart wrote:

On 23/04/2026 03:26, Keith Thompson wrote:

Bart <bc@freeuk.com> writes:

On 22/04/2026 22:23, Keith Thompson wrote:

Bart <bc@freeuk.com> writes:

On 22/04/2026 03:53, Keith Thompson wrote:

[...]

You're *so* close to getting it.� If you want to measure the
performance of addition, you have to write your benchmark code so >>>>>>> the
addition operator can't be optimized away.� If you don't do that, >>>>>>> the
results will be meaningless.

It seems like you're close getting it too.

OK, what am I getting close to?

That optimisation renders some results meaningless, but then ...

Optimization can make the meaninglessness of some results visible.

Would you agree that a result that involved executing ADD a billion >>>>>> times, can't be reliably compared with one that does it zero times? >>>>> No.

... here you say the opposite of 'If you don't do that, the results
will be meaningless'.

Even though both give the same result.

Of course they can be reliably compared.� One is much faster than
the other.� That's a reliable comparison.

Ha, ha, ha!

It wasn't a joke.� I answered your question.� Perhaps you meant
something by "reliably compared" other than what I assumed.
Can you rephrase the question and be more specific?

Remind me never to take any benchmark of yours seriously.

I rarely write benchmarks.� If I did, they would be much more
sophisticated than your code fragment above.

Would they make much use of 'volatile'?

When I write benchmarks (I don't do so much, but I quite often look at generated code with godbolt.org, and the same applies there) I make use
of "volatile" as appropriate to force observable behaviour.

So, taking one like this:

�� long long int sum=0;
�� for (int j=0; j<10; ++j)
�� for (int i=0; i<2000000000; ++i) sum+=i;
�� printf("%lld\n", sum);

Obviously, gcc has elided the entire program here so that the timing
is essentially zero (the 5ms is process overhead).

Almost.� It elided everything except code to print
1553255916290448384.

(I don't notice that it overflowed.)

�And the program still behaved as required.
Why do you have a problem with that?

Yes, because I expected that line to be executed 20 billion times and
to take an appreciable amount of time.

Have you still not understood that your expectations are wrong?

If I ask you "what is the sum of all the integers from 1 to 100 ?", do
you think I am expecting you to do all these sums on paper?� Or in your head?� Or individually on a calculator?� I think it is more likely that
you would write a program, or ask google, or use n(n+1)/2, or simply
know off-hand that it is 5050.� If I wanted to be more specific about
how I wanted you to handle the task - not just give me the answer - I'd specify that, such as saying "you may not use a computer".

Programming is just the same.� It really is that simple.

If you wanted "sum" to be updated 20 billion times during program
execution, why didn't you define it as volatile?� That's the exact
feature that C provides to do what you say you want.

Because I want to know how long reasonably efficient code takes to
execute it 20 billion times. Using 'volatile' would keep 'sum' memory-
bound.

No, that is only the case if you use "volatile" blindly.

First, make the function more general :

long long int summation(long long int start, int n) {
�� long long int sum = start;
�� for (int i = 0; i < n; i++) {
�� sum += i;
�� }

�� return sum;
}

Then use volatile in your driver function :

int main(void) {
��volatile long long int start = 0;
��volatile int n = 20000000000;
��volatile llong long int result = summation(start, n);

��printf("%lld\n", result);
}

Put the volatile access at the beginning and end of the benchmark, not
in the loop, and they will have minimal overhead - but they will force
the calculation to be done at runtime.

Alternatively, I might use one of the other "do nothing" inline assembly fragments I mentioned in another post, such as changing the loop to :

�� for (int i = 0; i < n; i++) {
�� sum += i;
��__asm__("" : "+g" (sum));
�� }

(I've already told you this.� Do you not bother reading posts trying to
help you, because that would give you less to whine about?� Or do you
have some other reason for ignoring them?)

Why, why, why do you expect the compiler to assume that you want
to measure CPU instructions rather than get correct output?

Because I told it I wanted a loop.

You told it you want the results as if there were a loop - you haven't
told it to generate a loop in the assembly.� C is not assembly.� Why do
you keep insisting that you expect C compilers to act like assemblers?

(I also tested it in the two compilers for my language. The old one
did it in 6.2 seconds, but the new one in 12 seconds. Something needs
looking at!

Your tests here are fine for comparing different versions of your own language or your own tools.� But if you want to benchmark aspects of implementations for different languages, learn how to write benchmarks
to measure and test the things you are interested in.� Or learn that the things you are trying to measure are perhaps not particular important,
and learn to measure other things.

Even such a simple benchmark generally works well across many languages.

For example I CAN successfully use a simple program like the above to
give a good picture of the capabilities of different language
implementations, when they DO have to repeatedly do a simple task.

This is invaluable when you trying to write implementations that are
trying to efficient *DO* X, not how cleverly they can *AVOID* doing X.

So taking just the inner loop of that program (to keep timings short, to
avoid overflow, and to avoid optimising the outer iterations to one),
these are the results I got across few languages:

ASM AA 0.62 seconds
Go Go 0.62
M mm7 0.62
C bcc 0.65
C lccwin32 1.2
Lua LuaJIT 2.1 (JIT; uses f64 so approximate result)
Python PyPy 2.3 (JIT)
C DMC 4.2 (32-bit compiler; task is 64 bits)
C Tiny C 5.3
Q QQ 8.6 (Dynamic interpreters from here on)
Lua Lua 5.4 58
Python CPython 132

Here, gcc-O3 just gives me 0.0 seconds, so it has to be excluded from
the table.

For such purposes, it is useless. According to you, to get a meaningful result, I have to go through all the kind of palaver you demonstrated.

Other than that, it is wonderful to use a language that does exactly
what you tell it, without a mind of its own, and strives to do it as
efficiently as it can given a simple compiler.)

That's why a lot of people like C.� It does what they ask.� Of course,
that only applies when you know the language, and know what you are
asking for.

Yeah? At which -O level will it actually do what I tell it so I that I
can make a meaningful comparison?

I don't want to spend ages devising a special benchmark for C which is
utterly different from the rest, and which may skew the results anyway.

How will I even know when the results are comparable? All I can do
reliably is to reject those that are unfeasibly fast, or hopelessly
slow. But how much should I trust the rest?

--- PyGate Linux v1.5.14
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From Tim Rentsch@3:633/10 to All on Thu Apr 23 06:46:35 2026

Keith Thompson <Keith.S.Thompson+u@gmail.com> writes:

Tim Rentsch <tr.17687@z991.linuxsc.com> writes:
[...]

Between 1972 and 1978 Unix was not available to the general public,
and I think for all practical purposes neither was C. Also AFAIAA
there was no recognized defining document for C during that time.
IIRC there were some papers written about C before 1978, but nothing
like a real language manual. So the answer seems to be either that
the question doesn't make sense or that everything is "undefined
behavior" because there is no language manual that defines it.

[...]

For the C history buffs, here are a few early papers on C:

C Reference Manual, Jan 15 1974, Dennis Ritchie https://www.nokia.com/bell-labs/about/dennis-m-ritchie/cman74.pdf

C Reference Manual, 1975, Dennis Ritchie https://www.nokia.com/bell-labs/about/dennis-m-ritchie/cman.pdf

Programming in C - A Tutorial, 1975(?), Brian Kernighan https://www.nokia.com/bell-labs/about/dennis-m-ritchie/ctut.pdf

The Development of the C Language, 1994, Dennis Ritchie https://www.nokia.com/bell-labs/about/dennis-m-ritchie/chist.pdf

Dennis Ritchie's home page https://www.nokia.com/bell-labs/about/dennis-m-ritchie/
has a number of other papers on early Unix, BCPL, B, and C.

Thank you for the links. At some point I think I skimmed one of
the cman papers but I had forgotten about it.

--- PyGate Linux v1.5.14
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From Janis Papanagnou@3:633/10 to All on Thu Apr 23 16:10:51 2026

On 2026-04-23 04:26, Keith Thompson wrote:

Bart <bc@freeuk.com> writes:

[...]

Assessment of what? What exactly is it about your code fragment
that implies it's meant to be used as an assessment? Do you
expect the compiler to understand that what you want is a program
that performs 20 billion run-time operations rather than one that
produces correct output?

I've (meanwhile) got the impression that he's trying to find some
justification for his (and his tools') methods of code generation;
looking for a number that shows how brilliant his tools are. Code
optimizations are in the way of such reasoning. It reminds me his
stance concerning 'make' and software organization with make-files
that came in his way to vindicate his monolithic software approach.

[...]

Janis

--- PyGate Linux v1.5.14
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From Scott Lurndal@3:633/10 to All on Thu Apr 23 14:17:17 2026

Bart <bc@freeuk.com> writes:

On 23/04/2026 03:26, Keith Thompson wrote:

And the program still behaved as required.
Why do you have a problem with that?

Yes, because I expected that line to be executed 20 billion times and to >take an appreciable amount of time.

It is your expectation that is incorrect, not the compiler.

--- PyGate Linux v1.5.14
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From Scott Lurndal@3:633/10 to All on Thu Apr 23 14:18:43 2026

Michael S <already5chosen@yahoo.com> writes:

On Thu, 23 Apr 2026 13:12:16 +0200
David Brown <david.brown@hesbynett.no> wrote:

Would they make much use of 'volatile'? =20

=20
When I write benchmarks (I don't do so much, but I quite often look
at generated code with godbolt.org, and the same applies there) I
make use of "volatile" as appropriate to force observable behaviour.
=20

I never do.
I always try my best to give to execution of the "item under test" a
real meaning.

What is the item under test? The application, the compiler or
the processor implementation?

--- PyGate Linux v1.5.14
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From Scott Lurndal@3:633/10 to All on Thu Apr 23 14:22:32 2026

Bart <bc@freeuk.com> writes:

On 23/04/2026 12:43, Keith Thompson wrote:

I mean, it is not as though you can choose a different ADD instruction
to make it faster! You have to look at the bigger picture; it can't be >realistically isolated.

Actually, there are many flavors of ADD instruction on
x86/x86_64 processors. Some execute faster than others
(e.g. all three operands may be in registers or all three
may require fills to three different cache lines).

--- PyGate Linux v1.5.14
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From James Kuyper@3:633/10 to All on Thu Apr 23 10:32:57 2026

On 2026-04-23 08:12, Michael S wrote:

On Thu, 23 Apr 2026 13:12:16 +0200
David Brown <david.brown@hesbynett.no> wrote:

...

When I write benchmarks (I don't do so much, but I quite often look
at generated code with godbolt.org, and the same applies there) I
make use of "volatile" as appropriate to force observable behaviour.

I never do.
I always try my best to give to execution of the "item under test" a
real meaning.

If by "a real meaning" you mean something connected to the C standard's definition of "observable behavior", that's a perfectly valid approach.
Using volatile "as appropriate" is another perfectly valid approach to
getting observable behavior. It's also a simpler, less intrusive
approach, because the only other ways to have observable behavior
require doing I/O; often-times you don't want to be testing I/O speed as
part of your benchmark.

--- PyGate Linux v1.5.14
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From Michael S@3:633/10 to All on Thu Apr 23 17:35:50 2026

On Thu, 23 Apr 2026 14:18:43 GMT
scott@slp53.sl.home (Scott Lurndal) wrote:

Michael S <already5chosen@yahoo.com> writes:

On Thu, 23 Apr 2026 13:12:16 +0200
David Brown <david.brown@hesbynett.no> wrote:

Would they make much use of 'volatile'? =20

=20
When I write benchmarks (I don't do so much, but I quite often look
at generated code with godbolt.org, and the same applies there) I
make use of "volatile" as appropriate to force observable
behaviour.
=20

I never do.
I always try my best to give to execution of the "item under test" a
real meaning.

What is the item under test? The application, the compiler or
the processor implementation?

The context is software [micro]benchmarking. So, Item under Test is
typically a routine, preferably not located in the same source file(s)
as the rest of measurement setup (test bench).

--- PyGate Linux v1.5.14
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From Michael S@3:633/10 to All on Thu Apr 23 17:39:41 2026

On Thu, 23 Apr 2026 14:22:32 GMT
scott@slp53.sl.home (Scott Lurndal) wrote:

Bart <bc@freeuk.com> writes:

On 23/04/2026 12:43, Keith Thompson wrote:

I mean, it is not as though you can choose a different ADD
instruction to make it faster! You have to look at the bigger
picture; it can't be realistically isolated.

Actually, there are many flavors of ADD instruction on
x86/x86_64 processors. Some execute faster than others
(e.g. all three operands may be in registers or all three
may require fills to three different cache lines).

x86/x86-64 allows at most one operand of ADD in memory.

--- PyGate Linux v1.5.14
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From David Brown@3:633/10 to All on Thu Apr 23 16:40:27 2026

On 23/04/2026 14:43, Bart wrote:

On 23/04/2026 12:12, David Brown wrote:

Your tests here are fine for comparing different versions of your own
language or your own tools.� But if you want to benchmark aspects of
implementations for different languages, learn how to write benchmarks
to measure and test the things you are interested in.� Or learn that
the things you are trying to measure are perhaps not particular
important, and learn to measure other things.

Even such a simple benchmark generally works well across many languages.

So we can conclude that C is apparently a much better language than
these others for real programming, if run-time efficiency is important
to the task, because it allows much better optimisations. (I don't
think this is necessarily true - there are other languages and tools
that can generate efficient object code - but it is the conclusion I
draw from your testing.)

And we can conclude that you are unable to write benchmarks that measure
what you want to measure. And we can conclude that you have no interest
in improving that situation by learning anything.

--- PyGate Linux v1.5.14
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From James Kuyper@3:633/10 to All on Thu Apr 23 10:42:57 2026

On 23/04/2026 11:58, Bart wrote:
...

Are you suggesting that because something is tagged as UB, that it
literally gives a compiler a licence to do anything?

"behavior, upon use of a nonportable or erroneous program construct or
of erroneous data, for which this document imposes no requirements."
(3.5.3p1).

What exactly do you think "no requirements" means? What could it
possibly mean other than "license to do anything"?

--- PyGate Linux v1.5.14
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From Michael S@3:633/10 to All on Thu Apr 23 17:45:29 2026

On Thu, 23 Apr 2026 10:32:57 -0400
James Kuyper <jameskuyper@alumni.caltech.edu> wrote:

On 2026-04-23 08:12, Michael S wrote:

On Thu, 23 Apr 2026 13:12:16 +0200
David Brown <david.brown@hesbynett.no> wrote:

...

When I write benchmarks (I don't do so much, but I quite often look
at generated code with godbolt.org, and the same applies there) I
make use of "volatile" as appropriate to force observable
behaviour.

I never do.
I always try my best to give to execution of the "item under test" a
real meaning.

If by "a real meaning" you mean something connected to the C
standard's definition of "observable behavior", that's a perfectly
valid approach. Using volatile "as appropriate" is another perfectly
valid approach to getting observable behavior. It's also a simpler,
less intrusive approach, because the only other ways to have
observable behavior require doing I/O; often-times you don't want to
be testing I/O speed as part of your benchmark.

My favorite is

for () {
...
x += UUT(....);
...
}

if (x == 42)
printf("Blue Moon\n");

--- PyGate Linux v1.5.14
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From Tim Rentsch@3:633/10 to All on Thu Apr 23 08:07:45 2026

Michael S <already5chosen@yahoo.com> writes:

[.. discussing a code transformation..]

Indeed, transformation applied by compilers in this case is
more complex than mere [tail call elimination].

In theory, it can be a result of two successive transformations.
First transforming original to fib2:

unsigned long long fib2(unsigned long long n, unsigned long long acc)
{
if (n < 3)
return acc + 1;
return fib2(n-2, fib2(n-1, acc));
}

Ant then applying TCE.
But more likely compiler arrived to the same outcome by different
logical steps.

I have long since given up trying to guess how a compiler arrives
at an ultimate fomulation of what code to produce. And it is
surely only going to be harder to divine such results in the
future.

--- PyGate Linux v1.5.14
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From Bart@3:633/10 to All on Thu Apr 23 16:30:47 2026

On 23/04/2026 15:42, James Kuyper wrote:

On 23/04/2026 11:58, Bart wrote:
...

Are you suggesting that because something is tagged as UB, that it
literally gives a compiler a licence to do anything?

"behavior, upon use of a nonportable or erroneous program construct or
of erroneous data, for which this document imposes no requirements." (3.5.3p1).

What exactly do you think "no requirements" means? What could it
possibly mean other than "license to do anything"?

So the effect is that the compiler can be 'lax' in being able to do what
it likes, including not reporting it and not refusing to fail te program.

KT said: "the compiler is not being lax". I was responding to that.

If it is not being lax, then I'd like to what 'being lax' would look
like for this compiler.

--- PyGate Linux v1.5.14
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From Tim Rentsch@3:633/10 to All on Thu Apr 23 08:46:27 2026

Michael S <already5chosen@yahoo.com> writes:

On Wed, 22 Apr 2026 20:39:39 -0700
Tim Rentsch <tr.17687@z991.linuxsc.com> wrote:

Michael S <already5chosen@yahoo.com> writes:

On Wed, 22 Apr 2026 15:16:56 +0100
Bart <bc@freeuk.com> wrote:

On 22/04/2026 05:09, Tim Rentsch wrote:

antispam@fricas.org (Waldek Hebisch) writes:

You look at trivial example, where AFAICS the best answer is:
"Compiler follows general rules, why should it make exception for
this case?". Note that in this trivial case "interesting"
behaviour could happen on exotic hardware (probably disallowed
by C23 rules, but AFAICS legal for earlier C versions).

The kinds of behavior Bart is asking about has been undefined
behavior for just over 15 years, since 2011 ISO C.

So what was it between 1972 and 2011?

My record at guessing exact meaning of Tim's statements is not
particularly good, but I'll try nevertheless.

Tim seems to suggest that function foo() below had defined behavior
(most likely of returning 1) in C90 and C99, then it became
undefined in C11 and C17 then again became defined in C23.
For years 1972 to 1989 Tim probably thinks that there is no
sufficient data to answer your question.

I'm curious to know what you think of my answer now that I
have written one. :)

I'd like to read an explanation of what exactly was changed or
clarified in 2011 and again in 2024.

The main thing in C11 is that a specific scenario, related to
the "Not a Thing" in Itanium, was spelled out and unambiguously
labeled undefined behavior. The result wasn't so much of a
change as it was a specific addition.

I'm not paying much attention to C23 (which I guess was ratified
in 2024). What I have read about C23 makes me think the people
who are now responsible for the ISO C standard have lost the
spirit of the original writings and original authors. Very sad.

--- PyGate Linux v1.5.14
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From Tim Rentsch@3:633/10 to All on Thu Apr 23 09:14:52 2026

scott@slp53.sl.home (Scott Lurndal) writes:

IMO, most "undefined behavior" in the C specification was due to implementation differences between the C compilers/linkers that
existed at the time.

There are different kinds of undefined behavior. I think it is
more common for differences between different tool sets to be put
in the category of implemenation-defined behavior than undefined
behavior. Some circumstances, such as indexing past the end of
an array, are inherently undefined behavior, and really couldn't
be anything else. In some cases a construct is labeled UB not
because of differences in existing tools but because the ISO
committee wanted to allow a level of freedom in future tool sets
that is more than what IDB can provide. I haven't done any sort
of systematic study, but my sense is that UB arising only from
differences in existing implementations is more at the low end of
the histogram than the high end.

--- PyGate Linux v1.5.14
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From Tim Rentsch@3:633/10 to All on Thu Apr 23 09:21:18 2026

Bart <bc@freeuk.com> writes:

On 23/04/2026 15:42, James Kuyper wrote:

On 23/04/2026 11:58, Bart wrote:
...

Are you suggesting that because something is tagged as UB, that it
literally gives a compiler a licence to do anything?

"behavior, upon use of a nonportable or erroneous program construct or
of erroneous data, for which this document imposes no requirements."
(3.5.3p1).

What exactly do you think "no requirements" means? What could it
possibly mean other than "license to do anything"?

So the effect is that the compiler can be 'lax' in being able to do
what it likes, including not reporting it and not refusing to fail te program.

KT said: "the compiler is not being lax". I was responding to that.

If it is not being lax, then I'd like to what 'being lax' would look
like for this compiler.

What "being lax" means, for any compiler and not just this one,
is not being faithful to what the C standard requires of a
conforming implementation.

--- PyGate Linux v1.5.14
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From Bart@3:633/10 to All on Thu Apr 23 17:27:51 2026

On 23/04/2026 17:21, Tim Rentsch wrote:

Bart <bc@freeuk.com> writes:

On 23/04/2026 15:42, James Kuyper wrote:

On 23/04/2026 11:58, Bart wrote:
...

Are you suggesting that because something is tagged as UB, that it
literally gives a compiler a licence to do anything?

"behavior, upon use of a nonportable or erroneous program construct or
of erroneous data, for which this document imposes no requirements."
(3.5.3p1).

What exactly do you think "no requirements" means? What could it
possibly mean other than "license to do anything"?

So the effect is that the compiler can be 'lax' in being able to do
what it likes, including not reporting it and not refusing to fail te
program.

KT said: "the compiler is not being lax". I was responding to that.

If it is not being lax, then I'd like to what 'being lax' would look
like for this compiler.

What "being lax" means, for any compiler and not just this one,
is not being faithful to what the C standard requires of a
conforming implementation.

So the buck passes to the language being lax.

I tried running the 'a = b' with a, b unitialised, under Go. But that
language states that uninitialised variables are simply zeroed.

Still, that means that this is a valid program that compiles and runs:

var a int
var b int
a = b
fmt.Printf("%d\n", a);

--- PyGate Linux v1.5.14
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From Bart@3:633/10 to All on Thu Apr 23 17:42:20 2026

On 23/04/2026 15:40, David Brown wrote:

On 23/04/2026 14:43, Bart wrote:

On 23/04/2026 12:12, David Brown wrote:

Your tests here are fine for comparing different versions of your own
language or your own tools.� But if you want to benchmark aspects of
implementations for different languages, learn how to write
benchmarks to measure and test the things you are interested in.� Or
learn that the things you are trying to measure are perhaps not
particular important, and learn to measure other things.

Even such a simple benchmark generally works well across many languages.

So we can conclude that C is apparently a much better language than
these others for real programming,

But a nightmare to write benchmarks in.

Real programming includes interpreters. Lua and Python (see table
repeated below) are both implemented in C. We can assume that these
production releases are highly optimised builds.

It might interest you to know that that 'Q' timing, which is my product,
runs NON-OPTIMISED code in my language.

So why aren't CPython and Lua thrashing mine then? All three are
dynamically typed and are pure interpreters.

if run-time efficiency is important
to the task, because it allows much better optimisations.

Actually C is quite difficult to optimise because the compiler has to
work harder to divine the programmer's intentions.

And also, optimisations are the job of the compiler, not the language.
You should thank the 39 years (and countless man-years) of gcc development.

Since it is quite easy to have slow C implementations too. Though you
might notice that the Tiny C timing below is still 10 times faster than
Lua and 20 times faster than CPython.

� (I don't

think this is necessarily true - there are other languages and tools
that can generate efficient object code - but it is the conclusion I
draw from your testing.)

And we can conclude that you are unable to write benchmarks that measure what you want to measure.

What do you think I want to measure?

Of course, I've only been implementing assemblers, compilers, linkers, interpreters and emulators for 40+ years, quite performant ones that had
to themselves run on some very constrained systems (no using
cross-compilers like some sissies), so I clearly know nothing about benchmarking!

And we can conclude that you have no interest
in improving that situation by learning anything.

You mean about C? Honestly, I wish it didn't exist, that there was a
much better designed language that I could respect more, but that's
where we are.

-----------------------
(loop benchmark comparisons)

ASM AA 0.62 seconds (x64)
Go Go 0.62
M mm7 0.62
C bcc 0.65
C lccwin32 1.2
Lua LuaJIT 2.1 (JIT; uses f64 so approximate result)
Python PyPy 2.3 (JIT)
C DMC 4.2 (32-bit compiler; task is 64 bits)
C Tiny C 5.3
Q QQ 8.6 (Dynamic interpreters from here on)
Lua Lua 5.4 58
Python CPython 132

--- PyGate Linux v1.5.14
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From Bart@3:633/10 to All on Thu Apr 23 20:19:31 2026

On 23/04/2026 18:41, Waldek Hebisch wrote:

Bart <bc@freeuk.com> wrote:

On 21/04/2026 14:43, David Brown wrote:

On 21/04/2026 14:48, Bart wrote:

I might measure performance by invoking it N times. Suppose I get these
results across 4 languages:

L1: 3.5 seconds
L2: 4.2
L3: 0.1
L4 2.9

According to you, obviously L3 is the winner because of its superior
optimiser! No red flags at all.

I see big red flag above "by invoking it N times".

I see a red flag from you seeing a red flag!

If, in a program language or anywhere I might tell a machine to do
something N times, then I want it done N times.

If the language decides it's only worth doing once, or not at all, or
does some random number from 0..N-1, or even does it more than N times,
then I want it to tell me so. Because I must obviously be doing
something wrong!

Concerning
numbers, of course I would be interested how L3 managed to
get much better results.

As a little challenge I invite you to predict performance
of gcc on following 2 functions:

void
f1(unsigned char * a, int t, int n) {
int i;
for(i=0; i < n; i++} {
if (a[i] > t) {
a[i] = t;
}
}
}

void
f2(unsigned char * a, int t, int n) {
int i;
for(i=0; i < n; i++} {
a[i] = (a[i] > t)?t:a[i];
}
}

and on 2 sets of data, one where a is filled with constant
value, the second one with pseudo-random one. n should be
100000, t should be 128, a should be freshly filled with values.

I do not ask about exact values, but just qualitive comparison.

I really don't know. Without running it, I notice a few things:

* You don't specify whether the constant value is more or less than 't'.
So does it not matter?

* One uses a conditional normally involving a branch, the other might be
more easily done branchless, but I'd expect gcc-O2 to compile both
branchless.

* The N of 100000 you've specified is too small to measure reliably on Windows.

On the face of it, I predicted not much difference (I'm writing after
testing so I was wrong!).

Note: simply calling f1 or f2 multiple times with give wrong
time (you need to fill it with data before call!). Similarly
increasing n does not give intended time (the size is reasonaby
natural for the problem and will fit in L2 cache on typical
modern machine).

I used the test program show below. Each test is repeated M times.
You've specifed M must be 1, but I needed to repeat to give a measureble
time. With M=10000 and gcc-O2, I got this output (ticks are msecs on
Windows):

f1 const = 344 ticks
f2 const = 46
f1 rand = 4539
f2 rand = 109

I now tried a variation where f1/f2 are in a separate module and not
visible from the main one. Results are:

f1 const = 659 ticks
f2 const = 699
f1 rand = 4809
f2 rand = 1381

The constant array is filled with 155; if I use 55 then f1 results are a little slower (50% and 10%), which is expected.

With my C compiler (and back to 155), results are:

c:\c>c
f1 const = 820 ticks
f2 const = 998
f1 rand = 5149
f2 rand = 5865

It doesn't care if it's one file or not; either way the const results
are on a par with the second set of gcc figures, just somewhat slower.
It doesn't optimise the ?: to branchless, if that's what's happening.

The randomised data is much slower to process. That is somewhat
surprising, but it is not gcc-specific.

Obviously gcc is doing some clever stuff when it's all in one file.

(I also tried it in my languages. Results were somewhat brisker than
with bcc. But there, I have a special 'min' operator which can be used
like this with f2:

a[i] min:= t

With that, then the f2/random timing reduced from 5s to 2s, but f2/const increased from 0.6s to 1.1s)

-------------------------------------------------------
#include <stdio.h>
#include <stdlib.h>
#include <time.h>
#include <string.h>

enum {N=100000};

void f1(unsigned char * a, int t, int n) {
int i;
for(i=0; i < n; i++) {
if (a[i] > t) {
a[i] = t;
}
}
}

void f2(unsigned char * a, int t, int n) {
int i;
for(i=0; i < n; i++) {
a[i] = (a[i] > t)?t:a[i];
}
}

unsigned char data[N];

unsigned char constant[N];
unsigned char random[N];

int main() {
int i;
int t0, t1, t2, t3, t4;
enum {M=10000};

memset(constant, 155, N);
for (i=0; i<N; ++i) random[i]=rand();

t0=clock();

for (i=0; i<M; ++i) {
memcpy(data, constant, N);
f1(data, 128, N);
}
t1=clock()-t0;

for (i=0; i<M; ++i) {
memcpy(data, constant, N);
f2(data, 128, N);
}
t2=clock()-t1;

for (i=0; i<M; ++i) {
memcpy(data, random, N);
f1(data, 128, N);
}
t3=clock()-t2;

for (i=0; i<M; ++i) {
memcpy(data, random, N);
f2(data, 128, N);
}
t4=clock()-t3;

printf("f1 const = %5d ticks\n", t1);
printf("f2 const = %5d\n", t2);
printf("f1 rand = %5d\n", t3);
printf("f2 rand = %5d\n", t4);
}

--- PyGate Linux v1.5.14
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From Chris M. Thomasson@3:633/10 to All on Thu Apr 23 13:50:28 2026

On 4/23/2026 4:12 AM, David Brown wrote:

On 23/04/2026 12:30, Bart wrote:

On 23/04/2026 03:26, Keith Thompson wrote:

Bart <bc@freeuk.com> writes:

On 22/04/2026 22:23, Keith Thompson wrote:

Bart <bc@freeuk.com> writes:

On 22/04/2026 03:53, Keith Thompson wrote:

[...]

You're *so* close to getting it.� If you want to measure the
performance of addition, you have to write your benchmark code so >>>>>>> the
addition operator can't be optimized away.� If you don't do that, >>>>>>> the
results will be meaningless.

It seems like you're close getting it too.

OK, what am I getting close to?

That optimisation renders some results meaningless, but then ...

Optimization can make the meaninglessness of some results visible.

Would you agree that a result that involved executing ADD a billion >>>>>> times, can't be reliably compared with one that does it zero times? >>>>> No.

... here you say the opposite of 'If you don't do that, the results
will be meaningless'.

Even though both give the same result.

Of course they can be reliably compared.� One is much faster than
the other.� That's a reliable comparison.

Ha, ha, ha!

It wasn't a joke.� I answered your question.� Perhaps you meant
something by "reliably compared" other than what I assumed.
Can you rephrase the question and be more specific?

Remind me never to take any benchmark of yours seriously.

I rarely write benchmarks.� If I did, they would be much more
sophisticated than your code fragment above.

Would they make much use of 'volatile'?

When I write benchmarks (I don't do so much, but I quite often look at generated code with godbolt.org, and the same applies there) I make use
of "volatile" as appropriate to force observable behaviour.

Afaict, you basically have to use it.

[...]

--- PyGate Linux v1.5.14
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From Bart@3:633/10 to All on Thu Apr 23 22:04:54 2026

On 23/04/2026 20:42, Waldek Hebisch wrote:

Bart <bc@freeuk.com> wrote:

Other than that, it is wonderful to use a language that does exactly
what you tell it, without a mind of its own, and strives to do it as
efficiently as it can given a simple compiler.)

Consider the following function:

void
f(int k, int n, int a[n][n], int b[n][n], int c[n][n]) {
int i;
for(i = 0; i < n; i++) {
a[k][i] = b[k][i] + c[k][i];
}
}

How many instructions should it execute at runtime?

It's not always instruction count that matters.

Note, this uses C VMT-s because this is what in needed in real use.
C got VMT-s rather lately,

By VMT you mean Variably Modified Type (not Virtual Method Table which
got me confused at first).

Here it means the second bound of those matrices is a parameter, so it
is not known until runtime. (Although I sure that if gcc knew it was
called with constant 'n', it can make use of that info.)

but for example it would be trivial to
translate this function to Fortran 66. In "C" compiler that does
not support VMT-s you can do

#define aref(a, k, i) (*(a + n*k + i))

and replace assigment iside loop by

aref(a, k, i) = aref(b, k, i) + aref(c, k, i);

Is compiler which generates code doing 3*n multiplications efficient?

The function actually doesn't contain any multiplications. You're asking
it to do 2D array accesses and that is up to the implementation.

But you've introduced them with this macro. If I set up the test program calling f(), using that macro, then with a suitable scale of task, I
have something that takes:

gcc -O2 0.85 seconds
bcc 1.7 seconds
tcc 5.x seconds

Being half the speed of optimised code is common. If this was a
bottleneck for me, there'd be quite a ways to approach it.

People already do that anyway, it the compiler isn't up to
auto-vectorising the code, by using SIMD intrinsics.

Do compiler which does not need 3*n multiplications have
"a mind of its own"?

--- PyGate Linux v1.5.14
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From Chris M. Thomasson@3:633/10 to All on Thu Apr 23 14:06:47 2026

On 4/23/2026 2:30 AM, David Brown wrote:

On 23/04/2026 11:03, Chris M. Thomasson wrote:

On 4/23/2026 12:22 AM, David Brown wrote:

On 22/04/2026 23:29, Chris M. Thomasson wrote:

On 4/22/2026 2:28 PM, Chris M. Thomasson wrote:

On 4/21/2026 1:13 PM, David Brown wrote:

On 21/04/2026 20:51, Chris M. Thomasson wrote:

On 4/21/2026 1:13 AM, David Brown wrote:

On 20/04/2026 23:59, Keith Thompson wrote:

Bart <bc@freeuk.com> writes:

On 20/04/2026 18:48, Keith Thompson wrote:

Bart <bc@freeuk.com> writes:

Yes, that's really useful!

So which implementation is faster at actually doing function >>>>>>>>>> calls?
And how many calls were actually made?

I don't know or care.

Once again, *there are ways* to write C benchmarks that guarantee >>>>>>>>> that all the function calls you want to time actually occur during >>>>>>>>> execution.� For example, you can use calls to separately compiled >>>>>>>>> functions (and disable link-time optimization if necessary). >>>>>>>>> You can
do computations that the compiler can't unwrap.� You might
multiply
a value by (time(NULL) > 0); that always yields 1, but the
compiler
probably doesn't know that.� (That's off the top of my head; I >>>>>>>>> don't
know what the best techniques are in practice.)� And then you can >>>>>>>>> examine the generated code to make sure that it's what you want. >>>>>>>>>

To add more suggestions here, I find the key to benchmarking
when you want to stick to standard C is use of "volatile".� Use >>>>>>>> a volatile read at the start of your code, then calculations
that depend on each other and that first read, then a volatile >>>>>>>> write of the result.� That gives minimal intrusion in the code >>>>>>>> while making sure the calculations have to be generated, and
have to be done at run time.

If you are testing on a particular compiler (like gcc or clang), >>>>>>>> then there are other options.� The "noinline" function attribute >>>>>>>> is very handy.� Then there are empty inline assembly statements: >>>>>>>>
If you think of processor registers as acting like a level -1 >>>>>>>> memory cache (for things that are not always in registers), then >>>>>>>> this flushes that cache:

��asm volatile ("" ::: "memory");

This tells the compiler that it needs to have calculated "x" at >>>>>>>> this point in time (so that its value can be passed to the
assembly) :

��asm volatile ("" :: "" (x));

This tells the compiler that "x" might be changed by the
assembly, so it must forget any additional knowledge it had of it : >>>>>>>>
��asm volatile ("" : "+g" (x));

I've had use of all of these in real code, not just benchmarks >>>>>>>> or test code.� They can be helpful in some kinds of interactions >>>>>>>> between low level code and hardware.

Well, we have to make a difference between a compiler barrier and >>>>>>> a memory barrier. All memory barriers should be compiler
barriers, but compiler barriers do not have to be memory
barriers... Fair enough?

Of course there is a difference between memory barriers and
compiler barriers.� We are talking about compiler barriers here,
because they have an effect on the semantics of the language (in
this case, the language is "C with gcc extensions") without the
cost of real memory barriers.� C11 atomic fences are compiler and >>>>>> memory barriers, but they can have a huge effect on code speed -
these empty assembly statements are aimed at having minimal impact >>>>>> outside of the intended effects.

I think a relaxed memory barrier can be used as a compiler barrier
and be compatible with atomic, volatile does not have to be used here? >>>>

load/store with relaxed should act like compiler barriers?

To be honest, I have never been at all sure how C11 atomic accesses
and fences relate to "memory barriers" of any sort, or how they
enforce order in respect to volatile accesses or non-volatile accesses.

The C standards at times use "volatile atomic" qualifications, which
implies that non-volatile atomic uses are not volatile.� Volatile
accesses do two things - enforce an order (in the generated code, but
not necessarily at execution on the cpu) of volatile accesses, and
make the access "observable behaviour".� My understanding is then
that C11 atomics are missing one or both of these aspects, but I
don't know which.

gcc has a "memory clobber" facility in inline assembly - and this is
commonly used as a compiler (but not cpu) memory barrier.� I know
what it does in practical terms for the way I use it, but I am not
sure how precisely it can be specified in relation to the standard C
semantics. It seems reasonable to suppose that a relaxed atomic fence
could act like a gcc compiler memory barrier, but the standard says
that "atomic_thread_fence(memory_order_relaxed)" has no effects.

The main reason I have not bothered looking at the semantics and
effects of C11 atomics is that the libatomic implementation that is
distributed with gcc is (or at least /was/ when I looked a number of
years ago) fundamentally and irreparably broken for single-core
microcontrollers. Using spinlocks to enforce atomic actions is fine
on a multi-core Linux system, but a guaranteed hang on a single-core
RTOS or when using atomics from interrupts.� So I use RTOS-specific
features, or my own critical section code (disabling interrupts is
the way to do it on these kinds of devices), along with gcc inline
assembly - it's as far from portable standard C code as you can get
and still have it mixed with C, but I don't need portability there.

But I have no objection at all if someone wants to give an
explanation of some of the C11 atomic semantics, though it might be
better in a new thread.

Yeah. Well, damn. I would hope that in the _compiled_ code, memory
ordering aside:

std::atomic<int> a = 0;

a.store(123);
a.store(666);

Better damn well issue two stores in that order. The memory order side
be damned for this moment, but I think std::atomic in impls are laced
with the volatile keyword anyway, but shit can happen. Humm...

This is c.l.c., not c.l.c++, but they use the same memory model here.

A brief test shows that gcc seems to do both stores regardless of the
memory order (for atomic_store_explicit).� With memory_order_seq_cst,
gcc appears to act as though there were a compiler memory barrier along
with the store - with memory_order_relaxes, there is no such barrier.
That is, non-volatile accesses can be moved around.� So this:

_Atomic int a1;
int i1;

void foo(int x) {
��i1 = 100;
��atomic_store_explicit(&a1, x, memory_order_relaxed);
��atomic_store_explicit(&a1, x + 1, memory_order_relaxed);
��i1 = i1 + 1;
}

gets optimised as though it were:

void foo(int x) {
��atomic_store_explicit(&a1, x, memory_order_relaxed);
��atomic_store_explicit(&a1, x + 1, memory_order_relaxed);
��i1 = 101;
}

It is difficult to test, by trial and error, if volatile accesses get re-ordered around relaxed atomic accesses.� Regardless of semantics, the compiler is not going to re-order them unless there are clear efficiency benefits, and since relaxed atomic operations apparently can't be
combined (or at least, gcc does not combine them), I haven't got any examples where the compiler would be likely to re-arrange things if it
is allowed to do so.� But my failure to find a counter-example here does
not mean that I am sure relaxed atomic accesses cannot be re-ordered
with respect to non-atomic volatile accesses.

Agreed. So, well, volatile is there for a reason! :^)

I was just wondering about how the atomics relate to it, if they can act
like it... For seq_cst, the compiler better damn well be emitting memory barriers on systems that require them, say MFENCE or dummy LOCK'ed RMW,
or whatever the platform needs for seq_cst. For a SPARC, that would be a (MEMBAR #StoreLoad | #LoadStore | #LoadLoad | #StoreStore) at least.

--- PyGate Linux v1.5.14
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

From Chris M. Thomasson@3:633/10 to All on Thu Apr 23 14:08:57 2026

On 4/23/2026 2:58 AM, Bart wrote:

On 23/04/2026 02:59, Keith Thompson wrote:

Bart <bc@freeuk.com> writes:

Bart <bc@freeuk.com> writes:

[...]

So, what does language say about it again? Remind me! Or better, tell >>>>> the compiler.

I've already told you what the language says about it.� I quoted
the section of the ISO C standard that says explicitly that the
behavior is undefined.� N3220 6.3.2.1p2, last sentence.
The compiler's behavior is consistent with that requirment.
You cannot possibly have forgotten this.� Why do you pretend?

Nobody seems to have a problem with gcc being lax about this (or with
it allowing its users to let it be lax).

gcc is not being lax. gcc is behaving in a matter that is consistent
with the requirements of the C standard.� The code in question has
undefined behavior.

You know and understand all of that.

No, I don't.

So what is the concrete effect of all that on the behaviour of gcc and
the behaviour of the code it generates?

If something bad happens (what would that be exactly), whose fault would
�that, mine or the compiler's?

Are you suggesting that because something is tagged as UB, that it
literally gives a compiler a licence to do anything?

If so, how is that not being lax by either language, compiler, or both?

I'm starting to suspect that either nobody knows the answer, or they do,
but are chary of either blaming the compiler or criticising the language spec, and are trying to shift the blame to the user.

The behavior is undefined.� You know exactly what that means, but you
pretend not to.

And yet, the behaviour I have observed is nothing remarkable: some
undefined bit patterns get used; zero is assumed; or code is just elided.

Again, do you have any real-life, practical examples of bad or unusual things happening?

If you had to put money on whether some outcode is either one of those
three I listed, or something else, which would you go for?

Do you think that NULL must be 0? Keep in mind, NULL can be 0xDEADBEEF
if a platform deemed it that way.

--- PyGate Linux v1.5.14
* Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)

Who's Online
Recent Visitors
- RufusT
  Mon Mar 23 09:29:19 2026
  from Dallas, TX via RLogin
- Guest
  Thu Apr 16 22:04:47 2026
  from Melbourne, Victoria via Telnet
- armand0224
  Tue Apr 21 15:58:04 2026
  from Los Angeles, California via HTTP
- Guest
  Wed Apr 22 17:14:33 2026
  from Chicoutimi, Canada via Telnet

System Info

Sysop:	Tetrazocine
Location:	Melbourne, VIC, Australia
Users:	14
Nodes:	8 (0 / 8)
Uptime:	141:27:01
Calls:	212
Files:	21,502
Messages:	83,452

A thought of C

Who's Online

Recent Visitors

System Info