In attempting writting a simple language, I had a thought of what language is to share. Because I saw many people are stuck in thinking C/C++ (or other high level language) can be so abstract, unlimited 'high level' to mysteriously
solve various human description of idea.
C and assembly are essentially the same, maybe better call it 'portable assembly'.
In C, we don't explicitly specify how wide the register/memory unit is, we use
char/int (short/long, signed/unsigned) to denote the basic unit. I.e.
a=b; // equ. to "mov a,b"
The 2nd difference: Assembly contains too many burdomsom labels. In C, we use 'structure', for example:
while(a<b) { // 'while', '(', ')' may be the place for implicit lables
a+=1;
} // '}' is an implicit label
if(a<b) {
} else { // '{', '}' are implicit labels
} // ditto
The 3rd difference: Function calling convention in C is reentrance-able (mostly).
The 4th difference: Local variable.
(Assembly can theoritically do the same but I don't have impression which one support this feature.)
On 14/04/2026 15:47, wij wrote:age is
In attempting writting a simple language, I had a thought of what langu
erto share. Because I saw many people are stuck in thinking C/C++ (or oth
eriouslyhigh level language) can be so abstract, unlimited 'high level' to myst
assembly'.solve various human description of idea.
C and assembly are essentially the same, maybe better call it 'portable
we useIn C, we don't explicitly specify how wide the register/memory unit is,
char/int (short/long, signed/unsigned) to denote the basic unit. I.e.
ÿÿ a=b;ÿÿ // equ. to "mov a,b"
What C's 'a=b' equates to in assembly could be anything, depending on
target machine, the types of 'a' and 'b', their scopes and linkage, the
compiler used, and the optimisation levels employed.
we useThe 2nd difference: Assembly contains too many burdomsom labels. In C,
lace for implicit lables'structure', for example:
ÿÿ while(a<b) {ÿÿ // 'while', '(', ')' may be the p
?ÿÿÿ // '}' is an implicit labelÿÿÿÿ a+=1;
ÿÿ }ÿÿÿÿÿÿÿÿÿ?
re implicit labelsÿÿ if(a<b) {
ÿÿ } else {ÿÿÿÿÿÿ // '{', '}' a
?ÿÿÿ // dittoÿÿ }ÿÿÿÿÿÿÿÿÿ?
(mostly).The 3rd difference: Function calling convention in C is reentrance-able
ch oneThe 4th difference: Local variable.
(Assembly can theoritically do the same but I don't have impression whi
support this feature.)
So basically, C and Assembly are NOT essentially the same. C has far
more abstractions: it is a HLL.
And actually, there are at least a couple of language levels I've used
that sit between Assembly and C.
On Tue, 2026-04-14 at 18:45 +0100, Bart wrote:
On 14/04/2026 15:47, wij wrote:
In attempting writting a simple language, I had a thought of what language is
to share. Because I saw many people are stuck in thinking C/C++ (or other >> > high level language) can be so abstract, unlimited 'high level' to mysteriously
solve various human description of idea.
C and assembly are essentially the same, maybe better call it 'portable assembly'.
In C, we don't explicitly specify how wide the register/memory unit is, we use
char/int (short/long, signed/unsigned) to denote the basic unit. I.e.
ÿÿ a=b;ÿÿ // equ. to "mov a,b"
What C's 'a=b' equates to in assembly could be anything, depending on
target machine, the types of 'a' and 'b', their scopes and linkage, the
compiler used, and the optimisation levels employed.
The 2nd difference: Assembly contains too many burdomsom labels. In C, we use
'structure', for example:
ÿÿ while(a<b) {ÿÿ // 'while', '(', ')' may be the place for implicit lables
ÿÿÿÿ a+=1;
ÿÿ }ÿÿÿÿÿÿÿÿÿÿÿÿÿ // '}' is an implicit label
ÿÿ if(a<b) {
ÿÿ } else {ÿÿÿÿÿÿ // '{', '}' are implicit labels
ÿÿ }ÿÿÿÿÿÿÿÿÿÿÿÿÿ // ditto
The 3rd difference: Function calling convention in C is reentrance-able (mostly).
The 4th difference: Local variable.
(Assembly can theoritically do the same but I don't have impression which one
support this feature.)
So basically, C and Assembly are NOT essentially the same. C has far
more abstractions: it is a HLL.
Anyway, IMO, 'portable assembly' is more descriptive.
'High-Level Language' is anyone's interpretation (prone to mis-interpretation and
misunderstanding).
'Assembly' can also be like C:
// This is 'assembly'
def int=32bit; // Choose right bits for your platform, or leave it for
def char= 8bit; // compiler to decide.
int a;
char b;
a=b; // allow auto promotion
while(a<b) {
a+=1;
}
You also can call the above example 'C'. If so, you still have to know how wide
int/char is (Not rare. programmers often struggle which size to use) while writing "a=b", eventually. What the 'abstracton' really mean? Maybe, eventually
back to int32_t and int8_t after long theoretical/phillisophical pondering?
HHL is just 'style' in favor of specific purpose than the other for me. I am not
saying it is wrong, instead it is very helpful (measured by actuall effort and
gain).
And actually, there are at least a couple of language levels I've used
that sit between Assembly and C.
On Tue, 2026-04-14 at 18:45 +0100, Bart wrote:
On 14/04/2026 15:47, wij wrote:
In attempting writting a simple language, I had a thought of what language is
to share. Because I saw many people are stuck in thinking C/C++ (or other >>> high level language) can be so abstract, unlimited 'high level' to mysteriously
solve various human description of idea.
C and assembly are essentially the same, maybe better call it 'portable assembly'.
In C, we don't explicitly specify how wide the register/memory unit is, we use
char/int (short/long, signed/unsigned) to denote the basic unit. I.e.
ÿÿ a=b;ÿÿ // equ. to "mov a,b"
What C's 'a=b' equates to in assembly could be anything, depending on
target machine, the types of 'a' and 'b', their scopes and linkage, the
compiler used, and the optimisation levels employed.
The 2nd difference: Assembly contains too many burdomsom labels. In C, we use
'structure', for example:
ÿÿ while(a<b) {ÿÿ // 'while', '(', ')' may be the place for implicit lables
ÿÿÿÿ a+=1;
ÿÿ }ÿÿÿÿÿÿÿÿÿÿÿÿÿ // '}' is an implicit label
ÿÿ if(a<b) {
ÿÿ } else {ÿÿÿÿÿÿ // '{', '}' are implicit labels
ÿÿ }ÿÿÿÿÿÿÿÿÿÿÿÿÿ // ditto
The 3rd difference: Function calling convention in C is reentrance-able (mostly).
The 4th difference: Local variable.
(Assembly can theoritically do the same but I don't have impression which one
support this feature.)
So basically, C and Assembly are NOT essentially the same. C has far
more abstractions: it is a HLL.
Anyway, IMO, 'portable assembly' is more descriptive.
'High-Level Language' is anyone's interpretation (prone to mis-interpretation and
misunderstanding).
'Assembly' can also be like C:
// This is 'assembly'
def int=32bit; // Choose right bits for your platform, or leave it for
def char= 8bit; // compiler to decide.
int a;
char b;
a=b; // allow auto promotion
while(a<b) {
a+=1;
}
You also can call the above example 'C'.
If so, you still have to know how wide
int/char is (Not rare. programmers often struggle which size to use) while writing "a=b", eventually. What the 'abstracton' really mean? Maybe, eventually
back to int32_t and int8_t after long theoretical/phillisophical pondering?
HHL is just 'style' in favor of specific purpose than the other for me. I am not
saying it is wrong, instead it is very helpful (measured by actuall effort and
gain).
But if you want to call C some kind of assembler, even though it is
several levels above actual assembly, then that's up to you.
In attempting writting a simple language, I had a thought of what language is to share. Because I saw many people are stuck in thinking C/C++ (or other high level language) can be so abstract, unlimited 'high level' to mysteriously
solve various human description of idea.
C and assembly are essentially the same, maybe better call it 'portable assembly'.
In C, we don't explicitly specify how wide the register/memory unit is, we use
char/int (short/long, signed/unsigned) to denote the basic unit. I.e.
a=b; // equ. to "mov a,b"
On 2026-04-14 23:41, Bart wrote:
But if you want to call C some kind of assembler, even though it is
several levels above actual assembly, then that's up to you.
Can you name and describe a couple of these "several levels above
actual assembly"?ÿ (Assembler macros might qualify as one level.)
Beyond the inherent subjective aspects of that or the OP's initial
statement I certainly see "C" closer to the machine than many HLLs.
It certainly depends on where one is coming from; from an abstract
or user-application level or from the machine level.
There was often mentioned here - very much to the despise of the
audience - that there's a lot effort necessary to implement simple
concepts. To jump on that bandwagon; how would, say, Awk's array
constructÿ map[key] = valueÿ have to be modeled in (native) "C".
(Note that this simple statement represents an associative array.)
"C" is abstracting from the machine. And the OP's initial statement
"C and assembly are essentially the same" may be nonsense
wij <wyniijj5@gmail.com> writes:age is
In attempting writting a simple language, I had a thought of what langu
erto share. Because I saw many people are stuck in thinking C/C++ (or oth
eriouslyhigh level language) can be so abstract, unlimited 'high level' to myst
assembly'.solve various human description of idea.
C and assembly are essentially the same, maybe better call it 'portable
No, C is not any kind of assembly.ÿ Assembly language and C are
fundamentally different.
An assembly language program specifies a sequence of CPU instructions.
A C program specifies run-time behavior.ÿ (A compiler generates CPU instructions behind the scenes to implement that behavior.)
we useIn C, we don't explicitly specify how wide the register/memory unit is,
Therechar/int (short/long, signed/unsigned) to denote the basic unit. I.e.
ÿ a=b;ÿÿ // equ. to "mov a,b"
(Or "mov b,a" depending on the assembly syntax.)
Nope.ÿ `a=b` could translate to a lot of different instruction
sequences.ÿ Either or both of the operands could be registers.ÿ
might or might not be different "mov" instructions for integers, pointers,
floating-point values.ÿ a and b could be large structs, and the
assignment might be translated to a call to memcpy(), or to equivalent
inline code.
Or the assignment might not result in any code at all, if the compiler
can prove that it has no side effects and the value of a is not used.
[...]
On 14/04/2026 19:41, wij wrote:anguage is
On Tue, 2026-04-14 at 18:45 +0100, Bart wrote:
On 14/04/2026 15:47, wij wrote:
In attempting writting a simple language, I had a thought of what l
otherto share. Because I saw many people are stuck in thinking C/C++ (or
mysteriouslyhigh level language) can be so abstract, unlimited 'high level' to
able assembly'.solve various human description of idea.
C and assembly are essentially the same, maybe better call it 'port
is, we useIn C, we don't explicitly specify how wide the register/memory unit
e.char/int (short/long, signed/unsigned) to denote the basic unit. I.
onÿÿÿ a=b;ÿÿ // equ. to "mov a,b"
What C's 'a=b' equates to in assembly could be anything, depending
hetarget machine, the types of 'a' and 'b', their scopes and linkage, t
compiler used, and the optimisation levels employed.
C, we useThe 2nd difference: Assembly contains too many burdomsom labels. In
y be the place for implicit lables'structure', for example:
ÿÿÿ while(a<b) {ÿÿ // 'while', '(', ')' ma
?ÿÿÿÿÿ // '}' is an implicit labelÿÿÿÿÿ a+=1;
ÿÿÿ }ÿÿÿÿÿÿÿ?
'{', '}' are implicit labelsÿÿÿ if(a<b) {
ÿÿÿ } else {ÿÿÿÿÿÿ //
?ÿÿÿÿÿ // dittoÿÿÿ }ÿÿÿÿÿÿÿ?
able (mostly).The 3rd difference: Function calling convention in C is reentrance-
which oneThe 4th difference: Local variable.
(Assembly can theoritically do the same but I don't have impression
support this feature.)
tation andSo basically, C and Assembly are NOT essentially the same. C has far
more abstractions: it is a HLL.
Anyway, IMO, 'portable assembly' is more descriptive.
'High-Level Language' is anyone's interpretation (prone to mis-interpre
misunderstanding).
orm, or leave it for'Assembly' can also be like C:
ÿ // This is 'assembly'
ÿ def int=32bit;ÿÿ // Choose right bits for your platf
ÿ def char= 8bit;ÿ // compiler to decide.
ÿ int a;
ÿ char b;
ÿ a=b;ÿÿ // allow auto promotion
ÿ while(a<b) {
ÿÿÿ a+=1;
ÿ }
You also can call the above example 'C'.
That's because it is pretty much C. It's not like any assembly I've ever
seen!
ileIf so, you still have to know how wide
int/char is (Not rare. programmers often struggle which size to use) wh
eventuallywriting "a=b", eventually. What the 'abstracton' really mean? Maybe,
ing?back to int32_t and int8_t after long theoretical/phillisophical ponder
Wrap the above into a viable C function. Paste it into godbolt.org, then
look at the actual assembly that is generation for combinations ofm.)
target, compile and options. All will be different. Some may not even generate any code for that loop. (You might also try Clang with -emit-llv
Then change the types of a and b, say to floats or pointers, and do it
again. The assembly will changer yet again, even though you've modified
nothing else. That is a characteristic of a HLL: you change one small
part, and the generated code changes across the program.
But if you want to call C some kind of assembler, even though it is
several levels above actual assembly, then that's up to you.
I am notHHL is just 'style' in favor of specific purpose than the other for me.
ort andsaying it is wrong, instead it is very helpful (measured by actuall eff
gain).
Below C there are HLAs or high-level assemblers, which at one time were
also called machine-oriented languages, intended for humans to use. And
a little below that might be intermediate languages (ILs), usually machine-generated, intended for compiler backends.
ILs will be target-independent and so portable to some extent. I'd say
that 'portable assembly' fits those better.
(I've implemented, or devised and implemented, all the four levels
discussed here. There are also other languages in this space such as
PL/M, or Forth.)
On Tue, 2026-04-14 at 15:31 -0700, Keith Thompson wrote:
wij <wyniijj5@gmail.com> writes:
In attempting writting a simple language, I had a thought of what language is
to share. Because I saw many people are stuck in thinking C/C++ (or other >> > high level language) can be so abstract, unlimited 'high level' to mysteriously
solve various human description of idea.
C and assembly are essentially the same, maybe better call it 'portable assembly'.
No, C is not any kind of assembly.ÿ Assembly language and C are
fundamentally different.
An assembly language program specifies a sequence of CPU instructions.
[Repeat] 'Assembly' can also be like C:
// This is 'assembly'
def int=32bit; // Choose right bits for your platform, or leave it for
def char= 8bit; // compiler to decide.
int a;
char b;
a=b; // allow auto promotion
while(a<b) {
a+=1;
}
Yes, the C-like example above specifies exactly a sequence of CPU instructions
(well, small deviation is allowed, and assembly can also have function, macro)
A C program specifies run-time behavior.ÿ (A compiler generates CPU
instructions behind the scenes to implement that behavior.)
Being 'portable', it should specify 'run-time behavior', no exact instructions.
In C, we don't explicitly specify how wide the register/memory unit is, we use
char/int (short/long, signed/unsigned) to denote the basic unit. I.e.
ÿ a=b;ÿÿ // equ. to "mov a,b"
(Or "mov b,a" depending on the assembly syntax.)
Nope.ÿ `a=b` could translate to a lot of different instruction
sequences.ÿ Either or both of the operands could be registers.ÿ There
might or might not be different "mov" instructions for integers, pointers, >> floating-point values.ÿ a and b could be large structs, and the
assignment might be translated to a call to memcpy(), or to equivalent
inline code.
Or the assignment might not result in any code at all, if the compiler
can prove that it has no side effects and the value of a is not used.
[...]
All mentioned could also be implemented in assembly.
Note that I am not saying C is assembly.
C and assembly are essentially the same
wij <wyniijj5@gmail.com> writes:anguage is
On Tue, 2026-04-14 at 15:31 -0700, Keith Thompson wrote:
wij <wyniijj5@gmail.com> writes:
In attempting writting a simple language, I had a thought of what l
otherto share. Because I saw many people are stuck in thinking C/C++ (or
mysteriouslyhigh level language) can be so abstract, unlimited 'high level' to
able assembly'.solve various human description of idea.
C and assembly are essentially the same, maybe better call it 'port
.No, C is not any kind of assembly.ÿ Assembly language and C are fundamentally different.
An assembly language program specifies a sequence of CPU instructions
rm, or leave it for[Repeat] 'Assembly' can also be like C:
ÿ// This is 'assembly'
ÿdef int=32bit;ÿÿ // Choose right bits for your platfo
ÿdef char= 8bit;ÿ // compiler to decide.
Compiler?ÿ You said this was assembly.
ÿint a;
ÿchar b;
ÿa=b;ÿÿ // allow auto promotion
ÿwhile(a<b) {
ÿÿ a+=1;
ÿ}
You've claimed that that's assembly language.ÿ What assembler?
For what CPU?
Is it even for a real assembler?
uctionsYes, the C-like example above specifies exactly a sequence of CPU instr
macro)(well, small deviation is allowed, and assembly can also have function,
CPUA C program specifies run-time behavior.ÿ (A compiler generates
uctions.instructions behind the scenes to implement that behavior.)
Being 'portable', it should specify 'run-time behavior', no exact instr
Yes, that's what I said.ÿ And that's the fundamental difference between
assembly and C.
is, we useIn C, we don't explicitly specify how wide the register/memory unit
e.char/int (short/long, signed/unsigned) to denote the basic unit. I.
ÿ Thereÿ a=b;ÿÿ // equ. to "mov a,b"
(Or "mov b,a" depending on the assembly syntax.)
Nope.ÿ `a=b` could translate to a lot of different instruction
sequences.ÿ Either or both of the operands could be registers.
ters,might or might not be different "mov" instructions for integers, poin
tfloating-point values.ÿ a and b could be large structs, and the assignment might be translated to a call to memcpy(), or to equivalen
rinline code.
Or the assignment might not result in any code at all, if the compile
ourcan prove that it has no side effects and the value of a is not used.
[...]
All mentioned could also be implemented in assembly.
Sure, many C compilers can generate assembly code.ÿ But I question y
claim that an assembler can plausibly generate a call to memcpy() for something that looks like a simple assignment.
Many assemblers support macros, but the assembly language still
specifies the sequence of CPU instructions.
If you can cite a real-world "assembler" that behaves that way,
there might be something to discuss.
Note that I am not saying C is assembly.
You said that "C and assembly are essentially the same, maybe better
call it 'portable assembly'."ÿ I disagree.
I had a similar discussion here some time ago.ÿ As I recall, the
other participant repeatedly claimed that sophisticated assemblers
that don't generate specified sequences of CPU instructions are
common, but never provided an example.ÿ (I haven't been able to
track down the discussion.)
On 14/04/2026 23:20, Janis Papanagnou wrote:
On 2026-04-14 23:41, Bart wrote:
But if you want to call C some kind of assembler, even though it is
several levels above actual assembly, then that's up to you.
Can you name and describe a couple of these "several levels above
actual assembly"?ÿ (Assembler macros might qualify as one level.)
I said C is several levels above, and mentioned 2 categories and 2
specific ones that can be considered to be in-between.
Namely:
* HLAs (high-level assemblers) of various kinds, as this is a broad
category (see note)
* Intermediate languages (IRs/ILs) such as LLVM IR
* Forth
* PL/M (an old one; there was also C--, now dead)
(Note: the one I implemented was called 'Babbage', devised for the GEC
4000 machines. My task was to port it to DEC PDP10. There's something
about it 2/3 down this page: https://en.wikipedia.org/wiki/GEC_4000_series)
Beyond the inherent subjective aspects of that or the OP's initial
statement I certainly see "C" closer to the machine than many HLLs.
I see it as striving to distance itself from the machine as much as possible!
Certainly until C99 when stdint.h came along.
For example:
* Not committing to actual machine types, widths or representations,
such as a 'byte', or 'twos complement'.
* Being vague about the relations between the different integer types
* Not allowing (until standardised after half a century) binary
literals, and still not allowing those to be printed
* Not being allowed to do a dozen things that you KNOW are well-defined
on your target machine, but C says are UB.
It certainly depends on where one is coming from; from an abstract
or user-application level or from the machine level.
There was often mentioned here - very much to the despise of the
audience - that there's a lot effort necessary to implement simple
concepts. To jump on that bandwagon; how would, say, Awk's array
constructÿ map[key] = valueÿ have to be modeled in (native) "C".
(Note that this simple statement represents an associative array.)
"C" is abstracting from the machine. And the OP's initial statement
"C and assembly are essentially the same" may be nonsense
Actually, describing C as 'portable assembly' annoys me which is why I
went into some detail.
On Tue, 2026-04-14 at 21:46 -0700, Keith Thompson wrote:
wij <wyniijj5@gmail.com> writes:
On Tue, 2026-04-14 at 15:31 -0700, Keith Thompson wrote:
wij <wyniijj5@gmail.com> writes:
In attempting writting a simple language, I had a thought of what language is
to share. Because I saw many people are stuck in thinking C/C++ (or other >>>>> high level language) can be so abstract, unlimited 'high level' to mysteriously
solve various human description of idea.
C and assembly are essentially the same, maybe better call it 'portable assembly'.
No, C is not any kind of assembly.ÿ Assembly language and C are
fundamentally different.
An assembly language program specifies a sequence of CPU instructions.
[Repeat] 'Assembly' can also be like C:
ÿ// This is 'assembly'
ÿdef int=32bit;ÿÿ // Choose right bits for your platform, or leave it for >>> ÿdef char= 8bit;ÿ // compiler to decide.
Compiler?ÿ You said this was assembly.
ÿint a;
ÿchar b;
ÿa=b;ÿÿ // allow auto promotion
ÿwhile(a<b) {
ÿÿ a+=1;
ÿ}
You've claimed that that's assembly language.ÿ What assembler?
For what CPU?
Is it even for a real assembler?
I think you realize the example above is just an example to demo my idea.
Yes, the C-like example above specifies exactly a sequence of CPU instructions
(well, small deviation is allowed, and assembly can also have function, macro)
A C program specifies run-time behavior.ÿ (A compiler generates CPU
instructions behind the scenes to implement that behavior.)
Being 'portable', it should specify 'run-time behavior', no exact instructions.
Yes, that's what I said.ÿ And that's the fundamental difference between
assembly and C.
How/what do you specify 'run-time behavior'? Not based on CPU?
E.g. in C, int types are fixed-size, have range, wrap-around, alignment
and 'atomic','overlapping' properties, you cannot really understand or hide it and
program C/C++ correctly from the high-level concept of 'integer'.
The point is that C has NO WAY get rid of these (hard-ware) features, no matter
how high-level one thinks C is or expect C would be.
On Tue, 2026-04-14 at 22:41 +0100, Bart wrote:
HHL is just 'style' in favor of specific purpose than the other for me. I am not
saying it is wrong, instead it is very helpful (measured by actuall effort and
gain).
Below C there are HLAs or high-level assemblers, which at one time were
also called machine-oriented languages, intended for humans to use. And
a little below that might be intermediate languages (ILs), usually
machine-generated, intended for compiler backends.
ILs will be target-independent and so portable to some extent. I'd say
that 'portable assembly' fits those better.
Do you program (read/write) IL directly?
I am talking about the language that human uses directly.
(I've implemented, or devised and implemented, all the four levels
discussed here. There are also other languages in this space such as
PL/M, or Forth.)
I am not talking about compiler technology.
On Tue, 2026-04-14 at 21:46 -0700, Keith Thompson wrote:
ÿint a;
ÿchar b;
ÿa=b;ÿÿ // allow auto promotion
ÿwhile(a<b) {
ÿÿ a+=1;
ÿ}
You've claimed that that's assembly language.ÿ What assembler?
For what CPU?
Is it even for a real assembler?
I think you realize the example above is just an example to demo my idea.
Yes, the C-like example above specifies exactly a sequence of CPU instructions
(well, small deviation is allowed, and assembly can also have function, macro)
A C program specifies run-time behavior.ÿ (A compiler generates CPU
instructions behind the scenes to implement that behavior.)
Being 'portable', it should specify 'run-time behavior', no exact instructions.
Yes, that's what I said.ÿ And that's the fundamental difference between
assembly and C.
How/what do you specify 'run-time behavior'? Not based on CPU?
E.g. in C, int types are fixed-size, have range, wrap-around, alignment
and 'atomic','overlapping' properties, you cannot really understand or hide it and
program C/C++ correctly from the high-level concept of 'integer'.
The point is that C has NO WAY get rid of these (hard-ware) features, no matter
how high-level one thinks C is or expect C would be.
I had a similar discussion here some time ago.ÿ As I recall, the
other participant repeatedly claimed that sophisticated assemblers
that don't generate specified sequences of CPU instructions are
common, but never provided an example.ÿ (I haven't been able to
track down the discussion.)
When I heard 'sophisticated assemblers', I would think something like my idea of
'portable' assembly, but maybe different.
One my point should be clear as stated in the above int exampleÿ"... C has NO WAY
get rid of these (hard-ware) features, no matter how high-levelÿone thinks C is or
expect C would be."
On 15/04/2026 05:20, wij wrote:me. I am not
On Tue, 2026-04-14 at 22:41 +0100, Bart wrote:
HHL is just 'style' in favor of specific purpose than the other for
effort andsaying it is wrong, instead it is very helpful (measured by actuall
regain).
Below C there are HLAs or high-level assemblers, which at one time we
ndalso called machine-oriented languages, intended for humans to use. A
ya little below that might be intermediate languages (ILs), usually machine-generated, intended for compiler backends.
ILs will be target-independent and so portable to some extent. I'd sa
C.that 'portable assembly' fits those better.
Do you program (read/write) IL directly?
I am talking about the language that human uses directly.
It is possible to write IL directly, when a textual form of it exists.
Not many do that, but then not many write assembly these days either, /because more convenient higher level languages exist/, one of them being
Why do /you/ think that people prefer to use C to write programs rather
than assembly, if they are 'essentially the same'?
(I've implemented, or devised and implemented, all the four levels discussed here. There are also other languages in this space such as PL/M, or Forth.)
I am not talking about compiler technology.
You claimed that C and assembler are at pretty much the same level. I'm
saying that they are not only at different levels, but other levels
exist, and I know because I've used them!
A compiler can choose to translate a language to any of those levels, including C (from a higher level language than C usually).
On 15/04/2026 07:05, wij wrote:a.
On Tue, 2026-04-14 at 21:46 -0700, Keith Thompson wrote:
ÿÿint a;
ÿÿchar b;
ÿÿa=b;ÿÿ // allow auto promotion
ÿÿwhile(a<b) {
ÿÿÿ a+=1;
ÿÿ}
You've claimed that that's assembly language.ÿ What assembler?
For what CPU?
Is it even for a real assembler?
I think you realize the example above is just an example to demo my ide
So you've invented an 'assembly' syntax that looks exactly like C, ing!
order to support your notion that C and assembly are really the same thin
Real assembly generally uses explicit instructions and labels rather
than the implicit ones used here. It would also have limits on the complexity of expressions. If your pseudo-assembler supports:
ÿÿÿ a = b+c*f(x,y);
then you've invented a HLL.
nstructionsYes, the C-like example above specifies exactly a sequence of CPU i
ion, macro)(well, small deviation is allowed, and assembly can also have funct
tes CPUA C program specifies run-time behavior.ÿ (A compiler genera
nstructions.instructions behind the scenes to implement that behavior.)
Being 'portable', it should specify 'run-time behavior', no exact i
betweenYes, that's what I said.ÿ And that's the fundamental difference
hide it andassembly and C.
How/what do you specify 'run-time behavior'? Not based on CPU?
E.g. in C, int types are fixed-size, have range, wrap-around, alignment
and 'atomic','overlapping' properties, you cannot really understand or
o matterprogram C/C++ correctly from the high-level concept of 'integer'.
The point is that C has NO WAY get rid of these (hard-ware) features, n
how high-level one thinks C is or expect C would be.
There are a dozen or more HLLs that have exactly such a set of integer
types. Actually, those have fixed-width integers with fixed ranges, wrap-around behaviour, twos complement format and so on, even more so
than C.
So those HLLs (that is, C++, C#, D, Rust, Java, Zig, Go, ...) are even
more closely tied to the machine than C is. (In C, built-in types are?
not sized, but have mininum widths, and until C23, integer
representation was not specified.)
Would you claim that those are also essentially assembly? If not, why not
y idea ofI had a similar discussion here some time ago.ÿ As I recall, the
other participant repeatedly claimed that sophisticated assemblers
that don't generate specified sequences of CPU instructions are
common, but never provided an example.ÿ (I haven't been able to
track down the discussion.)
When I heard 'sophisticated assemblers', I would think something like m
.. C has NO WAY'portable' assembly, but maybe different.
One my point should be clear as stated in the above int exampleÿ".
e thinks C is orget rid of these (hard-ware) features, no matter how high-levelÿon
expect C would be."
Starting with C23, C has _BitInt, where you can define a 1000000-bit
integer type if you want. (There may be limits as to how big.)
Or a 37-bit type.
While I don't agree with such a feature for this language (partly
/because/ it is a big departure from machine types), it is a
counter-example to your point.
On Wed, 2026-04-15 at 11:46 +0100, Bart wrote:dea.
On 15/04/2026 07:05, wij wrote:
On Tue, 2026-04-14 at 21:46 -0700, Keith Thompson wrote:
ÿÿint a;
ÿÿchar b;
ÿÿa=b;ÿÿ // allow auto promotion
ÿÿwhile(a<b) {
ÿÿÿ a+=1;
ÿÿ}
You've claimed that that's assembly language.ÿ What assembler?
For what CPU?
Is it even for a real assembler?
I think you realize the example above is just an example to demo my i
So you've invented an 'assembly' syntax that looks exactly like C, in
ing!order to support your notion that C and assembly are really the same th
Exactly. But not really 'invented'. I feagured if anyone wants to implement
a 'portable assembly', he would find it not much different from C (from the
example shown, 'structured C'). So, in a sense, not worthy to implement.
Real assembly generally uses explicit instructions and labels rather
than the implicit ones used here. It would also have limits on the complexity of expressions. If your pseudo-assembler supports:
ÿÿÿ a = b+c*f(x,y);
then you've invented a HLL.
You may say that.
instructionsYes, the C-like example above specifies exactly a sequence of CPU
ction, macro)(well, small deviation is allowed, and assembly can also have fun
rates CPUA C program specifies run-time behavior.ÿ (A compiler gene
instructions.instructions behind the scenes to implement that behavior.)
Being 'portable', it should specify 'run-time behavior', no exact
e betweenYes, that's what I said.ÿ And that's the fundamental differenc
ntassembly and C.
How/what do you specify 'run-time behavior'? Not based on CPU?
E.g. in C, int types are fixed-size, have range, wrap-around, alignme
r hide it andand 'atomic','overlapping' properties, you cannot really understand o
no matterprogram C/C++ correctly from the high-level concept of 'integer'.
The point is that C has NO WAY get rid of these (hard-ware) features,
how high-level one thinks C is or expect C would be.
There are a dozen or more HLLs that have exactly such a set of integer
types. Actually, those have fixed-width integers with fixed ranges, wrap-around behaviour, twos complement format and so on, even more so
than C.
So those HLLs (that is, C++, C#, D, Rust, Java, Zig, Go, ...) are even
more closely tied to the machine than C is. (In C, built-in types are
ot?not sized, but have mininum widths, and until C23, integer
representation was not specified.)
Would you claim that those are also essentially assembly? If not, why n
I calim C is (maybe I should use 'may be'. Sometimes I feel the conversation
is difficult) 'portable assembly' is because C (subset) could map to 'assembly'
and in a sense have to. E.g.riggers different
ÿ int p2; // p2 is connected to extern hardware
ÿ p2=0;
ÿ p2=0;ÿ // significant (hard-ware knows the second 'touch' t
ÿÿÿÿÿÿÿÿ // action (or for delaypurpose).
And, in union, I don't how 'high-level' can explain the way read/write part
of float object officially.tees sizeof(char)==1
ÿ union {
ÿÿÿ char carr[sizeof(uint64_t)];ÿÿ // C++ guaran
ÿÿÿ float f;
ÿ }
heI had a similar discussion here some time ago.ÿ As I recall, t
oother participant repeatedly claimed that sophisticated assemblers
that don't generate specified sequences of CPU instructions are
common, but never provided an example.ÿ (I haven't been able t
my idea oftrack down the discussion.)
When I heard 'sophisticated assemblers', I would think something like
"... C has NO WAY'portable' assembly, but maybe different.
One my point should be clear as stated in the above int exampleÿ
one thinks C is orget rid of these (hard-ware) features, no matter how high-levelÿ
expect C would be."
Starting with C23, C has _BitInt, where you can define a 1000000-bit
ableinteger type if you want. (There may be limits as to how big.)
Or a 37-bit type.
While I don't agree with such a feature for this language (partly /because/ it is a big departure from machine types), it is a counter-example to your point.
Thanks for the example. I did not stress 'C is assembly', maybe it is
because I saw too many Bonita-type of programming concept to stress 'port
assembly' (also I think it may be helpful to others).needs
My understand of C is that the development of C is simply from practical
(i.e. rare of C is from 'theoretical imagination'). Maybe _BitInt is thesame but
I don't know.
On Wed, 2026-04-15 at 11:21 +0100, Bart wrote:
On 15/04/2026 05:20, wij wrote:
On Tue, 2026-04-14 at 22:41 +0100, Bart wrote:
HHL is just 'style' in favor of specific purpose than the other for me. I am not
saying it is wrong, instead it is very helpful (measured by actuall effort and
gain).
Below C there are HLAs or high-level assemblers, which at one time were >>>> also called machine-oriented languages, intended for humans to use. And >>>> a little below that might be intermediate languages (ILs), usually
machine-generated, intended for compiler backends.
ILs will be target-independent and so portable to some extent. I'd say >>>> that 'portable assembly' fits those better.
Do you program (read/write) IL directly?
I am talking about the language that human uses directly.
It is possible to write IL directly, when a textual form of it exists.
Not many do that, but then not many write assembly these days either,
/because more convenient higher level languages exist/, one of them being C. >>
Why do /you/ think that people prefer to use C to write programs rather
than assembly, if they are 'essentially the same'?
There are many reasons for those have chose C. I agree the dominant one
is support and convenience.
(I've implemented, or devised and implemented, all the four levels
discussed here. There are also other languages in this space such as
PL/M, or Forth.)
I am not talking about compiler technology.
You claimed that C and assembler are at pretty much the same level. I'm
saying that they are not only at different levels, but other levels
exist, and I know because I've used them!
A compiler can choose to translate a language to any of those levels,
including C (from a higher level language than C usually).
This argument of 'level' seems based on engineering of compiler of multiple languages. My point of view is based on theory of computation and maybe psychological recognition.
On Wed, 2026-04-15 at 11:46 +0100, Bart wrote:
There are a dozen or more HLLs that have exactly such a set of integer
types. Actually, those have fixed-width integers with fixed ranges,
wrap-around behaviour, twos complement format and so on, even more so
than C.
So those HLLs (that is, C++, C#, D, Rust, Java, Zig, Go, ...) are even
more closely tied to the machine than C is. (In C, built-in types are
not sized, but have mininum widths, and until C23, integer
representation was not specified.)
Would you claim that those are also essentially assembly? If not, why not?
I calim C is (maybe I should use 'may be'. Sometimes I feel the conversation is difficult) 'portable assembly' is because C (subset) could map to 'assembly'
and in a sense have to. E.g.
int p2; // p2 is connected to extern hardware
p2=0;
p2=0; // significant (hard-ware knows the second 'touch' triggers different
// action (or for delay purpose).
The 4th difference: Local variable.
(Assembly can theoritically do the same but I don't have impression which one support this feature.)
On Wed, 2026-04-15 at 11:46 +0100, Bart wrote:
On 15/04/2026 07:05, wij wrote:
On Tue, 2026-04-14 at 21:46 -0700, Keith Thompson wrote:So you've invented an 'assembly' syntax that looks exactly like C, in
ÿÿint a;
ÿÿchar b;
ÿÿa=b;ÿÿ // allow auto promotion
ÿÿwhile(a<b) {
ÿÿÿ a+=1;
ÿÿ}
You've claimed that that's assembly language.ÿ What assembler?
For what CPU?
Is it even for a real assembler?
I think you realize the example above is just an example to demo my idea. >>
order to support your notion that C and assembly are really the same thing!
Exactly. But not really 'invented'. I feagured if anyone wants to implement
a 'portable assembly', he would find it not much different from C (from the example shown, 'structured C'). So, in a sense, not worthy to implement.
Real assembly generally uses explicit instructions and labels rather
than the implicit ones used here. It would also have limits on the
complexity of expressions. If your pseudo-assembler supports:
ÿÿÿ a = b+c*f(x,y);
then you've invented a HLL.
You may say that.
Yes, the C-like example above specifies exactly a sequence of CPU instructions
(well, small deviation is allowed, and assembly can also have function, macro)
A C program specifies run-time behavior.ÿ (A compiler generates CPU >>>>>> instructions behind the scenes to implement that behavior.)
Being 'portable', it should specify 'run-time behavior', no exact instructions.
Yes, that's what I said.ÿ And that's the fundamental difference between >>>> assembly and C.
How/what do you specify 'run-time behavior'? Not based on CPU?
E.g. in C, int types are fixed-size, have range, wrap-around, alignment
and 'atomic','overlapping' properties, you cannot really understand or hide it and
program C/C++ correctly from the high-level concept of 'integer'.
The point is that C has NO WAY get rid of these (hard-ware) features, no matter
how high-level one thinks C is or expect C would be.
There are a dozen or more HLLs that have exactly such a set of integer
types. Actually, those have fixed-width integers with fixed ranges,
wrap-around behaviour, twos complement format and so on, even more so
than C.
So those HLLs (that is, C++, C#, D, Rust, Java, Zig, Go, ...) are even
more closely tied to the machine than C is. (In C, built-in types are
not sized, but have mininum widths, and until C23, integer
representation was not specified.)
Would you claim that those are also essentially assembly? If not, why not?
I calim C is (maybe I should use 'may be'. Sometimes I feel the conversation is difficult) 'portable assembly' is because C (subset) could map to 'assembly'
and in a sense have to. E.g.
int p2; // p2 is connected to extern hardware
p2=0;
p2=0; // significant (hard-ware knows the second 'touch' triggers different
// action (or for delay purpose).
And, in union, I don't how 'high-level' can explain the way read/write part of float object officially.
union {
char carr[sizeof(float)]; // C++ guarantees sizeof(char)==1
float f;
}
The 4th difference: Local variable.
(Assembly can theoritically do the same but I don't have impression which one
support this feature.)
If you are talking about function-local data, there are multiple ways
to do store them in an easy-to-clean-up fashion:
- Volatile registers, for the shortest lived data. Calling other
functions causes them to be overwritten with the function's
return value or irrelevant data.
- Non-volatile registers, for data that need to persist across
function calls. You save the contents of them before using them,
as your caller expects the contents of these registers to be
intact once you return.
- The stack, for long-lived function-local data when you are out of
non-volatile registers. You manipulate a dedicated stack pointer
register to allocate and deallocate space for your data.
- Immediates, and the .rodata (ELF) / .rdata (PE) section, for
constants and tables of constants.
The notion of local variable allows you to ignore all of these in C,
though. Assembly having multiple ways to store local data instead of
one can make things fairly complicated to read, write and debug.
(forwarding to alt.lang.asm because you are comparing C with it)
On 15/04/2026 12:52, wij wrote:for me. I am not
On Wed, 2026-04-15 at 11:21 +0100, Bart wrote:
On 15/04/2026 05:20, wij wrote:
On Tue, 2026-04-14 at 22:41 +0100, Bart wrote:
HHL is just 'style' in favor of specific purpose than the other
uall effort andsaying it is wrong, instead it is very helpful (measured by act
e weregain).
Below C there are HLAs or high-level assemblers, which at one tim
e. Andalso called machine-oriented languages, intended for humans to us
ya little below that might be intermediate languages (ILs), usuall
d saymachine-generated, intended for compiler backends.
ILs will be target-independent and so portable to some extent. I'
.that 'portable assembly' fits those better.
Do you program (read/write) IL directly?
I am talking about the language that human uses directly.
It is possible to write IL directly, when a textual form of it exists
eing C.Not many do that, but then not many write assembly these days either, /because more convenient higher level languages exist/, one of them b
erWhy do /you/ think that people prefer to use C to write programs rath
than assembly, if they are 'essentially the same'?
There are many reasons for those have chose C. I agree the dominant one
is support and convenience.
s(I've implemented, or devised and implemented, all the four level
asdiscussed here. There are also other languages in this space such
PL/M, or Forth.)
I am not talking about compiler technology.
'mYou claimed that C and assembler are at pretty much the same level. I
iplesaying that they are not only at different levels, but other levels exist, and I know because I've used them!
A compiler can choose to translate a language to any of those levels, including C (from a higher level language than C usually).
This argument of 'level' seems based on engineering of compiler of mult
languages. My point of view is based on theory of computation and maybe psychological recognition.
If you take syntax out of the equation, and then 'lower' what's left
(ie. flatten the various abstractions), then probably you can compare
the behaviour of a lot of languages with assembly.
However, those things are exactly what HLLs are about, while that
removing of syntax and lowering is exactly what compilers do.
That's why we use HLLs and not ASM unless we need to.
On 15/04/2026 14:21, wij wrote:r
On Wed, 2026-04-15 at 11:46 +0100, Bart wrote:
There are a dozen or more HLLs that have exactly such a set of intege
ntypes. Actually, those have fixed-width integers with fixed ranges, wrap-around behaviour, twos complement format and so on, even more so than C.
So those HLLs (that is, C++, C#, D, Rust, Java, Zig, Go, ...) are eve
not?more closely tied to the machine than C is. (In C, built-in types are
not sized, but have mininum widths, and until C23, integer
representation was not specified.)
Would you claim that those are also essentially assembly? If not, why
ationI calim C is (maybe I should use 'may be'. Sometimes I feel the convers
ssembly'is difficult) 'portable assembly' is because C (subset) could map to 'a
touch' triggers differentand in a sense have to. E.g.
ÿÿ int p2; // p2 is connected to extern hardware
ÿÿ p2=0;
ÿÿ p2=0;ÿ // significant (hard-ware knows the second '
r delay purpose).ÿÿÿÿÿÿÿÿÿ // action (or fo
You are not making any sense.ÿ I don't think you understand what C is,
how the language is defined, or how typical C implementations work.
In C, when you write the code above there is /nothing/ to suggest that
there should be two actions.ÿÿ
C compilers can - and many will - combinehy C
the two "p2 = 0;" statements.ÿ This is critical to understanding w
is not in any sense an "assembler".ÿ
In assembly languages, if you write.
the equivalent of "p2 = 0;" twice, you get the appropriate opcode twice
ÿ In C, the language do not require an operation for the statement "p2 =
0;".ÿ They require that after that statement, any observable behaviour
produced by the program will be as if the value 0 had been assigned to
the object "p2".ÿÿ
Repeating that same requirement does not change it -is
the compiler does not have to have implement "p2 = 0;" twice.ÿ (It
free to do so twice - or two hundred times if it likes.ÿ And if the
value of p2 is not used, it can be completely eliminated.)
Have you actually done any C programming at all?
ch oneThe 4th difference: Local variable.
(Assembly can theoritically do the same but I don't have impression whi
thersupport this feature.)
If you are talking about function-local data, there are multiple ways
to do store them in an easy-to-clean-up fashion:
ÿÿ - Volatile registers, for the shortest lived data. Calling o
ÿÿÿÿ functions causes them to be overwritten with thefunction's
ÿÿÿÿ return value or irrelevant data.ss
ÿÿ - Non-volatile registers, for data that need to persist acro
ÿÿÿÿ function calls. You save the contents of them before using them,
ÿÿÿÿ as your caller expects the contents of these registers to be
ÿÿÿÿ intact once you return.out of
ÿÿ - The stack, for long-lived function-local data when you are
ÿÿÿÿ non-volatile registers. You manipulate a dedicated stack pointer
ÿÿÿÿ register to allocate and deallocate space for your data.
ÿÿ - Immediates, and the .rodata (ELF) / .rdata (PE) section, for
ÿÿÿÿ constants and tables of constants.
The notion of local variable allows you to ignore all of these in C,
though. Assembly having multiple ways to store local data instead of
one can make things fairly complicated to read, write and debug.
(forwarding to alt.lang.asm because you are comparing C with it)
(forwarding to alt.lang.asm because you are comparing C with it)
W dniu 15.04.2026 oÿ15:40, makendo pisze:
(forwarding to alt.lang.asm because you are comparing C with it)
Great, but what is wrong with comp.lang.asm ? I subcribe it instead any ohter asm related groups. Is this wrong aproach?
--
Jacek Marcin Jaworski, Pruszcz Gd., woj. Pomorskie, Polska ??, EU ??;
tel.: +48-609-170-742, najlepiej w godz.: 5:00-5:55 lub 16:00-17:25; <jmj@energokod.gda.pl>, gpg: 4A541AA7A6E872318B85D7F6A651CC39244B0BFA;
Domowa s. WWW: <https://energokod.gda.pl>;
Mini Netykieta: <https://energokod.gda.pl/MiniNetykieta.html>; Mailowa Samoobrona: <https://emailselfdefense.fsf.org/pl>.
UWAGA:
NIE ZACI?GAJ "UKRYTEGO D?UGU"! P?A? ZA PROG. FOSS I INFO. INTERNETOWE!
CZYTAJ DARMOWY: "17. Raport Totaliztyczny - Patroni Kontra Bankierzy": <https://energokod.gda.pl/raporty-totaliztyczne/17.%20Patroni%20Kontra%20Bankierzy.pdf>
On Wed, 2026-04-15 at 15:38 +0200, David Brown wrote:
On 15/04/2026 14:21, wij wrote:
On Wed, 2026-04-15 at 11:46 +0100, Bart wrote:You are not making any sense.ÿ I don't think you understand what C is,
I calim C is (maybe I should use 'may be'. Sometimes I feel the conversation
There are a dozen or more HLLs that have exactly such a set of integer >>>> types. Actually, those have fixed-width integers with fixed ranges,
wrap-around behaviour, twos complement format and so on, even more so
than C.
So those HLLs (that is, C++, C#, D, Rust, Java, Zig, Go, ...) are even >>>> more closely tied to the machine than C is. (In C, built-in types are
not sized, but have mininum widths, and until C23, integer
representation was not specified.)
Would you claim that those are also essentially assembly? If not, why not? >>>
is difficult) 'portable assembly' is because C (subset) could map to 'assembly'
and in a sense have to. E.g.
ÿÿ int p2; // p2 is connected to extern hardware
ÿÿ p2=0;
ÿÿ p2=0;ÿ // significant (hard-ware knows the second 'touch' triggers different
ÿÿÿÿÿÿÿÿÿ // action (or for delay purpose).
how the language is defined, or how typical C implementations work.
I switched from C to C++ 30 years ago.
But that is 'theoretical', I see things
from real world side.ÿI think you approach 'C' from standard documents, that is
not the way of understanding. You cannot understand the world by/from reading the bible.
In C, when you write the code above there is /nothing/ to suggest that
there should be two actions.
As I know, 'old-time' C has no optimization.
C compilers can - and many will - combine
the two "p2 = 0;" statements.ÿ This is critical to understanding why C
is not in any sense an "assembler".
Not a valid reason.
In assembly languages, if you write
the equivalent of "p2 = 0;" twice, you get the appropriate opcode twice.
Assembly compiler (or language) can also do the same optimization.
ÿ In C, the language do not require an operation for the statement "p2 =
0;".ÿ They require that after that statement, any observable behaviour
produced by the program will be as if the value 0 had been assigned to
the object "p2".
You need a model now by saying so.
Repeating that same requirement does not change it -
the compiler does not have to have implement "p2 = 0;" twice.ÿ (It is
free to do so twice - or two hundred times if it likes.ÿ And if the
value of p2 is not used, it can be completely eliminated.)
Have you actually done any C programming at all?
Nope, I quit C (but I keep watching C, since part of C++ is C)
On Wed, 15 Apr 2026 20:23:52 +0200
??Jacek Marcin Jaworski??<jmj@energokod.gda.pl> wrote:
W dniu 15.04.2026 oÿ15:40, makendo pisze:DYM comp.lang.asm.x86?
(forwarding to alt.lang.asm because you are comparing C with it)Great, but what is wrong with comp.lang.asm ? I subcribe it instead any
ohter asm related groups. Is this wrong aproach?
comp.lang.asm is an empty header for me on eternal september's feed.
The newsgroup comp.lang.asm is generally considered an unmoderated Usenet group. Unlike comp.lang.asm.x86, which is known to be moderated, comp.lang.asm does not have an official moderation process and typically allows posts to appear without prior review.
sig is overlong, and crowded, IMHO.
On 15/04/2026 13:21, wij wrote:r?
On Wed, 2026-04-15 at 11:46 +0100, Bart wrote:
On 15/04/2026 07:05, wij wrote:
On Tue, 2026-04-14 at 21:46 -0700, Keith Thompson wrote:
ÿÿÿint a;
ÿÿÿchar b;
ÿÿÿa=b;ÿÿ // allow auto promotion
ÿÿÿwhile(a<b) {
ÿÿÿÿ a+=1;
ÿÿÿ}
You've claimed that that's assembly language.ÿ What assemble
idea.For what CPU?
Is it even for a real assembler?
I think you realize the example above is just an example to demo my
thing!So you've invented an 'assembly' syntax that looks exactly like C, in order to support your notion that C and assembly are really the same
mentExactly. But not really 'invented'. I feagured if anyone wants to imple
thea 'portable assembly', he would find it not much different from C (from
.example shown, 'structured C'). So, in a sense, not worthy to implement
Real assembly generally uses explicit instructions and labels rather
than the implicit ones used here. It would also have limits on the complexity of expressions. If your pseudo-assembler supports:
ÿÿÿÿ a = b+c*f(x,y);
then you've invented a HLL.
You may say that.
It sounds like you don't understand the difference between a low-level
language and a high-level one.
These days C might be considered mid-level (I call it a lower-level HLL,
because so many HLLs are much higher level and more abstract).
Compiling a HLL involves lowering it to a different representation, say
from language A to language B.
But just because that translation happens to be routine, doesn't mean
and A is essentially B.
PU instructionsYes, the C-like example above specifies exactly a sequence of C
unction, macro)(well, small deviation is allowed, and assembly can also have f
nerates CPUA C program specifies run-time behavior.ÿ (A compiler ge
ct instructions.instructions behind the scenes to implement that behavior.)
Being 'portable', it should specify 'run-time behavior', no exa
nce betweenYes, that's what I said.ÿ And that's the fundamental differe
mentassembly and C.
How/what do you specify 'run-time behavior'? Not based on CPU?
E.g. in C, int types are fixed-size, have range, wrap-around, align
or hide it andand 'atomic','overlapping' properties, you cannot really understand
s, no matterprogram C/C++ correctly from the high-level concept of 'integer'.
The point is that C has NO WAY get rid of these (hard-ware) feature
rhow high-level one thinks C is or expect C would be.
There are a dozen or more HLLs that have exactly such a set of intege
ntypes. Actually, those have fixed-width integers with fixed ranges, wrap-around behaviour, twos complement format and so on, even more so than C.
So those HLLs (that is, C++, C#, D, Rust, Java, Zig, Go, ...) are eve
not?more closely tied to the machine than C is. (In C, built-in types are
not sized, but have mininum widths, and until C23, integer
representation was not specified.)
Would you claim that those are also essentially assembly? If not, why
ationI calim C is (maybe I should use 'may be'. Sometimes I feel the convers
ssembly'is difficult) 'portable assembly' is because C (subset) could map to 'a
touch' triggers differentand in a sense have to. E.g.
ÿÿ int p2; // p2 is connected to extern hardware
ÿÿ p2=0;
ÿÿ p2=0;ÿ // significant (hard-ware knows the second '
r delay purpose).ÿÿÿÿÿÿÿÿÿ // action (or fo
That won't work in C. 'p2' is likely to be in a register; that extra
write may be elided.
You'd have to use 'volatile' to guard against that. But you still can't
control where p2 is put into memory. C /is/ used for this stuff, but all
sorts of special extensions, or compiler specifics, may be employed.
In assembly it's much easier.
partAnd, in union, I don't how 'high-level' can explain the way read/write
uarantees sizeof(char)==1of float object officially.
ÿÿ union {
ÿÿÿÿ char carr[sizeof(float)];ÿÿ // C++ g
.ÿÿÿÿ float f;
ÿÿ }
(Fixed that sizeof.)
I normally use my own systems language. That one is aligned much more directly to hardware than C is, even though it is marginally higher level
This is because C is intended to work on possible hardware, while mine
was created to work with one target as a time.
Also, when I started on mine (c. 1982 rather than 1972), hardware was already standardising on 8-bit bytes, byte-addressed, power-of-two word
sizes, and twos-complement integers.
I don't however consider my language to be a form of assembly for lots
of reasons already mentioned.e:
Its compilers use 3 internal representations before it gets to native cod
ÿÿ HLL source -> AST -> IL -> MCL -> Native
'MCL' is the internal representation of the native code. If I need ASM
output, then MCL can be dumped into a suitable syntax (I support 4
different ASM syntaxes for x64).
This MCL/ASM itself has abstractions, so the same 'MOV' mnemonic is used
for a dozens of different move instructions that each have different
binary opcodes.
30instructions are defined for convenience for common usage, see man page
On 15/04/2026 18:58, wij wrote:teger
On Wed, 2026-04-15 at 15:38 +0200, David Brown wrote:
On 15/04/2026 14:21, wij wrote:
On Wed, 2026-04-15 at 11:46 +0100, Bart wrote:
There are a dozen or more HLLs that have exactly such a set of in
s,types. Actually, those have fixed-width integers with fixed range
e sowrap-around behaviour, twos complement format and so on, even mor
eventhan C.
So those HLLs (that is, C++, C#, D, Rust, Java, Zig, Go, ...) are
aremore closely tied to the machine than C is. (In C, built-in types
why not?not sized, but have mininum widths, and until C23, integer representation was not specified.)
Would you claim that those are also essentially assembly? If not,
versationI calim C is (maybe I should use 'may be'. Sometimes I feel the con
o 'assembly'is difficult) 'portable assembly' is because C (subset) could map t
e second 'touch' triggers differentand in a sense have to. E.g.
ÿÿÿ int p2; // p2 is connected to extern hardware
ÿÿÿ p2=0;
ÿÿÿ p2=0;ÿ // significant (hard-ware knows th
ion (or for delay purpose).ÿÿÿÿÿÿÿÿÿÿ // act
C is,You are not making any sense.ÿ I don't think you understand what
ents, that ishow the language is defined, or how typical C implementations work.
I switched from C to C++ 30 years ago.
I don't think you understand C++ either.ÿ In the context of this
discussion, it is not different from C.
But that is 'theoretical', I see things
from real world side.ÿI think you approach 'C' from standard docum
eadingnot the way of understanding. You cannot understand the world by/from r
the bible.
No, I understand C and C++ from using them in real-world code - as well
as knowing what the code means and what is guaranteed by the language.
Practical experience tells you what works well in practice - but
theoretical knowledge tells you what you can expect so that you are not
just programming by luck and "it worked for me when I tried it".
tIn C, when you write the code above there is /nothing/ to suggest tha
there should be two actions.
As I know, 'old-time' C has no optimization.
Nonsense.
Modern C compilers often do more optimisation than older ones, but there
was never a "pre-optimisation" world.ÿ Things like eliminating dead
code, or optimising based on knowing that signed integer overflow never
occurs in a correct program, have been around from early tools.ÿ I have
used heavily optimising compilers for 30 years.ng why C
C compilers can - and many will - combine
the two "p2 = 0;" statements.ÿ This is critical to understandi
is not in any sense an "assembler".
Not a valid reason.
What do you mean by that?ÿ It's a fact, not a "reason".wice.
ÿ In assembly languages, if you write
the equivalent of "p2 = 0;" twice, you get the appropriate opcode t
Assembly compiler (or language) can also do the same optimization.
No, assemblers cannot do that - if they did, they would not be
assemblers.ÿ An assembler directly translates your instructions from
mnemonic codes (assembly instructions) to binary opcodes.ÿ Sometatement "p2 =
assemblers might have pseudo-instructions that translate to more than
one binary opcode, but always in a specific defined pattern.
ÿÿ In C, the language do not require an operation for the s
aviour0;".ÿ They require that after that statement, any observable beh
oproduced by the program will be as if the value 0 had been assigned t
the object "p2".
You need a model now by saying so.
Again, I don't know what you are trying to say.
(It isRepeating that same requirement does not change it -
the compiler does not have to have implement "p2 = 0;" twice.ÿ
thefree to do so twice - or two hundred times if it likes.ÿ And if
value of p2 is not used, it can be completely eliminated.)
Have you actually done any C programming at all?
Nope, I quit C (but I keep watching C, since part of C++ is C)
Okay, have you ever actually done any C++ programming?ÿ The languages
share the same philosophy here.
On Tue, 2026-04-14 at 21:46 -0700, Keith Thompson wrote:
wij <wyniijj5@gmail.com> writes:
On Tue, 2026-04-14 at 15:31 -0700, Keith Thompson wrote:
wij <wyniijj5@gmail.com> writes:[Repeat] 'Assembly' can also be like C:
In attempting writting a simple language, I had a thought of what language is
to share. Because I saw many people are stuck in thinking C/C++ (or other
high level language) can be so abstract, unlimited 'high level' to mysteriously
solve various human description of idea.
C and assembly are essentially the same, maybe better call it 'portable assembly'.
No, C is not any kind of assembly.ÿ Assembly language and C are
fundamentally different.
An assembly language program specifies a sequence of CPU instructions. >> >
ÿ// This is 'assembly'
ÿdef int=32bit;ÿÿ // Choose right bits for your platform, or leave it for >> > ÿdef char= 8bit;ÿ // compiler to decide.
Compiler?ÿ You said this was assembly.
ÿint a;
ÿchar b;
ÿa=b;ÿÿ // allow auto promotion
ÿwhile(a<b) {
ÿÿ a+=1;
ÿ}
You've claimed that that's assembly language.ÿ What assembler?
For what CPU?
Is it even for a real assembler?
I think you realize the example above is just an example to demo my idea.
Yes, the C-like example above specifies exactly a sequence of CPU instructions
(well, small deviation is allowed, and assembly can also have function, macro)
A C program specifies run-time behavior.ÿ (A compiler generates CPU
instructions behind the scenes to implement that behavior.)
Being 'portable', it should specify 'run-time behavior', no exact instructions.
Yes, that's what I said.ÿ And that's the fundamental difference between
assembly and C.
How/what do you specify 'run-time behavior'? Not based on CPU?
E.g. in C, int types are fixed-size, have range, wrap-around, alignment
and 'atomic','overlapping' properties, you cannot really understand or hide it and
program C/C++ correctly from the high-level concept of 'integer'.
The point is that C has NO WAY get rid of these (hard-ware) features, no matter
how high-level one thinks C is or expect C would be.
When I heard 'sophisticated assemblers', I would think something like
my idea of 'portable' assembly, but maybe different.ÿ One my point
should be clear as stated in the above int exampleÿ"... C has NO WAY
get rid of these (hard-ware) features, no matter how high-levelÿone
thinks C is or expect C would be."
On Wed, 2026-04-15 at 15:38 +0200, David Brown wrote:[...]
In C, when you write the code above there is /nothing/ to suggest that
there should be two actions.ÿÿ
As I know, 'old-time' C has no optimization.
In assembly languages, if you write
the equivalent of "p2 = 0;" twice, you get the appropriate opcode twice.
Assembly compiler (or language) can also do the same optimization.
On 15/04/2026 01:33, Bart wrote:[...]
Certainly until C99 when stdint.h came along.I would not draw that distinction - indeed, I see the opposite. Prior
to <stdint.h>, your integer type sizes were directly from the target
machine - with <stdint.h> explicitly sized integer types, they are now independent of the target hardware.
On Wed, 2026-04-15 at 22:11 +0200, David Brown wrote:[...]
Okay, have you ever actually done any C++ programming?ÿ The languages
share the same philosophy here.
You are really a sick person. Looser of the real world. You just don't know yourself.
I have a gold medal, an aluminum medal and a bronze commemorative plaque (for
solving a riddle of Northrop Coorp.). What you have? Well... a paper (paid for)
and still making false memory everyday for yourself.
I retired at 37, can you?
Ah, recently, you also failed to verify a simple program that proves
3x+1 problem. Fact is not made by mouth (like DJT?), looser.
On Wed, 2026-04-15 at 15:06 +0100, Bart wrote:
The boundary of assembly and HLL is not clear to me.
I had wrote a killer-grade commercial assembly program, it may still be running
today after >30 years. My experience is that assembly is not that scary as commonly
thought, just don't think in low level.
wij <wyniijj5@gmail.com> writes:at language is
On Tue, 2026-04-14 at 21:46 -0700, Keith Thompson wrote:
wij <wyniijj5@gmail.com> writes:
On Tue, 2026-04-14 at 15:31 -0700, Keith Thompson wrote:
wij <wyniijj5@gmail.com> writes:
In attempting writting a simple language, I had a thought of wh
(or otherto share. Because I saw many people are stuck in thinking C/C++
to mysteriouslyhigh level language) can be so abstract, unlimited 'high level'
portable assembly'.solve various human description of idea.
C and assembly are essentially the same, maybe better call it '
areNo, C is not any kind of assembly.ÿ Assembly language and C
ions.fundamentally different.
An assembly language program specifies a sequence of CPU instruct
atform, or leave it for[Repeat] 'Assembly' can also be like C:
ÿ// This is 'assembly'
ÿdef int=32bit;ÿÿ // Choose right bits for your pl
a.ÿdef char= 8bit;ÿ // compiler to decide.
Compiler?ÿ You said this was assembly.
ÿint a;
ÿchar b;
ÿa=b;ÿÿ // allow auto promotion
ÿwhile(a<b) {
ÿÿ a+=1;
ÿ}
You've claimed that that's assembly language.ÿ What assembler?
For what CPU?
Is it even for a real assembler?
I think you realize the example above is just an example to demo my ide
I hadn't.ÿ I realize it now that you've admitted it.nstructions
In other words, you made it up.
I don't believe there is any real-world assembler that accepts
that syntax.ÿ Your example is meaningless.
For every assembler I've used, the assembly language input
unambiguously specifies the sequence of CPU instructions in the
generated object file.ÿ Support for macros do not change that;
it just means the mapping is slightly more complicated.
Cite an example of an existing real-world assembler that does not
behave that way, and we might have something interesting to discuss.
Yes, the C-like example above specifies exactly a sequence of CPU i
ion, macro)(well, small deviation is allowed, and assembly can also have funct
tes CPUA C program specifies run-time behavior.ÿ (A compiler genera
nstructions.instructions behind the scenes to implement that behavior.)
Being 'portable', it should specify 'run-time behavior', no exact i
betweenYes, that's what I said.ÿ And that's the fundamental difference
heassembly and C.
How/what do you specify 'run-time behavior'? Not based on CPU?
The C standard defines "behavior" as "external appearance or action",
which is admittedly vague.ÿ Run-time behavior is what happens when t
program is running on the target system.ÿ It includes things like input
and output, either to a console or to files.e
The C standard specifies the behavior of this program:
ÿÿÿ #include <stdio.h>
ÿÿÿ int main(void) { puts("hello, world"); }
It does so without reference to any CPU.ÿ (Of course some CPU will b
used to implement that behavior.)hide it and
E.g. in C, int types are fixed-size, have range, wrap-around, alignment
and 'atomic','overlapping' properties, you cannot really understand or
o matterprogram C/C++ correctly from the high-level concept of 'integer'.
The point is that C has NO WAY get rid of these (hard-ware) features, n
pointhow high-level one thinks C is or expect C would be.
Right, C doesn't directly support abstract mathematical integers.
Of course I agree that C is a lower level language than many others.
Python, for example, has reasonably transparent support for integers
of arbitrary width.ÿ Python is a higher level language than C.
(Notably, the Python interpreter is written in C).
That doesn't make C an assembly language.
[...]
When I heard 'sophisticated assemblers', I would think something like
my idea of 'portable' assembly, but maybe different.ÿÿ One my
AYshould be clear as stated in the above int exampleÿ"... C has NO W
eget rid of these (hard-ware) features, no matter how high-levelÿon
thinks C is or expect C would be."
Again, yes, C is a relatively low-level language.ÿ And again,
C is not an assembly language.
And again, if you can cite a real-world example of the kind of
"sophisticated assembler" you're talking about, that would be an
interesting data point.
--ÿ
Keith Thompson (The_Other_Keith) Keith.S.Thompson+u@gmail.com
void Void(void) { Void(); } /* The recursive call of the void */
On 15/04/2026 22:12, wij wrote:running
On Wed, 2026-04-15 at 15:06 +0100, Bart wrote:
The boundary of assembly and HLL is not clear to me.
That seems to be obvious.
I had wrote a killer-grade commercial assembly program, it may still be
as commonlytoday after >30 years. My experience is that assembly is not that scary
thought, just don't think in low level.
It's not that scary. Just unergonomic to code in it, taking longer,
being more error prone, much harder to understand, harder to maintain,
much less portable ...
On Wed, 2026-04-15 at 23:52 +0100, Bart wrote:
On 15/04/2026 22:12, wij wrote:
On Wed, 2026-04-15 at 15:06 +0100, Bart wrote:
The boundary of assembly and HLL is not clear to me.
That seems to be obvious.
I had wrote a killer-grade commercial assembly program, it may still be running
today after >30 years. My experience is that assembly is not that scary as commonly
thought, just don't think in low level.
It's not that scary. Just unergonomic to code in it, taking longer,
being more error prone, much harder to understand, harder to maintain,
much less portable ...
Skill. Treat assembly as a chunk. Well document.
Again, yes, C is a relatively low-level language.ÿ And again,
C is not an assembly language.
And again, if you can cite a real-world example of the kind of
"sophisticated assembler" you're talking about, that would be an
interesting data point.
I had thought questions like yours might have been due to the English problem.ÿ
I did not mean C is (equal to) assembly, but C is-a assembly (logic course 101).
And I hope the following code could explain some confusion.
The 'assembly' could be 'structured assembly', but then I felt the
result should not be much different from C...
[... comparing C and assembly language ...]
On 16/04/2026 00:30, wij wrote:
On Wed, 2026-04-15 at 23:52 +0100, Bart wrote:
On 15/04/2026 22:12, wij wrote:Skill. Treat assembly as a chunk. Well document.
On Wed, 2026-04-15 at 15:06 +0100, Bart wrote:
The boundary of assembly and HLL is not clear to me.
That seems to be obvious.
I had wrote a killer-grade commercial assembly program, it may still be running
today after >30 years. My experience is that assembly is not that scary as commonly
thought, just don't think in low level.
It's not that scary. Just unergonomic to code in it, taking longer,
being more error prone, much harder to understand, harder to maintain,
much less portable ...
You're not making sense. It's like saying I should walk everywhere
instead of using my car.
But I don't want to spend two extra hours a day walking and carrying
shopping etc.
What exactly is the benefit of using assembly over a HLL when both can
tackle the task?
When I first started with microprocessors, I first had to build the
hardware, which was programmed in binary. I wrote a hex editor so I
could use a keyboard. Then used that to write an assembler. Then used
the assembler to write a compiler for a simple HLL.
The HLL allowed me to be far more productive than otherwise. Everybody
seems to understand that, except you.
But I have a counter-proposal: why don't you also program in binary
machine code (I'll let you use hex!) instead of assembly? After all
it's just a skill.
wij <wyniijj5@gmail.com> writes:
[...]
[signature snipped]
When you post a followup, please trim quoted text that's not relevant to
your reply. And in particular, don't quote signatures.
wij <wyniijj5@gmail.com> writes:
[... comparing C and assembly language ...]
Gentlemen,
I understand the natural reaction to want to respond to the kind of statements being made in this thread.ÿ I hope y'all can resist this
natural reaction and not respond to people who persist in making
arguments that are basically isomorphic to saying 1 equals 0.
Thank you for your assistance in this matter.
On Wed, 2026-04-15 at 17:14 -0700, Tim Rentsch wrote:
wij <wyniijj5@gmail.com> writes:
[... comparing C and assembly language ...]
Gentlemen,
I understand the natural reaction to want to respond to the kind of
statements being made in this thread.ÿ I hope y'all can resist this
natural reaction and not respond to people who persist in making
arguments that are basically isomorphic to saying 1 equals 0.
Thank you for your assistance in this matter.
Maybe you are right. I say A is-a B, one persist to read A is (exactly) B.
I provide help to using assembly. One persist to read I persuade using assembly and give up HLL. What is going on here?
On Wed, 2026-04-15 at 17:14 -0700, Tim Rentsch wrote:
wij <wyniijj5@gmail.com> writes:
[... comparing C and assembly language ...]
Gentlemen,
I understand the natural reaction to want to respond to the kind of
statements being made in this thread.ÿ I hope y'all can resist this
natural reaction and not respond to people who persist in making
arguments that are basically isomorphic to saying 1 equals 0.
Thank you for your assistance in this matter.
Maybe you are right. I say A is-a B, one persist to read A is (exactly) B.
I provide help to using assembly. One persist to read I persuade using assembly and give up HLL. What is going on here?
So what you're saying is that assembly can do anything that any other arbitrary language (that has to eventually compile down to the same
machine code) can do? This should not be surprising to anyone.
C has never been, and was never intended to be, a "portable assembly".
It was designed to reduce the need to write assembly code.ÿ There is a
huge difference in these concepts.
On 15/04/2026 01:33, Bart wrote:
On 14/04/2026 23:20, Janis Papanagnou wrote:
On 2026-04-14 23:41, Bart wrote:
But if you want to call C some kind of assembler, even though it is
several levels above actual assembly, then that's up to you.
Can you name and describe a couple of these "several levels above
actual assembly"?ÿ (Assembler macros might qualify as one level.)
I said C is several levels above, and mentioned 2 categories and 2
specific ones that can be considered to be in-between.
I agree with a great deal you have written in this thread (at least what
I have read so far).ÿ My points below are mainly additional comments
rather than arguments or disagreements.ÿ Like you, my disagreement is primarily with wij.
Namely:
* HLAs (high-level assemblers) of various kinds, as this is a broad
category (see note)
When I used to do significant amounts of assembly programming (often on "brain-dead" 8-bit CISC microcontrollers), I would make heavy use of assembler macros as a way of getting slightly "higher level" assembly.
Even with common assembler tools you can write something that is a kind
of HLA.ÿ And then for some targets, there are more sophisticated tools
(or you can write them yourself) for additional higher level constructs.
* Intermediate languages (IRs/ILs) such as LLVM IR
LLVM is probably the best candidate for something that could be called a "portable assembler".ÿ It is quite likely that other such "languages"
have been developed and used (perhaps internally in multi-target
compilers), but LLVM's is the biggest and with the widest support.
* Forth
Forth is always a bit difficult to categorise.ÿ Many Forth
implementations are done with virtual machines or byte-code
interpreters, raising them above assembly.ÿ Others are for stack machine processors (very common in the 4-bit world) where the assembly /is/ a
small Forth language.ÿ A lot of Forth tools compile very directly
(giving you the "you get what your code looks like" aspect of assembly), others do more optimisation (for the "you get what your code means"
aspect of high level languages).
* PL/M (an old one; there was also C--, now dead)
I never used PL/M - I'm too young for that!ÿ C-- was conceived as a
portable intermediary language that compilers could generate to get cross-target compilation without needing individual target backends.ÿ In practice, ordinary C does a good enough job for many transpilers during development, then they can move to LLVM for more control and efficiency
if they see it as worth the effort.
(Note: the one I implemented was called 'Babbage', devised for the GEC
4000 machines. My task was to port it to DEC PDP10. There's something
about it 2/3 down this page: https://en.wikipedia.org/wiki/
GEC_4000_series)
Beyond the inherent subjective aspects of that or the OP's initial
statement I certainly see "C" closer to the machine than many HLLs.
I see it as striving to distance itself from the machine as much as
possible!
Yes - as much as possible while retaining efficiency.
Certainly until C99 when stdint.h came along.
I would not draw that distinction - indeed, I see the opposite.ÿ Prior
to <stdint.h>, your integer type sizes were directly from the target
machine - with <stdint.h> explicitly sized integer types, they are now independent of the target hardware.
C has always intended to be as independent from the machine as
practically possible without compromising efficiency.ÿ That's why it has implementation-defined behaviour where it makes a significant difference (such as the size of integer types), while giving full definitions of
things that can reasonably be widely portable while still being
efficient (and sometimes leaving things undefined to encourage
portability).
For example:
* Not committing to actual machine types, widths or representations,
such as a 'byte', or 'twos complement'.
(With C23, two's complement is the only allowed signed integer representation.ÿ There comes a point where something is so dominant that even C commits it to the standards.)
* Being vague about the relations between the different integer types
* Not allowing (until standardised after half a century) binary
literals, and still not allowing those to be printed
That one is more that no one had bothered standardising binary literals.
ÿThe people that wanted them, for the most part, are low-level embedded programmers and their tools already supported them.ÿ (And even then,
they are not much used in practice.)ÿ Printing in binary is not
something people often want - it is far too cumbersome for numbers, and
if you want to dump a view of some flag register then a custom function
with letters is vastly more useful.
C is standardised on binary - unsigned integer types would not work well
on a non-binary target.
* Not being allowed to do a dozen things that you KNOW are well-
defined on your target machine, but C says are UB.
That is certainly part of it.ÿ Things like "signed integer arithmetic overflow" is UB at least partly because C models mathematical integer arithmetic.ÿ It does not attempt to mimic the underlying hardware.ÿ This
is clearly "high level language" territory - C defines the behaviour of
an abstract machine in terms of mathematics.ÿ It is not an "assembler"
that defines operations in terms of hardware instructions.
It certainly depends on where one is coming from; from an abstract
or user-application level or from the machine level.
There was often mentioned here - very much to the despise of the
audience - that there's a lot effort necessary to implement simple
concepts. To jump on that bandwagon; how would, say, Awk's array
constructÿ map[key] = valueÿ have to be modeled in (native) "C".
(Note that this simple statement represents an associative array.)
"C" is abstracting from the machine. And the OP's initial statement
"C and assembly are essentially the same" may be nonsense
Actually, describing C as 'portable assembly' annoys me which is why I
went into some detail.
Indeed.
C is defined in terms of an abstract machine, not hardware.ÿ And the C source code running on this abstract machine only needs to match up with
the actual binary code on the real target machine in very specific and limited ways - the "observable behaviour" of the program.ÿ That's
basically start, stop, volatile accesses and IO.ÿ Everything else
follows the "as if" - the compiler needs to generate target code that
works (for observable behaviour) "as if" it had done a direct, na‹ve translation of the source.
As I understand the history - and certainly the practice - of the C language, it is a language with two goals.ÿ One is that it should be possible to write highly portable C code that can be used on a very wide range of target systems while remaining efficient.ÿ The other is that it should be useable for a lot of target-specific system code.
C has never been, and was never intended to be, a "portable assembly".
It was designed to reduce the need to write assembly code.ÿ There is a
huge difference in these concepts.
On 4/15/2026 12:57 AM, David Brown wrote:
On 15/04/2026 01:33, Bart wrote:
On 14/04/2026 23:20, Janis Papanagnou wrote:
On 2026-04-14 23:41, Bart wrote:
But if you want to call C some kind of assembler, even though it is >>>>> several levels above actual assembly, then that's up to you.
Can you name and describe a couple of these "several levels above
actual assembly"?ÿ (Assembler macros might qualify as one level.)
I said C is several levels above, and mentioned 2 categories and 2
specific ones that can be considered to be in-between.
I agree with a great deal you have written in this thread (at least
what I have read so far).ÿ My points below are mainly additional
comments rather than arguments or disagreements.ÿ Like you, my
disagreement is primarily with wij.
Namely:
* HLAs (high-level assemblers) of various kinds, as this is a broad
category (see note)
When I used to do significant amounts of assembly programming (often
on "brain-dead" 8-bit CISC microcontrollers), I would make heavy use
of assembler macros as a way of getting slightly "higher level"
assembly. Even with common assembler tools you can write something
that is a kind of HLA.ÿ And then for some targets, there are more
sophisticated tools (or you can write them yourself) for additional
higher level constructs.
* Intermediate languages (IRs/ILs) such as LLVM IR
LLVM is probably the best candidate for something that could be called
a "portable assembler".ÿ It is quite likely that other such
"languages" have been developed and used (perhaps internally in multi-
target compilers), but LLVM's is the biggest and with the widest support.
* Forth
Forth is always a bit difficult to categorise.ÿ Many Forth
implementations are done with virtual machines or byte-code
interpreters, raising them above assembly.ÿ Others are for stack
machine processors (very common in the 4-bit world) where the
assembly /is/ a small Forth language.ÿ A lot of Forth tools compile
very directly (giving you the "you get what your code looks like"
aspect of assembly), others do more optimisation (for the "you get
what your code means" aspect of high level languages).
* PL/M (an old one; there was also C--, now dead)
I never used PL/M - I'm too young for that!ÿ C-- was conceived as a
portable intermediary language that compilers could generate to get
cross-target compilation without needing individual target backends.
In practice, ordinary C does a good enough job for many transpilers
during development, then they can move to LLVM for more control and
efficiency if they see it as worth the effort.
(Note: the one I implemented was called 'Babbage', devised for the
GEC 4000 machines. My task was to port it to DEC PDP10. There's
something about it 2/3 down this page: https://en.wikipedia.org/wiki/
GEC_4000_series)
Beyond the inherent subjective aspects of that or the OP's initial
statement I certainly see "C" closer to the machine than many HLLs.
I see it as striving to distance itself from the machine as much as
possible!
Yes - as much as possible while retaining efficiency.
Certainly until C99 when stdint.h came along.
I would not draw that distinction - indeed, I see the opposite.ÿ Prior
to <stdint.h>, your integer type sizes were directly from the target
machine - with <stdint.h> explicitly sized integer types, they are now
independent of the target hardware.
C has always intended to be as independent from the machine as
practically possible without compromising efficiency.ÿ That's why it
has implementation-defined behaviour where it makes a significant
difference (such as the size of integer types), while giving full
definitions of things that can reasonably be widely portable while
still being efficient (and sometimes leaving things undefined to
encourage portability).
For example:
* Not committing to actual machine types, widths or representations,
such as a 'byte', or 'twos complement'.
(With C23, two's complement is the only allowed signed integer
representation.ÿ There comes a point where something is so dominant
that even C commits it to the standards.)
* Being vague about the relations between the different integer types
* Not allowing (until standardised after half a century) binary
literals, and still not allowing those to be printed
That one is more that no one had bothered standardising binary
literals. ÿÿThe people that wanted them, for the most part, are low-
level embedded programmers and their tools already supported them.
(And even then, they are not much used in practice.)ÿ Printing in
binary is not something people often want - it is far too cumbersome
for numbers, and if you want to dump a view of some flag register then
a custom function with letters is vastly more useful.
C is standardised on binary - unsigned integer types would not work
well on a non-binary target.
* Not being allowed to do a dozen things that you KNOW are well-
defined on your target machine, but C says are UB.
That is certainly part of it.ÿ Things like "signed integer arithmetic
overflow" is UB at least partly because C models mathematical integer
arithmetic.ÿ It does not attempt to mimic the underlying hardware.
This is clearly "high level language" territory - C defines the
behaviour of an abstract machine in terms of mathematics.ÿ It is not
an "assembler" that defines operations in terms of hardware instructions.
It certainly depends on where one is coming from; from an abstract
or user-application level or from the machine level.
There was often mentioned here - very much to the despise of the
audience - that there's a lot effort necessary to implement simple
concepts. To jump on that bandwagon; how would, say, Awk's array
constructÿ map[key] = valueÿ have to be modeled in (native) "C".
(Note that this simple statement represents an associative array.)
"C" is abstracting from the machine. And the OP's initial statement
"C and assembly are essentially the same" may be nonsense
Actually, describing C as 'portable assembly' annoys me which is why
I went into some detail.
Indeed.
C is defined in terms of an abstract machine, not hardware.ÿ And the C
source code running on this abstract machine only needs to match up
with the actual binary code on the real target machine in very
specific and limited ways - the "observable behaviour" of the
program.ÿ That's basically start, stop, volatile accesses and IO.
Everything else follows the "as if" - the compiler needs to generate
target code that works (for observable behaviour) "as if" it had done
a direct, na‹ve translation of the source.
As I understand the history - and certainly the practice - of the C
language, it is a language with two goals.ÿ One is that it should be
possible to write highly portable C code that can be used on a very
wide range of target systems while remaining efficient.ÿ The other is
that it should be useable for a lot of target-specific system code.
C has never been, and was never intended to be, a "portable assembly".
It was designed to reduce the need to write assembly code.ÿ There is a
huge difference in these concepts.
Use C to create the ASM, then GAS it... ;^)
Nope, I quit C (but I keep watching C, since part of C++ is C)
wij <wyniijj5@gmail.com> writes:
On Wed, 2026-04-15 at 22:11 +0200, David Brown wrote:[...]
Okay, have you ever actually done any C++ programming?ÿ The languages
share the same philosophy here.
You are really a sick person. Looser of the real world. You just don't know >> yourself.
I have a gold medal, an aluminum medal and a bronze commemorative plaque (for
solving a riddle of Northrop Coorp.). What you have? Well... a paper (paid for)
and still making false memory everyday for yourself.
I retired at 37, can you?
Ah, recently, you also failed to verify a simple program that proves
3x+1 problem. Fact is not made by mouth (like DJT?), looser.
Keep the personal abuse to yourself.
On 15/04/2026 13:21, wij wrote:A high level lang can dump code for a lower level one and vise versa.
On Wed, 2026-04-15 at 11:46 +0100, Bart wrote:
On 15/04/2026 07:05, wij wrote:
On Tue, 2026-04-14 at 21:46 -0700, Keith Thompson wrote:
ÿÿÿint a;
ÿÿÿchar b;
ÿÿÿa=b;ÿÿ // allow auto promotion
ÿÿÿwhile(a<b) {
ÿÿÿÿ a+=1;
ÿÿÿ}
You've claimed that that's assembly language.ÿ What assembler?
For what CPU?
Is it even for a real assembler?
I think you realize the example above is just an example to demo my
idea.
So you've invented an 'assembly' syntax that looks exactly like C, in
order to support your notion that C and assembly are really the same
thing!
Exactly. But not really 'invented'. I feagured if anyone wants to
implement
a 'portable assembly', he would find it not much different from C
(from the
example shown, 'structured C'). So, in a sense, not worthy to implement.
Real assembly generally uses explicit instructions and labels rather
than the implicit ones used here. It would also have limits on the
complexity of expressions. If your pseudo-assembler supports:
ÿÿÿÿ a = b+c*f(x,y);
then you've invented a HLL.
You may say that.
It sounds like you don't understand the difference between a low-level language and a high-level one.
On Wed, 2026-04-15 at 23:52 +0100, Bart wrote:
On 15/04/2026 22:12, wij wrote:
On Wed, 2026-04-15 at 15:06 +0100, Bart wrote:
The boundary of assembly and HLL is not clear to me.
That seems to be obvious.
I had wrote a killer-grade commercial assembly program, it may still be running
today after >30 years. My experience is that assembly is not that scary as commonly
thought, just don't think in low level.
It's not that scary. Just unergonomic to code in it, taking longer,
being more error prone, much harder to understand, harder to maintain,
much less portable ...
Skill. Treat assembly as a chunk. Well document.
On 4/15/2026 4:30 PM, wij wrote:
On Wed, 2026-04-15 at 23:52 +0100, Bart wrote:
On 15/04/2026 22:12, wij wrote:
On Wed, 2026-04-15 at 15:06 +0100, Bart wrote:
The boundary of assembly and HLL is not clear to me.
That seems to be obvious.
I had wrote a killer-grade commercial assembly program, it may still
be running
today after >30 years. My experience is that assembly is not that
scary as commonly
thought, just don't think in low level.
It's not that scary. Just unergonomic to code in it, taking longer,
being more error prone, much harder to understand, harder to maintain,
much less portable ...
Skill. Treat assembly as a chunk. Well document.
Well crafted asm is not bad. Only used when needed! simple... :^)
I found some of my old asm on the way back machine:
https://web.archive.org/web/20060214112345/http:// appcore.home.comcast.net/appcore/src/cpu/i686/ac_i686_gcc_asm.html
On 4/15/2026 11:37 PM, Chris M. Thomasson wrote:
On 4/15/2026 4:30 PM, wij wrote:
On Wed, 2026-04-15 at 23:52 +0100, Bart wrote:
On 15/04/2026 22:12, wij wrote:
On Wed, 2026-04-15 at 15:06 +0100, Bart wrote:
The boundary of assembly and HLL is not clear to me.
That seems to be obvious.
I had wrote a killer-grade commercial assembly program, it may
still be running
today after >30 years. My experience is that assembly is not that
scary as commonly
thought, just don't think in low level.
It's not that scary. Just unergonomic to code in it, taking longer,
being more error prone, much harder to understand, harder to maintain, >>>> much less portable ...
Skill. Treat assembly as a chunk. Well document.
Well crafted asm is not bad. Only used when needed! simple... :^)
I found some of my old asm on the way back machine:
https://web.archive.org/web/20060214112345/http://
appcore.home.comcast.net/appcore/src/cpu/i686/ac_i686_gcc_asm.html
2005, damn time goes on bye, bye... ;^o
David Brown <david.brown@hesbynett.no> writes:
On 15/04/2026 01:33, Bart wrote:[...]
Certainly until C99 when stdint.h came along.I would not draw that distinction - indeed, I see the opposite. Prior
to <stdint.h>, your integer type sizes were directly from the target
machine - with <stdint.h> explicitly sized integer types, they are now
independent of the target hardware.
A minor quibble: The sizes of the predefined integer types have
always been determined by the compiler, often mandated by an ABI
for the target platform. The choice is *influenced* by the target
hardware, but not controlled by it. For example, the width of
`long` on x86_64 is likely to be 32 bits on Windows, 64 bits on
other platforms.
David Brown <david.brown@hesbynett.no> wrote:
On 15/04/2026 18:58, wij wrote:
Assembly compiler (or language) can also do the same optimization.
No, assemblers cannot do that - if they did, they would not be
assemblers. An assembler directly translates your instructions from
mnemonic codes (assembly instructions) to binary opcodes. Some
assemblers might have pseudo-instructions that translate to more than
one binary opcode, but always in a specific defined pattern.
Well, as a program assembler is not a compiler. But people talk
about "assembly language" and you can have a compiler that
takes assembly language as an input. This was done by DEC
for VAX assembly. A guy created compilers for 360 assembly,
one targeting 386, another one targetimg Java. Such compilers
to be useful should do same optimization.
Bart <bc@freeuk.com> writes:
On 16/04/2026 00:30, wij wrote:
On Wed, 2026-04-15 at 23:52 +0100, Bart wrote:
On 15/04/2026 22:12, wij wrote:Skill. Treat assembly as a chunk. Well document.
On Wed, 2026-04-15 at 15:06 +0100, Bart wrote:
The boundary of assembly and HLL is not clear to me.
That seems to be obvious.
I had wrote a killer-grade commercial assembly program, it may still be running
today after >30 years. My experience is that assembly is not that scary as commonly
thought, just don't think in low level.
It's not that scary. Just unergonomic to code in it, taking longer,
being more error prone, much harder to understand, harder to maintain, >>>> much less portable ...
You're not making sense. It's like saying I should walk everywhere
instead of using my car.
But I don't want to spend two extra hours a day walking and carrying
shopping etc.
What exactly is the benefit of using assembly over a HLL when both can
tackle the task?
When I first started with microprocessors, I first had to build the
hardware, which was programmed in binary. I wrote a hex editor so I
could use a keyboard. Then used that to write an assembler. Then used
the assembler to write a compiler for a simple HLL.
The HLL allowed me to be far more productive than otherwise. Everybody
seems to understand that, except you.
But I have a counter-proposal: why don't you also program in binary
machine code (I'll let you use hex!) instead of assembly? After all
it's just a skill.
Assembly is a great thing to know. It makes it easier to know what's
going on under the hood of higher level languages, and can even help in trouboeshooting and reasoning about how to make your code more efficiet.
Do I think that learning assembly is an asset? Absolutely.
Do I think it's something that a project should be written in directly?
In most cases, absolutely not.
* Not allowing (until standardised after half a century) binary
literals, and still not allowing those to be printed
* Not being allowed to do a dozen things that you KNOW are well-defined
on your target machine, but C says are UB.
W dniu 15.04.2026 oÿ22:01, Kerr-Mudd, John pisze:
On Wed, 15 Apr 2026 20:23:52 +0200
??Jacek Marcin Jaworski??<jmj@energokod.gda.pl> wrote:
W dniu 15.04.2026 oÿ15:40, makendo pisze:DYM comp.lang.asm.x86?
(forwarding to alt.lang.asm because you are comparing C with it)Great, but what is wrong with comp.lang.asm ? I subcribe it instead any
ohter asm related groups. Is this wrong aproach?
No!
comp.lang.asm is an empty header for me on eternal september's feed.
After question "is comp.lang.asm moderated?" ecosia.org AI answer today, quote:
The newsgroup comp.lang.asm is generally considered an unmoderated Usenet group. Unlike comp.lang.asm.x86, which is known to be moderated, comp.lang.asm does not have an official moderation process and typically allows posts to appear without prior review.
I see old posts published on comp.lang.asm, and last is yours: "Kenny
Code for DOS", from 2023-04-24, mon. (without any answers).
sig is overlong, and crowded, IMHO.
I have so many things to communicate Poles - this is the reason of bit
sig. But I try to be laconic.
--
Jacek Marcin Jaworski, Pruszcz Gd., woj. Pomorskie, Polska ??, EU ??;
tel.: +48-609-170-742, najlepiej w godz.: 5:00-5:55 lub 16:00-17:25; <jmj@energokod.gda.pl>, gpg: 4A541AA7A6E872318B85D7F6A651CC39244B0BFA;
Domowa s. WWW: <https://energokod.gda.pl>;
Mini Netykieta: <https://energokod.gda.pl/MiniNetykieta.html>; Mailowa Samoobrona: <https://emailselfdefense.fsf.org/pl>.
UWAGA:
NIE ZACI?GAJ "UKRYTEGO D?UGU"! P?A? ZA PROG. FOSS I INFO. INTERNETOWE!
CZYTAJ DARMOWY: "17. Raport Totaliztyczny - Patroni Kontra Bankierzy": <https://energokod.gda.pl/raporty-totaliztyczne/17.%20Patroni%20Kontra%20Bankierzy.pdf>
wij <wyniijj5@gmail.com> writes:his
On Wed, 2026-04-15 at 17:14 -0700, Tim Rentsch wrote:
wij <wyniijj5@gmail.com> writes:
[... comparing C and assembly language ...]
Gentlemen,
I understand the natural reaction to want to respond to the kind of statements being made in this thread.ÿ I hope y'all can resist t
B.natural reaction and not respond to people who persist in making arguments that are basically isomorphic to saying 1 equals 0.
Thank you for your assistance in this matter.
Maybe you are right. I say A is-a B, one persist to read A is (exactly)
I provide help to using assembly. One persist to read I persuade using
assembly and give up HLL. What is going on here?
You say that C is an assembly language.ÿ Nobody here thinks that
you're *equating* C and assembly language.ÿ It's obvious that
there are plenty of assembly languages that are not C, and nobody
has said otherwise.ÿ I have no idea why you think anyone has that
particular confusion.
At least one person has apparently interpreted your defense of
assembly language (that it isn't as scary as some think it is)
as a claim that we should program in assembly language rather
than in HLLs.ÿ You're right, that was a misinterpretation of what
you wrote.ÿ I considered mentioning that, but didn't bother.
The issue I've been discussing is your claim that C is an assembly
language.ÿ It is not.
I do not intend to post again in this thread until and unless you
post something substantive on that issue.
Well crafted asm is not bad. Only used when needed! simple... :^)
wij <wyniijj5@gmail.com> writes:
[... comparing C and assembly language ...]
Gentlemen,
I understand the natural reaction to want to respond to the kind of >statements being made in this thread. I hope y'all can resist this
natural reaction and not respond to people who persist in making
arguments that are basically isomorphic to saying 1 equals 0.
Thank you for your assistance in this matter.
It seem you insist C and assembly have to be exactly what your bible says.
If so, I would say what C standard (I cannot read it) says is the meaning
of terminology of term in it, not intended to be anything used in any
other situation.
I do not intend to post again in this thread until and unless you
post something substantive on that issue.
On Wed, 2026-04-15 at 19:04 -0700, Keith Thompson wrote:this
wij <wyniijj5@gmail.com> writes:
On Wed, 2026-04-15 at 17:14 -0700, Tim Rentsch wrote:
wij <wyniijj5@gmail.com> writes:
[... comparing C and assembly language ...]
Gentlemen,
I understand the natural reaction to want to respond to the kind of statements being made in this thread.ÿ I hope y'all can resist
y) B.natural reaction and not respond to people who persist in making arguments that are basically isomorphic to saying 1 equals 0.
Thank you for your assistance in this matter.
Maybe you are right. I say A is-a B, one persist to read A is (exactl
gI provide help to using assembly. One persist to read I persuade usin
astassembly and give up HLL. What is going on here?
You say that C is an assembly language.ÿ Nobody here thinks that
you're *equating* C and assembly language.ÿ It's obvious that
there are plenty of assembly languages that are not C, and nobody
has said otherwise.ÿ I have no idea why you think anyone has that particular confusion.
At least one person has apparently interpreted your defense of
assembly language (that it isn't as scary as some think it is)
as a claim that we should program in assembly language rather
than in HLLs.ÿ You're right, that was a misinterpretation of what
you wrote.ÿ I considered mentioning that, but didn't bother.
The issue I've been discussing is your claim that C is an assembly language.ÿ It is not.
If I said C is assembly is in the sense that have at least shown in the l
post (s_tut2.cpp), where even 'instruction' can be any function (e.g. change
directory, copy files, launch an editor,...). And also, what is 'computation'
is demonstrated, which include suggestion what C is, essentially any program,
and in this sense what HLL is. Finally, it could demonstrate the meaningand
testify Church-Turing thesis (my words: no computation language, includingÿ
various kind of math formula, can exceeds the expressive power of TM).. If
It seem you insist C and assembly have to be exactly what your bible says
so, I would say what C standard (I cannot read it) says is the meaning of terminology of term in it, not intended to be anything used in any othersituation.
I do not intend to post again in this thread until and unless you
post something substantive on that issue.
In article <1e4ef965d5ee27013e0abfd3c5dc18831400ad5f.camel@gmail.com>,ys.
wijÿ <wyniijj5@gmail.com> wrote:
...
It seem you insist C and assembly have to be exactly what your bible sa
ngIf so, I would say what C standard (I cannot read it) says is the meani
of terminology of term in it, not intended to be anything used in any
other situation.
Keith is the king of this newsgroup.ÿ What he says, goes.
The way he defines words is the law, and all must fall in line with that.
You're new around here, so you are probably not familar with these rules,
but you will be soon (if you choose to stick around).
Kind Keith then stated:
I do not intend to post again in this thread until and unless you
post something substantive on that issue.
For which we are all grateful.
David Brown <david.brown@hesbynett.no> wrote:
On 15/04/2026 18:58, wij wrote:
Assembly compiler (or language) can also do the same optimization.
No, assemblers cannot do that - if they did, they would not be
assemblers. An assembler directly translates your instructions from
mnemonic codes (assembly instructions) to binary opcodes. Some
assemblers might have pseudo-instructions that translate to more than
one binary opcode, but always in a specific defined pattern.
Well, as a program assembler is not a compiler. But people talk
about "assembly language" and you can have a compiler that
takes assembly language as an input. This was done by DEC
for VAX assembly. A guy created compilers for 360 assembly,
one targeting 386, another one targetimg Java. Such compilers
to be useful should do same optimization.
On Thu, 2026-04-16 at 13:10 +0000, Kenny McCormack wrote:
In article <1e4ef965d5ee27013e0abfd3c5dc18831400ad5f.camel@gmail.com>,
wijÿ <wyniijj5@gmail.com> wrote:
...
It seem you insist C and assembly have to be exactly what your bible says. >>> If so, I would say what C standard (I cannot read it) says is the meaning >>> of terminology of term in it, not intended to be anything used in any
other situation.
Keith is the king of this newsgroup.ÿ What he says, goes.
The way he defines words is the law, and all must fall in line with that.
Forget about that, fact first. There are LLM. This is not court.
As I know, comp.lang.c should be a forum for more general topics than lang.c.mod, comp.lang.c.std. And,ÿrefrain from telling what other should do, you are just another participant.
antispam@fricas.org (Waldek Hebisch) writes:
David Brown <david.brown@hesbynett.no> wrote:
On 15/04/2026 18:58, wij wrote:
Assembly compiler (or language) can also do the same optimization.
No, assemblers cannot do that - if they did, they would not be
assemblers. An assembler directly translates your instructions from
mnemonic codes (assembly instructions) to binary opcodes. Some
assemblers might have pseudo-instructions that translate to more than
one binary opcode, but always in a specific defined pattern.
Well, as a program assembler is not a compiler. But people talk
about "assembly language" and you can have a compiler that
takes assembly language as an input. This was done by DEC
for VAX assembly. A guy created compilers for 360 assembly,
one targeting 386, another one targetimg Java. Such compilers
to be useful should do same optimization.
The C compiler in the GNU Compiler Collection provides
a mechanism to 'take assembly language as an input'
in the form of in-line assembler fragments. It's
useful in some limited cases (machine-level software like
kernels, boot loaders and the like).
antispam@fricas.org (Waldek Hebisch) writes:
David Brown <david.brown@hesbynett.no> wrote:
On 15/04/2026 18:58, wij wrote:
Assembly compiler (or language) can also do the same optimization.
No, assemblers cannot do that - if they did, they would not be
assemblers. An assembler directly translates your instructions from
mnemonic codes (assembly instructions) to binary opcodes. Some
assemblers might have pseudo-instructions that translate to more than
one binary opcode, but always in a specific defined pattern.
Well, as a program assembler is not a compiler. But people talk
about "assembly language" and you can have a compiler that
takes assembly language as an input. This was done by DEC
for VAX assembly. A guy created compilers for 360 assembly,
one targeting 386, another one targetimg Java. Such compilers
to be useful should do same optimization.
The C compiler in the GNU Compiler Collection provides
a mechanism to 'take assembly language as an input'
in the form of in-line assembler fragments. It's
useful in some limited cases (machine-level software like
kernels, boot loaders and the like).
The Burroughs Large systems (B5500 and descendents) has
never had an assembler; all code is written in a flavor
of Algol (with special syntax extensions required for
the MCP and other privileged applications).
The Burroughs Medium systems COBOL68 compiler supported
the 'ENTER SYMBOLIC' statement, which was followed by
in-line assembler until the LEAVE SYMBOLIC statement.
[snip]
FWIW, I believe that the origins of C had much the same
philosophy: write parts in suitable languages, and link
them together prior to execution.
K&R C had no reason
to support inline assembly (and, as far as I have read)
the authors studiously avoided that capability.
On 2026-04-16 17:11, Lew Pitcher wrote:
[snip]
FWIW, I believe that the origins of C had much the same
philosophy: write parts in suitable languages, and link
them together prior to execution.
But was that an outcome of the C-language design, or of
the UNIX operating system concepts with its languages,
toolbox, and linking-editor?
There also seems to have been an asymmetry here with "C",
at least evolving later...
From what I observed, "C" had reached a status to not be
"inter pares". As a comparably low-level language it had
been often used for other languages as the compile-output
to be then handled by any C-compiler. Also HLLs supported
interfaces to access (primarily) "C" modules because of
their (much better) performance and the typically easier
access to system resources.
K&R C had no reason
to support inline assembly (and, as far as I have read)
the authors studiously avoided that capability.
Nonetheless it supported the reserved word 'asm' (as I can
read in my old translation of K&R). (Not exactly what I'd
call "studiously avoided".)
Janis
On Thu, 16 Apr 2026 14:38:06 +0000, Scott Lurndal wrote:
antispam@fricas.org (Waldek Hebisch) writes:
David Brown <david.brown@hesbynett.no> wrote:
On 15/04/2026 18:58, wij wrote:
Assembly compiler (or language) can also do the same optimization.
No, assemblers cannot do that - if they did, they would not be
assemblers. An assembler directly translates your instructions from
mnemonic codes (assembly instructions) to binary opcodes. Some
assemblers might have pseudo-instructions that translate to more than >>>> one binary opcode, but always in a specific defined pattern.
Well, as a program assembler is not a compiler. But people talk
about "assembly language" and you can have a compiler that
takes assembly language as an input. This was done by DEC
for VAX assembly. A guy created compilers for 360 assembly,
one targeting 386, another one targetimg Java. Such compilers
to be useful should do same optimization.
The C compiler in the GNU Compiler Collection provides
a mechanism to 'take assembly language as an input'
in the form of in-line assembler fragments. It's
useful in some limited cases (machine-level software like
kernels, boot loaders and the like).
I believe that the authors of GNU C latched on to an (at the
time) useful extension of the C language, originally implemented
in Ron Cain's "Small C Compiler for the 8080's" (Dr. Dobbs
Journal # 45, 1980) as the #asm/#endasm preprocessor directives.
Ron's K&R C subset compiler didn't compile to machine code;
instead, it compiled to CP/M 8080 assembler (CP/M came with
an 8080 assembler as it's only language tool), and so an
sourcecode assembly "passthrough" was easily implemented.
The Burroughs Large systems (B5500 and descendents) has
never had an assembler; all code is written in a flavor
of Algol (with special syntax extensions required for
the MCP and other privileged applications).
The Burroughs Medium systems COBOL68 compiler supported
the 'ENTER SYMBOLIC' statement, which was followed by
in-line assembler until the LEAVE SYMBOLIC statement.
The IBM language environments that I worked in all
supported static (and later, dynamic) linkage, and my
employer could afford a suite of IBM language tools.
IBMs language tools shared a common object interface,
so it was (relatively) easy to write the Assembly
parts in Assembler, and the HLL parts in the appropriate
HLL (ususally, for us, COBOL), and link them together
for execution.
Consequently, none of the high-level languages supported
an "assembly" escape (although COBOL provided extensions
for IBM DB2 relational database interaction).
On 2026-04-16 17:11, Lew Pitcher wrote:
[snip]
FWIW, I believe that the origins of C had much the same
philosophy: write parts in suitable languages, and link
them together prior to execution.
But was that an outcome of the C-language design, or of
the UNIX operating system concepts with its languages,
toolbox, and linking-editor?
On Thu, 16 Apr 2026 17:43:19 +0200, Janis Papanagnou wrote:
On 2026-04-16 17:11, Lew Pitcher wrote:
[snip]
FWIW, I believe that the origins of C had much the same
philosophy: write parts in suitable languages, and link
them together prior to execution.
But was that an outcome of the C-language design, or of
the UNIX operating system concepts with its languages,
toolbox, and linking-editor?
All of the above.
Linkage editors were (and still are) common technology,
as was separation of languages (assembler vs high level
language). Originally, Unix was written in assembler, and
(according to the histories) C was designed (with the existent
language tools in mind) to allow the Unix developers to use
a high-level language in their development. Remember, Bell
Labs wrote more than just Unix in C; C became the lingua-franca
for all the tools and applications, including the text management
tools (TROFF, EQN, SED, AWK, etc) and games (CHESS/CHECKERS/
BACKGAMMON)
I recall reading (but cannot find the reference now) that
Unix (V7 perhaps?) consisted of thousands of lines of C code,
and a few hundred lines of assembly for device drivers.
There also seems to have been an asymmetry here with "C",
at least evolving later...
From what I observed, "C" had reached a status to not be
"inter pares". As a comparably low-level language it had
been often used for other languages as the compile-output
to be then handled by any C-compiler. Also HLLs supported
interfaces to access (primarily) "C" modules because of
their (much better) performance and the typically easier
access to system resources.
K&R C had no reason
to support inline assembly (and, as far as I have read)
the authors studiously avoided that capability.
Nonetheless it supported the reserved word 'asm' (as I can
read in my old translation of K&R). (Not exactly what I'd
call "studiously avoided".)
To quote K&R ("The C Programming Language" 1978)
from Appendix A ("C Reference Manual") section 2.3 ("Keywords")
"The 'entry' keyword is not currently implemented by
any compiler, but is reserved for future use. Some
implementations also reserve the words 'fortran' and 'asm'."
I note that, according to that appendix, C had been ported to
PDP 11, Honeywell 6000, IBM 360/370, and Interdata 8/32 systems
at that time, none of them running Unix, to my knowledge.
As
such, the language (at that time in a bit of a plastic state,
being supplied as source code to AT&T customers and educators
alike) may have been altered on a site-by-site basis to suit
the needs of each particular client. As the context of these
keywords was never explained, I find it easier to believe that
the intent for these keywords was as a storage modifier, and
not an inline language change. Something like
extern fortran int F1(); /* use fortran calling convention */
extern asm char *F2(); /* use assembly calling convention */
On 15/04/2026 01:33, Bart wrote:
...
* Not allowing (until standardised after half a century) binary
literals, and still not allowing those to be printed
The latest draft standard supports %b and %B formats.
...
* Not being allowed to do a dozen things that you KNOW are well-defined
on your target machine, but C says are UB.
If you know they are well-defined on your only target platform, there's nothing wrong with writing such code. That's part of the reason why C
says the behavior is undefined, rather than requiring that such code be rejected. Implementations are intended to take advantage of that fact
for code that does not need to be portable.
On 15/04/2026 01:33, Bart wrote:
...
* Not allowing (until standardised after half a century) binary
literals, and still not allowing those to be printed
The latest draft standard supports %b and %B formats.
...
* Not being allowed to do a dozen things that you KNOW are well-defined
on your target machine, but C says are UB.
If you know they are well-defined on your only target platform, there's nothing wrong with writing such code. That's part of the reason why C
says the behavior is undefined, rather than requiring that such code be rejected. Implementations are intended to take advantage of that fact
for code that does not need to be portable.
On 2026-04-16 08:37, Chris M. Thomasson wrote:
Well crafted asm is not bad. Only used when needed! simple... :^)
And in practice a throwaway-product once you change platform.
(I'm shuddering thinking about porting my decades old DSP asm
code to some other platform/CPU architecture.) But I've ported
or re-used old "C" code without much effort. This is a crucial
differences, especially in the light of the thread-theses.
On 16/04/2026 11:28, James Kuyper wrote:
On 15/04/2026 01:33, Bart wrote:
...
* Not allowing (until standardised after half a century) binaryThe latest draft standard supports %b and %B formats.
literals, and still not allowing those to be printed
...
* Not being allowed to do a dozen things that you KNOW are well-definedIf you know they are well-defined on your only target platform,
on your target machine, but C says are UB.
there's
nothing wrong with writing such code. That's part of the reason why C
says the behavior is undefined, rather than requiring that such code be
rejected. Implementations are intended to take advantage of that fact
for code that does not need to be portable.
Taking advantage in what way? Doing something entirely unexpected or unintuitive?
/That's/ the problem!
On 16/04/2026 11:28, James Kuyper wrote:
On 15/04/2026 01:33, Bart wrote:
...
* Not allowing (until standardised after half a century) binaryThe latest draft standard supports %b and %B formats.
literals, and still not allowing those to be printed
...
* Not being allowed to do a dozen things that you KNOW are well-definedIf you know they are well-defined on your only target platform,
on your target machine, but C says are UB.
there's
nothing wrong with writing such code. That's part of the reason why C
says the behavior is undefined, rather than requiring that such code be
rejected. Implementations are intended to take advantage of that fact
for code that does not need to be portable.
Taking advantage in what way? Doing something entirely unexpected or unintuitive?
Bart <bc@freeuk.com> writes:
On 16/04/2026 11:28, James Kuyper wrote:
On 15/04/2026 01:33, Bart wrote:
...
* Not allowing (until standardised after half a century) binaryThe latest draft standard supports %b and %B formats.
literals, and still not allowing those to be printed
...
* Not being allowed to do a dozen things that you KNOW are well-defined >>>> on your target machine, but C says are UB.If you know they are well-defined on your only target platform,
there's
nothing wrong with writing such code. That's part of the reason why C
says the behavior is undefined, rather than requiring that such code be
rejected. Implementations are intended to take advantage of that fact
for code that does not need to be portable.
Taking advantage in what way? Doing something entirely unexpected or
unintuitive?
How ridiculous! If you can figure out a way to take advantage of
unexpected behavior, I'd appreciate knowing what it is.
I was talking
about defining the behavior that the C standard itself leaves undefined,
in ways that make things convenient for the developer.
On 17/04/2026 01:26, James Kuyper wrote:
Bart <bc@freeuk.com> writes:
On 16/04/2026 11:28, James Kuyper wrote:
On 15/04/2026 01:33, Bart wrote:
...
* Not allowing (until standardised after half a century) binaryThe latest draft standard supports %b and %B formats.
literals, and still not allowing those to be printed
...
* Not being allowed to do a dozen things that you KNOW are well-If you know they are well-defined on your only target platform,
defined
on your target machine, but C says are UB.
there's
nothing wrong with writing such code. That's part of the reason why C
says the behavior is undefined, rather than requiring that such code be >>>> rejected. Implementations are intended to take advantage of that fact
for code that does not need to be portable.
Taking advantage in what way? Doing something entirely unexpected or
unintuitive?
How ridiculous! If you can figure out a way to take advantage of
unexpected behavior, I'd appreciate knowing what it is.
It was you who mentioned taking advantage.
And by taking advantage, I assume you meant all the unpredictable things that optimising compilers like to do, because they assume that UB cannot happen.
Signed integer overflow is the one that everyone knows (though oddly it
is not listed in Appendix J.2, or if it is, it doesn't use the word 'overflow'!).
I think there are other obscure ones to do with the order you read and
write members of unions, or apply type-punning, or what you can do with pointers.
A common scenario is where someone is implementing a language where such things are well-defined, and they want to run it on a target machine
where they are also well-defined, but decide to use C as an intermediate language.
Unfortunately C has other ideas! So this means somehow getting around
the UB in the C that is generated, or stipulating specific compilers or compiler options.
Or just crossing your fingers and hoping the compiler will not be so crass.
Another scenerio is where you just writing C code and want that same behaviour.
I was talking
about defining the behavior that the C standard itself leaves undefined,
in ways that make things convenient for the developer.
The developer of the C implementation, or the C application?
I don't often use intermediate C code now, but that code is no longer portable among C compilers. It is for gcc only, and requires:
ÿÿ -fno-strict-aliasing
I can't remember exactly why it's needed, but some programs won't work without it.
(It's used with -O2, also necessary due to much redundancy in the C
code. Without the aliasing option, gcc will warn with: "dereferencing type-punned pointer will break strict-aliasing rules")
Whatever it is, I don't need anything like that when bypassing C and
going straight to native code.
And you won't need it if writing real assembly.
On 17/04/2026 13:27, Bart wrote:
On 17/04/2026 01:26, James Kuyper wrote:
Bart <bc@freeuk.com> writes:
On 16/04/2026 11:28, James Kuyper wrote:
On 15/04/2026 01:33, Bart wrote:
...
* Not allowing (until standardised after half a century) binaryThe latest draft standard supports %b and %B formats.
literals, and still not allowing those to be printed
...
* Not being allowed to do a dozen things that you KNOW areIf you know they are well-defined on your only target platform,
well- defined
on your target machine, but C says are UB.
there's
nothing wrong with writing such code. That's part of the reason
why C says the behavior is undefined, rather than requiring that
such code be rejected. Implementations are intended to take
advantage of that fact for code that does not need to be
portable.
Taking advantage in what way? Doing something entirely unexpected
or unintuitive?
How ridiculous! If you can figure out a way to take advantage of
unexpected behavior, I'd appreciate knowing what it is.
It was you who mentioned taking advantage.
And by taking advantage, I assume you meant all the unpredictable
things that optimising compilers like to do, because they assume
that UB cannot happen.
Signed integer overflow is the one that everyone knows (though
oddly it is not listed in Appendix J.2, or if it is, it doesn't use
the word 'overflow'!).
"An exceptional condition occurs during the evaluation of an
expression (6.5.1)"
You are correct that it does not use the word "overflow" - it's a bit
more generic than that.
I think there are other obscure ones to do with the order you read
and write members of unions, or apply type-punning, or what you can
do with pointers.
A common scenario is where someone is implementing a language where
such things are well-defined, and they want to run it on a target
machine where they are also well-defined, but decide to use C as an intermediate language.
That is an extraordinarily /uncommon/ scenario. I know it applies to
you, but you are not a typical C user in this respect.
People who want to use C as an intermediate language need to generate
code that is correct according to C semantics. It does not matter
how well the source language matches the target processor in its
behaviour if the C code in the middle has different ideas. (Indeed,
it does not matter what the target processor semantics are, except
for knowing the efficiency you can hope to achieve.) Thus if you
want wrapping signed integer arithmetic in your source language, you
must generate C code that emulates those semantics - such as by
having casts back and forth to unsigned types,
or using bigger types and then masking,
or writing non-portable code such as adding
"#pragma GCC optimize ("wrapv")" to the generated code.
Unfortunately C has other ideas! So this means somehow getting
around the UB in the C that is generated, or stipulating specific
compilers or compiler options.
Should C semantics be designed to suit millions of general C
developers over several generations, or should they be optimised to
suit a single developer of non-C languages who can't be bothered
adding some casts to his code generator? Hm, that's a difficult
trade-off question...
Or just crossing your fingers and hoping the compiler will not be
so crass.
Another scenerio is where you just writing C code and want that
same behaviour.
That's a great deal more common than the transpiler situation. But
it is still far rarer than many people think. In general, people
don't want their integer arithmetic to overflow - doing so is a bug,
no matter what the results.
I was talking
about defining the behavior that the C standard itself leaves
undefined, in ways that make things convenient for the developer.
The developer of the C implementation, or the C application?
I don't often use intermediate C code now, but that code is no
longer portable among C compilers. It is for gcc only, and requires:
?? -fno-strict-aliasing
I recommend adding that as a pragma, not expecting people (yourself)
to remember it as a command-line option.
I can't remember exactly why it's needed, but some programs won't
work without it.
It is needed if you faff around with converting pointer types - lying
to your compiler by saying "this is a pointer to type A" when you are setting it to the address of an object of type B. Such "tricks" can
be convenient sometimes, more convenient than semantically correct
methods (like unions or using memmove) so I can understand the
appeal. But you should understand clearly that your C code here is non-portable and has undefined behaviour according to the C standard
- "gcc -fno-strict-aliasing" provides additional semantics that you
can rely on as long as you use that flag.
(It's used with -O2, also necessary due to much redundancy in the C
code. Without the aliasing option, gcc will warn with:
"dereferencing type-punned pointer will break strict-aliasing
rules")
gcc's warning here is slightly inaccurately worded, but very useful.
Whatever it is, I don't need anything like that when bypassing C
and going straight to native code.
And you won't need it if writing real assembly.
Sure. If you don't use C, you don't have to care about C semantics.
On 17/04/2026 13:27, Bart wrote:
Signed integer overflow is the one that everyone knows (though oddly
it is not listed in Appendix J.2, or if it is, it doesn't use the word
'overflow'!).
"An exceptional condition occurs during the evaluation of an expression (6.5.1)"
You are correct that it does not use the word "overflow" - it's a bit
more generic than that.
I think there are other obscure ones to do with the order you read and
write members of unions, or apply type-punning, or what you can do
with pointers.
A common scenario is where someone is implementing a language where
such things are well-defined, and they want to run it on a target
machine where they are also well-defined, but decide to use C as an
intermediate language.
That is an extraordinarily /uncommon/ scenario.
People who want to use C as an intermediate language need to generate
code that is correct according to C semantics.ÿ It does not matter how
well the source language matches the target processor in its behaviour
if the C code in the middle has different ideas.
Unfortunately C has other ideas! So this means somehow getting around
the UB in the C that is generated, or stipulating specific compilers
or compiler options.
Should C semantics be designed to suit millions of general C developers
over several generations, or should they be optimised to suit a single developer of non-C languages who can't be bothered adding some casts to
his code generator?ÿ Hm, that's a difficult trade-off question...
Another scenerio is where you just writing C code and want that same
behaviour.
That's a great deal more common than the transpiler situation.ÿ But it
is still far rarer than many people think.ÿ In general, people don't
want their integer arithmetic to overflow - doing so is a bug, no matter what the results.
I was talking
about defining the behavior that the C standard itself leaves undefined, >>> in ways that make things convenient for the developer.
The developer of the C implementation, or the C application?
I don't often use intermediate C code now, but that code is no longer
portable among C compilers. It is for gcc only, and requires:
ÿÿÿ -fno-strict-aliasing
I recommend adding that as a pragma, not expecting people (yourself) to remember it as a command-line option.
It is needed if you faff around with converting pointer types - lying to your compiler by saying "this is a pointer to type A" when you are
setting it to the address of an object of type B.
On Fri, 17 Apr 2026 14:37:47 +0200
David Brown <david.brown@hesbynett.no> wrote:
On 17/04/2026 13:27, Bart wrote:
On 17/04/2026 01:26, James Kuyper wrote:
Bart <bc@freeuk.com> writes:
On 16/04/2026 11:28, James Kuyper wrote:
On 15/04/2026 01:33, Bart wrote:
...
* Not allowing (until standardised after half a century) binaryThe latest draft standard supports %b and %B formats.
literals, and still not allowing those to be printed
...
* Not being allowed to do a dozen things that you KNOW areIf you know they are well-defined on your only target platform,
well- defined
on your target machine, but C says are UB.
there's
nothing wrong with writing such code. That's part of the reason
why C says the behavior is undefined, rather than requiring that
such code be rejected. Implementations are intended to take
advantage of that fact for code that does not need to be
portable.
Taking advantage in what way? Doing something entirely unexpected
or unintuitive?
How ridiculous! If you can figure out a way to take advantage of
unexpected behavior, I'd appreciate knowing what it is.
It was you who mentioned taking advantage.
And by taking advantage, I assume you meant all the unpredictable
things that optimising compilers like to do, because they assume
that UB cannot happen.
Signed integer overflow is the one that everyone knows (though
oddly it is not listed in Appendix J.2, or if it is, it doesn't use
the word 'overflow'!).
"An exceptional condition occurs during the evaluation of an
expression (6.5.1)"
You are correct that it does not use the word "overflow" - it's a bit
more generic than that.
I think there are other obscure ones to do with the order you read
and write members of unions, or apply type-punning, or what you can
do with pointers.
A common scenario is where someone is implementing a language where
such things are well-defined, and they want to run it on a target
machine where they are also well-defined, but decide to use C as an
intermediate language.
That is an extraordinarily /uncommon/ scenario. I know it applies to
you, but you are not a typical C user in this respect.
People who want to use C as an intermediate language need to generate
code that is correct according to C semantics. It does not matter
how well the source language matches the target processor in its
behaviour if the C code in the middle has different ideas. (Indeed,
it does not matter what the target processor semantics are, except
for knowing the efficiency you can hope to achieve.) Thus if you
want wrapping signed integer arithmetic in your source language, you
must generate C code that emulates those semantics - such as by
having casts back and forth to unsigned types,
That would, indeed, avoid undefined behavior, but it leaves you in the
realm of implementation-defined behavior (6.3.1.3.3).
6.3.1.3 Signed and unsigned integers
1
When a value with integer type is converted to another integer type
other than _Bool, if the value can be represented by the new type, it
is unchanged.
2
Otherwise, if the new type is unsigned, the value is converted by
repeatedly adding or subtracting one more than the maximum value that
can be represented in the new type until the value is in the range of
the new type.60)
3
Otherwise, the new type is signed and the value cannot be represented
in it; either the result is implementation-defined or an implementation-defined signal is raised.
or using bigger types and then masking,
Which can be problematic when dealing with the widest integer type.
Besides, it's still implementation-defined (the same 6.3.1.3.3 apply),
unless generated code is *very* elaborate.
On 17/04/2026 13:37, David Brown wrote:
On 17/04/2026 13:27, Bart wrote:
Signed integer overflow is the one that everyone knows (though oddly
it is not listed in Appendix J.2, or if it is, it doesn't use the
word 'overflow'!).
"An exceptional condition occurs during the evaluation of an
expression (6.5.1)"
You are correct that it does not use the word "overflow" - it's a bit
more generic than that.
I think there are other obscure ones to do with the order you read
and write members of unions, or apply type-punning, or what you can
do with pointers.
A common scenario is where someone is implementing a language where
such things are well-defined, and they want to run it on a target
machine where they are also well-defined, but decide to use C as an
intermediate language.
That is an extraordinarily /uncommon/ scenario.
Lots of languages do this. A few that you may have heard of include
Haxe, Seed7, Nim, FreeBasic and Haskell. Although with some it will be
an option.
Even early C++ did so, but there it had mainly C semantics anyway.
People who want to use C as an intermediate language need to generate
code that is correct according to C semantics.ÿ It does not matter how
well the source language matches the target processor in its behaviour
if the C code in the middle has different ideas.
Well, this is the problem.
But the thread is about C being equated to assembly, and this is one of
the differences.
Some UBs are reasonable, others are not because the
behaviour is poorly defined on some rare or obsolete hardware, but would
be fine on virtually anything someone is likely to use.
Unfortunately C has other ideas! So this means somehow getting around
the UB in the C that is generated, or stipulating specific compilers
or compiler options.
Should C semantics be designed to suit millions of general C
developers over several generations, or should they be optimised to
suit a single developer of non-C languages who can't be bothered
adding some casts to his code generator?ÿ Hm, that's a difficult
trade-off question...
My generated C now is full of casts. It doesn't help much, partly
because the casts are designed to match the C to the semantics of the
source language (here it is typed IL code transpiled to C), rather than fixing the problems of C.
Example (extract from a larger output; module name is 'h'):
ÿ#define asi64(x) *(i64*)&x
ÿ#define tou64(x) (u64)x
ÿextern i32 printf(u64 $1, ...);
ÿextern void exit(i32);
ÿvoid h_main();
ÿint main(int nargs, char** args) {
ÿÿÿÿ h_main();
ÿ}
ÿvoid h_main() {
ÿÿÿÿ u64 R1, R2;
ÿÿÿÿ i64 a;
ÿÿÿÿ i64 b;
ÿÿÿÿ i64 c;
ÿÿÿÿ asi64(R1) = b;
ÿÿÿÿ asi64(R2) = c;
ÿÿÿÿ asi64(R1) += asi64(R2);
ÿÿÿÿ a = asi64(R1);
ÿÿÿÿ asi64(R1) = a;
ÿÿÿÿ R2 = tou64("hello %lld\n");
ÿÿÿÿ printf(asu64(R2), asi64(R1));
ÿÿÿÿ R1 = 0;
ÿÿÿÿ exit(R1);
ÿÿÿÿ return;
ÿ}
Another scenerio is where you just writing C code and want that same
behaviour.
That's a great deal more common than the transpiler situation.ÿ But it
is still far rarer than many people think.ÿ In general, people don't
want their integer arithmetic to overflow - doing so is a bug, no
matter what the results.
They want to do arbitrary conversions and type-punning. They want to use unions in whatever way they like without worrying that it may or may not
be UB.
I was talking
about defining the behavior that the C standard itself leaves
undefined,
in ways that make things convenient for the developer.
The developer of the C implementation, or the C application?
I don't often use intermediate C code now, but that code is no longer
portable among C compilers. It is for gcc only, and requires:
ÿÿÿ -fno-strict-aliasing
I recommend adding that as a pragma, not expecting people (yourself)
to remember it as a command-line option.
I've tried pragmas before, but they only worked on gcc/Windows; they
seemed to be ignored on gcc/Linux. For example for '-fno-builtin'. But
maybe I'll try it again.
Still, the entire build process for any of my programs, when expressed
as C, is still one command line involving one source file.
It is needed if you faff around with converting pointer types - lying
to your compiler by saying "this is a pointer to type A" when you are
setting it to the address of an object of type B.
Why would I be lying if I clearly use a cast to change T* (or some
integer X) to U*?
On 17/04/2026 15:49, Bart wrote:
On 17/04/2026 13:37, David Brown wrote:
On 17/04/2026 13:27, Bart wrote:
Signed integer overflow is the one that everyone knows (though oddly
it is not listed in Appendix J.2, or if it is, it doesn't use the
word 'overflow'!).
"An exceptional condition occurs during the evaluation of an
expression (6.5.1)"
You are correct that it does not use the word "overflow" - it's a bit
more generic than that.
I think there are other obscure ones to do with the order you read
and write members of unions, or apply type-punning, or what you can
do with pointers.
A common scenario is where someone is implementing a language where
such things are well-defined, and they want to run it on a target
machine where they are also well-defined, but decide to use C as an
intermediate language.
That is an extraordinarily /uncommon/ scenario.
Lots of languages do this. A few that you may have heard of include
Haxe, Seed7, Nim, FreeBasic and Haskell. Although with some it will be
an option.
Do all these languages support type-punning, unions, and signed integer arithmetic overflow defined in the way you think?ÿ I know Haskell does
not, I don't imagine FreeBasic does, but I can't answer for the others.
Well, this is the problem.
If you feel it is a problem for /you/, then I can't argue against that -
but it is /your/ problem.
ÿÿ#define asi64(x) *(i64*)&x
This is an extremely bad way to do conversions.ÿ It is possibly the
reason you need the "-fno-strict-aliasing" flag.ÿ Prefer to use value
casts, not pointer casts, as you do with "tou64".
ÿÿextern i32 printf(u64 $1, ...);
ÿÿextern void exit(i32);
Why would you declare these standard library functions like that?ÿ Using "printf" will be UB, as the declaration does not match the definition.
It might happen to work on x86, but some platform ABIs pass pointers and integers in different registers.
And you are doing all this with uninitialised variables, which is UB.
It's not C that's the problem here.ÿ If you see problems, it is because
you are pretending that C is something that it is not, and that you can write all sorts of risky nonsense.
Why would I be lying if I clearly use a cast to change T* (or some
integer X) to U*?
C requires you to access objects using lvalues of the appropriate type.
But it also allows conversions between various pointer types - that is
how you can have generic and flexible code (such as using malloc returns
for different types).ÿ So if you have a pointer "p" of type "T*", and
you write "(U*) p", you are telling the compiler "I know I said p was a pointer to objects of type T, but in this particular case it is actually pointing to an object of type U - the value contained in p started off
as the address of a U, before it was converted to a T*".ÿ If the thing
your pointer "p" points to is /not/ an object of type U (or other
suitable type, following the compatibility and qualifier rules), then
you are lying to the compiler.
I've no idea. You just said it was uncommon to use C in this way. But[...]
every other amateur compiler project on Reddit forums likes to use a C target.
On 17/04/2026 01:26, James Kuyper wrote:
Bart <bc@freeuk.com> writes:
On 16/04/2026 11:28, James Kuyper wrote:How ridiculous! If you can figure out a way to take advantage of
On 15/04/2026 01:33, Bart wrote:
...
* Not allowing (until standardised after half a century) binaryThe latest draft standard supports %b and %B formats.
literals, and still not allowing those to be printed
...
* Not being allowed to do a dozen things that you KNOW are well-defined >>>>> on your target machine, but C says are UB.If you know they are well-defined on your only target platform,
there's
nothing wrong with writing such code. That's part of the reason why C
says the behavior is undefined, rather than requiring that such code be >>>> rejected. Implementations are intended to take advantage of that fact
for code that does not need to be portable.
Taking advantage in what way? Doing something entirely unexpected or
unintuitive?
unexpected behavior, I'd appreciate knowing what it is.
It was you who mentioned taking advantage.
And by taking advantage, I assume you meant all the unpredictable
things that optimising compilers like to do, because they assume that
UB cannot happen.
On 17/04/2026 15:45, David Brown wrote:dly
On 17/04/2026 15:49, Bart wrote:
On 17/04/2026 13:37, David Brown wrote:
On 17/04/2026 13:27, Bart wrote:
Signed integer overflow is the one that everyone knows (though od
it is not listed in Appendix J.2, or if it is, it doesn't use the
word 'overflow'!).
it"An exceptional condition occurs during the evaluation of an expression (6.5.1)"
You are correct that it does not use the word "overflow" - it's a b
dmore generic than that.
I think there are other obscure ones to do with the order you rea
anand write members of unions, or apply type-punning, or what you c
redo with pointers.
A common scenario is where someone is implementing a language whe
such things are well-defined, and they want to run it on a target
anmachine where they are also well-defined, but decide to use C as
intermediate language.
That is an extraordinarily /uncommon/ scenario.
Lots of languages do this. A few that you may have heard of include
eHaxe, Seed7, Nim, FreeBasic and Haskell. Although with some it will b
an option.
Do all these languages support type-punning, unions, and signed integer
doesarithmetic overflow defined in the way you think?ÿ I know Haskell
not, I don't imagine FreeBasic does, but I can't answer for the others.
I've no idea. You just said it was uncommon to use C in this way. But
every other amateur compiler project on Reddit forums likes to use a C
target.
Well, this is the problem.
-If you feel it is a problem for /you/, then I can't argue against that
but it is /your/ problem.
It is a problem when using C for this purpose, which wouldn't arise
using a language designed to be used as an intermediate target.
eÿÿ#define asi64(x) *(i64*)&x
This is an extremely bad way to do conversions.ÿ It is possibly th
luereason you need the "-fno-strict-aliasing" flag.ÿ Prefer to use va
casts, not pointer casts, as you do with "tou64".
For this purpose, the C has to emulate a stack machine, with the stack
slots being a fixed type (u64) which have to contain signed and unsigned
integers, floats or doubles, or any kinds of pointer, or even any
arbitrary struct or array, by value.
One option was to use a union type for each stack element, but I decided
my choice would give cleaner code.
ÿÿextern i32 printf(u64 $1, ...);
ÿÿextern void exit(i32);
UsingWhy would you declare these standard library functions like that?ÿ
"printf" will be UB, as the declaration does not match the definition.
On most 64-bit machines these days you have float and non-float register
banks. Pointers are non-floats so can be handled like ints.d
In the IL that this C comes from, there are no pointer types. The
convention is to use 'u64' to represent addresses.
It might happen to work on x86, but some platform ABIs pass pointers an
integers in different registers.
The only one I know off-hand is 68K, which has separate data and address
registers, and that might happen, so I'll keep it in mind!
(There is a slim chance I can target 68K from my IL, via an emulator
that I would make, but likely I wouldn't be able to use a C library
anyway. It's funny I remember thinking around 1984 that those dual
register files would make it tricky to compile for.)
And you are doing all this with uninitialised variables, which is UB.
So this is something else. There should be no problem with using
unitialised data here, other than not being meaningful or useful.
They're uninitialised because my test program didn't bother to do so.
But running the program shouldn't be a problem. Why, what do you think C
might do that is so bad?ÿÿÿÿÿÿÿÿÿÿÿÿ?
Here is the original HLL fragment:
ÿÿÿÿ int a, b, c
ÿÿÿÿ a := b + c
This is the portable IL generated from that:
ÿÿÿÿ i64 x
ÿÿÿÿ i64 y
ÿÿÿÿ i64 z
ÿÿÿÿ loadÿÿÿÿÿ yÿÿ
ÿÿÿÿ loadÿÿÿÿÿ zÿÿÿÿÿÿÿÿÿÿÿÿÿÿ?
ÿÿÿÿ addÿÿÿÿÿÿÿ??ÿÿÿÿÿÿÿÿÿÿÿÿ
ÿÿÿÿ storeÿÿÿÿ xÿÿ??ÿÿÿÿÿÿÿÿÿÿÿÿ
This uses two stack slots. In the C above, those slots are called R1 andR2.
The same IL can be turned directly into x64 code:ÿÿÿÿÿÿÿÿ # D0-D15 are 64-bit regs
ÿÿÿÿ R.a = D3ÿÿÿÿÿÿ
ÿÿÿÿ R.b = D4
ÿÿÿÿ R.c = D5
ÿÿÿÿ movÿÿ D0,ÿÿ R.b
ÿÿÿÿ addÿÿ D0,ÿÿ R.c
ÿÿÿÿ movÿÿ R.a,ÿ D0
(This could be reduced to one instruction, but it's no faster.)
AFAIK no hardware exceptions are caused by adding whatever bit patterns
happen to be in those 'b' and 'c' registers.
causeIt's not C that's the problem here.ÿ If you see problems, it is be
you are pretending that C is something that it is not, and that you can
write all sorts of risky nonsense.
I use generated C for three things:
* To share my non-C programs with others, who can't/won't use my
compiler binary
* To optimise my non-C programs
* To run my non-C programs on platforms I don't directly support.
When I do that, it seems to work. Eg. Paul Edwards is using my C
compiler, and I distribute it as a 66Kloc file full of code like my
example. But I can also distribute it as NASM, AT&T or MASM assembly.
Why would I be lying if I clearly use a cast to change T* (or some
integer X) to U*?
C requires you to access objects using lvalues of the appropriate type.
But it also allows conversions between various pointer types - that is
show you can have generic and flexible code (such as using malloc return
andfor different types).ÿ So if you have a pointer "p" of type "T*",
you write "(U*) p", you are telling the compiler "I know I said p was a
ypointer to objects of type T, but in this particular case it is actuall
pointing to an object of type U - the value contained in p started off
hingas the address of a U, before it was converted to a T*".ÿ If the t
your pointer "p" points to is /not/ an object of type U (or other
suitable type, following the compatibility and qualifier rules), then
you are lying to the compiler.
I don't care about any of this. If I take a byte* pointer and cast it to
int* and then write via that version, I expect it to do exactly that,
and not question my choice!
This is exactly how it works in assembly, in my HLLs, and in my ILs.
It's possible that I may have done that erroneously, but that is another
matter. This is not about detecting coding bugs in the source language.
30instructions are defined for convenience for common usage, see man page
On 17/04/2026 01:26, James Kuyper wrote:...
Bart <bc@freeuk.com> writes:
On 16/04/2026 11:28, James Kuyper wrote:
On 15/04/2026 01:33, Bart wrote:
* Not being allowed to do a dozen things that you KNOW are well-If you know they are well-defined on your only target platform,
defined
on your target machine, but C says are UB.
there's
nothing wrong with writing such code. That's part of the reason why C
says the behavior is undefined, rather than requiring that such code be >>>> rejected. Implementations are intended to take advantage of that fact
for code that does not need to be portable.
Taking advantage in what way? Doing something entirely unexpected or
unintuitive?
How ridiculous! If you can figure out a way to take advantage of
unexpected behavior, I'd appreciate knowing what it is.
It was you who mentioned taking advantage.
And by taking advantage, I assume you meant all the unpredictable things that optimising compilers like to do, because they assume that UB cannot happen.
I think there are other obscure ones to do with the order you read and
write members of unions, or apply type-punning, or what you can do with pointers.
A common scenario is where someone is implementing a language where such things are well-defined, and they want to run it on a target machine
where they are also well-defined, but decide to use C as an intermediate language.
Unfortunately C has other ideas! So this means somehow getting around
the UB in the C that is generated, or stipulating specific compilers or
Or just crossing your fingers and hoping the compiler will not be so crass.
Another scenerio is where you just writing C code and want that same behaviour.In virtually every case where the C behavior is undefined, you can
I was talking
about defining the behavior that the C standard itself leaves undefined,
in ways that make things convenient for the developer.
The developer of the C implementation, or the C application?
I don't often use intermediate C code now, but that code is no longer portable among C compilers. It is for gcc only, and requires:
ÿÿ -fno-strict-aliasing
I can't remember exactly why it's needed, but some programs won't work without it.
In attempting writting a simple language, I had a thought of what language is >to share. Because I saw many people are stuck in thinking C/C++ (or other >high level language) can be so abstract, unlimited 'high level' to mysteriously
solve various human description of idea.
C and assembly are essentially the same, maybe better call it 'portable assembly'.
In C, we don't explicitly specify how wide the register/memory unit is, we use >char/int (short/long, signed/unsigned) to denote the basic unit. I.e.
a=b; // equ. to "mov a,b"
On 17/04/2026 15:45, David Brown wrote:
On 17/04/2026 15:49, Bart wrote:
On 17/04/2026 13:37, David Brown wrote:
On 17/04/2026 13:27, Bart wrote:
Signed integer overflow is the one that everyone knows (though
oddly it is not listed in Appendix J.2, or if it is, it doesn't use >>>>> the word 'overflow'!).
"An exceptional condition occurs during the evaluation of an
expression (6.5.1)"
You are correct that it does not use the word "overflow" - it's a
bit more generic than that.
I think there are other obscure ones to do with the order you read
and write members of unions, or apply type-punning, or what you can >>>>> do with pointers.
A common scenario is where someone is implementing a language where >>>>> such things are well-defined, and they want to run it on a target
machine where they are also well-defined, but decide to use C as an >>>>> intermediate language.
That is an extraordinarily /uncommon/ scenario.
Lots of languages do this. A few that you may have heard of include
Haxe, Seed7, Nim, FreeBasic and Haskell. Although with some it will
be an option.
Do all these languages support type-punning, unions, and signed
integer arithmetic overflow defined in the way you think?ÿ I know
Haskell does not, I don't imagine FreeBasic does, but I can't answer
for the others.
I've no idea. You just said it was uncommon to use C in this way. But
every other amateur compiler project on Reddit forums likes to use a C target.
Well, this is the problem.
If you feel it is a problem for /you/, then I can't argue against that
- but it is /your/ problem.
It is a problem when using C for this purpose, which wouldn't arise
using a language designed to be used as an intermediate target.
ÿÿ#define asi64(x) *(i64*)&x
This is an extremely bad way to do conversions.ÿ It is possibly the
reason you need the "-fno-strict-aliasing" flag.ÿ Prefer to use value
casts, not pointer casts, as you do with "tou64".
For this purpose, the C has to emulate a stack machine, with the stack
slots being a fixed type (u64) which have to contain signed and unsigned integers, floats or doubles, or any kinds of pointer, or even any
arbitrary struct or array, by value.
One option was to use a union type for each stack element, but I decided
my choice would give cleaner code.
ÿÿextern i32 printf(u64 $1, ...);
ÿÿextern void exit(i32);
Why would you declare these standard library functions like that?
Using "printf" will be UB, as the declaration does not match the
definition.
On most 64-bit machines these days you have float and non-float register banks. Pointers are non-floats so can be handled like ints.
In the IL that this C comes from, there are no pointer types. The
convention is to use 'u64' to represent addresses.
It might happen to work on x86, but some platform ABIs pass pointers
and integers in different registers.
The only one I know off-hand is 68K, which has separate data and address registers, and that might happen, so I'll keep it in mind!
(There is a slim chance I can target 68K from my IL, via an emulator
that I would make, but likely I wouldn't be able to use a C library
anyway. It's funny I remember thinking around 1984 that those dual
register files would make it tricky to compile for.)
And you are doing all this with uninitialised variables, which is UB.
So this is something else. There should be no problem with using
unitialised data here, other than not being meaningful or useful.
They're uninitialised because my test program didn't bother to do so.
But running the program shouldn't be a problem. Why, what do you think C might do that is so bad?
On Thu, 2026-04-16 at 18:42 +0800, wij wrote:of
On Wed, 2026-04-15 at 19:04 -0700, Keith Thompson wrote:
wij <wyniijj5@gmail.com> writes:
On Wed, 2026-04-15 at 17:14 -0700, Tim Rentsch wrote:
wij <wyniijj5@gmail.com> writes:
[... comparing C and assembly language ...]
Gentlemen,
I understand the natural reaction to want to respond to the kind
st thisstatements being made in this thread.ÿ I hope y'all can resi
tly) B.natural reaction and not respond to people who persist in making arguments that are basically isomorphic to saying 1 equals 0.
Thank you for your assistance in this matter.
Maybe you are right. I say A is-a B, one persist to read A is (exac
ingI provide help to using assembly. One persist to read I persuade us
tassembly and give up HLL. What is going on here?
You say that C is an assembly language.ÿ Nobody here thinks that
you're *equating* C and assembly language.ÿ It's obvious that
there are plenty of assembly languages that are not C, and nobody
has said otherwise.ÿ I have no idea why you think anyone has tha
tparticular confusion.
At least one person has apparently interpreted your defense of
assembly language (that it isn't as scary as some think it is)
as a claim that we should program in assembly language rather
than in HLLs.ÿ You're right, that was a misinterpretation of wha
lastyou wrote.ÿ I considered mentioning that, but didn't bother.
The issue I've been discussing is your claim that C is an assembly language.ÿ It is not.
If I said C is assembly is in the sense that have at least shown in the
hangepost (s_tut2.cpp), where even 'instruction' can be any function (e.g. c
ation'directory, copy files, launch an editor,...). And also, what is 'comput
ogram,is demonstrated, which include suggestion what C is, essentially any pr
g andand in this sense what HLL is. Finally, it could demonstrate the meanin
ingÿtestify Church-Turing thesis (my words: no computation language, includ
ys. Ifvarious kind of math formula, can exceeds the expressive power of TM).
It seem you insist C and assembly have to be exactly what your bible sa
ofso, I would say what C standard (I cannot read it) says is the meaning
r situation.terminology of term in it, not intended to be anything used in any othe
derI do not intend to post again in this thread until and unless you
post something substantive on that issue.
(continue)
IMO, C standard is like book of legal terms. Like many symbols in the hea
file, it defines one symbol in anoter symbol. The real meaning is not fixed.
The result is you cannot 'prove' correctness of the source program, even
consistency is a problem.k.
'Instruction' is low-level? Yes, by definition, but not as one might thin
Instruction could refer to a processing unit (might be like the x87 math
co-processor, which may even be more higher level to process expression,...)
As good chance of C is to find a good function that can be hardwired.oves
So, the basic feature of HLL is 'structured' (or 'nested') text which rem
labels. Semantics is inventor's imagination. So, avoid bizarre complexity, it
won't add express power to the language, just a matter of short or lengthyÿ
expression of programming idea.
On 17/04/2026 18:42, Bart wrote:
I've no idea. You just said it was uncommon to use C in this way. But
every other amateur compiler project on Reddit forums likes to use a C
target.
You didn't simply claim that people were using C as an intermediary
language - you claimed they were doing so specifically for languages
that defined things like type punning, wrapping signed integer
arithmetic, and messing about with pointers.
It is a problem when using C for this purpose, which wouldn't arise
using a language designed to be used as an intermediate target.
C is not designed for that purpose, nor are C compilers.
ÿ So if you
don't like C here, don't use it.ÿ It is not the fault of C, its language designers, or toolchain implementers.ÿ And if this really were the
problem you seem to think, people would use something else.
As it turns out, people /do/ use something else.ÿ There are countless virtual machines with their own byte-codes, specialised for different
types of source languages.ÿ And there is a common intermediary language
used by a lot of tools - LLVM "assembly".ÿ This /was/ designed for that purpose, and does a pretty good job at it.
And if you don't like it (of course you don't like it - you didn't
invent it), find or make something else.
One option was to use a union type for each stack element, but I
decided my choice would give cleaner code.
Oh, right - you knew of a correct solution, but decided instead that something broken would be cleaner.
So you think UB is better than doing things correctly, and then you
complain when C doesn't have the semantics you want?
So this is something else. There should be no problem with using
unitialised data here, other than not being meaningful or useful.
Again - you are pretending that C means what you think it should mean.
Using uninitialised local data leads, in most cases, to UB in C.ÿ If
your language treats uninitialised data as unspecified values, or
default initialised (typically to 0), or has some other determined behaviour, then you need to implement that behaviour in the generated C code.ÿ You don't get to generate C and pretend it means something
different.
On 18/04/2026 14:37, David Brown wrote:
On 17/04/2026 18:42, Bart wrote:
I've no idea. You just said it was uncommon to use C in thisYou didn't simply claim that people were using C as an intermediary
way. But every other amateur compiler project on Reddit forums
likes to use a C target.
language - you claimed they were doing so specifically for languages
that defined things like type punning, wrapping signed integer
arithmetic, and messing about with pointers.
The broader picture is being forgotten. The thread is partly about C
being a 'portable assembler', and this is a common notion.
C is famous for being low level; being close to the hardware; for a
1:1 correspondence between types that people work with, and the
operations on those, with the equivalent assembly.
Whether that is correct or not, that is what people think or say, and
what many assume.
It is also what very many want, including me.
Yes, I know. There should have been one that is much better - a HLL,
not the monstrosity that is LLVM. But it doesn't exist.
Oh, right - you knew of a correct solution, but decided instead that
something broken would be cleaner.
Well, it shouldn't BE broken! That's the problem with C.
So you think UB is better than doing things correctly, and then you
complain when C doesn't have the semantics you want?
I'm saying that a lot of things shouldn't be UB. Some people just want
want to write assembly - for a specific machine - but also want HLL conveniences.
So this is something else. There should be no problem with usingAgain - you are pretending that C means what you think it should
unitialised data here, other than not being meaningful or useful.
mean. Using uninitialised local data leads, in most cases, to UB in
C.ÿ If your language treats uninitialised data as unspecified
values, or default initialised (typically to 0), or has some other
determined behaviour, then you need to implement that behaviour in
the generated C code.ÿ You don't get to generate C and pretend it
means something different.
In a real application then using unitialised data, outside of .bss,
would be uncommon, and likely be a bug.
But outside of a real application, such as in fragments of test code
that I work on every day, then variables can be uninitialised,
especially if I'm interested more in the code that is being generated
and will not actually run it.
It seems that a C compiler cannot make that distinction and must
always assume that every program, even in development, is
mission-critical.
In my original example, they weren't initialised in order to keep the
posted examples short.
I still wouldn't call A + B undefined behaviour when A/B are not
initialised; the result is the sum of whatever A and B happen to
contain, and is little different from:
A = rand();
B = rand();
A + B;
On 18/04/2026 14:37, David Brown wrote:
On 17/04/2026 18:42, Bart wrote:
I've no idea. You just said it was uncommon to use C in thisYou didn't simply claim that people were using C as an intermediary
way. But every other amateur compiler project on Reddit forums
likes to use a C target.
language - you claimed they were doing so specifically for languages
that defined things like type punning, wrapping signed integer
arithmetic, and messing about with pointers.
The broader picture is being forgotten. The thread is partly about C
being a 'portable assembler', and this is a common notion.
C is famous for being low level; being close to the hardware; for a
1:1 correspondence between types that people work with, and the
operations on those, with the equivalent assembly.
Whether that is correct or not, that is what people think or say, and
what many assume.
It is also what very many want, including me.
Yes, I know. There should have been one that is much better - a HLL,
not the monstrosity that is LLVM. But it doesn't exist.
Oh, right - you knew of a correct solution, but decided instead that
something broken would be cleaner.
Well, it shouldn't BE broken! That's the problem with C.
So you think UB is better than doing things correctly, and then you
complain when C doesn't have the semantics you want?
I'm saying that a lot of things shouldn't be UB. Some people just want
want to write assembly - for a specific machine - but also want HLL conveniences.
So this is something else. There should be no problem with usingAgain - you are pretending that C means what you think it should
unitialised data here, other than not being meaningful or useful.
mean. Using uninitialised local data leads, in most cases, to UB in
C.ÿ If your language treats uninitialised data as unspecified
values, or default initialised (typically to 0), or has some other
determined behaviour, then you need to implement that behaviour in
the generated C code.ÿ You don't get to generate C and pretend it
means something different.
In a real application then using unitialised data, outside of .bss,
would be uncommon, and likely be a bug.
But outside of a real application, such as in fragments of test code
that I work on every day, then variables can be uninitialised,
especially if I'm interested more in the code that is being generated
and will not actually run it.
It seems that a C compiler cannot make that distinction and must
always assume that every program, even in development, is
mission-critical.
In my original example, they weren't initialised in order to keep the
posted examples short.
I still wouldn't call A + B undefined behaviour when A/B are not
initialised; the result is the sum of whatever A and B happen to
contain, and is little different from:
A = rand();
B = rand();
A + B;
It's a common wrong notion.[...]
If you are talking about function-local data, there are multiple
ways to do store them in an easy-to-clean-up fashion:
On 18/04/2026 14:37, David Brown wrote:
On 17/04/2026 18:42, Bart wrote:
I've no idea. You just said it was uncommon to use C in this way. But
every other amateur compiler project on Reddit forums likes to use a
C target.
You didn't simply claim that people were using C as an intermediary
language - you claimed they were doing so specifically for languages
that defined things like type punning, wrapping signed integer
arithmetic, and messing about with pointers.
The broader picture is being forgotten. The thread is partly about C
being a 'portable assembler', and this is a common notion.
C is famous for being low level; being close to the hardware; for a 1:1 correspondence between types that people work with, and the operations
on those, with the equivalent assembly.
Whether that is correct or not, that is what people think or say, and
what many assume.
It is also what very many want, including me.
This particular use-case for C as an intermediate language is one
example, a good one as it highlights the issues. But I also want all
those assumptions to be true.
(In my systems language, it is a lot truer than in C. But my language supports a small number of targets, and usually one at a time.)
It is a problem when using C for this purpose, which wouldn't arise
using a language designed to be used as an intermediate target.
C is not designed for that purpose, nor are C compilers.
Yes, I know. There should have been one that is much better - a HLL, not
the monstrosity that is LLVM. But it doesn't exist.
If it did, then it could have served another purpose for which C is currently used and is not ideal either, which is to express APIs of libraries. Currently that is too C-centric and it is a big task to
tranlate into bindings for other languages.
(For example, the headers for GTK2 include about 4000 C macro definitions.)
ÿ So if you don't like C here, don't use it.ÿ It is not the fault of
C, its language designers, or toolchain implementers.ÿ And if this
really were the problem you seem to think, people would use something
else.
There /is/ nothing else. C is the best of a bad bunch.
One option was to use a union type for each stack element, but I
decided my choice would give cleaner code.
Oh, right - you knew of a correct solution, but decided instead that
something broken would be cleaner.
Well, it shouldn't BE broken! That's the problem with C.
So you think UB is better than doing things correctly, and then you
complain when C doesn't have the semantics you want?
I'm saying that a lot of things shouldn't be UB. Some people just want
want to write assembly - for a specific machine - but also want HLL conveniences.
So this is something else. There should be no problem with using
unitialised data here, other than not being meaningful or useful.
Again - you are pretending that C means what you think it should mean.
Using uninitialised local data leads, in most cases, to UB in C.ÿ If
your language treats uninitialised data as unspecified values, or
default initialised (typically to 0), or has some other determined
behaviour, then you need to implement that behaviour in the generated
C code.ÿ You don't get to generate C and pretend it means something
different.
In a real application then using unitialised data, outside of .bss,
would be uncommon, and likely be a bug.
But outside of a real application, such as in fragments of test code
that I work on every day, then variables can be uninitialised,
especially if I'm interested more in the code that is being generated
and will not actually run it.
It seems that a C compiler cannot make that distinction and must always assume that every program, even in development, is mission-critical.
In my original example, they weren't initialised in order to keep the
posted examples short.
I still wouldn't call A + B undefined behaviour when A/B are not initialised; the result is the sum of whatever A and B happen to
contain, and is little different from:
ÿÿÿ A = rand();
ÿÿÿ B = rand();
ÿÿÿ A + B;
On 18/04/2026 17:08, Bart wrote:
(Yes, LLVM and the tools around it are big.ÿ It takes a lot of effort to make use of them, but you get a lot in return.ÿ A "little language" has
to grow to a certain size in numbers of toolchain developers and numbers
of toolchain users before it can make sense to move to LLVM.
But doing
so is still a fraction of the work compared to making a serious
optimising back-end for multiple targets.)
If it did, then it could have served another purpose for which C is
currently used and is not ideal either, which is to express APIs of
libraries. Currently that is too C-centric and it is a big task to
tranlate into bindings for other languages.
(For example, the headers for GTK2 include about 4000 C macro
definitions.)
And yet in practice C is it good enough for almost cases.
A C compiler expects code written in valid C.ÿ Compilers expect code to
be run - I don't think that is unreasonable.
ÿ And when I use a compiler
to look at generated assembly for some C code (and I do that quite
often), I am using C code that has a meaning if it were to be run.
On 19/04/2026 11:17, David Brown wrote:
On 18/04/2026 17:08, Bart wrote:
(Yes, LLVM and the tools around it are big.ÿ It takes a lot of effort
to make use of them, but you get a lot in return.ÿ A "little language"
has to grow to a certain size in numbers of toolchain developers and
numbers of toolchain users before it can make sense to move to LLVM.
Actually lots of small projects use LLVM.
But probably people don't realise it is like installing the engine from
a container ship into your small family car.
A C compiler expects code written in valid C.ÿ Compilers expect code
to be run - I don't think that is unreasonable.
What's not valid about 'a = b + c'?
ÿ And when I use a compiler to look at generated assembly for some C
code (and I do that quite often), I am using C code that has a meaning
if it were to be run.
I'm interested too, but if I compile this in godbolt:
ÿvoid F() {
ÿÿÿÿ int a, b, c;
ÿÿÿÿ a = b + c * 8;
ÿ}
then all the C compilers I tried generated code at -O0 which kept those variables in memory.
What does the code look like when a/b/c are kept in registers? I've no
idea, because at soon as you try -O1 and above, the whole expression is elided.
If you stick 'static' in front, then the whole function disappears. This
is not very useful when trying to compare code generation across
compilers and languages!
If I do something meaningful with 'a' to keep the expression alive, and initialise b and c, then the whole expression is reduced to a constant.
What do you have to do see if the expression would be compiled to, for example, 'lea ra, [rb + rc*8]'?
Bart <bc@freeuk.com> writes:
On 18/04/2026 14:37, David Brown wrote:
On 17/04/2026 18:42, Bart wrote:
I've no idea. You just said it was uncommon to use C in thisYou didn't simply claim that people were using C as an intermediary
way. But every other amateur compiler project on Reddit forums
likes to use a C target.
language - you claimed they were doing so specifically for languages
that defined things like type punning, wrapping signed integer
arithmetic, and messing about with pointers.
The broader picture is being forgotten. The thread is partly about C
being a 'portable assembler', and this is a common notion.
It's a common wrong notion.
One person here recently claimed that C is a kind of assembly
language.
Yes, I know. There should have been one that is much better - a HLL,
not the monstrosity that is LLVM. But it doesn't exist.
Given your habit of inventing your own languages and writing your own compilers, I'm surprised you haven't defined your own intermediate
language, something like LLVM IR but suiting your purposes better.
You're complaining about a problem that *you* might be in a position
to address.
On 18/04/2026 17:08, Bart wrote:
The broader picture is being forgotten. The thread is partly about C
being a 'portable assembler', and this is a common notion.
It is a common misconception - and I believe we agree it is a
misconception.
C is famous for being low level; [...]
I describe C as being a relatively low level high-level language.ÿ [...]
On 19/04/2026 11:17, David Brown wrote:
On 18/04/2026 17:08, Bart wrote:
(Yes, LLVM and the tools around it are big.? It takes a lot of
effort to make use of them, but you get a lot in return.? A "little language" has to grow to a certain size in numbers of toolchain
developers and numbers of toolchain users before it can make sense
to move to LLVM.
Actually lots of small projects use LLVM.
But probably people don't realise it is like installing the engine
from a container ship into your small family car.
But doing
so is still a fraction of the work compared to making a serious
optimising back-end for multiple targets.)
If it did, then it could have served another purpose for which C
is currently used and is not ideal either, which is to express
APIs of libraries. Currently that is too C-centric and it is a big
task to tranlate into bindings for other languages.
(For example, the headers for GTK2 include about 4000 C macro
definitions.)
And yet in practice C is it good enough for almost cases.
It is not even good enough C. To get back to GTK2 (which I looked at
in detail some years back), compiling this program:
#include <gtk2.h>
involved processing over 1000 #includes, some 550 discrete headers,
330K lines of declarations, with a bunch of -I options to tell it the
dozen different folders it needs to go and look for those headers.
I was looking at reducing the whole thing to one file - a set of
bindings in my language for the functions, types etc that are exposed.
This file would have been 25Kloc in my language (including those 4000 headers; most would have been simple #defines, but many will have
needed manual translation: macros can contain actual C code, not just declarations).
HOWEVER... if such an exercise works for my language, why can't it
work for C too? That is, reduce those 100s of header files and dozens
of folders into a single 25Kloc file, specific to your platform.
Think how much easier it would be to install, or employ, and how
much faster to /compile/!
So why doesn't this happen? The equivalent exercise for SDL2 would
reduce 50Kloc across 80 header files (at least these are in the same
folder) to one 3Kloc file.
A C compiler expects code written in valid C.? Compilers expect
code to be run - I don't think that is unreasonable.
What's not valid about 'a = b + c'?
? And when I use a compiler
to look at generated assembly for some C code (and I do that quite
often), I am using C code that has a meaning if it were to be run.
I'm interested too, but if I compile this in godbolt:
void F() {
int a, b, c;
a = b + c * 8;
}
then all the C compilers I tried generated code at -O0 which kept
those variables in memory.
What does the code look like when a/b/c are kept in registers? I've
no idea, because at soon as you try -O1 and above, the whole
expression is elided.
If you stick 'static' in front, then the whole function disappears.
This is not very useful when trying to compare code generation across compilers and languages!
If I do something meaningful with 'a' to keep the expression alive,
and initialise b and c, then the whole expression is reduced to a
constant.
What do you have to do see if the expression would be compiled to,
for example, 'lea ra, [rb + rc*8]'?
On 19/04/2026 13:50, Bart wrote:
On 19/04/2026 11:17, David Brown wrote:
On 18/04/2026 17:08, Bart wrote:
(Yes, LLVM and the tools around it are big.ÿ It takes a lot of effort
to make use of them, but you get a lot in return.ÿ A "little
language" has to grow to a certain size in numbers of toolchain
developers and numbers of toolchain users before it can make sense to
move to LLVM.
Actually lots of small projects use LLVM.
But probably people don't realise it is like installing the engine
from a container ship into your small family car.
The strange thing about the software world is that it does not matter.
I do appreciate liking things to be small, simple and efficient.
Sometimes that is important - in my own work, it is very often
important.ÿ But often it doesn't matter at all.ÿ There are other things
more worthy of our time and effort.
Would it be better if the gcc toolchain installation for the cross
compiler I use were 1 MB of installation over 20 files, rather than
whatever it is now?
ÿÿvoid F() {
ÿÿÿÿÿ int a, b, c;
ÿÿÿÿÿ a = b + c * 8;
ÿÿ}
So you want to know how the compiler deals with meaningless code.ÿ Why?
Do you not know how to write meaningful code?
then all the C compilers I tried generated code at -O0 which kept
those variables in memory.
They are on the stack in memory, yes.ÿ You've asked for close to a
direct and na‹ve translation, which gives no insight into what kind of
code the compiler can generate and is harder to follow (because it's
mostly moving things onto and off from the stack).
What does the code look like when a/b/c are kept in registers? I've no
idea, because at soon as you try -O1 and above, the whole expression
is elided.
If you stick 'static' in front, then the whole function disappears.
This is not very useful when trying to compare code generation across
compilers and languages!
If I do something meaningful with 'a' to keep the expression alive,
and initialise b and c, then the whole expression is reduced to a
constant.
What do you have to do see if the expression would be compiled to, for
example, 'lea ra, [rb + rc*8]'?
int f(int b, int c)
{
ÿÿÿ int a;
ÿÿÿ a = b + c * 8;
ÿÿÿ return a;
}
If you don't want to use parameters and return values, I recommend
declaring externally linked volatile variables and use them as the
source and destination of your calculations:
volatile int xa;
volatile int xb;
volatile int xc;
void foo(void) {
ÿÿÿ int a, b, c;
ÿÿÿ b = xa;
ÿÿÿ c = xc;
ÿÿÿ a = b + c * 8;
ÿÿÿ xa = a;
}
When you ask the compiler "give me an efficient implementation of this
code" and the compiler can see that the code does nothing, it generates
no code (or just a "ret").ÿ This should not be a surprise.ÿ So you might need "tricks" to make the code mean something - including access to
volatile objects is one of these tricks.
On Sun, 19 Apr 2026 12:50:04 +0100
Bart <bc@freeuk.com> wrote:
It is not even good enough C. To get back to GTK2 (which I looked at
in detail some years back), compiling this program:
#include <gtk2.h>
involved processing over 1000 #includes, some 550 discrete headers,
330K lines of declarations, with a bunch of -I options to tell it the
dozen different folders it needs to go and look for those headers.
I was looking at reducing the whole thing to one file - a set of
bindings in my language for the functions, types etc that are exposed.
This file would have been 25Kloc in my language (including those 4000
headers; most would have been simple #defines, but many will have
needed manual translation: macros can contain actual C code, not just
declarations).
HOWEVER... if such an exercise works for my language, why can't it
work for C too? That is, reduce those 100s of header files and dozens
of folders into a single 25Kloc file, specific to your platform.
Think how much easier it would be to install, or employ, and how
much faster to /compile/!
It would be faster to compile. Probably, meaningfully faster for
compiling large GUI project from scratch with very slow compiler like
gcc. Probably, not meaningfully faster in other situations.
It would not be easier to install or employ unless one happens to be as stubborn as you are.
If I ever want to write code using GTK2 for hobby purpose, which is
extremely unlikely, then all I'd need to do is to type 'pacman -S mingw-w64-ucrt-x86_64-gtk2' at msys2 command prompt. That's all.
For somebody on Debian/Ubuntu it likely would be 'apt-get install
gtk2'. RHEL/Fedora, MSVC command prompt or Mac it would be some other
magic incantation. Except that for the latter two it's probably not
available at all, so even easier.
The point is - it's already so easy that you can't really make it any
easier, at best the same.
On 19/04/2026 13:17, David Brown wrote:
On 19/04/2026 13:50, Bart wrote:
On 19/04/2026 11:17, David Brown wrote:
On 18/04/2026 17:08, Bart wrote:
ÿÿvoid F() {
ÿÿÿÿÿ int a, b, c;
ÿÿÿÿÿ a = b + c * 8;
ÿÿ}
So you want to know how the compiler deals with meaningless code.
Why? Do you not know how to write meaningful code?
I don't want the compiler deciding what's a meaningful program. The
intent here is clear:
* Allocate 3 local slots for int
* Add the contents of two of those, and store into the third.
That is the task. In terms of observable effects, there are are at least two: the code that is generated, and the time it might take to execute.
There is also the code size, and the compilation time.
One of my favourite compilation benchmarks is this:
ÿÿÿ void F() {
ÿÿÿÿÿÿÿ int a, b=2, c=3, d=4
ÿÿÿÿÿÿÿ a = b + c * d;
ÿÿÿÿÿÿÿ ....ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿ // repeat N times
ÿÿÿÿÿÿÿ printf("%d\n", a);
ÿÿÿ }
Here initialisation is used otherwise it causes problems with
interpreted languages for example.
It is amazing how many language implementations have trouble with this, especially with bigger N such as 1000000. The bigger ones usually fare worse.
This program is not meaningful; it is simply a stress test. Two more r observable effects are at what N it fails, and whether it crashes or
fails gracefully.
then all the C compilers I tried generated code at -O0 which kept
those variables in memory.
They are on the stack in memory, yes.ÿ You've asked for close to a
direct and na‹ve translation, which gives no insight into what kind of
code the compiler can generate and is harder to follow (because it's
mostly moving things onto and off from the stack).
It's easier to follow. Or would be it the compiler were to generate
decent assembly. gcc -O0 produces:
F:
ÿÿÿÿÿÿÿ pushqÿÿ %rbp
ÿÿÿÿÿÿÿ movqÿÿÿ %rsp, %rbp
ÿÿÿÿÿÿÿ movlÿÿÿ -4(%rbp), %eax
ÿÿÿÿÿÿÿ lealÿÿÿ 0(,%rax,8), %edx
ÿÿÿÿÿÿÿ movlÿÿÿ -8(%rbp), %eax
ÿÿÿÿÿÿÿ addlÿÿÿ %edx, %eax
ÿÿÿÿÿÿÿ movlÿÿÿ %eax, -12(%rbp)
ÿÿÿÿÿÿÿ nop
ÿÿÿÿÿÿÿ popqÿÿÿ %rbp
ÿÿÿÿÿÿÿ ret
What does the code look like when a/b/c are kept in registers? I've
no idea, because at soon as you try -O1 and above, the whole
expression is elided.
If you stick 'static' in front, then the whole function disappears.
This is not very useful when trying to compare code generation across
compilers and languages!
If I do something meaningful with 'a' to keep the expression alive,
and initialise b and c, then the whole expression is reduced to a
constant.
What do you have to do see if the expression would be compiled to,
for example, 'lea ra, [rb + rc*8]'?
int f(int b, int c)
{
ÿÿÿÿ int a;
ÿÿÿÿ a = b + c * 8;
ÿÿÿÿ return a;
}
If you don't want to use parameters and return values, I recommend
declaring externally linked volatile variables and use them as the
source and destination of your calculations:
volatile int xa;
volatile int xb;
volatile int xc;
void foo(void) {
ÿÿÿÿ int a, b, c;
ÿÿÿÿ b = xa;
ÿÿÿÿ c = xc;
ÿÿÿÿ a = b + c * 8;
ÿÿÿÿ xa = a;
}
When you ask the compiler "give me an efficient implementation of this
code" and the compiler can see that the code does nothing, it
generates no code (or just a "ret").ÿ This should not be a surprise.
So you might need "tricks" to make the code mean something - including
access to volatile objects is one of these tricks.
So, you have to spend time fooling the compiler. And then you are never quite sure if it has left something out so that you're not comparing
like with like.
However, this is a perfect example of how even a language and especially
its compilers differ from assembly and assemblers.
It can happen with my compilers too, but on a much smaller scale. For example 'a = 2 + 2' is reduced to 'a = 4'. But it is easier to get around.
On 19/04/2026 16:28, Bart wrote:
On 19/04/2026 13:17, David Brown wrote:
On 19/04/2026 13:50, Bart wrote:
On 19/04/2026 11:17, David Brown wrote:
On 18/04/2026 17:08, Bart wrote:
ÿÿvoid F() {
ÿÿÿÿÿ int a, b, c;
ÿÿÿÿÿ a = b + c * 8;
ÿÿ}
So you want to know how the compiler deals with meaningless code.
Why? Do you not know how to write meaningful code?
I don't want the compiler deciding what's a meaningful program. The
intent here is clear:
No, the intent is not clear.ÿ If you are writing in C, and you intend
the code to have a definite meaning, you have to write that meaning in
C.ÿ Break C's rules, and the code does not have meaning as a whole - and compilers cannot be expected to guess what you meant, especially when
you ask them to analyse your code carefully to generate optimised output.
* Allocate 3 local slots for int
* Add the contents of two of those, and store into the third.
That is not what you wrote - because that's not what the C means.
As you pointed out yourself, C is not assembly.ÿ It does not have a
direct meaning like this.
Stress tests of tools can be useful.ÿ I would not say something like
this is useful as a compilation benchmark - I want my tools to be fast enough for practical use on the real code I write, and don't care how
slow they are for totally meaningless and unrealistic code.
ÿ But if I
were writing a tool, I'd like to know how well it handled extreme cases.
ÿ(Sometimes generated C code has functions with huge numbers of simple lines, totally unlike code that anyone would write by hand.
ÿ It would
not have pointless repetition of lines, however.)
So, you have to spend time fooling the compiler. And then you are
never quite sure if it has left something out so that you're not
comparing like with like.
Sorry, I thought I was being helpful so that you would understand how to
get the results you are asking for from compilers.ÿ I am not "fooling"
the compiler, I am showing you how to ask the right questions.
On 19/04/2026 16:47, David Brown wrote:
On 19/04/2026 16:28, Bart wrote:
On 19/04/2026 13:17, David Brown wrote:
On 19/04/2026 13:50, Bart wrote:
On 19/04/2026 11:17, David Brown wrote:
On 18/04/2026 17:08, Bart wrote:
ÿÿvoid F() {
ÿÿÿÿÿ int a, b, c;
ÿÿÿÿÿ a = b + c * 8;
ÿÿ}
So you want to know how the compiler deals with meaningless code.
Why? Do you not know how to write meaningful code?
I don't want the compiler deciding what's a meaningful program. The
intent here is clear:
No, the intent is not clear.ÿ If you are writing in C, and you intend
the code to have a definite meaning, you have to write that meaning in
C.ÿ Break C's rules, and the code does not have meaning as a whole -
and compilers cannot be expected to guess what you meant, especially
when you ask them to analyse your code carefully to generate optimised
output.
* Allocate 3 local slots for int
* Add the contents of two of those, and store into the third.
That is not what you wrote - because that's not what the C means.
I forgot the scaling of 'c'.
As you pointed out yourself, C is not assembly.ÿ It does not have a
direct meaning like this.
I don't understand what else it can possibly mean.
Get the value of 'b',
whatever it happens to be, add the value of 'c'
scaled by 8, and store the result it into 'a'. The only things to
consider are that some intermediate results may lose the top bits.
Is 'a = b' equally undefined? If so that C is even crazy than I'd thought.
Stress tests of tools can be useful.ÿ I would not say something like
this is useful as a compilation benchmark - I want my tools to be fast
enough for practical use on the real code I write, and don't care how
slow they are for totally meaningless and unrealistic code.
Meaningless and unrealistic are what stress tests and benchmarks are!
But they can also give useful insights, highlight shortcomings, and can
be used to compare implementations.
I think if I used a real program such as sqlite3.c, you still wouldn't
care about my results.
ÿ But if I were writing a tool, I'd like to know how well it handled
extreme cases. ÿÿ(Sometimes generated C code has functions with huge
numbers of simple lines, totally unlike code that anyone would write
by hand.
ÿ It would not have pointless repetition of lines, however.)
Then it becomes much, much harder to have a simple test that can used
for practically any language.
As a matter of interest, I tried 1 million lines of 'a=b+c*d' now. These
are some results:
ÿgcc -O0ÿÿÿ 560ÿÿ seconds
ÿTiny Cÿÿÿÿÿÿ 1.7 seconds
ÿbccÿÿÿÿÿÿÿÿÿ 2.0 seconds
ÿmmÿÿÿÿÿÿÿÿÿÿ 1.9 seconds (non-C); both these run unoptimised code
gcc likely uses some sort of SSA representation, meaning a new variable
for each intermediate result. Here it probably needs 5 million intermediates.
(From memory, it is faster when using optimise flags, because it can eliminate 99.9999% of those assignments, so there is less code to
process later. You know when that happens as the resulting EXE is too
small to be feasible.)
Here's an interesting one:
ÿ void F(){
ÿ L1:
ÿ L2:
ÿ ...
ÿ L1000000:
ÿ ;
ÿ }
Both bcc and tcc crash instantly, because in C, labels have a recursive definition in the grammar, and this cause a stack overflow.
But I can't tell you what gcc does, as I aborted it after 5 minutes.
(In my language, labels are just another statement, and this compiled in
1.5 seconds. However I had to increase the hashtable size as it doesn't
grow as needed.)
So, you have to spend time fooling the compiler. And then you are
never quite sure if it has left something out so that you're not
comparing like with like.
Sorry, I thought I was being helpful so that you would understand how
to get the results you are asking for from compilers.ÿ I am not
"fooling" the compiler, I am showing you how to ask the right questions.
The question was already posed as I wanted in my original fragment.
On 19/04/2026 01:35, Keith Thompson wrote:
Bart <bc@freeuk.com> writes:
On 18/04/2026 14:37, David Brown wrote:It's a common wrong notion.
On 17/04/2026 18:42, Bart wrote:
I've no idea. You just said it was uncommon to use C in thisYou didn't simply claim that people were using C as an intermediary
way. But every other amateur compiler project on Reddit forums
likes to use a C target.
language - you claimed they were doing so specifically for languages
that defined things like type punning, wrapping signed integer
arithmetic, and messing about with pointers.
The broader picture is being forgotten. The thread is partly about C
being a 'portable assembler', and this is a common notion.
One person here recently claimed that C is a kind of assembly
language.
'C being portable assembly' keeps coming up, not just here.
Yes, I know. There should have been one that is much better - a HLL,Given your habit of inventing your own languages and writing your
not the monstrosity that is LLVM. But it doesn't exist.
own
compilers, I'm surprised you haven't defined your own intermediate
language, something like LLVM IR but suiting your purposes better.
You're complaining about a problem that *you* might be in a position
to address.
If you read my post again, you'll see that I did exactly that.
On 19/04/2026 19:47, Bart wrote:
Get the value of 'b',
You can't do that.ÿ "b" has no value.ÿ "b" is indeterminate, and using
its value is UB - the code has no meaning right out of the gate.
When you use "b" in an expression, you are /not/ asking C to read the
bits and bytes stored at the address of the object "b".ÿ You are asking
for the /value/ of the object "b".ÿ How the compiler gets that value is
up to the compiler - it can read the memory, or use a stored copy in a register, or use program analysis to know what the value is in some
other way.ÿ And if the object "b" does not have a value, you are asking
the impossible.
Try asking a human "You have two numbers, b and c.ÿ Add them.ÿ What is
the answer?".
whatever it happens to be, add the value of 'c' scaled by 8, and store
the result it into 'a'. The only things to consider are that some
intermediate results may lose the top bits.
Is 'a = b' equally undefined? If so that C is even crazy than I'd
thought.
If "a" or "b" are indeterminate, then using them is undefined.ÿ I have
two things - are they the same colour?ÿ How is that supposed to make sense?
You keep thinking of objects like "b" as a section of memory with a bit pattern in it.ÿ Objects are not that simple in C - C is not assembly.
You mean when the object code is small because the compiler did a good job?
On 19/04/2026 20:32, David Brown wrote:
On 19/04/2026 19:47, Bart wrote:
Get the value of 'b',
You can't do that.ÿ "b" has no value.ÿ "b" is indeterminate, and using
its value is UB - the code has no meaning right out of the gate.
When you use "b" in an expression, you are /not/ asking C to read the
bits and bytes stored at the address of the object "b".ÿ You are
asking for the /value/ of the object "b".ÿ How the compiler gets that
value is up to the compiler - it can read the memory, or use a stored
copy in a register, or use program analysis to know what the value is
in some other way.ÿ And if the object "b" does not have a value, you
are asking the impossible.
Try asking a human "You have two numbers, b and c.ÿ Add them.ÿ What is
the answer?".
You have two slates A and B which someone should have wiped clean then written a new number on each.
But that part hasn't been done; they each still have an old number from their last use.
You can still add them together, nothing bad will happen. It just may be
the wrong answer if the purpose of the exercise was to find the sum of
two specific new numbers.
But the purpose may also be see how good they are adding. Or in
following instructions.
whatever it happens to be, add the value of 'c' scaled by 8, and
store the result it into 'a'. The only things to consider are that
some intermediate results may lose the top bits.
Is 'a = b' equally undefined? If so that C is even crazy than I'd
thought.
If "a" or "b" are indeterminate, then using them is undefined.ÿ I have
two things - are they the same colour?ÿ How is that supposed to make
sense?
You keep thinking of objects like "b" as a section of memory with a
bit pattern in it.ÿ Objects are not that simple in C - C is not assembly.
Why ISN'T it that simple? What ghastly thing would happen if it was?
"b" will be some location in memory or it might be some register, and it WILL have a value. That value happens to be unknown until it is
initialised.
So accessing it will return garbage (unless you know exactly what you
are doing then it may be something useful).
My original example was something like 'a = b + c' (I think in my
language), converted to my IL, then expressed in very low-level C.
You were concerned that in that C, the values weren't initialised. How
would that have affected the code that C compiler generated from that?
It's starting to appear that the compiler is more of the problem!
Because mine would certainly not be bothered by it and nobody would be scratching their heads wondering what surpises the compiler might have
in store.
Would the compiler have been happier with this:
ÿÿÿ int a, b = F(), c = F();
ÿÿÿ a = b + c;
If so, then suppose F was this:
ÿÿÿ int F() {int x; return x;}
When the body of F is not visible, then that cannot possibly affect what
is generated for 'a = b + c'.
So I'm still interested in what possible reason the compiler might have
for generating code that is any different in the absence of
initialisation. Warn about it, sure, but why do anything else?
THIS is why I try to stay from using C intermediate code.
You mean when the object code is small because the compiler did a good
job?
It may do a good job of eliminating duplicate or redundant code. But
maybe you are measuring how well it copes with a certain quantity of
code, which when synthesised may well be duplicate or redundant.
Then it is not helpful that it discards most of it. How is that supposed
to give an accurate meaure of how well it does when it really does need
to do it all?
It's like comparing car A and car B over a course (we've been here
before), but A's driver is using clever shortcuts. Or maybe he doesn't
even bother going anywhere if the course is circular.
That will give A an unfair advantage, and a misleading result. It could
be that B is actually faster, so somebody deciding to buy A based on
this test is going to be disappointed!
On 20/04/2026 01:36, Bart wrote:
In C, "b" is not any specific place.ÿ In optimising compilers, the implementation is unlikely to exist at all until it has a real value,
C programmers don't need to scratch their heads.ÿ They simply have to
write meaningful code.ÿ It's not rocket science.
(As a C implementer, you should have a better understanding of these
details than C programmers usually need.)
You are arguing that C is difficult to use as an intermediate language because you don't know what happens when you generate shite sort-of C
code?ÿ Just generate valid C code that has meaning, and stop worrying.
So gcc is a more powerful compiler than yours, and that's not fair?
On 20/04/2026 07:25, David Brown wrote:
On 20/04/2026 01:36, Bart wrote:
In C, "b" is not any specific place.ÿ In optimising compilers, the
implementation is unlikely to exist at all until it has a real value,
It should come into existence when it is referenced. Then it will have a value.
Here for example:
ÿÿÿ int b;
ÿÿÿ if (rand()&1) b=0;
ÿÿÿ printf("%d", b);
'b' may or may not be initialised. But I expect the 'b' used in that
last line to exist somewhere and for the generated code to access that location. I'd also expect the same if the assignment was commented out.
Someone could write some actual code like my example, with an
unconditional assignment, but for various reasons has to temporarily
comment out that assignment.
It might be a function that is not called. Or it might be in a program
that is not run at all, because the developer is sorting out some build issue.
But according to you, that part of the code is UB, whether the program
is ever run or not, and so the whole thing is undefined.
That would be ludicrous.
C programmers don't need to scratch their heads.ÿ They simply have to
write meaningful code.ÿ It's not rocket science.
(As a C implementer, you should have a better understanding of these
details than C programmers usually need.)
I implement it in a common sense manner.
I don't say, Ah, 'x' might not
be initialised at this point, so it is UB, therefore I don't need to
bother compiling these remaining 100 lines, then the program will be
smaller and faster!
You are arguing that C is difficult to use as an intermediate language
because you don't know what happens when you generate shite sort-of C
code?ÿ Just generate valid C code that has meaning, and stop worrying.
My language allows you to do this:
ÿÿ int a, b
ÿÿ a := b
It is well-defined in the language, and I know it is well defined on all
my likely targets. (I think we're back where we started!)
However, we are generating C via an IL. The IL will be something like this:
ÿÿ local a
ÿÿ local b
ÿÿ ...
ÿÿ load b
ÿÿ store a
Again, it is perfectly well-defined. Whatever bit-pattern in b is
transfered to a. In assembly, the same thing: b will be in memory or register.
All well and good. UNTIL we decided to involve C! Let's say everything
has u64 type:
ÿÿ u64 R1;ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿ # represents the one stack slot used
ÿÿ u64 a, b;ÿÿÿÿÿÿÿÿÿÿÿÿÿÿ # our local variables
Now we need to translate that load and store:
ÿÿ R1 = b;
ÿÿ a = R1;
This looks really easy, but no, C just has to make it UB.
So, how do you suggest this is fixed? Do I now have to do an in-depth analysis of that IL to figure out whether 'b' was initialised at this
point (there might be 100 lines of IL code in-between, including
conditional code and loops). Even if I find out it wasn't, what do I do about it?
Maybe a simpler solution: zero all locals whether necessary or not:
ÿ u64 a = 0, b = 0;
However, the point of using C may be to get a faster program. I don't
want unnecessary assignments which the C compiler may or may not be able
to elide.
Especially when declaring entire arrays or structs:
ÿ struct $B1 dd = {0};
ÿ struct $B1 ee = {0};
In any case, there will be a million other things that are probably UB
as well. How far do you go in trying to appease C?
So, C as an intermediate is undesirable, but the real reason is the compilers. A simple, dumb compiler like bcc or tcc is preferable (but
mine doesn't optimise and is for Windows, and tcc has its own problems).
This is why we needed C--.
So gcc is a more powerful compiler than yours, and that's not fair?
If it is effectively cheating at benchmarks, then no it isn't.
If you take recursive Fibonacci, then fib(N) is expected to execute
about 2*fib(N) function calls (for versions that start 'if (N<3) return
1').
However, using -Os or O1, gcc's code only does half of those calls. And
with -O2/-O3, only 5%, via aggressive inlining and TOC optimising.
So, no, that's not a fair comparison.
Fibonacci is supposed to be a
measure of how many calls/second a language implementation can make, and
the figure you'd get with gcc can be misleading.
(It might as well use memoisation, or be clever enough to convert to to iterative form, then it can report an infinite number of calls per
second. That's really useful!
So, for gcc and Fibonacci, I now use -fno-inline and another to turn off TCO.)
On 20/04/2026 13:45, Bart wrote:
I implement it in a common sense manner.
"Common sense" is another way of saying "I don't know the actual rules".
It is a shame if we are back where we started - because you started out wrong.ÿ You started out treating C like assembly, and you haven't shown
you understand the difference.
The semantics of your language are important to you - but not to C.ÿ The semantics of whatever targets you use are important to the back-end of
the C compiler, but not to the C language or its semantics.
If it is effectively cheating at benchmarks, then no it isn't.
Again - you are asking the wrong questions in your benchmarks.
You /think/ you are asking the car to drive round a loop.ÿ But you what
you are writing is asking the car to go from A to B.ÿ And then you
complain when gcc figures out it can drive directly from A to B without going through the loop.
If you want to benchmark a compiler going through the whole path, write
that in the code.ÿ Force observable behaviour at the start (with a
volatile access), use code lines that depend on that input and previous lines, and observe the behaviour at the end (with a volatile write, or a printf, or something else /real/).
int fibonacci(int n)
{
ÿÿÿ if (n <= 2) return 1;
ÿÿÿ return fibonacci(n - 1) + fibonacci(n - 2);
}
No, I don't expect the generated code to have 2 * fib(n) recursive
calls.ÿ I expect the code to give the same results as if it had made
those calls.
If a compiler can optimise in such a way as to reduce the number of
calls, that's great.
So, for gcc and Fibonacci, I now use -fno-inline and another to turn
off TCO.)
And does that give you any kind of information that is useful for any purpose?ÿ I suspect not.
On 20/04/2026 14:02, David Brown wrote:
On 20/04/2026 13:45, Bart wrote:
I implement it in a common sense manner.
"Common sense" is another way of saying "I don't know the actual rules".
It means doing the obvious thing with no unexpected surprises.
It is a shame if we are back where we started - because you started
out wrong.ÿ You started out treating C like assembly, and you haven't
shown you understand the difference.
So why should I listen to you, and why should I care?
Bart <bc@freeuk.com> wrote:effort to
On 19/04/2026 11:17, David Brown wrote:
On 18/04/2026 17:08, Bart wrote:
(Yes, LLVM and the tools around it are big. It takes a lot of
numbersmake use of them, but you get a lot in return. A "little language" has
to grow to a certain size in numbers of toolchain developers and
of toolchain users before it can make sense to move to LLVM.
Actually lots of small projects use LLVM.
But probably people don't realise it is like installing the engine from
a container ship into your small family car.
AFAICS people are proud using powerful engine and tend to ignore disadvantages.
In non-C context co-worker in a project uses
"standard" documentation tool to generate tens of megabytes of
HTML documentation. This needs something like 2 min 30 seconds,
and tens of megabytes extra packages. I wrote few hundreds lines
of code to do almost the same thing but directly. Amount of
specialized code is similar to what is needed to interface with
external package. My code is doing the job in about 1.5 sec.
His reaction was essentially: "Why are you wasting time when the
code works". Actually, there were differing assumptions: he
assumed that code will be run once a few months, so performance
would not matter at all. I would run the code as part of build
and test cycle and 2 min 30 seconds per cycle matters a lot.
The external package has a lot of features, for example it
support tens (maybe hundreds) color schemes. But we need only
one color scheme.
Anyway, people belive that by using major "standard" package
they will somewhat get superior features.
What's not valid about 'a = b + c'?It is incomplete. Why do not you use eqiuvalent function:
int
add(int b, int c) {
return b + c;
}
Bart <bc@freeuk.com> wrote:
On 19/04/2026 20:32, David Brown wrote:
On 19/04/2026 19:47, Bart wrote:
Get the value of 'b',
You can't do that.ÿ "b" has no value.ÿ "b" is indeterminate, and using
its value is UB - the code has no meaning right out of the gate.
When you use "b" in an expression, you are /not/ asking C to read the
bits and bytes stored at the address of the object "b".ÿ You are asking
for the /value/ of the object "b".ÿ How the compiler gets that value is
up to the compiler - it can read the memory, or use a stored copy in a
register, or use program analysis to know what the value is in some
other way.ÿ And if the object "b" does not have a value, you are asking
the impossible.
Try asking a human "You have two numbers, b and c.ÿ Add them.ÿ What is
the answer?".
You have two slates A and B which someone should have wiped clean then
written a new number on each.
But that part hasn't been done; they each still have an old number from
their last use.
You can still add them together, nothing bad will happen. It just may be
the wrong answer if the purpose of the exercise was to find the sum of
two specific new numbers.
But the purpose may also be see how good they are adding. Or in
following instructions.
whatever it happens to be, add the value of 'c' scaled by 8, and store >>>> the result it into 'a'. The only things to consider are that some
intermediate results may lose the top bits.
Is 'a = b' equally undefined? If so that C is even crazy than I'd
thought.
If "a" or "b" are indeterminate, then using them is undefined.ÿ I have
two things - are they the same colour?ÿ How is that supposed to make sense? >>>
You keep thinking of objects like "b" as a section of memory with a bit
pattern in it.ÿ Objects are not that simple in C - C is not assembly.
Why ISN'T it that simple? What ghastly thing would happen if it was?
"b" will be some location in memory or it might be some register, and it
WILL have a value. That value happens to be unknown until it is initialised. >>
So accessing it will return garbage (unless you know exactly what you
are doing then it may be something useful).
My original example was something like 'a = b + c' (I think in my
language), converted to my IL, then expressed in very low-level C.
You were concerned that in that C, the values weren't initialised. How
would that have affected the code that C compiler generated from that?
You look at trivial example, where AFAICS the best answer is:
"Compiler follows general rules, why should it make exception for
this case?". Note that in this trivial case "interesting"
behaviour could happen on exotic hardware (probably disallowed
by C23 rules, but AFAICS legal for earlier C versions).
Namely, consder machine where one bit pattern is illegal
and causes exception at runtime when read from memory by
integer load. Compiler could "initialize" all otherwise
uninitialized variables with this bit pattern. So accessing
uninitialised integer variable would cause runtime exception.
If you look at more complex examples you may see why the rule
allows more efficient code on ordinary machines. Namely,
look at:
void
f() {
bool b;
printf("b is ");
if (b) {
printf("true\n");
}
if (!b) {
printf("false\n");
}
}
Compiler could contain function called 'known_false' and omit
code for conditional statement is condition (in our case 'b')
is known to be false. How compiler could know this? Simplest
case is when condition is a constant. But that is trivial case.
More interesting cases are when some earlier statement assigns
constant value to 'b'. But function may contain "interesting"
control, so determining which assigments are executed is tricky.
Instead, compiler probably would use some kind of approximation,
tracking possible values at different program points. Now,
according to your point of view, uninitialized variable would
mean "any value is possible". According to C rules uninitialized
variable can not occur in correct program, which means that
there must be assignment later and analyzing possible values
corrent statement is "no value". In the function above,
consistently propagating information according to your
rules means that in conditional 'b' can take any value, so
compiler must emit the code. Using C rules, 'b' has no
value, so can not be true and compiler can delete the conditional
(and the same for conditional involving '!b').
This example
is still pretty simple, so you may think that your rules are
superior.
But imagine that between declaration of 'b' and
conditional there is some hairy code. This code initializes
'b' to false, but only if some conditions are satisfied.
Now, consider situation were in fact 'b' is always initialized,
but compiler is too limited to see this. Under C rules
compiler will assume that 'b' is initialized and conclude
that it is false, allowing it to delete the conditional.
Under your rules compiler would have to consider possibility
that 'b' is uninitialized and keep the conditional.
So I'm still interested in what possible reason the compiler might have
for generating code that is any different in the absence of
initialisation. Warn about it, sure, but why do anything else?
As explained, under C rules compiler can generate more efficient
code.
It may do a good job of eliminating duplicate or redundant code. But
maybe you are measuring how well it copes with a certain quantity of
code, which when synthesised may well be duplicate or redundant.
I very much want my compiler backend to eliminate duplicate or
redundant code inserted by front end.
Then it is not helpful that it discards most of it.
If you use C compiler as a backed it is quite helpful.
On 20/04/2026 07:25, David Brown wrote:
On 20/04/2026 01:36, Bart wrote:
In C, "b" is not any specific place.ÿ In optimising compilers, the
implementation is unlikely to exist at all until it has a real
value,
It should come into existence when it is referenced. Then it will have
a value.
Here for example:
int b;
if (rand()&1) b=0;
printf("%d", b);
'b' may or may not be initialised. But I expect the 'b' used in that
last line to exist somewhere and for the generated code to access that location. I'd also expect the same if the assignment was commented
out.
So gcc is a more powerful compiler than yours, and that's not fair?
If it is effectively cheating at benchmarks, then no it isn't.
If you take recursive Fibonacci, then fib(N) is expected to execute
about 2*fib(N) function calls (for versions that start 'if (N<3)
return 1').
However, using -Os or O1, gcc's code only does half of those
calls. And with -O2/-O3, only 5%, via aggressive inlining and TOC
optimising.
So, no, that's not a fair comparison. Fibonacci is supposed to be a
measure of how many calls/second a language implementation can make,
and the figure you'd get with gcc can be misleading.
(It might as well use memoisation, or be clever enough to convert to
to iterative form, then it can report an infinite number of calls per
second. That's really useful!
So, for gcc and Fibonacci, I now use -fno-inline and another to turn
off TCO.)
So why should I listen to you, and why should I care?
Bart <bc@freeuk.com> writes:
[...]
So why should I listen to you, and why should I care?
I don't know, why should you?
You obviously care a great deal, or you wouldn't spend so much
time arguing.
Bart <bc@freeuk.com> writes:
Yes, that's really useful!
So, for gcc and Fibonacci, I now use -fno-inline and another to turn
off TCO.)
If I write this program:
#include <stdio.h>
int fib(int n) {
if (n <= 1) {
return 1;
}
else {
return fib(n-2) + fib(n-1);
}
}
int main(void) {
printf("%d\n", fib(10));
}
the implementation's job is to generate code that prints "89".
If it's able to do so by replacing the whole thing with `puts("89");`
*that's a good thing*. That's not cheating. That's good code
generation.
If you want to write a benchmark that avoids certain optimizations,
you need to write it carefully so you get the code you want.
On 20/04/2026 18:48, Keith Thompson wrote:
Bart <bc@freeuk.com> writes:
Yes, that's really useful!
So which implementation is faster at actually doing function calls? And
how many calls were actually made?
the implementation's job is to generate code that prints "89".
In that case, why bother using very slow recursive Fibonacci?
"(Why would recursive Fibonacci even ever be used as a benchmark when
the iterative method be much faster in every case?)"
I will give you the answer: it is to compare how implementations cope
with very large numbers of recursive function calls. So if one finds a
way to avoid doing such calls, then it is not a fair comparison.
Nobody is interested in the actual output, other than checking it worked >correctly. But in how long it took.
On 20/04/2026 14:49, Waldek Hebisch wrote:
Bart <bc@freeuk.com> wrote:
On 19/04/2026 20:32, David Brown wrote:
On 19/04/2026 19:47, Bart wrote:
You look at trivial example, where AFAICS the best answer is:
"Compiler follows general rules, why should it make exception for
this case?".ÿ Note that in this trivial case "interesting"
behaviour could happen on exotic hardware (probably disallowed
by C23 rules, but AFAICS legal for earlier C versions).
I don't care about exotic hardware. I don't see why its needs should
impact the 99.99% (if not 100%) of actual hardware that people use.
It ought to have made more things implementation defined.
Namely, consder machine where one bit pattern is illegal
and causes exception at runtime when read from memory by
integer load.ÿ Compiler could "initialize" all otherwise
uninitialized variables with this bit pattern.ÿ So accessing
uninitialised integer variable would cause runtime exception.
I acknowledge this somewhere, for the case of floating point numbers. A
poor implementation may have problems. But in the case of XMM registers
on x64, they seem to tolerate arbitrary bit patterns used in floating
point operations.
At worst you end up with a NaN result or something.
And obviously, it is inadvisable to dereference a unknown pointer value.
But you can give all this advice, issue warnings etc, and still not
seize upon such UB as an excuse to invalidate the rest of the program or
for a compiler to choose to do whatever it likes.
If you look at more complex examples you may see why the rule
allows more efficient code on ordinary machines.ÿ Namely,
look at:
void
f() {
ÿÿÿÿ bool b;
ÿÿÿÿ printf("b is ");
ÿÿÿÿ if (b) {
ÿÿÿÿÿÿÿÿ printf("true\n");
ÿÿÿÿ }
ÿÿÿÿ if (!b) {
ÿÿÿÿÿÿÿÿ printf("false\n");
ÿÿÿÿ }
}
Compiler could contain function called 'known_false' and omit
code for conditional statement is condition (in our case 'b')
is known to be false.ÿ How compiler could know this?ÿ Simplest
case isÿ when condition is a constant.ÿ But that is trivial case.
More interesting cases are when some earlier statement assigns
constant value to 'b'.ÿ But function may contain "interesting"
control, so determining which assigments are executed is tricky.
Instead, compiler probably would use some kind of approximation,
tracking possible values at different program points.ÿ Now,
according to your point of view, uninitialized variable would
mean "any value is possible".ÿ According to C rules uninitialized
variable can not occur in correct program, which means that
there must be assignment later and analyzing possible values
corrent statement is "no value".ÿ In the function above,
consistently propagating information according to your
rules means that in conditional 'b' can take any value, so
compiler must emit the code.ÿ Using C rules, 'b' has no
value, so can not be true and compiler can delete the conditional
(and the same for conditional involving '!b').
If I apply gcc-O2 to your example, it prints that b is false without actually testing the value. If I get to return the value of b, it
returns a hard-coded zero.
This example
is still pretty simple, so you may think that your rules are
superior.
They're certainly simpler. I can't predict what gcc will do. And
whatever it does, can differ depending on options.
It comes down to the user's intention: was the non-initialisation an oversight? Did they know that only one of those conditionals can be true?
My compilers don't try and double-guess the user: they will simply do
what is requested.
So I'm still interested in what possible reason the compiler might have
for generating code that is any different in the absence of
initialisation. Warn about it, sure, but why do anything else?
As explained, under C rules compiler can generate more efficient
code.
A lot of it seems to be for dodgy-looking code. I tend to rely or assume sensibly written programs. That seems to go a long way!
It may do a good job of eliminating duplicate or redundant code. But
maybe you are measuring how well it copes with a certain quantity of
code, which when synthesised may well be duplicate or redundant.
I very much want my compiler backend to eliminate duplicate or
redundant code inserted by front end.
Then it is not helpful that it discards most of it.
If you use C compiler as a backed it is quite helpful.
My current transpiled C is full of redundant intermediates like your example, and such optimising is necessary to get reasonble size and speed.
On 20/04/2026 18:50, Keith Thompson wrote:
Bart <bc@freeuk.com> writes:
[...]
So why should I listen to you, and why should I care?I don't know, why should you?
You obviously care a great deal, or you wouldn't spend so much
time arguing.
I first posted this to show how casts are extensively used in my
generated C:
i64 a;
i64 b;
i64 c;
asi64(R1) = b;
asi64(R2) = c;
asi64(R1) += asi64(R2);
a = asi64(R1);
This was generated from this fragment HLL code: "a := b + c". There is
no initialisation because that is rarely done when testing compiler code-generation. Examples are kept as simple as possible, and
initialisation would have absolutely no bearing on the matter.
But somebody said this was UB. Now even though uninitialised variables
are not used in my production programs (AFAIK), I disagreed about this matter.
On 20/04/2026 18:48, Keith Thompson wrote:
Bart <bc@freeuk.com> writes:
Yes, that's really useful!
So which implementation is faster at actually doing function calls?
And how many calls were actually made?
So, for gcc and Fibonacci, I now use -fno-inline and another to turnIf I write this program:
off TCO.)
#include <stdio.h>
int fib(int n) {
if (n <= 1) {
return 1;
}
else {
return fib(n-2) + fib(n-1);
}
}
int main(void) {
printf("%d\n", fib(10));
}
the implementation's job is to generate code that prints "89".
In that case, why bother using very slow recursive Fibonacci?
Presumably the expection is that it would actually be using recursion.
I already posed this question:
"(Why would recursive Fibonacci even ever be used as a benchmark when
the iterative method be much faster in every case?)"
I will give you the answer: it is to compare how implementations cope
with very large numbers of recursive function calls. So if one finds a
way to avoid doing such calls, then it is not a fair comparison.
If you want to write a benchmark that avoids certain optimizations,
you need to write it carefully so you get the code you want.
It's not possible to do that with Fibonacci without making it
unrecognisable and so a poor comparison for other reasons.
If testing with gcc now, I'd use these two options:
-fno-inline
-fno-optimize-sibling-calls
On my PC, gcc-O2 code than manages some 560M calls/second running
Fibonacci, rather than a misleading 1270M calls/second
See also:
https://github.com/drujensen/fib/issues/119
Referenced from: https://github.com/drujensen/fib
On 20/04/2026 19:34, Bart wrote:
And obviously, it is inadvisable to dereference a unknown pointer value.
Okay, so you think it is "obvious" that you should avoid doing some
things that are explicitly UB, and yet you think it is "obvious" that
you should be able to do other types of UB.ÿ Who makes up those
"obvious" rules?ÿ Why do you think such inconsistency is a good idea?
No, your rules are far from simple - you have internal ideas about what kinds of UB you think should produce certain results, and which should
not, and how compilers should interpret things that have no meaning in
C.ÿ That's not simple.
I can predict what gcc will do,
The compiler can quite reasonably generate all sorts of different code
here. A different version, or a different compiler, or on a different
day, you could get different results. That's life when you use UB.
What else could the non-initialisation have been other than an oversight
- a bug in their code due to ignorance, or just making a mistake as we
all do occasionally?ÿ Do you think it is likely that someone
intentionally and knowingly wrote incorrect code?
My compilers don't try and double-guess the user: they will simply do
what is requested.
No, guessing the user's intentions is /exactly/ what your compiler is
trying to do.ÿ It is trying to guess what the programmer wrote even
though the programmer made a mistake and wrote something that does not
make sense.
A good
compiler will work with sensibly written programs, yet you insist on
writing C that is not sensibly written.
Oh, so you want gcc to optimise away redundant code when your transpiler generates redundant code, but it is "cheating" if it optimises away redundant code in bizarre tests because your C compiler can't do that?
Actually _EVERYBODY_ is interested in the actual output, and NOBODY is interested in how long it took.
The 5 people in the world that think in terms of random irrelevent
benchmarks are the only people would even think to care.
Bart <bc@freeuk.com> writes:
On 20/04/2026 18:50, Keith Thompson wrote:
Bart <bc@freeuk.com> writes:
[...]
So why should I listen to you, and why should I care?I don't know, why should you?
You obviously care a great deal, or you wouldn't spend so much
time arguing.
I first posted this to show how casts are extensively used in my
generated C:
i64 a;
i64 b;
i64 c;
asi64(R1) = b;
asi64(R2) = c;
asi64(R1) += asi64(R2);
a = asi64(R1);
This was generated from this fragment HLL code: "a := b + c". There is
no initialisation because that is rarely done when testing compiler
code-generation. Examples are kept as simple as possible, and
initialisation would have absolutely no bearing on the matter.
But somebody said this was UB. Now even though uninitialised variables
are not used in my production programs (AFAIK), I disagreed about this
matter.
The point is not that "somebody said" that this was UB.
I'm aware of your opinions about this, but will you acknowledge that
the standard actually says what it says? I'm not asking whether
you think the behavior should be undefined. I'm asking whether
you'll acknowledge that the ISO C standard says it's undefined.
Yes or no.
On Thu, 2026-04-16 at 22:14 +0800, wij wrote:d of
On Thu, 2026-04-16 at 18:42 +0800, wij wrote:
On Wed, 2026-04-15 at 19:04 -0700, Keith Thompson wrote:
wij <wyniijj5@gmail.com> writes:
On Wed, 2026-04-15 at 17:14 -0700, Tim Rentsch wrote:
wij <wyniijj5@gmail.com> writes:
[... comparing C and assembly language ...]
Gentlemen,
I understand the natural reaction to want to respond to the kin
sist thisstatements being made in this thread.ÿ I hope y'all can re
gnatural reaction and not respond to people who persist in makin
actly) B.arguments that are basically isomorphic to saying 1 equals 0.
Thank you for your assistance in this matter.
Maybe you are right. I say A is-a B, one persist to read A is (ex
usingI provide help to using assembly. One persist to read I persuade
atassembly and give up HLL. What is going on here?
You say that C is an assembly language.ÿ Nobody here thinks th
hatyou're *equating* C and assembly language.ÿ It's obvious that
there are plenty of assembly languages that are not C, and nobody
has said otherwise.ÿ I have no idea why you think anyone has t
hatparticular confusion.
At least one person has apparently interpreted your defense of
assembly language (that it isn't as scary as some think it is)
as a claim that we should program in assembly language rather
than in HLLs.ÿ You're right, that was a misinterpretation of w
he lastyou wrote.ÿ I considered mentioning that, but didn't bother.
The issue I've been discussing is your claim that C is an assembly language.ÿ It is not.
If I said C is assembly is in the sense that have at least shown in t
changepost (s_tut2.cpp), where even 'instruction' can be any function (e.g.
utation'directory, copy files, launch an editor,...). And also, what is 'comp
program,is demonstrated, which include suggestion what C is, essentially any
ing andand in this sense what HLL is. Finally, it could demonstrate the mean
udingÿtestify Church-Turing thesis (my words: no computation language, incl
.various kind of math formula, can exceeds the expressive power of TM)
says. IfIt seem you insist C and assembly have to be exactly what your bible
g ofso, I would say what C standard (I cannot read it) says is the meanin
her situation.terminology of term in it, not intended to be anything used in any ot
eaderI do not intend to post again in this thread until and unless you
post something substantive on that issue.
(continue)
IMO, C standard is like book of legal terms. Like many symbols in the h
ixed.file, it defines one symbol in anoter symbol. The real meaning is not f
nThe result is you cannot 'prove' correctness of the source program, eve
ink.consistency is a problem.
'Instruction' is low-level? Yes, by definition, but not as one might th
hInstruction could refer to a processing unit (might be like the x87 mat
,...)co-processor, which may even be more higher level to process expression
emovesAs good chance of C is to find a good function that can be hardwired.
So, the basic feature of HLL is 'structured' (or 'nested') text which r
ty, itlabels. Semantics is inventor's imagination. So, avoid bizarre complexi
thyÿwon't add express power to the language, just a matter of short or leng
ableexpression of programming idea.
(Continue)
Thus, C is-a language for controlling hardware. Therefore, the term 'port
assembly' seems fit for this meaning. But on the other side, C needs to be user
friendly. But skipping the friend part, I think there should more, C could be
the foundation of forml system (particularily for academic uses). For example:
ÿÿÿ'infinity'
Case 1: "ä(n=1,m) f(n)" should be defined as:
ÿsum=0;
ÿfor(int n=1; n<=m; ++n) {
ÿÿ sum+=f(n)
ÿ}
ÿBy doing so, it is easier to deduce things from nested series.
ÿÿÿ
Case 2: What if m=ì ?
ÿÿÿ
ÿfor(int n=1; ; ++n) {
ÿÿ sum+=f(n)
ÿ}
ÿÿÿ
ÿThe infinity case has no consensus. At least, it demonstrates that
ÿsimply refers to an infinite loop. This leads to the long debate ofe proof can
ÿ0.999....=? (0.999... will not terminates BY DEFINITION, no finit
ÿprove it equals to anything except you define it).... And what INF,INFINITYÿ
ÿshould be in C.?P(xn) (x?{x1,x2,..})
Case 3: Proposition ?x,P(x)::= P(x1)?P(x2)?..
ÿÿÿ bool f() {ÿ // f()= "?x,P(x)"refore, from
ÿÿÿÿÿ for(int x=1; x<=S.size(); ++x) {
ÿÿÿÿÿÿÿ if(P(x)==false) {
ÿÿÿÿÿÿÿÿÿ return false;
ÿÿÿÿÿÿÿ }
ÿÿÿÿÿ }
ÿÿÿÿÿ return true;
ÿÿÿ };
ÿÿÿ Universal quantifier itself is also a proposition, the
ÿÿÿ definition, its negation exists:?..?P(xn)= ~P(x1)?~P(x2)?..?~P(xn)?
ÿÿÿ ~Prop(?x,P(x))= ~(P(x1)?P(x2)?
ÿÿÿ = Prop(?x,~P(x))and its negation are thus easier to
ÿÿÿ Math/logic has no such clear definition.
ÿÿÿ Multiple quantifiers (?x?y?z)
ÿÿÿ understand and used in 'reasoning'.he omission of evaluation of b in case a==false
ÿÿÿ Note: This leads to a case: if(a&&b) { /*...*/ }
ÿÿÿÿÿÿÿÿÿ I tends to think t
ÿÿÿÿÿÿÿÿÿ is not really optimizaion. It is the problem or definition of the
ÿÿÿÿÿÿÿÿÿ traditional logic.
So, don't make C too bizarre.
Bart <bc@freeuk.com> writes:
On 20/04/2026 18:48, Keith Thompson wrote:
Presumably the expection is that it would actually be using recursion.
That expectation was not expressed in the code.
"(Why would recursive Fibonacci even ever be used as a benchmark when
the iterative method be much faster in every case?)"
Because you want to measure the speed of function calls, of course.
I will give you the answer: it is to compare how implementations cope
with very large numbers of recursive function calls. So if one finds a
way to avoid doing such calls, then it is not a fair comparison.
Then write the code so the compiler can't eliminate the calls.
You want the compiler to work with one hand tied behind its
metaphorical back for the sake of "fairness". Not gonna happen.
If you ask me to go from point A to point B, if it's a few kilometers
away, I'll probably drive my car. If you intended it to be a
three-legged race, I'm not cheating *if you didn't tell me that*.
If testing with gcc now, I'd use these two options:
-fno-inline
-fno-optimize-sibling-calls
On my PC, gcc-O2 code than manages some 560M calls/second running
Fibonacci, rather than a misleading 1270M calls/second
It's misleading *to you*, because you (deliberately?) misinterpret
the results.
See also:
https://github.com/drujensen/fib/issues/119
Referenced from: https://github.com/drujensen/fib
Let me ask you a simple question. Given my fibonacci example,
if a compiler compiled it to the equivalent of `puts("89")`, would
that compiler fail to conform to the ISO C standard? If so, why?
Are you able to distinguish between "I dislike this requirement
in the standard" and "I deny that this requirement exists"?
On 20/04/2026 22:59, Keith Thompson wrote:
Bart <bc@freeuk.com> writes:
On 20/04/2026 18:48, Keith Thompson wrote:That expectation was not expressed in the code.
Presumably the expection is that it would actually be using recursion.
Other than clearly using recursion?
"(Why would recursive Fibonacci even ever be used as a benchmark whenBecause you want to measure the speed of function calls, of course.
the iterative method be much faster in every case?)"
That's ... a surprising response.
I assumed both you and David had absolutely no interest in such
matters and were not sympathetic to those who did.
The naive fib() benchmark tells me it can achieve 1.27 billion fib()
calls per second on my PC. Great!
In that case, I should also get 1.27 billion calls/second when I run
the fib1/fib2/fib3 version.
But, it doesn't; I get less than half that throughput. What's gone wrong?
According to you, gcc code should be able to have that throughput; why doesn't it?
Let me ask you a simple question. Given my fibonacci example,
if a compiler compiled it to the equivalent of `puts("89")`, would
that compiler fail to conform to the ISO C standard? If so, why?
Are you able to distinguish between "I dislike this requirement
in the standard" and "I deny that this requirement exists"?
I don't understand. I assume you know the answer, that a C compiler
can do whatever it likes (including emailing your source to a human accomplise and have them mail back a cheat like this).
My problem is doing fair comparisons between implementations doing the
same task using the same algorithm. And in the case of recursive
fibonacci, I showed above that the naive gcc results are unreliable.
Bart <bc@freeuk.com> wrote:
On 20/04/2026 18:48, Keith Thompson wrote:
Bart <bc@freeuk.com> writes:
Yes, that's really useful!
So which implementation is faster at actually doing function calls? And
how many calls were actually made?
Implementation that uses 0 instructions to implement function call
on normal machine will get shorter runtime, so clearly is faster
at doing function calls. This does not differ much from what
some modern processors do, namely move instruction may effectively
take 0 cycles. People used to old ways when confrotnted with
movq %rax, %rdx
expect that there will be actual movement of data, that instruction
must travel the whole CPU pipeline. But modern processors do
register renaming and after looking at this istruction may
simply note that to get value of %rdx one uses place storing
%rax (I am using AT&T convention so direction is from %rax to
%rdx) and otherwise drop the istructruction. Is the processor
cheating? Naive benchmark where moves are overrepresented may
execute unexpectedy fast, but moves are frequent in real
program so this gives valuable speedup for all programs.
Coming back to function calls, consider programmer who cares
very much about speed. He knows that his program would be
simpler and easier to write if he used a lot of small
functions. In old days he would worry about cost of
function calls and he proably would write much bigger and
complicated functions to get good speed. But if cost of
function call is 0 he can freely use small functions, without
worrying about cost of calls.
I will give you the answer: it is to compare how implementations cope
with very large numbers of recursive function calls. So if one finds a
way to avoid doing such calls, then it is not a fair comparison.
Well, Fibonacci and similar functions have limited use.
So the
real question is what is the cost of function calls in actual
programs. For calls to small non-recursive function cost is
close to 0. Recursion increases makes optimization more tricky,
so increases cost. But still, in practice cost is lower than
one could naively expect.
Concerning fairness, AFAIK gcc optimization were developed to
speed up real programs. They speed up Fibonacci basically as
a side effect.
So IMO it is fair: compier that can not speed
up calls in Fibonacci probably will have trouble speeding up
calls at least in some real programs.
scott@slp53.sl.home (Scott Lurndal) writes:
[...]
Actually _EVERYBODY_ is interested in the actual output, and NOBODY is
interested in how long it took.
The 5 people in the world that think in terms of random irrelevent
benchmarks are the only people would even think to care.
I think that's an extreme exaggeration. Plenty of people are
interested in benchmarks. The TOP500 project, for example, ranks supercomputers on the basis of their performance on specific
benchmarks.
Of course those benchmarks are carefully written to avoid optimizing
away the code that's being measured.
On 20/04/2026 18:48, Keith Thompson wrote:
Bart <bc@freeuk.com> writes:
Yes, that's really useful!
So which implementation is faster at actually doing function calls? And
how many calls were actually made?
So, for gcc and Fibonacci, I now use -fno-inline and another to turn
off TCO.)
If I write this program:
#include <stdio.h>
int fib(int n) {
ÿÿÿÿ if (n <= 1) {
ÿÿÿÿÿÿÿÿ return 1;
ÿÿÿÿ }
ÿÿÿÿ else {
ÿÿÿÿÿÿÿÿ return fib(n-2) + fib(n-1);
ÿÿÿÿ }
}
int main(void) {
ÿÿÿÿ printf("%d\n", fib(10));
}
the implementation's job is to generate code that prints "89".
In that case, why bother using very slow recursive Fibonacci?
Presumably the expection is that it would actually be using recursion.
I already posed this question:
"(Why would recursive Fibonacci even ever be used as a benchmark when
the iterative method be much faster in every case?)"
I will give you the answer: it is to compare how implementations cope
with very large numbers of recursive function calls. So if one finds a
way to avoid doing such calls, then it is not a fair comparison.
Nobody is interested in the actual output, other than checking it worked correctly. But in how long it took.
If it's able to do so by replacing the whole thing with `puts("89");`
*that's a good thing*.ÿ That's not cheating.ÿ That's good code
generation.
What gcc-O2/-O3 actually does is to take the 5 lines of the Fibonacci function in C, which normally generates 25 lines of assembly, and turn
it into 270 lines of assembly.
Imagine such a ten-fold explosion in code size across a whole program,
for some tiny function which might not even ever be called as far as it knows. It's a little suspect; why these 5 lines over a 100Kloc program
for example?
If you want to write a benchmark that avoids certain optimizations,
you need to write it carefully so you get the code you want.
It's not possible to do that with Fibonacci without making it
unrecognisable and so a poor comparison for other reasons.
If testing with gcc now, I'd use these two options:
ÿ-fno-inline
ÿ-fno-optimize-sibling-calls
On my PC, gcc-O2 code than manages some 560M calls/second running
Fibonacci, rather than a misleading 1270M calls/second
See also:
https://github.com/drujensen/fib/issues/119
Referenced from: https://github.com/drujensen/fib
Bart <bc@freeuk.com> writes:
On 20/04/2026 18:48, Keith Thompson wrote:
Bart <bc@freeuk.com> writes:
Yes, that's really useful!
So which implementation is faster at actually doing function calls?
And how many calls were actually made?
I don't know or care.
Once again, *there are ways* to write C benchmarks that guarantee
that all the function calls you want to time actually occur during
execution. For example, you can use calls to separately compiled
functions (and disable link-time optimization if necessary). You can
do computations that the compiler can't unwrap. You might multiply
a value by (time(NULL) > 0); that always yields 1, but the compiler
probably doesn't know that. (That's off the top of my head; I don't
know what the best techniques are in practice.) And then you can
examine the generated code to make sure that it's what you want.
On 20/04/2026 22:59, Keith Thompson wrote:
Bart <bc@freeuk.com> writes:
On 20/04/2026 18:48, Keith Thompson wrote:
Presumably the expection is that it would actually be using recursion.
That expectation was not expressed in the code.
Other than clearly using recursion?
"(Why would recursive Fibonacci even ever be used as a benchmark when
the iterative method be much faster in every case?)"
Because you want to measure the speed of function calls, of course.
That's ... a surprising response.
I assumed both you and David had absolutely no interest in such matters
and were not sympathetic to those who did.
On 20/04/2026 21:57, David Brown wrote:
On 20/04/2026 19:34, Bart wrote:
And obviously, it is inadvisable to dereference a unknown pointer value.
Okay, so you think it is "obvious" that you should avoid doing some
things that are explicitly UB, and yet you think it is "obvious" that
you should be able to do other types of UB.ÿ Who makes up those
"obvious" rules?ÿ Why do you think such inconsistency is a good idea?
Common sense? Reading the contents of a variable /within/ your program
is harmless. Now trying reading from a random memory location that may
be outside your program, or trying to write somewhere within it.
You really think they are comparable?
No, your rules are far from simple - you have internal ideas about
what kinds of UB you think should produce certain results, and which
should not, and how compilers should interpret things that have no
meaning in C.ÿ That's not simple.
I don't have any ideas about UB at all. So long as a program is valid, I will translate it. I do very little transformations, and I rarely elide code, or only on a small scale.
My language works how a lot of people think C works. Maybe how they
wished it worked.
I can predict what gcc will do,
And yet you say:
The compiler can quite reasonably generate all sorts of different code here.ÿ A different version, or a different compiler, or on a different day, you could get different results.ÿ That's life when you use UB.
What else could the non-initialisation have been other than an
oversight - a bug in their code due to ignorance, or just making a
mistake as we all do occasionally?ÿ Do you think it is likely that
someone intentionally and knowingly wrote incorrect code?
I write such code hundreds of times a day. It is rarely run.
I might also write such code in my dynamic language. But there,
executing this program:
ÿ a := b + c
generates a runtime error: '+' is not defined between 'void' types.
Here, variables are automatically initialised to 'void' when they come
into existence.
My systems language is lower level. I had thought about zeroing the
stack frame on function entry (and did once try this with C), but I
decided not to do that.
A correctly written program shouldn't need that (although it would be convenient if guaranteed and could be useful to get repeatable results
if debugging).
My compilers don't try and double-guess the user: they will simply do
what is requested.
No, guessing the user's intentions is /exactly/ what your compiler is
trying to do.ÿ It is trying to guess what the programmer wrote even
though the programmer made a mistake and wrote something that does not
make sense.
It is not making any guesses at all. It is faithfully translating the
user's code without making any judgements so long as it is valid.
It is GCC takes which an entire function body or even an entire static function and vanishes it out of existence. And where the output depends
on the combination effects of dozens of options.
A good compiler will work with sensibly written programs, yet you
insist on writing C that is not sensibly written.
IMO it is fine.
If a problem were to come up, then I would adjust the
code generator to fix it.
Oh, so you want gcc to optimise away redundant code when your
transpiler generates redundant code, but it is "cheating" if it
optimises away redundant code in bizarre tests because your C compiler
can't do that?
Yes, sometimes the redundant code is the very thing you are trying to measure.
Who'd have thought that?
But sometimes what a compiler thinks is redundant can be surprising.
You seem to think that just writing "int a;" somehow creates an int[...]
object called "a" along with a slot on the stack or a dedicated
register. That does not happen in many compilers. And it does not
happen in the C semantics. "a" is an lvalue that /potentially/
designates an object - for an uninitialised local variable, it does
not designate an object until a value is assigned.
Bart <bc@freeuk.com> writes:
On 20/04/2026 22:59, Keith Thompson wrote:
Bart <bc@freeuk.com> writes:
On 20/04/2026 18:48, Keith Thompson wrote:That expectation was not expressed in the code.
Presumably the expection is that it would actually be using recursion.
Other than clearly using recursion?
The C source code specifies the behavior of the program. Recursion
is not behavior. A recursive C program does not require a compiler
to generate a recursive executable, any more than a function call
requires it to generate a "call" instruction.
"(Why would recursive Fibonacci even ever be used as a benchmark whenBecause you want to measure the speed of function calls, of course.
the iterative method be much faster in every case?)"
That's ... a surprising response.
I assumed both you and David had absolutely no interest in such
matters and were not sympathetic to those who did.
I have no idea how you reached that conclusion.
[...]
The naive fib() benchmark tells me it can achieve 1.27 billion fib()
calls per second on my PC. Great!
In that case, I should also get 1.27 billion calls/second when I run
the fib1/fib2/fib3 version.
But, it doesn't; I get less than half that throughput. What's gone wrong?
According to you, gcc code should be able to have that throughput; why
doesn't it?
Where did I say that? To be clear, when I said that a compiler
could transform my Fibonacci program into just puts("89"), I did
not suggest that gcc actually does so.
[...]
Let me ask you a simple question. Given my fibonacci example,
if a compiler compiled it to the equivalent of `puts("89")`, would
that compiler fail to conform to the ISO C standard? If so, why?
Are you able to distinguish between "I dislike this requirement
in the standard" and "I deny that this requirement exists"?
I don't understand. I assume you know the answer, that a C compiler
can do whatever it likes (including emailing your source to a human
accomplise and have them mail back a cheat like this).
So you agree that optimizing the program to just puts("89") is valid.
My problem is doing fair comparisons between implementations doing the
same task using the same algorithm. And in the case of recursive
fibonacci, I showed above that the naive gcc results are unreliable.
Your problem, apparently, is that you make some bad assumptions about
how to compare and measure performance. You assume that the mapping
from source code to machine code is, or should be, straightforward
enough that you can know how many CALL instructions will be executed.
You get performance numbers that are obviously absurd because some
calls are legitimately optimized away. And even though you now
acknowledge that the optimizations that break your measurements
are legitimate, you still blame the compiler rather than your own
benchmark code.
On 21/04/2026 00:03, Bart wrote:
It is GCC takes which an entire function body or even an entire static
function and vanishes it out of existence. And where the output
depends on the combination effects of dozens of options.
Code that does nothing can be eliminated - that's a /good/ thing. Static functions that are not used within a file can never be used in the
program - eliminating them is a /good/ thing.ÿ Different users have a
wide variety of different needs from their tools, so GCC has a large
number of options.ÿ That's a /good/ thing.
On 20/04/2026 22:13, Bart wrote:
On 20/04/2026 18:48, Keith Thompson wrote:
Bart <bc@freeuk.com> writes:
Yes, that's really useful!
So which implementation is faster at actually doing function calls?
And how many calls were actually made?
Why do you care?
It's not possible to do that with Fibonacci without making it
unrecognisable and so a poor comparison for other reasons.
Sure it is.
__attribute__((noinline))
If testing with gcc now, I'd use these two options:
ÿÿ-fno-inline
ÿÿ-fno-optimize-sibling-calls
On 21/04/2026 02:22, Keith Thompson wrote:
Bart <bc@freeuk.com> writes:
On 20/04/2026 22:59, Keith Thompson wrote:
Bart <bc@freeuk.com> writes:
On 20/04/2026 18:48, Keith Thompson wrote:
Presumably the expection is that it would actually be using recursion. >>>> That expectation was not expressed in the code.
Other than clearly using recursion?
The C source code specifies the behavior of the program.ÿ Recursion
is not behavior.ÿ A recursive C program does not require a compiler
to generate a recursive executable, any more than a function call
requires it to generate a "call" instruction.
"(Why would recursive Fibonacci even ever be used as a benchmark when >>>>> the iterative method be much faster in every case?)"Because you want to measure the speed of function calls, of course.
That's ... a surprising response.
I assumed both you and David had absolutely no interest in such
matters and were not sympathetic to those who did.
I have no idea how you reached that conclusion.
[...]
The naive fib() benchmark tells me it can achieve 1.27 billion fib()
calls per second on my PC. Great!
In that case, I should also get 1.27 billion calls/second when I run
the fib1/fib2/fib3 version.
But, it doesn't; I get less than half that throughput. What's gone
wrong?
According to you, gcc code should be able to have that throughput; why
doesn't it?
Where did I say that?ÿ To be clear, when I said that a compiler
could transform my Fibonacci program into just puts("89"), I did
not suggest that gcc actually does so.
I questioned the apparent throughput of gcc's 1.27B call/second, and you said:
"It's misleading *to you*, because you (deliberately?) misinterpret
the results."
So what does that mean, that you agree with that result (1.27B) or that
you don't?
Note that a global counter can be injected into the benchmark at the
entry to fib(), and sure enough, it shows the expected number of calls
when displayed at the end (some 500M for fib(42)). But it's wrong!
Your schtick seems to whether a program is conforming or not and seem to care nothing about practicalities.
Mine, in this case, is comparing language implementations' abilities to achieve a certain throughput of function calls.
I claim that gcc-O3 is giving misleading result because it is not
executing the task I expect, which is to calculate fib(N) by expliciting doing every one of the necessary 2*fib(N)-1 calls.
On 21/04/2026 09:01, David Brown wrote:
On 20/04/2026 22:13, Bart wrote:
On 20/04/2026 18:48, Keith Thompson wrote:
Bart <bc@freeuk.com> writes:
Yes, that's really useful!
So which implementation is faster at actually doing function calls?
And how many calls were actually made?
Why do you care?
I care because, as you know, I sometimes work on compilers, specifically ones that do not have a formal optimiser.
And I'm interested in how far such compilers can be pushed.
Normally I don't pay much heed to micro-benchmarks, because it is so
easy for a big compiler to optimise a tiny program of a few dozen lines
to nothing.
What counts is what happens with real applications or real libraries.
There I compete favourably.
It's not possible to do that with Fibonacci without making it
unrecognisable and so a poor comparison for other reasons.
Sure it is.
__attribute__((noinline))
I use these to get a more valid result:
If testing with gcc now, I'd use these two options:
ÿÿ-fno-inline
ÿÿ-fno-optimize-sibling-calls
This is now used in my survery, which still includes the cheating
versions. People can make up their own minds as to how valid they are.
David Brown <david.brown@hesbynett.no> writes:
[...]
You seem to think that just writing "int a;" somehow creates an int[...]
object called "a" along with a slot on the stack or a dedicated
register. That does not happen in many compilers. And it does not
happen in the C semantics. "a" is an lvalue that /potentially/
designates an object - for an uninitialised local variable, it does
not designate an object until a value is assigned.
I don't think that's correct. Within the scope of a declaration `int
a;` the expression `a` is an lvalue that *does* designate an object. If
that expression undergoes "lvalue conversion", the operation that
fetches the stored value, the behavior is undefined if a is
uninitialize.
The word "potentially (which was my idea, BTW) means that given:
int *p = NULL;
the expression *p is an lvalue, even if it doesn't currently designate
an object. C90 and C99 both messed up the definition of "lvalue" in different ways.
On 21/04/2026 12:01, Bart wrote:
So what does that mean, that you agree with that result (1.27B) or
that you don't?
Keith will have to answer for himself.ÿ And I have not tried to
replicate your tests.ÿ But if you have got that number by timing gcc's calculations and dividing it by 2 * fib(n) - 1, and you think it shows
the number of assembly "call" instructions, then that sounds a great
deal like misinterpreting the results.
Note that a global counter can be injected into the benchmark at the
entry to fib(), and sure enough, it shows the expected number of calls
when displayed at the end (some 500M for fib(42)). But it's wrong!
Why would you think that adding a counter would tell you the number of actual assembly "call" function calls?
ÿ It can tell you the number of
logical C function calls made, but not the number of assembly calls.
It would be very helpful here if you made the distinction between those
two meanings of "function call".
Your schtick seems to whether a program is conforming or not and seem
to care nothing about practicalities.
(Again, I am speaking for myself, not for Keith.)ÿ I'm fine with code
that is not conforming to standard C, as long as it is conforming to the implementation you are using - and as long as you are clear that it is
not standard C.ÿ So if you were using uninitialised variables and
compiling with "gcc -ftrivial-auto-var-init=zero", that's okay.ÿ I
consider correct code practically useful, and incorrect code practically useless.ÿ But "correct code" does not necessarily mean fully portable
code relying purely on the semantics of C in the standard.
Mine, in this case, is comparing language implementations' abilities
to achieve a certain throughput of function calls.
You keep claiming that.ÿ People keep telling you that you are failing to
do so - and you know yourself that you are failing to do so.
Your expectations are unreasonable and are at odds with what you are
doing.ÿ The way out of this hole you have dug for yourself is either to change your expectations, or change your benchmarking methodology.
Note 2: I believe these figures are suspect because the requisite
number of calls are not done.
On 21/04/2026 12:44, David Brown wrote:
On 21/04/2026 12:01, Bart wrote:
So what does that mean, that you agree with that result (1.27B) or
that you don't?
Keith will have to answer for himself.ÿ And I have not tried to
replicate your tests.ÿ But if you have got that number by timing gcc's
calculations and dividing it by 2 * fib(n) - 1, and you think it shows
the number of assembly "call" instructions, then that sounds a great
deal like misinterpreting the results.
Note that a global counter can be injected into the benchmark at the
entry to fib(), and sure enough, it shows the expected number of
calls when displayed at the end (some 500M for fib(42)). But it's wrong! >>>
Why would you think that adding a counter would tell you the number of
actual assembly "call" function calls?
Did you miss the bit where I said it's wrong?
ÿ It can tell you the number of logical C function calls made, but not
the number of assembly calls.
I've measured the number of assembly calls too. By injecting, within the assembly, an increment of the count at every place where it calls 'fib'. That's how I discovered that with -O1 it only does 50% of the calls, and with -O3 only 5%.
I gave this link to someone doing a similar analysis:
https://github.com/drujensen/fib/issues/119
which everyone has conveniently ignored.
It would be very helpful here if you made the distinction between
those two meanings of "function call".
In the case of native x64 code, it is counting the number of times
'CALL' is executed.
Your schtick seems to whether a program is conforming or not and seem
to care nothing about practicalities.
(Again, I am speaking for myself, not for Keith.)ÿ I'm fine with code
that is not conforming to standard C, as long as it is conforming to
the implementation you are using - and as long as you are clear that
it is not standard C.ÿ So if you were using uninitialised variables
and compiling with "gcc -ftrivial-auto-var-init=zero", that's okay.ÿ I
consider correct code practically useful, and incorrect code
practically useless.ÿ But "correct code" does not necessarily mean
fully portable code relying purely on the semantics of C in the standard.
Mine, in this case, is comparing language implementations' abilities
to achieve a certain throughput of function calls.
You keep claiming that.ÿ People keep telling you that you are failing
to do so - and you know yourself that you are failing to do so.
You snipped my chart. I'm pretty sure that all of those interpreted
timings do the right number calls, and likely the JIT ones too.
As well as the C timings for Pico C, Tiny C, bcc, and gcc-O0.
It is gcc-O1 and above that are the outliers.
All I'm interested in here is comparing a range of languages with an attainable upper limit, not a fantasy one.
Your expectations are unreasonable and are at odds with what you are
doing.ÿ The way out of this hole you have dug for yourself is either
to change your expectations, or change your benchmarking methodology.
The methodology is fine. I just need to exclude the outliers. That would those C ones, and also the slowest.
On Tue, 21 Apr 2026 12:12:28 +0100
Bart <bc@freeuk.com> wrote:
Note 2: I believe these figures are suspect because the requisite
number of calls are not done.
I don't see anything suspect in the -O1 code generated by gcc 14.2.0
Source:
unsigned long long fib(unsigned long long n)
{
if (n < 2)
return 1;
return fib(n-1)+fib(n-2);
}
$ gcc -S -O1 -Wall -Wextra -fno-asynchronous-unwind-tables ref0_fib.c
$ cat ref0_fib.s
.file "ref0_fib.c"
.text
.globl fib
.def fib; .scl 2; .type 32; .endef
fib:
movl $1, %eax
cmpq $1, %rcx
jbe .L5
pushq %rsi
pushq %rbx
subq $40, %rsp
movq %rcx, %rbx
leaq -1(%rcx), %rcx
call fib
movq %rax, %rsi
leaq -2(%rbx), %rcx
call fib
addq %rsi, %rax
addq $40, %rsp
popq %rbx
popq %rsi
ret
.L5:
ret
.ident "GCC: (Rev2, Built by MSYS2 project) 14.2.0"
Measured with n=43 on my very old home desktop it gave:
1402817465/2.646 s = 530165330.7 calls/sec
On 21/04/2026 09:53, David Brown wrote:
On 21/04/2026 00:03, Bart wrote:
I took a program ll.c (which is Lua source code in one file, so the
compiler can see the whole program), and replaced the body of main()
with 'exit(0)'. So none of the functions are called. I got these results:
c:\cx>gcc -s -Os ll.c # optimise for size
c:\cx>dir a.exe
21/04/2026 11:10 241,152 a.exe
c:\cx>bcc ll
Compiling ll.c to ll.exe
c:\cx>dir ll.exe
21/04/2026 11:11 237,056 ll.exe
Somehow my bcc-compiled version generated a smaller excutable!
Bart <bc@freeuk.com> writes:
On 21/04/2026 09:53, David Brown wrote:
On 21/04/2026 00:03, Bart wrote:
I took a program ll.c (which is Lua source code in one file, so the
compiler can see the whole program), and replaced the body of main()
with 'exit(0)'. So none of the functions are called. I got these results:
c:\cx>gcc -s -Os ll.c # optimise for size
c:\cx>dir a.exe
21/04/2026 11:10 241,152 a.exe
c:\cx>bcc ll
Compiling ll.c to ll.exe
c:\cx>dir ll.exe
21/04/2026 11:11 237,056 ll.exe
Somehow my bcc-compiled version generated a smaller excutable!
Ah, back to the on-disk executable size. An irrelevent
metric. One might expect the 'in-memory' size is interesting
and that's what the -Os option is designed to minimize, not the
number of disk sectors consumed by the executable file.
You might wish to compare the text section sizes,
On 21/04/2026 14:48, Bart wrote:
I gave this link to someone doing a similar analysis:
https://github.com/drujensen/fib/issues/119
which everyone has conveniently ignored.
I can't answer for "everyone", but I rarely follow links posted on
Usenet.ÿ I am interested in your opinions and answers - a third party's opinions on a fourth party's project is not typically helpful.ÿ But
since you insist, I have looked at that page.ÿ Have you?ÿ The project
author and other posters agree that optimisations are not "cheating",
and question the realism of fibonacci as a benchmark.
It would be very helpful here if you made the distinction between
those two meanings of "function call".
In the case of native x64 code, it is counting the number of times
'CALL' is executed.
Okay.ÿ That is, of course, not a meaningful number as far as C (or any
other language, other than assembly) is concerned.ÿ It can be a somewhat meaningful value for comparing implementations - where a smaller number
of "CALL" instructions in the compilation indicates a better
implementation.
You snipped my chart. I'm pretty sure that all of those interpreted
timings do the right number calls, and likely the JIT ones too.
There is no "right" number of calls.
As well as the C timings for Pico C, Tiny C, bcc, and gcc-O0.
It is gcc-O1 and above that are the outliers.
So gcc, when optimising, does a better job of optimising than these
small C compilers?ÿ Is that a surprise to anyone?
On 21/04/2026 15:27, Scott Lurndal wrote:
Bart <bc@freeuk.com> writes:
On 21/04/2026 09:53, David Brown wrote:
On 21/04/2026 00:03, Bart wrote:
I took a program ll.c (which is Lua source code in one file, so the
compiler can see the whole program), and replaced the body of main()
with 'exit(0)'. So none of the functions are called. I got these results: >>>
c:\cx>gcc -s -Os ll.c # optimise for size
c:\cx>dir a.exe
21/04/2026 11:10 241,152 a.exe
c:\cx>bcc ll
Compiling ll.c to ll.exe
c:\cx>dir ll.exe
21/04/2026 11:11 237,056 ll.exe
Somehow my bcc-compiled version generated a smaller excutable!
Ah, back to the on-disk executable size. An irrelevent
metric. One might expect the 'in-memory' size is interesting
and that's what the -Os option is designed to minimize, not the
number of disk sectors consumed by the executable file.
You don't think there's a correlation between the size of code and >initialised data, and the size of the executable?
You might wish to compare the text section sizes,
Both text and initialised data will take up valuable memory.
On 21/04/2026 13:55, Michael S wrote:
On Tue, 21 Apr 2026 12:12:28 +0100
Bart <bc@freeuk.com> wrote:
Note 2: I believe these figures are suspect because the requisite
number of calls are not done.
I don't see anything suspect in the -O1 code generated by gcc 14.2.0
Source:
unsigned long long fib(unsigned long long n)
{
if (n < 2)
return 1;
return fib(n-1)+fib(n-2);
}
$ gcc -S -O1 -Wall -Wextra -fno-asynchronous-unwind-tables
ref0_fib.c $ cat ref0_fib.s
.file "ref0_fib.c"
.text
.globl fib
.def fib; .scl 2; .type 32; .endef
fib:
movl $1, %eax
cmpq $1, %rcx
jbe .L5
pushq %rsi
pushq %rbx
subq $40, %rsp
movq %rcx, %rbx
leaq -1(%rcx), %rcx
call fib
movq %rax, %rsi
leaq -2(%rbx), %rcx
call fib
addq %rsi, %rax
addq $40, %rsp
popq %rbx
popq %rsi
ret
.L5:
ret
.ident "GCC: (Rev2, Built by MSYS2 project) 14.2.0"
Measured with n=43 on my very old home desktop it gave:
1402817465/2.646 s = 530165330.7 calls/sec
You're right. It was either a different version or I was mistaken.
But it seems that Clang -O1 will generate a version with only a
single fib call. This is the godbolt code for the Fib() version using
"if (n < 3) return 1":
fib:
pushq %r14
pushq %rbx
pushq %rax
movl %edi, %r14d
xorl %ebx, %ebx
cmpl $3, %r14d
jl .LBB0_3
.LBB0_2:
leal -1(%r14), %edi
callq fib
addl %eax, %ebx
addl $-2, %r14d
cmpl $3, %r14d
jge .LBB0_2
.LBB0_3:
incl %ebx
movl %ebx, %eax
addq $8, %rsp
popq %rbx
popq %r14
retq
If I inject an increment to a global counter just after than callq
fib, then it shows only half the expected value.
(This fib version is one-based, so that fib(10) is 55, while yours I
think has it as 89. Google tells me that Fibonacci(10) is 55.)
On 21/04/2026 02:22, Keith Thompson wrote:
Bart <bc@freeuk.com> writes:
On 20/04/2026 22:59, Keith Thompson wrote:The C source code specifies the behavior of the program. Recursion
Bart <bc@freeuk.com> writes:
On 20/04/2026 18:48, Keith Thompson wrote:
Presumably the expection is that it would actually be using recursion. >>>> That expectation was not expressed in the code.
Other than clearly using recursion?
is not behavior. A recursive C program does not require a compiler
to generate a recursive executable, any more than a function call
requires it to generate a "call" instruction.
I have no idea how you reached that conclusion."(Why would recursive Fibonacci even ever be used as a benchmark when >>>>> the iterative method be much faster in every case?)"Because you want to measure the speed of function calls, of course.
That's ... a surprising response.
I assumed both you and David had absolutely no interest in such
matters and were not sympathetic to those who did.
[...]
The naive fib() benchmark tells me it can achieve 1.27 billion fib()Where did I say that? To be clear, when I said that a compiler
calls per second on my PC. Great!
In that case, I should also get 1.27 billion calls/second when I run
the fib1/fib2/fib3 version.
But, it doesn't; I get less than half that throughput. What's gone wrong? >>>
According to you, gcc code should be able to have that throughput; why
doesn't it?
could transform my Fibonacci program into just puts("89"), I did
not suggest that gcc actually does so.
I questioned the apparent throughput of gcc's 1.27B call/second, and
you said:
"It's misleading *to you*, because you (deliberately?) misinterpret
the results."
So what does that mean, that you agree with that result (1.27B) or
that you don't?
Note that a global counter can be injected into the benchmark at the
entry to fib(), and sure enough, it shows the expected number of calls
when displayed at the end (some 500M for fib(42)). But it's wrong!
[...]
So you agree that optimizing the program to just puts("89") isLet me ask you a simple question. Given my fibonacci example,
if a compiler compiled it to the equivalent of `puts("89")`, would
that compiler fail to conform to the ISO C standard? If so, why?
Are you able to distinguish between "I dislike this requirement
in the standard" and "I deny that this requirement exists"?
I don't understand. I assume you know the answer, that a C compiler
can do whatever it likes (including emailing your source to a human
accomplise and have them mail back a cheat like this).
valid.
My problem is doing fair comparisons between implementations doing theYour problem, apparently, is that you make some bad assumptions
same task using the same algorithm. And in the case of recursive
fibonacci, I showed above that the naive gcc results are unreliable.
about
how to compare and measure performance. You assume that the mapping
from source code to machine code is, or should be, straightforward
enough that you can know how many CALL instructions will be executed.
You get performance numbers that are obviously absurd because some
calls are legitimately optimized away. And even though you now
acknowledge that the optimizations that break your measurements
are legitimate, you still blame the compiler rather than your own
benchmark code.
I like how to ignore every single one of my points.
Your schtick seems to whether a program is conforming or not and seem
to care nothing about practicalities.
Mine, in this case, is comparing language implementations' abilities
to achieve a certain throughput of function calls.
I claim that gcc-O3 is giving misleading result because it is not
executing the task I expect, which is to calculate fib(N) by
expliciting doing every one of the necessary 2*fib(N)-1 calls.
On 21/04/2026 12:01, Bart wrote:[...]
Note that a global counter can be injected into the benchmark at the
entry to fib(), and sure enough, it shows the expected number of
calls when displayed at the end (some 500M for fib(42)). But it's
wrong!
Why would you think that adding a counter would tell you the number of
actual assembly "call" function calls? It can tell you the number of
logical C function calls made, but not the number of assembly calls.
On 21/04/2026 14:43, David Brown wrote:
On 21/04/2026 14:48, Bart wrote:
I gave this link to someone doing a similar analysis:
https://github.com/drujensen/fib/issues/119
which everyone has conveniently ignored.
I can't answer for "everyone", but I rarely follow links posted on
Usenet.ÿ I am interested in your opinions and answers - a third
party's opinions on a fourth party's project is not typically
helpful.ÿ But since you insist, I have looked at that page.ÿ Have
you?ÿ The project author and other posters agree that optimisations
are not "cheating", and question the realism of fibonacci as a benchmark.
Literally the title of the page contains the word "cheating". And the
person maintaining the benchmarks says:
"I am open to suggestions on how to improve the fairness of the benchmark."
So the question of cheating and fairness has been raised. Some suggest a separate category for optimised code. Some suggest using flags as I have done. Some agree with you that optimisation should not be restricted.
I think Fibonacci is a good benchmark for languages that don't cheat by avoiding doing the full quota of 2*fib(N)-1 calls.
I'm not going to dump a useful tool that works fine in dozens of implementations just because you say so.
According to you, obviously L3 is the winner because of its superior optimiser! No red flags at all.
You snipped my chart. I'm pretty sure that all of those interpreted
timings do the right number calls, and likely the JIT ones too.
There is no "right" number of calls.
As well as the C timings for Pico C, Tiny C, bcc, and gcc-O0.
It is gcc-O1 and above that are the outliers.
So gcc, when optimising, does a better job of optimising than these
small C compilers?ÿ Is that a surprise to anyone?
Below I've posted two programs that both evaluate fib(42).
The first uses the regular function, the second splits it into three functions across three sources, that all call each other.
They are built like this, using gcc 14.1.0 on Windows:
ÿÿ gcc -O3 fib.c -o fib
ÿÿ gcc -O3 fib1.c fib2.c fib3.c -o fib123
ÿTimings are:
ÿÿ fib:ÿÿÿÿÿ 0.418 seconds
ÿÿ fib123:ÿÿ 0.857
Both do the same thing, do the same number of calls (right?) but one is twice as fast. Why is that?
If I try it with bcc:
ÿÿ fib:ÿÿÿÿÿ 1.20 seconds
ÿÿ fib123:ÿÿ 1.16
It's consistent. I did one more test, which was to combine fib1/2/4 into
one file fib4.c. This was the result:
ÿÿ fib4:ÿÿÿÿ 0.149 seconds
WTF is going on?! This is nearly 3 times that 0.42 seconds which was
already cheating IMV. And it is 6 times as fast having the code in
separate files.
On Tue, 21 Apr 2026 14:49:58 +0100
Bart <bc@freeuk.com> wrote:
On 21/04/2026 13:55, Michael S wrote:
On Tue, 21 Apr 2026 12:12:28 +0100
Bart <bc@freeuk.com> wrote:
Note 2: I believe these figures are suspect because the requisite
number of calls are not done.
I don't see anything suspect in the -O1 code generated by gcc 14.2.0
Source:
unsigned long long fib(unsigned long long n)
{
if (n < 2)
return 1;
return fib(n-1)+fib(n-2);
}
$ gcc -S -O1 -Wall -Wextra -fno-asynchronous-unwind-tables
ref0_fib.c $ cat ref0_fib.s
.file "ref0_fib.c"
.text
.globl fib
.def fib; .scl 2; .type 32; .endef
fib:
movl $1, %eax
cmpq $1, %rcx
jbe .L5
pushq %rsi
pushq %rbx
subq $40, %rsp
movq %rcx, %rbx
leaq -1(%rcx), %rcx
call fib
movq %rax, %rsi
leaq -2(%rbx), %rcx
call fib
addq %rsi, %rax
addq $40, %rsp
popq %rbx
popq %rsi
ret
.L5:
ret
.ident "GCC: (Rev2, Built by MSYS2 project) 14.2.0"
Measured with n=43 on my very old home desktop it gave:
1402817465/2.646 s = 530165330.7 calls/sec
You're right. It was either a different version or I was mistaken.
But it seems that Clang -O1 will generate a version with only a
single fib call. This is the godbolt code for the Fib() version using
"if (n < 3) return 1":
fib:
pushq %r14
pushq %rbx
pushq %rax
movl %edi, %r14d
xorl %ebx, %ebx
cmpl $3, %r14d
jl .LBB0_3
.LBB0_2:
leal -1(%r14), %edi
callq fib
addl %eax, %ebx
addl $-2, %r14d
cmpl $3, %r14d
jge .LBB0_2
.LBB0_3:
incl %ebx
movl %ebx, %eax
addq $8, %rsp
popq %rbx
popq %r14
retq
If I inject an increment to a global counter just after than callq
fib, then it shows only half the expected value.
(This fib version is one-based, so that fib(10) is 55, while yours I
think has it as 89. Google tells me that Fibonacci(10) is 55.)
That looks like tail call elimination. I.e. compiler turned the code
into:
unsigned long long fib(unsigned long long n)
{
unsigned long long res = 0;
while (n >= 3) {
res += fib(n-1);
n -= 2;
}
return res + 1;;
}
gcc generates similar code with -O -foptimize-sibling-calls
For certain styles of coding, e.g. one often preferred by Tim Rentsch,
this optimization is extremely important.
Bart <bc@freeuk.com> writes:
On 21/04/2026 09:53, David Brown wrote:
On 21/04/2026 00:03, Bart wrote:
I took a program ll.c (which is Lua source code in one file, so the >compiler can see the whole program), and replaced the body of main()
with 'exit(0)'. So none of the functions are called. I got these
results:
c:\cx>gcc -s -Os ll.c # optimise for size
c:\cx>dir a.exe
21/04/2026 11:10 241,152 a.exe
c:\cx>bcc ll
Compiling ll.c to ll.exe
c:\cx>dir ll.exe
21/04/2026 11:11 237,056 ll.exe
Somehow my bcc-compiled version generated a smaller excutable!
Ah, back to the on-disk executable size. An irrelevent
metric. One might expect the 'in-memory' size is interesting
and that's what the -Os option is designed to minimize, not the
number of disk sectors consumed by the executable file.
You might wish to compare the text section sizes, but I suspect
that's like telling an apprentice to go fetch a left-handed pipe
wrench, as the size of the text doesn't necessarily correlate
with performance.
On 21/04/2026 14:43, David Brown wrote:
On 21/04/2026 14:48, Bart wrote:
I gave this link to someone doing a similar analysis:I can't answer for "everyone", but I rarely follow links posted on
https://github.com/drujensen/fib/issues/119
which everyone has conveniently ignored.
Usenet.ÿ I am interested in your opinions and answers - a third
party's opinions on a fourth party's project is not typically
helpful.ÿ But since you insist, I have looked at that page.ÿ Have
you?ÿ The project author and other posters agree that optimisations
are not "cheating", and question the realism of fibonacci as a
benchmark.
Literally the title of the page contains the word "cheating". And the
person maintaining the benchmarks says:
"I am open to suggestions on how to improve the fairness of the
benchmark."
So the question of cheating and fairness has been raised. Some suggest
a separate category for optimised code. Some suggest using flags as I
have done. Some agree with you that optimisation should not be
restricted.
I think Fibonacci is a good benchmark for languages that don't cheat
by avoiding doing the full quota of 2*fib(N)-1 calls.
I'm not going to dump a useful tool that works fine in dozens of implementations just because you say so.
It would be very helpful here if you made the distinction betweenOkay.ÿ That is, of course, not a meaningful number as far as C (or
those two meanings of "function call".
In the case of native x64 code, it is counting the number of times
'CALL' is executed.
any other language, other than assembly) is concerned.ÿ It can be a
somewhat meaningful value for comparing implementations - where a
smaller number of "CALL" instructions in the compilation indicates a
better implementation.
Suppose I had some a function implementing an algorithm and I wanted
to use that as a benchmark to compare languages.
I might measure performance by invoking it N times. Suppose I get
these results across 4 languages:
L1: 3.5 seconds
L2: 4.2
L3: 0.1
L4 2.9
According to you, obviously L3 is the winner because of its superior optimiser! No red flags at all.
Bart <bc@freeuk.com> writes:
On 21/04/2026 15:27, Scott Lurndal wrote:
Bart <bc@freeuk.com> writes:
On 21/04/2026 09:53, David Brown wrote:
On 21/04/2026 00:03, Bart wrote:
I took a program ll.c (which is Lua source code in one file, so the
compiler can see the whole program), and replaced the body of main()
with 'exit(0)'. So none of the functions are called. I got these results: >>>>
c:\cx>gcc -s -Os ll.c # optimise for size
c:\cx>dir a.exe
21/04/2026 11:10 241,152 a.exe
c:\cx>bcc ll
Compiling ll.c to ll.exe
c:\cx>dir ll.exe
21/04/2026 11:11 237,056 ll.exe
Somehow my bcc-compiled version generated a smaller excutable!
Ah, back to the on-disk executable size. An irrelevent
metric. One might expect the 'in-memory' size is interesting
and that's what the -Os option is designed to minimize, not the
number of disk sectors consumed by the executable file.
You don't think there's a correlation between the size of code and
initialised data, and the size of the executable?
A portion of the executable is metadata that never
gets loaded into memory (symbol tables, rtld data
and relocation information, etc.)
You might wish to compare the text section sizes,
Both text and initialised data will take up valuable memory.
$ size bin/test1
text data bss dec hex filename
6783060 85872 1861744 8730676 853834 bin/test1
The text only takes up memory -on demand-. If a code
page is never referenced, it is never loaded into
memory.
The working set size is interesting, but completely unrelated
to the size of the on-disk executable file.
On 21/04/2026 16:38, Scott Lurndal wrote:
Bart <bc@freeuk.com> writes:
On 21/04/2026 15:27, Scott Lurndal wrote:
Bart <bc@freeuk.com> writes:
On 21/04/2026 09:53, David Brown wrote:
On 21/04/2026 00:03, Bart wrote:
I took a program ll.c (which is Lua source code in one file, so the
compiler can see the whole program), and replaced the body of main() >>>>> with 'exit(0)'. So none of the functions are called. I got these results: >>>>>
c:\cx>gcc -s -Os ll.c # optimise for size
c:\cx>dir a.exe
21/04/2026 11:10 241,152 a.exe
c:\cx>bcc ll
Compiling ll.c to ll.exe
c:\cx>dir ll.exe
21/04/2026 11:11 237,056 ll.exe
Somehow my bcc-compiled version generated a smaller excutable!
Ah, back to the on-disk executable size. An irrelevent
metric. One might expect the 'in-memory' size is interesting
and that's what the -Os option is designed to minimize, not the
number of disk sectors consumed by the executable file.
You don't think there's a correlation between the size of code and
initialised data, and the size of the executable?
A portion of the executable is metadata that never
gets loaded into memory (symbol tables, rtld data
and relocation information, etc.)
You might wish to compare the text section sizes,
Both text and initialised data will take up valuable memory.
$ size bin/test1
text data bss dec hex filename
6783060 85872 1861744 8730676 853834 bin/test1
The text only takes up memory -on demand-. If a code
page is never referenced, it is never loaded into
memory.
The working set size is interesting, but completely unrelated
to the size of the on-disk executable file.
So it's a just a coincidence that specifying -Os tends to get you a
smaller executable?
Bart <bc@freeuk.com> writes:
On 21/04/2026 14:43, David Brown wrote:
On 21/04/2026 14:48, Bart wrote:
I gave this link to someone doing a similar analysis:I can't answer for "everyone", but I rarely follow links posted on
https://github.com/drujensen/fib/issues/119
which everyone has conveniently ignored.
Usenet.ÿ I am interested in your opinions and answers - a third
party's opinions on a fourth party's project is not typically
helpful.ÿ But since you insist, I have looked at that page.ÿ Have
you?ÿ The project author and other posters agree that optimisations
are not "cheating", and question the realism of fibonacci as a
benchmark.
Literally the title of the page contains the word "cheating". And the
person maintaining the benchmarks says:
"I am open to suggestions on how to improve the fairness of the
benchmark."
I'm not sure I would have used the word "cheating", though I might
have used it facetiously. None of the compilers we're discussing
"cheat" in the sense of violating the rules of the language.
The author is talking about *improving the benchmark* so that
it prevents the optimizations that make it difficult to measure
performance.
So the question of cheating and fairness has been raised. Some suggest
a separate category for optimised code. Some suggest using flags as I
have done. Some agree with you that optimisation should not be
restricted.
I think Fibonacci is a good benchmark for languages that don't cheat
by avoiding doing the full quota of 2*fib(N)-1 calls.
THAT'S NOT CHEATING. It's called optimization. If you refuse to
do the work of writing you benchmark so it avoids optimization,
then you'll end up with a bad benchmark.
I'm not going to dump a useful tool that works fine in dozens of
implementations just because you say so.
Apparently you intend to continue to use a tool that does not measure
what you want it to measure, that works with some implementations but
not with others. You say it works perfectly well while posting data
that shows that it does not.
I might measure performance by invoking it N times. Suppose I get
these results across 4 languages:
L1: 3.5 seconds
L2: 4.2
L3: 0.1
L4 2.9
According to you, obviously L3 is the winner because of its superior
optimiser! No red flags at all.
Does the code generated by L3 produce the correct output? If so, the
only problem is that your benchmark is affected by the L3 compiler's optimization. If your goal is to compare the performance of "call" instructions, fix the benchmark.
On 20/04/2026 23:59, Keith Thompson wrote:
Bart <bc@freeuk.com> writes:
On 20/04/2026 18:48, Keith Thompson wrote:
Bart <bc@freeuk.com> writes:
Yes, that's really useful!
So which implementation is faster at actually doing function calls?
And how many calls were actually made?
I don't know or care.
Once again, *there are ways* to write C benchmarks that guarantee
that all the function calls you want to time actually occur during
execution.ÿ For example, you can use calls to separately compiled
functions (and disable link-time optimization if necessary).ÿ You can
do computations that the compiler can't unwrap.ÿ You might multiply
a value by (time(NULL) > 0); that always yields 1, but the compiler
probably doesn't know that.ÿ (That's off the top of my head; I don't
know what the best techniques are in practice.)ÿ And then you can
examine the generated code to make sure that it's what you want.
To add more suggestions here, I find the key to benchmarking when you
want to stick to standard C is use of "volatile".ÿ Use a volatile read
at the start of your code, then calculations that depend on each other
and that first read, then a volatile write of the result.ÿ That gives minimal intrusion in the code while making sure the calculations have to
be generated, and have to be done at run time.
If you are testing on a particular compiler (like gcc or clang), then
there are other options.ÿ The "noinline" function attribute is very
handy.ÿ Then there are empty inline assembly statements:
If you think of processor registers as acting like a level -1 memory
cache (for things that are not always in registers), then this flushes
that cache:
ÿÿÿÿasm volatile ("" ::: "memory");
This tells the compiler that it needs to have calculated "x" at this
point in time (so that its value can be passed to the assembly) :
ÿÿÿÿasm volatile ("" :: "" (x));
This tells the compiler that "x" might be changed by the assembly, so it must forget any additional knowledge it had of it :
ÿÿÿÿasm volatile ("" : "+g" (x));
I've had use of all of these in real code, not just benchmarks or test
code.ÿ They can be helpful in some kinds of interactions between low
level code and hardware.
Note that a global counter can be injected into the benchmark at the[...]
entry to fib(), and sure enough, it shows the expected number of calls
when displayed at the end (some 500M for fib(42)). But it's wrong!
On 4/21/2026 1:13 AM, David Brown wrote:
On 20/04/2026 23:59, Keith Thompson wrote:
Bart <bc@freeuk.com> writes:
On 20/04/2026 18:48, Keith Thompson wrote:
Bart <bc@freeuk.com> writes:
Yes, that's really useful!
So which implementation is faster at actually doing function calls?
And how many calls were actually made?
I don't know or care.
Once again, *there are ways* to write C benchmarks that guarantee
that all the function calls you want to time actually occur during
execution.ÿ For example, you can use calls to separately compiled
functions (and disable link-time optimization if necessary).ÿ You can
do computations that the compiler can't unwrap.ÿ You might multiply
a value by (time(NULL) > 0); that always yields 1, but the compiler
probably doesn't know that.ÿ (That's off the top of my head; I don't
know what the best techniques are in practice.)ÿ And then you can
examine the generated code to make sure that it's what you want.
To add more suggestions here, I find the key to benchmarking when you
want to stick to standard C is use of "volatile".ÿ Use a volatile read
at the start of your code, then calculations that depend on each other
and that first read, then a volatile write of the result.ÿ That gives
minimal intrusion in the code while making sure the calculations have
to be generated, and have to be done at run time.
If you are testing on a particular compiler (like gcc or clang), then
there are other options.ÿ The "noinline" function attribute is very
handy.ÿ Then there are empty inline assembly statements:
If you think of processor registers as acting like a level -1 memory
cache (for things that are not always in registers), then this flushes
that cache:
ÿÿÿÿÿasm volatile ("" ::: "memory");
This tells the compiler that it needs to have calculated "x" at this
point in time (so that its value can be passed to the assembly) :
ÿÿÿÿÿasm volatile ("" :: "" (x));
This tells the compiler that "x" might be changed by the assembly, so
it must forget any additional knowledge it had of it :
ÿÿÿÿÿasm volatile ("" : "+g" (x));
I've had use of all of these in real code, not just benchmarks or test
code.ÿ They can be helpful in some kinds of interactions between low
level code and hardware.
Well, we have to make a difference between a compiler barrier and a
memory barrier. All memory barriers should be compiler barriers, but compiler barriers do not have to be memory barriers... Fair enough?
On 21/04/2026 17:19, Keith Thompson wrote:
Bart <bc@freeuk.com> writes:
On 21/04/2026 14:43, David Brown wrote:I'm not sure I would have used the word "cheating", though I might
On 21/04/2026 14:48, Bart wrote:
I gave this link to someone doing a similar analysis:I can't answer for "everyone", but I rarely follow links posted on
https://github.com/drujensen/fib/issues/119
which everyone has conveniently ignored.
Usenet.ÿ I am interested in your opinions and answers - a third
party's opinions on a fourth party's project is not typically
helpful.ÿ But since you insist, I have looked at that page.ÿ Have
you?ÿ The project author and other posters agree that optimisations
are not "cheating", and question the realism of fibonacci as a
benchmark.
Literally the title of the page contains the word "cheating". And the
person maintaining the benchmarks says:
"I am open to suggestions on how to improve the fairness of the
benchmark."
have used it facetiously. None of the compilers we're discussing
"cheat" in the sense of violating the rules of the language.
The author is talking about *improving the benchmark* so that
it prevents the optimizations that make it difficult to measure
performance.
The author segregates different categories of language. I want to
compare across categories, but also want to compare unoptimised and
optimised native code.
Optimised timings tend to be fragile (see my example with fib1/2/3); unoptimised is far more reliable.
So the question of cheating and fairness has been raised. Some suggestTHAT'S NOT CHEATING. It's called optimization. If you refuse to
a separate category for optimised code. Some suggest using flags as I
have done. Some agree with you that optimisation should not be
restricted.
I think Fibonacci is a good benchmark for languages that don't cheat
by avoiding doing the full quota of 2*fib(N)-1 calls.
do the work of writing you benchmark so it avoids optimization,
then you'll end up with a bad benchmark.
For a fair comparison of language implementatons, you HAVE to be
running the same algorithm with the same steps.
If one doesn't bother executing some or most of those steps, then the comparison is meaningless. Unless we are comparing optimisation
ability, but that is not what this is about.
From that github home page: "Any language faster than Assembly is
performing unrolling type optimizations."
In fact, I am now using Assembly as the reference
implementation. Using optimised C is far too unreliable and it can be inconsistent.
I'm not going to dump a useful tool that works fine in dozens ofApparently you intend to continue to use a tool that does not
implementations just because you say so.
measure
what you want it to measure, that works with some implementations but
not with others. You say it works perfectly well while posting data
that shows that it does not.
It only does not because of those erroneous and erratic gcc timings.
I might measure performance by invoking it N times. Suppose I getDoes the code generated by L3 produce the correct output? If so, the
these results across 4 languages:
L1: 3.5 seconds
L2: 4.2
L3: 0.1
L4 2.9
According to you, obviously L3 is the winner because of its superior
optimiser! No red flags at all.
only problem is that your benchmark is affected by the L3 compiler's
optimization. If your goal is to compare the performance of "call"
instructions, fix the benchmark.
So you would ignore such a giant red flag? That's good to know.
(You wouldn't even curious at all about such an outlier?)
Bart <bc@freeuk.com> writes:
[...]
Note that a global counter can be injected into the benchmark at the[...]
entry to fib(), and sure enough, it shows the expected number of calls
when displayed at the end (some 500M for fib(42)). But it's wrong!
What exactly do you mean by "injected"? Do you modify the C program
to add a counter, or do you modify the compiler-generated assembly
or machine code? If the former, then of course you have a different
program, probably with different behavior.
And what exactly do you mean by "wrong"?
I can't be sure without seeing your code, but I'd expect the counter
to reflect the number of times the function is called in the abstract machine. Whether those function calls are implemented by "call"
(or "bl") instructions is an implementation detail about which the
C standard says nothing. If you happen to care about that, that's
fine, but you'll need to go beyond what the language guarantees if
you want to measure it.
On 21/04/2026 20:40, Keith Thompson wrote:
Bart <bc@freeuk.com> writes:
[...]
Note that a global counter can be injected into the benchmark at the[...]
entry to fib(), and sure enough, it shows the expected number of calls
when displayed at the end (some 500M for fib(42)). But it's wrong!
What exactly do you mean by "injected"? Do you modify the C program
to add a counter, or do you modify the compiler-generated assembly
or machine code? If the former, then of course you have a different
program, probably with different behavior.
And what exactly do you mean by "wrong"?
I can't be sure without seeing your code, but I'd expect the counter
to reflect the number of times the function is called in the abstract
machine. Whether those function calls are implemented by "call"
(or "bl") instructions is an implementation detail about which the
C standard says nothing. If you happen to care about that, that's
fine, but you'll need to go beyond what the language guarantees if
you want to measure it.
Here I mean within the C, but I've done both.
Injecting into the ASM gives more accurate results. In the C, it just displays what you would expect, so for fib(N), it would be 2*fib)-1,
even if it just uses a lookup table or hardcodes the value. It's a
sham.
Bart <bc@freeuk.com> writes:
In fact, I am now using Assembly as the reference
implementation. Using optimised C is far too unreliable and it can be
inconsistent.
Comparing the performance of unoptimized code is not, in my humble
opinion, particularly useful. Benchmarks are for people who care
about performance. People who care about performance typically do
not ship unoptimized code, simply because optimized code is faster.
If you want to perform measurements that are relevant to the
performance of real-world code, it's best to (a) write the benchmark
in a way that forces the compiler *not* to perform optimizations
that destroy the thing you're trying to measure, and (b) compile
the benchmark with optimization *enabled*.
I'm not an expert on benchmarks. The people who write them presumably
know all about this stuff. (There's a comp.benchmarks newsgroup,
but it's inactive.)
They are neither "erroneous" nor "erratic" as far as the behavior
required by the C standard or by gcc is concerned. They merely
violate your faulty assumptions.
Sure, I'd be curious about why L3 performs so much better. If L3 is
compiled and the others are interpreted, that's probably the answer.
If they're all compiled, its likely that the L3 compiler performs optimizations that the others don't. (Of course this is assuming the
output is correct; fast wrong answers are not useful or interesting.)
I would not assume that it's a *problem*.
There is at least one kind of optimization that I'd call "cheating".
If a compiler computes the sha256 checksum of a C source file and
finds that it matches the checksum of a known benchmark, and then
generates an executable that just prints the expected output with impressive-looking numbers, I'd call that cheating, though it's
still conforming behavior as far as the C standard is concerned.
If a compiler is able to optimize code without changing its behavior
in impermissible ways, that's just good optimization.
Let me ask you a question. You've been complaining a lot about
how gcc behaves.
Would you insist that a program that computes fib(10) executes
exactly 177 "call" instructions
quick experiment)? Would you insist that `2+2` must generate an
"add" instruction? If your answers differ, why?
My language allows you to do this:
ÿÿ int a, b
ÿÿ a := b
It is well-defined in the language, and I know it is well defined on all
my likely targets.
On 20/04/2026 13:45, Bart wrote:
[...]
[...] But it is
genuinely absurd to say that you have been writing languages,
translators, compilers, transpilers and other tools for decades, and
don't understand such simple things.ÿ This really is the first step for
a compiler that transforms language A to language B - you have to
generate code in language B that implements the semantics in language A.
If I were in your shoes, I would translate "int a;" from your language
into "int a = 0;" in C.ÿ Simple and clear.ÿ The semantics on the C side
are stronger than those in the source language (since there is a
definite value of 0, rather than an unspecified int value as you have AFAIUI), but that's fine.ÿ If your translator generates something with weaker semantics - like plain "int a;" in C - then your translator is
broken by design.
[...]
On 21/04/2026 21:16, Keith Thompson wrote:[...]
Would you insist that a program that computes fib(10) executes
exactly 177 "call" instructions
For comparing function calls across languages and implementations via
that benchmark, then yes.
I am not interested in finding out how clever optimising compilers can
be. They anyway have ample opportunity to do that within the real applications that are the compiler-intepreters
The bytecode compilers of interpreted languages tend not to do
aggressive optimisations (its harder for dynamic code, and would take
too long).
quick experiment)? Would you insist that `2+2` must generate an
"add" instruction? If your answers differ, why?
If I was testing integer arithmetic then I would have to use
variables. Constant expression reduction is too commonly done, and in
some languages (eg. C) it is mandated.
On 20/04/2026 14:02, David Brown wrote:
On 20/04/2026 13:45, Bart wrote:
I implement it in a common sense manner.
"Common sense" is another way of saying "I don't know the actual rules".
It means doing the obvious thing with no unexpected surprises. If the resulting program runs with exactly the behaviour the user expects, then what is the problem?
On 2026-04-20 13:45, Bart wrote:
My language allows you to do this:
ÿÿ int a, b
ÿÿ a := b
It is well-defined in the language, and I know it is well defined on
all my likely targets.
That's perfectly fine if (for example) your language implies a default initialization semantics. (Simula, e.g., has such a semantic defined; declared (instantiated) integer variables have the value 0.) - But "C"
does not! - Haven't you two been talking about "C" all the time?
(If you are again trying to project your language's "design decisions"
onto "C" I really suggest to stop that nonsense since it doesn't lead anywhere.)
David Brown <david.brown@hesbynett.no> writes:
[...]
You seem to think that just writing "int a;" somehow creates an int
object called "a" along with a slot on the stack or a dedicated
register. That does not happen in many compilers. And it does not
happen in the C semantics. "a" is an lvalue that /potentially/
designates an object - for an uninitialised local variable, it does
not designate an object until a value is assigned.
[...]
I don't think that's correct.
Within the scope of a declaration `int a;` the expression `a` is
an lvalue that *does* designate an object.
Bart <bc@freeuk.com> wrote:
On 19/04/2026 20:32, David Brown wrote:
On 19/04/2026 19:47, Bart wrote:
Get the value of 'b',
You can't do that. "b" has no value. "b" is indeterminate, and
using its value is UB - the code has no meaning right out of the
gate.
When you use "b" in an expression, you are /not/ asking C to read
the bits and bytes stored at the address of the object "b". You
are asking for the /value/ of the object "b". How the compiler
gets that value is up to the compiler - it can read the memory, or
use a stored copy in a register, or use program analysis to know
what the value is in some other way. And if the object "b" does
not have a value, you are asking the impossible.
Try asking a human "You have two numbers, b and c. Add them.
What is the answer?".
You have two slates A and B which someone should have wiped clean
then written a new number on each.
But that part hasn't been done; they each still have an old number
from their last use.
You can still add them together, nothing bad will happen. It just
may be the wrong answer if the purpose of the exercise was to find
the sum of two specific new numbers.
But the purpose may also be see how good they are adding. Or in
following instructions.
whatever it happens to be, add the value of 'c' scaled by 8, and
store the result it into 'a'. The only things to consider are
that some intermediate results may lose the top bits.
Is 'a = b' equally undefined? If so that C is even crazy than
I'd thought.
If "a" or "b" are indeterminate, then using them is undefined. I
have two things - are they the same colour? How is that supposed
to make sense?
You keep thinking of objects like "b" as a section of memory with
a bit pattern in it. Objects are not that simple in C - C is not
assembly.
Why ISN'T it that simple? What ghastly thing would happen if it
was?
"b" will be some location in memory or it might be some register,
and it WILL have a value. That value happens to be unknown until
it is initialised.
So accessing it will return garbage (unless you know exactly what
you are doing then it may be something useful).
My original example was something like 'a = b + c' (I think in my
language), converted to my IL, then expressed in very low-level C.
You were concerned that in that C, the values weren't initialised.
How would that have affected the code that C compiler generated
from that?
You look at trivial example, where AFAICS the best answer is:
"Compiler follows general rules, why should it make exception for
this case?". Note that in this trivial case "interesting"
behaviour could happen on exotic hardware (probably disallowed
by C23 rules, but AFAICS legal for earlier C versions).
On Tue, 21 Apr 2026 12:12:28 +0100
Bart <bc@freeuk.com> wrote:
Note 2: I believe these figures are suspect because the requisite
number of calls are not done.
I don't see anything suspect in the -O1 code generated by gcc 14.2.0
Source:
unsigned long long fib(unsigned long long n)
{
if (n < 2)
return 1;
return fib(n-1)+fib(n-2);
}
On Tue, 21 Apr 2026 14:49:58 +0100
Bart <bc@freeuk.com> wrote:
On 21/04/2026 13:55, Michael S wrote:
On Tue, 21 Apr 2026 12:12:28 +0100
Bart <bc@freeuk.com> wrote:
Note 2: I believe these figures are suspect because the requisite
number of calls are not done.
I don't see anything suspect in the -O1 code generated by gcc 14.2.0
Source:
unsigned long long fib(unsigned long long n)
{
if (n < 2)
return 1;
return fib(n-1)+fib(n-2);
}
$ gcc -S -O1 -Wall -Wextra -fno-asynchronous-unwind-tables
ref0_fib.c $ cat ref0_fib.s
.file "ref0_fib.c"
.text
.globl fib
.def fib; .scl 2; .type 32; .endef
fib:
movl $1, %eax
cmpq $1, %rcx
jbe .L5
pushq %rsi
pushq %rbx
subq $40, %rsp
movq %rcx, %rbx
leaq -1(%rcx), %rcx
call fib
movq %rax, %rsi
leaq -2(%rbx), %rcx
call fib
addq %rsi, %rax
addq $40, %rsp
popq %rbx
popq %rsi
ret
.L5:
ret
.ident "GCC: (Rev2, Built by MSYS2 project) 14.2.0"
Measured with n=43 on my very old home desktop it gave:
1402817465/2.646 s = 530165330.7 calls/sec
You're right. It was either a different version or I was mistaken.
But it seems that Clang -O1 will generate a version with only a
single fib call. This is the godbolt code for the Fib() version using
"if (n < 3) return 1":
fib:
pushq %r14
pushq %rbx
pushq %rax
movl %edi, %r14d
xorl %ebx, %ebx
cmpl $3, %r14d
jl .LBB0_3
.LBB0_2:
leal -1(%r14), %edi
callq fib
addl %eax, %ebx
addl $-2, %r14d
cmpl $3, %r14d
jge .LBB0_2
.LBB0_3:
incl %ebx
movl %ebx, %eax
addq $8, %rsp
popq %rbx
popq %r14
retq
If I inject an increment to a global counter just after than callq
fib, then it shows only half the expected value.
(This fib version is one-based, so that fib(10) is 55, while yours I
think has it as 89. Google tells me that Fibonacci(10) is 55.)
That looks like tail call elimination. I.e. compiler turned the code
into:
unsigned long long fib(unsigned long long n)
{
unsigned long long res = 0;
while (n >= 3) {
res += fib(n-1);
n -= 2;
}
return res + 1;;
}
gcc generates similar code with -O -foptimize-sibling-calls
For certain styles of coding, e.g. one often preferred by Tim Rentsch,
this optimization is extremely important.
Keith Thompson <Keith.S.Thompson+u@gmail.com> writes:
David Brown <david.brown@hesbynett.no> writes:
[...]
You seem to think that just writing "int a;" somehow creates an int
object called "a" along with a slot on the stack or a dedicated
register. That does not happen in many compilers. And it does not
happen in the C semantics. "a" is an lvalue that /potentially/
designates an object - for an uninitialised local variable, it does
not designate an object until a value is assigned.
[...]
I don't think that's correct.
In fact, it's wrong. Encountering a declaration 'int a;' certainly
means that in the abstract machine there is an object corresponding
to the identifier 'a'.
Within the scope of a declaration `int a;` the expression `a` is
an lvalue that *does* designate an object.
And more than that: in the abstract machine an object corresponding
to the identifier 'a' comes into existence as soon as the block
containing 'int a;' is entered, regardless of whether 'a' is
initialized or referenced in any way.
Bart <bc@freeuk.com> writes:
On 21/04/2026 21:16, Keith Thompson wrote:[...]
If I was testing integer arithmetic then I would have to use
variables. Constant expression reduction is too commonly done, and in
some languages (eg. C) it is mandated.
You're *so* close to getting it.
If you want to measure the performance of addition, you have to write
your benchmark code so the addition operator can't be optimized away.
If you don't do that, the results will be meaningless.
If you want to measure the performance of function calls, ...
Bart <bc@freeuk.com> writes:
On 21/04/2026 15:27, Scott Lurndal wrote:
Bart <bc@freeuk.com> writes:
On 21/04/2026 09:53, David Brown wrote:
On 21/04/2026 00:03, Bart wrote:
I took a program ll.c (which is Lua source code in one file, so the
compiler can see the whole program), and replaced the body of main()
with 'exit(0)'. So none of the functions are called. I got these results: >>>>
c:\cx>gcc -s -Os ll.c # optimise for size
c:\cx>dir a.exe
21/04/2026 11:10 241,152 a.exe
c:\cx>bcc ll
Compiling ll.c to ll.exe
c:\cx>dir ll.exe
21/04/2026 11:11 237,056 ll.exe
Somehow my bcc-compiled version generated a smaller excutable!
Ah, back to the on-disk executable size. An irrelevent
metric. One might expect the 'in-memory' size is interesting
and that's what the -Os option is designed to minimize, not the
number of disk sectors consumed by the executable file.
You don't think there's a correlation between the size of code and
initialised data, and the size of the executable?
A portion of the executable is metadata that never
gets loaded into memory (symbol tables, rtld data
and relocation information, etc.)
You might wish to compare the text section sizes,
Both text and initialised data will take up valuable memory.
$ size bin/test1
text data bss dec hex filename
6783060 85872 1861744 8730676 853834 bin/test1
The text only takes up memory -on demand-. If a code
page is never referenced, it is never loaded into
memory.
The working set size is interesting, but completely unrelated
to the size of the on-disk executable file.
Michael S <already5chosen@yahoo.com> writes:
On Tue, 21 Apr 2026 12:12:28 +0100
Bart <bc@freeuk.com> wrote:
Note 2: I believe these figures are suspect because the requisite
number of calls are not done.
I don't see anything suspect in the -O1 code generated by gcc 14.2.0
Source:
unsigned long long fib(unsigned long long n)
{
if (n < 2)
return 1;
return fib(n-1)+fib(n-2);
}
Tsk, tsk. fibonacci(0) is 0.
Michael S <already5chosen@yahoo.com> writes:
On Tue, 21 Apr 2026 14:49:58 +0100
Bart <bc@freeuk.com> wrote:
On 21/04/2026 13:55, Michael S wrote:
On Tue, 21 Apr 2026 12:12:28 +0100
Bart <bc@freeuk.com> wrote:
Note 2: I believe these figures are suspect because the
requisite number of calls are not done.
I don't see anything suspect in the -O1 code generated by gcc
14.2.0
Source:
unsigned long long fib(unsigned long long n)
{
if (n < 2)
return 1;
return fib(n-1)+fib(n-2);
}
$ gcc -S -O1 -Wall -Wextra -fno-asynchronous-unwind-tables
ref0_fib.c $ cat ref0_fib.s
.file "ref0_fib.c"
.text
.globl fib
.def fib; .scl 2; .type 32; .endef
fib:
movl $1, %eax
cmpq $1, %rcx
jbe .L5
pushq %rsi
pushq %rbx
subq $40, %rsp
movq %rcx, %rbx
leaq -1(%rcx), %rcx
call fib
movq %rax, %rsi
leaq -2(%rbx), %rcx
call fib
addq %rsi, %rax
addq $40, %rsp
popq %rbx
popq %rsi
ret
.L5:
ret
.ident "GCC: (Rev2, Built by MSYS2 project) 14.2.0"
Measured with n=43 on my very old home desktop it gave:
1402817465/2.646 s = 530165330.7 calls/sec
You're right. It was either a different version or I was mistaken.
But it seems that Clang -O1 will generate a version with only a
single fib call. This is the godbolt code for the Fib() version
using "if (n < 3) return 1":
fib:
pushq %r14
pushq %rbx
pushq %rax
movl %edi, %r14d
xorl %ebx, %ebx
cmpl $3, %r14d
jl .LBB0_3
.LBB0_2:
leal -1(%r14), %edi
callq fib
addl %eax, %ebx
addl $-2, %r14d
cmpl $3, %r14d
jge .LBB0_2
.LBB0_3:
incl %ebx
movl %ebx, %eax
addq $8, %rsp
popq %rbx
popq %r14
retq
If I inject an increment to a global counter just after than callq
fib, then it shows only half the expected value.
(This fib version is one-based, so that fib(10) is 55, while yours
I think has it as 89. Google tells me that Fibonacci(10) is 55.)
That looks like tail call elimination. I.e. compiler turned the
code into:
unsigned long long fib(unsigned long long n)
{
unsigned long long res = 0;
while (n >= 3) {
res += fib(n-1);
n -= 2;
}
return res + 1;;
}
gcc generates similar code with -O -foptimize-sibling-calls
For certain styles of coding, e.g. one often preferred by Tim
Rentsch, this optimization is extremely important.
Please don't misrepresent me. The code transformation shown above
is not important to the functional recursive style that I often
employ. Neither of the two recursive calls to fib() in the C code
function shown at the top are tail calls.
Bart <bc@freeuk.com> wrote:
On 21/04/2026 01:39, Waldek Hebisch wrote:
Bart <bc@freeuk.com> wrote:
On 20/04/2026 18:48, Keith Thompson wrote:
Bart <bc@freeuk.com> writes:
Yes, that's really useful!
So which implementation is faster at actually doing function calls? And >>>> how many calls were actually made?
Implementation that uses 0 instructions to implement function call
on normal machine will get shorter runtime, so clearly is faster
at doing function calls. This does not differ much from what
some modern processors do, namely move instruction may effectively
take 0 cycles. People used to old ways when confrotnted with
movq %rax, %rdx
expect that there will be actual movement of data, that instruction
must travel the whole CPU pipeline. But modern processors do
register renaming and after looking at this istruction may
simply note that to get value of %rdx one uses place storing
%rax (I am using AT&T convention so direction is from %rax to
%rdx) and otherwise drop the istructruction. Is the processor
cheating? Naive benchmark where moves are overrepresented may
execute unexpectedy fast, but moves are frequent in real
program so this gives valuable speedup for all programs.
Coming back to function calls, consider programmer who cares
very much about speed. He knows that his program would be
simpler and easier to write if he used a lot of small
functions. In old days he would worry about cost of
function calls and he proably would write much bigger and
complicated functions to get good speed. But if cost of
function call is 0 he can freely use small functions, without
worrying about cost of calls.
If the cost was zero then function inlining wouldn't be a thing.
Inlining is a way to get 0 cost.
I will give you the answer: it is to compare how implementations cope
with very large numbers of recursive function calls. So if one finds a >>>> way to avoid doing such calls, then it is not a fair comparison.
Well, Fibonacci and similar functions have limited use.
They are commonly used as benchmarks. I use them a lot to compare
interpreted and JITed languages, but need also some native code tests as
a reference.
It is the latter that are flawed when using gcc.
I decided to write a reference version in x64 assembly, a
straightforward version that does the requisite number of calls.
To evaluate fib(42), it took 0.85 seconds on my PC, or about 680M
calls/second. gcc-O3 does it, miraculously, at 1270M calls/second.
However that is misleading and unsustainable. I showed in my last post
how, if the calls are split across modules using three fib() functions
that call each other, it can only manage 570M calls/second.
Meanwhile the versions that don't cheat can maintain the same throughput.
For curiosity I tried several different variant of Fibonacci-like
benchmark. First version is what you apparently expect, that is
single function that contains machine instructions that you expect.
Then a version which corresponds to C code like:
long
fib45(long n) {
return fib44(n - 1) + fib43(n - 2);
}
....
long
fib0(long n) {
return 0;
}
This version makes all control transfers perfectly predictable
and avoids conditionals. Then version which does not bother
actually passing parameters, but still is doing stack adjustments
and computes value. Then version which performs all required
calls but does not bother with computing value and stack adjustments.
Finally version which replaces final call by jump.
All version using calls had similar speed, nominally about 2.4
clocks per call with about 10% variation (I did not investigate
what caused variation). Version using jumps was significantly
faster, needing about 1.4 clocks per call. The following
version allows tail calls without inlining:
long
fibk(long n, long acc) {
if (n < 2) {
return n + acc;
}
return fibk(n - 1, fibk(n - 2, acc));
}
(start with acc equal to 0).
This one needs about 1.6 clock per call. Note that this one
executes instruction essentially in the same sequence as naive
version, only approximately half of calls is replaced by
jumps and consequently approximately half of returns is gone.
AFAICS reasonable interpretation of results above is that
jumps (even conditional ones) are cheaper than calls and
returns. And it seems that calls and returns are cheap
as long as you do not have too many of them. That is
you can execute a lot of instructions in parallel with
a call or return, but each call and return introduces
latency (probably 1 clock in good case, more if predictors
do no manage to speed it up).
I've anyway counted the calls that gcc-O3 does make and it is a lot
fewer than needed (95% less IIRC). It is achieved via complex inlining
and use of TCO from what I can see.
OK, tried that one too. It needs 0.93 clocks per nominal "call".
I tried also
long
fibk(long n, long acc) {
if (n == 2) {
return 1 + acc;
}
if (n < 2) {
return n + acc;
}
return fibk(n - 1, fibk(n - 2, acc));
}
It needs 1.02 clocks per "call". Of course it needs less calls
than version without special case for 'n == 2', that is why I
put call in quotes. This version could be produced by first
introducing special case for 'n == 2', that is
long
fibk(long n, long acc) {
if (n == 2) {
return fibk(n - 1, fibk(n - 2, acc));
}
if (n < 2) {
return n + acc;
}
return fibk(n - 1, fibk(n - 2, acc));
}
then replacing n by known value and replacing calls to fibk
with known first argument by its value. gcc is doing different
thing, but clearly "not optimizable" does not hold for
Fibonacci function.
So the
real question is what is the cost of function calls in actual
programs. For calls to small non-recursive function cost is
close to 0. Recursion increases makes optimization more tricky,
so increases cost. But still, in practice cost is lower than
one could naively expect.
Concerning fairness, AFAIK gcc optimization were developed to
speed up real programs. They speed up Fibonacci basically as
a side effect.
I suspect some time was spent on Fibonacci too!
Possible. But the main motivation were C++ methods and related
coding style. Minor motivation was to allow functional style
of programming. Consider silly (completely untested) code below:
struct node {struct node * next; int data;};
long
length0(struct node * n, long acc) {
if (!n) {
return acc;
} else {
return length0(n->next, acc + 1);
}
}
long
length(struct node * n) {
return length0(n, 0);
}
Let me say that there are people who prefer code like this. Now
it is very easy to turn this into a loop completely avoiding recursion. Functional programming folks call code like this "iterative" and
demand that compiler uses fixed stack space regardless of number
of calls. If you use C as intermediate language for functianal
languages, then it is tricky to satisfy this property in a
compiler except for case when C compiler implements this
optimization.
So IMO it is fair: compier that can not speed
up calls in Fibonacci probably will have trouble speeding up
calls at least in some real programs.
Speeding up calls = avoiding making those calls?
Avoiding emiting call instructions.
Bart <bc@freeuk.com> writes:
On 21/04/2026 21:16, Keith Thompson wrote:[...]
Would you insist that a program that computes fib(10) executes
exactly 177 "call" instructions
For comparing function calls across languages and implementations via
that benchmark, then yes.
I didn't say the program was a benchmark. I didn't say what the
purpose of the program was. And guess what, the compiler *doesn't
know* it's a benchmark, so it doesn't disable optimizations for the
sake of measuring something.
If I write a C program with a fib() function (never mind that
the naive recursive algorithm is horrible), and it calls fib(10)
because it needs to know the value of fib(10), would you insist that
it executes exactly 177 "call" instructions?
If I was testing integer arithmetic then I would have to use
variables. Constant expression reduction is too commonly done, and in
some languages (eg. C) it is mandated.
You're *so* close to getting it.
If you want to measure the performance of addition, you have to write
your benchmark code so the addition operator can't be optimized away.
If you don't do that, the results will be meaningless.
On 2026-04-20 13:45, Bart wrote:
My language allows you to do this:
ÿÿÿ int a, b
ÿÿÿ a := b
It is well-defined in the language, and I know it is well defined on
all my likely targets.
That's perfectly fine if (for example) your language implies a default initialization semantics. (Simula, e.g., has such a semantic defined; declared (instantiated) integer variables have the value 0.) - But "C"
does not!
(If you are again trying to project your language's "design decisions"
onto "C" I really suggest to stop that nonsense since it doesn't lead anywhere.)
antispam@fricas.org (Waldek Hebisch) writes:
Bart <bc@freeuk.com> wrote:
On 19/04/2026 20:32, David Brown wrote:
On 19/04/2026 19:47, Bart wrote:
Get the value of 'b',
You can't do that. "b" has no value. "b" is indeterminate, and
using its value is UB - the code has no meaning right out of the
gate.
When you use "b" in an expression, you are /not/ asking C to read
the bits and bytes stored at the address of the object "b". You
are asking for the /value/ of the object "b". How the compiler
gets that value is up to the compiler - it can read the memory, or
use a stored copy in a register, or use program analysis to know
what the value is in some other way. And if the object "b" does
not have a value, you are asking the impossible.
Try asking a human "You have two numbers, b and c. Add them.
What is the answer?".
You have two slates A and B which someone should have wiped clean
then written a new number on each.
But that part hasn't been done; they each still have an old number
from their last use.
You can still add them together, nothing bad will happen. It just
may be the wrong answer if the purpose of the exercise was to find
the sum of two specific new numbers.
But the purpose may also be see how good they are adding. Or in
following instructions.
whatever it happens to be, add the value of 'c' scaled by 8, and
store the result it into 'a'. The only things to consider are
that some intermediate results may lose the top bits.
Is 'a = b' equally undefined? If so that C is even crazy than
I'd thought.
If "a" or "b" are indeterminate, then using them is undefined. I
have two things - are they the same colour? How is that supposed
to make sense?
You keep thinking of objects like "b" as a section of memory with
a bit pattern in it. Objects are not that simple in C - C is not
assembly.
Why ISN'T it that simple? What ghastly thing would happen if it
was?
"b" will be some location in memory or it might be some register,
and it WILL have a value. That value happens to be unknown until
it is initialised.
So accessing it will return garbage (unless you know exactly what
you are doing then it may be something useful).
My original example was something like 'a = b + c' (I think in my
language), converted to my IL, then expressed in very low-level C.
You were concerned that in that C, the values weren't initialised.
How would that have affected the code that C compiler generated
from that?
You look at trivial example, where AFAICS the best answer is:
"Compiler follows general rules, why should it make exception for
this case?". Note that in this trivial case "interesting"
behaviour could happen on exotic hardware (probably disallowed
by C23 rules, but AFAICS legal for earlier C versions).
The kinds of behavior Bart is asking about has been undefined
behavior for just over 15 years, since 2011 ISO C.
On 22/04/2026 03:36, Janis Papanagnou wrote:
On 2026-04-20 13:45, Bart wrote:
My language allows you to do this:
??? int a, b
??? a := b
It is well-defined in the language, and I know it is well defined
on all my likely targets.
That's perfectly fine if (for example) your language implies a
default initialization semantics. (Simula, e.g., has such a
semantic defined; declared (instantiated) integer variables have
the value 0.) - But "C" does not!
I'm sure that was foremost in the designers' minds when they created
C in 1972. It wasn't retrofitted into the spec years later at all.
- Haven't you two been talking about "C" all the time?
(If you are again trying to project your language's "design
decisions" onto "C" I really suggest to stop that nonsense since it
doesn't lead anywhere.)
It seems to be fine in C too according to observation:
c:\cx>type t.c
void F() {
int a, b;
a = b;
}
c:\cx>gcc -c t.c
c:\cx>gcc -c -O3 t.c
c:\cx>gcc -c -Wextra t.c
t.c:3:5: warning: 'b' is used uninitialized [-Wuninitialized]
c:\cx>gcc -c -O3 -Wextra t.c
t.c:3:5: warning: 'b' is used uninitialized [-Wuninitialized]
c:\cx>gcc -c -O3 -Wextra -Wno-uninitialized t.c
It either ignores it or it warns about it. Or you can optimise the
program and tell it to ignore it - apparently you call the shots.
(I notice it says nothing about 'a' not being used.)
So, what does language say about it again? Remind me! Or better, tell
the compiler.
What about this version:
void F() {
int a, b;
// b = G();
a = b;
}
This is part of a larger program, but G hasn't been written yet, and
this function will not be run in any tests until it has. Should gcc
swamp you with pointless warnings?
On Wed, 22 Apr 2026 15:14:39 +0100
Bart <bc@freeuk.com> wrote:
On 22/04/2026 03:36, Janis Papanagnou wrote:
On 2026-04-20 13:45, Bart wrote:
My language allows you to do this:
ÿÿÿ int a, b
ÿÿÿ a := b
It is well-defined in the language, and I know it is well defined
on all my likely targets.
That's perfectly fine if (for example) your language implies a
default initialization semantics. (Simula, e.g., has such a
semantic defined; declared (instantiated) integer variables have
the value 0.) - But "C" does not!
I'm sure that was foremost in the designers' minds when they created
C in 1972. It wasn't retrofitted into the spec years later at all.
- Haven't you two been talking about "C" all the time?
(If you are again trying to project your language's "design
decisions" onto "C" I really suggest to stop that nonsense since it
doesn't lead anywhere.)
It seems to be fine in C too according to observation:
c:\cx>type t.c
void F() {
int a, b;
a = b;
}
c:\cx>gcc -c t.c
c:\cx>gcc -c -O3 t.c
c:\cx>gcc -c -Wextra t.c
t.c:3:5: warning: 'b' is used uninitialized [-Wuninitialized]
c:\cx>gcc -c -O3 -Wextra t.c
t.c:3:5: warning: 'b' is used uninitialized [-Wuninitialized]
c:\cx>gcc -c -O3 -Wextra -Wno-uninitialized t.c
It either ignores it or it warns about it. Or you can optimise the
program and tell it to ignore it - apparently you call the shots.
(I notice it says nothing about 'a' not being used.)
So, what does language say about it again? Remind me! Or better, tell
the compiler.
What about this version:
void F() {
int a, b;
// b = G();
a = b;
}
This is part of a larger program, but G hasn't been written yet, and
this function will not be run in any tests until it has. Should gcc
swamp you with pointless warnings?
C warns you, but it does not stop. So, you can ignore a warning or even disable it, if you happen to know a suitable spell.
Rust is not as tolerant, even in cases of less serious nonobservance.
On 22/04/2026 05:09, Tim Rentsch wrote:
antispam@fricas.org (Waldek Hebisch) writes:
Bart <bc@freeuk.com> wrote:
On 19/04/2026 20:32, David Brown wrote:
A
On 19/04/2026 19:47, Bart wrote:
Get the value of 'b',
You can't do that. "b" has no value. "b" is indeterminate, and
using its value is UB - the code has no meaning right out of the
gate.
The kinds of behavior Bart is asking about has been undefined
behavior for just over 15 years, since 2011 ISO C.
So what was it between 1972 and 2011?
On 22/04/2026 16:56, Michael S wrote:
On Wed, 22 Apr 2026 15:14:39 +0100
Bart <bc@freeuk.com> wrote:
On 22/04/2026 03:36, Janis Papanagnou wrote:
On 2026-04-20 13:45, Bart wrote:
My language allows you to do this:
??? int a, b
??? a := b
It is well-defined in the language, and I know it is well defined
on all my likely targets.
That's perfectly fine if (for example) your language implies a
default initialization semantics. (Simula, e.g., has such a
semantic defined; declared (instantiated) integer variables have
the value 0.) - But "C" does not!
I'm sure that was foremost in the designers' minds when they
created C in 1972. It wasn't retrofitted into the spec years later
at all.
- Haven't you two been talking about "C" all the time?
(If you are again trying to project your language's "design
decisions" onto "C" I really suggest to stop that nonsense since
it doesn't lead anywhere.)
It seems to be fine in C too according to observation:
c:\cx>type t.c
void F() {
int a, b;
a = b;
}
c:\cx>gcc -c t.c
c:\cx>gcc -c -O3 t.c
c:\cx>gcc -c -Wextra t.c
t.c:3:5: warning: 'b' is used uninitialized [-Wuninitialized]
c:\cx>gcc -c -O3 -Wextra t.c
t.c:3:5: warning: 'b' is used uninitialized [-Wuninitialized]
c:\cx>gcc -c -O3 -Wextra -Wno-uninitialized t.c
It either ignores it or it warns about it. Or you can optimise the
program and tell it to ignore it - apparently you call the shots.
(I notice it says nothing about 'a' not being used.)
So, what does language say about it again? Remind me! Or better,
tell the compiler.
What about this version:
void F() {
int a, b;
// b = G();
a = b;
}
This is part of a larger program, but G hasn't been written yet,
and this function will not be run in any tests until it has.
Should gcc swamp you with pointless warnings?
C warns you, but it does not stop. So, you can ignore a warning or
even disable it, if you happen to know a suitable spell.
Rust is not as tolerant, even in cases of less serious
nonobservance.
"C" does not warn you about anything here - that's up to C compilers.
Some mistakes in your C code require a diagnostic (for conforming compilers), but not this one.
gcc will, of course, treat the use of the uninitialised variable as
an error halting compilation if you know how to use it properly. It
will also give a warning that "a" is unused, if you know how to use
it properly. Bart knows how to use gcc with flags to set standards conformance, warnings, etc., and he knows the difference between C
the language and particular compiler implementations, but he thinks
it is fun to pretend he does not.
Rust is not fundamentally better than C here - it is simply that Rust
is a relatively new language without the historical baggage of
existing questionable quality Rust code. So Rust tools were able to
have better checking for this kind of thing out of the gate. gcc
/could/ make "-Werror=uninitialized" the default setting, but that
would cause problems with some existing code. gcc has gradually
expanded on the selection of warnings it enables by default, but it's
a slow process to avoid upsetting people with large code bases that
trigger such warnings.
Bart <bc@freeuk.com> writes:
On 22/04/2026 05:09, Tim Rentsch wrote:
antispam@fricas.org (Waldek Hebisch) writes:
Bart <bc@freeuk.com> wrote:
On 19/04/2026 20:32, David Brown wrote:
A
On 19/04/2026 19:47, Bart wrote:
Get the value of 'b',
You can't do that. "b" has no value. "b" is indeterminate, and
using its value is UB - the code has no meaning right out of the
gate.
The kinds of behavior Bart is asking about has been undefined
behavior for just over 15 years, since 2011 ISO C.
So what was it between 1972 and 2011?
Implementation specific. Depending on how the linker
and run-time loader handled uninitialized data regions
in the a.out file and when loading.
Some may have initialized to zero, others may have initialized
to some other data pattern (e.g. 0xdeadbeef) to catch
uninitialized pointer dereferences (particularly since early
unix systems often would return zero on a load from a NULL
pointer rather than trapping the access).
The key point is that portable C code could make no
assumptions about uninitialized data accesses as
the existing implementations differed. Hence, UB.
IMO, most "undefined behavior" in the C specification was
due to implementation differences between the C compilers/linkers
that existed at the time.
On 22/04/2026 05:09, Tim Rentsch wrote:<snip>
antispam@fricas.org (Waldek Hebisch) writes:
You look at trivial example, where AFAICS the best answer is:
"Compiler follows general rules, why should it make exception for
this case?".ÿ Note that in this trivial case "interesting"
behaviour could happen on exotic hardware (probably disallowed
by C23 rules, but AFAICS legal for earlier C versions).
The kinds of behavior Bart is asking about has been undefined
behavior for just over 15 years, since 2011 ISO C.
So what was it between 1972 and 2011?
On 22/04/2026 05:09, Tim Rentsch wrote:
antispam@fricas.org (Waldek Hebisch) writes:
You look at trivial example, where AFAICS the best answer is:
"Compiler follows general rules, why should it make exception for
this case?". Note that in this trivial case "interesting"
behaviour could happen on exotic hardware (probably disallowed
by C23 rules, but AFAICS legal for earlier C versions).
The kinds of behavior Bart is asking about has been undefined
behavior for just over 15 years, since 2011 ISO C.
So what was it between 1972 and 2011?
On Wed, 22 Apr 2026 17:12:00 +0200
David Brown <david.brown@hesbynett.no> wrote:
On 22/04/2026 16:56, Michael S wrote:
On Wed, 22 Apr 2026 15:14:39 +0100
Bart <bc@freeuk.com> wrote:
On 22/04/2026 03:36, Janis Papanagnou wrote:
On 2026-04-20 13:45, Bart wrote:
My language allows you to do this:
ÿÿÿ int a, b
ÿÿÿ a := b
It is well-defined in the language, and I know it is well defined
on all my likely targets.
That's perfectly fine if (for example) your language implies a
default initialization semantics. (Simula, e.g., has such a
semantic defined; declared (instantiated) integer variables have
the value 0.) - But "C" does not!
I'm sure that was foremost in the designers' minds when they
created C in 1972. It wasn't retrofitted into the spec years later
at all.
- Haven't you two been talking about "C" all the time?
(If you are again trying to project your language's "design
decisions" onto "C" I really suggest to stop that nonsense since
it doesn't lead anywhere.)
It seems to be fine in C too according to observation:
c:\cx>type t.c
void F() {
int a, b;
a = b;
}
c:\cx>gcc -c t.c
c:\cx>gcc -c -O3 t.c
c:\cx>gcc -c -Wextra t.c
t.c:3:5: warning: 'b' is used uninitialized [-Wuninitialized]
c:\cx>gcc -c -O3 -Wextra t.c
t.c:3:5: warning: 'b' is used uninitialized [-Wuninitialized]
c:\cx>gcc -c -O3 -Wextra -Wno-uninitialized t.c
It either ignores it or it warns about it. Or you can optimise the
program and tell it to ignore it - apparently you call the shots.
(I notice it says nothing about 'a' not being used.)
So, what does language say about it again? Remind me! Or better,
tell the compiler.
What about this version:
void F() {
int a, b;
// b = G();
a = b;
}
This is part of a larger program, but G hasn't been written yet,
and this function will not be run in any tests until it has.
Should gcc swamp you with pointless warnings?
C warns you, but it does not stop. So, you can ignore a warning or
even disable it, if you happen to know a suitable spell.
Rust is not as tolerant, even in cases of less serious
nonobservance.
"C" does not warn you about anything here - that's up to C compilers.
Some mistakes in your C code require a diagnostic (for conforming
compilers), but not this one.
gcc will, of course, treat the use of the uninitialised variable as
an error halting compilation if you know how to use it properly. It
will also give a warning that "a" is unused, if you know how to use
it properly. Bart knows how to use gcc with flags to set standards
conformance, warnings, etc., and he knows the difference between C
the language and particular compiler implementations, but he thinks
it is fun to pretend he does not.
Rust is not fundamentally better than C here - it is simply that Rust
is a relatively new language without the historical baggage of
existing questionable quality Rust code. So Rust tools were able to
have better checking for this kind of thing out of the gate. gcc
/could/ make "-Werror=uninitialized" the default setting, but that
would cause problems with some existing code. gcc has gradually
expanded on the selection of warnings it enables by default, but it's
a slow process to avoid upsetting people with large code bases that
trigger such warnings.
The intention of my post was to claim that C is better than Rust.
Unless you didn't pay attention yet, I am rustophobic.
Bart <bc@freeuk.com> writes:
On 22/04/2026 05:09, Tim Rentsch wrote:
antispam@fricas.org (Waldek Hebisch) writes:
Bart <bc@freeuk.com> wrote:
On 19/04/2026 20:32, David Brown wrote:
A
On 19/04/2026 19:47, Bart wrote:
Get the value of 'b',
You can't do that. "b" has no value. "b" is indeterminate, and
using its value is UB - the code has no meaning right out of the
gate.
The kinds of behavior Bart is asking about has been undefined
behavior for just over 15 years, since 2011 ISO C.
So what was it between 1972 and 2011?
Implementation specific. Depending on how the linker
and run-time loader handled uninitialized data regions
in the a.out file and when loading.
On 22/04/2026 17:21, Michael S wrote:
On Wed, 22 Apr 2026 17:12:00 +0200
David Brown <david.brown@hesbynett.no> wrote:
On 22/04/2026 16:56, Michael S wrote:
On Wed, 22 Apr 2026 15:14:39 +0100
Bart <bc@freeuk.com> wrote:
On 22/04/2026 03:36, Janis Papanagnou wrote:
On 2026-04-20 13:45, Bart wrote:
My language allows you to do this:
??? int a, b
??? a := b
It is well-defined in the language, and I know it is well
defined on all my likely targets.
That's perfectly fine if (for example) your language implies a
default initialization semantics. (Simula, e.g., has such a
semantic defined; declared (instantiated) integer variables have
the value 0.) - But "C" does not!
I'm sure that was foremost in the designers' minds when they
created C in 1972. It wasn't retrofitted into the spec years
later at all.
- Haven't you two been talking about "C" all the time?
(If you are again trying to project your language's "design
decisions" onto "C" I really suggest to stop that nonsense since
it doesn't lead anywhere.)
It seems to be fine in C too according to observation:
c:\cx>type t.c
void F() {
int a, b;
a = b;
}
c:\cx>gcc -c t.c
c:\cx>gcc -c -O3 t.c
c:\cx>gcc -c -Wextra t.c
t.c:3:5: warning: 'b' is used uninitialized
[-Wuninitialized]
c:\cx>gcc -c -O3 -Wextra t.c
t.c:3:5: warning: 'b' is used uninitialized
[-Wuninitialized]
c:\cx>gcc -c -O3 -Wextra -Wno-uninitialized t.c
It either ignores it or it warns about it. Or you can optimise
the program and tell it to ignore it - apparently you call the
shots.
(I notice it says nothing about 'a' not being used.)
So, what does language say about it again? Remind me! Or better,
tell the compiler.
What about this version:
void F() {
int a, b;
// b = G();
a = b;
}
This is part of a larger program, but G hasn't been written yet,
and this function will not be run in any tests until it has.
Should gcc swamp you with pointless warnings?
C warns you, but it does not stop. So, you can ignore a warning or
even disable it, if you happen to know a suitable spell.
Rust is not as tolerant, even in cases of less serious
nonobservance.
"C" does not warn you about anything here - that's up to C
compilers. Some mistakes in your C code require a diagnostic (for
conforming compilers), but not this one.
gcc will, of course, treat the use of the uninitialised variable as
an error halting compilation if you know how to use it properly.
It will also give a warning that "a" is unused, if you know how to
use it properly. Bart knows how to use gcc with flags to set
standards conformance, warnings, etc., and he knows the difference
between C the language and particular compiler implementations,
but he thinks it is fun to pretend he does not.
Rust is not fundamentally better than C here - it is simply that
Rust is a relatively new language without the historical baggage of
existing questionable quality Rust code. So Rust tools were able
to have better checking for this kind of thing out of the gate.
gcc /could/ make "-Werror=uninitialized" the default setting, but
that would cause problems with some existing code. gcc has
gradually expanded on the selection of warnings it enables by
default, but it's a slow process to avoid upsetting people with
large code bases that trigger such warnings.
The intention of my post was to claim that C is better than Rust.
Unless you didn't pay attention yet, I am rustophobic.
OK - that did not come across to me from your post. I think it is a
good thing if a language or tool is intolerant to mistakes like
these. But sometimes stricter rules and checking mean less
flexibility, so there can be trade-offs. My knowledge of Rust is far
too limited to know what it does there, and if defaults can be
overridden.
On Wed, 22 Apr 2026 17:57:11 +0200
David Brown <david.brown@hesbynett.no> wrote:
On 22/04/2026 17:21, Michael S wrote:
On Wed, 22 Apr 2026 17:12:00 +0200
David Brown <david.brown@hesbynett.no> wrote:
On 22/04/2026 16:56, Michael S wrote:
On Wed, 22 Apr 2026 15:14:39 +0100
Bart <bc@freeuk.com> wrote:
On 22/04/2026 03:36, Janis Papanagnou wrote:
On 2026-04-20 13:45, Bart wrote:
My language allows you to do this:
ÿÿÿ int a, b
ÿÿÿ a := b
It is well-defined in the language, and I know it is well
defined on all my likely targets.
That's perfectly fine if (for example) your language implies a
default initialization semantics. (Simula, e.g., has such a
semantic defined; declared (instantiated) integer variables have >>>>>>> the value 0.) - But "C" does not!
I'm sure that was foremost in the designers' minds when they
created C in 1972. It wasn't retrofitted into the spec years
later at all.
- Haven't you two been talking about "C" all the time?
(If you are again trying to project your language's "design
decisions" onto "C" I really suggest to stop that nonsense since >>>>>>> it doesn't lead anywhere.)
It seems to be fine in C too according to observation:
c:\cx>type t.c
void F() {
int a, b;
a = b;
}
c:\cx>gcc -c t.c
c:\cx>gcc -c -O3 t.c
c:\cx>gcc -c -Wextra t.c
t.c:3:5: warning: 'b' is used uninitialized
[-Wuninitialized]
c:\cx>gcc -c -O3 -Wextra t.c
t.c:3:5: warning: 'b' is used uninitialized
[-Wuninitialized]
c:\cx>gcc -c -O3 -Wextra -Wno-uninitialized t.c
It either ignores it or it warns about it. Or you can optimise
the program and tell it to ignore it - apparently you call the
shots.
(I notice it says nothing about 'a' not being used.)
So, what does language say about it again? Remind me! Or better,
tell the compiler.
What about this version:
void F() {
int a, b;
// b = G();
a = b;
}
This is part of a larger program, but G hasn't been written yet,
and this function will not be run in any tests until it has.
Should gcc swamp you with pointless warnings?
C warns you, but it does not stop. So, you can ignore a warning or
even disable it, if you happen to know a suitable spell.
Rust is not as tolerant, even in cases of less serious
nonobservance.
"C" does not warn you about anything here - that's up to C
compilers. Some mistakes in your C code require a diagnostic (for
conforming compilers), but not this one.
gcc will, of course, treat the use of the uninitialised variable as
an error halting compilation if you know how to use it properly.
It will also give a warning that "a" is unused, if you know how to
use it properly. Bart knows how to use gcc with flags to set
standards conformance, warnings, etc., and he knows the difference
between C the language and particular compiler implementations,
but he thinks it is fun to pretend he does not.
Rust is not fundamentally better than C here - it is simply that
Rust is a relatively new language without the historical baggage of
existing questionable quality Rust code. So Rust tools were able
to have better checking for this kind of thing out of the gate.
gcc /could/ make "-Werror=uninitialized" the default setting, but
that would cause problems with some existing code. gcc has
gradually expanded on the selection of warnings it enables by
default, but it's a slow process to avoid upsetting people with
large code bases that trigger such warnings.
The intention of my post was to claim that C is better than Rust.
Unless you didn't pay attention yet, I am rustophobic.
OK - that did not come across to me from your post. I think it is a
good thing if a language or tool is intolerant to mistakes like
these. But sometimes stricter rules and checking mean less
flexibility, so there can be trade-offs. My knowledge of Rust is far
too limited to know what it does there, and if defaults can be
overridden.
My comment was referring to "less serious nonobservance", mostly
I meant unused local variables.
On 22/04/2026 03:53, Keith Thompson wrote:[...]
You're *so* close to getting it. If you want to measure the
performance of addition, you have to write your benchmark code so the
addition operator can't be optimized away. If you don't do that, the
results will be meaningless.
It seems like you're close getting it too.
Would you agree that a result that involved executing ADD a billion
times, can't be reliably compared with one that does it zero times?
Even though both give the same result.
On 21/04/2026 20:51, Chris M. Thomasson wrote:
On 4/21/2026 1:13 AM, David Brown wrote:
On 20/04/2026 23:59, Keith Thompson wrote:
Bart <bc@freeuk.com> writes:
On 20/04/2026 18:48, Keith Thompson wrote:
Bart <bc@freeuk.com> writes:
Yes, that's really useful!
So which implementation is faster at actually doing function calls?
And how many calls were actually made?
I don't know or care.
Once again, *there are ways* to write C benchmarks that guarantee
that all the function calls you want to time actually occur during
execution.ÿ For example, you can use calls to separately compiled
functions (and disable link-time optimization if necessary).ÿ You can
do computations that the compiler can't unwrap.ÿ You might multiply
a value by (time(NULL) > 0); that always yields 1, but the compiler
probably doesn't know that.ÿ (That's off the top of my head; I don't
know what the best techniques are in practice.)ÿ And then you can
examine the generated code to make sure that it's what you want.
To add more suggestions here, I find the key to benchmarking when you
want to stick to standard C is use of "volatile".ÿ Use a volatile
read at the start of your code, then calculations that depend on each
other and that first read, then a volatile write of the result.ÿ That
gives minimal intrusion in the code while making sure the
calculations have to be generated, and have to be done at run time.
If you are testing on a particular compiler (like gcc or clang), then
there are other options.ÿ The "noinline" function attribute is very
handy.ÿ Then there are empty inline assembly statements:
If you think of processor registers as acting like a level -1 memory
cache (for things that are not always in registers), then this
flushes that cache:
ÿÿÿÿÿasm volatile ("" ::: "memory");
This tells the compiler that it needs to have calculated "x" at this
point in time (so that its value can be passed to the assembly) :
ÿÿÿÿÿasm volatile ("" :: "" (x));
This tells the compiler that "x" might be changed by the assembly, so
it must forget any additional knowledge it had of it :
ÿÿÿÿÿasm volatile ("" : "+g" (x));
I've had use of all of these in real code, not just benchmarks or
test code.ÿ They can be helpful in some kinds of interactions between
low level code and hardware.
Well, we have to make a difference between a compiler barrier and a
memory barrier. All memory barriers should be compiler barriers, but
compiler barriers do not have to be memory barriers... Fair enough?
Of course there is a difference between memory barriers and compiler barriers.ÿ We are talking about compiler barriers here, because they
have an effect on the semantics of the language (in this case, the
language is "C with gcc extensions") without the cost of real memory barriers.ÿ C11 atomic fences are compiler and memory barriers, but they
can have a huge effect on code speed - these empty assembly statements
are aimed at having minimal impact outside of the intended effects.
On 4/21/2026 1:13 PM, David Brown wrote:
On 21/04/2026 20:51, Chris M. Thomasson wrote:
On 4/21/2026 1:13 AM, David Brown wrote:
On 20/04/2026 23:59, Keith Thompson wrote:
Bart <bc@freeuk.com> writes:
On 20/04/2026 18:48, Keith Thompson wrote:
Bart <bc@freeuk.com> writes:
Yes, that's really useful!
So which implementation is faster at actually doing function calls? >>>>>> And how many calls were actually made?
I don't know or care.
Once again, *there are ways* to write C benchmarks that guarantee
that all the function calls you want to time actually occur during
execution.ÿ For example, you can use calls to separately compiled
functions (and disable link-time optimization if necessary).ÿ You can >>>>> do computations that the compiler can't unwrap.ÿ You might multiply
a value by (time(NULL) > 0); that always yields 1, but the compiler
probably doesn't know that.ÿ (That's off the top of my head; I don't >>>>> know what the best techniques are in practice.)ÿ And then you can
examine the generated code to make sure that it's what you want.
To add more suggestions here, I find the key to benchmarking when
you want to stick to standard C is use of "volatile".ÿ Use a
volatile read at the start of your code, then calculations that
depend on each other and that first read, then a volatile write of
the result.ÿ That gives minimal intrusion in the code while making
sure the calculations have to be generated, and have to be done at
run time.
If you are testing on a particular compiler (like gcc or clang),
then there are other options.ÿ The "noinline" function attribute is
very handy.ÿ Then there are empty inline assembly statements:
If you think of processor registers as acting like a level -1 memory
cache (for things that are not always in registers), then this
flushes that cache:
ÿÿÿÿÿasm volatile ("" ::: "memory");
This tells the compiler that it needs to have calculated "x" at this
point in time (so that its value can be passed to the assembly) :
ÿÿÿÿÿasm volatile ("" :: "" (x));
This tells the compiler that "x" might be changed by the assembly,
so it must forget any additional knowledge it had of it :
ÿÿÿÿÿasm volatile ("" : "+g" (x));
I've had use of all of these in real code, not just benchmarks or
test code.ÿ They can be helpful in some kinds of interactions
between low level code and hardware.
Well, we have to make a difference between a compiler barrier and a
memory barrier. All memory barriers should be compiler barriers, but
compiler barriers do not have to be memory barriers... Fair enough?
Of course there is a difference between memory barriers and compiler
barriers.ÿ We are talking about compiler barriers here, because they
have an effect on the semantics of the language (in this case, the
language is "C with gcc extensions") without the cost of real memory
barriers.ÿ C11 atomic fences are compiler and memory barriers, but
they can have a huge effect on code speed - these empty assembly
statements are aimed at having minimal impact outside of the intended
effects.
I think a relaxed memory barrier can be used as a compiler barrier and
be compatible with atomic, volatile does not have to be used here?
On 22/04/2026 03:36, Janis Papanagnou wrote:
On 2026-04-20 13:45, Bart wrote:
My language allows you to do this:That's perfectly fine if (for example) your language implies a
ÿÿÿ int a, b
ÿÿÿ a := b
It is well-defined in the language, and I know it is well defined
on all my likely targets.
default
initialization semantics. (Simula, e.g., has such a semantic defined;
declared (instantiated) integer variables have the value 0.) - But "C"
does not!
I'm sure that was foremost in the designers' minds when they created C
in 1972. It wasn't retrofitted into the spec years later at all.
- Haven't you two been talking about "C" all the time?
(If you are again trying to project your language's "design decisions"
onto "C" I really suggest to stop that nonsense since it doesn't lead
anywhere.)
It seems to be fine in C too according to observation:
c:\cx>type t.c
void F() {
int a, b;
a = b;
}
So, what does language say about it again? Remind me! Or better, tell
the compiler.
Bart <bc@freeuk.com> writes:
On 22/04/2026 03:36, Janis Papanagnou wrote:
On 2026-04-20 13:45, Bart wrote:
My language allows you to do this:That's perfectly fine if (for example) your language implies a
ÿÿÿ int a, b
ÿÿÿ a := b
It is well-defined in the language, and I know it is well defined
on all my likely targets.
default
initialization semantics. (Simula, e.g., has such a semantic defined;
declared (instantiated) integer variables have the value 0.) - But "C"
does not!
I'm sure that was foremost in the designers' minds when they created C
in 1972. It wasn't retrofitted into the spec years later at all.
- Haven't you two been talking about "C" all the time?
(If you are again trying to project your language's "design decisions"
onto "C" I really suggest to stop that nonsense since it doesn't lead
anywhere.)
It seems to be fine in C too according to observation:
c:\cx>type t.c
void F() {
int a, b;
a = b;
}
Come on, Bart, you already know this stuff.
The behavior of `a = b;` is undefined. You know what "undefined
behavior" means. You know that C implementations are not required
to diagnose undefined behavior.
You know that, since a and b are local to the function and their
values are never used, a compiler could generate machine code for F()
as an empty function. (I do not claim that any particular compiler
does or does not perform this optimization.)
[...]
So, what does language say about it again? Remind me! Or better, tell
the compiler.
I've already told you what the language says about it. I quoted
the section of the ISO C standard that says explicitly that the
behavior is undefined. N3220 6.3.2.1p2, last sentence.
The compiler's behavior is consistent with that requirment.
You cannot possibly have forgotten this. Why do you pretend?
Bart <bc@freeuk.com> writes:
On 22/04/2026 03:53, Keith Thompson wrote:[...]
You're *so* close to getting it. If you want to measure the
performance of addition, you have to write your benchmark code so the
addition operator can't be optimized away. If you don't do that, the
results will be meaningless.
It seems like you're close getting it too.
OK, what am I getting close to?
Would you agree that a result that involved executing ADD a billion
times, can't be reliably compared with one that does it zero times?
No.
Even though both give the same result.
Of course they can be reliably compared. One is much faster than
the other. That's a reliable comparison.
[...]Bart <bc@freeuk.com> writes:
So, what does language say about it again? Remind me! Or better, tellI've already told you what the language says about it. I quoted
the compiler.
the section of the ISO C standard that says explicitly that the
behavior is undefined. N3220 6.3.2.1p2, last sentence.
The compiler's behavior is consistent with that requirment.
You cannot possibly have forgotten this. Why do you pretend?
Nobody seems to have a problem with gcc being lax about this (or with
it allowing its users to let it be lax).
Everybody seems to have a problem with /me/ being lax about it.
Does anyone have any actual examples of very bad things happening with
a program like the above?
From what I can see, with -O0 it just moves 32 bits from one part of
the allocated stack frame to another. And with -O1 and above, the code
is elided anyway.
Not exactly the end of the world.
On 22/04/2026 22:23, Keith Thompson wrote:
Bart <bc@freeuk.com> writes:
On 22/04/2026 03:53, Keith Thompson wrote:[...]
OK, what am I getting close to?You're *so* close to getting it. If you want to measure the
performance of addition, you have to write your benchmark code so the
addition operator can't be optimized away. If you don't do that, the
results will be meaningless.
It seems like you're close getting it too.
That optimisation renders some results meaningless, but then ...
Would you agree that a result that involved executing ADD a billionNo.
times, can't be reliably compared with one that does it zero times?
... here you say the opposite of 'If you don't do that, the results
will be meaningless'.
Even though both give the same result.Of course they can be reliably compared. One is much faster than
the other. That's a reliable comparison.
Ha, ha, ha!
Remind me never to take any benchmark of yours seriously.
You seem to be more interested in pedantry than anything else.
So, taking one like this:
long long int sum=0;
for (int j=0; j<10; ++j)
for (int i=0; i<2000000000; ++i) sum+=i;
printf("%lld\n", sum);
With gcc-O0, this takes 50 seconds. With gcc-O3, it takes 0.005 seconds.
According to your, gcc managed to make this program 10,000 times faster?
Do 1000 repeats of the inner loop instead, and gcc-O3 would amazingly
speed it up by a million times.
From your previous remarks, you'd consider that a fair and accurate assessment (remember that 0.1s outlier figure).
Obviously, gcc has elided the entire program here so that the timing
is essentially zero (the 5ms is process overhead).
It's not like you can now apply that 1000000x speed-up to real
programs (who needs quantum computers!).
(If you are interested, which I doubt, it takes 6-7 seconds to run
that program using optimised code that actually does the task, which
is 20 billion iterations of 'sum+=i'.)
Bart <bc@freeuk.com> writes:ll
[...]Bart <bc@freeuk.com> writes:
So, what does language say about it again? Remind me! Or better, te
uthe compiler.I've already told you what the language says about it.ÿ I quoted
the section of the ISO C standard that says explicitly that the
behavior is undefined.ÿ N3220 6.3.2.1p2, last sentence.
The compiler's behavior is consistent with that requirment.
You cannot possibly have forgotten this.ÿ Why do you pretend?
Nobody seems to have a problem with gcc being lax about this (or with
it allowing its users to let it be lax).
gcc is not being lax. gcc is behaving in a matter that is consistent
with the requirements of the C standard.ÿ The code in question has
undefined behavior.
You know and understand all of that.
Everybody seems to have a problem with /me/ being lax about it.
Not everybody, but I certainly do.
Does anyone have any actual examples of very bad things happening with
a program like the above?
From what I can see, with -O0 it just moves 32 bits from one part of
the allocated stack frame to another. And with -O1 and above, the code
is elided anyway.
Not exactly the end of the world.
The behavior is undefined.ÿ You know exactly what that means, but yo
pretend not to.
On 22/04/2026 05:09, Tim Rentsch wrote:
antispam@fricas.org (Waldek Hebisch) writes:
Bart <bc@freeuk.com> wrote:
On 19/04/2026 20:32, David Brown wrote:
On 19/04/2026 19:47, Bart wrote:
Get the value of 'b',
You can't do that. "b" has no value. "b" is indeterminate, and
using its value is UB - the code has no meaning right out of the
gate.
When you use "b" in an expression, you are /not/ asking C to read
the bits and bytes stored at the address of the object "b". You
are asking for the /value/ of the object "b". How the compiler
gets that value is up to the compiler - it can read the memory, or
use a stored copy in a register, or use program analysis to know
what the value is in some other way. And if the object "b" does
not have a value, you are asking the impossible.
Try asking a human "You have two numbers, b and c. Add them.
What is the answer?".
You have two slates A and B which someone should have wiped clean
then written a new number on each.
But that part hasn't been done; they each still have an old number
from their last use.
You can still add them together, nothing bad will happen. It just
may be the wrong answer if the purpose of the exercise was to find
the sum of two specific new numbers.
But the purpose may also be see how good they are adding. Or in
following instructions.
whatever it happens to be, add the value of 'c' scaled by 8, and
store the result it into 'a'. The only things to consider are
that some intermediate results may lose the top bits.
Is 'a = b' equally undefined? If so that C is even crazy than
I'd thought.
If "a" or "b" are indeterminate, then using them is undefined. I
have two things - are they the same colour? How is that supposed
to make sense?
You keep thinking of objects like "b" as a section of memory with
a bit pattern in it. Objects are not that simple in C - C is not
assembly.
Why ISN'T it that simple? What ghastly thing would happen if it
was?
"b" will be some location in memory or it might be some register,
and it WILL have a value. That value happens to be unknown until
it is initialised.
So accessing it will return garbage (unless you know exactly what
you are doing then it may be something useful).
My original example was something like 'a = b + c' (I think in my
language), converted to my IL, then expressed in very low-level C.
You were concerned that in that C, the values weren't initialised.
How would that have affected the code that C compiler generated
from that?
You look at trivial example, where AFAICS the best answer is:
"Compiler follows general rules, why should it make exception for
this case?". Note that in this trivial case "interesting"
behaviour could happen on exotic hardware (probably disallowed
by C23 rules, but AFAICS legal for earlier C versions).
The kinds of behavior Bart is asking about has been undefined
behavior for just over 15 years, since 2011 ISO C.
So what was it between 1972 and 2011?
On Wed, 22 Apr 2026 15:16:56 +0100
Bart <bc@freeuk.com> wrote:
On 22/04/2026 05:09, Tim Rentsch wrote:
antispam@fricas.org (Waldek Hebisch) writes:
You look at trivial example, where AFAICS the best answer is:
"Compiler follows general rules, why should it make exception for
this case?". Note that in this trivial case "interesting"
behaviour could happen on exotic hardware (probably disallowed
by C23 rules, but AFAICS legal for earlier C versions).
The kinds of behavior Bart is asking about has been undefined
behavior for just over 15 years, since 2011 ISO C.
So what was it between 1972 and 2011?
My record at guessing exact meaning of Tim's statements is not
particularly good, but I'll try nevertheless.
Tim seems to suggest that function foo() below had defined behavior
(most likely of returning 1) in C90 and C99, then it became undefined in
C11 and C17 then again became defined in C23.
For years 1972 to 1989 Tim probably thinks that there is no sufficient
data to answer your question.
Bart <bc@freeuk.com> writes:
[...]Bart <bc@freeuk.com> writes:
So, what does language say about it again? Remind me! Or better, tellI've already told you what the language says about it. I quoted
the compiler.
the section of the ISO C standard that says explicitly that the
behavior is undefined. N3220 6.3.2.1p2, last sentence.
The compiler's behavior is consistent with that requirment.
You cannot possibly have forgotten this. Why do you pretend?
Nobody seems to have a problem with gcc being lax about this (or with
it allowing its users to let it be lax).
gcc is not being lax. gcc is behaving in a matter that is consistent
with the requirements of the C standard. The code in question has
undefined behavior.
You know and understand all of that.
Everybody seems to have a problem with /me/ being lax about it.
Not everybody, but I certainly do.
Does anyone have any actual examples of very bad things happening with
a program like the above?
From what I can see, with -O0 it just moves 32 bits from one part of
the allocated stack frame to another. And with -O1 and above, the code
is elided anyway.
Not exactly the end of the world.
The behavior is undefined. You know exactly what that means, but you
pretend not to.
On Wed, 22 Apr 2026 15:13:56 +0000, Scott Lurndal wrote:
Bart <bc@freeuk.com> writes:
On 22/04/2026 05:09, Tim Rentsch wrote:
antispam@fricas.org (Waldek Hebisch) writes:
Bart <bc@freeuk.com> wrote:
On 19/04/2026 20:32, David Brown wrote:
A
On 19/04/2026 19:47, Bart wrote:
Get the value of 'b',
You can't do that. "b" has no value. "b" is indeterminate, and >>>>>>> using its value is UB - the code has no meaning right out of the >>>>>>> gate.
The kinds of behavior Bart is asking about has been undefined
behavior for just over 15 years, since 2011 ISO C.
So what was it between 1972 and 2011?
Implementation specific. Depending on how the linker
and run-time loader handled uninitialized data regions
in the a.out file and when loading.
K&R is very specific about the initial value of automatic
variables:
1.10 Scope; External Variables
...
"Because automatic variables come and go with
function invocation, they do not retain their
values from one call to the next, and must be
explicitly set upon each entry. If they are
not set, they will contain garbage."
...
2.4 Declarations
...
"Automatic variables for which there is no
explicit initializer have undefined (i.e.
garbage) values."
...
4.9 Initialization
...
"In the absence of explicit initialization,
external and static variables are guaranteed
to be initialized to zero; automatic and
register variables have undefined (.e.e garbage)
values."
...
8.6 Initialization
...
"Static and external variables which are not
initialized are guaranteed to start off
as 0, automatic and register variables which
are not initialized are guaranteed to start
off as garbage."
...
So, for automatic and register variables at least,
even K&R defined that, before initialization, their
values were undefined.
On 22/04/2026 22:23, Keith Thompson wrote:
Bart <bc@freeuk.com> writes:
On 22/04/2026 03:53, Keith Thompson wrote:[...]
You're *so* close to getting it.ÿ If you want to measure the
performance of addition, you have to write your benchmark code so the
addition operator can't be optimized away.ÿ If you don't do that, the
results will be meaningless.
It seems like you're close getting it too.
OK, what am I getting close to?
That optimisation renders some results meaningless, but then ...
Would you agree that a result that involved executing ADD a billion
times, can't be reliably compared with one that does it zero times?
No.
... here you say the opposite of 'If you don't do that, the results will
be meaningless'.
Even though both give the same result.
Of course they can be reliably compared.ÿ One is much faster than
the other.ÿ That's a reliable comparison.
Ha, ha, ha!
Remind me never to take any benchmark of yours seriously.
You seem to be more interested in pedantry than anything else.
So, taking one like this:
ÿÿÿ long long int sum=0;
ÿÿÿ for (int j=0; j<10; ++j)
ÿÿÿÿÿÿÿ for (int i=0; i<2000000000; ++i) sum+=i;
ÿÿÿ printf("%lld\n", sum);
With gcc-O0, this takes 50 seconds. With gcc-O3, it takes 0.005 seconds.
According to your, gcc managed to make this program 10,000 times faster?
On 4/22/2026 2:28 PM, Chris M. Thomasson wrote:
On 4/21/2026 1:13 PM, David Brown wrote:
On 21/04/2026 20:51, Chris M. Thomasson wrote:
On 4/21/2026 1:13 AM, David Brown wrote:
On 20/04/2026 23:59, Keith Thompson wrote:
Bart <bc@freeuk.com> writes:
On 20/04/2026 18:48, Keith Thompson wrote:
Bart <bc@freeuk.com> writes:
Yes, that's really useful!
So which implementation is faster at actually doing function calls? >>>>>>> And how many calls were actually made?
I don't know or care.
Once again, *there are ways* to write C benchmarks that guarantee
that all the function calls you want to time actually occur during >>>>>> execution.ÿ For example, you can use calls to separately compiled
functions (and disable link-time optimization if necessary).ÿ You can >>>>>> do computations that the compiler can't unwrap.ÿ You might multiply >>>>>> a value by (time(NULL) > 0); that always yields 1, but the compiler >>>>>> probably doesn't know that.ÿ (That's off the top of my head; I don't >>>>>> know what the best techniques are in practice.)ÿ And then you can
examine the generated code to make sure that it's what you want.
To add more suggestions here, I find the key to benchmarking when
you want to stick to standard C is use of "volatile".ÿ Use a
volatile read at the start of your code, then calculations that
depend on each other and that first read, then a volatile write of
the result.ÿ That gives minimal intrusion in the code while making
sure the calculations have to be generated, and have to be done at
run time.
If you are testing on a particular compiler (like gcc or clang),
then there are other options.ÿ The "noinline" function attribute is >>>>> very handy.ÿ Then there are empty inline assembly statements:
If you think of processor registers as acting like a level -1
memory cache (for things that are not always in registers), then
this flushes that cache:
ÿÿÿÿÿasm volatile ("" ::: "memory");
This tells the compiler that it needs to have calculated "x" at
this point in time (so that its value can be passed to the assembly) : >>>>>
ÿÿÿÿÿasm volatile ("" :: "" (x));
This tells the compiler that "x" might be changed by the assembly,
so it must forget any additional knowledge it had of it :
ÿÿÿÿÿasm volatile ("" : "+g" (x));
I've had use of all of these in real code, not just benchmarks or
test code.ÿ They can be helpful in some kinds of interactions
between low level code and hardware.
Well, we have to make a difference between a compiler barrier and a
memory barrier. All memory barriers should be compiler barriers, but
compiler barriers do not have to be memory barriers... Fair enough?
Of course there is a difference between memory barriers and compiler
barriers.ÿ We are talking about compiler barriers here, because they
have an effect on the semantics of the language (in this case, the
language is "C with gcc extensions") without the cost of real memory
barriers.ÿ C11 atomic fences are compiler and memory barriers, but
they can have a huge effect on code speed - these empty assembly
statements are aimed at having minimal impact outside of the intended
effects.
I think a relaxed memory barrier can be used as a compiler barrier and
be compatible with atomic, volatile does not have to be used here?
load/store with relaxed should act like compiler barriers?
Bart <bc@freeuk.com> writes:
On 22/04/2026 03:36, Janis Papanagnou wrote:
On 2026-04-20 13:45, Bart wrote:
My language allows you to do this:That's perfectly fine if (for example) your language implies a
ÿÿÿ int a, b
ÿÿÿ a := b
It is well-defined in the language, and I know it is well defined
on all my likely targets.
default
initialization semantics. (Simula, e.g., has such a semantic defined;
declared (instantiated) integer variables have the value 0.) - But "C"
does not!
I'm sure that was foremost in the designers' minds when they created C
in 1972. It wasn't retrofitted into the spec years later at all.
- Haven't you two been talking about "C" all the time?
(If you are again trying to project your language's "design decisions"
onto "C" I really suggest to stop that nonsense since it doesn't lead
anywhere.)
It seems to be fine in C too according to observation:
c:\cx>type t.c
void F() {
int a, b;
a = b;
}
Come on, Bart, you already know this stuff.
The behavior of `a = b;` is undefined. You know what "undefined
behavior" means. You know that C implementations are not required
to diagnose undefined behavior.
You know that, since a and b are local to the function and their
values are never used, a compiler could generate machine code for F()
as an empty function. (I do not claim that any particular compiler
does or does not perform this optimization.)
On Wed, 2026-04-22 at 18:59 -0700, Keith Thompson wrote:
Bart <bc@freeuk.com> writes:
[...]Bart <bc@freeuk.com> writes:
So, what does language say about it again? Remind me! Or better, tell >>>>> the compiler.I've already told you what the language says about it.ÿ I quoted
the section of the ISO C standard that says explicitly that the
behavior is undefined.ÿ N3220 6.3.2.1p2, last sentence.
The compiler's behavior is consistent with that requirment.
You cannot possibly have forgotten this.ÿ Why do you pretend?
Nobody seems to have a problem with gcc being lax about this (or with
it allowing its users to let it be lax).
gcc is not being lax. gcc is behaving in a matter that is consistent
with the requirements of the C standard.ÿ The code in question has
undefined behavior.
You know and understand all of that.
Everybody seems to have a problem with /me/ being lax about it.
Not everybody, but I certainly do.
Does anyone have any actual examples of very bad things happening with
a program like the above?
From what I can see, with -O0 it just moves 32 bits from one part of
the allocated stack frame to another. And with -O1 and above, the code
is elided anyway.
Not exactly the end of the world.
The behavior is undefined.ÿ You know exactly what that means, but you
pretend not to.
1. 'The language' must see the 'C program' as it is, i.e. every component in
this case must map to some assembly code (or 'portable assembly').
2. 'optimiztion' is a heigher level concept, nothing to do with 'The language'.
3. If the code is defined as undefined, why it can be justified to optimize?
So, the 'undefined' is but the C standard's concept, maybe about compiler spec...
Because the real thing is that the development of C had been always buttom-up.
On 22/04/2026 18:09, Lew Pitcher wrote:
On Wed, 22 Apr 2026 15:13:56 +0000, Scott Lurndal wrote:
Bart <bc@freeuk.com> writes:
On 22/04/2026 05:09, Tim Rentsch wrote:
antispam@fricas.org (Waldek Hebisch) writes:
Bart <bc@freeuk.com> wrote:
On 19/04/2026 20:32, David Brown wrote:
A
On 19/04/2026 19:47, Bart wrote:
Get the value of 'b',
You can't do that.ÿ "b" has no value.ÿ "b" is indeterminate, and >>>>>>>> using its value is UB - the code has no meaning right out of the >>>>>>>> gate.
The kinds of behavior Bart is asking about has been undefined
behavior for just over 15 years, since 2011 ISO C.
So what was it between 1972 and 2011?
Implementation specific.ÿ Depending on how the linker
and run-time loader handled uninitialized data regions
in the a.out file and when loading.
K&R is very specific about the initial value of automatic
variables:
ÿÿ 1.10 Scope; External Variables
ÿÿÿÿÿÿÿ ...
ÿÿÿÿÿÿÿ "Because automatic variables come and go with
ÿÿÿÿÿÿÿÿ function invocation, they do not retain their
ÿÿÿÿÿÿÿÿ values from one call to the next, and must be
ÿÿÿÿÿÿÿÿ explicitly set upon each entry. If they are
ÿÿÿÿÿÿÿÿ not set, they will contain garbage."
ÿÿÿÿÿÿÿÿ ...
ÿÿ 2.4ÿ Declarations
ÿÿÿÿÿÿÿ ...
ÿÿÿÿÿÿÿ "Automatic variables for which there is no
ÿÿÿÿÿÿÿÿ explicit initializer have undefined (i.e.
ÿÿÿÿÿÿÿÿ garbage) values."
ÿÿÿÿÿÿÿÿ ...
ÿÿ 4.9 Initialization
ÿÿÿÿÿÿÿÿ ...
ÿÿÿÿÿÿÿÿ "In the absence of explicit initialization,
ÿÿÿÿÿÿÿÿÿ external and static variables are guaranteed
ÿÿÿÿÿÿÿÿÿ to be initialized to zero; automatic and
ÿÿÿÿÿÿÿÿÿ register variables have undefined (.e.e garbage)
ÿÿÿÿÿÿÿÿÿ values."
ÿÿÿÿÿÿÿÿ ...
ÿÿ 8.6 Initialization
ÿÿÿÿÿÿÿÿ ...
ÿÿÿÿÿÿÿÿ "Static and external variables which are not
ÿÿÿÿÿÿÿÿÿ initialized are guaranteed to start off
ÿÿÿÿÿÿÿÿÿ as 0, automatic and register variables which
ÿÿÿÿÿÿÿÿÿ are not initialized are guaranteed to start
ÿÿÿÿÿÿÿÿÿ off as garbage."
ÿÿÿÿÿÿÿÿ ...
So, for automatic and register variables at least,
even K&R defined that, before initialization, their
values were undefined.
I don't see the use of uninitialised variables being undefined here.ÿ It just says their values are garbage (thus unspecified values, or possibly trap values).ÿ Indeed, it says they are /guaranteed/ to be garbage,
which is a strange turn of phrase - it could be interpreted to mean an implementation is not allowed to zero-initialise them even if it wanted to.
There's no doubt that use of the values of uninitialised local variables
has been a bad idea - incorrect code - since early C.ÿ But UB is not
just a case of "nothing good will happen".
On 22/04/2026 23:29, Chris M. Thomasson wrote:
On 4/22/2026 2:28 PM, Chris M. Thomasson wrote:
On 4/21/2026 1:13 PM, David Brown wrote:
On 21/04/2026 20:51, Chris M. Thomasson wrote:
On 4/21/2026 1:13 AM, David Brown wrote:
On 20/04/2026 23:59, Keith Thompson wrote:
Bart <bc@freeuk.com> writes:
On 20/04/2026 18:48, Keith Thompson wrote:
Bart <bc@freeuk.com> writes:
Yes, that's really useful!
So which implementation is faster at actually doing function calls? >>>>>>>> And how many calls were actually made?
I don't know or care.
Once again, *there are ways* to write C benchmarks that guarantee >>>>>>> that all the function calls you want to time actually occur during >>>>>>> execution.ÿ For example, you can use calls to separately compiled >>>>>>> functions (and disable link-time optimization if necessary).ÿ You >>>>>>> can
do computations that the compiler can't unwrap.ÿ You might multiply >>>>>>> a value by (time(NULL) > 0); that always yields 1, but the compiler >>>>>>> probably doesn't know that.ÿ (That's off the top of my head; I don't >>>>>>> know what the best techniques are in practice.)ÿ And then you can >>>>>>> examine the generated code to make sure that it's what you want. >>>>>>>
To add more suggestions here, I find the key to benchmarking when >>>>>> you want to stick to standard C is use of "volatile".ÿ Use a
volatile read at the start of your code, then calculations that
depend on each other and that first read, then a volatile write of >>>>>> the result.ÿ That gives minimal intrusion in the code while making >>>>>> sure the calculations have to be generated, and have to be done at >>>>>> run time.
If you are testing on a particular compiler (like gcc or clang),
then there are other options.ÿ The "noinline" function attribute
is very handy.ÿ Then there are empty inline assembly statements:
If you think of processor registers as acting like a level -1
memory cache (for things that are not always in registers), then
this flushes that cache:
ÿÿÿÿÿasm volatile ("" ::: "memory");
This tells the compiler that it needs to have calculated "x" at
this point in time (so that its value can be passed to the
assembly) :
ÿÿÿÿÿasm volatile ("" :: "" (x));
This tells the compiler that "x" might be changed by the assembly, >>>>>> so it must forget any additional knowledge it had of it :
ÿÿÿÿÿasm volatile ("" : "+g" (x));
I've had use of all of these in real code, not just benchmarks or >>>>>> test code.ÿ They can be helpful in some kinds of interactions
between low level code and hardware.
Well, we have to make a difference between a compiler barrier and a >>>>> memory barrier. All memory barriers should be compiler barriers,
but compiler barriers do not have to be memory barriers... Fair
enough?
Of course there is a difference between memory barriers and compiler
barriers.ÿ We are talking about compiler barriers here, because they
have an effect on the semantics of the language (in this case, the
language is "C with gcc extensions") without the cost of real memory
barriers.ÿ C11 atomic fences are compiler and memory barriers, but
they can have a huge effect on code speed - these empty assembly
statements are aimed at having minimal impact outside of the
intended effects.
I think a relaxed memory barrier can be used as a compiler barrier
and be compatible with atomic, volatile does not have to be used here?
load/store with relaxed should act like compiler barriers?
To be honest, I have never been at all sure how C11 atomic accesses and fences relate to "memory barriers" of any sort, or how they enforce
order in respect to volatile accesses or non-volatile accesses.
The C standards at times use "volatile atomic" qualifications, which
implies that non-volatile atomic uses are not volatile.ÿ Volatile
accesses do two things - enforce an order (in the generated code, but
not necessarily at execution on the cpu) of volatile accesses, and make
the access "observable behaviour".ÿ My understanding is then that C11 atomics are missing one or both of these aspects, but I don't know which.
gcc has a "memory clobber" facility in inline assembly - and this is commonly used as a compiler (but not cpu) memory barrier.ÿ I know what
it does in practical terms for the way I use it, but I am not sure how precisely it can be specified in relation to the standard C semantics.
It seems reasonable to suppose that a relaxed atomic fence could act
like a gcc compiler memory barrier, but the standard says that "atomic_thread_fence(memory_order_relaxed)" has no effects.
The main reason I have not bothered looking at the semantics and effects
of C11 atomics is that the libatomic implementation that is distributed
with gcc is (or at least /was/ when I looked a number of years ago) fundamentally and irreparably broken for single-core microcontrollers.
Using spinlocks to enforce atomic actions is fine on a multi-core Linux system, but a guaranteed hang on a single-core RTOS or when using
atomics from interrupts.ÿ So I use RTOS-specific features, or my own critical section code (disabling interrupts is the way to do it on these kinds of devices), along with gcc inline assembly - it's as far from portable standard C code as you can get and still have it mixed with C,
but I don't need portability there.
But I have no objection at all if someone wants to give an explanation
of some of the C11 atomic semantics, though it might be better in a new thread.
On 4/23/2026 12:22 AM, David Brown wrote:
On 22/04/2026 23:29, Chris M. Thomasson wrote:
On 4/22/2026 2:28 PM, Chris M. Thomasson wrote:
On 4/21/2026 1:13 PM, David Brown wrote:
On 21/04/2026 20:51, Chris M. Thomasson wrote:
On 4/21/2026 1:13 AM, David Brown wrote:
On 20/04/2026 23:59, Keith Thompson wrote:
Bart <bc@freeuk.com> writes:
On 20/04/2026 18:48, Keith Thompson wrote:
Bart <bc@freeuk.com> writes:
Yes, that's really useful!
So which implementation is faster at actually doing function >>>>>>>>> calls?
And how many calls were actually made?
I don't know or care.
Once again, *there are ways* to write C benchmarks that guarantee >>>>>>>> that all the function calls you want to time actually occur during >>>>>>>> execution.ÿ For example, you can use calls to separately compiled >>>>>>>> functions (and disable link-time optimization if necessary). >>>>>>>> You can
do computations that the compiler can't unwrap.ÿ You might multiply >>>>>>>> a value by (time(NULL) > 0); that always yields 1, but the compiler >>>>>>>> probably doesn't know that.ÿ (That's off the top of my head; I >>>>>>>> don't
know what the best techniques are in practice.)ÿ And then you can >>>>>>>> examine the generated code to make sure that it's what you want. >>>>>>>>
To add more suggestions here, I find the key to benchmarking when >>>>>>> you want to stick to standard C is use of "volatile".ÿ Use a
volatile read at the start of your code, then calculations that >>>>>>> depend on each other and that first read, then a volatile write >>>>>>> of the result.ÿ That gives minimal intrusion in the code while
making sure the calculations have to be generated, and have to be >>>>>>> done at run time.
If you are testing on a particular compiler (like gcc or clang), >>>>>>> then there are other options.ÿ The "noinline" function attribute >>>>>>> is very handy.ÿ Then there are empty inline assembly statements: >>>>>>>
If you think of processor registers as acting like a level -1
memory cache (for things that are not always in registers), then >>>>>>> this flushes that cache:
ÿÿÿÿÿasm volatile ("" ::: "memory");
This tells the compiler that it needs to have calculated "x" at >>>>>>> this point in time (so that its value can be passed to the
assembly) :
ÿÿÿÿÿasm volatile ("" :: "" (x));
This tells the compiler that "x" might be changed by the
assembly, so it must forget any additional knowledge it had of it : >>>>>>>
ÿÿÿÿÿasm volatile ("" : "+g" (x));
I've had use of all of these in real code, not just benchmarks or >>>>>>> test code.ÿ They can be helpful in some kinds of interactions
between low level code and hardware.
Well, we have to make a difference between a compiler barrier and >>>>>> a memory barrier. All memory barriers should be compiler barriers, >>>>>> but compiler barriers do not have to be memory barriers... Fair
enough?
Of course there is a difference between memory barriers and
compiler barriers.ÿ We are talking about compiler barriers here,
because they have an effect on the semantics of the language (in
this case, the language is "C with gcc extensions") without the
cost of real memory barriers.ÿ C11 atomic fences are compiler and
memory barriers, but they can have a huge effect on code speed -
these empty assembly statements are aimed at having minimal impact
outside of the intended effects.
I think a relaxed memory barrier can be used as a compiler barrier
and be compatible with atomic, volatile does not have to be used here?
load/store with relaxed should act like compiler barriers?
To be honest, I have never been at all sure how C11 atomic accesses
and fences relate to "memory barriers" of any sort, or how they
enforce order in respect to volatile accesses or non-volatile accesses.
The C standards at times use "volatile atomic" qualifications, which
implies that non-volatile atomic uses are not volatile.ÿ Volatile
accesses do two things - enforce an order (in the generated code, but
not necessarily at execution on the cpu) of volatile accesses, and
make the access "observable behaviour".ÿ My understanding is then that
C11 atomics are missing one or both of these aspects, but I don't know
which.
gcc has a "memory clobber" facility in inline assembly - and this is
commonly used as a compiler (but not cpu) memory barrier.ÿ I know what
it does in practical terms for the way I use it, but I am not sure how
precisely it can be specified in relation to the standard C semantics.
It seems reasonable to suppose that a relaxed atomic fence could act
like a gcc compiler memory barrier, but the standard says that
"atomic_thread_fence(memory_order_relaxed)" has no effects.
The main reason I have not bothered looking at the semantics and
effects of C11 atomics is that the libatomic implementation that is
distributed with gcc is (or at least /was/ when I looked a number of
years ago) fundamentally and irreparably broken for single-core
microcontrollers. Using spinlocks to enforce atomic actions is fine on
a multi-core Linux system, but a guaranteed hang on a single-core RTOS
or when using atomics from interrupts.ÿ So I use RTOS-specific
features, or my own critical section code (disabling interrupts is the
way to do it on these kinds of devices), along with gcc inline
assembly - it's as far from portable standard C code as you can get
and still have it mixed with C, but I don't need portability there.
But I have no objection at all if someone wants to give an explanation
of some of the C11 atomic semantics, though it might be better in a
new thread.
Yeah. Well, damn. I would hope that in the _compiled_ code, memory
ordering aside:
std::atomic<int> a = 0;
a.store(123);
a.store(666);
Better damn well issue two stores in that order. The memory order side
be damned for this moment, but I think std::atomic in impls are laced
with the volatile keyword anyway, but shit can happen. Humm...
On 4/23/2026 12:22 AM, David Brown wrote:
On 22/04/2026 23:29, Chris M. Thomasson wrote:
On 4/22/2026 2:28 PM, Chris M. Thomasson wrote:
On 4/21/2026 1:13 PM, David Brown wrote:
On 21/04/2026 20:51, Chris M. Thomasson wrote:
On 4/21/2026 1:13 AM, David Brown wrote:
On 20/04/2026 23:59, Keith Thompson wrote:
Bart <bc@freeuk.com> writes:
On 20/04/2026 18:48, Keith Thompson wrote:
Bart <bc@freeuk.com> writes:
Yes, that's really useful!
So which implementation is faster at actually doing function >>>>>>>>> calls?
And how many calls were actually made?
I don't know or care.
Once again, *there are ways* to write C benchmarks that guarantee >>>>>>>> that all the function calls you want to time actually occur during >>>>>>>> execution.ÿ For example, you can use calls to separately compiled >>>>>>>> functions (and disable link-time optimization if necessary). >>>>>>>> You can
do computations that the compiler can't unwrap.ÿ You might multiply >>>>>>>> a value by (time(NULL) > 0); that always yields 1, but the compiler >>>>>>>> probably doesn't know that.ÿ (That's off the top of my head; I >>>>>>>> don't
know what the best techniques are in practice.)ÿ And then you can >>>>>>>> examine the generated code to make sure that it's what you want. >>>>>>>>
To add more suggestions here, I find the key to benchmarking when >>>>>>> you want to stick to standard C is use of "volatile".ÿ Use a
volatile read at the start of your code, then calculations that >>>>>>> depend on each other and that first read, then a volatile write >>>>>>> of the result.ÿ That gives minimal intrusion in the code while
making sure the calculations have to be generated, and have to be >>>>>>> done at run time.
If you are testing on a particular compiler (like gcc or clang), >>>>>>> then there are other options.ÿ The "noinline" function attribute >>>>>>> is very handy.ÿ Then there are empty inline assembly statements: >>>>>>>
If you think of processor registers as acting like a level -1
memory cache (for things that are not always in registers), then >>>>>>> this flushes that cache:
ÿÿÿÿÿasm volatile ("" ::: "memory");
This tells the compiler that it needs to have calculated "x" at >>>>>>> this point in time (so that its value can be passed to the
assembly) :
ÿÿÿÿÿasm volatile ("" :: "" (x));
This tells the compiler that "x" might be changed by the
assembly, so it must forget any additional knowledge it had of it : >>>>>>>
ÿÿÿÿÿasm volatile ("" : "+g" (x));
I've had use of all of these in real code, not just benchmarks or >>>>>>> test code.ÿ They can be helpful in some kinds of interactions
between low level code and hardware.
Well, we have to make a difference between a compiler barrier and >>>>>> a memory barrier. All memory barriers should be compiler barriers, >>>>>> but compiler barriers do not have to be memory barriers... Fair
enough?
Of course there is a difference between memory barriers and
compiler barriers.ÿ We are talking about compiler barriers here,
because they have an effect on the semantics of the language (in
this case, the language is "C with gcc extensions") without the
cost of real memory barriers.ÿ C11 atomic fences are compiler and
memory barriers, but they can have a huge effect on code speed -
these empty assembly statements are aimed at having minimal impact
outside of the intended effects.
I think a relaxed memory barrier can be used as a compiler barrier
and be compatible with atomic, volatile does not have to be used here?
load/store with relaxed should act like compiler barriers?
To be honest, I have never been at all sure how C11 atomic accesses
and fences relate to "memory barriers" of any sort, or how they
enforce order in respect to volatile accesses or non-volatile accesses.
The C standards at times use "volatile atomic" qualifications, which
implies that non-volatile atomic uses are not volatile.ÿ Volatile
accesses do two things - enforce an order (in the generated code, but
not necessarily at execution on the cpu) of volatile accesses, and
make the access "observable behaviour".ÿ My understanding is then that
C11 atomics are missing one or both of these aspects, but I don't know
which.
gcc has a "memory clobber" facility in inline assembly - and this is
commonly used as a compiler (but not cpu) memory barrier.ÿ I know what
it does in practical terms for the way I use it, but I am not sure how
precisely it can be specified in relation to the standard C semantics.
It seems reasonable to suppose that a relaxed atomic fence could act
like a gcc compiler memory barrier, but the standard says that
"atomic_thread_fence(memory_order_relaxed)" has no effects.
The main reason I have not bothered looking at the semantics and
effects of C11 atomics is that the libatomic implementation that is
distributed with gcc is (or at least /was/ when I looked a number of
years ago) fundamentally and irreparably broken for single-core
microcontrollers. Using spinlocks to enforce atomic actions is fine on
a multi-core Linux system, but a guaranteed hang on a single-core RTOS
or when using atomics from interrupts.ÿ So I use RTOS-specific
features, or my own critical section code (disabling interrupts is the
way to do it on these kinds of devices), along with gcc inline
assembly - it's as far from portable standard C code as you can get
and still have it mixed with C, but I don't need portability there.
But I have no objection at all if someone wants to give an explanation
of some of the C11 atomic semantics, though it might be better in a
new thread.
Yeah. Well, damn. I would hope that in the _compiled_ code, memory
ordering aside:
std::atomic<int> a = 0;
a.store(123);
a.store(666);
Better damn well issue two stores in that order. The memory order side
be damned for this moment, but I think std::atomic in impls are laced
with the volatile keyword anyway, but shit can happen. Humm...
Between 1972 and 1978 Unix was not available to the general public,[...]
and I think for all practical purposes neither was C. Also AFAIAA
there was no recognized defining document for C during that time.
IIRC there were some papers written about C before 1978, but nothing
like a real language manual. So the answer seems to be either that
the question doesn't make sense or that everything is "undefined
behavior" because there is no language manual that defines it.
Bart <bc@freeuk.com> writes:
[...]Bart <bc@freeuk.com> writes:
So, what does language say about it again? Remind me! Or better, tellI've already told you what the language says about it. I quoted
the compiler.
the section of the ISO C standard that says explicitly that the
behavior is undefined. N3220 6.3.2.1p2, last sentence.
The compiler's behavior is consistent with that requirment.
You cannot possibly have forgotten this. Why do you pretend?
Nobody seems to have a problem with gcc being lax about this (or with
it allowing its users to let it be lax).
gcc is not being lax. gcc is behaving in a matter that is consistent
with the requirements of the C standard. The code in question has
undefined behavior.
You know and understand all of that.
The behavior is undefined. You know exactly what that means, but you
pretend not to.
Michael S <already5chosen@yahoo.com> writes:
On Wed, 22 Apr 2026 15:16:56 +0100
Bart <bc@freeuk.com> wrote:
On 22/04/2026 05:09, Tim Rentsch wrote:
antispam@fricas.org (Waldek Hebisch) writes:
You look at trivial example, where AFAICS the best answer is:
"Compiler follows general rules, why should it make exception for
this case?". Note that in this trivial case "interesting"
behaviour could happen on exotic hardware (probably disallowed
by C23 rules, but AFAICS legal for earlier C versions).
The kinds of behavior Bart is asking about has been undefined
behavior for just over 15 years, since 2011 ISO C.
So what was it between 1972 and 2011?
My record at guessing exact meaning of Tim's statements is not
particularly good, but I'll try nevertheless.
Tim seems to suggest that function foo() below had defined behavior
(most likely of returning 1) in C90 and C99, then it became
undefined in C11 and C17 then again became defined in C23.
For years 1972 to 1989 Tim probably thinks that there is no
sufficient data to answer your question.
I'm curious to know what you think of my answer now that I
have written one. :)
Bart <bc@freeuk.com> writes:
On 22/04/2026 22:23, Keith Thompson wrote:
Bart <bc@freeuk.com> writes:
On 22/04/2026 03:53, Keith Thompson wrote:[...]
OK, what am I getting close to?You're *so* close to getting it. If you want to measure the
performance of addition, you have to write your benchmark code so the >>>>> addition operator can't be optimized away. If you don't do that, the >>>>> results will be meaningless.
It seems like you're close getting it too.
That optimisation renders some results meaningless, but then ...
Optimization can make the meaninglessness of some results visible.
Would you agree that a result that involved executing ADD a billionNo.
times, can't be reliably compared with one that does it zero times?
... here you say the opposite of 'If you don't do that, the results
will be meaningless'.
Even though both give the same result.Of course they can be reliably compared. One is much faster than
the other. That's a reliable comparison.
Ha, ha, ha!
It wasn't a joke. I answered your question. Perhaps you meant
something by "reliably compared" other than what I assumed.
Can you rephrase the question and be more specific?
Remind me never to take any benchmark of yours seriously.
I rarely write benchmarks. If I did, they would be much more
sophisticated than your code fragment above.
So, taking one like this:
long long int sum=0;
for (int j=0; j<10; ++j)
for (int i=0; i<2000000000; ++i) sum+=i;
printf("%lld\n", sum);
Obviously, gcc has elided the entire program here so that the timing
is essentially zero (the 5ms is process overhead).
Almost. It elided everything except code to print
1553255916290448384.
And the program still behaved as required.
Why do you have a problem with that?
If you wanted "sum" to be updated 20 billion times during program
execution, why didn't you define it as volatile? That's the exact
feature that C provides to do what you say you want.
Why, why, why do you expect the compiler to assume that you want
to measure CPU instructions rather than get correct output?
On 23/04/2026 02:59, Keith Thompson wrote:
Bart <bc@freeuk.com> writes:
[...]Bart <bc@freeuk.com> writes:
gcc is not being lax. gcc is behaving in a matter that is consistentSo, what does language say about it again? Remind me! Or better, tell >>>>> the compiler.I've already told you what the language says about it. I quoted
the section of the ISO C standard that says explicitly that the
behavior is undefined. N3220 6.3.2.1p2, last sentence.
The compiler's behavior is consistent with that requirment.
You cannot possibly have forgotten this. Why do you pretend?
Nobody seems to have a problem with gcc being lax about this (or with
it allowing its users to let it be lax).
with the requirements of the C standard. The code in question has
undefined behavior.
You know and understand all of that.
No, I don't.
Are you suggesting that because something is tagged as UB, that it
literally gives a compiler a licence to do anything?
On 23/04/2026 02:59, Keith Thompson wrote:
Bart <bc@freeuk.com> writes:
[...]Bart <bc@freeuk.com> writes:
So, what does language say about it again? Remind me! Or better, tell >>>>> the compiler.I've already told you what the language says about it.ÿ I quoted
the section of the ISO C standard that says explicitly that the
behavior is undefined.ÿ N3220 6.3.2.1p2, last sentence.
The compiler's behavior is consistent with that requirment.
You cannot possibly have forgotten this.ÿ Why do you pretend?
Nobody seems to have a problem with gcc being lax about this (or with
it allowing its users to let it be lax).
gcc is not being lax. gcc is behaving in a matter that is consistent
with the requirements of the C standard.ÿ The code in question has
undefined behavior.
You know and understand all of that.
No, I don't.
So what is the concrete effect of all that on the behaviour of gcc and
the behaviour of the code it generates?
If something bad happens (what would that be exactly), whose fault would
ÿthat, mine or the compiler's?
Are you suggesting that because something is tagged as UB, that it
literally gives a compiler a licence to do anything?
If so, how is that not being lax by either language, compiler, or both?
I'm starting to suspect that either nobody knows the answer, or they do,
but are chary of either blaming the compiler or criticising the language spec, and are trying to shift the blame to the user.
The behavior is undefined.ÿ You know exactly what that means, but you
pretend not to.
And yet, the behaviour I have observed is nothing remarkable: some
undefined bit patterns get used; zero is assumed; or code is just elided.
Again, do you have any real-life, practical examples of bad or unusual things happening?
If you had to put money on whether some outcode is either one of those
three I listed, or something else, which would you go for?
On Wed, 22 Apr 2026 20:39:39 -0700
Tim Rentsch <tr.17687@z991.linuxsc.com> wrote:
Michael S <already5chosen@yahoo.com> writes:
On Wed, 22 Apr 2026 15:16:56 +0100
Bart <bc@freeuk.com> wrote:
On 22/04/2026 05:09, Tim Rentsch wrote:
antispam@fricas.org (Waldek Hebisch) writes:
You look at trivial example, where AFAICS the best answer is:
"Compiler follows general rules, why should it make exception for
this case?". Note that in this trivial case "interesting"
behaviour could happen on exotic hardware (probably disallowed
by C23 rules, but AFAICS legal for earlier C versions).
The kinds of behavior Bart is asking about has been undefined
behavior for just over 15 years, since 2011 ISO C.
So what was it between 1972 and 2011?
My record at guessing exact meaning of Tim's statements is not
particularly good, but I'll try nevertheless.
Tim seems to suggest that function foo() below had defined behavior
(most likely of returning 1) in C90 and C99, then it became
undefined in C11 and C17 then again became defined in C23.
For years 1972 to 1989 Tim probably thinks that there is no
sufficient data to answer your question.
I'm curious to know what you think of my answer now that I
have written one. :)
I'd like to read an explanation of what exactly was changed or
clarified in 2011 and again in 2024.
Between 1989 and 2011 the behavior was either always undefined or potentially undefined, depending on when, on what data types are
involved, on some implementation-specific choices, and on how one
reads some passages in the C standard that unfortunately were not
written as clearly as they might have been.
On 23/04/2026 03:26, Keith Thompson wrote:
Bart <bc@freeuk.com> writes:
On 22/04/2026 22:23, Keith Thompson wrote:
Bart <bc@freeuk.com> writes:
On 22/04/2026 03:53, Keith Thompson wrote:[...]
OK, what am I getting close to?You're *so* close to getting it.ÿ If you want to measure the
performance of addition, you have to write your benchmark code so the >>>>>> addition operator can't be optimized away.ÿ If you don't do that, the >>>>>> results will be meaningless.
It seems like you're close getting it too.
That optimisation renders some results meaningless, but then ...
Optimization can make the meaninglessness of some results visible.
Would you agree that a result that involved executing ADD a billionNo.
times, can't be reliably compared with one that does it zero times?
... here you say the opposite of 'If you don't do that, the results
will be meaningless'.
Even though both give the same result.Of course they can be reliably compared.ÿ One is much faster than
the other.ÿ That's a reliable comparison.
Ha, ha, ha!
It wasn't a joke.ÿ I answered your question.ÿ Perhaps you meant
something by "reliably compared" other than what I assumed.
Can you rephrase the question and be more specific?
Remind me never to take any benchmark of yours seriously.
I rarely write benchmarks.ÿ If I did, they would be much more
sophisticated than your code fragment above.
Would they make much use of 'volatile'?
So, taking one like this:
ÿÿÿÿ long long int sum=0;
ÿÿÿÿ for (int j=0; j<10; ++j)
ÿÿÿÿÿÿÿÿ for (int i=0; i<2000000000; ++i) sum+=i;
ÿÿÿÿ printf("%lld\n", sum);
Obviously, gcc has elided the entire program here so that the timing
is essentially zero (the 5ms is process overhead).
Almost.ÿ It elided everything except code to print
1553255916290448384.
(I don't notice that it overflowed.)
ÿAnd the program still behaved as required.
Why do you have a problem with that?
Yes, because I expected that line to be executed 20 billion times and to take an appreciable amount of time.
If you wanted "sum" to be updated 20 billion times during program
execution, why didn't you define it as volatile?ÿ That's the exact
feature that C provides to do what you say you want.
Because I want to know how long reasonably efficient code takes to
execute it 20 billion times. Using 'volatile' would keep 'sum' memory- bound.
Why, why, why do you expect the compiler to assume that you want
to measure CPU instructions rather than get correct output?
Because I told it I wanted a loop.
(I also tested it in the two compilers for my language. The old one did
it in 6.2 seconds, but the new one in 12 seconds. Something needs
looking at!
Other than that, it is wonderful to use a language that does exactly
what you tell it, without a mind of its own, and strives to do it as efficiently as it can given a simple compiler.)
On 23/04/2026 03:26, Keith Thompson wrote:
Bart <bc@freeuk.com> writes:
On 22/04/2026 22:23, Keith Thompson wrote:Optimization can make the meaninglessness of some results visible.
Bart <bc@freeuk.com> writes:
On 22/04/2026 03:53, Keith Thompson wrote:[...]
OK, what am I getting close to?You're *so* close to getting it. If you want to measure the
performance of addition, you have to write your benchmark code so the >>>>>> addition operator can't be optimized away. If you don't do that, the >>>>>> results will be meaningless.
It seems like you're close getting it too.
That optimisation renders some results meaningless, but then ...
It wasn't a joke. I answered your question. Perhaps you meantWould you agree that a result that involved executing ADD a billionNo.
times, can't be reliably compared with one that does it zero times?
... here you say the opposite of 'If you don't do that, the results
will be meaningless'.
Even though both give the same result.Of course they can be reliably compared. One is much faster than
the other. That's a reliable comparison.
Ha, ha, ha!
something by "reliably compared" other than what I assumed.
Can you rephrase the question and be more specific?
Remind me never to take any benchmark of yours seriously.I rarely write benchmarks. If I did, they would be much more
sophisticated than your code fragment above.
Would they make much use of 'volatile'?
So, taking one like this:
long long int sum=0;
for (int j=0; j<10; ++j)
for (int i=0; i<2000000000; ++i) sum+=i;
printf("%lld\n", sum);
Obviously, gcc has elided the entire program here so that the timing
is essentially zero (the 5ms is process overhead).
Almost. It elided everything except code to print
1553255916290448384.
(I don't notice that it overflowed.)
And the program still behaved as required.
Why do you have a problem with that?
Yes, because I expected that line to be executed 20 billion times and
to take an appreciable amount of time.
If you wanted "sum" to be updated 20 billion times during program
execution, why didn't you define it as volatile? That's the exact
feature that C provides to do what you say you want.
Because I want to know how long reasonably efficient code takes to
execute it 20 billion times. Using 'volatile' would keep 'sum'
memory-bound.
I expect sensible code to keep it in a register, but to also do the
task. If I use 'volatile', then I get these results:
bcc 6.3 seconds ('volatile' is ignored)
gcc -O3 49.3 seconds
That would be nice, but it's quite misleading. The nearest I get is to
use gcc -O1 without 'volatile', then it takes 6.2 seconds.
Why, why, why do you expect the compiler to assume that you want
to measure CPU instructions rather than get correct output?
Because I told it I wanted a loop. If I just wanted the correct
output, I would have given it the formula for summing the sequence 0
to N-1, or hardcoded it myself.
Other than that, it is wonderful to use a language that does exactly
what you tell it, without a mind of its own, and strives to do it as efficiently as it can given a simple compiler.)
On 23/04/2026 12:30, Bart wrote:
On 23/04/2026 03:26, Keith Thompson wrote:
Bart <bc@freeuk.com> writes:
On 22/04/2026 22:23, Keith Thompson wrote:
Bart <bc@freeuk.com> writes:
On 22/04/2026 03:53, Keith Thompson wrote:[...]
OK, what am I getting close to?You're *so* close to getting it.? If you want to measure the
performance of addition, you have to write your benchmark code
so the addition operator can't be optimized away.? If you
don't do that, the results will be meaningless.
It seems like you're close getting it too.
That optimisation renders some results meaningless, but then ...
Optimization can make the meaninglessness of some results visible.
Would you agree that a result that involved executing ADD aNo.
billion times, can't be reliably compared with one that does it
zero times?
... here you say the opposite of 'If you don't do that, the
results will be meaningless'.
Even though both give the same result.Of course they can be reliably compared.? One is much faster than
the other.? That's a reliable comparison.
Ha, ha, ha!
It wasn't a joke.? I answered your question.? Perhaps you meant
something by "reliably compared" other than what I assumed.
Can you rephrase the question and be more specific?
Remind me never to take any benchmark of yours seriously.
I rarely write benchmarks.? If I did, they would be much more
sophisticated than your code fragment above.
Would they make much use of 'volatile'?
When I write benchmarks (I don't do so much, but I quite often look
at generated code with godbolt.org, and the same applies there) I
make use of "volatile" as appropriate to force observable behaviour.
Bart <bc@freeuk.com> writes:
Earlier, I mentioned, and you failed to acknowledge, that optimizing
away function calls is exactly as valid as optimizing 2+2 to 4.
Will you address that?
Bart <bc@freeuk.com> writes:
Because I want to know how long reasonably efficient code takes to
execute it 20 billion times. Using 'volatile' would keep 'sum'
memory-bound.
Oh? I wouldn't expect "volatile" to prevent a variable from being
stored in a register. (I don't know whether it might do so with
a given compiler, or why.)
On 23/04/2026 12:30, Bart wrote:
On 23/04/2026 03:26, Keith Thompson wrote:
Bart <bc@freeuk.com> writes:
On 22/04/2026 22:23, Keith Thompson wrote:
Bart <bc@freeuk.com> writes:
On 22/04/2026 03:53, Keith Thompson wrote:[...]
OK, what am I getting close to?You're *so* close to getting it.ÿ If you want to measure the
performance of addition, you have to write your benchmark code so >>>>>>> the
addition operator can't be optimized away.ÿ If you don't do that, >>>>>>> the
results will be meaningless.
It seems like you're close getting it too.
That optimisation renders some results meaningless, but then ...
Optimization can make the meaninglessness of some results visible.
Would you agree that a result that involved executing ADD a billion >>>>>> times, can't be reliably compared with one that does it zero times? >>>>> No.
... here you say the opposite of 'If you don't do that, the results
will be meaningless'.
Even though both give the same result.Of course they can be reliably compared.ÿ One is much faster than
the other.ÿ That's a reliable comparison.
Ha, ha, ha!
It wasn't a joke.ÿ I answered your question.ÿ Perhaps you meant
something by "reliably compared" other than what I assumed.
Can you rephrase the question and be more specific?
Remind me never to take any benchmark of yours seriously.
I rarely write benchmarks.ÿ If I did, they would be much more
sophisticated than your code fragment above.
Would they make much use of 'volatile'?
When I write benchmarks (I don't do so much, but I quite often look at generated code with godbolt.org, and the same applies there) I make use
of "volatile" as appropriate to force observable behaviour.
So, taking one like this:
ÿÿÿÿ long long int sum=0;
ÿÿÿÿ for (int j=0; j<10; ++j)
ÿÿÿÿÿÿÿÿ for (int i=0; i<2000000000; ++i) sum+=i;
ÿÿÿÿ printf("%lld\n", sum);
Obviously, gcc has elided the entire program here so that the timing
is essentially zero (the 5ms is process overhead).
Almost.ÿ It elided everything except code to print
1553255916290448384.
(I don't notice that it overflowed.)
ÿAnd the program still behaved as required.
Why do you have a problem with that?
Yes, because I expected that line to be executed 20 billion times and
to take an appreciable amount of time.
Have you still not understood that your expectations are wrong?
If I ask you "what is the sum of all the integers from 1 to 100 ?", do
you think I am expecting you to do all these sums on paper?ÿ Or in your head?ÿ Or individually on a calculator?ÿ I think it is more likely that
you would write a program, or ask google, or use n(n+1)/2, or simply
know off-hand that it is 5050.ÿ If I wanted to be more specific about
how I wanted you to handle the task - not just give me the answer - I'd specify that, such as saying "you may not use a computer".
Programming is just the same.ÿ It really is that simple.
If you wanted "sum" to be updated 20 billion times during program
execution, why didn't you define it as volatile?ÿ That's the exact
feature that C provides to do what you say you want.
Because I want to know how long reasonably efficient code takes to
execute it 20 billion times. Using 'volatile' would keep 'sum' memory-
bound.
No, that is only the case if you use "volatile" blindly.
First, make the function more general :
long long int summation(long long int start, int n) {
ÿÿÿ long long int sum = start;
ÿÿÿ for (int i = 0; i < n; i++) {
ÿÿÿÿÿÿÿ sum += i;
ÿÿÿ }
ÿÿÿ return sum;
}
Then use volatile in your driver function :
int main(void) {
ÿÿÿÿvolatile long long int start = 0;
ÿÿÿÿvolatile int n = 20000000000;
ÿÿÿÿvolatile llong long int result = summation(start, n);
ÿÿÿÿprintf("%lld\n", result);
}
Put the volatile access at the beginning and end of the benchmark, not
in the loop, and they will have minimal overhead - but they will force
the calculation to be done at runtime.
Alternatively, I might use one of the other "do nothing" inline assembly fragments I mentioned in another post, such as changing the loop to :
ÿÿÿ for (int i = 0; i < n; i++) {
ÿÿÿÿÿÿÿ sum += i;
ÿÿÿÿ__asm__("" : "+g" (sum));
ÿÿÿ }
(I've already told you this.ÿ Do you not bother reading posts trying to
help you, because that would give you less to whine about?ÿ Or do you
have some other reason for ignoring them?)
Why, why, why do you expect the compiler to assume that you want
to measure CPU instructions rather than get correct output?
Because I told it I wanted a loop.
You told it you want the results as if there were a loop - you haven't
told it to generate a loop in the assembly.ÿ C is not assembly.ÿ Why do
you keep insisting that you expect C compilers to act like assemblers?
(I also tested it in the two compilers for my language. The old one
did it in 6.2 seconds, but the new one in 12 seconds. Something needs
looking at!
Your tests here are fine for comparing different versions of your own language or your own tools.ÿ But if you want to benchmark aspects of implementations for different languages, learn how to write benchmarks
to measure and test the things you are interested in.ÿ Or learn that the things you are trying to measure are perhaps not particular important,
and learn to measure other things.
Other than that, it is wonderful to use a language that does exactly
what you tell it, without a mind of its own, and strives to do it as
efficiently as it can given a simple compiler.)
That's why a lot of people like C.ÿ It does what they ask.ÿ Of course,
that only applies when you know the language, and know what you are
asking for.
Tim Rentsch <tr.17687@z991.linuxsc.com> writes:
[...]
Between 1972 and 1978 Unix was not available to the general public,
and I think for all practical purposes neither was C. Also AFAIAA
there was no recognized defining document for C during that time.
IIRC there were some papers written about C before 1978, but nothing
like a real language manual. So the answer seems to be either that
the question doesn't make sense or that everything is "undefined
behavior" because there is no language manual that defines it.
[...]
For the C history buffs, here are a few early papers on C:
C Reference Manual, Jan 15 1974, Dennis Ritchie https://www.nokia.com/bell-labs/about/dennis-m-ritchie/cman74.pdf
C Reference Manual, 1975, Dennis Ritchie https://www.nokia.com/bell-labs/about/dennis-m-ritchie/cman.pdf
Programming in C - A Tutorial, 1975(?), Brian Kernighan https://www.nokia.com/bell-labs/about/dennis-m-ritchie/ctut.pdf
The Development of the C Language, 1994, Dennis Ritchie https://www.nokia.com/bell-labs/about/dennis-m-ritchie/chist.pdf
Dennis Ritchie's home page https://www.nokia.com/bell-labs/about/dennis-m-ritchie/
has a number of other papers on early Unix, BCPL, B, and C.
Bart <bc@freeuk.com> writes:
[...]
Assessment of what? What exactly is it about your code fragment
that implies it's meant to be used as an assessment? Do you
expect the compiler to understand that what you want is a program
that performs 20 billion run-time operations rather than one that
produces correct output?
[...]
On 23/04/2026 03:26, Keith Thompson wrote:
And the program still behaved as required.
Why do you have a problem with that?
Yes, because I expected that line to be executed 20 billion times and to >take an appreciable amount of time.
On Thu, 23 Apr 2026 13:12:16 +0200
David Brown <david.brown@hesbynett.no> wrote:
Would they make much use of 'volatile'? =20=20
When I write benchmarks (I don't do so much, but I quite often look
at generated code with godbolt.org, and the same applies there) I
make use of "volatile" as appropriate to force observable behaviour.
=20
I never do.
I always try my best to give to execution of the "item under test" a
real meaning.
On 23/04/2026 12:43, Keith Thompson wrote:
I mean, it is not as though you can choose a different ADD instruction
to make it faster! You have to look at the bigger picture; it can't be >realistically isolated.
On Thu, 23 Apr 2026 13:12:16 +0200...
David Brown <david.brown@hesbynett.no> wrote:
When I write benchmarks (I don't do so much, but I quite often look
at generated code with godbolt.org, and the same applies there) I
make use of "volatile" as appropriate to force observable behaviour.
I never do.
I always try my best to give to execution of the "item under test" a
real meaning.
Michael S <already5chosen@yahoo.com> writes:
On Thu, 23 Apr 2026 13:12:16 +0200
David Brown <david.brown@hesbynett.no> wrote:
Would they make much use of 'volatile'? =20=20
When I write benchmarks (I don't do so much, but I quite often look
at generated code with godbolt.org, and the same applies there) I
make use of "volatile" as appropriate to force observable
behaviour.
=20
I never do.
I always try my best to give to execution of the "item under test" a
real meaning.
What is the item under test? The application, the compiler or
the processor implementation?
Bart <bc@freeuk.com> writes:
On 23/04/2026 12:43, Keith Thompson wrote:
I mean, it is not as though you can choose a different ADD
instruction to make it faster! You have to look at the bigger
picture; it can't be realistically isolated.
Actually, there are many flavors of ADD instruction on
x86/x86_64 processors. Some execute faster than others
(e.g. all three operands may be in registers or all three
may require fills to three different cache lines).
On 23/04/2026 12:12, David Brown wrote:
Your tests here are fine for comparing different versions of your own
language or your own tools.ÿ But if you want to benchmark aspects of
implementations for different languages, learn how to write benchmarks
to measure and test the things you are interested in.ÿ Or learn that
the things you are trying to measure are perhaps not particular
important, and learn to measure other things.
Even such a simple benchmark generally works well across many languages.
Are you suggesting that because something is tagged as UB, that it"behavior, upon use of a nonportable or erroneous program construct or
literally gives a compiler a licence to do anything?
On 2026-04-23 08:12, Michael S wrote:
On Thu, 23 Apr 2026 13:12:16 +0200...
David Brown <david.brown@hesbynett.no> wrote:
When I write benchmarks (I don't do so much, but I quite often look
at generated code with godbolt.org, and the same applies there) I
make use of "volatile" as appropriate to force observable
behaviour.
I never do.
I always try my best to give to execution of the "item under test" a
real meaning.
If by "a real meaning" you mean something connected to the C
standard's definition of "observable behavior", that's a perfectly
valid approach. Using volatile "as appropriate" is another perfectly
valid approach to getting observable behavior. It's also a simpler,
less intrusive approach, because the only other ways to have
observable behavior require doing I/O; often-times you don't want to
be testing I/O speed as part of your benchmark.
Indeed, transformation applied by compilers in this case is
more complex than mere [tail call elimination].
In theory, it can be a result of two successive transformations.
First transforming original to fib2:
unsigned long long fib2(unsigned long long n, unsigned long long acc)
{
if (n < 3)
return acc + 1;
return fib2(n-2, fib2(n-1, acc));
}
Ant then applying TCE.
But more likely compiler arrived to the same outcome by different
logical steps.
On 23/04/2026 11:58, Bart wrote:
...
Are you suggesting that because something is tagged as UB, that it"behavior, upon use of a nonportable or erroneous program construct or
literally gives a compiler a licence to do anything?
of erroneous data, for which this document imposes no requirements." (3.5.3p1).
What exactly do you think "no requirements" means? What could it
possibly mean other than "license to do anything"?
On Wed, 22 Apr 2026 20:39:39 -0700
Tim Rentsch <tr.17687@z991.linuxsc.com> wrote:
Michael S <already5chosen@yahoo.com> writes:
On Wed, 22 Apr 2026 15:16:56 +0100
Bart <bc@freeuk.com> wrote:
On 22/04/2026 05:09, Tim Rentsch wrote:
antispam@fricas.org (Waldek Hebisch) writes:
You look at trivial example, where AFAICS the best answer is:
"Compiler follows general rules, why should it make exception for
this case?". Note that in this trivial case "interesting"
behaviour could happen on exotic hardware (probably disallowed
by C23 rules, but AFAICS legal for earlier C versions).
The kinds of behavior Bart is asking about has been undefined
behavior for just over 15 years, since 2011 ISO C.
So what was it between 1972 and 2011?
My record at guessing exact meaning of Tim's statements is not
particularly good, but I'll try nevertheless.
Tim seems to suggest that function foo() below had defined behavior
(most likely of returning 1) in C90 and C99, then it became
undefined in C11 and C17 then again became defined in C23.
For years 1972 to 1989 Tim probably thinks that there is no
sufficient data to answer your question.
I'm curious to know what you think of my answer now that I
have written one. :)
I'd like to read an explanation of what exactly was changed or
clarified in 2011 and again in 2024.
IMO, most "undefined behavior" in the C specification was due to implementation differences between the C compilers/linkers that
existed at the time.
On 23/04/2026 15:42, James Kuyper wrote:
On 23/04/2026 11:58, Bart wrote:
...
Are you suggesting that because something is tagged as UB, that it
literally gives a compiler a licence to do anything?
"behavior, upon use of a nonportable or erroneous program construct or
of erroneous data, for which this document imposes no requirements."
(3.5.3p1).
What exactly do you think "no requirements" means? What could it
possibly mean other than "license to do anything"?
So the effect is that the compiler can be 'lax' in being able to do
what it likes, including not reporting it and not refusing to fail te program.
KT said: "the compiler is not being lax". I was responding to that.
If it is not being lax, then I'd like to what 'being lax' would look
like for this compiler.
Bart <bc@freeuk.com> writes:
On 23/04/2026 15:42, James Kuyper wrote:
On 23/04/2026 11:58, Bart wrote:
...
Are you suggesting that because something is tagged as UB, that it
literally gives a compiler a licence to do anything?
"behavior, upon use of a nonportable or erroneous program construct or
of erroneous data, for which this document imposes no requirements."
(3.5.3p1).
What exactly do you think "no requirements" means? What could it
possibly mean other than "license to do anything"?
So the effect is that the compiler can be 'lax' in being able to do
what it likes, including not reporting it and not refusing to fail te
program.
KT said: "the compiler is not being lax". I was responding to that.
If it is not being lax, then I'd like to what 'being lax' would look
like for this compiler.
What "being lax" means, for any compiler and not just this one,
is not being faithful to what the C standard requires of a
conforming implementation.
On 23/04/2026 14:43, Bart wrote:
On 23/04/2026 12:12, David Brown wrote:
Your tests here are fine for comparing different versions of your own
language or your own tools.ÿ But if you want to benchmark aspects of
implementations for different languages, learn how to write
benchmarks to measure and test the things you are interested in.ÿ Or
learn that the things you are trying to measure are perhaps not
particular important, and learn to measure other things.
Even such a simple benchmark generally works well across many languages.
So we can conclude that C is apparently a much better language than
these others for real programming,
if run-time efficiency is important
to the task, because it allows much better optimisations.
think this is necessarily true - there are other languages and tools
that can generate efficient object code - but it is the conclusion I
draw from your testing.)
And we can conclude that you are unable to write benchmarks that measure what you want to measure.
And we can conclude that you have no interest
in improving that situation by learning anything.
Bart <bc@freeuk.com> wrote:
On 21/04/2026 14:43, David Brown wrote:
On 21/04/2026 14:48, Bart wrote:
I might measure performance by invoking it N times. Suppose I get these
results across 4 languages:
L1: 3.5 seconds
L2: 4.2
L3: 0.1
L4 2.9
According to you, obviously L3 is the winner because of its superior
optimiser! No red flags at all.
I see big red flag above "by invoking it N times".
Concerning
numbers, of course I would be interested how L3 managed to
get much better results.
As a little challenge I invite you to predict performance
of gcc on following 2 functions:
void
f1(unsigned char * a, int t, int n) {
int i;
for(i=0; i < n; i++} {
if (a[i] > t) {
a[i] = t;
}
}
}
void
f2(unsigned char * a, int t, int n) {
int i;
for(i=0; i < n; i++} {
a[i] = (a[i] > t)?t:a[i];
}
}
and on 2 sets of data, one where a is filled with constant
value, the second one with pseudo-random one. n should be
100000, t should be 128, a should be freshly filled with values.
I do not ask about exact values, but just qualitive comparison.
Note: simply calling f1 or f2 multiple times with give wrong
time (you need to fill it with data before call!). Similarly
increasing n does not give intended time (the size is reasonaby
natural for the problem and will fit in L2 cache on typical
modern machine).
On 23/04/2026 12:30, Bart wrote:
On 23/04/2026 03:26, Keith Thompson wrote:
Bart <bc@freeuk.com> writes:
On 22/04/2026 22:23, Keith Thompson wrote:
Bart <bc@freeuk.com> writes:
On 22/04/2026 03:53, Keith Thompson wrote:[...]
OK, what am I getting close to?You're *so* close to getting it.ÿ If you want to measure the
performance of addition, you have to write your benchmark code so >>>>>>> the
addition operator can't be optimized away.ÿ If you don't do that, >>>>>>> the
results will be meaningless.
It seems like you're close getting it too.
That optimisation renders some results meaningless, but then ...
Optimization can make the meaninglessness of some results visible.
Would you agree that a result that involved executing ADD a billion >>>>>> times, can't be reliably compared with one that does it zero times? >>>>> No.
... here you say the opposite of 'If you don't do that, the results
will be meaningless'.
Even though both give the same result.Of course they can be reliably compared.ÿ One is much faster than
the other.ÿ That's a reliable comparison.
Ha, ha, ha!
It wasn't a joke.ÿ I answered your question.ÿ Perhaps you meant
something by "reliably compared" other than what I assumed.
Can you rephrase the question and be more specific?
Remind me never to take any benchmark of yours seriously.
I rarely write benchmarks.ÿ If I did, they would be much more
sophisticated than your code fragment above.
Would they make much use of 'volatile'?
When I write benchmarks (I don't do so much, but I quite often look at generated code with godbolt.org, and the same applies there) I make use
of "volatile" as appropriate to force observable behaviour.
Bart <bc@freeuk.com> wrote:
Other than that, it is wonderful to use a language that does exactly
what you tell it, without a mind of its own, and strives to do it as
efficiently as it can given a simple compiler.)
Consider the following function:
void
f(int k, int n, int a[n][n], int b[n][n], int c[n][n]) {
int i;
for(i = 0; i < n; i++) {
a[k][i] = b[k][i] + c[k][i];
}
}
How many instructions should it execute at runtime?
Note, this uses C VMT-s because this is what in needed in real use.
C got VMT-s rather lately,
but for example it would be trivial to
translate this function to Fortran 66. In "C" compiler that does
not support VMT-s you can do
#define aref(a, k, i) (*(a + n*k + i))
and replace assigment iside loop by
aref(a, k, i) = aref(b, k, i) + aref(c, k, i);
Is compiler which generates code doing 3*n multiplications efficient?
Do compiler which does not need 3*n multiplications have
"a mind of its own"?
On 23/04/2026 11:03, Chris M. Thomasson wrote:
On 4/23/2026 12:22 AM, David Brown wrote:
On 22/04/2026 23:29, Chris M. Thomasson wrote:
On 4/22/2026 2:28 PM, Chris M. Thomasson wrote:
On 4/21/2026 1:13 PM, David Brown wrote:load/store with relaxed should act like compiler barriers?
On 21/04/2026 20:51, Chris M. Thomasson wrote:
On 4/21/2026 1:13 AM, David Brown wrote:
On 20/04/2026 23:59, Keith Thompson wrote:
Bart <bc@freeuk.com> writes:
On 20/04/2026 18:48, Keith Thompson wrote:
Bart <bc@freeuk.com> writes:
Yes, that's really useful!
So which implementation is faster at actually doing function >>>>>>>>>> calls?
And how many calls were actually made?
I don't know or care.
Once again, *there are ways* to write C benchmarks that guarantee >>>>>>>>> that all the function calls you want to time actually occur during >>>>>>>>> execution.ÿ For example, you can use calls to separately compiled >>>>>>>>> functions (and disable link-time optimization if necessary). >>>>>>>>> You can
do computations that the compiler can't unwrap.ÿ You might
multiply
a value by (time(NULL) > 0); that always yields 1, but the
compiler
probably doesn't know that.ÿ (That's off the top of my head; I >>>>>>>>> don't
know what the best techniques are in practice.)ÿ And then you can >>>>>>>>> examine the generated code to make sure that it's what you want. >>>>>>>>>
To add more suggestions here, I find the key to benchmarking
when you want to stick to standard C is use of "volatile".ÿ Use >>>>>>>> a volatile read at the start of your code, then calculations
that depend on each other and that first read, then a volatile >>>>>>>> write of the result.ÿ That gives minimal intrusion in the code >>>>>>>> while making sure the calculations have to be generated, and
have to be done at run time.
If you are testing on a particular compiler (like gcc or clang), >>>>>>>> then there are other options.ÿ The "noinline" function attribute >>>>>>>> is very handy.ÿ Then there are empty inline assembly statements: >>>>>>>>
If you think of processor registers as acting like a level -1 >>>>>>>> memory cache (for things that are not always in registers), then >>>>>>>> this flushes that cache:
ÿÿÿÿÿasm volatile ("" ::: "memory");
This tells the compiler that it needs to have calculated "x" at >>>>>>>> this point in time (so that its value can be passed to the
assembly) :
ÿÿÿÿÿasm volatile ("" :: "" (x));
This tells the compiler that "x" might be changed by the
assembly, so it must forget any additional knowledge it had of it : >>>>>>>>
ÿÿÿÿÿasm volatile ("" : "+g" (x));
I've had use of all of these in real code, not just benchmarks >>>>>>>> or test code.ÿ They can be helpful in some kinds of interactions >>>>>>>> between low level code and hardware.
Well, we have to make a difference between a compiler barrier and >>>>>>> a memory barrier. All memory barriers should be compiler
barriers, but compiler barriers do not have to be memory
barriers... Fair enough?
Of course there is a difference between memory barriers and
compiler barriers.ÿ We are talking about compiler barriers here,
because they have an effect on the semantics of the language (in
this case, the language is "C with gcc extensions") without the
cost of real memory barriers.ÿ C11 atomic fences are compiler and >>>>>> memory barriers, but they can have a huge effect on code speed -
these empty assembly statements are aimed at having minimal impact >>>>>> outside of the intended effects.
I think a relaxed memory barrier can be used as a compiler barrier
and be compatible with atomic, volatile does not have to be used here? >>>>
To be honest, I have never been at all sure how C11 atomic accesses
and fences relate to "memory barriers" of any sort, or how they
enforce order in respect to volatile accesses or non-volatile accesses.
The C standards at times use "volatile atomic" qualifications, which
implies that non-volatile atomic uses are not volatile.ÿ Volatile
accesses do two things - enforce an order (in the generated code, but
not necessarily at execution on the cpu) of volatile accesses, and
make the access "observable behaviour".ÿ My understanding is then
that C11 atomics are missing one or both of these aspects, but I
don't know which.
gcc has a "memory clobber" facility in inline assembly - and this is
commonly used as a compiler (but not cpu) memory barrier.ÿ I know
what it does in practical terms for the way I use it, but I am not
sure how precisely it can be specified in relation to the standard C
semantics. It seems reasonable to suppose that a relaxed atomic fence
could act like a gcc compiler memory barrier, but the standard says
that "atomic_thread_fence(memory_order_relaxed)" has no effects.
The main reason I have not bothered looking at the semantics and
effects of C11 atomics is that the libatomic implementation that is
distributed with gcc is (or at least /was/ when I looked a number of
years ago) fundamentally and irreparably broken for single-core
microcontrollers. Using spinlocks to enforce atomic actions is fine
on a multi-core Linux system, but a guaranteed hang on a single-core
RTOS or when using atomics from interrupts.ÿ So I use RTOS-specific
features, or my own critical section code (disabling interrupts is
the way to do it on these kinds of devices), along with gcc inline
assembly - it's as far from portable standard C code as you can get
and still have it mixed with C, but I don't need portability there.
But I have no objection at all if someone wants to give an
explanation of some of the C11 atomic semantics, though it might be
better in a new thread.
Yeah. Well, damn. I would hope that in the _compiled_ code, memory
ordering aside:
std::atomic<int> a = 0;
a.store(123);
a.store(666);
Better damn well issue two stores in that order. The memory order side
be damned for this moment, but I think std::atomic in impls are laced
with the volatile keyword anyway, but shit can happen. Humm...
This is c.l.c., not c.l.c++, but they use the same memory model here.
A brief test shows that gcc seems to do both stores regardless of the
memory order (for atomic_store_explicit).ÿ With memory_order_seq_cst,
gcc appears to act as though there were a compiler memory barrier along
with the store - with memory_order_relaxes, there is no such barrier.
That is, non-volatile accesses can be moved around.ÿ So this:
_Atomic int a1;
int i1;
void foo(int x) {
ÿÿÿÿi1 = 100;
ÿÿÿÿatomic_store_explicit(&a1, x, memory_order_relaxed);
ÿÿÿÿatomic_store_explicit(&a1, x + 1, memory_order_relaxed);
ÿÿÿÿi1 = i1 + 1;
}
gets optimised as though it were:
void foo(int x) {
ÿÿÿÿatomic_store_explicit(&a1, x, memory_order_relaxed);
ÿÿÿÿatomic_store_explicit(&a1, x + 1, memory_order_relaxed);
ÿÿÿÿi1 = 101;
}
It is difficult to test, by trial and error, if volatile accesses get re-ordered around relaxed atomic accesses.ÿ Regardless of semantics, the compiler is not going to re-order them unless there are clear efficiency benefits, and since relaxed atomic operations apparently can't be
combined (or at least, gcc does not combine them), I haven't got any examples where the compiler would be likely to re-arrange things if it
is allowed to do so.ÿ But my failure to find a counter-example here does
not mean that I am sure relaxed atomic accesses cannot be re-ordered
with respect to non-atomic volatile accesses.
On 23/04/2026 02:59, Keith Thompson wrote:
Bart <bc@freeuk.com> writes:
[...]Bart <bc@freeuk.com> writes:
So, what does language say about it again? Remind me! Or better, tell >>>>> the compiler.I've already told you what the language says about it.ÿ I quoted
the section of the ISO C standard that says explicitly that the
behavior is undefined.ÿ N3220 6.3.2.1p2, last sentence.
The compiler's behavior is consistent with that requirment.
You cannot possibly have forgotten this.ÿ Why do you pretend?
Nobody seems to have a problem with gcc being lax about this (or with
it allowing its users to let it be lax).
gcc is not being lax. gcc is behaving in a matter that is consistent
with the requirements of the C standard.ÿ The code in question has
undefined behavior.
You know and understand all of that.
No, I don't.
So what is the concrete effect of all that on the behaviour of gcc and
the behaviour of the code it generates?
If something bad happens (what would that be exactly), whose fault would
ÿthat, mine or the compiler's?
Are you suggesting that because something is tagged as UB, that it
literally gives a compiler a licence to do anything?
If so, how is that not being lax by either language, compiler, or both?
I'm starting to suspect that either nobody knows the answer, or they do,
but are chary of either blaming the compiler or criticising the language spec, and are trying to shift the blame to the user.
The behavior is undefined.ÿ You know exactly what that means, but you
pretend not to.
And yet, the behaviour I have observed is nothing remarkable: some
undefined bit patterns get used; zero is assumed; or code is just elided.
Again, do you have any real-life, practical examples of bad or unusual things happening?
If you had to put money on whether some outcode is either one of those
three I listed, or something else, which would you go for?
| Sysop: | Tetrazocine |
|---|---|
| Location: | Melbourne, VIC, Australia |
| Users: | 14 |
| Nodes: | 8 (0 / 8) |
| Uptime: | 141:27:01 |
| Calls: | 212 |
| Files: | 21,502 |
| Messages: | 83,452 |