On 2024-03-21, David Brown <david.brown@hesbynett.no> wrote:
On 20/03/2024 19:54, Kaz Kylheku wrote:
On 2024-03-20, Stefan Ram <ram@zedat.fu-berlin.de> wrote:
A "famous security bug":
void f( void )
{ char buffer[ MAX ];
/* . . . */
memset( buffer, 0, sizeof( buffer )); }
. Can you see what the bug is?
I don't know about "the bug", but conditions can be identified under
which that would have a problem executing, like MAX being in excess
of available automatic storage.
If the /*...*/ comment represents the elision of some security sensitive >>> code, where the memset is intended to obliterate secret information,
of course, that obliteration is not required to work.
After the memset, the buffer has no next use, so the all the assignments >>> performed by memset to the bytes of buffer are dead assignments that can >>> be elided.
To securely clear memory, you have to use a function for that purpose
that is not susceptible to optimization.
If you're not doing anything stupid, like link time optimization, an
external function in another translation unit (a function that the
compiler doesn't recognize as being an alias or wrapper for memset)
ought to suffice.
Using LTO is not "stupid". Relying on people /not/ using LTO, or not
using other valid optimisations, is "stupid".
LTO is a nonconforming optimization. It destroys the concept that
when a translation unit is translated, the semantic analysis is
complete, such that the only remaining activity is resolution of
external references (linkage), and that the semantic analysis of one translation unit deos not use information about another translation
unit.
[...]
Kaz Kylheku to Stefan Ram:
A "famous security bug":
void f( void )
{ char buffer[ MAX ];
/* . . . */
memset( buffer, 0, sizeof( buffer )); }
. Can you see what the bug is?
I don't know about "the bug", but conditions can be
identified under which that would have a problem
executing, like MAX being in excess of available automatic
storage.
If the /*...*/ comment represents the elision of some
security sensitive code, where the memset is intended to
obliterate secret information, of course, that
obliteration is not required to work.
After the memset, the buffer has no next use, so the all
the assignments performed by memset to the bytes of buffer
are dead assignments that can be elided.
To securely clear memory, you have to use a function for
that purpose that is not susceptible to optimization.
I think this behavior (of a C compiler) rather stupid. In a
low-level imperative language, the compiled program shall
do whatever the programmer commands it to do. If he
commands it to clear the buffer, it shall clear the buffer.
This optimisation is too high-level, too counter-inituitive,
even deceitful. The optimiser is free to perform the task
in the fastest manner possible, but it shall not ignore the
programmer's order to zero-fill the buffer, especially
without emitting a warning about (potentially!) redundant
code, which it is the programmer's reponsibility to confirm
and remove.
Redundant code shall be dealt with in the source, rather than
in the executable.
"All of its “critical-sequences” are contained in externally assembled >functions ( read all ) in order to prevent a rouge C compiler from
Kaz Kylheku to Stefan Ram:
A "famous security bug":
void f( void )
{ char buffer[ MAX ];
/* . . . */
memset( buffer, 0, sizeof( buffer )); }
. Can you see what the bug is?
I don't know about "the bug", but conditions can be
identified under which that would have a problem
executing, like MAX being in excess of available automatic
storage.
If the /*...*/ comment represents the elision of some
security sensitive code, where the memset is intended to
obliterate secret information, of course, that
obliteration is not required to work.
After the memset, the buffer has no next use, so the all
the assignments performed by memset to the bytes of buffer
are dead assignments that can be elided.
To securely clear memory, you have to use a function for
that purpose that is not susceptible to optimization.
I think this behavior (of a C compiler) rather stupid. In a
low-level imperative language, the compiled program shall
do whatever the programmer commands it to do. If he
commands it to clear the buffer, it shall clear the buffer.
This optimisation is too high-level, too counter-inituitive,
even deceitful. The optimiser is free to perform the task
in the fastest manner possible, but it shall not ignore the
programmer's order to zero-fill the buffer, especially
without emitting a warning about (potentially!) redundant
code, which it is the programmer's reponsibility to confirm
and remove.
On 2024-03-21, David Brown <david.brown@hesbynett.no> wrote:[...]
On 20/03/2024 19:54, Kaz Kylheku wrote:
On 2024-03-20, Stefan Ram <ram@zedat.fu-berlin.de> wrote:
A "famous security bug":
void f( void )
{ char buffer[ MAX ];
/* . . . */
memset( buffer, 0, sizeof( buffer )); }
. Can you see what the bug is?
I don't know about "the bug", but conditions can be identified under
which that would have a problem executing, like MAX being in excess
of available automatic storage.
If the /*...*/ comment represents the elision of some security sensitive >>> code, where the memset is intended to obliterate secret information,
of course, that obliteration is not required to work.
After the memset, the buffer has no next use, so the all the assignments >>> performed by memset to the bytes of buffer are dead assignments that can >>> be elided.
To securely clear memory, you have to use a function for that purpose
that is not susceptible to optimization.
If you're not doing anything stupid, like link time optimization, an
external function in another translation unit (a function that the
compiler doesn't recognize as being an alias or wrapper for memset)
ought to suffice.
Using LTO is not "stupid". Relying on people /not/ using LTO, or not
using other valid optimisations, is "stupid".
LTO is a nonconforming optimization. It destroys the concept that
when a translation unit is translated, the semantic analysis is
complete, such that the only remaining activity is resolution of
external references (linkage), and that the semantic analysis of one translation unit deos not use information about another translation
unit.
This has not yet changed in last April's N3096 draft, where
translation phases 7 and 8 are:
7. White-space characters separating tokens are no longer significant.
Each preprocessing token is converted into a token. The resulting
tokens are syntactically and semantically analyzed and translated
as a translation unit.
8. All external object and function references are resolved. Library
components are linked to satisfy external references to functions
and objects not defined in the current translation. All such
translator output is collected into a program image which contains
information needed for execution in its execution environment.
and before that, the Program Structure section says:
The separate translation units of a program communicate by (for
example) calls to functions whose identifiers have external linkage,
manipulation of objects whose identifiers have external linkage, or
manipulation of data files. Translation units may be separately
translated and then later linked to produce an executable program.
LTO deviates from the the model that translation units are separate,
and the conceptual steps of phases 7 and 8.
"Chris M. Thomasson" <chris.m.thomasson.1@gmail.com> writes:
"All of its “critical-sequences” are contained in externally assembled >> functions ( read all ) in order to prevent a rouge C compiler from
As opposed to a viridian C compiler?
"Chris M. Thomasson" <chris.m.thomasson.1@gmail.com> writes:
On 3/21/2024 1:21 PM, Scott Lurndal wrote:
"Chris M. Thomasson" <chris.m.thomasson.1@gmail.com> writes:
"All of its “critical-sequences” are contained in externally assembled >>>> functions ( read all ) in order to prevent a rouge C compiler from
As opposed to a viridian C compiler?
I was worried about "overly aggressive" LTO messing around with my ASM.
And you missed the oblique reference to the mispelling of 'rogue' as 'rouge'.
On 2024-03-21, David Brown <david.brown@hesbynett.no> wrote:
On 20/03/2024 19:54, Kaz Kylheku wrote:
On 2024-03-20, Stefan Ram <ram@zedat.fu-berlin.de> wrote:
A "famous security bug":
void f( void )
{ char buffer[ MAX ];
/* . . . */
memset( buffer, 0, sizeof( buffer )); }
. Can you see what the bug is?
I don't know about "the bug", but conditions can be identified under
which that would have a problem executing, like MAX being in excess
of available automatic storage.
If the /*...*/ comment represents the elision of some security sensitive >>> code, where the memset is intended to obliterate secret information,
of course, that obliteration is not required to work.
After the memset, the buffer has no next use, so the all the assignments >>> performed by memset to the bytes of buffer are dead assignments that can >>> be elided.
To securely clear memory, you have to use a function for that purpose
that is not susceptible to optimization.
If you're not doing anything stupid, like link time optimization, an
external function in another translation unit (a function that the
compiler doesn't recognize as being an alias or wrapper for memset)
ought to suffice.
Using LTO is not "stupid". Relying on people /not/ using LTO, or not
using other valid optimisations, is "stupid".
LTO is a nonconforming optimization.
It destroys the concept that
when a translation unit is translated, the semantic analysis is
complete, such that the only remaining activity is resolution of
external references (linkage), and that the semantic analysis of one translation unit deos not use information about another translation
unit.
This has not yet changed in last April's N3096 draft, where
translation phases 7 and 8 are:
7. White-space characters separating tokens are no longer significant.
Each preprocessing token is converted into a token. The resulting
tokens are syntactically and semantically analyzed and translated
as a translation unit.
8. All external object and function references are resolved. Library
components are linked to satisfy external references to functions
and objects not defined in the current translation. All such
translator output is collected into a program image which contains
information needed for execution in its execution environment.
and before that, the Program Structure section says:
The separate translation units of a program communicate by (for
example) calls to functions whose identifiers have external linkage,
manipulation of objects whose identifiers have external linkage, or
manipulation of data files. Translation units may be separately
translated and then later linked to produce an executable program.
LTO deviates from the the model that translation units are separate,
and the conceptual steps of phases 7 and 8.
The translation unit that is prepared for LTO is not fully cooked. You
have no idea what its code will turn into when the interrupted
compilation is resumed during linkage, under the influence of other tranlation units it is combined with.
So in fact, the language allows us to take it for granted that, given
my_memset(array, 0, sizeof(array)); }
at the end of a function, and my_memset is an external definition
provided by another translation unit, the call may not be elided.
The one who may be acting recklessly is he who turns on nonconforming optimizations that are not documented as supported by the code base.
Another example would be something like gcc's -ffast-math.
You wouldn't unleash that on numerical code written by experts,
and expect the same correct results.
Eliminating dead stores is a very basic dataflow-driven optimization.
Because memset is part of the C language, the compiler knows
exactly what effect it has (that it's equivalent to setting
all the bytes to zero, like a sequence of assignments).
If you don't want a call to be optimized away, call your
own function in another translation unit.
(And don't turn
on nonconforming cross-translation-unit optimizations.)
Kaz Kylheku to Stefan Ram:
A "famous security bug":
void f( void )
{ char buffer[ MAX ];
/* . . . */
memset( buffer, 0, sizeof( buffer )); }
. Can you see what the bug is?
I don't know about "the bug", but conditions can be
identified under which that would have a problem
executing, like MAX being in excess of available automatic
storage.
If the /*...*/ comment represents the elision of some
security sensitive code, where the memset is intended to
obliterate secret information, of course, that
obliteration is not required to work.
After the memset, the buffer has no next use, so the all
the assignments performed by memset to the bytes of buffer
are dead assignments that can be elided.
To securely clear memory, you have to use a function for
that purpose that is not susceptible to optimization.
I think this behavior (of a C compiler) rather stupid. In a
low-level imperative language, the compiled program shall
do whatever the programmer commands it to do. If he
commands it to clear the buffer, it shall clear the buffer.
This optimisation is too high-level, too counter-inituitive,
even deceitful. The optimiser is free to perform the task
in the fastest manner possible, but it shall not ignore the
programmer's order to zero-fill the buffer, especially
without emitting a warning about (potentially!) redundant
code, which it is the programmer's reponsibility to confirm
and remove.
Redundant code shall be dealt with in the source, rather than
in the executable.
On 21/03/2024 21:21, Kaz Kylheku wrote:
Eliminating dead stores is a very basic dataflow-driven optimization.
Because memset is part of the C language, the compiler knows
exactly what effect it has (that it's equivalent to setting
all the bytes to zero, like a sequence of assignments).
Yes.
If you don't want a call to be optimized away, call your
own function in another translation unit.
No.
There are several ways that guarantee your code will carry out the
writes here (though none that guarantee the secret data is not also
stored elsewhere). Using a function in a different TU is not one of
these techniques. You do people a disfavour by recommending it.
(And don't turn
on nonconforming cross-translation-unit optimizations.)
If I knew of any non-conforming cross-translation-unit optimisations in
a compiler, I would avoid using them until the compiler vendor had fixed
the bug in question.
Kaz Kylheku <433-929-6894@kylheku.com> writes:
On 2024-03-21, David Brown <david.brown@hesbynett.no> wrote:[...]
On 20/03/2024 19:54, Kaz Kylheku wrote:
On 2024-03-20, Stefan Ram <ram@zedat.fu-berlin.de> wrote:
A "famous security bug":
void f( void )
{ char buffer[ MAX ];
/* . . . */
memset( buffer, 0, sizeof( buffer )); }
. Can you see what the bug is?
I don't know about "the bug", but conditions can be identified under
which that would have a problem executing, like MAX being in excess
of available automatic storage.
If the /*...*/ comment represents the elision of some security sensitive >>>> code, where the memset is intended to obliterate secret information,
of course, that obliteration is not required to work.
After the memset, the buffer has no next use, so the all the assignments >>>> performed by memset to the bytes of buffer are dead assignments that can >>>> be elided.
To securely clear memory, you have to use a function for that purpose
that is not susceptible to optimization.
If you're not doing anything stupid, like link time optimization, an
external function in another translation unit (a function that the
compiler doesn't recognize as being an alias or wrapper for memset)
ought to suffice.
Using LTO is not "stupid". Relying on people /not/ using LTO, or not
using other valid optimisations, is "stupid".
LTO is a nonconforming optimization. It destroys the concept that
when a translation unit is translated, the semantic analysis is
complete, such that the only remaining activity is resolution of
external references (linkage), and that the semantic analysis of one
translation unit deos not use information about another translation
unit.
This has not yet changed in last April's N3096 draft, where
translation phases 7 and 8 are:
7. White-space characters separating tokens are no longer significant.
Each preprocessing token is converted into a token. The resulting
tokens are syntactically and semantically analyzed and translated
as a translation unit.
8. All external object and function references are resolved. Library
components are linked to satisfy external references to functions
and objects not defined in the current translation. All such
translator output is collected into a program image which contains
information needed for execution in its execution environment.
and before that, the Program Structure section says:
The separate translation units of a program communicate by (for
example) calls to functions whose identifiers have external linkage,
manipulation of objects whose identifiers have external linkage, or
manipulation of data files. Translation units may be separately
translated and then later linked to produce an executable program.
LTO deviates from the the model that translation units are separate,
and the conceptual steps of phases 7 and 8.
Link time optimization is as valid as cross-function optimization *as
long as* it doesn't change the defined behavior of the program.
Say I have a call to foo in main, and the definition of foo is in
another translation unit. In the absence of LTO, the compiler will have
to generate a call to foo. If LTO is able to determine that foo doesn't
do anything, it can remove the code for the function call, and the
resulting behavior of the linked program is unchanged.
On 2024-03-21, Keith Thompson <Keith.S.Thompson+u@gmail.com> wrote:
Kaz Kylheku <433-929-6894@kylheku.com> writes:
On 2024-03-21, David Brown <david.brown@hesbynett.no> wrote:[...]
On 20/03/2024 19:54, Kaz Kylheku wrote:
On 2024-03-20, Stefan Ram <ram@zedat.fu-berlin.de> wrote:
A "famous security bug":
void f( void )
{ char buffer[ MAX ];
/* . . . */
memset( buffer, 0, sizeof( buffer )); }
. Can you see what the bug is?
I don't know about "the bug", but conditions can be identified under >>>>> which that would have a problem executing, like MAX being in excess
of available automatic storage.
If the /*...*/ comment represents the elision of some security sensitive >>>>> code, where the memset is intended to obliterate secret information, >>>>> of course, that obliteration is not required to work.
After the memset, the buffer has no next use, so the all the assignments >>>>> performed by memset to the bytes of buffer are dead assignments that can >>>>> be elided.
To securely clear memory, you have to use a function for that purpose >>>>> that is not susceptible to optimization.
If you're not doing anything stupid, like link time optimization, an >>>>> external function in another translation unit (a function that the
compiler doesn't recognize as being an alias or wrapper for memset)
ought to suffice.
Using LTO is not "stupid". Relying on people /not/ using LTO, or not >>>> using other valid optimisations, is "stupid".
LTO is a nonconforming optimization. It destroys the concept that
when a translation unit is translated, the semantic analysis is
complete, such that the only remaining activity is resolution of
external references (linkage), and that the semantic analysis of one
translation unit deos not use information about another translation
unit.
This has not yet changed in last April's N3096 draft, where
translation phases 7 and 8 are:
7. White-space characters separating tokens are no longer significant. >>> Each preprocessing token is converted into a token. The resulting
tokens are syntactically and semantically analyzed and translated
as a translation unit.
8. All external object and function references are resolved. Library
components are linked to satisfy external references to functions
and objects not defined in the current translation. All such
translator output is collected into a program image which contains
information needed for execution in its execution environment.
and before that, the Program Structure section says:
The separate translation units of a program communicate by (for
example) calls to functions whose identifiers have external linkage,
manipulation of objects whose identifiers have external linkage, or
manipulation of data files. Translation units may be separately
translated and then later linked to produce an executable program.
LTO deviates from the the model that translation units are separate,
and the conceptual steps of phases 7 and 8.
Link time optimization is as valid as cross-function optimization *as
long as* it doesn't change the defined behavior of the program.
It always does; the interaction of a translation unit with another
is an externally visible aspect of the C program. (That can be inferred
from the rules which forbid semantic analysis across translation
units, only linkage.)
That's why we can have a real world security issue caused by zeroing
being optimized away.
The rules spelled out in ISO C allow us to unit test a translation
unit by linking it to some harness, and be sure it has exactly the
same behaviors when linked to the production program.
If I have some translation unit in which there is a function foo, such
that when I call foo, it then calls an external function bar, that's observable. I can link that unit to a program which supplies bar,
containing a printf call, then call foo and verify that the printf call
is executed.
Since ISO C says that the semantic analysis has been done (that
unit having gone through phase 7), we can take it for granted as a done-and-dusted property of that translation unit that it calls bar
whenever its foo is invoked.
Say I have a call to foo in main, and the definition of foo is in
another translation unit. In the absence of LTO, the compiler will have
to generate a call to foo. If LTO is able to determine that foo doesn't
do anything, it can remove the code for the function call, and the
resulting behavior of the linked program is unchanged.
There always situations in which optimizations that have been forbidden
don't cause a problem, and are even desirable.
If you have LTO turned on, you might be programming in GNU C or Clang C
or whatever, not standard C.
Sometimes programs have the same interpretation in GNU C and standard
C, or the same interpretation to someone who doesn't care about certain differences.
Link time optimization is as valid as cross-function optimization *as
long as* it doesn't change the defined behavior of the program.
On 2024-03-21, Keith Thompson <Keith.S.Thompson+u@gmail.com> wrote:....
Link time optimization is as valid as cross-function optimization *as
long as* it doesn't change the defined behavior of the program.
It always does; the interaction of a translation unit with another
is an externally visible aspect of the C program.
... (That can be inferred
from the rules which forbid semantic analysis across translation
units, only linkage.)
If I have some translation unit in which there is a function foo, such
that when I call foo, it then calls an external function bar, that's observable.
Since ISO C says that the semantic analysis has been done (that
unit having gone through phase 7),
If you have LTO turned on, you might be programming in GNU C or Clang C
or whatever, not standard C.
I think this behavior (of a C compiler) rather stupid. In a
low-level imperative language, the compiled program shall
do whatever the programmer commands it to do.
They are not fixable. Translation units are separate, subject
to separate semantic analysis, which is settled prior to linkage.>
The semantic analysis of one translation unit must be carried out in the absence of any information about what is in another translation unit.
Kaz Kylheku <433-929-6894@kylheku.com> writes:
Since ISO C says that the semantic analysis has been done (that
unit having gone through phase 7), we can take it for granted as a
done-and-dusted property of that translation unit that it calls bar
whenever its foo is invoked.
We can take it for granted that the output performed by the printf call
will be performed, because output is observable behavior. If the
external function bar is modified, the LTO step has to be redone.
Say I have a call to foo in main, and the definition of foo is in
another translation unit. In the absence of LTO, the compiler will have >>> to generate a call to foo. If LTO is able to determine that foo doesn't >>> do anything, it can remove the code for the function call, and the
resulting behavior of the linked program is unchanged.
There always situations in which optimizations that have been forbidden
don't cause a problem, and are even desirable.
If you have LTO turned on, you might be programming in GNU C or Clang C
or whatever, not standard C.
Sometimes programs have the same interpretation in GNU C and standard
C, or the same interpretation to someone who doesn't care about certain
differences.
Are you claiming that a function call is observable behavior?
Consider:
main.c:
#include "foo.h"
int main(void) {
foo();
}
foo.h:
#ifndef FOO_H
#define FOO_H
void foo(void);
#endif
foo.c:
void foo(void) {
// do nothing
}
Are you saying that the "call" instruction generated for the function
call is *observable behavior*?
If an implementation doesn't generate
that "call" instruction because it's able to determine at link time that
the call does nothing, that optimization is forbidden?
I presume you'd agree that omitting the "call" instruction is allowed if
the call and the function definition are in the same translation unit.
What wording in the standard requires a "call" instruction to be
generated if they're in different translation units?
That's a trivial example, but other link time optimizations that don't
change a program's observable behavior (insert weasel words about
unspecified behavior) are also allowed.
In phase 8:
All external object and function references are resolved. Library
components are linked to satisfy external references to functions
and objects not defined in the current translation. All such
translator output is collected into a program image which contains
information needed for execution in its execution environment.
I don't see anything about required CPU instructions.
On 3/21/24 16:46, Keith Thompson wrote:
...
Link time optimization is as valid as cross-function optimization *as
long as* it doesn't change the defined behavior of the program.
Minor adjustment: due to unspecified behavior, some code can have
multiple permitted behaviors. LTO could be conforming even if it changed
the behavior, as long as it changes it to one of the other permitted behaviors. For implementation-defined behavior, the fact that the change could happen would have to be documented.
On 2024-03-22, Keith Thompson <Keith.S.Thompson+u@gmail.com> wrote:....
Are you claiming that a function call is observable behavior?
Yes. It is the observable behavior of an unlinked translation unit.
Are you saying that the "call" instruction generated for the function
call is *observable behavior*?
Of course; it can be observed externally, without doing any reverse engineering on the translated unit.
If an implementation doesn't generate
that "call" instruction because it's able to determine at link time that
the call does nothing, that optimization is forbidden?
The text says so. Translation units are separate; semantic analysis is finished in translation phase 7; linking in 8.
Out of translation phases 1-7 we get a concrete artifact: the translated unit. That has externally visible features, like what symbols it
requires. Its behavior with regard to those symbols can be empirically observed, validated by tests and expected to hold thereafter.
If the unspecified behavior a translation unit is changed to another in
a way that obviously requires semantic analysis (such that a change
occurs in the translated unit that amounts to it having been
re-translated) then that appears to violate the requirements in ISO C
about semantic analysis being done in phase 7, and not any later.
On 2024-03-21, Keith Thompson <Keith.S.Thompson+u@gmail.com> wrote:
Kaz Kylheku <433-929-6894@kylheku.com> writes:
On 2024-03-21, David Brown <david.brown@hesbynett.no> wrote:[...]
On 20/03/2024 19:54, Kaz Kylheku wrote:
On 2024-03-20, Stefan Ram <ram@zedat.fu-berlin.de> wrote:
A "famous security bug":
void f( void )
{ char buffer[ MAX ];
/* . . . */
memset( buffer, 0, sizeof( buffer )); }
. Can you see what the bug is?
I don't know about "the bug", but conditions can be identified under >>>>> which that would have a problem executing, like MAX being in excess
of available automatic storage.
If the /*...*/ comment represents the elision of some security sensitive >>>>> code, where the memset is intended to obliterate secret information, >>>>> of course, that obliteration is not required to work.
After the memset, the buffer has no next use, so the all the assignments >>>>> performed by memset to the bytes of buffer are dead assignments that can >>>>> be elided.
To securely clear memory, you have to use a function for that purpose >>>>> that is not susceptible to optimization.
If you're not doing anything stupid, like link time optimization, an >>>>> external function in another translation unit (a function that the
compiler doesn't recognize as being an alias or wrapper for memset)
ought to suffice.
Using LTO is not "stupid". Relying on people /not/ using LTO, or not
using other valid optimisations, is "stupid".
LTO is a nonconforming optimization. It destroys the concept that
when a translation unit is translated, the semantic analysis is
complete, such that the only remaining activity is resolution of
external references (linkage), and that the semantic analysis of one
translation unit deos not use information about another translation
unit.
This has not yet changed in last April's N3096 draft, where
translation phases 7 and 8 are:
7. White-space characters separating tokens are no longer significant. >>> Each preprocessing token is converted into a token. The resulting
tokens are syntactically and semantically analyzed and translated
as a translation unit.
8. All external object and function references are resolved. Library
components are linked to satisfy external references to functions
and objects not defined in the current translation. All such
translator output is collected into a program image which contains >>> information needed for execution in its execution environment.
and before that, the Program Structure section says:
The separate translation units of a program communicate by (for
example) calls to functions whose identifiers have external linkage,
manipulation of objects whose identifiers have external linkage, or
manipulation of data files. Translation units may be separately
translated and then later linked to produce an executable program.
LTO deviates from the the model that translation units are separate,
and the conceptual steps of phases 7 and 8.
Link time optimization is as valid as cross-function optimization *as
long as* it doesn't change the defined behavior of the program.
It always does; the interaction of a translation unit with another
is an externally visible aspect of the C program.
(That can be inferred
from the rules which forbid semantic analysis across translation
units, only linkage.)
That's why we can have a real world security issue caused by zeroing
being optimized away.
The rules spelled out in ISO C allow us to unit test a translation
unit by linking it to some harness, and be sure it has exactly the
same behaviors when linked to the production program.
If I have some translation unit in which there is a function foo, such
that when I call foo, it then calls an external function bar, that's observable.
I can link that unit to a program which supplies bar,
containing a printf call, then call foo and verify that the printf call
is executed.
Since ISO C says that the semantic analysis has been done (that
unit having gone through phase 7), we can take it for granted as a done-and-dusted property of that translation unit that it calls bar
whenever its foo is invoked.
Say I have a call to foo in main, and the definition of foo is in
another translation unit. In the absence of LTO, the compiler will have
to generate a call to foo. If LTO is able to determine that foo doesn't
do anything, it can remove the code for the function call, and the
resulting behavior of the linked program is unchanged.
There always situations in which optimizations that have been forbidden
don't cause a problem, and are even desirable.
If you have LTO turned on, you might be programming in GNU C or Clang C
or whatever, not standard C.
Sometimes programs have the same interpretation in GNU C and standard
C, or the same interpretation to someone who doesn't care about certain differences.
On 2024-03-22, David Brown <david.brown@hesbynett.no> wrote:
On 21/03/2024 21:21, Kaz Kylheku wrote:
Eliminating dead stores is a very basic dataflow-driven optimization.
Because memset is part of the C language, the compiler knows
exactly what effect it has (that it's equivalent to setting
all the bytes to zero, like a sequence of assignments).
Yes.
If you don't want a call to be optimized away, call your
own function in another translation unit.
No.
There are several ways that guarantee your code will carry out the
writes here (though none that guarantee the secret data is not also
stored elsewhere). Using a function in a different TU is not one of
these techniques. You do people a disfavour by recommending it.
It demonstrably is.
(And don't turn
on nonconforming cross-translation-unit optimizations.)
If I knew of any non-conforming cross-translation-unit optimisations in
a compiler, I would avoid using them until the compiler vendor had fixed
the bug in question.
They are not fixable. Translation units are separate, subject
to separate semantic analysis, which is settled prior to linkage.
The semantic analysis of one translation unit must be carried out in the absence of any information about what is in another translation unit.
On 2024-03-22, Keith Thompson <Keith.S.Thompson+u@gmail.com> wrote:
Kaz Kylheku <433-929-6894@kylheku.com> writes:
Are you claiming that a function call is observable behavior?
Yes. It is the observable behavior of an unlinked translation unit.
It can be observed by linking a harness to it, with a main() function
and all else that is required to make it a complete program.
That harness becomes an instrument for observation.
Are you saying that the "call" instruction generated for the function
call is *observable behavior*?
Of course; it can be observed externally, without doing any reverse engineering on the translated unit.
If an implementation doesn't generate
that "call" instruction because it's able to determine at link time that
the call does nothing, that optimization is forbidden?
The text says so. Translation units are separate; semantic analysis is finished in translation phase 7; linking in 8.
What wording in the standard requires a "call" instruction to be
generated if they're in different translation units?
That's a trivial example, but other link time optimizations that don't
change a program's observable behavior (insert weasel words about
unspecified behavior) are also allowed.
An example would be the removal of material that is not referenced,
like functions not called anywhere, or entire translation units
whose external names are not referenced. That can cause issues too,
and I've run into them, but I can't call that nonconforming.
Nothing is semantically analyzed across translation units, only the
linkage graph itself, which may be found to be disconnected.
In phase 8:
All external object and function references are resolved. Library
components are linked to satisfy external references to functions
and objects not defined in the current translation. All such
translator output is collected into a program image which contains
information needed for execution in its execution environment.
I don't see anything about required CPU instructions.
I don't see anything about /removing/ instructions that have to be
there according to the semantic analysis performed in order to
translate those units from phases 1 - 7, and that can be confirmed
to be present with a test harness.
On 2024-03-22, Keith Thompson <Keith.S.Thompson+u@gmail.com> wrote:
Kaz Kylheku <433-929-6894@kylheku.com> writes:
Since ISO C says that the semantic analysis has been done (that
unit having gone through phase 7), we can take it for granted as a
done-and-dusted property of that translation unit that it calls bar
whenever its foo is invoked.
We can take it for granted that the output performed by the printf call
will be performed, because output is observable behavior. If the
external function bar is modified, the LTO step has to be redone.
That's what undeniably has to be done in the LTO world. Nothing that
is done brings that world into conformance, though.
Say I have a call to foo in main, and the definition of foo is in
another translation unit. In the absence of LTO, the compiler will have >>>> to generate a call to foo. If LTO is able to determine that foo doesn't >>>> do anything, it can remove the code for the function call, and the
resulting behavior of the linked program is unchanged.
There always situations in which optimizations that have been forbidden
don't cause a problem, and are even desirable.
If you have LTO turned on, you might be programming in GNU C or Clang C
or whatever, not standard C.
Sometimes programs have the same interpretation in GNU C and standard
C, or the same interpretation to someone who doesn't care about certain
differences.
Are you claiming that a function call is observable behavior?
Yes. It is the observable behavior of an unlinked translation unit.
It can be observed by linking a harness to it, with a main() function
and all else that is required to make it a complete program.
That harness becomes an instrument for observation.
Are you saying that the "call" instruction generated for the function
call is *observable behavior*?
Of course; it can be observed externally, without doing any reverse engineering on the translated unit.
In phase 8:
All external object and function references are resolved. Library
components are linked to satisfy external references to functions
and objects not defined in the current translation. All such
translator output is collected into a program image which contains
information needed for execution in its execution environment.
I don't see anything about required CPU instructions.
I don't see anything about /removing/ instructions that have to be
there according to the semantic analysis performed in order to
translate those units from phases 1 - 7, and that can be confirmed
to be present with a test harness.
You should read the footnotes to 5.1.1.2 "Translation phases".
Footnotes are not normative, but they are helpful in explaining the
meaning of the text. They note that compilers don't have to follow the details of the translation phases, and that source files, translation
units, and translated translation units don't have to have one-to-one correspondences.
The standard also does not say what the output of "translation" is - it
does not have to be assembly or machine code. It can happily be an
internal format, as used by gcc and clang/llvm. It does not define what "linking" is, or how the translated translation units are "collected
into a program image" - combining the partially compiled units,
optimising, and then generating a program image is well within that definition.
(That can be inferred
from the rules which forbid semantic analysis across translation
units, only linkage.)
The rules do not forbid semantic analysis across translation units -
they merely do not /require/ it. You are making an inference without
any justification that I can see.
That's why we can have a real world security issue caused by zeroing
being optimized away.
No, it is not. We have real-world security issues for all sorts of
reasons, including people mistakenly thinking they can force particular types of code generation by calling functions in different source files.
The rules spelled out in ISO C allow us to unit test a translation
unit by linking it to some harness, and be sure it has exactly the
same behaviors when linked to the production program.
No, they don't.
If the unit you are testing calls something outside that unit, you may
get different behaviours when testing and when used in production.
only thing you can be sure of from testing is that if you find a bug
during testing, you have a bug in the code. You can never use testing
to be sure that the code works (with the exception of exhaustive testing
of all possible inputs, which is rarely practical).
If I have some translation unit in which there is a function foo, such
that when I call foo, it then calls an external function bar, that's
observable.
5.1.2.2.1p6 lists the three things that C defines as "observable
behaviour". Function calls - internal or external - are not amongst these.
I can link that unit to a program which supplies bar,
containing a printf call, then call foo and verify that the printf call
is executed.
Yes, you can. The printf call - or, more exactly, the "input and output dynamics" - are observable behaviour. The call to "bar", however, is not.
The compiler, when compiling the source of "foo", will include a call to "bar" when it does not have the source code (or other detailed semantic information) for "bar" available at the time.
But you are mistaken to
think it does so because the call is "observable" or required by the C standard.
It does so because it cannot prove that /running/ the
function "bar" contains no observable behaviour, or otherwise affects
the observable behaviour of the program. The compiler cannot skip the
call unless it can be sure it is safe to do so - and if it knows nothing about the implementation of "bar", it must assume the worst.
Sometimes the compiler may have additional information - such as if it
is declared the gcc "const" or "pure" attributes (or the standardised "unsequenced" and "reproducible" attributes in the draft for the next C version after C23).
Since ISO C says that the semantic analysis has been done (that
unit having gone through phase 7), we can take it for granted as a
done-and-dusted property of that translation unit that it calls bar
whenever its foo is invoked.
No, we can't - see above. Nothing in the C standards forbids any
additional analysis, or using other information in code generation.
Say I have a call to foo in main, and the definition of foo is in
another translation unit. In the absence of LTO, the compiler will have >>> to generate a call to foo. If LTO is able to determine that foo doesn't >>> do anything, it can remove the code for the function call, and the
resulting behavior of the linked program is unchanged.
There always situations in which optimizations that have been forbidden
don't cause a problem, and are even desirable.
Can you give examples?
You already mentioned "-fast-math" (and by implication, its various
subflags in gcc, clang and icc). These are clearly documented as
allowing some violations of the C standards (and not least, the IEEE floating point standards, which are stricter than those of C).
(While I don't much like an "appeal to authority" argument, I think it's worth noting that the major C / C++ compilers, gcc, clang/llvm and MSVC,
all support link-time optimisation. They also all work together with
both the C and C++ standards committees. It would be quite the scandal
if there were any truth in your claims and these compiler vendors were
all breaking the rules of the languages they help to specify!)
And the C standard imposes no requirement that such behavior occur as described by the abstract semantics. Only actual observable behavior, as
that term is defined by the C standard, must occur as if those semantics
were followed - whether or not they actually were.
On 3/21/2024 4:19 PM, Scott Lurndal wrote:
"Chris M. Thomasson" <chris.m.thomasson.1@gmail.com> writes:
On 3/21/2024 1:21 PM, Scott Lurndal wrote:
"Chris M. Thomasson" <chris.m.thomasson.1@gmail.com> writes:
"All of its “critical-sequences” are contained in externally assembled
functions ( read all ) in order to prevent a rouge C compiler from
As opposed to a viridian C compiler?
I was worried about "overly aggressive" LTO messing around with my ASM.
And you missed the oblique reference to the mispelling of 'rogue' as
'rouge'.
Yup! I sure did. I have red on my face!
Is the "call" instruction *observable behavior* as defined in 5.1.2.3?
[...]
In phase 8:
All external object and function references are resolved. Library
components are linked to satisfy external references to functions
and objects not defined in the current translation. All such
translator output is collected into a program image which contains
information needed for execution in its execution environment.
I don't see anything about required CPU instructions.
I don't see anything about /removing/ instructions that have to be
there according to the semantic analysis performed in order to
translate those units from phases 1 - 7, and that can be confirmed
to be present with a test harness.
The standard doesn't mention either adding or removing instructions.
Running a program under a test harness is effectively running a
different program. Of course it can yield information about the
original program, but in effect you're linking the program with a
different set of libraries.
I can use a test harness to observe whether a program uses an add or inc instruction to evaluate `i++` (assuming the CPU has both instructions).
The standard doesn't care how the increment happens, as long as the
result is correct. It doesn't care *whether* the increment happens
unless the result affects the programs *observable behavior*.
What in the description of translation phases 7 and 8 makes behavior-preserving optimizations valid in phase 7 and forbidden in
phase 8? (Again, insert weasel words about unspecified behavior.)
On 2024-03-22, David Brown <david.brown@hesbynett.no> wrote:
You should read the footnotes to 5.1.1.2 "Translation phases".
Footnotes are not normative, but they are helpful in explaining the
meaning of the text. They note that compilers don't have to follow the
details of the translation phases, and that source files, translation
units, and translated translation units don't have to have one-to-one
correspondences.
Yes, I'm aware of that. For instance preprocessing can all be jumbled
into one process. But it has to produce that result.
Even if translation phases 7 and 8 are combined, the semantic analysis
of the individual translation unit has to appear to be settled before linkage. So for instance a translation unit could incrementally emerge
from the semantic analysis steps, and those parts of it already analyzed (phase 7) could start to be linked to other translation units (phase 8).
I'm just saying that certain information leakage is clearly permitted, regardless of how the phases are integrated.
The standard also does not say what the output of "translation" is - it
does not have to be assembly or machine code. It can happily be an
internal format, as used by gcc and clang/llvm. It does not define what
"linking" is, or how the translated translation units are "collected
into a program image" - combining the partially compiled units,
optimising, and then generating a program image is well within that
definition.
(That can be inferred
from the rules which forbid semantic analysis across translation
units, only linkage.)
The rules do not forbid semantic analysis across translation units -
they merely do not /require/ it. You are making an inference without
any justification that I can see.
Translation phase 7 is clearly about a single translation unit in
isolation:
"The resulting tokens are syntactically and semantically analyzed
and translated as a translation unit."
Not: "as a combination of multiple translation uints".
5.1.1.1 clearly refers to "[t]he separate translation units of a
program".
LTO pretends that the program is still divided into the same translation units, while minging them together in ways contrary to all those
chapter 5 descriptions.
The conforming way to obtain LTO is to actually combine multiple preprocessing translation units into one.
That's why we can have a real world security issue caused by zeroing
being optimized away.
No, it is not. We have real-world security issues for all sorts of
reasons, including people mistakenly thinking they can force particular
types of code generation by calling functions in different source files.
In fact, that code generation is forced, when people do not use LTO,
which is not enabled by default.
The rules spelled out in ISO C allow us to unit test a translation
unit by linking it to some harness, and be sure it has exactly the
same behaviors when linked to the production program.
No, they don't.
If the unit you are testing calls something outside that unit, you may
get different behaviours when testing and when used in production.
Yes; if you do nonconforming things.
only thing you can be sure of from testing is that if you find a bug
during testing, you have a bug in the code. You can never use testing
to be sure that the code works (with the exception of exhaustive testing
of all possible inputs, which is rarely practical).
LTO will break translation units that are simple enough to be trivially proven to have a certain behavior.
If I have some translation unit in which there is a function foo, such
that when I call foo, it then calls an external function bar, that's
observable.
5.1.2.2.1p6 lists the three things that C defines as "observable
behaviour". Function calls - internal or external - are not amongst these.
External calls are de facto observable,
because we have it for granted
when we have a translation unit that calls a certain function, we can
supply another translation unit which supplies that function. In
that function we can communicate with the host environment to confirm
that it was called.
I can link that unit to a program which supplies bar,
containing a printf call, then call foo and verify that the printf call
is executed.
Yes, you can. The printf call - or, more exactly, the "input and output
dynamics" - are observable behaviour. The call to "bar", however, is not.
If bar does not call the function, then the observable behavior of
printf doesn't occur either; they linked by logic / cause-and-effect.
A behavior that is not itself formally classified as observable can be discovered by logical linkage to be necessary for the production of observable behavior. It can be an "if, and only if" linkage.
If an observable behavior B occurs if, and only if, some behavior A
occurs, then the fact of whether A occurs or not is de facto observable.
The compiler, when compiling the source of "foo", will include a call to
"bar" when it does not have the source code (or other detailed semantic
information) for "bar" available at the time.
Translation phases 1 to 7 forbid processing material from another
translation unit.
Conforming semantic analysis of a translation unit has
nothing but that translation unit.
But you are mistaken to
think it does so because the call is "observable" or required by the C
standard.
Sure; let's say that the call can be tied to observable behavior
elsewhere such that the call occurs if and only if the observable
behavior occurs.
It does so because it cannot prove that /running/ the
function "bar" contains no observable behaviour, or otherwise affects
the observable behaviour of the program. The compiler cannot skip the
call unless it can be sure it is safe to do so - and if it knows nothing
about the implementation of "bar", it must assume the worst.
The compiler cannot do any of this if it is in a conforming mode.
But sure, in the nonconforming LTO paradigm, which does have to adhere
to sane rules, that more or less follow what would have to happen if
multiple preprocessing translation units were merged at the token level
and thus analyzed together.
Sometimes the compiler may have additional information - such as if it
is declared the gcc "const" or "pure" attributes (or the standardised
"unsequenced" and "reproducible" attributes in the draft for the next C
version after C23).
If the declarations are available only in another translation unit,
they cannot be taken into account when analyzing this translation unit.
Since ISO C says that the semantic analysis has been done (that
unit having gone through phase 7), we can take it for granted as a
done-and-dusted property of that translation unit that it calls bar
whenever its foo is invoked.
No, we can't - see above. Nothing in the C standards forbids any
additional analysis, or using other information in code generation.
Any semantic analysis performed be that which is stated in translation
phase 7, which happens for one translation unit, before considering
linkage to other translation units.
What forbids is is that no semantic analysis activity is decribed as
taking place in translation phase 8, other than linage.
Say I have a call to foo in main, and the definition of foo is in
another translation unit. In the absence of LTO, the compiler will have >>>> to generate a call to foo. If LTO is able to determine that foo doesn't >>>> do anything, it can remove the code for the function call, and the
resulting behavior of the linked program is unchanged.
There always situations in which optimizations that have been forbidden
don't cause a problem, and are even desirable.
Can you give examples?
You already mentioned "-fast-math" (and by implication, its various
subflags in gcc, clang and icc). These are clearly documented as
allowing some violations of the C standards (and not least, the IEEE
floating point standards, which are stricter than those of C).
Yes, and some people want that, learn how it works, and get their
programs working with it, all the while knowing that it's
nonconforming to IEEE and ISO C.
Another tool in the box.
(While I don't much like an "appeal to authority" argument, I think it's
worth noting that the major C / C++ compilers, gcc, clang/llvm and MSVC,
all support link-time optimisation. They also all work together with
both the C and C++ standards committees. It would be quite the scandal
if there were any truth in your claims and these compiler vendors were
all breaking the rules of the languages they help to specify!)
Why would it be?
In the first place, all the implementations you mention have to be
explicitly put into a nondefault configuration in order to resemble conforming ISO C implementations.
LTO is not even enabled by default (for good reasons).
A few goofballs who maintain GNU/Linux distros are turning on LTO for compiling upstream packages whose development they know nothing about
beyond ./configure && make. (Luckily, the projects themselves can take countermeasures to defend against this.)
I think the fact that LTO is almost certainly nonconforming deserves
more attention, but not panic or anything like that.
LTO should be made into a conforming feature that is optional.
Translation phase 8 can be split into 8 and 9. In 8, translation units
would be optionally partitioned into subsets. Each subset containing
two or more translation units would be be subjected to further semantic analysis, as a group, and turned into a subset translation unit.
Phase 9 would be same as former 8.
Whether an implementation supports subsetting and the manner in which
units are indicated for subsetting would be implementation-defined, but
it would be clear that there is a semantic difference, and that each implementation must support a translation mode in which the subsetting
isn't performed.
On 3/21/24 14:13, Anton Shepelev wrote:
...
I think this behavior (of a C compiler) rather stupid. In a
low-level imperative language, the compiled program shall
do whatever the programmer commands it to do.
C is NOT that low a level of language. The standard explicitly allows implementations to use any method they find convenient to produce
observable behavior which is consistent with the requirements of the standard. Despite describing how that behavior might be produced by the abstract machine, it explicitly allows an implementation to achieve that behavior by other means.
If you want to tell a system not only what a program must do, but also
how it must do it, you need to use a lower-level language than C.
On 22/03/2024 17:14, James Kuyper wrote:[...]
If you want to tell a system not only what a program must do, but
also how it must do it, you need to use a lower-level language than
C.
Which one?
I don't think anyone seriously wants to switch to assembly for the
sort of tasks they want to use C for.
bart <bc@freeuk.com> writes:
On 22/03/2024 17:14, James Kuyper wrote:[...]
If you want to tell a system not only what a program must do, but
also how it must do it, you need to use a lower-level language than
C.
Which one?
Good question.
I don't think anyone seriously wants to switch to assembly for the
sort of tasks they want to use C for.
Agreed. What some people seem to be looking for is a language that's
about as portable as C, but where every language construct is required
to result in generated code that performs the specified operation.
There's a lot of handwaving in that description. "C without
optimization", maybe?
I'm not aware that any such language exists, at least in the mainstream
(and I've looked at a *lot* of programming languages). I conclude that
there just isn't enough demand for that kind of thing.
On 22/03/2024 17:14, James Kuyper wrote:[...]
If you want to tell a system not only what a program must do, but
also how it must do it, you need to use a lower-level language than
C.
Which one?
I don't think anyone seriously wants to switch to assembly for the
sort of tasks they want to use C for.
I have tried to explain the reality of what the C standards say in a
couple of posts (including one that I had not posted before you wrote
this one). I have tried to make things as clear as possible, and
hopefully you will see the point.
If not, then you must accept that you interpret the C standards in a different manner from the main compile vendors, as well as some "big
names" in this group. That is, of course, not proof in itself - but
you must realise that for practical purposes you need to be aware of
how others interpret the standard, both for your own coding and for
the advice or recommendations you give to others.
bart <bc@freeuk.com> writes:
On 22/03/2024 17:14, James Kuyper wrote:[...]
If you want to tell a system not only what a program must do, but
also how it must do it, you need to use a lower-level language than
C.
Which one?
That's up to you. The point is, C is NOT that language.
I don't think anyone seriously wants to switch to assembly for the
sort of tasks they want to use C for.
Why not? Assembly provides the kind of control you're looking for; C
does not. If that kind of control is important to you, you have to find
a language which provides it. If not assembler or C, what would you use?
On 2024-03-22, Keith Thompson <Keith.S.Thompson+u@gmail.com> wrote:
Is the "call" instruction *observable behavior* as defined in 5.1.2.3?
Running a program under a test harness is effectively running a
different program. Of course it can yield information about the
original program, but in effect you're linking the program with a
different set of libraries.
It's a different program, but the retained translation unit must be the
same, except that the external references it makes are resolved to
different entities.
If in one program we have an observable behavior which implies that a
call took place (that itself not being directly observable, by
definition, I again acknowledge) then under the same conditions in
another program, that call also has to take place, by the fact that the translation unit has not changed.
David Brown <david.brown@hesbynett.no> writes:
I have tried to explain the reality of what the C standards say in a
couple of posts (including one that I had not posted before you wrote
this one). I have tried to make things as clear as possible, and
hopefully you will see the point.
If not, then you must accept that you interpret the C standards in a
different manner from the main compile vendors, as well as some "big
names" in this group. That is, of course, not proof in itself - but
you must realise that for practical purposes you need to be aware of
how others interpret the standard, both for your own coding and for
the advice or recommendations you give to others.
Agreed that the ship has sailed on whether LTO is a valid optimization.
But it’s understandable why someone might reach a different conclusion.
- Phase 7 says the tokens are “semantically analyzed and translated as a
translation unit”.
- Phase 8 does not use either verb, “analyzed” or “translated”.
This would be very easy to address, by replacing “collected” with a word or phrase that makes clear that further analysis and translation can
happen outside the “as a translation unit” context.
On 22/03/2024 20:43, Kaz Kylheku wrote:
On 2024-03-22, Keith Thompson <Keith.S.Thompson+u@gmail.com> wrote:
Is the "call" instruction *observable behavior* as defined in 5.1.2.3?
Running a program under a test harness is effectively running a
different program. Of course it can yield information about the
original program, but in effect you're linking the program with a
different set of libraries.
It's a different program, but the retained translation unit must be the
same, except that the external references it makes are resolved to
different entities.
That is true - /if/ you make the restriction that the translation unit
is complied completely to linkable machine code or assembly, and that it
is not changed in any way when it is combined into the new program.
Such a setup is common in practice, but it is in no way required by the
C standards and does not apply for more advanced compilation and build scenarios.
David Brown <david.brown@hesbynett.no> writes:
I have tried to explain the reality of what the C standards say in a
couple of posts (including one that I had not posted before you wrote
this one). I have tried to make things as clear as possible, and
hopefully you will see the point.
If not, then you must accept that you interpret the C standards in a
different manner from the main compile vendors, as well as some "big
names" in this group. That is, of course, not proof in itself - but
you must realise that for practical purposes you need to be aware of
how others interpret the standard, both for your own coding and for
the advice or recommendations you give to others.
Agreed that the ship has sailed on whether LTO is a valid optimization.
But it’s understandable why someone might reach a different conclusion.
- Phase 7 says the tokens are “semantically analyzed and translated as a
translation unit”.
- Phase 8 does not use either verb, “analyzed” or “translated”.
- At least two steps (in the abstract, as-if model) are explicitly
happening in the “as a translation unit” level but not in any wider
context.
- The result of those two steps (“translator output”) is than
“collected”.
- Unless you somehow understand that “collected” implicitly includes
further analysis and translation, it’s does not seem unnatural to
conclude that many of the whole-program optimizations done by LTO
implementations would be outside the spec.
This would be very easy to address, by replacing “collected” with a word or phrase that makes clear that further analysis and translation can
happen outside the “as a translation unit” context.
Obviously this would violate the principle from the rationale that
existing code (that uses TU boundaries to get memset to “work”) is important and existing implementations (LTO) are not, but C
standardization has never actually behaved as if that is true anyway.
On 2024-03-22, Keith Thompson <Keith.S.Thompson+u@gmail.com> wrote:
bart <bc@freeuk.com> writes:
On 22/03/2024 17:14, James Kuyper wrote:[...]
If you want to tell a system not only what a program must do, but
also how it must do it, you need to use a lower-level language than
C.
Which one?
Good question.
I don't think anyone seriously wants to switch to assembly for the
sort of tasks they want to use C for.
Agreed. What some people seem to be looking for is a language that's
about as portable as C, but where every language construct is required
to result in generated code that performs the specified operation.
There's a lot of handwaving in that description. "C without
optimization", maybe?
I'm not aware that any such language exists, at least in the mainstream
(and I've looked at a *lot* of programming languages). I conclude that
there just isn't enough demand for that kind of thing.
I think you can more or less get something like that with the following strategy:
- all memory accesses through pointers are performed as written.
- local variables are aggressively optimized into registers.
- basic optimizations:
- constant folding, dead code elimination.
- basic control flow ones: jump threading and the like.
- basic data flow optimizations.
- peephole, good instruction selection.
In that environment, the way the programmer writes the code is the rest
of the optimization. Want loop unrolling? Write it yourself.
On 23/03/2024 07:26, James Kuyper wrote:
bart <bc@freeuk.com> writes:
On 22/03/2024 17:14, James Kuyper wrote:[...]
If you want to tell a system not only what a program must do, but
also how it must do it, you need to use a lower-level language than
C.
Which one?
That's up to you. The point is, C is NOT that language.
I'm asking which /mainstream/ HLL is lower level than C. So specifically ruling out assembly.
If there is no such choice, then this is the problem: it has to be C or nothing.
I don't think anyone seriously wants to switch to assembly for the
sort of tasks they want to use C for.
Why not? Assembly provides the kind of control you're looking for; C
does not. If that kind of control is important to you, you have to find
a language which provides it. If not assembler or C, what would you use?
Among non-mainstream ones, my own would fit the bill. Since I write the implementations, I can ensure the compiler doesn't have a mind of its own.
However if somebody else tried to implement it, then I can't guarantee
the same behaviour. This would need to somehow be enforced with a
precise language spec, or mine would need to be a reference
implementation with a lot of test cases.
-----------------
Take this program:
#include <stdio.h>
int main(void) {
goto L;
0x12345678;
L:
printf("Hello, World!\n");
}
If I use my compiler, then that 12345678 pattern gets compiled into the binary (because it is loaded into a register then discarded). That means
I can use that value as a marker or sentinel which can be searched for.
However no other compiler I tried will do that. If I instead change that line to:
int a = 0x12345678;
then a tcc-compiled binary will contain that value. So will lccwin32-compiled (with a warning). But not DMC or gcc.
If I get rid of the 'goto' , then gcc-O0 will work, but still not DMC or gcc-O3.
Here I can use `volatile` to ensure that value stays in, but not if I
put the 'goto' back in!
It's all too unpredictable.
On 23/03/2024 10:20, Richard Kettlewell wrote:
David Brown <david.brown@hesbynett.no> writes:
I have tried to explain the reality of what the C standards say in a
couple of posts (including one that I had not posted before you wrote
this one). I have tried to make things as clear as possible, and
hopefully you will see the point.
If not, then you must accept that you interpret the C standards in a
different manner from the main compile vendors, as well as some "big
names" in this group. That is, of course, not proof in itself - but
you must realise that for practical purposes you need to be aware of
how others interpret the standard, both for your own coding and for
the advice or recommendations you give to others.
Agreed that the ship has sailed on whether LTO is a valid optimization.
But it’s understandable why someone might reach a different conclusion.
I /do/ understand why Kaz thinks the way he does. I am just trying to
show that his interpretation is wrong, so that he can better understand
what is going on, and how to get the behaviour he wants.
I would be entirely happy to see clearer wording in the standards here,
or at least some footnotes saying what is allowed or not allowed.
It would be unreasonable to expect them to guarantee the behaviour of
code under new standards when the code did not have guaranteed behaviour under the old standards. Using TU boundaries to "get memset to work"
has never been guaranteed.
On 2024-03-23, David Brown <david.brown@hesbynett.no> wrote:
On 22/03/2024 20:43, Kaz Kylheku wrote:
On 2024-03-22, Keith Thompson <Keith.S.Thompson+u@gmail.com> wrote:
Is the "call" instruction *observable behavior* as defined in 5.1.2.3?
Running a program under a test harness is effectively running a
different program. Of course it can yield information about the
original program, but in effect you're linking the program with a
different set of libraries.
It's a different program, but the retained translation unit must be the
same, except that the external references it makes are resolved to
different entities.
That is true - /if/ you make the restriction that the translation unit
is complied completely to linkable machine code or assembly, and that it
is not changed in any way when it is combined into the new program.
Such a setup is common in practice, but it is in no way required by the
C standards and does not apply for more advanced compilation and build
scenarios.
Well, it's only not required if you hand-wave away the sentences in
section 5.
You can't just do that!
On 23/03/2024 07:26, James Kuyper wrote:
bart <bc@freeuk.com> writes:
On 22/03/2024 17:14, James Kuyper wrote:
[...]
If you want to tell a system not only what a program must do, but
also how it must do it, you need to use a lower-level language than
C.
Which one?
That's up to you. The point is, C is NOT that language.
I'm asking which /mainstream/ HLL is lower level than C. So
specifically ruling out assembly.
If there is no such choice, then this is the problem: it has to be C
or nothing.
On 23/03/2024 01:09, Kaz Kylheku wrote:
On 2024-03-22, Keith Thompson <Keith.S.Thompson+u@gmail.com> wrote:
I'm not aware that any such language exists, at least in the mainstream
(and I've looked at a *lot* of programming languages). I conclude that
there just isn't enough demand for that kind of thing.
I think lack of demand combines with it actually being an extremely difficult task.
Consider something as simple as "x++;" in C. How could that be
implemented? Perhaps the cpu has an "increment" instruction. Perhaps
it has an "add immediate" instruction. Perhaps it needs to load 1 into
a register, then use an "add" instruction. Perhaps "x" is in memory.
Some cpus can execute an increment directly on the memory address as an atomic instruction. Some can do so, but only using specific (and more expensive) instructions. Some can't do it at all without locking
mechanisms and synchronisation loops.
So what does this user of this mythical LLL expect when he/she writes
"x++;" ?
On 23/03/2024 12:26, bart wrote:
On 23/03/2024 07:26, James Kuyper wrote:
bart <bc@freeuk.com> writes:
On 22/03/2024 17:14, James Kuyper wrote:[...]
If you want to tell a system not only what a program must do, but
also how it must do it, you need to use a lower-level language than
C.
Which one?
That's up to you. The point is, C is NOT that language.
I'm asking which /mainstream/ HLL is lower level than C. So
specifically ruling out assembly.
If there is no such choice, then this is the problem: it has to be C
or nothing.
How much of a problem is it, really?
My field is probably the place where low level programming is most ubiquitous. There are plenty of people who use assembly - for good
reasons or for bad (or for reasons that some people think are good,
other people think are bad). C is the most common choice.
Other languages used for small systems embedded programming include C++, Ada, Forth, BASIC, Pascal, Lua, and Micropython. Forth is the only one
that could be argued as lower-level or more "directly translated" than C.
BASIC, ..., Lua, and Micropython.
On 2024-03-23, David Brown <david.brown@hesbynett.no> wrote:....
That is true - /if/ you make the restriction that the translation unit
is complied completely to linkable machine code or assembly, and that it
is not changed in any way when it is combined into the new program.
Such a setup is common in practice, but it is in no way required by the
C standards and does not apply for more advanced compilation and build
scenarios.
Well, it's only not required if you hand-wave away the sentences in
section 5.
I believe we are at an impasse here, unless someone can think of a new
point to make.
One thing I would ask before leaving this - could you take a look at the latest draft for the next C standard after C23?
<https://www.open-std.org/jtc1/sc22/wg14/www/docs/n3220.pdf>
Look at the definitions of the "reproducible" and "unsequenced" function type attributes in 6.7.13.8. In particular, look at the leeway
explicitly given to the compiler for re-arranging code in 6.7.13.8.3p6
and similar examples. Consider how that fits (or fails to fit) with
your interpretation of the tranSlation phases in section 5.
On 3/23/24 12:07, Kaz Kylheku wrote:
On 2024-03-23, David Brown <david.brown@hesbynett.no> wrote:...
That is true - /if/ you make the restriction that the translation unit
is complied completely to linkable machine code or assembly, and that it >>> is not changed in any way when it is combined into the new program.
Such a setup is common in practice, but it is in no way required by the >>> C standards and does not apply for more advanced compilation and build
scenarios.
Well, it's only not required if you hand-wave away the sentences in
section 5.
Or, you could read the whole of section 5. 5.1.2.3p6 makes it clear that
all of the other requirements of the standard apply only insofar as the
observable behavior of the program is concerned.
Any method of achieving observable behavior that matches the behavior
that would be permitted if the abstract semantics were followed, is permitted, even if the actual semantics producing that behavior are
quite different from those specified.
On 2024-03-23, James Kuyper <jameskuyper@alumni.caltech.edu> wrote:
On 3/23/24 12:07, Kaz Kylheku wrote:
On 2024-03-23, David Brown <david.brown@hesbynett.no> wrote:...
That is true - /if/ you make the restriction that the translation unit >>>> is complied completely to linkable machine code or assembly, and that it >>>> is not changed in any way when it is combined into the new program.
Such a setup is common in practice, but it is in no way required by the >>>> C standards and does not apply for more advanced compilation and build >>>> scenarios.
Well, it's only not required if you hand-wave away the sentences in
section 5.
Or, you could read the whole of section 5. 5.1.2.3p6 makes it clear that
all of the other requirements of the standard apply only insofar as the
Aha, so you agree there are requirements, just that the behavior they
imply can be achieved without them being followed in every detail.
observable behavior of the program is concerned.
I believe what you're referring to is now in 5.1.2.4¶6 in N3220.
Yes, you make the excellent point.
If we make any claim about conformance, it has to be rooted in
observable behavior, which is the determiner of conformance.
But we will not find that problem in LTO. If any empirical test of a LTO implementation shows that there is a difference in the ISO C observable behavior of a strictly conforming program, that LTO implementation
obviously has a bug, not LTO itself.
(So why bother looking.) I mean,
the absolute baseline requirement any LTO implementor strives toward is
no change in observable behavior in a strictly conforming program, which would be a showstopper.
On 23/03/2024 16:25, David Brown wrote:
On 23/03/2024 01:09, Kaz Kylheku wrote:
On 2024-03-22, Keith Thompson <Keith.S.Thompson+u@gmail.com> wrote:
I'm not aware that any such language exists, at least in the mainstream >>>> (and I've looked at a *lot* of programming languages). I conclude that >>>> there just isn't enough demand for that kind of thing.
I think lack of demand combines with it actually being an extremely
difficult task.
Consider something as simple as "x++;" in C. How could that be
implemented? Perhaps the cpu has an "increment" instruction. Perhaps
it has an "add immediate" instruction. Perhaps it needs to load 1
into a register, then use an "add" instruction. Perhaps "x" is in
memory. Some cpus can execute an increment directly on the memory
address as an atomic instruction. Some can do so, but only using
specific (and more expensive) instructions. Some can't do it at all
without locking mechanisms and synchronisation loops.
So what does this user of this mythical LLL expect when he/she writes
"x++;" ?
This is not the issue the comes up in the OP (or the issue that was
assumed as I don't think the OP has clarified).
There it is not about micro-managing the implementation of x++, but the compiler deciding it isn't needed at all.
On 23/03/2024 07:26, James Kuyper wrote:
bart <bc@freeuk.com> writes:
On 22/03/2024 17:14, James Kuyper wrote:[...]
If you want to tell a system not only what a program must do, but
also how it must do it, you need to use a lower-level language
than C.
Which one?
That's up to you. The point is, C is NOT that language.
I'm asking which /mainstream/ HLL is lower level than C. So
specifically ruling out assembly.
On 23/03/2024 07:26, James Kuyper wrote:
bart <bc@freeuk.com> writes:
On 22/03/2024 17:14, James Kuyper wrote:[...]
If you want to tell a system not only what a program must do, but
also how it must do it, you need to use a lower-level language
than C.
Which one?
That's up to you. The point is, C is NOT that language.
I'm asking which /mainstream/ HLL is lower level than C. So
specifically ruling out assembly.
If there is no such choice, then this is the problem: it has to be C
or nothing.
I don't think anyone seriously wants to switch to assembly for the
sort of tasks they want to use C for.
Why not? Assembly provides the kind of control you're looking for; C
does not. If that kind of control is important to you, you have to
find a language which provides it. If not assembler or C, what
would you use?
Among non-mainstream ones, my own would fit the bill. Since I write
the implementations, I can ensure the compiler doesn't have a mind of
its own.
However if somebody else tried to implement it, then I can't
guarantee the same behaviour. This would need to somehow be enforced
with a precise language spec, or mine would need to be a reference implementation with a lot of test cases.
-----------------
Take this program:
#include <stdio.h>
int main(void) {
goto L;
0x12345678;
L:
printf("Hello, World!\n");
}
If I use my compiler, then that 12345678 pattern gets compiled into
the binary (because it is loaded into a register then discarded).
That means I can use that value as a marker or sentinel which can be
searched for.
On 23/03/2024 16:51, David Brown wrote:
On 23/03/2024 12:26, bart wrote:
On 23/03/2024 07:26, James Kuyper wrote:
bart <bc@freeuk.com> writes:
On 22/03/2024 17:14, James Kuyper wrote:[...]
If you want to tell a system not only what a program must do, but
also how it must do it, you need to use a lower-level language than >>>>>> C.
Which one?
That's up to you. The point is, C is NOT that language.
I'm asking which /mainstream/ HLL is lower level than C. So
specifically ruling out assembly.
If there is no such choice, then this is the problem: it has to be C
or nothing.
How much of a problem is it, really?
My field is probably the place where low level programming is most
ubiquitous. There are plenty of people who use assembly - for good
reasons or for bad (or for reasons that some people think are good,
other people think are bad). C is the most common choice.
Other languages used for small systems embedded programming include
C++, Ada, Forth, BASIC, Pascal, Lua, and Micropython. Forth is the
only one that could be argued as lower-level or more "directly
translated" than C.
Well, Forth is certainly cruder than C (it's barely a language IMO). But
I don't remember seeing anything in it resembling a type system that corresponds to the 'i8-i64 u8-u64 f32-f64' types typical in current hardware. (Imagine trying to create a precisely laid out struct.)
It is just too weird. I think I'd rather take my chances with C.
BASIC, ..., Lua, and Micropython.
Hmm, I think my own scripting language is better at low level than any
of these.
It supports those low-level types for a start. And I can do
stuff like this:
println peek(0x40'0000, u16):"m"
fun peek(addr, t=byte) = makeref(addr, t)^
This displays 'MZ', the signature of the (low-)loaded EXE image on Windows
Possibly it is even better than C; is this little program valid (no UB)
C, even when it is known that the program is low-loaded:
#include <stdio.h>
typedef unsigned char byte;
int main(void) {
printf("%c%c\n", *(byte*)0x400000, *(byte*)0x400001);
}
This works on DMC, tcc, mcc, lccwin, but not gcc because that loads
programs at high addresses. The problem being that the address involved, while belonging to the program, is outside of any C data objects.
On 23/03/2024 16:51, David Brown wrote:d=20
On 23/03/2024 12:26, bart wrote: =20
On 23/03/2024 07:26, James Kuyper wrote: =20=20
bart <bc@freeuk.com> writes: =20
On 22/03/2024 17:14, James Kuyper wrote: =20[...] =20
If you want to tell a system not only what a program must do,
but also how it must do it, you need to use a lower-level
language than C. =20
Which one? =20
That's up to you. The point is, C is NOT that language. =20
I'm asking which /mainstream/ HLL is lower level than C. So=20
specifically ruling out assembly.
If there is no such choice, then this is the problem: it has to be
C or nothing. =20
How much of a problem is it, really?
=20
My field is probably the place where low level programming is most=20 ubiquitous.=C2=A0 There are plenty of people who use assembly - for goo=
reasons or for bad (or for reasons that some people think are good,=20 other people think are bad).=C2=A0 C is the most common choice.=20
=20
Other languages used for small systems embedded programming include
C++, Ada, Forth, BASIC, Pascal, Lua, and Micropython.=C2=A0 Forth is the only one that could be argued as lower-level or more "directly
translated" than C. =20
Well, Forth is certainly cruder than C (it's barely a language IMO).
But I don't remember seeing anything in it resembling a type system
that corresponds to the 'i8-i64 u8-u64 f32-f64' types typical in
current hardware. (Imagine trying to create a precisely laid out
struct.)
=20
It is just too weird. I think I'd rather take my chances with C.
=20
BASIC, ..., Lua, and Micropython. =20=20
Hmm, I think my own scripting language is better at low level than
any of these. It supports those low-level types for a start. And I
can do stuff like this:
=20
println peek(0x40'0000, u16):"m"
=20
fun peek(addr, t=3Dbyte) =3D makeref(addr, t)^
=20
This displays 'MZ', the signature of the (low-)loaded EXE image on
Windows
=20
Possibly it is even better than C; is this little program valid (no
UB) C, even when it is known that the program is low-loaded:
=20
#include <stdio.h>
typedef unsigned char byte;
=20
int main(void) {
printf("%c%c\n", *(byte*)0x400000, *(byte*)0x400001);
}
=20
This works on DMC, tcc, mcc, lccwin, but not gcc because that loads=20 programs at high addresses. The problem being that the address
involved, while belonging to the program, is outside of any C data
objects.
=20
=20
On 24/03/2024 06:50, Kaz Kylheku wrote:
(So why bother looking.) I mean,
the absolute baseline requirement any LTO implementor strives toward is
no change in observable behavior in a strictly conforming program, which
would be a showstopper.
Yes.
I don't believe anyone - except you - has said anything otherwise. A C implementation is conforming if and only if it takes any correct C
source code and generates a program image that always has correct
observable behaviour when no undefined behaviour is executed. There are
no extra imaginary requirements to be conforming, such as not being
allowed to use extra information while compiling translation units.
On 2024-03-24, David Brown <david.brown@hesbynett.no> wrote:
On 24/03/2024 06:50, Kaz Kylheku wrote:
(So why bother looking.) I mean,
the absolute baseline requirement any LTO implementor strives toward is
no change in observable behavior in a strictly conforming program, which >>> would be a showstopper.
Yes.
I don't believe anyone - except you - has said anything otherwise. A C
implementation is conforming if and only if it takes any correct C
source code and generates a program image that always has correct
observable behaviour when no undefined behaviour is executed. There are
no extra imaginary requirements to be conforming, such as not being
allowed to use extra information while compiling translation units.
But the requirement isn't imaginary. The "least requirements"
paragraph doesn't mean that all other requirements are imaginary;
most of them are necessary to describe the language so that we know
how to find the observable behavior.
It takes a modicum of inference to deduce that a certain explicitly
stated requirement doesn't exist as far as observability/conformance.
We are clearly not imagining the sentences which describe a classic translation and linkage model. The argument that they don't matter
for conformance is different from the argument that we imagined
something between the lines. It is the inference based on 5.1.2.4 that
is between the lines; potentially between any pair of lines anywhere!
Furthermore, the requirents may matter to other kinds of observability.
In C programming, we don't always just care about ISO C observability.
In safety critical coding, we might want to conduct a code review of
the disassembly of an object file (does it correctly implement the
intent we believe to be expressed in the source), and then retain that
exact file until wit needs to be recompiled.
If the code is actually a
an intermediate code that is further translated during linking, that's
not good; we face the prospect of reviewing potentially the entire image
each time. Thus we might want an implementation which has a way of conforming to the classic linkage model (that happens to be conveniently described).
We just may not confuse that conformance (private contract between implementor and user) with ISO C conformance, as I have.
Sorry about that!
What is significant is that the concept has support in ISO C wording.
Such a contract can just refer to that: "our project requires the
classic translation and linkage model that arises from the translation
phases descriptions 7 and 8 being closely followed".
As long as you have a way to disable LTO (or not enable it), you have
that.
David Brown <david.brown@hesbynett.no> writes:
I have tried to explain the reality of what the C standards say in a
couple of posts (including one that I had not posted before you wrote
this one). I have tried to make things as clear as possible, and
hopefully you will see the point.
If not, then you must accept that you interpret the C standards in a
different manner from the main compile vendors, as well as some "big
names" in this group. That is, of course, not proof in itself - but
you must realise that for practical purposes you need to be aware of
how others interpret the standard, both for your own coding and for
the advice or recommendations you give to others.
Agreed that the ship has sailed on whether LTO is a valid optimization.
But it's understandable why someone might reach a different conclusion.
[...]
The C standard means what the ISO C group thinks it means.
They are the ultimate and sole authority. Any discussion about what
the C standard requires that ignores that or pretends otherwise is
a meaningless exercise.
On Sat, 23 Mar 2024 21:21:58 +0000
bart <bc@freeuk.com> wrote:
On 23/03/2024 16:51, David Brown wrote:
On 23/03/2024 12:26, bart wrote:
On 23/03/2024 07:26, James Kuyper wrote:
bart <bc@freeuk.com> writes:
On 22/03/2024 17:14, James Kuyper wrote:[...]
If you want to tell a system not only what a program must do,
but also how it must do it, you need to use a lower-level
language than C.
Which one?
That's up to you. The point is, C is NOT that language.
I'm asking which /mainstream/ HLL is lower level than C. So
specifically ruling out assembly.
If there is no such choice, then this is the problem: it has to be
C or nothing.
How much of a problem is it, really?
My field is probably the place where low level programming is most
ubiquitous. There are plenty of people who use assembly - for good
reasons or for bad (or for reasons that some people think are good,
other people think are bad). C is the most common choice.
Other languages used for small systems embedded programming include
C++, Ada, Forth, BASIC, Pascal, Lua, and Micropython. Forth is the
only one that could be argued as lower-level or more "directly
translated" than C.
Well, Forth is certainly cruder than C (it's barely a language IMO).
But I don't remember seeing anything in it resembling a type system
that corresponds to the 'i8-i64 u8-u64 f32-f64' types typical in
current hardware. (Imagine trying to create a precisely laid out
struct.)
It is just too weird. I think I'd rather take my chances with C.
> BASIC, ..., Lua, and Micropython.
Hmm, I think my own scripting language is better at low level than
any of these. It supports those low-level types for a start. And I
can do stuff like this:
println peek(0x40'0000, u16):"m"
fun peek(addr, t=byte) = makeref(addr, t)^
This displays 'MZ', the signature of the (low-)loaded EXE image on
Windows
Possibly it is even better than C; is this little program valid (no
UB) C, even when it is known that the program is low-loaded:
#include <stdio.h>
typedef unsigned char byte;
int main(void) {
printf("%c%c\n", *(byte*)0x400000, *(byte*)0x400001);
}
This works on DMC, tcc, mcc, lccwin, but not gcc because that loads
programs at high addresses. The problem being that the address
involved, while belonging to the program, is outside of any C data
objects.
#include <stdio.h>
#include <stddef.h>
int main(void)
{
char* p0 = (char*)((size_t)main & -(size_t)0x10000);
printf("%c%c\n", p0[0], p0[1]);
return 0;
}
That would work for small programs. Not necessarily for bigger
programs.
On Sat, 23 Mar 2024 11:26:03 +0000
bart <bc@freeuk.com> wrote:
On 23/03/2024 07:26, James Kuyper wrote:
bart <bc@freeuk.com> writes:
On 22/03/2024 17:14, James Kuyper wrote:[...]
If you want to tell a system not only what a program must do, but
also how it must do it, you need to use a lower-level language
than C.
Which one?
That's up to you. The point is, C is NOT that language.
I'm asking which /mainstream/ HLL is lower level than C. So
specifically ruling out assembly.
If there is no such choice, then this is the problem: it has to be C
or nothing.
I don't think anyone seriously wants to switch to assembly for the
sort of tasks they want to use C for.
Why not? Assembly provides the kind of control you're looking for; C
does not. If that kind of control is important to you, you have to
find a language which provides it. If not assembler or C, what
would you use?
Among non-mainstream ones, my own would fit the bill. Since I write
the implementations, I can ensure the compiler doesn't have a mind of
its own.
However if somebody else tried to implement it, then I can't
guarantee the same behaviour. This would need to somehow be enforced
with a precise language spec, or mine would need to be a reference
implementation with a lot of test cases.
-----------------
Take this program:
#include <stdio.h>
int main(void) {
goto L;
0x12345678;
L:
printf("Hello, World!\n");
}
If I use my compiler, then that 12345678 pattern gets compiled into
the binary (because it is loaded into a register then discarded).
That means I can use that value as a marker or sentinel which can be
searched for.
Does it apply to your aarch64 compiler as well?
On 24/03/2024 14:26, Michael S wrote:
On Sat, 23 Mar 2024 11:26:03 +0000
bart <bc@freeuk.com> wrote:
On 23/03/2024 07:26, James Kuyper wrote:
bart <bc@freeuk.com> writes:
On 22/03/2024 17:14, James Kuyper wrote:[...]
If you want to tell a system not only what a program must do,
but also how it must do it, you need to use a lower-level
language than C.
Which one?
That's up to you. The point is, C is NOT that language.
I'm asking which /mainstream/ HLL is lower level than C. So
specifically ruling out assembly.
If there is no such choice, then this is the problem: it has to be
C or nothing.
I don't think anyone seriously wants to switch to assembly for
the sort of tasks they want to use C for.
Why not? Assembly provides the kind of control you're looking
for; C does not. If that kind of control is important to you, you
have to find a language which provides it. If not assembler or C,
what would you use?
Among non-mainstream ones, my own would fit the bill. Since I write
the implementations, I can ensure the compiler doesn't have a mind
of its own.
However if somebody else tried to implement it, then I can't
guarantee the same behaviour. This would need to somehow be
enforced with a precise language spec, or mine would need to be a
reference implementation with a lot of test cases.
-----------------
Take this program:
#include <stdio.h>
int main(void) {
goto L;
0x12345678;
L:
printf("Hello, World!\n");
}
If I use my compiler, then that 12345678 pattern gets compiled into
the binary (because it is loaded into a register then discarded).
That means I can use that value as a marker or sentinel which can
be searched for.
Does it apply to your aarch64 compiler as well?
I don't support arm64 as a native C (only via intermediate C). Why,
is there something peculiar about that architecture?
I would expect that 0x12345678 pattern to still be in memory but
probably not in an immediate instruction field.
So if wanted to mark
a location in the code, I might need a different approach.
If I ever do directly target that processor, I'll be able to tell you
more.
On 2024-03-24, David Brown <david.brown@hesbynett.no> wrote:[...]
On 24/03/2024 06:50, Kaz Kylheku wrote:
In safety critical coding, we might want to conduct a code review of
the disassembly of an object file (does it correctly implement the
intent we believe to be expressed in the source), and then retain that
exact file until wit needs to be recompiled.
On 23/03/2024 22:21, bart wrote:
Well, Forth is certainly cruder than C (it's barely a language IMO).
But I don't remember seeing anything in it resembling a type system
that corresponds to the 'i8-i64 u8-u64 f32-f64' types typical in
current hardware. (Imagine trying to create a precisely laid out struct.)
Forth can be considered a typeless language - you deal with cells (or
double cells, etc.), which have contents but not types. And you can
define structs with specific layouts quite easily. (Note that I've
never tried this myself - my Forth experience is /very/ limited, and you will get much more accurate information in comp.lang.forth or another
place Forth experts hang out.)
A key thing you miss, in comparison to C, is the type checking and the structured identifier syntax.
In C, if you have :
struct foo {
int32_t x;
int8_t y;
uint16_t z;
};
struct foo obj;
obj.x = obj.y + obj.z;
then you access the fields as "obj.x", etc. Your struct may or may not
have padding, depending on the target and compiler (or compiler-specific extensions). If "obj2" is an object of a different type, then "obj2.x" might be a different field or a compile-time error if that type has no
field "x".
In Forth, you write (again, I could be inaccurate here) :
struct
4 field >x
1 field >y
2 field >z
constant /foo
And note that although Forth is often byte-compiled very directly to
give you exactly the actions you specify in the source code, it is also sometimes compiled to machine code - using optimisations.
It is just too weird. I think I'd rather take my chances with C.
Forth does take some getting used to!
BASIC, ..., Lua, and Micropython.
Hmm, I think my own scripting language is better at low level than any
of these.
These all have one key advantage over your language - they are real languages, available for use by /other/ programmers for development of products.
This works on DMC, tcc, mcc, lccwin, but not gcc because that loads
programs at high addresses. The problem being that the address
involved, while belonging to the program, is outside of any C data
objects.
I think you are being quite unreasonable in blaming gcc - or C - for generating code that cannot access that particular arbitrary address!
But what people want are the conveniences and familiarity of a HLL,[...]
without the bloody-mindedness of an optimising C compiler.
bart <bc@freeuk.com> writes:
[...]
But what people want are the conveniences and familiarity of a HLL,[...]
without the bloody-mindedness of an optimising C compiler.
Exactly which people want that?
The evidence suggests that, while some people undoubtedly want that (and
it's a perfectly legitimate desire), there isn't enough demand to induce anyone to actually produce such a thing and for it to catch on.
Developers have had decades to define and implement the kind of language you're talking about. Why haven't they?
bart <bc@freeuk.com> writes:
[...]
But what people want are the conveniences and familiarity of a HLL,[...]
without the bloody-mindedness of an optimising C compiler.
Exactly which people want that?
The evidence suggests that, while some people undoubtedly want that
(and it's a perfectly legitimate desire), there isn't enough demand
to induce anyone to actually produce such a thing and for it to catch
on.
Developers have had decades to define and implement the kind of
language you're talking about. Why haven't they?
On 24/03/2024 14:52, David Brown wrote:
On 23/03/2024 22:21, bart wrote:
This works on DMC, tcc, mcc, lccwin, but not gcc because that loads
programs at high addresses. The problem being that the address
involved, while belonging to the program, is outside of any C data
objects.
I think you are being quite unreasonable in blaming gcc - or C - for
generating code that cannot access that particular arbitrary address!
There were two separate points here. One is that a gcc-compiled version won't work because exe images are not loaded at 0x40'0000.
The other was
me speculating whether the access to 0x40'0000, even when valid memory
for this process, was UB in C.
bart <bc@freeuk.com> writes:
[...]
But what people want are the conveniences and familiarity of a HLL,[...]
without the bloody-mindedness of an optimising C compiler.
Exactly which people want that?
The evidence suggests that, while some people undoubtedly want that (and
it's a perfectly legitimate desire), there isn't enough demand to induce anyone to actually produce such a thing and for it to catch on.
Developers have had decades to define and implement the kind of language you're talking about. Why haven't they?
On 24/03/2024 20:49, Keith Thompson wrote:
bart <bc@freeuk.com> writes:
[...]
But what people want are the conveniences and familiarity of a HLL,[...]
without the bloody-mindedness of an optimising C compiler.
Exactly which people want that?
The evidence suggests that, while some people undoubtedly want thatPerhaps many settle for using C but using a lesser C compiler or one
(and it's a perfectly legitimate desire), there isn't enough demand
to induce anyone to actually produce such a thing and for it to
catch on. Developers have had decades to define and implement the
kind of language you're talking about. Why haven't they?
with optimisation turned off.
On Sun, 24 Mar 2024 23:07:44 +0000
bart <bc@freeuk.com> wrote:
On 24/03/2024 20:49, Keith Thompson wrote:
bart <bc@freeuk.com> writes:Perhaps many settle for using C but using a lesser C compiler or one
[...]
But what people want are the conveniences and familiarity of a HLL,[...]
without the bloody-mindedness of an optimising C compiler.
Exactly which people want that?
The evidence suggests that, while some people undoubtedly want that
(and it's a perfectly legitimate desire), there isn't enough demand
to induce anyone to actually produce such a thing and for it to
catch on. Developers have had decades to define and implement the
kind of language you're talking about. Why haven't they?
with optimisation turned off.
What is "lesser C compiler"?
Something like IAR ? Yes, people use it.
Something like TI? People use it when they have no other choice.
20 years ago there were Diab Data, Kiel and few others. I didn't hear
about them lately.
Microchip, I'd guess, still has its own compilers for many of their
families, but that's because they have to. "Bigger" compilers dont want
to support this chips.
On the opposite edge of scale, IBM has compilers for their mainframes
and for POWER/AIX. The former are used widely. The later are quickly
losing to "bigger' compilers running on the same platform.
As to tcc, mcc, lccwin etc... those only used by hobbyists.
Never by
pro.
The only "lesser" PC-hosted PC-targeting C compilers that are used
by significant amount of pro developers are Intel and
Borland/Embarcadero, the later strictly for historical reasons.
Embarcadero switched their dev suits to "bigger" compiler quite a few
years ago, but some people like their old stuff. Well, may be, National Instruments compiler still used? I really don't know.
On Sun, 24 Mar 2024 13:49:43 -0700
Keith Thompson <Keith.S.Thompson+u@gmail.com> wrote:
bart <bc@freeuk.com> writes:
[...]
But what people want are the conveniences and familiarity of a HLL,[...]
without the bloody-mindedness of an optimising C compiler.
Exactly which people want that?
The evidence suggests that, while some people undoubtedly want that
(and it's a perfectly legitimate desire), there isn't enough demand
to induce anyone to actually produce such a thing and for it to catch
on.
Such things are produced all the time. A yes, they fail to catch on.
The most recent [half-hearted] attempt that didn't realize yet that it
has no chance is called zig.
Developers have had decades to define and implement the kind of
language you're talking about. Why haven't they?
Because C is juggernaut?
On Sun, 24 Mar 2024 23:07:44 +0000
bart <bc@freeuk.com> wrote:
On 24/03/2024 20:49, Keith Thompson wrote:
bart <bc@freeuk.com> writes:Perhaps many settle for using C but using a lesser C compiler or one
[...]
But what people want are the conveniences and familiarity of a HLL,[...]
without the bloody-mindedness of an optimising C compiler.
Exactly which people want that?
The evidence suggests that, while some people undoubtedly want that
(and it's a perfectly legitimate desire), there isn't enough demand
to induce anyone to actually produce such a thing and for it to
catch on. Developers have had decades to define and implement the
kind of language you're talking about. Why haven't they?
with optimisation turned off.
What is "lesser C compiler"?
Something like IAR ? Yes, people use it.
Something like TI? People use it when they have no other choice.
20 years ago there were Diab Data, Kiel and few others. I didn't hear
about them lately.
Microchip, I'd guess, still has its own compilers for many of their
families, but that's because they have to. "Bigger" compilers dont want
to support this chips.
On the opposite edge of scale, IBM has compilers for their mainframes
and for POWER/AIX. The former are used widely. The later are quickly
losing to "bigger' compilers running on the same platform.
As to tcc, mcc, lccwin etc... those only used by hobbyists. Never by
pro. The only "lesser" PC-hosted PC-targeting C compilers that are used
by significant amount of pro developers are Intel and
Borland/Embarcadero, the later strictly for historical reasons.
Embarcadero switched their dev suits to "bigger" compiler quite a few
years ago, but some people like their old stuff. Well, may be, National Instruments compiler still used? I really don't know.
On 24/03/2024 23:39, Michael S wrote:
On Sun, 24 Mar 2024 23:07:44 +0000
bart <bc@freeuk.com> wrote:
On 24/03/2024 20:49, Keith Thompson wrote:
bart <bc@freeuk.com> writes:Perhaps many settle for using C but using a lesser C compiler or one
[...]
But what people want are the conveniences and familiarity of a HLL,[...]
without the bloody-mindedness of an optimising C compiler.
Exactly which people want that?
The evidence suggests that, while some people undoubtedly want that
(and it's a perfectly legitimate desire), there isn't enough demand
to induce anyone to actually produce such a thing and for it to
catch on. Developers have had decades to define and implement the
kind of language you're talking about. Why haven't they?
with optimisation turned off.
What is "lesser C compiler"?
Something like IAR ? Yes, people use it.
Something like TI? People use it when they have no other choice.
20 years ago there were Diab Data, Kiel and few others. I didn't hear
about them lately.
Microchip, I'd guess, still has its own compilers for many of their
families, but that's because they have to. "Bigger" compilers dont want
to support this chips.
On the opposite edge of scale, IBM has compilers for their mainframes
and for POWER/AIX. The former are used widely. The later are quickly
losing to "bigger' compilers running on the same platform.
As to tcc, mcc, lccwin etc... those only used by hobbyists.
AFAIK lccwin can be used commercially.
I guess you mean companies using big tools and big ecosystems that need equally big compilers to go with them.
I mainly use, and develop, small, nippy tools and would rate them above above any of the big, glossy ones.
On 24/03/2024 15:53, Michael S wrote:
On Sat, 23 Mar 2024 21:21:58 +0000=20
bart <bc@freeuk.com> wrote:
=20
On 23/03/2024 16:51, David Brown wrote: =20=20
On 23/03/2024 12:26, bart wrote: =20
On 23/03/2024 07:26, James Kuyper wrote: =20
bart <bc@freeuk.com> writes: =20
On 22/03/2024 17:14, James Kuyper wrote: =20[...] =20
If you want to tell a system not only what a program must do,
but also how it must do it, you need to use a lower-level
language than C. =20
Which one? =20
That's up to you. The point is, C is NOT that language. =20
I'm asking which /mainstream/ HLL is lower level than C. So
specifically ruling out assembly.
If there is no such choice, then this is the problem: it has to
be C or nothing. =20
How much of a problem is it, really?
My field is probably the place where low level programming is most
ubiquitous.=C2=A0 There are plenty of people who use assembly - for
good reasons or for bad (or for reasons that some people think
are good, other people think are bad).=C2=A0 C is the most common
choice.
Other languages used for small systems embedded programming
include C++, Ada, Forth, BASIC, Pascal, Lua, and Micropython.
Forth is the only one that could be argued as lower-level or more
"directly translated" than C. =20
Well, Forth is certainly cruder than C (it's barely a language
IMO). But I don't remember seeing anything in it resembling a type
system that corresponds to the 'i8-i64 u8-u64 f32-f64' types
typical in current hardware. (Imagine trying to create a precisely
laid out struct.)
It is just too weird. I think I'd rather take my chances with C.
=20
> BASIC, ..., Lua, and Micropython. =20
Hmm, I think my own scripting language is better at low level than
any of these. It supports those low-level types for a start. And I
can do stuff like this:
println peek(0x40'0000, u16):"m"
fun peek(addr, t=3Dbyte) =3D makeref(addr, t)^
This displays 'MZ', the signature of the (low-)loaded EXE image on
Windows
Possibly it is even better than C; is this little program valid (no
UB) C, even when it is known that the program is low-loaded:
#include <stdio.h>
typedef unsigned char byte;
int main(void) {
printf("%c%c\n", *(byte*)0x400000, *(byte*)0x400001);
}
This works on DMC, tcc, mcc, lccwin, but not gcc because that loads
programs at high addresses. The problem being that the address
involved, while belonging to the program, is outside of any C data
objects.
=20
#include <stdio.h>
#include <stddef.h>
=20
int main(void)
{
char* p0 =3D (char*)((size_t)main & -(size_t)0x10000);
printf("%c%c\n", p0[0], p0[1]);
return 0;
}
=20
=20
That would work for small programs. Not necessarily for bigger
programs.
=20
I'm not sure how that works.
Are EXE images always loaded at multiple
of 64KB? I suppose on larger programs it could search backwards 64KB
at a time (although it could also hit on a rogue 'MZ' in program
data).
=20
My point however was whether C considered that p0[0] access UB
because it doesn't point into any C data object.
If so, it would make access to memory-mapped devices or
frame-buffers, or implementing things like garbage collectors,
problematical.
I could be wrong here, of course.
extern char __image_base__[];
On Sun, 24 Mar 2024 23:43:32 +0100
David Brown <david.brown@hesbynett.no> wrote:
I could be wrong here, of course.
It seems, you are.
On 25/03/2024 12:16, Michael S wrote:
On Sun, 24 Mar 2024 23:43:32 +0100
David Brown <david.brown@hesbynett.no> wrote:
I could be wrong here, of course.
It seems, you are.
It happens - and it was not unexpected here, as I said. I don't have
all these compilers installed to test.
But it would be helpful if you had a /little/ more information. If
you don't know why some compilers generate binaries that have memory
mapped at 0x400000, and others do not, fair enough. I am curious,
but it's not at all important.
On 25/03/2024 03:12, bart wrote:
On 24/03/2024 23:39, Michael S wrote:
As to tcc, mcc, lccwin etc... those only used by hobbyists.
AFAIK lccwin can be used commercially.
"/Can/ be used commercially" does not imply "/is/ used professionally".
I'm sure there are some people who use it in their work, but I would
expect that in any statistics about compiler usage, it would be in the "Others < 0.1%" category.
On 25/03/2024 08:58, David Brown wrote:
On 25/03/2024 03:12, bart wrote:
On 24/03/2024 23:39, Michael S wrote:
As to tcc, mcc, lccwin etc... those only used by hobbyists.
AFAIK lccwin can be used commercially.
"/Can/ be used commercially" does not imply "/is/ used
professionally". I'm sure there are some people who use it in their
work, but I would expect that in any statistics about compiler
usage, it would be in the "Others < 0.1%" category.
lccwin is used to compile C functions with an interface which maes
them callable from Matlab.
Whilst I haven't written Matalb code
commercially and it would be rare to do so, I have written Matlab
ocdoe professionally, and that is quite common. I probably also made
rather heavier use of the C interfaces than was really justified.
My Matlab File Exchange submissions are still going strong. But the
gem, the faded bar chart, hasn't been valued, and hasn't attracted
any stars. Matlab users, download and give some love.
On 24/03/2024 15:53, Michael S wrote:
#include <stdio.h>
#include <stddef.h>
int main(void)
{
char* p0 = (char*)((size_t)main & -(size_t)0x10000);
printf("%c%c\n", p0[0], p0[1]);
return 0;
}
That would work for small programs. Not necessarily for bigger
programs.
I'm not sure how that works. Are EXE images always loaded at multiple of 64KB? I suppose on larger programs it could search backwards 64KB at a
time (although it could also hit on a rogue 'MZ' in program data).
My point however was whether C considered that p0[0] access UB because
it doesn't point into any C data object.
If so, it would make access to memory-mapped devices or frame-buffers,
or implementing things like garbage collectors, problematical.
On Mon, 25 Mar 2024 13:26:01 +0100
David Brown <david.brown@hesbynett.no> wrote:
On 25/03/2024 12:16, Michael S wrote:
On Sun, 24 Mar 2024 23:43:32 +0100
David Brown <david.brown@hesbynett.no> wrote:
I could be wrong here, of course.
It seems, you are.
It happens - and it was not unexpected here, as I said. I don't have
all these compilers installed to test.
But it would be helpful if you had a /little/ more information. If
you don't know why some compilers generate binaries that have memory
mapped at 0x400000, and others do not, fair enough. I am curious,
but it's not at all important.
I am not an expert, but it does not look like the problem is directly
related to compiler or linker. All 32-bit Windows compilers/linkers,
including gcc, clang and MSVC, by default put symbol ___ImageBase at
address 4 MB. However loader relocates it to wherever it wants,
typically much higher.
I don't know for sure why loader does it to images generated by gcc,
clang and MSVC and does not do it to images generated by lccwin and
others, but I have an educated guess: most likely, these other compilers
link by default with an option similar to Microsoft's /Fixed https://learn.microsoft.com/en-us/cpp/build/reference/fixed-fixed-base-address?view=msvc-170
The option disables ASLR and thus can shorten app load time and make performance just a little snappier. Still, I wouldn't make it default.
To get similar behavior with [32-bit] MSVC user can specify '/linker
/fixed' on the command line. I don't know how to do it with gcc variant supplied with msys2. But, I'd guess, if you google for long enough, you
can find it.
On Sun, 24 Mar 2024 13:49:43 -0700
Keith Thompson <Keith.S.Thompson+u@gmail.com> wrote:
bart <bc@freeuk.com> writes:
[...]
But what people want are the conveniences and familiarity of a HLL,[...]
without the bloody-mindedness of an optimising C compiler.
Exactly which people want that?
The evidence suggests that, while some people undoubtedly want that
(and it's a perfectly legitimate desire), there isn't enough demand
to induce anyone to actually produce such a thing and for it to catch
on.
Such things are produced all the time. A yes, they fail to catch on.
The most recent [half-hearted] attempt that didn't realize yet that it
has no chance is called zig.
Developers have had decades to define and implement the kind of
language you're talking about. Why haven't they?
Because C is juggernaut?
On 25/03/2024 12:16, Michael S wrote:
On Sun, 24 Mar 2024 23:43:32 +0100
David Brown <david.brown@hesbynett.no> wrote:
I could be wrong here, of course.
It seems, you are.
It happens - and it was not unexpected here, as I said. I don't have
all these compilers installed to test.
But it would be helpful if you had a /little/ more information. If you don't know why some compilers generate binaries that have memory mapped
at 0x400000, and others do not, fair enough. I am curious, but it's not
at all important.
On Mon, 25 Mar 2024 13:26:01 +0100
David Brown <david.brown@hesbynett.no> wrote:
On 25/03/2024 12:16, Michael S wrote:
On Sun, 24 Mar 2024 23:43:32 +0100
David Brown <david.brown@hesbynett.no> wrote:
I could be wrong here, of course.
It seems, you are.
It happens - and it was not unexpected here, as I said. I don't have
all these compilers installed to test.
But it would be helpful if you had a /little/ more information. If
you don't know why some compilers generate binaries that have memory
mapped at 0x400000, and others do not, fair enough. I am curious,
but it's not at all important.
I am not an expert, but it does not look like the problem is directly
related to compiler or linker. All 32-bit Windows compilers/linkers, including gcc, clang and MSVC, by default put symbol ___ImageBase at
address 4 MB. However loader relocates it to wherever it wants,
typically much higher.
I don't know for sure why loader does it to images generated by gcc,
clang and MSVC and does not do it to images generated by lccwin and
others, but I have an educated guess: most likely, these other compilers
link by default with an option similar to Microsoft's /Fixed https://learn.microsoft.com/en-us/cpp/build/reference/fixed-fixed-base-address?view=msvc-170
On 25/03/2024 12:26, David Brown wrote:
On 25/03/2024 12:16, Michael S wrote: =20=20
On Sun, 24 Mar 2024 23:43:32 +0100=20
David Brown <david.brown@hesbynett.no> wrote: =20
I could be=C2=A0 wrong here, of course.
=20
It seems, you are.
=20
It happens - and it was not unexpected here, as I said.=C2=A0 I don't
have all these compilers installed to test.
=20
But it would be helpful if you had a /little/ more information.=C2=A0 If you don't know why some compilers generate binaries that have
memory mapped at 0x400000, and others do not, fair enough.=C2=A0 I am curious, but it's not at all important.
=20
In the PE EXE format, the default image load base is specified in a=20 special header in the file:
=20
Magic: 20B
Link version: 1.0
Code size: 512 200
Idata size: 1024 400
Zdata size: 512
Entry point: 4096 1000 in data:0
Code base: 4096
Image base: 4194304 400000
Section align: 4096
=20
By convention it is at 0x40'0000 (I've no idea why).
=20
More recently, dynamic loading, regardless of what it says in the PE=20 header, has become popular with linkers. So, while there is still a=20
fixed value in the Image Base file, which might be 0x140000000, it
gets loaded at some random address, usually in high memory above 2GB.
=20
I don't know what's responsible for that, but presumably the OS must
be in on the act.
=20
To make this possible, both for loading above 2GB, and for loading at
an address not known by the linker, the code inside the EXE must be=20 position-independent, and have relocation info for any absolute
64-bit static addresses. 32-bit static addresses won't work.
On 25/03/2024 00:39, Michael S wrote:
I tried out Diab Data for the 68k some 25 years ago. It was /way/
better than anything else around, but outside our budget at the time.
On 25/03/2024 12:26, David Brown wrote:
On 25/03/2024 12:16, Michael S wrote:
On Sun, 24 Mar 2024 23:43:32 +0100
David Brown <david.brown@hesbynett.no> wrote:
I could be wrong here, of course.
It seems, you are.
It happens - and it was not unexpected here, as I said. I don't have
all these compilers installed to test.
But it would be helpful if you had a /little/ more information. If
you don't know why some compilers generate binaries that have memory
mapped at 0x400000, and others do not, fair enough. I am curious, but
it's not at all important.
In the PE EXE format, the default image load base is specified in a
special header in the file:
Magic: 20B
Link version: 1.0
Code size: 512 200
Idata size: 1024 400
Zdata size: 512
Entry point: 4096 1000 in data:0
Code base: 4096
Image base: 4194304 400000
Section align: 4096
By convention it is at 0x40'0000 (I've no idea why).
More recently, dynamic loading, regardless of what it says in the PE
header, has become popular with linkers. So, while there is still a
fixed value in the Image Base file, which might be 0x140000000, it gets loaded at some random address, usually in high memory above 2GB.
I don't know what's responsible for that, but presumably the OS must be
in on the act.
To make this possible, both for loading above 2GB, and for loading at an address not known by the linker, the code inside the EXE must be position-independent, and have relocation info for any absolute 64-bit static addresses. 32-bit static addresses won't work.
If I take this C program:
#include <stdio.h>
int main(void) {
printf("%p\n", main);
}
This shows 0000000000401000 when compiled with mcc or tcc, or 0000000000401020 with lccwin32 (the exact address of 'main' relative to
the image base will vary). With DMC (32 bits) it's 0040210. All load at 0x400000.
With gcc, it shows: 00007ff6e63a1591.
Dynamic loading can be disabled by passing --disable-dynamicbase to ld,
then it might show something like 0000000140001000, which corresponds to
the default Image Base file in the EXE header
Not dynamic, but still high.
(My compilers, both for C and M, did not generate code suitable for high-loading until a few months ago. That didn't matter since the EXEs loaded at the fixed 0x400000 adddress. But it can matter for DLL files
and will do for OBJ files, since the latter would need to use an
external linker.
So if I do this with a mix of mcc and gcc:
C:\c>mcc test -c
Compiling test.c to test.obj
C:\c>gcc test.obj
C:\c>a
00007FF613311540
I get the same high-loaded address. I don't think that Tiny C has that support yet for high-loading code.)
To summarise: the high-loading is not directly to do with compilers, but
the program that generates the EXE. But the compiler does need to
generate code that could be loaded high if needed.
On Mon, 25 Mar 2024 16:06:24 +0000
bart <bc@freeuk.com> wrote:
On 25/03/2024 12:26, David Brown wrote:
On 25/03/2024 12:16, Michael S wrote:
On Sun, 24 Mar 2024 23:43:32 +0100
David Brown <david.brown@hesbynett.no> wrote:
I could be wrong here, of course.
It seems, you are.
It happens - and it was not unexpected here, as I said. I don't
have all these compilers installed to test.
But it would be helpful if you had a /little/ more information. If
you don't know why some compilers generate binaries that have
memory mapped at 0x400000, and others do not, fair enough. I am
curious, but it's not at all important.
In the PE EXE format, the default image load base is specified in a
special header in the file:
Magic: 20B
Link version: 1.0
Code size: 512 200
Idata size: 1024 400
Zdata size: 512
Entry point: 4096 1000 in data:0
Code base: 4096
Image base: 4194304 400000
Section align: 4096
By convention it is at 0x40'0000 (I've no idea why).
More recently, dynamic loading, regardless of what it says in the PE
header, has become popular with linkers. So, while there is still a
fixed value in the Image Base file, which might be 0x140000000, it
gets loaded at some random address, usually in high memory above 2GB.
I don't know what's responsible for that, but presumably the OS must
be in on the act.
To make this possible, both for loading above 2GB, and for loading at
an address not known by the linker, the code inside the EXE must be
position-independent, and have relocation info for any absolute
64-bit static addresses. 32-bit static addresses won't work.
I don't understand why you say that EXE must be position-independent.
I never learned PE format in depth (and learned only absolute minimum of
elf, just enough to be able to load images in simple embedded
scenario), but my impression always was that PE EXE contains plenty of relocation info for a loader, so it (loader) can modify (I think
professional argot uses the word 'fix') non-PIC at load time to run at
any chosen position.
Am I wrong about it?
On 25/03/2024 16:51, Michael S wrote:
On Mon, 25 Mar 2024 16:06:24 +0000
bart <bc@freeuk.com> wrote:
On 25/03/2024 12:26, David Brown wrote:
On 25/03/2024 12:16, Michael S wrote:
On Sun, 24 Mar 2024 23:43:32 +0100
David Brown <david.brown@hesbynett.no> wrote:
I could be wrong here, of course.
It seems, you are.
It happens - and it was not unexpected here, as I said. I don't
have all these compilers installed to test.
But it would be helpful if you had a /little/ more information. If
you don't know why some compilers generate binaries that have
memory mapped at 0x400000, and others do not, fair enough. I am
curious, but it's not at all important.
In the PE EXE format, the default image load base is specified in a
special header in the file:
Magic: 20B
Link version: 1.0
Code size: 512 200
Idata size: 1024 400
Zdata size: 512
Entry point: 4096 1000 in data:0
Code base: 4096
Image base: 4194304 400000
Section align: 4096
By convention it is at 0x40'0000 (I've no idea why).
More recently, dynamic loading, regardless of what it says in the PE
header, has become popular with linkers. So, while there is still a
fixed value in the Image Base file, which might be 0x140000000, it
gets loaded at some random address, usually in high memory above 2GB.
I don't know what's responsible for that, but presumably the OS must
be in on the act.
To make this possible, both for loading above 2GB, and for loading at
an address not known by the linker, the code inside the EXE must be
position-independent, and have relocation info for any absolute
64-bit static addresses. 32-bit static addresses won't work.
I don't understand why you say that EXE must be position-independent.
I never learned PE format in depth (and learned only absolute minimum of
elf, just enough to be able to load images in simple embedded
scenario), but my impression always was that PE EXE contains plenty of
relocation info for a loader, so it (loader) can modify (I think
professional argot uses the word 'fix') non-PIC at load time to run at
any chosen position.
Am I wrong about it?
A PE EXE designed to run only at the image base given won't be position-independent, so it can't be moved anywwhere else.
There isn't enough info to make it possible, especially before position-independent addressing modes for x64 came along (that is, using offset to the RIP intruction pointer instead of 32-bit absolute addresses).
Take this C program:
int abc;
int* ptr = &abc;
int main(void) {
int x;
x = abc;
}
Some of the assembly generated is this:
abc: resb 4
ptr: dq abc
...
mov eax, [abc]
That last reference is an absolute 32-bit address, for example it might
have address 0x00403000 when loaded at 0x400000.
If the program is instead loaded at 0x78230000, there is no reloc info
to tell it that that particular 32-bit value, plus the 64-bit field initialising ptr, must be adjusted.
RIP-relative addressing (I think sometimes called PIC), can fix that
second reference:
mov eax, [rip:abc]
But it only works for code, not data; that initialisation is still
absolute.
When a DLL is generated instead, those will need to be moved (to avoid multiple DLLs all based at the same address). In that case,
base-relocation tables are needed: a list of addresses that contain a
field that needs relocating, and what type and size of reloc is needed.
The same info is needed for EXE if it contains flags saying that the EXE could be loaded at an arbitrary adddress.
On 25/03/2024 16:51, Michael S wrote:
On Mon, 25 Mar 2024 16:06:24 +0000=20
bart <bc@freeuk.com> wrote:
=20
On 25/03/2024 12:26, David Brown wrote: =20=20
On 25/03/2024 12:16, Michael S wrote: =20
On Sun, 24 Mar 2024 23:43:32 +0100
David Brown <david.brown@hesbynett.no> wrote: =20
I could be=A0 wrong here, of course.
=20
It seems, you are.
=20
It happens - and it was not unexpected here, as I said.=A0 I don't
have all these compilers installed to test.
But it would be helpful if you had a /little/ more information.
If you don't know why some compilers generate binaries that have
memory mapped at 0x400000, and others do not, fair enough.=A0 I am
curious, but it's not at all important.
=20
In the PE EXE format, the default image load base is specified in a
special header in the file:
Magic: 20B
Link version: 1.0
Code size: 512 200
Idata size: 1024 400
Zdata size: 512
Entry point: 4096 1000 in data:0
Code base: 4096
Image base: 4194304 400000
Section align: 4096
By convention it is at 0x40'0000 (I've no idea why).
More recently, dynamic loading, regardless of what it says in the
PE header, has become popular with linkers. So, while there is
still a fixed value in the Image Base file, which might be
0x140000000, it gets loaded at some random address, usually in
high memory above 2GB.
I don't know what's responsible for that, but presumably the OS
must be in on the act.
To make this possible, both for loading above 2GB, and for loading
at an address not known by the linker, the code inside the EXE
must be position-independent, and have relocation info for any
absolute 64-bit static addresses. 32-bit static addresses won't
work.=20
I don't understand why you say that EXE must be
position-independent. I never learned PE format in depth (and
learned only absolute minimum of elf, just enough to be able to
load images in simple embedded scenario), but my impression always
was that PE EXE contains plenty of relocation info for a loader, so
it (loader) can modify (I think professional argot uses the word
'fix') non-PIC at load time to run at any chosen position.
Am I wrong about it? =20
=20
A PE EXE designed to run only at the image base given won't be=20 position-independent, so it can't be moved anywwhere else.
=20
There isn't enough info to make it possible, especially before=20 position-independent addressing modes for x64 came along (that is,
using offset to the RIP intruction pointer instead of 32-bit absolute addresses).
=20
Take this C program:
=20
int abc;
int* ptr =3D &abc;
=20
int main(void) {
int x;
x =3D abc;
}
=20
Some of the assembly generated is this:
=20
abc: resb 4
=20
ptr: dq abc
...
mov eax, [abc]
=20
That last reference is an absolute 32-bit address, for example it
might have address 0x00403000 when loaded at 0x400000.
=20
If the program is instead loaded at 0x78230000, there is no reloc
info to tell it that that particular 32-bit value, plus the 64-bit
field initialising ptr, must be adjusted.
=20
RIP-relative addressing (I think sometimes called PIC), can fix that=20 second reference:
=20
mov eax, [rip:abc]
=20
But it only works for code, not data; that initialisation is still
absolute.
=20
When a DLL is generated instead, those will need to be moved (to
avoid multiple DLLs all based at the same address). In that case,=20 base-relocation tables are needed: a list of addresses that contain a=20 field that needs relocating, and what type and size of reloc is
needed.
=20
The same info is needed for EXE if it contains flags saying that the
EXE could be loaded at an arbitrary adddress.
=20
On Mon, 25 Mar 2024 18:10:23 +0000
bart <bc@freeuk.com> wrote:
On 25/03/2024 16:51, Michael S wrote:
On Mon, 25 Mar 2024 16:06:24 +0000
bart <bc@freeuk.com> wrote:
On 25/03/2024 12:26, David Brown wrote:
On 25/03/2024 12:16, Michael S wrote:
On Sun, 24 Mar 2024 23:43:32 +0100
David Brown <david.brown@hesbynett.no> wrote:
I could be wrong here, of course.
It seems, you are.
It happens - and it was not unexpected here, as I said. I don't
have all these compilers installed to test.
But it would be helpful if you had a /little/ more information.
If you don't know why some compilers generate binaries that have
memory mapped at 0x400000, and others do not, fair enough. I am
curious, but it's not at all important.
In the PE EXE format, the default image load base is specified in a
special header in the file:
Magic: 20B
Link version: 1.0
Code size: 512 200
Idata size: 1024 400
Zdata size: 512
Entry point: 4096 1000 in data:0
Code base: 4096
Image base: 4194304 400000
Section align: 4096
By convention it is at 0x40'0000 (I've no idea why).
More recently, dynamic loading, regardless of what it says in the
PE header, has become popular with linkers. So, while there is
still a fixed value in the Image Base file, which might be
0x140000000, it gets loaded at some random address, usually in
high memory above 2GB.
I don't know what's responsible for that, but presumably the OS
must be in on the act.
To make this possible, both for loading above 2GB, and for loading
at an address not known by the linker, the code inside the EXE
must be position-independent, and have relocation info for any
absolute 64-bit static addresses. 32-bit static addresses won't
work.
I don't understand why you say that EXE must be
position-independent. I never learned PE format in depth (and
learned only absolute minimum of elf, just enough to be able to
load images in simple embedded scenario), but my impression always
was that PE EXE contains plenty of relocation info for a loader, so
it (loader) can modify (I think professional argot uses the word
'fix') non-PIC at load time to run at any chosen position.
Am I wrong about it?
A PE EXE designed to run only at the image base given won't be
position-independent, so it can't be moved anywwhere else.
There isn't enough info to make it possible, especially before
position-independent addressing modes for x64 came along (that is,
using offset to the RIP intruction pointer instead of 32-bit absolute
addresses).
Take this C program:
int abc;
int* ptr = &abc;
int main(void) {
int x;
x = abc;
}
Some of the assembly generated is this:
abc: resb 4
ptr: dq abc
...
mov eax, [abc]
That last reference is an absolute 32-bit address, for example it
might have address 0x00403000 when loaded at 0x400000.
If the program is instead loaded at 0x78230000, there is no reloc
info to tell it that that particular 32-bit value, plus the 64-bit
field initialising ptr, must be adjusted.
RIP-relative addressing (I think sometimes called PIC), can fix that
second reference:
mov eax, [rip:abc]
But it only works for code, not data; that initialisation is still
absolute.
When a DLL is generated instead, those will need to be moved (to
avoid multiple DLLs all based at the same address). In that case,
base-relocation tables are needed: a list of addresses that contain a
field that needs relocating, and what type and size of reloc is
needed.
The same info is needed for EXE if it contains flags saying that the
EXE could be loaded at an arbitrary adddress.
Your explanation exactly matches what I was imagining.
The technology for relocation of non-PIC code is already here, in file
format definitions and in OS loader code. The linker or the part of
compiler that serves the role of linker can decide to not generate
required tables. Operation in such mode will have small benefits in EXE
size and in quicker load time, but IMHO nowadays it should be used
rarely, only in special situations rather than serve as a default of the tool.
On 25/03/2024 21:05, Michael S wrote:
On Mon, 25 Mar 2024 18:10:23 +0000=20
bart <bc@freeuk.com> wrote:
=20
On 25/03/2024 16:51, Michael S wrote: =20=20
On Mon, 25 Mar 2024 16:06:24 +0000
bart <bc@freeuk.com> wrote:
=20
On 25/03/2024 12:26, David Brown wrote: =20
On 25/03/2024 12:16, Michael S wrote: =20
On Sun, 24 Mar 2024 23:43:32 +0100
David Brown <david.brown@hesbynett.no> wrote: =20
I could be=A0 wrong here, of course.
=20
It seems, you are.
=20
It happens - and it was not unexpected here, as I said.=A0 I don't >>>>> have all these compilers installed to test.
But it would be helpful if you had a /little/ more information.
If you don't know why some compilers generate binaries that have
memory mapped at 0x400000, and others do not, fair enough.=A0 I am >>>>> curious, but it's not at all important.
=20
In the PE EXE format, the default image load base is specified
in a special header in the file:
Magic: 20B
Link version: 1.0
Code size: 512 200
Idata size: 1024 400
Zdata size: 512
Entry point: 4096 1000 in data:0
Code base: 4096
Image base: 4194304 400000
Section align: 4096
By convention it is at 0x40'0000 (I've no idea why).
More recently, dynamic loading, regardless of what it says in the
PE header, has become popular with linkers. So, while there is
still a fixed value in the Image Base file, which might be
0x140000000, it gets loaded at some random address, usually in
high memory above 2GB.
I don't know what's responsible for that, but presumably the OS
must be in on the act.
To make this possible, both for loading above 2GB, and for
loading at an address not known by the linker, the code inside
the EXE must be position-independent, and have relocation info
for any absolute 64-bit static addresses. 32-bit static
addresses won't work. =20
I don't understand why you say that EXE must be
position-independent. I never learned PE format in depth (and
learned only absolute minimum of elf, just enough to be able to
load images in simple embedded scenario), but my impression always
was that PE EXE contains plenty of relocation info for a loader,
so it (loader) can modify (I think professional argot uses the
word 'fix') non-PIC at load time to run at any chosen position.
Am I wrong about it? =20
A PE EXE designed to run only at the image base given won't be
position-independent, so it can't be moved anywwhere else.
There isn't enough info to make it possible, especially before
position-independent addressing modes for x64 came along (that is,
using offset to the RIP intruction pointer instead of 32-bit
absolute addresses).
Take this C program:
int abc;
int* ptr =3D &abc;
int main(void) {
int x;
x =3D abc;
}
Some of the assembly generated is this:
abc: resb 4
ptr: dq abc
...
mov eax, [abc]
That last reference is an absolute 32-bit address, for example it
might have address 0x00403000 when loaded at 0x400000.
If the program is instead loaded at 0x78230000, there is no reloc
info to tell it that that particular 32-bit value, plus the 64-bit
field initialising ptr, must be adjusted.
RIP-relative addressing (I think sometimes called PIC), can fix
that second reference:
mov eax, [rip:abc]
But it only works for code, not data; that initialisation is still
absolute.
When a DLL is generated instead, those will need to be moved (to
avoid multiple DLLs all based at the same address). In that case,
base-relocation tables are needed: a list of addresses that
contain a field that needs relocating, and what type and size of
reloc is needed.
The same info is needed for EXE if it contains flags saying that
the EXE could be loaded at an arbitrary adddress.
=20
Your explanation exactly matches what I was imagining.
The technology for relocation of non-PIC code is already here, in
file format definitions and in OS loader code. The linker or the
part of compiler that serves the role of linker can decide to not
generate required tables. Operation in such mode will have small
benefits in EXE size and in quicker load time, but IMHO nowadays it
should be used rarely, only in special situations rather than serve
as a default of the tool. =20
There are two aspects to be considered:
=20
* Relocating a program to a different address below 2GB
=20
* Relocating a program to any address including above 2GB
=20
The first can be accommodated with tables derived from the reloc info
of object files.
=20
But the second requires compiler cooperation in generating code that=20
will work above 2GB.
=20
Part of that can be done with RIP-relative address modes as I touched=20
on. But not all; RIP-relative won't work here:
=20
movsx rax, dword [i]
mov rax, [rbx*8 + abc]
=20
where the address works with registers. This requires something like:
=20
lea rcx, [rip:abc] # or mov rcx, abc (64-bit abs addr)
mov rax, [rbx*8 + rcx]
=20
This is specific to x64, but other processors will have their issues.=20
Like ARM64 which doesn't even have the 32-bit displayment used with
rip here.
=20
On Mon, 25 Mar 2024 21:25:27 +0000
bart <bc@freeuk.com> wrote:
Your explanation exactly matches what I was imagining.
The technology for relocation of non-PIC code is already here, in
file format definitions and in OS loader code. The linker or the
part of compiler that serves the role of linker can decide to not
generate required tables. Operation in such mode will have small
benefits in EXE size and in quicker load time, but IMHO nowadays it
should be used rarely, only in special situations rather than serve
as a default of the tool.
There are two aspects to be considered:
* Relocating a program to a different address below 2GB
* Relocating a program to any address including above 2GB
The first can be accommodated with tables derived from the reloc info
of object files.
But the second requires compiler cooperation in generating code that
will work above 2GB.
Part of that can be done with RIP-relative address modes as I touched
on. But not all; RIP-relative won't work here:
movsx rax, dword [i]
mov rax, [rbx*8 + abc]
where the address works with registers. This requires something like:
lea rcx, [rip:abc] # or mov rcx, abc (64-bit abs addr)
mov rax, [rbx*8 + rcx]
This is specific to x64, but other processors will have their issues.
Like ARM64 which doesn't even have the 32-bit displayment used with
rip here.
You mean, when compiler knows that the program is loaded at low address
and when combined data segments are relatively small then compiler can
use zero-extended or sign-extended 32-bit literals to form 64-bit
addresses of static/global objects?
I see how relocation of such program is a problem in 64-bit mode, but
still fail to see how similar problem could happen in 32-bit mode.
A "famous security bug":
void f( void )
{ char buffer[ MAX ];
/* . . . */
memset( buffer, 0, sizeof( buffer )); }
. Can you see what the bug is?
(I have already read the answer; I post it as a pastime.)
void f()
{ char buffer[MAX];
/* . . . */
memset( buffer, 0, sizeof( buffer ));
Ensures( buffer[ 0 ]== 0 ); }
i = mylib_random( sizeof( buffer ));
Ensures( buffer[ i ]== 0 );
. How could one implement "Ensures" in C? The first thing that
comes to mind is a call to "assert" of course.
But I also have to think of an "escape" Chandler Carruth mentioned
it in one talk. IIRC, it was something along the lines of
static void escape( volatile void * p )
{ asm volatile( "" : : "g"(p) : "memory" ); }
(which might not be standard C). Now, if you call "escape( buffer )"
at the end of the definition of the function "f" above, the compiler
knows that the contents of buffer has become visible to the outside
world, so that the effects of the "memset" operation become visible
externally, which means that the "memset" call cannot be elided.
On 24/03/2024 17:02, Kaz Kylheku wrote:
On 2024-03-24, David Brown <david.brown@hesbynett.no> wrote:
On 24/03/2024 06:50, Kaz Kylheku wrote:
(So why bother looking.) I mean,
the absolute baseline requirement any LTO implementor strives toward is >>>> no change in observable behavior in a strictly conforming program, which >>>> would be a showstopper.
Yes.
I don't believe anyone - except you - has said anything otherwise. A C
implementation is conforming if and only if it takes any correct C
source code and generates a program image that always has correct
observable behaviour when no undefined behaviour is executed. There are >>> no extra imaginary requirements to be conforming, such as not being
allowed to use extra information while compiling translation units.
But the requirement isn't imaginary. The "least requirements"
paragraph doesn't mean that all other requirements are imaginary;
most of them are necessary to describe the language so that we know
how to find the observable behavior.
The text is not imaginary - your reading between the lines /is/. There
is no rule in the C standards stopping the compiler from using
additional information or knowledge about other parts of the program.
In safety critical coding, we might want to conduct a code review of
the disassembly of an object file (does it correctly implement the
intent we believe to be expressed in the source), and then retain that
exact file until wit needs to be recompiled.
Sure. And for that reason, some developers in that field will not use
LTO. I personally don't make much use of LTO because it makes software
a pain to debug.
We just may not confuse that conformance (private contract between
implementor and user) with ISO C conformance, as I have.
Sorry about that!
Are you saying that after dozens of posts back and forth where you made claims about non-conformity of C compilers handling of C code in comp.lang.c, with heavy references to the C standards which define the
term "conformity", you are now saying that you were not talking about C standard conformity?
If C compilers warned about every piece of dead code that
is eliminated, you'd be up to your ears in diagnostics all
day.
If you do want the code deleted, that doesn't always mean
you can do it yoruself. What gets eliminated can be target
dependent:
switch (sizeof (long)) {
case 4: ...
case 8: ..
}
Because memset is part of the C language, the compiler
knows exactly what effect it has (that it's equivalent to
setting all the bytes to zero, like a sequence of
assignments).
On 23/03/2024 07:26, James Kuyper wrote:
bart <bc@freeuk.com> writes:
On 22/03/2024 17:14, James Kuyper wrote:[...]
If you want to tell a system not only what a program must do, but
also how it must do it, you need to use a lower-level language
than C.
Which one?
That's up to you. The point is, C is NOT that language.
I'm asking which /mainstream/ HLL is lower level than C. So
specifically ruling out assembly.
On 24/03/2024 15:53, Michael S wrote:
#include <stdio.h>
#include <stddef.h>
int main(void)
{
char* p0 = (char*)((size_t)main & -(size_t)0x10000);
printf("%c%c\n", p0[0], p0[1]);
return 0;
}
That would work for small programs. Not necessarily for bigger
programs.
I'm not sure how that works. Are EXE images always loaded at multiple of 64KB? I suppose on larger programs it could search backwards 64KB at a
time (although it could also hit on a rogue 'MZ' in program data).
My point however was whether C considered that p0[0] access UB because
it doesn't point into any C data object.
On 2024-03-24, David Brown <david.brown@hesbynett.no> wrote:
On 24/03/2024 17:02, Kaz Kylheku wrote:
On 2024-03-24, David Brown <david.brown@hesbynett.no> wrote:
On 24/03/2024 06:50, Kaz Kylheku wrote:
(So why bother looking.) I mean,
the absolute baseline requirement any LTO implementor strives toward is >>>>> no change in observable behavior in a strictly conforming program, which >>>>> would be a showstopper.
Yes.
I don't believe anyone - except you - has said anything otherwise. A C >>>> implementation is conforming if and only if it takes any correct C
source code and generates a program image that always has correct
observable behaviour when no undefined behaviour is executed. There are >>>> no extra imaginary requirements to be conforming, such as not being
allowed to use extra information while compiling translation units.
But the requirement isn't imaginary. The "least requirements"
paragraph doesn't mean that all other requirements are imaginary;
most of them are necessary to describe the language so that we know
how to find the observable behavior.
The text is not imaginary - your reading between the lines /is/. There
is no rule in the C standards stopping the compiler from using
additional information or knowledge about other parts of the program.
Sure there is; just not in a way that speaks to the formal notion of conformance. The text is there, and a user and implementor can use
that as a touchstone for agreeing on something outside of conformance.
In safety critical coding, we might want to conduct a code review of
the disassembly of an object file (does it correctly implement the
intent we believe to be expressed in the source), and then retain that
exact file until wit needs to be recompiled.
Sure. And for that reason, some developers in that field will not use
LTO. I personally don't make much use of LTO because it makes software
a pain to debug.
So, in that situation, your requirement can be articulated in a way that refers to the descriptions in ISO C.
You're having your translation
units semantically analyzed according to the abstract separation between phase 7 and 8 (which is not required to be followed for conformance).
We can identify the LTO switch in the compiler as hinging around
whether the abstract semantics is followed or not. (Just we can't tell
using observable behavior.)
This seems like a good thing.
We just may not confuse that conformance (private contract between
implementor and user) with ISO C conformance, as I have.
Sorry about that!
Are you saying that after dozens of posts back and forth where you made
claims about non-conformity of C compilers handling of C code in
comp.lang.c, with heavy references to the C standards which define the
term "conformity", you are now saying that you were not talking about C
standard conformity?
Certainly not! I was wrongly talking about that one and only
conformance.
Once again, sorry about that.
On 24/03/2024 16:45, Tim Rentsch wrote:
The C standard means what the ISO C group thinks it means.
They are the ultimate and sole authority. Any discussion about what
the C standard requires that ignores that or pretends otherwise is
a meaningless exercise.
An intentionalist.
But when a text has come about by a process of argument, negotation
and compromise and votes, is that postion so easy to defend as it
might appear to be for a simpler text?
Malcolm McLean <malcolm.arthur.mclean@gmail.com> writes:
On 24/03/2024 16:45, Tim Rentsch wrote:
The C standard means what the ISO C group thinks it means.
They are the ultimate and sole authority. Any discussion about what
the C standard requires that ignores that or pretends otherwise is
a meaningless exercise.
An intentionalist.
That is a misunderstanding of what I said.
But when a text has come about by a process of argument, negotation
and compromise and votes, is that postion so easy to defend as it
might appear to be for a simpler text?
It's not a position, it's an observation. The ISO C committee is
the recognized authority for judgment about the meaning of the C
standard. Whatever discussion may have gone into writing the
document is irrelevant; all that matters is that the ISO C
group went through the approved ISO process, and hence the world
at large defers to their view as being authoritative on the
question of how to read the text of the standard.
Malcolm McLean <malcolm.arthur.mclean@gmail.com> writes:
On 24/03/2024 16:45, Tim Rentsch wrote:
The C standard means what the ISO C group thinks it means.
They are the ultimate and sole authority. Any discussion about what
the C standard requires that ignores that or pretends otherwise is
a meaningless exercise.
An intentionalist.
That is a misunderstanding of what I said.
But when a text has come about by a process of argument, negotation
and compromise and votes, is that postion so easy to defend as it
might appear to be for a simpler text?
It's not a position, it's an observation. The ISO C committee is
the recognized authority for judgment about the meaning of the C
standard. Whatever discussion may have gone into writing the
document is irrelevant; all that matters is that the ISO C
group went through the approved ISO process, and hence the world
at large defers to their view as being authoritative on the
question of how to read the text of the standard.
Sysop: | Tetrazocine |
---|---|
Location: | Melbourne, VIC, Australia |
Users: | 7 |
Nodes: | 8 (0 / 8) |
Uptime: | 121:08:39 |
Calls: | 46 |
Files: | 21,492 |
Messages: | 64,792 |