https://www.youtube.com/watch?v=I7fEsbksKRE
as far as i understood..(becouse if someona talks english fast my mind
tend to skip more than half of the message)
overally this is quite curious...
she probably read this articles etc for lisp ponys who say lisp is beautifull and c is not so much... but still this is much of
incompetence call c ugly...
fir pisze:
https://www.youtube.com/watch?v=I7fEsbksKRE
as far as i understood..(becouse if someona talks english fast my mind
tend to skip more than half of the message)
overally this is quite curious...
she probably read this articles etc for lisp ponys who say lisp is
beautifull and c is not so much... but still this is much of
incompetence call c ugly...
well i see she named bjorne stroustrup smart (well i wouldnt be so bold here)ÿ - and said he took UGLY c and combined it with OO so this make a
c++ 'succesfull'ÿ language which
takes much hate
so i understand c is ugly and to blame c++ is hated... instead of coding
in lisp... now thats where comedy possibly gets too strong :3
(now im quite convinced my comment on this is also not too strong in bad
way of its sense but i was kinda really stunned/surprised ehen someona
talks such things on video ) -,-
But, I am not personally as much of a fan of C++ as she is...
[...]
In general I like her videos, and she seems to know what she is talking about...
But, I am not personally as much of a fan of C++ as she is...
[...]
[ Cygwin ]
[...]
Similar reason to why one doesn't build complex patterns or do template- like stuff via function macros in the C preprocessor:
One can do this... But again, bloated binaries and terrible build times.
So, one is back to the core issue:
The part that is actually usable, mostly still amounts to syntactic
sugar over things you can already do in C.
There are "niceties", granted, but relatively little "actually new".
And, one of the rare few "actually new" features C++ offers: exceptions.
Also comes with its own drawbacks (code bloat, try/catch+throw is
usually slow, use with care else program explodes, ...). [...]
Many people coding styles often forbid it and mandate that people use error-code returns or similar instead (like in C), with exceptions being globally disabled at build time (along with RTTI and similar), which,
isn't really a strong selling point...
[...]
On Wed, 27 May 2026 18:49:38 -0500, BGB wrote:
But, I am not personally as much of a fan of C++ as she is...
C++ syntax is so complex, the language spec has to add rules that say,
in case of ambiguity, that this interpretation is meant and not that.
Someone described this as ?the principle of most surprise?.
C++ syntax is so complex, the language spec has to add rules that say,
in case of ambiguity, that this interpretation is meant and not that.
On 2026-05-28 01:49, BGB wrote:
[...]
In general I like her videos, and she seems to know what she is
talking about...
But, I am not personally as much of a fan of C++ as she is...
I'm a big fan of abstractions. - So many things beyond "C" are fine!
[...]
[ Cygwin ]
A sensible but imperfect workaround provided for an inferior platform.
[...]
Similar reason to why one doesn't build complex patterns or do
template- like stuff via function macros in the C preprocessor:
One can do this... But again, bloated binaries and terrible build times.
Code patterns that are bulky in "C" can be formulated tersely with
C++/STL (while still preserving an efficient implementation, even with complexities guaranteed); and the framework is flexible, orthogonally designed. Easy to reuse high-level concepts as opposed to re-implement
the same code for different types. Or weaken the code by extensive use
of casts. All sorts of C's problems with memory can be addressed. (The
list can be continued; but I wonder why such things aren't recognized.)
So, one is back to the core issue:
The part that is actually usable, mostly still amounts to syntactic
sugar over things you can already do in C.
Huh? It may depend on the developer/programmer. But it's certainly a
lot more than "syntactic sugar".
There are "niceties", granted, but relatively little "actually new".
Not all concepts are "new", of course; we saw them in other languages
years or (in some cases) decades ago. But C++' and STL features are a
lot more than just niceties; it's beyond me how one may come to such
a valuation. (And now let's compare that formulated demand or wish of
new things with "C"?)
And, one of the rare few "actually new" features C++ offers: exceptions.
We used them already in the 1990's.
Also comes with its own drawbacks (code bloat, try/catch+throw is
usually slow, use with care else program explodes, ...). [...]
I cannot confirm your statements, especially in that generality.
I recall we had bloat with templates on a specific platform in the
very early pre-standard era, when they were first supported. But we
didn't have any [noteworthy] speed degradation with exceptions (or templates).
Many people coding styles often forbid it and mandate that people use
error-code returns or similar instead (like in C), with exceptions
being globally disabled at build time (along with RTTI and similar),
which, isn't really a strong selling point...
Yes, stupid things are done. Mandating to use RCs and forbid to use exceptions is particularly stupid as a general rule.
Probably mentally inconvenient as "new" concept for FORTRAN, BASIC,
or "C" programmers? I can certainly understand that, psychologically.
But not technically. Maybe the support on the commercial platforms
was just better than on Cygwin?
On 5/27/2026 1:15 PM, fir wrote:
fir pisze:
https://www.youtube.com/watch?v=I7fEsbksKRE
as far as i understood..(becouse if someona talks english fast my
mind tend to skip more than half of the message)
overally this is quite curious...
she probably read this articles etc for lisp ponys who say lisp is
beautifull and c is not so much... but still this is much of
incompetence call c ugly...
well i see she named bjorne stroustrup smart (well i wouldnt be so
bold here)ÿ - and said he took UGLY c and combined it with OO so this
make a c++ 'succesfull'ÿ language which
takes much hate
so i understand c is ugly and to blame c++ is hated... instead of
coding in lisp... now thats where comedy possibly gets too strong :3
(now im quite convinced my comment on this is also not too strong in
bad way of its sense but i was kinda really stunned/surprised ehen
someona talks such things on video ) -,-
In general I like her videos, and she seems to know what she is talking about...
But, I am not personally as much of a fan of C++ as she is...
BGB pisze:
On 5/27/2026 1:15 PM, fir wrote:
fir pisze:
https://www.youtube.com/watch?v=I7fEsbksKRE
as far as i understood..(becouse if someona talks english fast my
mind tend to skip more than half of the message)
overally this is quite curious...
she probably read this articles etc for lisp ponys who say lisp is
beautifull and c is not so much... but still this is much of
incompetence call c ugly...
well i see she named bjorne stroustrup smart (well i wouldnt be so
bold here)ÿ - and said he took UGLY c and combined it with OO so this
make a c++ 'succesfull'ÿ language which
takes much hate
so i understand c is ugly and to blame c++ is hated... instead of
coding in lisp... now thats where comedy possibly gets too strong :3
(now im quite convinced my comment on this is also not too strong in
bad way of its sense but i was kinda really stunned/surprised ehen
someona talks such things on video ) -,-
In general I like her videos, and she seems to know what she is
talking about...
good to watch for sure, but those statements are still preposterous
for me its kinda funny becouse i didnt think people who say c is ugly
are real
though my opinion on c++ from -10/10 or about rised recently maybe to
-9/10 becouse of this so called 'references' who after thinking shoved
to have some sense (thou in c++ they probably dont even know it have
sense thay just add sh*t (and by chance 1 on 100 has some sense)
On 2026-05-28 01:49, BGB wrote:
[...]
In general I like her videos, and she seems to know what she is
talking about...
But, I am not personally as much of a fan of C++ as she is...
I'm a big fan of abstractions. - So many things beyond "C" are fine!
[...]
[ Cygwin ]
A sensible but imperfect workaround provided for an inferior platform.
[...]
Similar reason to why one doesn't build complex patterns or do
template- like stuff via function macros in the C preprocessor:
One can do this... But again, bloated binaries and terrible build times.
Code patterns that are bulky in "C" can be formulated tersely with
C++/STL (while still preserving an efficient implementation, even with complexities guaranteed); and the framework is flexible, orthogonally designed. Easy to reuse high-level concepts as opposed to re-implement
the same code for different types. Or weaken the code by extensive use
of casts.
All sorts of C's problems with memory can be addressed. (The
list can be continued; but I wonder why such things aren't recognized.)
Someone could almost come up with a language that is "like C++ but less horrible".
It was the main thing I had when I was in high-school. Well, Cygwin
and MinGW.
Some like LISP syntax, others can't stand the excessive parenthesis.
There have been attempts to eliminate parenthesis ...
Infix notation and precedence rules are pros/cons.
Smalltalk was once popular, and while arguably some aspects of its
syntax are "aesthetic", I personally found trying to read anything
in the language to be almost incomprehensible (so, negative points
if I can't make any sense of what is going on).
Early on, I had liked JS and ActionScript, as (compared with LISP
and Scheme) they scaled a lot easier to "real programming work".
But, then one faces a tension:
Light duty scripting: Favors keeping the language dynamic and
minimizing structural concerns;
Implementation work: Favors strongly going in a direction more like
the C-like languages.
On Thu, 28 May 2026 02:35:37 -0500, BGB wrote:
Someone could almost come up with a language that is "like C++ but less
horrible".
Like this? <https://en.wikipedia.org/wiki/Carbon_(programming_language)>
Both C++ and BS2 had exceptions, but in both cases, enabling them has non-zero overhead.
And, then I lose incentive as I don't really use C++, and (unlike C
land) the C++ people tend to chase after the newest features, rather
than stick to an older and more conservative subset.
On 5/28/2026 2:18 AM, Janis Papanagnou wrote:[...]
On 2026-05-28 01:49, BGB wrote:
I'm a big fan of abstractions. - So many things beyond "C" are fine!
I am not saying that abstractions are bad, but I haven't usually found
them to be worth the costs IME.
[...]
Similar reason to why one doesn't build complex patterns or do
template- like stuff via function macros in the C preprocessor:
One can do this... But again, bloated binaries and terrible build times.
Code patterns that are bulky in "C" can be formulated tersely with
C++/STL (while still preserving an efficient implementation, even with
complexities guaranteed); and the framework is flexible, orthogonally
designed. Easy to reuse high-level concepts as opposed to re-implement
the same code for different types. Or weaken the code by extensive use
of casts. All sorts of C's problems with memory can be addressed. (The
list can be continued; but I wonder why such things aren't recognized.)
Both have a similar issue when used in a naive way though:
ÿ Non-careful use of either results in code bloat.
But, not really an "easy" way to avoid bloat, other than to write code specifically for what cases are relevant; while also avoiding needless duplication and copy paste (where, overuse of copy/paste can also lead
to bloat; along with turning the code into an ugly mess).
But, OTOH, factoring things into too small of pieces can negatively
effect performance (and, for non-leaf functions, prolog/epilog costs for
too many tiny functions can also be a source of code bloat).
As can be noted, trying to mimic templates via creative use of C preprocessor macros can also easily result in excessive bloat...
So, one is back to the core issue:
The part that is actually usable, mostly still amounts to syntactic
sugar over things you can already do in C.
Huh? It may depend on the developer/programmer. But it's certainly a
lot more than "syntactic sugar".
Well, for example:
ÿ Operator overloading:
ÿÿÿ Basically glorified function calls made to resemble operators;
ÿ Classes:
ÿÿÿ Can be done with structs, and implementing vtables manually.
Implementing class hierarchies via structs can be done, but gets ugly
(GTK's GObject system sorta went this way).
There are "niceties", granted, but relatively little "actually new".
Not all concepts are "new", of course; we saw them in other languages
years or (in some cases) decades ago. But C++' and STL features are a
lot more than just niceties; it's beyond me how one may come to such
a valuation. (And now let's compare that formulated demand or wish of
new things with "C"?)
Well, it is a thing that can be done, but is a double-edged sword.
Saves code one might have to write out manually.
But, is very easy to result in things that negatively effect build times.
Usual strategy is to try to limit how much code is written, and also to avoid doing things in ways that result in too much code, or too much cruft.
Best to avoid both copy paste when reasonable, and sticking anything non-trivial in macros.
And, one of the rare few "actually new" features C++ offers: exceptions.
We used them already in the 1990's.
Here, "new" in the sense that it can't be mapped directly back to stuff
that can already be expressed natively in C.
Also comes with its own drawbacks (code bloat, try/catch+throw is
usually slow, use with care else program explodes, ...). [...]
I cannot confirm your statements, especially in that generality.
I recall we had bloat with templates on a specific platform in the
very early pre-standard era, when they were first supported. But we
didn't have any [noteworthy] speed degradation with exceptions (or
templates).
The relative impact of try/catch is more modest.
Typically, it results in every function having an unwind-handling stub
for, in-case an exception is thrown, it can call any destructors or
similar.
[...]
On 5/28/2026 12:18 AM, Janis Papanagnou wrote:
[...]
All sorts of C's problems with memory can be addressed. (The
list can be continued; but I wonder why such things aren't recognized.)
C's problems with memory? Don't you mean the programmers that make bugs?
On Thu, 28 May 2026 14:07:45 -0500, BGB wrote:
Infix notation and precedence rules are pros/cons.
Python took over most of the C operator precedence rules, with one interesting wrinkle: they moved up the precedence of the bitwise
operators so that what has to be written like this in C:
(®val¯ & ®mask¯) == ®expected¯
can have the parentheses omitted in Python:
®val¯ & ®mask¯ == ®expected¯
Am 28.05.2026 um 21:07 schrieb BGB:
Both C++ and BS2 had exceptions, but in both cases, enabling them has
non-zero overhead.
Table-driven exception handling isn't very old but currently it applies
for every 64 bit platform; under Windows / x86 concatenated stackframes
are used. With table-driven EH the additional overhead is zero. And with
the older concatenated stackframes the overhad is very low.
And, then I lose incentive as I don't really use C++, and (unlike C
land) the C++ people tend to chase after the newest features, rather
than stick to an older and more conservative subset.
There's no language where the users are so detail focussed and open
to new features. But this new features raise the productivity a lot
and it was far beyond C even with C++98. With C you've to flip every
bit ourself over and over and C++ does replace that with standard
components.
This has been emphasized through a lot of C++-channels on YouTube;
I personally prefer the CppCon vids or the vids of Jason Turner.
And there are a lot of good books like these of Rainer Grimm and
Nicolai Josuttis.
On 2026-05-28 11:57, BGB wrote:
On 5/28/2026 2:18 AM, Janis Papanagnou wrote:[...]
On 2026-05-28 01:49, BGB wrote:
I'm a big fan of abstractions. - So many things beyond "C" are fine!
I am not saying that abstractions are bad, but I haven't usually found
them to be worth the costs IME.
Wow! - That's completely different from my experience and practice.
It's what makes usage simple, fast, reliable. Not wasting time for
details, or fixing technical bugs that should be prevented by the
language.
[...]
Similar reason to why one doesn't build complex patterns or do
template- like stuff via function macros in the C preprocessor:
One can do this... But again, bloated binaries and terrible build
times.
Code patterns that are bulky in "C" can be formulated tersely with
C++/STL (while still preserving an efficient implementation, even with
complexities guaranteed); and the framework is flexible, orthogonally
designed. Easy to reuse high-level concepts as opposed to re-implement
the same code for different types. Or weaken the code by extensive use
of casts. All sorts of C's problems with memory can be addressed. (The
list can be continued; but I wonder why such things aren't recognized.)
Both have a similar issue when used in a naive way though:
ÿÿ Non-careful use of either results in code bloat.
Okay, "when used in a in a naive way". - Let's leave it at that,
then.
But, not really an "easy" way to avoid bloat, other than to write code
specifically for what cases are relevant; while also avoiding needless
duplication and copy paste (where, overuse of copy/paste can also lead
to bloat; along with turning the code into an ugly mess).
Hmm.. - as said, the during very early days there were issues; I
recall on one platform duplication of template code in more that
one source unit. And/or some environmental hacks (of the compiler)
to deposit template code for linking. In the later days I've not
seen such immature things anymore.
But, OTOH, factoring things into too small of pieces can negatively
effect performance (and, for non-leaf functions, prolog/epilog costs
for too many tiny functions can also be a source of code bloat).
As can be noted, trying to mimic templates via creative use of C
preprocessor macros can also easily result in excessive bloat...
So, one is back to the core issue:
The part that is actually usable, mostly still amounts to syntactic
sugar over things you can already do in C.
Huh? It may depend on the developer/programmer. But it's certainly a
lot more than "syntactic sugar".
Well, for example:
ÿÿ Operator overloading:
ÿÿÿÿ Basically glorified function calls made to resemble operators;
ÿÿ Classes:
ÿÿÿÿ Can be done with structs, and implementing vtables manually.
Implementing class hierarchies via structs can be done, but gets ugly
(GTK's GObject system sorta went this way).
We obviously disagree completely in what's "syntactic sugar".
(With that reasoning all ("C" or other languages') features are
"syntactic sugar" because you can do that also with assembly?)
There are "niceties", granted, but relatively little "actually new".
Not all concepts are "new", of course; we saw them in other languages
years or (in some cases) decades ago. But C++' and STL features are a
lot more than just niceties; it's beyond me how one may come to such
a valuation. (And now let's compare that formulated demand or wish of
new things with "C"?)
Well, it is a thing that can be done, but is a double-edged sword.
Saves code one might have to write out manually.
But, is very easy to result in things that negatively effect build times.
A simple, less-abstracted language can certainly be easier (thus
faster) translated to machine code.
I don't know about your working contexts. In our contexts slightly
larger built-times were no issue. For one, we built using makefiles,
and only full builds (to create QA test images, or public releases)
required much time; they typically ran over night; our systems were
typically very large!
Build times were also influenced by other more significant factors.
Mundane sounding things like ordering of functions in libraries and
some such. (Though nothing that wouldn't have been possible to be
addressed by the build-management group.)
Usual strategy is to try to limit how much code is written, and also
to avoid doing things in ways that result in too much code, or too
much cruft.
Best to avoid both copy paste when reasonable, and sticking anything
non-trivial in macros.
We avoided macros if possible.
And, one of the rare few "actually new" features C++ offers:
exceptions.
We used them already in the 1990's.
Here, "new" in the sense that it can't be mapped directly back to
stuff that can already be expressed natively in C.
Okay.
Also comes with its own drawbacks (code bloat, try/catch+throw is
usually slow, use with care else program explodes, ...). [...]
I cannot confirm your statements, especially in that generality.
I recall we had bloat with templates on a specific platform in the
very early pre-standard era, when they were first supported. But we
didn't have any [noteworthy] speed degradation with exceptions (or
templates).
The relative impact of try/catch is more modest.
Aha; I thought that this would have been the source of criticism.
Typically, it results in every function having an unwind-handling stub
for, in-case an exception is thrown, it can call any destructors or
similar.
I've seen and heard of may ways in which exceptions have been used,
ranging from a single "catch all" in the main() function, to each
function instrumented. I will not judge about these extreme cases.
All I say is that you, as a software designer, have the options to
sensibly structure and instrument your code with exceptions.
There's also the characteristic that you may define exception types
(or use just existing ones); build or add to a hierarchy to handle
them flexibly, provide context data with the exception objects, etc.
Handling all that manually and explicitly, without the support of an exception concept I'd certainly not prefer.
On 2026-05-29 01:54, Lawrence D?Oliveiro wrote:
On Thu, 28 May 2026 14:07:45 -0500, BGB wrote:
Infix notation and precedence rules are pros/cons.
Python took over most of the C operator precedence rules, with one
interesting wrinkle: they moved up the precedence of the bitwise
operators so that what has to be written like this in C:
ÿÿÿÿ (®val¯ & ®mask¯) == ®expected¯
can have the parentheses omitted in Python:
ÿÿÿÿ ®val¯ & ®mask¯ == ®expected¯
Unsurprisingly; since exactly *that* was the obvious (and single)
issue with C's precedence definitions.
On 5/29/2026 2:52 AM, Janis Papanagnou wrote:
On 2026-05-28 11:57, BGB wrote:
On 5/28/2026 2:18 AM, Janis Papanagnou wrote:[...]
On 2026-05-28 01:49, BGB wrote:
But, not really an "easy" way to avoid bloat, other than to write
code specifically for what cases are relevant; while also avoiding
needless duplication and copy paste (where, overuse of copy/paste can
also lead to bloat; along with turning the code into an ugly mess).
Hmm.. - as said, the during very early days there were issues; I
recall on one platform duplication of template code in more that
one source unit. And/or some environmental hacks (of the compiler)
to deposit template code for linking. In the later days I've not
seen such immature things anymore.
Possibly, a lot could depend on how one is counting things as well.
In a lot of cases when using GCC, I end up using:
ÿ -ffunction-sections -fdata-sections -Wl,-gc-sections
Because otherwise it likes wasting code space by retaining unreachable functions.
Using "static inline" functions also carries a risk because the can end
up duplicated across multiple translation units, or in multiple places within the same translation unit, so is best used sparingly.
As for assembler:
Main reasons not to use assembler for everything:
ÿ Needlessly verbose;
ÿ Non-portable.
However, often one can still end up writing C code that looks like
assembler sometimes, as this is often an effective way to optimize things.
Say, for example:
ÿ v0=cs[0];
ÿ v2=cs[2];
ÿ v1=cs[1];
ÿ v3=vs[3];
ÿ ct[0]=v0;
ÿ ct[2]=v2;
ÿ ct[1]=v1;
ÿ ct[3]=v3;
Vs:
ÿ ct[0]=cs[0];
ÿ ct[1]=cs[1];
ÿ ct[2]=cs[2];
ÿ ct[3]=cs[3];
Because the extra variables can avoid help sidestep latency from the
load instructions and staggering stores can avoid penalties of two
adjacent stores to the same cache-line in some cache architectures.
Where, in the latter case, the compiler may fail to as effectively avoid
the load-latency or realize the need to stagger the stores for best performance, ...
Usual strategy is to try to limit how much code is written, and also
to avoid doing things in ways that result in too much code, or too
much cruft.
Best to avoid both copy paste when reasonable, and sticking anything
non-trivial in macros.
We avoided macros if possible.
They are de-facto for constants and similar, but for longer stuff is
better avoided.
But, things can be considered in relative terms:
Like, C++ may carry various penalties vs C.
On 29/05/2026 09:02, Janis Papanagnou wrote:
On 2026-05-29 01:54, Lawrence D?Oliveiro wrote:
On Thu, 28 May 2026 14:07:45 -0500, BGB wrote:
Infix notation and precedence rules are pros/cons.
Python took over most of the C operator precedence rules, with one
interesting wrinkle: they moved up the precedence of the bitwise
operators so that what has to be written like this in C:
ÿÿÿÿ (®val¯ & ®mask¯) == ®expected¯
can have the parentheses omitted in Python:
ÿÿÿÿ ®val¯ & ®mask¯ == ®expected¯
Unsurprisingly; since exactly *that* was the obvious (and single)
issue with C's precedence definitions.
The only one?
How about:
[...]
Like, if one doesn't care that the compiler takes a long time to run
and the EXE is needlessly large, maybe OK, not great if one does care...
Having to spend minutes or more waiting for the compiler would seriously hurt momentum for many tasks.
Say, for example, if the Boot ROM requires keeping everything under 32K.
On 2026-05-29 13:19, Bart wrote:
On 29/05/2026 09:02, Janis Papanagnou wrote:
On 2026-05-29 01:54, Lawrence D?Oliveiro wrote:
On Thu, 28 May 2026 14:07:45 -0500, BGB wrote:
Infix notation and precedence rules are pros/cons.
Python took over most of the C operator precedence rules, with one
interesting wrinkle: they moved up the precedence of the bitwise
operators so that what has to be written like this in C:
ÿÿÿÿ (®val¯ & ®mask¯) == ®expected¯
can have the parentheses omitted in Python:
ÿÿÿÿ ®val¯ & ®mask¯ == ®expected¯
Unsurprisingly; since exactly *that* was the obvious (and single)
issue with C's precedence definitions.
The only one?
Yes. - That group of operators was what I noticed immediately when
I've learned "C" back then reading K&R. Much later I've seen some
folks also mentioning that specific disorder. Still later I've got information about a paper of some of the "C" authors admitting that
mistake. (I think I've also seen comments from some regulars here
that also noted that.) That all together is certainly a solid base
for a sensible valuation.
How about:
[...]
Your well known very specific views have never been a landmark for reconsidering my personal judgement. (And I'm positive that won't
ever change; never mind!)
The "confusions" you listed - not worth quoting - are your personal
problem. The precedence of assignments and related operations and
their evaluation order are clear, reasonable, and they can be found
that way in many existing languages. (Some of your listed "problems"
have been answered here already in the past - I wonder whether it's
worth replying to you if you don't learn from the answers. You seem
to have fun wasting everyone's time.)
You can continue to assume that all those people, language designers
and programmers, are wrong,
and I accept your astonishment that for
those folks it's not "confusing" as it is for you.
On 5/29/2026 2:52 AM, Janis Papanagnou wrote:
On 2026-05-28 11:57, BGB wrote:
On 5/28/2026 2:18 AM, Janis Papanagnou wrote:[...]
On 2026-05-28 01:49, BGB wrote:
I'm a big fan of abstractions. - So many things beyond "C" are fine!
I am not saying that abstractions are bad, but I haven't usually
found them to be worth the costs IME.
Wow! - That's completely different from my experience and practice.
It's what makes usage simple, fast, reliable. Not wasting time for
details, or fixing technical bugs that should be prevented by the
language.
Possibly.
But, it is also possible I approach programming in a different way.
[...]
Possibly, a lot could depend on how one is counting things as well.
In a lot of cases when using GCC, I end up using:
ÿ -ffunction-sections -fdata-sections -Wl,-gc-sections
[...]
[...]
To be excluded from being syntactic sugar, it needs to be something that
is not generally possible to express within the base language.
So, for example:
Things like operator overloading or classes are syntactic sugar IMO, as
what they do can be expressed in C, even if a lot less pretty (or far
from an idiomatic style).
I would not consider exceptions or RTTI as syntactic sugar, because
these involve things that do not map to native C.
Using longjmp, pointer-tagging, etc, could be considered as analogous,
but not functionally equivalent, to what C++ is doing in these cases.
[...]
We avoided macros if possible.
They are de-facto for constants and similar, but for longer stuff is
better avoided.
[...]
On 29/05/2026 13:46, Janis Papanagnou wrote:[...]
On 2026-05-29 13:19, Bart wrote:
On 29/05/2026 09:02, Janis Papanagnou wrote:
On 2026-05-29 01:54, Lawrence D?Oliveiro wrote:
On Thu, 28 May 2026 14:07:45 -0500, BGB wrote:
[...]
The "confusions" you listed - not worth quoting - are your personal
problem. The precedence of assignments and related operations and
their evaluation order are clear, reasonable, and they can be found
that way in many existing languages. (Some of your listed "problems"
have been answered here already in the past - I wonder whether it's
worth replying to you if you don't learn from the answers. You seem
to have fun wasting everyone's time.)
[...]
I noticed that you didn't answer my questions.
[...]
On 2026-05-29 15:22, Bart wrote:
On 29/05/2026 13:46, Janis Papanagnou wrote:[...]
On 2026-05-29 13:19, Bart wrote:
On 29/05/2026 09:02, Janis Papanagnou wrote:
On 2026-05-29 01:54, Lawrence D?Oliveiro wrote:
On Thu, 28 May 2026 14:07:45 -0500, BGB wrote:
[...]
The "confusions" you listed - not worth quoting - are your personal
problem. The precedence of assignments and related operations and
their evaluation order are clear, reasonable, and they can be found
that way in many existing languages. (Some of your listed "problems"
have been answered here already in the past - I wonder whether it's
worth replying to you if you don't learn from the answers. You seem
to have fun wasting everyone's time.)
[...]
I noticed that you didn't answer my questions.
Yes, because, as experience shows, it's obviously a waste of time!
Okay, I'll bite. - I'll go waste my time again and comment on your
other post where you said you are confused about these cases...
On 2026-05-29 15:22, Bart wrote:
On 29/05/2026 13:46, Janis Papanagnou wrote:[...]
On 2026-05-29 13:19, Bart wrote:
On 29/05/2026 09:02, Janis Papanagnou wrote:
On 2026-05-29 01:54, Lawrence D?Oliveiro wrote:
On Thu, 28 May 2026 14:07:45 -0500, BGB wrote:
[...]
The "confusions" you listed - not worth quoting - are your personal
problem. The precedence of assignments and related operations and
their evaluation order are clear, reasonable, and they can be found
that way in many existing languages. (Some of your listed "problems"
have been answered here already in the past - I wonder whether it's
worth replying to you if you don't learn from the answers. You seem
to have fun wasting everyone's time.)
[...]
I noticed that you didn't answer my questions.
Yes, because, as experience shows, it's obviously a waste of time!
How about:
* What is the order here: a ^ b | c
* Why do bitwise & | ^ need their own level anyway
* What is most intuitive precedence here: a << 3 + b, and what
ÿ is it in C
* Why do << >> have their own level anyway
* Why do == != have a difference precendence from < <= >= >
Further, here: 'a * b + c' the multplication is done first, but here:
ÿÿ aÿ *= b += c
It is done second.
The issue I have is whether augmented assignments should return a value
at all. It's just generally too confusing especially with mixed types.
It's confusing enough with assignments returning a value:
ÿÿ a = b = x;
Here, assuming x has no side-effects, you might expect this to mean the
same as:
ÿÿ b = x;
ÿÿ a = x;
In fact it's more like: 'b = x; a = b;'. Example:
ÿÿÿ double a;
ÿÿÿ float b;
ÿÿÿ aÿ = b = 3.14159265358979323846;
Here, 'a' will be assigned the less precise 32-bit version of the RHS.
Janis Papanagnou <janis_papanagnou+ng@hotmail.com> writes:
On 2026-05-29 15:22, Bart wrote:
On 29/05/2026 13:46, Janis Papanagnou wrote:[...]
On 2026-05-29 13:19, Bart wrote:
On 29/05/2026 09:02, Janis Papanagnou wrote:
On 2026-05-29 01:54, Lawrence D?Oliveiro wrote:
On Thu, 28 May 2026 14:07:45 -0500, BGB wrote:
[...]
The "confusions" you listed - not worth quoting - are your personal
problem. The precedence of assignments and related operations and
their evaluation order are clear, reasonable, and they can be found
that way in many existing languages. (Some of your listed "problems"
have been answered here already in the past - I wonder whether it's
worth replying to you if you don't learn from the answers. You seem
to have fun wasting everyone's time.)
[...]
I noticed that you didn't answer my questions.
Yes, because, as experience shows, it's obviously a waste of time!
Okay, I'll bite. - I'll go waste my time again and comment on your
other post where you said you are confused about these cases...
Why bother? C isn't ever going to change operator precedence
just to make Bart happy.
On 29/05/2026 16:15, Janis Papanagnou wrote:
On 2026-05-29 15:22, Bart wrote:
On 29/05/2026 13:46, Janis Papanagnou wrote:[...]
On 2026-05-29 13:19, Bart wrote:
On 29/05/2026 09:02, Janis Papanagnou wrote:
On 2026-05-29 01:54, Lawrence D?Oliveiro wrote:
On Thu, 28 May 2026 14:07:45 -0500, BGB wrote:
[...]
The "confusions" you listed - not worth quoting - are your personal
problem. The precedence of assignments and related operations and
their evaluation order are clear, reasonable, and they can be found
that way in many existing languages. (Some of your listed "problems"
have been answered here already in the past - I wonder whether it's
worth replying to you if you don't learn from the answers. You seem
to have fun wasting everyone's time.)
[...]
I noticed that you didn't answer my questions.
Yes, because, as experience shows, it's obviously a waste of time!
It can go both ways.
You always exasperatingly insist that there is only one thing wrong with
C's precedence rules, and I think you said once that they are are
otherwise perfect.
And yet there endless examples on forums of people
saying they are confusing.
On 29/05/2026 16:59, Scott Lurndal wrote:
Janis Papanagnou <janis_papanagnou+ng@hotmail.com> writes:
On 2026-05-29 15:22, Bart wrote:
I noticed that you didn't answer my questions.
Yes, because, as experience shows, it's obviously a waste of time!
Okay, I'll bite. - I'll go waste my time again and comment on your
other post where you said you are confused about these cases...
Why bother?ÿ C isn't ever going to change operator precedence
just to make Bart happy.
It would make me happy just for someone to admit there are problems. JP always says they are perfect but for one little thing.
Something you might do when you have time (as I'm busy), is to analyse
the expressions in some C codebases, and isolate those where removal of parentheses that group terms, would result in exactly the same shape of expressions, and are therefore redundant.
On 5/29/26 15:22, Bart wrote:
Something you might do when you have time (as I'm busy), is to analyse
the expressions in some C codebases, and isolate those where removal
of parentheses that group terms, would result in exactly the same
shape of expressions, and are therefore redundant.
ÿÿ This is a strange exercice. When I write complex expression,
ÿÿ I sometime use redondant parenthesis for the clarity of
ÿÿ my intentions about this computation. I'm thinking that
ÿÿ those extra (()) are a sort of in-line comments.
On 2026-05-28 21:47, Chris M. Thomasson wrote:
On 5/28/2026 12:18 AM, Janis Papanagnou wrote:
[...]
All sorts of C's problems with memory can be addressed. (The
list can be continued; but I wonder why such things aren't recognized.)
C's problems with memory? Don't you mean the programmers that make bugs?
I'm not sure you're serious here or just joking. - To clarify...
Yes, the programmers "implement the bugs", and the language makes it
just easy and obligingly support the programmers to make such bugs.
On 2026-05-29 18:12, Bart wrote:
On 29/05/2026 16:59, Scott Lurndal wrote:
Janis Papanagnou <janis_papanagnou+ng@hotmail.com> writes:
On 2026-05-29 15:22, Bart wrote:
I noticed that you didn't answer my questions.
Yes, because, as experience shows, it's obviously a waste of time!
Okay, I'll bite. - I'll go waste my time again and comment on your
other post where you said you are confused about these cases...
Why bother?ÿ C isn't ever going to change operator precedence
just to make Bart happy.
I think it's the "someone is wrong on the internet" syndrome.[*]
My apologies. :-)
It would make me happy just for someone to admit there are problems.
JP always says they are perfect but for one little thing.
You said in your previous post that I would have said that they
are perfect. And now you are even saying that I'd always say that.
Please stop that!
Just for the record...
What I would say is that operator precedences are in "C"
"sensibly and appropriately defined, modulo the bit-ops".
(The point is that - with the exception of & ^ | - the ranking
makes perfectly sense and should be easily usable without doubt
by a concept-knowing programmer."
Unsurprisingly; since exactly *that* was the obvious (and single)
issue with C's precedence definitions.
Programmers _should_ absolutely learn the rules. But in C,
there are many of them, and some of them are deceptively subtle.
We agreed.
On 2026-05-29 13:19, Bart wrote:
Further, here: 'a * b + c' the multplication is done first, but here:
ÿÿÿ aÿ *= b += c
It is done second.
You understand that '=', '*=', and '*' are three different things,
don't you?
I hope you understand that '=' should have low precedence. And that
it makes sense to evaluate that from right to left. Do you follow?
"C" obviously decided to have them all, =, +=, *=, etc. in a single
group, and thus evaluated from right to left. - Easy rule, easy to
memorize. - And that is actually what you are demanding from many
other operators, to put them in a single group. - But here you are complaining about it!
Of course the rules for those combined (sort of two-address) operators
could have been defined differently, in an own group with other rules.
(Algol 68 had done that, actually; the semantics are like "apply these operations from left to right, indicating an incremental modification
of the underlying value.)
In fact it's more like: 'b = x; a = b;'. Example:
ÿÿÿÿ double a;
ÿÿÿÿ float b;
ÿÿÿÿ aÿ = b = 3.14159265358979323846;
Here, 'a' will be assigned the less precise 32-bit version of the RHS.
And why are you composing such stupid examples (if not only for sake
of an argument)? - An experienced programmer wouldn't write such an expression with mixed types if he intends clear and non-dubious code.
On 29/05/2026 16:59, Scott Lurndal wrote:[...]
Why bother? C isn't ever going to change operator precedence
just to make Bart happy.
It would make me happy just for someone to admit there are
problems.
On 29/05/2026 18:29, tTh wrote:
On 5/29/26 15:22, Bart wrote:
Something you might do when you have time (as I'm busy), is toÿÿ This is a strange exercice. When I write complex expression,
analyse the expressions in some C codebases, and isolate those
where removal of parentheses that group terms, would result in
exactly the same shape of expressions, and are therefore redundant.
ÿÿ I sometime use redondant parenthesis for the clarity of
ÿÿ my intentions about this computation. I'm thinking that
ÿÿ those extra (()) are a sort of in-line comments.
Sure, but some here like to say that such expressions, if they still
work without parentheses, are unambiguous anyway.
They forget that people aren't compilers.
And then the point becomes, if you always add the parentheses, what
was the point of having that particular precedence level?
Bart <bc@freeuk.com> writes:
or would you continue to post contrived examples that make it appear
as confusing as possible?
Bart, what do you want?
Unsurprisingly; since exactly *that* was the obvious (and single)
issue with C's precedence definitions.
On 2026-05-29 13:19, Bart wrote:...
* What is the order here: a ^ b | c
Personally I don't think that there's a prevalent definition
how these should be ordered.
You actually said this:
[...]Bart, you are incapable of understanding semantics and associating
Dan Cross:
Programmers _should_ absolutely learn the rules.ÿ But in C,
there are many of them, and some of them are deceptively subtle.
JP:
We agreed.
On 29/05/2026 20:28, Keith Thompson wrote:
Bart <bc@freeuk.com> writes:
or would you continue to post contrived examples that make it appear
as confusing as possible?
Examples are examples. Do you want me to post one that didn't illustrate
an issue? It necessaily has to be contrived.
Bart, what do you want?
Today I was just replying to this post today that I found annoying:
JP:
Unsurprisingly; since exactly *that* was the obvious (and single)
issue with C's precedence definitions.
That is, the suggestion, made several times by JP, that there is only
one thing wrong.
On 29/05/2026 12:20, BGB wrote:
On 5/29/2026 2:52 AM, Janis Papanagnou wrote:
On 2026-05-28 11:57, BGB wrote:
On 5/28/2026 2:18 AM, Janis Papanagnou wrote:[...]
On 2026-05-28 01:49, BGB wrote:
Hmm.. - as said, the during very early days there were issues; I
But, not really an "easy" way to avoid bloat, other than to write
code specifically for what cases are relevant; while also avoiding
needless duplication and copy paste (where, overuse of copy/paste
can also lead to bloat; along with turning the code into an ugly mess). >>>
recall on one platform duplication of template code in more that
one source unit. And/or some environmental hacks (of the compiler)
to deposit template code for linking. In the later days I've not
seen such immature things anymore.
Possibly, a lot could depend on how one is counting things as well.
In a lot of cases when using GCC, I end up using:
ÿÿ -ffunction-sections -fdata-sections -Wl,-gc-sections
On many targets, "-fdata-sections" can lead to noticeably larger and
slower code because it effectively eliminates section anchor
optimisations.ÿ It does not negatively affect x86 AFAICS, because x86
does not use section anchors.
<https://godbolt.org/z/zeoq41Y7d>
With -fsection-anchors (enabled with optimisation on targets that
support it - generally RISCy load/store architectures), program-lifetime variables are kept together in a lump (as though they were in a struct)
and often addressed by a pointer to that pretend struct.ÿ Thus if a
function accesses two variables "a" and "b", instead of having to load
the addresses of each of "a" and "b" into separate registers, it loads
an "anchor" into one register and accesses the variables with reg+offset addressing.
I've seen "-fdata-sections" used regularly in embedded systems - it is almost always a bad idea.
("-ffunction-sections" is often very helpful to reduce code image size,
so keep that one.)
Because otherwise it likes wasting code space by retaining unreachable
functions.
Using "static inline" functions also carries a risk because the can
end up duplicated across multiple translation units, or in multiple
places within the same translation unit, so is best used sparingly.
Usually you would only use static inline functions for small functions
in headers, where they are a better choice than function-like macros. In
a C file, there is rarely much point in declaring a function "inline" - optimising compilers will inline or not as they see fit, without regard
for "inline".ÿ "static" on its own is, of course, always a good idea for functions or data that is not "exported" by the current translation
unit, and will often make generated code smaller.
How much or how little duplication of code there will be within one translation unit will depend on compiler settings and the rest of the
code, and not on whether or not you use "inline".
As for assembler:
Main reasons not to use assembler for everything:
ÿÿ Needlessly verbose;
ÿÿ Non-portable.
However, often one can still end up writing C code that looks like
assembler sometimes, as this is often an effective way to optimize
things.
Say, for example:
ÿÿ v0=cs[0];
ÿÿ v2=cs[2];
ÿÿ v1=cs[1];
ÿÿ v3=vs[3];
ÿÿ ct[0]=v0;
ÿÿ ct[2]=v2;
ÿÿ ct[1]=v1;
ÿÿ ct[3]=v3;
Vs:
ÿÿ ct[0]=cs[0];
ÿÿ ct[1]=cs[1];
ÿÿ ct[2]=cs[2];
ÿÿ ct[3]=cs[3];
Because the extra variables can avoid help sidestep latency from the
load instructions and staggering stores can avoid penalties of two
adjacent stores to the same cache-line in some cache architectures.
Where, in the latter case, the compiler may fail to as effectively
avoid the load-latency or realize the need to stagger the stores for
best performance, ...
That might be the case for a very simplistic compiler.ÿ With an
optimising compiler, these extra variables will quickly be eliminated.
If the compiler has a good scheduling model of the device, it do
whatever instruction scheduling works best for that processor.ÿ If the
model is not good enough, it will be suboptimal.ÿ I would not, however, expect any different in the generated code for the two code snippets.
Sometimes this kind of "manual optimisation" is helpful when you have to
try to get efficient results from a weak compiler, however.
Usual strategy is to try to limit how much code is written, and also
to avoid doing things in ways that result in too much code, or too
much cruft.
Best to avoid both copy paste when reasonable, and sticking anything
non-trivial in macros.
We avoided macros if possible.
They are de-facto for constants and similar, but for longer stuff is
better avoided.
Macros are rarely the best way to define constants.ÿ They are needed if
you are using the constants for pre-processor stuff like conditional compilation.ÿ But generally you get clearer code, better typing, and potentially several other benefits from using alternative choices like "enum" (even for stand-alone integer constants), "static const"
variables, and in C23, "constexpr" variables.ÿ There's no doubt that a
lot of code /does/ use macros for constants, but I view it as a relic of
the past rather than good coding practice.
But, things can be considered in relative terms:
Like, C++ may carry various penalties vs C.
I don't find C++ carries noticeably penalties compared to C, for my
embedded work.ÿ But I do disable exceptions and RTTI - exceptions may
have very little run-time time overhead, but the unwind tables can be significant when code size is important in small systems.
On 29/05/2026 17:10, Janis Papanagnou wrote:
On 2026-05-29 13:19, Bart wrote:
Further, here: 'a * b + c' the multplication is done first, but here:
ÿÿÿ aÿ *= b += c
It is done second.
You understand that '=', '*=', and '*' are three different things,
don't you?
I hope you understand that '=' should have low precedence. And that
it makes sense to evaluate that from right to left. Do you follow?
"C" obviously decided to have them all, =, +=, *=, etc. in a single
group, and thus evaluated from right to left. - Easy rule, easy to
memorize. - And that is actually what you are demanding from many
other operators, to put them in a single group. - But here you are
complaining about it!
Of course the rules for those combined (sort of two-address) operators
could have been defined differently, in an own group with other rules.
(Algol 68 had done that, actually; the semantics are like "apply these
operations from left to right, indicating an incremental modification
of the underlying value.)
For those who don't know, the behaviour of this C code:
ÿÿ a += b += c += d
is very different from the equivalent Algol68:
ÿÿ a +:= b +:= c +:= d
This only modifies 'a'.
[...]
You seem to like making it 100% about me. How about stopping making it always so personal.
On 2026-05-29 12:10, Janis Papanagnou wrote:
On 2026-05-29 13:19, Bart wrote:...
* What is the order here: a ^ b | c
(a^b)|c
Personally I don't think that there's a prevalent definition
how these should be ordered.
I'm not sure what you mean by "prevalent definition". Ordinarily, I'd
expect the C standard to qualify - it definitely defines the order, and
the very purpose of a language standard is to prevail over non-standard alternatives. However, I'm sure you're aware of the C standard, and made
that comment anyway, so I presume you mean something different by it.
On 2026-05-29 20:18, Bart wrote:
(Are you
so proud of having understood that that you want to repeat it? -
"Look Ma, no hands!")
What you expose here (about your personality) is nothing new, and
it's about your personality; you obviously aren't really interested
to know or understand or learn the facts.
you obviously weren't intellectually capable of understanding the
topic, and all you posted is this reply!
- Pathetic!
On 29/05/2026 20:28, Keith Thompson wrote:
Bart <bc@freeuk.com> writes:
or would you continue to post contrived examples that make it appear
as confusing as possible?
Examples are examples. Do you want me to post one that didn't
illustrate an issue? It necessaily has to be contrived.
Bart, what do you want?
Today I was just replying to this post today that I found annoying:
JP:
Unsurprisingly; since exactly *that* was the obvious (and single)
issue with C's precedence definitions.
That is, the suggestion, made several times by JP, that there is only
one thing wrong.
On 2026-05-29 20:09, Bart wrote:
You actually said this:
You continue your trollish stance to cherry-pick words without
understanding or trying to understand what's been expressed.
The insight appears to me that you're taking communication in a
similar way as you "design" your languages; focusing on personal
*syntax* preferences instead of the more important *semantics*.
Despite we're talking in your native language
(and not mine) you
obvious completely miss or deliberately ignore that there's a
difference between "it makes perfectly sense" and "it's perfect".
(I said the former, you put the latter in my mouth.
(The point is that - with the exception of & ^ | - the ranking
makes perfect[ly] sense and should be easily usable without doubt
by a concept-knowing programmer."
ÿ >> What I would say is that operator precedences are in "C"
ÿ >> "sensibly and appropriately defined, modulo the bit-ops".
you're still playing your stupid game; you ignored that. I suggest
to try to map this statement to either of the above two statements,
the one I said and the one you (wrongly) attributed, and see which
one fits. (Hint: the former.)
Bart <bc@freeuk.com> writes:
On 29/05/2026 20:28, Keith Thompson wrote:
Bart <bc@freeuk.com> writes:
or would you continue to post contrived examples that make it appear
as confusing as possible?
Examples are examples. Do you want me to post one that didn't
illustrate an issue? It necessaily has to be contrived.
Bart, what do you want?
Today I was just replying to this post today that I found annoying:
JP:
Unsurprisingly; since exactly *that* was the obvious (and single)
issue with C's precedence definitions.
That is, the suggestion, made several times by JP, that there is only
one thing wrong.
I note your refusal to address most of what I wrote.
Upthread, you asked a question:
And then the point becomes, if you always add the parentheses, what
was the point of having that particular precedence level?
You've made it clear that you were never interested in an answer.
Bart, what do you want?
On 29/05/2026 21:56, Keith Thompson wrote:[...]
I note your refusal to address most of what I wrote.
Upthread, you asked a question:
And then the point becomes, if you always add the
parentheses, what was the point of having that particular
precedence level?
You've made it clear that you were never interested in an answer.
You said this:
"You're asking why C is designed the way it is. We could waste a
great deal of time and effort answering that for you. There are
numerous documents about the design and history of C, and of
its ancestor languages. I could provide you with links."
Actually I'm not asking why C is like that. We're already there.
As for my question, what /is/ the point? I'm still waiting!
Of course, I want the answer to be that there isn't any point if
parentheses will be used anyway.
* Why do bitwise & | ^ need their own level anyway
* Why do << >> have their own level anyway
Further, here: 'a * b + c' the multplication is done first, but
here:
a *= b += c
It is done second.
There's no language where the users are so detail focussed and open
to new features [than C++].
On Fri, 29 May 2026 12:19:04 +0100, Bart wrote:
* Why do bitwise & | ^ need their own level anyway
So that you can do shifting and masking with minimal parentheses.
* Why do << >> have their own level anyway
So that shift expressions can use common arithmetic operators with
minimal parentheses.
Further, here: 'a * b + c' the multplication is done first, but
here:
a *= b += c
It is done second.
That kind of thing is disallowed in Python, for some reason.
Bart <bc@freeuk.com> writes:...
Of course, I want the answer to be that there isn't any point if
parentheses will be used anyway.
On 2026-05-29 18:52, Keith Thompson wrote:
Bart <bc@freeuk.com> writes:...
Of course, I want the answer to be that there isn't any point if
parentheses will be used anyway.
The answer, of course, is that the condition of your "if" clause is not
true. In the overwhelming majority of the cases, people do not use parentheses to clarify the order of evaluation that is guaranteed by C's grammar rules. They only use them in the cases where they feel that
there's a significant chance of confusion.
On 30/05/2026 01:31, James Kuyper wrote:
On 2026-05-29 18:52, Keith Thompson wrote:
Bart <bc@freeuk.com> writes:...
Of course, I want the answer to be that there isn't any point if
parentheses will be used anyway.
The answer, of course, is that the condition of your "if" clause is
not true. In the overwhelming majority of the cases, people do not
use parentheses to clarify the order of evaluation that is guaranteed
by C's grammar rules. They only use them in the cases where they feel
that there's a significant chance of confusion.
Those are the cases we're talking about! That is:
<< >> & | ^
Maybe add == != and < <= >= > is someone wants to take advantage of
their different levels, but I guess 99% wouldn't even know about what.
Most of the rest, there tends to be agreement across languages:
school arithmetic group - comparisons - logical and/or
I haven't included ?: as that's too weird.
To be excluded from being syntactic sugar, it needs to be something
that is not generally possible to express within the base language.
So, for example: Things like operator overloading or classes are
syntactic sugar IMO, as what they do can be expressed in C, even if
a lot less pretty (or far from an idiomatic style).
I would not consider exceptions or RTTI as syntactic sugar, because
these involve things that do not map to native C.
On 30/05/2026 00:18, Lawrence D?Oliveiro wrote:
On Fri, 29 May 2026 12:19:04 +0100, Bart wrote:
* Why do bitwise & | ^ need their own level anyway
So that you can do shifting and masking with minimal parentheses.
Can you give examples?
* Why do << >> have their own level anyway
So that shift expressions can use common arithmetic operators with
minimal parentheses.
Again, examples?
Am 29.05.2026 um 11:15 schrieb BGB:
Like, if one doesn't care that the compiler takes a long time to run
and the EXE is needlessly large, maybe OK, not great if one does care...
Binary size doesn't matter with Windows.
Having to spend minutes or more waiting for the compiler would
seriously hurt momentum for many tasks.
Use C++20 modules and parallel builds.
Say, for example, if the Boot ROM requires keeping everything under 32K.
C++ was designed for large scale program development.
With 32K-systems you can stick with C.
But they still don?t have ?try-finally?.
On Sat, 30 May 2026 01:26:47 +0100, Bart wrote:
On 30/05/2026 00:18, Lawrence D?Oliveiro wrote:
On Fri, 29 May 2026 12:19:04 +0100, Bart wrote:
* Why do bitwise & | ^ need their own level anyway
So that you can do shifting and masking with minimal parentheses.
Can you give examples?
You haven?t done much bit manipulation, have you?
Extracting RGB components from a pixel:
const unsigned int
r = pixel >> 16 & 255,
g = pixel >> 8 & 255,
b = pixel & 255;
Combining RGBA components into a pixel:
colors[i] =
channel[0] << 24
|
channel[1] << 16
|
channel[2] << 8
|
channel[3];
* Why do << >> have their own level anyway
So that shift expressions can use common arithmetic operators with
minimal parentheses.
Again, examples?
From the same code module, putting together a subpicture image
consisting of 2 bits per pixel:
^ & with respect to each other. Port the fragment to a language with slightly different rules and it it would still work.
& | ^, and poor to rely on them for the meaning of your code.
Bart <bc@freeuk.com> writes:
On 30/05/2026 01:31, James Kuyper wrote:
On 2026-05-29 18:52, Keith Thompson wrote:
Bart <bc@freeuk.com> writes:...
Of course, I want the answer to be that there isn't any point if
parentheses will be used anyway.
The answer, of course, is that the condition of your "if" clause is
not true. In the overwhelming majority of the cases, people do not
use parentheses to clarify the order of evaluation that is guaranteed
by C's grammar rules. They only use them in the cases where they feel
that there's a significant chance of confusion.
Those are the cases we're talking about! That is:
<< >> & | ^
Maybe add == != and < <= >= > is someone wants to take advantage of
their different levels, but I guess 99% wouldn't even know about what.
Most of the rest, there tends to be agreement across languages:
school arithmetic group - comparisons - logical and/or
I haven't included ?: as that's too weird.
So what is your question? I had thought that you meant to ask why
Ritchie defined the precedences that way, but apparently that's
not what you meant.
On 5/29/2026 6:22 AM, David Brown wrote:
On 29/05/2026 12:20, BGB wrote:
On 5/29/2026 2:52 AM, Janis Papanagnou wrote:
On 2026-05-28 11:57, BGB wrote:
On 5/28/2026 2:18 AM, Janis Papanagnou wrote:[...]
On 2026-05-28 01:49, BGB wrote:
But, not really an "easy" way to avoid bloat, other than to write
code specifically for what cases are relevant; while also avoiding
needless duplication and copy paste (where, overuse of copy/paste
can also lead to bloat; along with turning the code into an ugly
mess).
Hmm.. - as said, the during very early days there were issues; I
recall on one platform duplication of template code in more that
one source unit. And/or some environmental hacks (of the compiler)
to deposit template code for linking. In the later days I've not
seen such immature things anymore.
Possibly, a lot could depend on how one is counting things as well.
In a lot of cases when using GCC, I end up using:
ÿÿ -ffunction-sections -fdata-sections -Wl,-gc-sections
On many targets, "-fdata-sections" can lead to noticeably larger and
slower code because it effectively eliminates section anchor
optimisations.ÿ It does not negatively affect x86 AFAICS, because x86
does not use section anchors.
<https://godbolt.org/z/zeoq41Y7d>
With -fsection-anchors (enabled with optimisation on targets that
support it - generally RISCy load/store architectures), program-
lifetime variables are kept together in a lump (as though they were in
a struct) and often addressed by a pointer to that pretend struct.
Thus if a function accesses two variables "a" and "b", instead of
having to load the addresses of each of "a" and "b" into separate
registers, it loads an "anchor" into one register and accesses the
variables with reg+offset addressing.
I've seen "-fdata-sections" used regularly in embedded systems - it is
almost always a bad idea.
("-ffunction-sections" is often very helpful to reduce code image
size, so keep that one.)
Both seem to help on x86, x86-64, and also on RISC-V, at making GCC's
output at least sorta space-comparable to my own compilers.
The merit of "-fdata-sections" is mostly that it eliminates unused
global variables; whereas "-ffunction-sections" eliminates unreachable functions.
Neither is needed with my own compiler, which compiles things in a way
such that it eliminates anything that is unreachable.
That might be the case for a very simplistic compiler.ÿ With an
optimising compiler, these extra variables will quickly be eliminated.
If the compiler has a good scheduling model of the device, it do
whatever instruction scheduling works best for that processor.ÿ If the
model is not good enough, it will be suboptimal.ÿ I would not,
however, expect any different in the generated code for the two code
snippets.
Sometimes this kind of "manual optimisation" is helpful when you have
to try to get efficient results from a weak compiler, however.
Possibly, but this sort of thing can help with both BGBCC and with MSVC
IME
Usual strategy is to try to limit how much code is written, and
also to avoid doing things in ways that result in too much code, or >>>>> too much cruft.
Best to avoid both copy paste when reasonable, and sticking
anything non-trivial in macros.
We avoided macros if possible.
They are de-facto for constants and similar, but for longer stuff is
better avoided.
Macros are rarely the best way to define constants.ÿ They are needed
if you are using the constants for pre-processor stuff like
conditional compilation.ÿ But generally you get clearer code, better
typing, and potentially several other benefits from using alternative
choices like "enum" (even for stand-alone integer constants), "static
const" variables, and in C23, "constexpr" variables.ÿ There's no doubt
that a lot of code /does/ use macros for constants, but I view it as a
relic of the past rather than good coding practice.
They are traditional...
Like:
ÿ static const double M_PI = 3.14159265358979;
Could also make sense, but people don't do usually this, they usually
use macros...
But, things can be considered in relative terms:
Like, C++ may carry various penalties vs C.
I don't find C++ carries noticeably penalties compared to C, for my
embedded work.ÿ But I do disable exceptions and RTTI - exceptions may
have very little run-time time overhead, but the unwind tables can be
significant when code size is important in small systems.
Yes, that is the main thing.
ÿ They carry zero performance penalty in practice;
ÿ But, have a non-zero penalty for image size.
Not enough to be a deal-breaker towards using them if they are used, but enough that one wants them disabled if not used...
On 29/05/2026 18:29, tTh wrote:
On 5/29/26 15:22, Bart wrote:
Something you might do when you have time (as I'm busy), is to
analyse the expressions in some C codebases, and isolate those where
removal of parentheses that group terms, would result in exactly the
same shape of expressions, and are therefore redundant.
ÿÿÿ This is a strange exercice. When I write complex expression,
ÿÿÿ I sometime use redondant parenthesis for the clarity of
ÿÿÿ my intentions about this computation. I'm thinking that
ÿÿÿ those extra (()) are a sort of in-line comments.
Sure, but some here like to say that such expressions, if they still
work without parentheses, are unambiguous anyway.
On 29/05/2026 21:56, Keith Thompson wrote:
[snip]
Upthread, you asked a question:
And then the point becomes, if you always add the parentheses, what
was the point of having that particular precedence level?
You've made it clear that you were never interested in an answer.
You said this:
"You're asking why C is designed the way it is. We could waste a
great deal of time and effort answering that for you. There are
numerous documents about the design and history of C, and of
its ancestor languages. I could provide you with links."
Actually I'm not asking why C is like that. We're already there.
I'm saying that there is no value in those extra levels, some people
think is, and I'm arging about that. I was replying to tTh.
As for my question, what /is/ the point? I'm still waiting!
Of course, I want the answer to be that there isn't any point if
parentheses will be used anyway.
On 29/05/2026 22:16, BGB wrote:
On 5/29/2026 6:22 AM, David Brown wrote:
On 29/05/2026 12:20, BGB wrote:
On 5/29/2026 2:52 AM, Janis Papanagnou wrote:
We avoided macros if possible.
They are de-facto for constants and similar, but for longer stuff is
better avoided.
Macros are rarely the best way to define constants.ÿ They are needed
if you are using the constants for pre-processor stuff like
conditional compilation.ÿ But generally you get clearer code, better
typing, and potentially several other benefits from using alternative
choices like "enum" (even for stand-alone integer constants), "static
const" variables, and in C23, "constexpr" variables.ÿ There's no
doubt that a lot of code /does/ use macros for constants, but I view
it as a relic of the past rather than good coding practice.
They are traditional...
Like:
ÿÿ static const double M_PI = 3.14159265358979;
Could also make sense, but people don't do usually this, they usually
use macros...
They should not do so (IMHO, of course).ÿ Yes, macros are traditional -
but there are no plus sides to using them for this kind of thing. (There
are no plus sides to using all-caps either, but people do that too.)
[...]
In article <10vd1tu$ekvl$1@dont-email.me>, Bart <bc@freeuk.com> wrote:
On 29/05/2026 21:56, Keith Thompson wrote:
[snip]
Upthread, you asked a question:
And then the point becomes, if you always add the parentheses, what >>> was the point of having that particular precedence level?
You've made it clear that you were never interested in an answer.
You said this:
"You're asking why C is designed the way it is. We could waste a
great deal of time and effort answering that for you. There are
numerous documents about the design and history of C, and of
its ancestor languages. I could provide you with links."
Actually I'm not asking why C is like that. We're already there.
I'm saying that there is no value in those extra levels, some people
think is, and I'm arging about that. I was replying to tTh.
As for my question, what /is/ the point? I'm still waiting!
To clarify: the question is, what is the point of those levels?
How is that different from asking "why C is like that"?
On 2026-05-30 13:52, David Brown wrote:
On 29/05/2026 22:16, BGB wrote:
On 5/29/2026 6:22 AM, David Brown wrote:
On 29/05/2026 12:20, BGB wrote:
On 5/29/2026 2:52 AM, Janis Papanagnou wrote:
We avoided macros if possible.
They are de-facto for constants and similar, but for longer stuff
is better avoided.
Macros are rarely the best way to define constants.ÿ They are needed
if you are using the constants for pre-processor stuff like
conditional compilation.ÿ But generally you get clearer code, better
typing, and potentially several other benefits from using
alternative choices like "enum" (even for stand-alone integer
constants), "static const" variables, and in C23, "constexpr"
variables.ÿ There's no doubt that a lot of code /does/ use macros
for constants, but I view it as a relic of the past rather than good
coding practice.
They are traditional...
Like:
ÿÿ static const double M_PI = 3.14159265358979;
Could also make sense, but people don't do usually this, they usually
use macros...
They should not do so (IMHO, of course).ÿ Yes, macros are traditional
- but there are no plus sides to using them for this kind of thing.
(There are no plus sides to using all-caps either, but people do that
too.)
Because in early days Cpp constants have been used and Cpp-stuff often capitalized[*]. Our C++ coding rules back then had mandated lowercase
also for constants, but strangely some folks were so used to uppercase
Cpp literals that they disliked to write constants (as other objects)
in lowercase, and stated opinions were sometimes heated like religious topics.
I wonder what lexical convention regular "C" (or C++) programmers here
use for constants nowadays.
Curiously I inspected my latest C-source to see what convention I've
actually followed recently. But I noticed that I had no hard constants
used at all; all parameters came from a configuration file and through
the command line interface. (That makes sense, I guess.)
Janis
[*] Strangely there were C-function-macros that were written lowercase, though.
On 29/05/2026 22:16, BGB wrote:
On 5/29/2026 6:22 AM, David Brown wrote:
On 29/05/2026 12:20, BGB wrote:
On 5/29/2026 2:52 AM, Janis Papanagnou wrote:
On 2026-05-28 11:57, BGB wrote:
On 5/28/2026 2:18 AM, Janis Papanagnou wrote:[...]
On 2026-05-28 01:49, BGB wrote:
But, not really an "easy" way to avoid bloat, other than to write >>>>>> code specifically for what cases are relevant; while also avoiding >>>>>> needless duplication and copy paste (where, overuse of copy/paste >>>>>> can also lead to bloat; along with turning the code into an ugly
mess).
Hmm.. - as said, the during very early days there were issues; I
recall on one platform duplication of template code in more that
one source unit. And/or some environmental hacks (of the compiler)
to deposit template code for linking. In the later days I've not
seen such immature things anymore.
Possibly, a lot could depend on how one is counting things as well.
In a lot of cases when using GCC, I end up using:
ÿÿ -ffunction-sections -fdata-sections -Wl,-gc-sections
On many targets, "-fdata-sections" can lead to noticeably larger and
slower code because it effectively eliminates section anchor
optimisations.ÿ It does not negatively affect x86 AFAICS, because x86
does not use section anchors.
<https://godbolt.org/z/zeoq41Y7d>
With -fsection-anchors (enabled with optimisation on targets that
support it - generally RISCy load/store architectures), program-
lifetime variables are kept together in a lump (as though they were
in a struct) and often addressed by a pointer to that pretend struct.
Thus if a function accesses two variables "a" and "b", instead of
having to load the addresses of each of "a" and "b" into separate
registers, it loads an "anchor" into one register and accesses the
variables with reg+offset addressing.
I've seen "-fdata-sections" used regularly in embedded systems - it
is almost always a bad idea.
("-ffunction-sections" is often very helpful to reduce code image
size, so keep that one.)
Both seem to help on x86, x86-64, and also on RISC-V, at making GCC's
output at least sorta space-comparable to my own compilers.
The merit of "-fdata-sections" is mostly that it eliminates unused
global variables; whereas "-ffunction-sections" eliminates unreachable
functions.
That is the point of them, yes.ÿ "-ffunction-sections" can be useful at removing unused code from more general code.ÿ For microcontrollers,
SDK's and manufacturers' driver code will normally contain a large
number of functions that can be eliminated in this way, saving a lot of
code space.
However, in practice, "-fdata-sections" rarely eliminates a significant amount - most programs do not have large amounts of statically-allocated data that is not used.ÿ Gcc, and I think most other compilers, put the static lifetime data for each translation unit in its own section, so if
no data from a translation unit is used it will be eliminated at link
time even with -fno-data-sections.ÿ And of course it makes no difference
for heap data or stack data.
In my testing, "-ffunction-sections" is absolutely worth using (on
targets where code space is relevant - there's no need for PC software).
ÿOn some targets, it may mean a few lost opportunities for shorter jump/call instructions between functions in the same translation unit,
but the cost is rarely anything more than a slightly longer link time.
But "-fdata-sections" typically gives almost no ram space savings, and
makes code bigger and slower.
As I noted, gcc on x86 does not support section anchors, so there is not likely to be much code cost for -ffdata-sections.
Where section anchors shine - and where -fdata-sections therefore has
cost - is when a function needs to access more than one piece of static lifetime data defined in the same translation unit (or another
translation unit if you are using LTO).ÿ That happens a lot in embedded
ARM programming at least.ÿ I don't know about RISC-V.ÿ If the target normally uses a "small data section" for ram (I know this is common on PowerPC), then there is, in effect, a program-wide section anchor
already.ÿ So it is possible that it relatively few targets have section anchors - but the 32-bit ARM on gcc is a vastly popular choice in the embedded world, so it is important to understand the cost of this
compiler flag for that target at least.
Neither is needed with my own compiler, which compiles things in a way
such that it eliminates anything that is unreachable.
[...]
That might be the case for a very simplistic compiler.ÿ With an
optimising compiler, these extra variables will quickly be
eliminated. If the compiler has a good scheduling model of the
device, it do whatever instruction scheduling works best for that
processor.ÿ If the model is not good enough, it will be suboptimal.
I would not, however, expect any different in the generated code for
the two code snippets.
Sometimes this kind of "manual optimisation" is helpful when you have
to try to get efficient results from a weak compiler, however.
Possibly, but this sort of thing can help with both BGBCC and with
MSVC IME
I don't tend to think of MSVC as a highly optimising compiler - but it
is not a tool I have much use for, as it does not handle the targets I
need.ÿ When I have sometimes looked at the generated code on godbolt, it
has not impressed me at all.ÿ So it could well fall into the "helpful
when using a weaker compiler" category.
Usual strategy is to try to limit how much code is written, and
also to avoid doing things in ways that result in too much code,
or too much cruft.
Best to avoid both copy paste when reasonable, and sticking
anything non-trivial in macros.
We avoided macros if possible.
They are de-facto for constants and similar, but for longer stuff is
better avoided.
Macros are rarely the best way to define constants.ÿ They are needed
if you are using the constants for pre-processor stuff like
conditional compilation.ÿ But generally you get clearer code, better
typing, and potentially several other benefits from using alternative
choices like "enum" (even for stand-alone integer constants), "static
const" variables, and in C23, "constexpr" variables.ÿ There's no
doubt that a lot of code /does/ use macros for constants, but I view
it as a relic of the past rather than good coding practice.
They are traditional...
Like:
ÿÿ static const double M_PI = 3.14159265358979;
Could also make sense, but people don't do usually this, they usually
use macros...
They should not do so (IMHO, of course).ÿ Yes, macros are traditional -
but there are no plus sides to using them for this kind of thing. (There
are no plus sides to using all-caps either, but people do that too.)
(I'm snipping all the details of your own C compiler, because there is
very little I can comment on.)
But, things can be considered in relative terms:
Like, C++ may carry various penalties vs C.
I don't find C++ carries noticeably penalties compared to C, for my
embedded work.ÿ But I do disable exceptions and RTTI - exceptions may
have very little run-time time overhead, but the unwind tables can be
significant when code size is important in small systems.
Yes, that is the main thing.
ÿÿ They carry zero performance penalty in practice;
ÿÿ But, have a non-zero penalty for image size.
Not enough to be a deal-breaker towards using them if they are used,
but enough that one wants them disabled if not used...
Agreed.
(I could also note that I make heavy use of templates in C++ code - it
often leads to smaller and faster results.)
On 30/05/2026 13:29, Dan Cross wrote:
In article <10vd1tu$ekvl$1@dont-email.me>, Bart <bc@freeuk.com> wrote:
On 29/05/2026 21:56, Keith Thompson wrote:To clarify: the question is, what is the point of those levels?
[snip]
Upthread, you asked a question:
And then the point becomes, if you always add the parentheses, what >>>> was the point of having that particular precedence level?
You've made it clear that you were never interested in an answer.
You said this:
"You're asking why C is designed the way it is. We could waste a
great deal of time and effort answering that for you. There are
numerous documents about the design and history of C, and of
its ancestor languages. I could provide you with links."
Actually I'm not asking why C is like that. We're already there.
I'm saying that there is no value in those extra levels, some people
think is, and I'm arging about that. I was replying to tTh.
As for my question, what /is/ the point? I'm still waiting!
How is that different from asking "why C is like that"?
My question is actually independent of C or its history.
I accept those levels exist. I was asking do they currently serve a
useful purpose.
If not, people can choose to ignore those them when writing C code,
for example like this where all () are technically superfluous:
crcu32 = (crcu32 >> 4) ^ s_crc32[(crcu32 & 0xF) ^ (b & 0xF)];
And they can choose to not adopt them when devising new languages,
however many still do faithfully recreate the same pattern, with a few notable exceptions such as Go lang.
It doesn't beed << >> to be in a distinct group from multiply or add
groups.
But it is also not clear because the part after >> is sprawling.
You'd want it like this:
Remove ambiguity in the mind of the reader? Leader to fewer
surprises when a new term needs to be added?
Bart <bc@freeuk.com> writes:
[...][...]
C's operator precedence rules are complicated and arguably flawed.
They could have been defined differently. A simpler set of rules,
with fewer levels, *might* have been better. I don't have any
concrete suggestions -- nor do I have any strong preferences.
I accept C's rules as they are. I would accept them if they had
been defined differently.
Nothing about the current rules particularly bothers me. There are
no objective criteria for deciding what the rules *should* be.
Even having multiplication bind more tightly than addition is
fundamentally an arbitrary choice
(though one that's almost
universally recognized, even outside the context of programming
languages).
[...]
If not, people can choose to ignore those them when writing C code,
for example like this where all () are technically superfluous:
crcu32 = (crcu32 >> 4) ^ s_crc32[(crcu32 & 0xF) ^ (b & 0xF)];
Yes, they can, and I personally tend to agree that they should.
[...]
When designing a new language, there are real advantages in strictly imitating C's rules, just because so many programmers are familiar
with them.
(I would have been silly for C++ or Objective-C to
change the precedence rules, even to improve them.) But there
are also real advantages in using precedence rules that are better
(e.g., simpler) than C's.
It depends on the nature of the language.
It could be an interesting discussion for comp.lang.misc.
On 2026-05-31 01:43, Keith Thompson wrote:
Bart <bc@freeuk.com> writes:
[...][...]
C's operator precedence rules are complicated and arguably flawed.
I'd say that just the (known) flaw makes them (slightly) complicated;
so you need to remember that "flaw" (or "inconsistency") to be safe.
The rest is completely sensible. And even if one doesn't have a table
to look up the precedences they mostly can be derived (presuming one
has a feeling for the underlying logic of these things or experiences
from other related areas).
They could have been defined differently. A simpler set of rules,
with fewer levels, *might* have been better. I don't have any
concrete suggestions -- nor do I have any strong preferences.
I accept C's rules as they are. I would accept them if they had
been defined differently.
Nothing about the current rules particularly bothers me. There are
no objective criteria for deciding what the rules *should* be.
There are. (What I called above as "derived underlying logic".) Some
aspects have already been formulated in this and other threads here.
But maybe not obvious to recognize without background in mathematics,
logic, or CS.
Even having multiplication bind more tightly than addition is
fundamentally an arbitrary choice
(Now opinions are getting really strange; in the above stated sense.)
(though one that's almost
universally recognized, even outside the context of programming
languages).
[...]
If not, people can choose to ignore those them when writing C code,Yes, they can, and I personally tend to agree that they should.
for example like this where all () are technically superfluous:
crcu32 = (crcu32 >> 4) ^ s_crc32[(crcu32 & 0xF) ^ (b & 0xF)];
The more complex the expressions are the more structure they need.
IMO, the parenthesis above make precedence clear (if unknown!), but
are not contributing to readability. It would have made more sense
to separate the sub-expression within the [...] in an own object to
enhance readability and to more easily understand what's going on.
To emphasize; not the precedences are the problem above, but the
complexity of the expression in connexion with lack of structuring.
[...]When designing a new language, there are real advantages in strictly
imitating C's rules, just because so many programmers are familiar
with them.
Huh? - How that? - Are you saying here that practically only C-like
languages are in common use?
- But even if so; there's quite some
languages with differing precedence rules, not C-based, and without
such a flaw like the one being discussed. - When designing a *new*
language I'd certainly choose one of the sensible precedence rules,
and just without those obvious flaws. (And not use "C" as base, of
course.)
(I would have been silly for C++ or Objective-C to
change the precedence rules, even to improve them.) But there
are also real advantages in using precedence rules that are better
(e.g., simpler) than C's.
Or - with reference to that flaw - just more consistent.
Consistent systems are inherently simpler, in the sense of easier to understand and thus more straightforward to use. A precondition for
that is, as said, at least a basic understanding of such things.
It depends on the nature of the language.
It could be an interesting discussion for comp.lang.misc.
C's operator precedence rules are complicated and arguably flawed.
They could have been defined differently. A simpler set of rules,
with fewer levels,*might* have been better. I don't have any
concrete suggestions -- nor do I have any strong preferences.
I accept C's rules as they are. I would accept them if they had
been defined differently.
On 5/30/2026 6:52 AM, David Brown wrote:
On 29/05/2026 22:16, BGB wrote:
On 5/29/2026 6:22 AM, David Brown wrote:
On 29/05/2026 12:20, BGB wrote:
On 5/29/2026 2:52 AM, Janis Papanagnou wrote:
On 2026-05-28 11:57, BGB wrote:
On 5/28/2026 2:18 AM, Janis Papanagnou wrote:[...]
On 2026-05-28 01:49, BGB wrote:
But, not really an "easy" way to avoid bloat, other than to write >>>>>>> code specifically for what cases are relevant; while also
avoiding needless duplication and copy paste (where, overuse of >>>>>>> copy/paste can also lead to bloat; along with turning the code
into an ugly mess).
Hmm.. - as said, the during very early days there were issues; I
recall on one platform duplication of template code in more that
one source unit. And/or some environmental hacks (of the compiler) >>>>>> to deposit template code for linking. In the later days I've not
seen such immature things anymore.
Possibly, a lot could depend on how one is counting things as well.
In a lot of cases when using GCC, I end up using:
ÿÿ -ffunction-sections -fdata-sections -Wl,-gc-sections
On many targets, "-fdata-sections" can lead to noticeably larger and
slower code because it effectively eliminates section anchor
optimisations.ÿ It does not negatively affect x86 AFAICS, because
x86 does not use section anchors.
<https://godbolt.org/z/zeoq41Y7d>
With -fsection-anchors (enabled with optimisation on targets that
support it - generally RISCy load/store architectures), program-
lifetime variables are kept together in a lump (as though they were
in a struct) and often addressed by a pointer to that pretend
struct. Thus if a function accesses two variables "a" and "b",
instead of having to load the addresses of each of "a" and "b" into
separate registers, it loads an "anchor" into one register and
accesses the variables with reg+offset addressing.
I've seen "-fdata-sections" used regularly in embedded systems - it
is almost always a bad idea.
("-ffunction-sections" is often very helpful to reduce code image
size, so keep that one.)
Both seem to help on x86, x86-64, and also on RISC-V, at making GCC's
output at least sorta space-comparable to my own compilers.
The merit of "-fdata-sections" is mostly that it eliminates unused
global variables; whereas "-ffunction-sections" eliminates
unreachable functions.
That is the point of them, yes.ÿ "-ffunction-sections" can be useful
at removing unused code from more general code.ÿ For microcontrollers,
SDK's and manufacturers' driver code will normally contain a large
number of functions that can be eliminated in this way, saving a lot
of code space.
However, in practice, "-fdata-sections" rarely eliminates a
significant amount - most programs do not have large amounts of
statically-allocated data that is not used.ÿ Gcc, and I think most
other compilers, put the static lifetime data for each translation
unit in its own section, so if no data from a translation unit is used
it will be eliminated at link time even with -fno-data-sections.ÿ And
of course it makes no difference for heap data or stack data.
The main place it makes a difference is global arrays from a translation unit that is included, but for functions that are not included.
Also functions with large static arrays.
void SomeFunc()
{
ÿ static char buf[4096];
ÿ ...
}
Where, say, eliminating SomeFunc does not necessarily eliminate buf.
In my testing, "-ffunction-sections" is absolutely worth using (on
targets where code space is relevant - there's no need for PC
software). ÿÿOn some targets, it may mean a few lost opportunities for
shorter jump/call instructions between functions in the same
translation unit, but the cost is rarely anything more than a slightly
longer link time. But "-fdata-sections" typically gives almost no ram
space savings, and makes code bigger and slower.
As I noted, gcc on x86 does not support section anchors, so there is
not likely to be much code cost for -ffdata-sections.
Where section anchors shine - and where -fdata-sections therefore has
cost - is when a function needs to access more than one piece of
static lifetime data defined in the same translation unit (or another
translation unit if you are using LTO).ÿ That happens a lot in
embedded ARM programming at least.ÿ I don't know about RISC-V.ÿ If the
target normally uses a "small data section" for ram (I know this is
common on PowerPC), then there is, in effect, a program-wide section
anchor already.ÿ So it is possible that it relatively few targets have
section anchors - but the 32-bit ARM on gcc is a vastly popular choice
in the embedded world, so it is important to understand the cost of
this compiler flag for that target at least.
It depends on the way it is built.
A lot of times though (for non-relocatable static-linked binaries) it
mostly tends to use AUIPC+LD or AUIPC+ST pairs to access global
variables. There is a Global Pointer that needs to be loaded when the
binary is started, unclear what it is used for exactly.
I don't tend to think of MSVC as a highly optimising compiler - but it
is not a tool I have much use for, as it does not handle the targets I
need.ÿ When I have sometimes looked at the generated code on godbolt,
it has not impressed me at all.ÿ So it could well fall into the
"helpful when using a weaker compiler" category.
Depends on what target I am building for:
ÿ Windows Native: Typically MSVC
ÿ WSL: Usually GCC or Clang
ÿÿÿ Seems to have: GCC 13.2.0; Clang 18.1.3
ÿÿÿ RISC-V GCC: Also 13.2.0 (also via WSL)
ÿ Linux: Typically GCC
I rarely much use Cygwin anymore, as it was mostly rendered obsolete by
WSL (on Win10 or similar).
Though, Cygwin may still be relevant on Win7 or WinXP systems.
For BGBCC, it can build both on native Windows and on Linux/WSL (though recently noted that this build was broken, mostly by GCC and Clang being more pedantic about missing prototypes, and a few prototypes were being missed by my function-prototype mining tool). Went and fixed this, but haven't posted this yet.
As for optimizing in MSVC, yeah, it is in the area of not terrible, but
not super clever either.
If one expects the sort of high-level code-rewriting cleverness that GCC
or Clang often does, one will be disappointed.
But, sometimes, the main "heavy hitter" optimizations are things like constant-folding and register allocation, which it does do effectively.
Though, both MSVC and BGBCC seem to use one sort of strategy for
register allocation:
Static assign things to callee-save registers and use remaining
registers for dynamic allocation within basic-blocks. Variables with
finite non-overlapping lifetimes (that do not cross basic-block
boundaries) may potentially share a register (this more generally
applies to things like temporaries).
And, GCC and Clang use another: Assign dynamically but carry values
across basic-block boundaries along control-flow paths.
Both tend to give different patterns though, and seem to favor different types of code.
Curious...
(I could also note that I make heavy use of templates in C++ code - it
often leads to smaller and faster results.)
I had tended to use the "write everything one off for the task at hand" approach, but this is a higher-effort approach.
On 2026-05-31 01:43, Keith Thompson wrote:
Bart <bc@freeuk.com> writes:
[...]
If not, people can choose to ignore those them when writing C code,
for example like this where all () are technically superfluous:
ÿÿÿ crcu32 = (crcu32 >> 4) ^ s_crc32[(crcu32 & 0xF) ^ (b & 0xF)];
Yes, they can, and I personally tend to agree that they should.
The more complex the expressions are the more structure they need.
IMO, the parenthesis above make precedence clear (if unknown!), but
are not contributing to readability. It would have made more sense
to separate the sub-expression within the [...] in an own object to
enhance readability and to more easily understand what's going on.
To emphasize; not the precedences are the problem above, but the
complexity of the expression in connexion with lack of structuring.
On 31/05/2026 00:43, Keith Thompson wrote:
C's operator precedence rules are complicated and arguably flawed.
They could have been defined differently.ÿ A simpler set of rules,
with fewer levels,*might* have been better.ÿ I don't have any
concrete suggestions -- nor do I have any strong preferences.
I accept C's rules as they are.ÿ I would accept them if they had
been defined differently.
Can't the compiler easily remove any parens that aren't necessary?
So - just write complex expressions in a way that a human can most
easily understand, it makes your intention clear and probable doesn't increase the size of the executable.
On Sat, 30 May 2026 12:01:27 +0100, Bart wrote:
It doesn't beed << >> to be in a distinct group from multiply or add
groups.
But it is also not clear because the part after >> is sprawling.
It?s a counterexample to your claim that ?<< >> [don?t need] to be in
a distinct group?, isn?t it?
On 31/05/2026 10:12, Richard Harnden wrote:
On 31/05/2026 00:43, Keith Thompson wrote:
C's operator precedence rules are complicated and arguably flawed.
They could have been defined differently.ÿ A simpler set of rules,
with fewer levels,*might* have been better.ÿ I don't have any
concrete suggestions -- nor do I have any strong preferences.
I accept C's rules as they are.ÿ I would accept them if they had
been defined differently.
Can't the compiler easily remove any parens that aren't necessary?
So - just write complex expressions in a way that a human can most
easily understand, it makes your intention clear and probable doesn't
increase the size of the executable.
Of course.ÿ Parentheses do not affect the generated code unless they
affect the semantics of the expression.ÿ (Some people think parentheses affect the order of evaluation,
On 31/05/2026 00:43, Keith Thompson wrote:
C's operator precedence rules are complicated and arguably flawed.
They could have been defined differently. A simpler set of rules,
with fewer levels,*might* have been better. I don't have any
concrete suggestions -- nor do I have any strong preferences.
I accept C's rules as they are. I would accept them if they had
been defined differently.
Can't the compiler easily remove any parens that aren't necessary?
So - just write complex expressions in a way that a human can most
easily understand, it makes your intention clear and probable doesn't increase the size of the executable.
Richard Harnden <richard.nospam@gmail.invalid> writes:
On 31/05/2026 00:43, Keith Thompson wrote:
C's operator precedence rules are complicated and arguably flawed.
They could have been defined differently. A simpler set of rules,
with fewer levels,*might* have been better. I don't have any
concrete suggestions -- nor do I have any strong preferences.
I accept C's rules as they are. I would accept them if they had
been defined differently.
Can't the compiler easily remove any parens that aren't necessary?
So - just write complex expressions in a way that a human can most
easily understand, it makes your intention clear and probable doesn't
increase the size of the executable.
Compilers generally remove *all* parens, necessary or not.
The output of a compiler is assembly or machine code. You almost
certainly can't tell from the generated code whether the input was,
for example, `a * b + c`, `(a * b) + c`, or `(((a) * (b)) + (c))`.
On 31/05/2026 10:49, David Brown wrote:
On 31/05/2026 10:12, Richard Harnden wrote:
On 31/05/2026 00:43, Keith Thompson wrote:
C's operator precedence rules are complicated and arguably flawed.
They could have been defined differently.ÿ A simpler set of rules,
with fewer levels,*might* have been better.ÿ I don't have any
concrete suggestions -- nor do I have any strong preferences.
I accept C's rules as they are.ÿ I would accept them if they had
been defined differently.
Can't the compiler easily remove any parens that aren't necessary?
So - just write complex expressions in a way that a human can most
easily understand, it makes your intention clear and probable doesn't
increase the size of the executable.
Of course.ÿ Parentheses do not affect the generated code unless they
affect the semantics of the expression.ÿ (Some people think
parentheses affect the order of evaluation,
They can do if they make a expression be parsed differently.
Do you have
an example where they make no difference but people might think they do?
On Sat, 30 May 2026 12:01:27 +0100, Bart wrote:
It doesn't beed << >> to be in a distinct group from multiply or add
groups.
But it is also not clear because the part after >> is sprawling.
It?s a counterexample to your claim that ?<< >> [don?t need] to be in
a distinct group?, isn?t it?
Of course. Parentheses do not affect the generated code unless they
affect the semantics of the expression. (Some people think parentheses affect the order of evaluation, but that is not the case for most compilers.)
On 31/05/2026 12:10, Bart wrote:...
On 31/05/2026 10:49, David Brown wrote:
On 31/05/2026 10:12, Richard Harnden wrote:
On 31/05/2026 00:43, Keith Thompson wrote:
But you might consider "(a + b) + c" to be "parsed differently" than "a
+ (b + c)", because of how a particular compiler implements its parser.
It's possible that this results in different code for a particular
compiler, but there is no difference in the meaning for the C language.
Do you have
an example where they make no difference but people might think they do?
People might think they affect the order of evaluation, such as when you have function calls :
u = foo(x) + (foo(y) + foo(z));
Some people might think the use of parentheses means that "foo(y)" and "foo(z)" are called before "foo(x)", when the order of all these calls
(and the additions) is unspecified. (Again, a given compiler might be influenced by the parentheses, but the language does not require it.
On 2026-05-31 05:49, David Brown wrote:
...
Of course. Parentheses do not affect the generated code unless they
affect the semantics of the expression. (Some people think parentheses
affect the order of evaluation, but that is not the case for most
compilers.)
I assume that last sentence is meant to apply only to parentheses which
don't change the semantics? Otherwise it seems manifestly false.
On 2026-05-31 07:18, David Brown wrote:
On 31/05/2026 12:10, Bart wrote:...
On 31/05/2026 10:49, David Brown wrote:
On 31/05/2026 10:12, Richard Harnden wrote:
On 31/05/2026 00:43, Keith Thompson wrote:
But you might consider "(a + b) + c" to be "parsed differently" than "a
+ (b + c)", because of how a particular compiler implements its parser.
It's possible that this results in different code for a particular
compiler, but there is no difference in the meaning for the C language.
(a + b) + c mandates adding a to b, then adding the result to c. a + (b
+ c) mandates adding b to c then adding the result to a. As far as mathematics is concerned, that's the same thing, but in computer math it
can make a difference if one of the two results in overflow or
unnecessary loss of precision, and the other does not.
...
Do you have
an example where they make no difference but people might think they do? >>>
People might think they affect the order of evaluation, such as when you
have function calls :
u = foo(x) + (foo(y) + foo(z));
Some people might think the use of parentheses means that "foo(y)" and
"foo(z)" are called before "foo(x)", when the order of all these calls
(and the additions) is unspecified. (Again, a given compiler might be
influenced by the parentheses, but the language does not require it.
You're correct with regard to the function calls, but the parenthesized addition must be performed first, and the other one second, which may
make a difference, for the same reasons given in my previous paragraph.
just write complex expressions in a way that a human can most
easily understand,
Usually, both sub-expressions of a binary operator will be evaluated
before the operator itself, simply because usually the results of the operator cannot be calculated until the sub-expression's values are
known. But this is not a requirement of the language
can get the same results without doing so, it is free to pick a
different order.
If an implementation provides additional semantics to signed integer arithmetic, such as saturating or trapping overflow, then signed integer arithmetic operations are no longer associative. But normal C undefined behaviour on overflow is fully associative (as is wrapping semantics,
for addition, subtraction and multiplication).
Richard Harnden <richard.nospam@gmail.invalid> writes:
just write complex expressions in a way that a human can most
easily understand,
Unfortunately, (1) different people have different ideas of what
writing is most easily understood, and (2) different readers have
different notions of which writings are easily understood, and
which writings are not so easily understood. To make things
worse "easily understood" is not a boolean condition, nor is it
necessarily well-ordered -- "most easily understood" isn't always
a well-defined quality, even for a given audience.
Sadly the idea of writing in a way that is "most easily understood"
has resulted in a race to the bottom, where writers are more and
more encouraged to take the view that (some) readers are pretty
much arbitrarily stupid, with the result that expressions become
littered with scads of unnecessary parentheses that actually
detract from ease of reading. Good writing is always a balance
between too much and too little.
On 30/05/2026 22:48, BGB wrote:
On 5/30/2026 6:52 AM, David Brown wrote:
On 29/05/2026 22:16, BGB wrote:
On 5/29/2026 6:22 AM, David Brown wrote:
On 29/05/2026 12:20, BGB wrote:
On 5/29/2026 2:52 AM, Janis Papanagnou wrote:
On 2026-05-28 11:57, BGB wrote:
On 5/28/2026 2:18 AM, Janis Papanagnou wrote:[...]
On 2026-05-28 01:49, BGB wrote:
But, not really an "easy" way to avoid bloat, other than to
write code specifically for what cases are relevant; while also >>>>>>>> avoiding needless duplication and copy paste (where, overuse of >>>>>>>> copy/paste can also lead to bloat; along with turning the code >>>>>>>> into an ugly mess).
Hmm.. - as said, the during very early days there were issues; I >>>>>>> recall on one platform duplication of template code in more that >>>>>>> one source unit. And/or some environmental hacks (of the compiler) >>>>>>> to deposit template code for linking. In the later days I've not >>>>>>> seen such immature things anymore.
Possibly, a lot could depend on how one is counting things as well. >>>>>>
In a lot of cases when using GCC, I end up using:
ÿÿ -ffunction-sections -fdata-sections -Wl,-gc-sections
On many targets, "-fdata-sections" can lead to noticeably larger
and slower code because it effectively eliminates section anchor
optimisations.ÿ It does not negatively affect x86 AFAICS, because
x86 does not use section anchors.
<https://godbolt.org/z/zeoq41Y7d>
With -fsection-anchors (enabled with optimisation on targets that
support it - generally RISCy load/store architectures), program-
lifetime variables are kept together in a lump (as though they were >>>>> in a struct) and often addressed by a pointer to that pretend
struct. Thus if a function accesses two variables "a" and "b",
instead of having to load the addresses of each of "a" and "b" into >>>>> separate registers, it loads an "anchor" into one register and
accesses the variables with reg+offset addressing.
I've seen "-fdata-sections" used regularly in embedded systems - it >>>>> is almost always a bad idea.
("-ffunction-sections" is often very helpful to reduce code image
size, so keep that one.)
Both seem to help on x86, x86-64, and also on RISC-V, at making
GCC's output at least sorta space-comparable to my own compilers.
The merit of "-fdata-sections" is mostly that it eliminates unused
global variables; whereas "-ffunction-sections" eliminates
unreachable functions.
That is the point of them, yes.ÿ "-ffunction-sections" can be useful
at removing unused code from more general code.ÿ For
microcontrollers, SDK's and manufacturers' driver code will normally
contain a large number of functions that can be eliminated in this
way, saving a lot of code space.
However, in practice, "-fdata-sections" rarely eliminates a
significant amount - most programs do not have large amounts of
statically-allocated data that is not used.ÿ Gcc, and I think most
other compilers, put the static lifetime data for each translation
unit in its own section, so if no data from a translation unit is
used it will be eliminated at link time even with -fno-data-
sections.ÿ And of course it makes no difference for heap data or
stack data.
The main place it makes a difference is global arrays from a
translation unit that is included, but for functions that are not
included.
Also functions with large static arrays.
void SomeFunc()
{
ÿÿ static char buf[4096];
ÿÿ ...
}
Where, say, eliminating SomeFunc does not necessarily eliminate buf.
Yes, if you have such code but want to eliminate it, then -fdata-
sections would definitely benefit.ÿ I have not seen such code in
practice (at least not with very big static arrays, and that also was
not an essential part of the program).ÿ But of course I have only seen a microscopic part of all C code written - if you come across this sort of thing, then I appreciate your point.
(There are several ways to make this more "friendly" to builds that need
to be compact, such as putting the buffer and/or SomeFunc in a separate
file or giving it a specific section of its own.)
In my testing, "-ffunction-sections" is absolutely worth using (on
targets where code space is relevant - there's no need for PC
software). ÿÿOn some targets, it may mean a few lost opportunities
for shorter jump/call instructions between functions in the same
translation unit, but the cost is rarely anything more than a
slightly longer link time. But "-fdata-sections" typically gives
almost no ram space savings, and makes code bigger and slower.
As I noted, gcc on x86 does not support section anchors, so there is
not likely to be much code cost for -ffdata-sections.
Where section anchors shine - and where -fdata-sections therefore has
cost - is when a function needs to access more than one piece of
static lifetime data defined in the same translation unit (or another
translation unit if you are using LTO).ÿ That happens a lot in
embedded ARM programming at least.ÿ I don't know about RISC-V.ÿ If
the target normally uses a "small data section" for ram (I know this
is common on PowerPC), then there is, in effect, a program-wide
section anchor already.ÿ So it is possible that it relatively few
targets have section anchors - but the 32-bit ARM on gcc is a vastly
popular choice in the embedded world, so it is important to
understand the cost of this compiler flag for that target at least.
It depends on the way it is built.
A lot of times though (for non-relocatable static-linked binaries) it
mostly tends to use AUIPC+LD or AUIPC+ST pairs to access global
variables. There is a Global Pointer that needs to be loaded when the
binary is started, unclear what it is used for exactly.
If you have a global pointer, then it will probably be used for
gp+offset access to global data, eliminating the need for section anchors.
I have not used RISC-V, and am not familiar with its details.ÿ I can see from godbolt that when -fdata-sections is in action and you are loading
from static lifetime variables, the compiler generates instructions like
ÿÿÿÿlw a5, a_variable
ÿÿÿÿlw a4, b_variable
ÿÿÿÿlw a0, c_variable
When you do not have "-fdata-sections", it uses anchors :
ÿÿÿÿlla a4, .LANCHOR0
ÿÿÿÿlw a5, 0(a4)
ÿÿÿÿlw a3, 4(a4)
ÿÿÿÿlw a0, 8(a4)
From my (limited) understanding, RISC-V cannot use 32-bit absolute addressing.ÿ So the "lw a5, a_variable" must be a pseudo-instruction -
using register + offset addressing.ÿ If there is a global pointer, then presumably that is used here.ÿ Alternatively, the pseudo instruction
might assemble to two real instruction to support the 32-bit address.ÿ I know both techniques are used in some targets, but don't know about RISC-V.
Certainly it would surprise me if the "lw a5, a_variable" version were
more efficient than using anchors - otherwise why would gcc generate
code with anchors when given a free choice?ÿ (Perhaps gcc is not well
tuned for RISC-V code generation - I am wary of making too many
assumptions about the processor just from some simple compiler outputs.)
(clang does not, apparently, support section anchors as an optimisation technique.ÿ Both with and without -fdata-sections, on RISC-V it first
uses two instructions to load ".L_MergedGlobals" into a register and
then uses that register plus offset to access data.)
I don't tend to think of MSVC as a highly optimising compiler - but
it is not a tool I have much use for, as it does not handle the
targets I need.ÿ When I have sometimes looked at the generated code
on godbolt, it has not impressed me at all.ÿ So it could well fall
into the "helpful when using a weaker compiler" category.
Depends on what target I am building for:
ÿÿ Windows Native: Typically MSVC
ÿÿ WSL: Usually GCC or Clang
ÿÿÿÿ Seems to have: GCC 13.2.0; Clang 18.1.3
ÿÿÿÿ RISC-V GCC: Also 13.2.0 (also via WSL)
ÿÿ Linux: Typically GCC
I rarely much use Cygwin anymore, as it was mostly rendered obsolete
by WSL (on Win10 or similar).
Though, Cygwin may still be relevant on Win7 or WinXP systems.
Cygwin has its own wide range of complications.ÿ If you want to use gcc targeting native Windows, msys2 and mingw-64 are probably your best bet, either compiled natively under msys2 or as a cross-compile from Linux.
But don't place too much emphasis on my advice, as I very rarely compile
C or C++ code for Windows - most of my PC target (Linux or Windows)
coding is in Python.
For BGBCC, it can build both on native Windows and on Linux/WSL
(though recently noted that this build was broken, mostly by GCC and
Clang being more pedantic about missing prototypes, and a few
prototypes were being missed by my function-prototype mining tool).
Went and fixed this, but haven't posted this yet.
As for optimizing in MSVC, yeah, it is in the area of not terrible,
but not super clever either.
If one expects the sort of high-level code-rewriting cleverness that
GCC or Clang often does, one will be disappointed.
But, sometimes, the main "heavy hitter" optimizations are things like
constant-folding and register allocation, which it does do effectively.
Though, both MSVC and BGBCC seem to use one sort of strategy for
register allocation:
Static assign things to callee-save registers and use remaining
registers for dynamic allocation within basic-blocks. Variables with
finite non-overlapping lifetimes (that do not cross basic-block
boundaries) may potentially share a register (this more generally
applies to things like temporaries).
And, GCC and Clang use another: Assign dynamically but carry values
across basic-block boundaries along control-flow paths.
Both tend to give different patterns though, and seem to favor
different types of code.
[...]
Curious...
(I could also note that I make heavy use of templates in C++ code -
it often leads to smaller and faster results.)
I had tended to use the "write everything one off for the task at
hand" approach, but this is a higher-effort approach.
A lot of code tends to fall into the category of shuffling data around
or doing simple checks or conversions.ÿ It's also common to have wrapper functions for libraries to get something nicer, safer and more
convenient than some API that belongs in the early 1990's.ÿ Good C++ templates (and sometimes even good macros in C) can make the use of
these things far nicer, and most of the code that the templates appear
to generate inline in the caller disappears in optimisation.
On 30/05/2026 13:29, Dan Cross wrote:
In article <10vd1tu$ekvl$1@dont-email.me>, Bart <bc@freeuk.com> wrote:
On 29/05/2026 21:56, Keith Thompson wrote:
[snip]
Upthread, you asked a question:
And then the point becomes, if you always add the parentheses, what >>>> was the point of having that particular precedence level?
You've made it clear that you were never interested in an answer.
You said this:
"You're asking why C is designed the way it is. We could waste a
great deal of time and effort answering that for you. There are
numerous documents about the design and history of C, and of
its ancestor languages. I could provide you with links."
Actually I'm not asking why C is like that. We're already there.
I'm saying that there is no value in those extra levels, some people
think is, and I'm arging about that. I was replying to tTh.
As for my question, what /is/ the point? I'm still waiting!
To clarify: the question is, what is the point of those levels?
How is that different from asking "why C is like that"?
My question is actually independent of C or its history.
I accept those levels exist. I was asking do they currently serve a
useful purpose.
If not, people can choose to ignore those them when writing C code, for >example like this where all () are technically superfluous:
crcu32 = (crcu32 >> 4) ^ s_crc32[(crcu32 & 0xF) ^ (b & 0xF)];
And they can choose to not adopt them when devising new languages,
however many still do faithfully recreate the same pattern, with a few >notable exceptions such as Go lang.
On 31/05/2026 17:04, Tim Rentsch wrote:
Richard Harnden <richard.nospam@gmail.invalid> writes:
just write complex expressions in a way that a human can most
easily understand,
Unfortunately, (1) different people have different ideas of what
writing is most easily understood, and (2) different readers have
different notions of which writings are easily understood, and
which writings are not so easily understood. To make things
worse "easily understood" is not a boolean condition, nor is it
necessarily well-ordered -- "most easily understood" isn't always
a well-defined quality, even for a given audience.
Sadly the idea of writing in a way that is "most easily understood"
has resulted in a race to the bottom, where writers are more and
more encouraged to take the view that (some) readers are pretty
much arbitrarily stupid, with the result that expressions become
littered with scads of unnecessary parentheses that actually
detract from ease of reading. Good writing is always a balance
between too much and too little.
Actual examples of too many parentheses?
On 5/31/2026 4:14 AM, David Brown wrote:
On 30/05/2026 22:48, BGB wrote:
On 5/30/2026 6:52 AM, David Brown wrote:
On 29/05/2026 22:16, BGB wrote:
On 5/29/2026 6:22 AM, David Brown wrote:
On 29/05/2026 12:20, BGB wrote:
On 5/29/2026 2:52 AM, Janis Papanagnou wrote:
On 2026-05-28 11:57, BGB wrote:
On 5/28/2026 2:18 AM, Janis Papanagnou wrote:[...]
On 2026-05-28 01:49, BGB wrote:
Also functions with large static arrays.
void SomeFunc()
{
ÿÿ static char buf[4096];
ÿÿ ...
}
Where, say, eliminating SomeFunc does not necessarily eliminate buf.
Yes, if you have such code but want to eliminate it, then -fdata-
sections would definitely benefit.ÿ I have not seen such code in
practice (at least not with very big static arrays, and that also was
not an essential part of the program).ÿ But of course I have only seen
a microscopic part of all C code written - if you come across this
sort of thing, then I appreciate your point.
(There are several ways to make this more "friendly" to builds that
need to be compact, such as putting the buffer and/or SomeFunc in a
separate file or giving it a specific section of its own.)
I have seen this pattern sometimes, though usually in "medium old" code, with newer code more often assuming that the stack is really big and so
can handle putting 1MB or more in a local array. Though, this is not
great on a target which doesn't have a huge stack.
In my case, I usually had 128K as the default stack size in my project.
Where section anchors shine - and where -fdata-sections therefore
has cost - is when a function needs to access more than one piece of
static lifetime data defined in the same translation unit (or
another translation unit if you are using LTO).ÿ That happens a lot
in embedded ARM programming at least.ÿ I don't know about RISC-V.
If the target normally uses a "small data section" for ram (I know
this is common on PowerPC), then there is, in effect, a program-wide
section anchor already.ÿ So it is possible that it relatively few
targets have section anchors - but the 32-bit ARM on gcc is a vastly
popular choice in the embedded world, so it is important to
understand the cost of this compiler flag for that target at least.
It depends on the way it is built.
A lot of times though (for non-relocatable static-linked binaries) it
mostly tends to use AUIPC+LD or AUIPC+ST pairs to access global
variables. There is a Global Pointer that needs to be loaded when the
binary is started, unclear what it is used for exactly.
If you have a global pointer, then it will probably be used for
gp+offset access to global data, eliminating the need for section
anchors.
I have not used RISC-V, and am not familiar with its details.ÿ I can
see from godbolt that when -fdata-sections is in action and you are
loading from static lifetime variables, the compiler generates
instructions like
ÿÿÿÿÿlw a5, a_variable
ÿÿÿÿÿlw a4, b_variable
ÿÿÿÿÿlw a0, c_variable
When you do not have "-fdata-sections", it uses anchors :
ÿÿÿÿÿlla a4, .LANCHOR0
ÿÿÿÿÿlw a5, 0(a4)
ÿÿÿÿÿlw a3, 4(a4)
ÿÿÿÿÿlw a0, 8(a4)
ÿFrom my (limited) understanding, RISC-V cannot use 32-bit absolute
addressing.ÿ So the "lw a5, a_variable" must be a pseudo-instruction -
using register + offset addressing.ÿ If there is a global pointer,
then presumably that is used here.ÿ Alternatively, the pseudo
instruction might assemble to two real instruction to support the 32-
bit address.ÿ I know both techniques are used in some targets, but
don't know about RISC-V.
It can use one of two strategies for these (after breaking up pseudo- instructions):
ÿ LUIÿÿÿ a5, HiAddrÿÿÿÿÿ //Abs32, Low 2GB only
ÿ LWÿÿÿÿ a5, LoAddr(a5)
Or:
ÿ AUIPCÿ a5, HiAddrÿÿÿÿÿ //PC-Rel
ÿ LWÿÿÿÿ a5, LoAddr(a5)
IIRC, LLA is similar, just using an ADDI as the second instruction.
But, yeah, the latter sequence would be more efficient.
I would expect something different if building with -fPIC or -fPIE, but
this depends on if it is a version of GCC built with support for these
(if using a version of GCC built for non-hosted targets, it ignores
these). Where, one effectively needs different GCC builds for bare-metal (like OS kernels) and for hosted Linux development, for whatever bizarre reason...
Certainly it would surprise me if the "lw a5, a_variable" version were
more efficient than using anchors - otherwise why would gcc generate
code with anchors when given a free choice?ÿ (Perhaps gcc is not well
tuned for RISC-V code generation - I am wary of making too many
assumptions about the processor just from some simple compiler outputs.)
It is not, it is a 2-op sequence usually.
Plain RISC-V has a bigger problem with 64-bit constants though,
generally needs to either load these from memory (more typical in GCC)
or build them in-place (which needs roughly 6 instructions in RISC-V).
Say (possible, but GCC doesn't do this):
ÿ LUIÿÿ t0, ValHiA
ÿ LUIÿÿ t1, valHiB
ÿ ADDIÿ t0, t0, valLoA
ÿ ADDIÿ t1, t1, valLoB
ÿ SLLIÿ t1, t1, 32
ÿ ADDÿÿ a0, t0, t1
In my case, I have extensions for RV that can turn a lot of this stuff
into single instructions (albeit with larger 8 and 12 byte encodings).
In some cases, it can save bytes, for example:
ÿ LWÿÿ a1, Disp33s(a0)
As a 64-bit / 8-byte encoding, vs:
ÿ LUIÿ t0, DispHi
ÿ ADDÿ t0, t0, a0
ÿ LWÿÿ a1, DispLo(a0)
Needing 12 bytes.
My own (more drastic) extensions can save more, by having a few Disp16 instructions, which can access 256K or 512K past GP within a single 32-
bit instruction.
But, if/when any of this would end up in mainline RISC-V is uncertain. Weirdly, there is a lot more emphasis there on big/fancy features (with niche applicability), rather than on smaller things that can improve the properties of the base ISA (and that could more generally benefit nearly
all code built for the ISA).
Cygwin has its own wide range of complications.ÿ If you want to use
gcc targeting native Windows, msys2 and mingw-64 are probably your
best bet, either compiled natively under msys2 or as a cross-compile
from Linux. But don't place too much emphasis on my advice, as I very
rarely compile C or C++ code for Windows - most of my PC target (Linux
or Windows) coding is in Python.
Yes, I had used MinGW for a while, before mostly moving over to MSVC for native Windows.
The tradeoff is mostly:
MinGW is closer to native for Windows;
Cygwin could give a closer approximation of Linux on Windows, so one can build a lot of Linux software and use "./configure" scripts and similar.
But, as noted, Cygwin's role was mostly displaced by WSL, which
effectively runs a Linux userland on Windows.
There was WSL1, which basically mapped Linux syscalls over to the
Windows kernel, and WSL2, which runs the Linux kernel in a VM.
Though, in my case I was using WSL1 as seemingly MS had decided that my
PC can't do virtualization (and sees it as necessary for WSL2), even
despite having a CPU that can do so, and it is enabled in the BIOS.
On 2026-05-31 11:35, David Brown wrote:
...
Usually, both sub-expressions of a binary operator will be evaluated
before the operator itself, simply because usually the results of the
operator cannot be calculated until the sub-expression's values are
known. But this is not a requirement of the language
"The value computations of the operands of an operator are sequenced
before the value computation of the result of the operator." (6.5.1p3)
- if the compiler
can get the same results without doing so, it is free to pick a
different order.
Correct - but "same results" is crucial; it allows you to invoke the
"as-if" rule. Otherwise, the sequencing specified by 6.5.1p3 must be
honored.
...
If an implementation provides additional semantics to signed integer
arithmetic, such as saturating or trapping overflow, then signed integer
arithmetic operations are no longer associative. But normal C undefined
behaviour on overflow is fully associative (as is wrapping semantics,
for addition, subtraction and multiplication).
I don't follow that. I believe that overflow is guaranteed for (5 +
INT_MAX) + INT_MIN, and completely avoided by 5 + (INT_MAX + INT_MIN),
which differ only by association. Are you saying they both have the
same chance of overflowing?
On 31/05/2026 18:46, James Kuyper wrote:...
On 2026-05-31 11:35, David Brown wrote:
If an implementation provides additional semantics to signed integer
arithmetic, such as saturating or trapping overflow, then signed integer >>> arithmetic operations are no longer associative. But normal C undefined >>> behaviour on overflow is fully associative (as is wrapping semantics,
for addition, subtraction and multiplication).
I don't follow that. I believe that overflow is guaranteed for (5 +
INT_MAX) + INT_MIN, and completely avoided by 5 + (INT_MAX + INT_MIN),
which differ only by association. Are you saying they both have the
same chance of overflowing?
No - I see now what you are saying. Overflow is never guaranteed to do anything, including to exist, because it is UB. So the compiler can
happily treat "(5 + INT_MAX) + INT_MIN" as though you had written "5 + (INT_MAX + INT_MIN)". It can freely re-arrange an expression like this
that has a potential overflow into one without risk of overflow, as long
as the same results are given for all values that do not overflow. (The overflow is not part of the observable behaviour.) But it cannot
re-arrange the other way unless it knows that intermediary overflows
have no effect. (And the compiler usually does know this.)
On 31/05/2026 16:24, James Kuyper wrote:[...]
On 2026-05-31 07:18, David Brown wrote:
People might think they affect the order of evaluation, such as when you >>> have function calls :
u = foo(x) + (foo(y) + foo(z));
Some people might think the use of parentheses means that "foo(y)" and
"foo(z)" are called before "foo(x)", when the order of all these calls
(and the additions) is unspecified. (Again, a given compiler might be
influenced by the parentheses, but the language does not require it.
You're correct with regard to the function calls, but the
parenthesized addition must be performed first, and the other one
second, which may make a difference, for the same reasons given in my
previous paragraph.
The parentheses do not dictate the order of evaluation. But you are
correct - and it's worth pointing out, so thank you for doing that -
that for floating point operations, the grouping of operations can
affect the result.
If you are talking about floating point arithmetic (I was thinking of
integer arithmetic, but did not specify), then the operations are not necessarily commutative or associative, and the compiler cannot then re-arrange the operations unless it knows that doing so does not
affect the result.
But except for specific cases, the order of evaluation - both for the
values and side-effects - of sub-expressions is unspecified. Indeed,
they are unsequenced - the evaluations can interleave.
Usually, both sub-expressions of a binary operator will be evaluated
before the operator itself, simply because usually the results of the operator cannot be calculated until the sub-expression's values are
known. But this is not a requirement of the language - if the
compiler can get the same results without doing so, it is free to pick
a different order. "(a + b) * 0" does not need to evaluate "a", "b",
or "a + b" at all unless there is a possibility of a side-effect - and
it can perform the side-effects in any order. "a + (b + c)" can check
"a" for a trap representation and deal with that before looking at "b"
and "c" or the results of "b + c", even though it cannot (for floating
point operations) re-arrange the code to do "a + b" first.
In [...] early C, `|` and `&` were logical operators. The
short-circuiting `||` and `&&` came later, but the usage low
precedence for `|` and `&` was already baked in.
That's the point: the precedence reflects the original use as
boolean operators, not how things evolved for use almost purely
as bitwise operators.
cross@spitfire.i.gajendra.net (Dan Cross) writes:
In [...] early C, `|` and `&` were logical operators. The
short-circuiting `||` and `&&` came later, but the usage low
precedence for `|` and `&` was already baked in.
That's the point: the precedence reflects the original use as
boolean operators, not how things evolved for use almost purely
as bitwise operators.
Surely even in pre-K&R C the & and | operators were used for
bitwise-and and bitwise-or as well as logical connectors.
Tim Rentsch <tr.17687@z991.linuxsc.com> writes:
cross@spitfire.i.gajendra.net (Dan Cross) writes:
In [...] early C, `|` and `&` were logical operators. The
short-circuiting `||` and `&&` came later, but the usage low
precedence for `|` and `&` was already baked in.
That's the point: the precedence reflects the original use as
boolean operators, not how things evolved for use almost purely
as bitwise operators.
Surely even in pre-K&R C the & and | operators were used for
bitwise-and and bitwise-or as well as logical connectors.
They were used for both [...]
How about when you intend it to be: '(a << b) + c'?
On Sun, 31 May 2026 10:59:42 +0100, Bart wrote:
How about when you intend it to be: '(a << b) + c'?
I gave real-world examples of the usage that you asked for, how about
you do the same?
On 31/05/2026 17:04, Tim Rentsch wrote:
Richard Harnden <richard.nospam@gmail.invalid> writes:
just write complex expressions in a way that a human can most
easily understand,
Unfortunately, (1) different people have different ideas of what
writing is most easily understood, and (2) different readers have
different notions of which writings are easily understood, and
which writings are not so easily understood. To make things
worse "easily understood" is not a boolean condition, nor is it
necessarily well-ordered -- "most easily understood" isn't always
a well-defined quality, even for a given audience.
Sadly the idea of writing in a way that is "most easily understood"
has resulted in a race to the bottom, where writers are more and
more encouraged to take the view that (some) readers are pretty
much arbitrarily stupid, with the result that expressions become
littered with scads of unnecessary parentheses that actually
detract from ease of reading. Good writing is always a balance
between too much and too little.
Actual examples of too many parentheses?
And then there is ?: :
a > b ? c : d # (a>b)?c:d
a + b ? c : d # (a+b)?c:d
The grouping of the first is probably what is intended. But in the
second, the intent might have been (a+b)?c:d, or a+(b?c:c); we don't
know for sure that the author didn't make a mistake or we don't know outselves.
On 2026-05-31 16:24, David Brown wrote:
On 31/05/2026 18:46, James Kuyper wrote:...
On 2026-05-31 11:35, David Brown wrote:
If an implementation provides additional semantics to signed integer
arithmetic, such as saturating or trapping overflow, then signed integer >>>> arithmetic operations are no longer associative. But normal C undefined >>>> behaviour on overflow is fully associative (as is wrapping semantics,
for addition, subtraction and multiplication).
I don't follow that. I believe that overflow is guaranteed for (5 +
INT_MAX) + INT_MIN, and completely avoided by 5 + (INT_MAX + INT_MIN),
which differ only by association. Are you saying they both have the
same chance of overflowing?
No - I see now what you are saying. Overflow is never guaranteed to do
anything, including to exist, because it is UB. So the compiler can
I only meant that overflow was guaranteed, and that the behavior was therefore guaranteed to be undefined. I didn't mean to imply that any particular behavior was guaranteed.
happily treat "(5 + INT_MAX) + INT_MIN" as though you had written "5 +
(INT_MAX + INT_MIN)". It can freely re-arrange an expression like this
that has a potential overflow into one without risk of overflow, as long
as the same results are given for all values that do not overflow. (The
overflow is not part of the observable behaviour.) But it cannot
re-arrange the other way unless it knows that intermediary overflows
have no effect. (And the compiler usually does know this.)
That's what I was mainly concerned about - if I've carefully arranged to
make sure that overflow is impossible, I'd be rather upset by a compiler which, because "normal C undefined behaviour on overflow is fully associative", rearranges the associations in my code to make overflow possible.
I interpreted that comment as meaning that "whether or not the
behavior is undefined is fully associative". I guess that what you
actually meant was "if the behavior is undefined, the compiler is free
to rearrange the associations".
David Brown <david.brown@hesbynett.no> writes:
On 31/05/2026 16:24, James Kuyper wrote:[...]
On 2026-05-31 07:18, David Brown wrote:
People might think they affect the order of evaluation, such as when you >>>> have function calls :
u = foo(x) + (foo(y) + foo(z));
Some people might think the use of parentheses means that "foo(y)" and >>>> "foo(z)" are called before "foo(x)", when the order of all these calls >>>> (and the additions) is unspecified. (Again, a given compiler might be >>>> influenced by the parentheses, but the language does not require it.
You're correct with regard to the function calls, but the
parenthesized addition must be performed first, and the other one
second, which may make a difference, for the same reasons given in my
previous paragraph.
The parentheses do not dictate the order of evaluation. But you are
correct - and it's worth pointing out, so thank you for doing that -
that for floating point operations, the grouping of operations can
affect the result.
The parentheses do not dictate the order of evaluation *of the
operands*. Each "+" can be evaluated (the addition performed)
only after the values of its operands are known. But regardless
of parentheses or operator precedence, the three operands foo(x),
foo(y), and foo(z) can be evaluated in any of 6 possible orders.
(It's different when you have operations like "&&", "||", and ",",
which imposes additional sequence points.)
If you are talking about floating point arithmetic (I was thinking of
integer arithmetic, but did not specify), then the operations are not
necessarily commutative or associative, and the compiler cannot then
re-arrange the operations unless it knows that doing so does not
affect the result.
It's not just floating-point. Signed integer overflow is also relevant.
(INT_MIN + INT_MAX) + 1 is well defined. (INT_MIN + INT_MAX) +1
is equivalent, and is also well defined. INT_MIN + (INT_MAX +1)
has undefined behavior.
But except for specific cases, the order of evaluation - both for the
values and side-effects - of sub-expressions is unspecified. Indeed,
they are unsequenced - the evaluations can interleave.
Usually, both sub-expressions of a binary operator will be evaluated
before the operator itself, simply because usually the results of the
operator cannot be calculated until the sub-expression's values are
known. But this is not a requirement of the language - if the
compiler can get the same results without doing so, it is free to pick
a different order. "(a + b) * 0" does not need to evaluate "a", "b",
or "a + b" at all unless there is a possibility of a side-effect - and
it can perform the side-effects in any order. "a + (b + c)" can check
"a" for a trap representation and deal with that before looking at "b"
and "c" or the results of "b + c", even though it cannot (for floating
point operations) re-arrange the code to do "a + b" first.
Yes, a compiler can reduce (a + b) * 0 to just 0. But it's not
required to do so, and (INT_MAX + 1) * 0 still has undefined
behavior. Undefined behavior is determined by the rules of the
abstract machine *without* any adjustments permitted by the as-if
rule.
On 31/05/2026 17:04, Tim Rentsch wrote:
Richard Harnden <richard.nospam@gmail.invalid> writes:
just write complex expressions in a way that a human can most
easily understand,
Unfortunately, (1) different people have different ideas of what
writing is most easily understood, and (2) different readers have
different notions of which writings are easily understood, and
which writings are not so easily understood.ÿ To make things
worse "easily understood" is not a boolean condition, nor is it
necessarily well-ordered -- "most easily understood" isn't always
a well-defined quality, even for a given audience.
Sadly the idea of writing in a way that is "most easily understood"
has resulted in a race to the bottom, where writers are more and
more encouraged to take the view that (some) readers are pretty
much arbitrarily stupid, with the result that expressions become
littered with scads of unnecessary parentheses that actually
detract from ease of reading.ÿ Good writing is always a balance
between too much and too little.
Actual examples of too many parentheses?
On 01/06/2026 00:54, Keith Thompson wrote:[...]
(INT_MIN + INT_MAX) + 1 is well defined. (INT_MIN + INT_MAX) +1
is equivalent, and is also well defined. INT_MIN + (INT_MAX +1)
has undefined behavior.
Compilers can re-arrange integer arithmetic, despite new overflows, if
they know the result is the same. On pretty much any current
processor, a compiler generating code for integer "a + b + c" could do
the additions in any order - treating the operations as commutative
and fully associative. The final result will be the same in every
case where the original expression did not overflow (i.e., every case
with defined behaviour).
On 31/05/2026 19:11, Bart wrote:[...]
Actual examples of too many parentheses?
Any source code written in LISP :-)
(And for too few parentheses, any source code in Forth.)
From a quick grep of an SDK in a project I am working on, I saw this
example :
if ((((pData1 == NULL) || (pData2 == NULL))) || (Length == 0U))
The number of parentheses there is so high it's hard to see that not
only is there an unnecessary extra parentheses for the first ||
operator, but there is a second set of extra parentheses around
it. Eliminating these would give :
if ((pData1 == NULL) || (pData2 == NULL) || (Length == 0U))
or, with an extra space for clarity,
if ( (pData1 == NULL) || (pData2 == NULL) || (Length == 0U) )
That still leaves extra parentheses around the equality operators, but
the decision to keep or remove them is subjective (as is the choice of "pData1 == NULL" vs. "!pData1").
And yes, these really are the names of the macro in this code.
#define CONVERTARGB88882ARGB4444(Color) \
((((Color & 0xFFU) >> 4) & 0xFU) |\
(((((Color & 0xFF00U) >> 8) >> 4) & 0xFU) << 4) |\
(((((Color & 0xFF0000U) >> 16) >> 4) & 0xFU) << 8) | \
(((((Color & 0xFF000000U) >> 24) >> 4) & 0xFU) << 12))
#define CONVERTRGB5652ARGB8888(Color) \
(((((((Color >> 11) & 0x1FU) * 527) + 23) >> 6) << 16) |\
((((((Color >> 5) & 0x3FU) * 259) + 33) >> 6) << 8) |\
((((Color & 0x1FU) * 527) + 23) >> 6) | 0xFF000000)
Bart <bc@freeuk.com> writes:
On 31/05/2026 17:04, Tim Rentsch wrote:
Richard Harnden <richard.nospam@gmail.invalid> writes:
just write complex expressions in a way that a human can most
easily understand,
Unfortunately, (1) different people have different ideas of what
writing is most easily understood, and (2) different readers have
different notions of which writings are easily understood, and
which writings are not so easily understood. To make things
worse "easily understood" is not a boolean condition, nor is it
necessarily well-ordered -- "most easily understood" isn't always
a well-defined quality, even for a given audience.
Sadly the idea of writing in a way that is "most easily understood"
has resulted in a race to the bottom, where writers are more and
more encouraged to take the view that (some) readers are pretty
much arbitrarily stupid, with the result that expressions become
littered with scads of unnecessary parentheses that actually
detract from ease of reading. Good writing is always a balance
between too much and too little.
Actual examples of too many parentheses?
The point of my comment is that either too many or too few is a
subjective judgment, not an objective one.
And then there is ?: :
a > b ? c : d # (a>b)?c:d
a + b ? c : d # (a+b)?c:d
The grouping of the first is probably what is intended. But in the
second, the intent might have been (a+b)?c:d, or a+(b?c:c); we don't
know for sure that the author didn't make a mistake or we don't know
outselves.
This example is so addlebrained that it's hard to imagine anyone
being confused about it. Or that it's worth any expenditure of
thought wondering what to do about people who are.
On 01/06/2026 03:10, Tim Rentsch wrote:
Bart <bc@freeuk.com> writes:
On 31/05/2026 17:04, Tim Rentsch wrote:
Richard Harnden <richard.nospam@gmail.invalid> writes:
just write complex expressions in a way that a human can most
easily understand,
Unfortunately, (1) different people have different ideas of what
writing is most easily understood, and (2) different readers have
different notions of which writings are easily understood, and
which writings are not so easily understood.ÿ To make things
worse "easily understood" is not a boolean condition, nor is it
necessarily well-ordered -- "most easily understood" isn't always
a well-defined quality, even for a given audience.
Sadly the idea of writing in a way that is "most easily understood"
has resulted in a race to the bottom, where writers are more and
more encouraged to take the view that (some) readers are pretty
much arbitrarily stupid, with the result that expressions become
littered with scads of unnecessary parentheses that actually
detract from ease of reading.ÿ Good writing is always a balance
between too much and too little.
Actual examples of too many parentheses?
The point of my comment is that either too many or too few is a
subjective judgment, not an objective one.
My point was that it could be objective, at least for too many. So (a*a)
+ (b*b) would be commonly agreed to have too many, and I was extending
that to other examples in computing.
On 31/05/2026 19:11, Bart wrote:
On 31/05/2026 17:04, Tim Rentsch wrote:
Richard Harnden <richard.nospam@gmail.invalid> writes:
just write complex expressions in a way that a human can most
easily understand,
Unfortunately, (1) different people have different ideas of what
writing is most easily understood, and (2) different readers have
different notions of which writings are easily understood, and
which writings are not so easily understood.ÿ To make things
worse "easily understood" is not a boolean condition, nor is it
necessarily well-ordered -- "most easily understood" isn't always
a well-defined quality, even for a given audience.
Sadly the idea of writing in a way that is "most easily understood"
has resulted in a race to the bottom, where writers are more and
more encouraged to take the view that (some) readers are pretty
much arbitrarily stupid, with the result that expressions become
littered with scads of unnecessary parentheses that actually
detract from ease of reading.ÿ Good writing is always a balance
between too much and too little.
Actual examples of too many parentheses?
Any source code written in LISP :-)
(And for too few parentheses, any source code in Forth.)
From a quick grep of an SDK in a project I am working on, I saw this example :
ÿÿÿÿif ((((pData1 == NULL) || (pData2 == NULL))) || (Length == 0U))
The number of parentheses there is so high it's hard to see that not
only is there an unnecessary extra parentheses for the first ||
operator, but there is a second set of extra parentheses around it. Eliminating these would give :
ÿÿÿÿif ((pData1 == NULL) || (pData2 == NULL) || (Length == 0U))
or, with an extra space for clarity,
ÿÿÿÿif ( (pData1 == NULL) || (pData2 == NULL) || (Length == 0U) )
That still leaves extra parentheses around the equality operators, but
the decision to keep or remove them is subjective (as is the choice of "pData1 == NULL" vs. "!pData1").
But IMHO, the original line had at least two sets of completely
redundant and unhelpful parentheses which made it harder to read - the reader is left wondering whether these parentheses are there for a
purpose and have an effect on what should have been a simple and clear expression.
The SDK also contains examples of parentheses used because it mixes relatively rare operators (shifts and binary operators).ÿ Parentheses
around such sub-expressions are not uncommon, and can definitely be
helpful, but the quantity here makes things hard to read.ÿ Ironically, though it is a macro, there are not "safety" parentheses around the
argument in the expression.
And yes, these really are the names of the macro in this code.
#define CONVERTARGB88882ARGB4444(Color) \
ÿÿÿÿ((((Color & 0xFFU) >> 4) & 0xFU) |\
ÿÿÿÿ(((((Color & 0xFF00U) >> 8) >> 4) & 0xFU) << 4) |\
ÿÿÿÿ(((((Color & 0xFF0000U) >> 16) >> 4) & 0xFU) << 8) | \
ÿÿÿÿ(((((Color & 0xFF000000U) >> 24) >> 4) & 0xFU) << 12))
#define CONVERTRGB5652ARGB8888(Color) \
ÿÿÿÿ(((((((Color >> 11) & 0x1FU) * 527) + 23) >> 6) << 16) |\
ÿÿÿÿ((((((Color >> 5) & 0x3FU) * 259) + 33) >> 6) << 8) |\
ÿÿÿÿ((((Color & 0x1FU) * 527) + 23) >> 6) | 0xFF000000)
It can be argued that the parentheses themselves are not the problem
here - it is doing too much in one expression.ÿ Static inline functions would make things clearer, as would a separation of the steps of
breaking down the original colour format into parts, scaling or
conversions, then building up the new colour format.ÿ Different named
types for the different formats would go a long way towards usability
and safety - at least using typedefs, but preferably using structs to
make real different types.ÿ And surely nicer names could have been found!
David Brown <david.brown@hesbynett.no> writes:
On 31/05/2026 19:11, Bart wrote:[...]
Actual examples of too many parentheses?
Any source code written in LISP :-)
(And for too few parentheses, any source code in Forth.)
From a quick grep of an SDK in a project I am working on, I saw this
example :
if ((((pData1 == NULL) || (pData2 == NULL))) || (Length == 0U))
The number of parentheses there is so high it's hard to see that not
only is there an unnecessary extra parentheses for the first ||
operator, but there is a second set of extra parentheses around
it. Eliminating these would give :
if ((pData1 == NULL) || (pData2 == NULL) || (Length == 0U))
or, with an extra space for clarity,
if ( (pData1 == NULL) || (pData2 == NULL) || (Length == 0U) )
That still leaves extra parentheses around the equality operators, but
the decision to keep or remove them is subjective (as is the choice of
"pData1 == NULL" vs. "!pData1").
Yeah, I'd write that as
if (pData1 == NULL || pData2 == NULL || Length == 0U)
The fact that || binds more loosely than == is one of those things
that I arbitrarily find sufficiently intuitive.
[...]
And yes, these really are the names of the macro in this code.
#define CONVERTARGB88882ARGB4444(Color) \
((((Color & 0xFFU) >> 4) & 0xFU) |\
(((((Color & 0xFF00U) >> 8) >> 4) & 0xFU) << 4) |\
(((((Color & 0xFF0000U) >> 16) >> 4) & 0xFU) << 8) | \
(((((Color & 0xFF000000U) >> 24) >> 4) & 0xFU) << 12))
#define CONVERTRGB5652ARGB8888(Color) \
(((((((Color >> 11) & 0x1FU) * 527) + 23) >> 6) << 16) |\
((((((Color >> 5) & 0x3FU) * 259) + 33) >> 6) << 8) |\
((((Color & 0x1FU) * 527) + 23) >> 6) | 0xFF000000)
In a macro definition, I'd parenthesize each occurrence of Color,
in case the argument is a more complicated expression, as well as parenthesizing the entire definition (the latter was done here).
The rest of the parentheses feel excessive, but I frankly can't be
bothered to figure out which can be omitted without hurting clarity.
On 01/06/2026 08:52, David Brown wrote:
On 31/05/2026 19:11, Bart wrote:
On 31/05/2026 17:04, Tim Rentsch wrote:
Richard Harnden <richard.nospam@gmail.invalid> writes:
just write complex expressions in a way that a human can most
easily understand,
Unfortunately, (1) different people have different ideas of what
writing is most easily understood, and (2) different readers have
different notions of which writings are easily understood, and
which writings are not so easily understood.ÿ To make things
worse "easily understood" is not a boolean condition, nor is it
necessarily well-ordered -- "most easily understood" isn't always
a well-defined quality, even for a given audience.
Sadly the idea of writing in a way that is "most easily understood"
has resulted in a race to the bottom, where writers are more and
more encouraged to take the view that (some) readers are pretty
much arbitrarily stupid, with the result that expressions become
littered with scads of unnecessary parentheses that actually
detract from ease of reading.ÿ Good writing is always a balance
between too much and too little.
Actual examples of too many parentheses?
Any source code written in LISP :-)
(And for too few parentheses, any source code in Forth.)
ÿFrom a quick grep of an SDK in a project I am working on, I saw this
example :
ÿÿÿÿÿif ((((pData1 == NULL) || (pData2 == NULL))) || (Length == 0U))
The number of parentheses there is so high it's hard to see that not
only is there an unnecessary extra parentheses for the first ||
operator, but there is a second set of extra parentheses around it.
Eliminating these would give :
ÿÿÿÿÿif ((pData1 == NULL) || (pData2 == NULL) || (Length == 0U))
or, with an extra space for clarity,
ÿÿÿÿÿif ( (pData1 == NULL) || (pData2 == NULL) || (Length == 0U) )
That still leaves extra parentheses around the equality operators, but
the decision to keep or remove them is subjective (as is the choice of
"pData1 == NULL" vs. "!pData1").
Maybe it's due to || being a symbol; compare:
ÿÿÿÿ if (pData1 == NULL || pData2 == NULL || Length == 0U)
ÿÿÿÿ if (pData1 == NULL or pData2 == NULL or Length == 0U)
To me, || seems to draw in the terms on either side as strongly as ==.
That happens less using 'or'.
(Both are valid C if using iso646.h.)
But IMHO, the original line had at least two sets of completely
redundant and unhelpful parentheses which made it harder to read - the
reader is left wondering whether these parentheses are there for a
purpose and have an effect on what should have been a simple and clear
expression.
The pattern seems to be '((a || b)) || c) || d' so maybe the author
didn't understand that || is parsed LTR anyway.
Your examples actually look reasonable. In fact, it could probably do
The SDK also contains examples of parentheses used because it mixes
relatively rare operators (shifts and binary operators).ÿ Parentheses
around such sub-expressions are not uncommon, and can definitely be
helpful, but the quantity here makes things hard to read.ÿ Ironically,
though it is a macro, there are not "safety" parentheses around the
argument in the expression.
And yes, these really are the names of the macro in this code.
#define CONVERTARGB88882ARGB4444(Color) \
ÿÿÿÿÿ((((Color & 0xFFU) >> 4) & 0xFU) |\
ÿÿÿÿÿ(((((Color & 0xFF00U) >> 8) >> 4) & 0xFU) << 4) |\
ÿÿÿÿÿ(((((Color & 0xFF0000U) >> 16) >> 4) & 0xFU) << 8) | \
ÿÿÿÿÿ(((((Color & 0xFF000000U) >> 24) >> 4) & 0xFU) << 12))
#define CONVERTRGB5652ARGB8888(Color) \
ÿÿÿÿÿ(((((((Color >> 11) & 0x1FU) * 527) + 23) >> 6) << 16) |\
ÿÿÿÿÿ((((((Color >> 5) & 0x3FU) * 259) + 33) >> 6) << 8) |\
ÿÿÿÿÿ((((Color & 0x1FU) * 527) + 23) >> 6) | 0xFF000000)
It can be argued that the parentheses themselves are not the problem
here - it is doing too much in one expression.ÿ Static inline
functions would make things clearer, as would a separation of the
steps of breaking down the original colour format into parts, scaling
or conversions, then building up the new colour format.ÿ Different
named types for the different formats would go a long way towards
usability and safety - at least using typedefs, but preferably using
structs to make real different types.ÿ And surely nicer names could
have been found!
with more parentheses around 'Color'... (I've just seen you've already mentioned this!)
The first part of the second has to apply 6 operations to 'Color' in
strict LTR order. Using parentheses ensures not having to worry about precedence, since the ops are '>> & * + >> <<'
The macro names seem self-explanatory too, although they could do with
some underscores.
But anything involving macros probably doesn't count; you expect () to
be heavily used in the expansion.
This is an example from Lua:
ÿÿÿ op_arith(L, l_addi, luai_numadd);
On the face of it, perfectly reasonable. But it expands to this:
{TValue*v1=(&((base+(((void)0),((((int)((((i)>>((((0+7)+8)+1)))& ((~((~(Instruction)0)<<(8)))<<(0))))))))))->val);TValue*v2=(&(( base+(((void)0),((((int)((((i)>>(((((0+7)+8)+1)+8)))&((~((~( Instruction)0)<<(8)))<<(0))))))))))->val);{StkId ra=(base+(((int) ((((i)>>((0+7)))&((~((~(Instruction)0)<<(8)))<<(0)))))));if(((((v1) )->tt_)==(((3)|((0)<<4))))&&((((v2))->tt_)==(((3)|((0)<<4))))){
lua_Integer i1=(((void)0),(((v1)->value_).i));lua_Integer i2=(((void) 0),(((v2)->value_).i));pc++;{TValue*io=((&(ra)->val));((io)->value_) .i=(((lua_Integer)(((lua_Unsigned)(i1))+((lua_Unsigned)(i2)))));((io) ->tt_=(((3)|((0)<<4))));};}else{lua_Number n1;lua_Number n2;if((((((v1)) ->tt_)==(((3)|((1)<<4))))?((n1)=(((void)0),(((v1)->value_).n)),1):((((( v1))->tt_)==(((3)|((0)<<4))))?((n1)=((lua_Number)(((((void)0),(((v1)-> value_).i))))),1):0))&&(((((v2))->tt_)==(((3)|((1)<<4))))?((n2)=(((void) 0),(((v2)->value_).n)),1):(((((v2))->tt_)==(((3)|((0)<<4))))?((n2)=(( lua_Number)(((((void)0),(((v2)->value_).i))))),1):0))){pc++;{TValue* io=((&(ra)->val));((io)->value_).n=(((n1)+(n2)));((io)->tt_=(((3)| ((1)<<4))));};}};};};
(I had fun debugging this at one time in my compiler. I've no idea how
the original developer did so.)
Not too many () in the macro definitions, but I can only see the top
level; here deeply nested macros are used.
On 31/05/2026 19:11, Bart wrote:
On 31/05/2026 17:04, Tim Rentsch wrote:
Richard Harnden <richard.nospam@gmail.invalid> writes:
just write complex expressions in a way that a human can most
easily understand,
Unfortunately, (1) different people have different ideas of what
writing is most easily understood, and (2) different readers have
different notions of which writings are easily understood, and
which writings are not so easily understood.ÿ To make things
worse "easily understood" is not a boolean condition, nor is it
necessarily well-ordered -- "most easily understood" isn't always
a well-defined quality, even for a given audience.
Sadly the idea of writing in a way that is "most easily understood"
has resulted in a race to the bottom, where writers are more and
more encouraged to take the view that (some) readers are pretty
much arbitrarily stupid, with the result that expressions become
littered with scads of unnecessary parentheses that actually
detract from ease of reading.ÿ Good writing is always a balance
between too much and too little.
Actual examples of too many parentheses?
Any source code written in LISP :-)
(And for too few parentheses, any source code in Forth.)
From a quick grep of an SDK in a project I am working on, I saw this
example :
if ((((pData1 == NULL) || (pData2 == NULL))) || (Length == 0U))
The number of parentheses there is so high it's hard to see that not
only is there an unnecessary extra parentheses for the first ||
operator, but there is a second set of extra parentheses around it. >Eliminating these would give :
if ((pData1 == NULL) || (pData2 == NULL) || (Length == 0U))
or, with an extra space for clarity,
if ( (pData1 == NULL) || (pData2 == NULL) || (Length == 0U) )
That still leaves extra parentheses around the equality operators, but
the decision to keep or remove them is subjective (as is the choice of >"pData1 == NULL" vs. "!pData1").
But IMHO, the original line had at least two sets of completely
redundant and unhelpful parentheses which made it harder to read - the >reader is left wondering whether these parentheses are there for a
purpose and have an effect on what should have been a simple and clear >expression.
The SDK also contains examples of parentheses used because it mixes >relatively rare operators (shifts and binary operators). Parentheses
around such sub-expressions are not uncommon, and can definitely be
helpful, but the quantity here makes things hard to read. Ironically, >though it is a macro, there are not "safety" parentheses around the
argument in the expression.
And yes, these really are the names of the macro in this code.
#define CONVERTARGB88882ARGB4444(Color) \
((((Color & 0xFFU) >> 4) & 0xFU) |\
(((((Color & 0xFF00U) >> 8) >> 4) & 0xFU) << 4) |\
(((((Color & 0xFF0000U) >> 16) >> 4) & 0xFU) << 8) | \
(((((Color & 0xFF000000U) >> 24) >> 4) & 0xFU) << 12))
#define CONVERTRGB5652ARGB8888(Color) \
(((((((Color >> 11) & 0x1FU) * 527) + 23) >> 6) << 16) |\
((((((Color >> 5) & 0x3FU) * 259) + 33) >> 6) << 8) |\
((((Color & 0x1FU) * 527) + 23) >> 6) | 0xFF000000)
It can be argued that the parentheses themselves are not the problem
here - it is doing too much in one expression. Static inline functions >would make things clearer, as would a separation of the steps of
breaking down the original colour format into parts, scaling or
conversions, then building up the new colour format. Different named
types for the different formats would go a long way towards usability
and safety - at least using typedefs, but preferably using structs to
make real different types. And surely nicer names could have been found!
In article <10vjdn8$22tgu$1@dont-email.me>,
David Brown <david.brown@hesbynett.no> wrote:
On 31/05/2026 19:11, Bart wrote:
On 31/05/2026 17:04, Tim Rentsch wrote:
Richard Harnden <richard.nospam@gmail.invalid> writes:
just write complex expressions in a way that a human can most
easily understand,
Unfortunately, (1) different people have different ideas of what
writing is most easily understood, and (2) different readers have
different notions of which writings are easily understood, and
which writings are not so easily understood.ÿ To make things
worse "easily understood" is not a boolean condition, nor is it
necessarily well-ordered -- "most easily understood" isn't always
a well-defined quality, even for a given audience.
Sadly the idea of writing in a way that is "most easily understood"
has resulted in a race to the bottom, where writers are more and
more encouraged to take the view that (some) readers are pretty
much arbitrarily stupid, with the result that expressions become
littered with scads of unnecessary parentheses that actually
detract from ease of reading.ÿ Good writing is always a balance
between too much and too little.
Actual examples of too many parentheses?
Any source code written in LISP :-)
Hey now. Some of us have programmed in Lisp professionally, and
rather enjoy it.
Lisp is often maligned for its parentheses; I don't think that's
fair. They really aren't that onorus once you start working in
it, and they're unambiguous; one may of the structure of Lisp
code as a shorthand notation for the resulting program's AST.
(And for too few parentheses, any source code in Forth.)
No comment.
From a quick grep of an SDK in a project I am working on, I saw this
example :
if ((((pData1 == NULL) || (pData2 == NULL))) || (Length == 0U))
The number of parentheses there is so high it's hard to see that not
only is there an unnecessary extra parentheses for the first ||
operator, but there is a second set of extra parentheses around it.
Eliminating these would give :
if ((pData1 == NULL) || (pData2 == NULL) || (Length == 0U))
or, with an extra space for clarity,
if ( (pData1 == NULL) || (pData2 == NULL) || (Length == 0U) )
That still leaves extra parentheses around the equality operators, but
the decision to keep or remove them is subjective (as is the choice of
"pData1 == NULL" vs. "!pData1").
But IMHO, the original line had at least two sets of completely
redundant and unhelpful parentheses which made it harder to read - the
reader is left wondering whether these parentheses are there for a
purpose and have an effect on what should have been a simple and clear
expression.
I see code like this all the time; usually it comes from
hardware vendors (I take it this was from a BSP or something
similar?). I often wonder about vendor programming standards
when I run across things like it.
The SDK also contains examples of parentheses used because it mixes
relatively rare operators (shifts and binary operators). Parentheses
around such sub-expressions are not uncommon, and can definitely be
helpful, but the quantity here makes things hard to read. Ironically,
though it is a macro, there are not "safety" parentheses around the
argument in the expression.
And yes, these really are the names of the macro in this code.
#define CONVERTARGB88882ARGB4444(Color) \
((((Color & 0xFFU) >> 4) & 0xFU) |\
(((((Color & 0xFF00U) >> 8) >> 4) & 0xFU) << 4) |\
(((((Color & 0xFF0000U) >> 16) >> 4) & 0xFU) << 8) | \
(((((Color & 0xFF000000U) >> 24) >> 4) & 0xFU) << 12))
#define CONVERTRGB5652ARGB8888(Color) \
(((((((Color >> 11) & 0x1FU) * 527) + 23) >> 6) << 16) |\
((((((Color >> 5) & 0x3FU) * 259) + 33) >> 6) << 8) |\
((((Color & 0x1FU) * 527) + 23) >> 6) | 0xFF000000)
It can be argued that the parentheses themselves are not the problem
here - it is doing too much in one expression. Static inline functions
would make things clearer, as would a separation of the steps of
breaking down the original colour format into parts, scaling or
conversions, then building up the new colour format. Different named
types for the different formats would go a long way towards usability
and safety - at least using typedefs, but preferably using structs to
make real different types. And surely nicer names could have been found!
Not to mention symbolic names for the magic constants. :-/
This is exactly the sort of thing that, as you point out, a
`static inline` function is far better suited for. Some code
bases don't want to use them for a variety of reasons, usually
compatibility concerns with older code, compilers, or language
standards. Some variants of Unix, for instance, worry about
header compatibility with C90 [and in some cases K&R C] code.
On 01/06/2026 13:04, Dan Cross wrote:
In article <10vjdn8$22tgu$1@dont-email.me>,
David Brown <david.brown@hesbynett.no> wrote:
On 31/05/2026 19:11, Bart wrote:
[snip]
Actual examples of too many parentheses?
Any source code written in LISP :-)
Hey now. Some of us have programmed in Lisp professionally, and
rather enjoy it.
Lisp is often maligned for its parentheses; I don't think that's
fair. They really aren't that onorus once you start working in
it, and they're unambiguous; one may of the structure of Lisp
code as a shorthand notation for the resulting program's AST.
I did include a smiley - I know there are people here who enjoy working
with LISP, and have probably heard a few too many jokes about parentheses!
(And for too few parentheses, any source code in Forth.)
No comment.
From a quick grep of an SDK in a project I am working on, I saw this
example :
if ((((pData1 == NULL) || (pData2 == NULL))) || (Length == 0U))
The number of parentheses there is so high it's hard to see that not
only is there an unnecessary extra parentheses for the first ||
operator, but there is a second set of extra parentheses around it.
Eliminating these would give :
if ((pData1 == NULL) || (pData2 == NULL) || (Length == 0U))
or, with an extra space for clarity,
if ( (pData1 == NULL) || (pData2 == NULL) || (Length == 0U) )
That still leaves extra parentheses around the equality operators, but
the decision to keep or remove them is subjective (as is the choice of
"pData1 == NULL" vs. "!pData1").
But IMHO, the original line had at least two sets of completely
redundant and unhelpful parentheses which made it harder to read - the
reader is left wondering whether these parentheses are there for a
purpose and have an effect on what should have been a simple and clear
expression.
I see code like this all the time; usually it comes from
hardware vendors (I take it this was from a BSP or something
similar?). I often wonder about vendor programming standards
when I run across things like it.
Yes, this was from a hardware vendor (who shall remain nameless to
protect the guilty - not that I have found other vendors to be much
better). They have a tendency to be obsessed with MISRA, with sticking
to C90, and with filling headers with huge Doxygen templates giving no >information and obscuring the code. (I'm fine with Doxygen comments
that actually add useful information, but not a dozen lines repeating
the names and types from a function signature.)
The SDK also contains examples of parentheses used because it mixesNot to mention symbolic names for the magic constants. :-/
relatively rare operators (shifts and binary operators). Parentheses
around such sub-expressions are not uncommon, and can definitely be
helpful, but the quantity here makes things hard to read. Ironically,
though it is a macro, there are not "safety" parentheses around the
argument in the expression.
And yes, these really are the names of the macro in this code.
#define CONVERTARGB88882ARGB4444(Color) \
((((Color & 0xFFU) >> 4) & 0xFU) |\
(((((Color & 0xFF00U) >> 8) >> 4) & 0xFU) << 4) |\
(((((Color & 0xFF0000U) >> 16) >> 4) & 0xFU) << 8) | \
(((((Color & 0xFF000000U) >> 24) >> 4) & 0xFU) << 12))
#define CONVERTRGB5652ARGB8888(Color) \
(((((((Color >> 11) & 0x1FU) * 527) + 23) >> 6) << 16) |\
((((((Color >> 5) & 0x3FU) * 259) + 33) >> 6) << 8) |\
((((Color & 0x1FU) * 527) + 23) >> 6) | 0xFF000000)
It can be argued that the parentheses themselves are not the problem
here - it is doing too much in one expression. Static inline functions
would make things clearer, as would a separation of the steps of
breaking down the original colour format into parts, scaling or
conversions, then building up the new colour format. Different named
types for the different formats would go a long way towards usability
and safety - at least using typedefs, but preferably using structs to
make real different types. And surely nicer names could have been found! >>
Names for magic constants can be good, but they are not always helpful -
if the magic number is only used once, its definition is far from its
use, and it is polluting the global name space, then it can be a lot
better to simply use the number directly and add a comment at the point
of use. But the shift-and-mask constants could be replaced by either a >struct with bit-fields, or inline functions for field extractions, or at >separate local variables for the extracted fields.
This is exactly the sort of thing that, as you point out, a
`static inline` function is far better suited for. Some code
bases don't want to use them for a variety of reasons, usually
compatibility concerns with older code, compilers, or language
standards. Some variants of Unix, for instance, worry about
header compatibility with C90 [and in some cases K&R C] code.
Indeed. But even if they don't want to use "inline", a static function
is better - the compiler will do the inlining anyway (if it makes sense >according to its heuristics).
In article <10vjsg2$259m3$3@dont-email.me>,
Names for magic constants can be good, but they are not always helpful -
if the magic number is only used once, its definition is far from its
use, and it is polluting the global name space, then it can be a lot
better to simply use the number directly and add a comment at the point
of use. But the shift-and-mask constants could be replaced by either a
struct with bit-fields, or inline functions for field extractions, or at
separate local variables for the extracted fields.
I don't mind some magic: the shift constants and the masks, for
instance, are fine. But the magic 527, 259, 23, and 33, and why
the subsequent values are shifted right by 6, could be better
explained by naming those constants.
Btw, with respect to this specific algorithm, I looked them up,
and they seem to be empirically discovered lore, though derived
from a relatively standard algorithm for projection of a
discrete value into a larger space. This stack overflow page
has some details: https://stackoverflow.com/questions/2442576/how-does-one-convert-16-bit-rgb565-to-24-bit-rgb888
Anyway, I don't think the constants have to be defined far away
from the code; I'd be happy with a local `const uint32_t FOO`,
though in this case it should probably just be a comment.
Here's my offering:
// Converts a 16-bit RGB16 (5-6-5) value to an ARGB32
// ("RGBA8888") value.
static inline uint32_t
rgb16_to_argb(uint16_t color)
{
const uint32_t blue5 = (color >> 0) & 0x1F;
const uint32_t green6 = (color >> 5) & 0x3F;
const uint32_t red5 = (color >> 11) & 0x1F;
// Map from a 5 or 6 bit space into an 8 bit space. A
// 5-bit number has 32 possibilities; a 6 bit number
// has 64. We can calculate the projected 8-bit
// value for a k-bit number v, we can use the formula,
// v_8 = (v*2^8-1 + (k - 1)/2)/(2^k-1), or
// (v*255 + 15)/31 (for k=5) or (v*255 + 31)/63 (for
// k=6.
//
// To remove division by a prime and turn it into a
// shift, the constants below were empirically
// discovered to generate good results. See
// https://stackoverflow.com/questions/2442576/how-does-one-convert-16-bit-rgb565-to-24-bit-rgb888
// for details.
const uint32_t blue = (blue5 * 527 + 23) >> 6;
const uint32_t green = (green6 * 259 + 33) >> 6;
const uint32_t red = (red5 * 527 + 23) >> 6;
const uint32_t alpha = 0xFF000000;
return blue | (green << 8) | (red << 16) | alpha;
}
It's longer, yes, but I'd argue it's much easier to understand.
On my compiler, it generates almost identical code, except that
some instructions are in a different order.
These are more or less real examples, I just simplified the
terms. Here are some from MZLIB:
return (status == MZ_OK) ? MZ_BUF_ERROR : status;
return (pL == pE) ? (l_len < r_len) : (l < r);
sym = (match_dist < 512) ? s0 : s1;
return ((pState->m_last_status == TINFL_STATUS_DONE) &&
(!pState->m_dict_avail)) ? MZ_STREAM_END : MZ_OK;
I believe that in the first three, all parentheses are superflous, but
they are used anyway. Why is that?
(My preferences for ?: are that the whole thing is syntax, outside of
the precedence scheme, and that it has mandatory parentheses. That
second line would then look like this:
return (pL == pE ? l_len < r_len : l < r);
There are fewer parentheses in all, and less potential confusion. You
can even have assignments in each branch; they will not interfere with
?:.)
On 01/06/2026 08:52, David Brown wrote:[...]
That still leaves extra parentheses around the equality operators,
but the decision to keep or remove them is subjective (as is the
choice of "pData1 == NULL" vs. "!pData1").
Maybe it's due to || being a symbol; compare:
if (pData1 == NULL || pData2 == NULL || Length == 0U)
if (pData1 == NULL or pData2 == NULL or Length == 0U)
To me, || seems to draw in the terms on either side as strongly as
==. That happens less using 'or'.
(Both are valid C if using iso646.h.)
I vaguely recall that there's some language that uses the ?: syntax
for the conditional operator, but with a different precedence and/or associativity than C. I can't remember which language it is.
The "and" macro in <iso646.h> is exactly equivalent to "||".
Bart <bc@freeuk.com> writes:
On 01/06/2026 08:52, David Brown wrote:[...]
That still leaves extra parentheses around the equality operators,
but the decision to keep or remove them is subjective (as is the
choice of "pData1 == NULL" vs. "!pData1").
Maybe it's due to || being a symbol; compare:
if (pData1 == NULL || pData2 == NULL || Length == 0U)
if (pData1 == NULL or pData2 == NULL or Length == 0U)
To me, || seems to draw in the terms on either side as strongly as
==. That happens less using 'or'.
(Both are valid C if using iso646.h.)
The "and" macro in <iso646.h> is exactly equivalent to "||".
If your intuition tells you they have different precedences, that
could be a problem.
Keith Thompson <Keith.S.Thompson+u@gmail.com> writes:
The "and" macro in <iso646.h> is exactly equivalent to "||".
I don't think so.
Keith Thompson <Keith.S.Thompson+u@gmail.com> writes:
[...]
I vaguely recall that there's some language that uses the ?: syntax
for the conditional operator, but with a different precedence and/or
associativity than C. I can't remember which language it is.
The language I was thinking of is PHP. C's ?: operator associates right-to-left, which makes it possible to write chained conditional expressions like:
cond1 ? expr1 :
cond2 ? expr2 :
cond3 ? expr3 :
default_expr
PHP's ?: operator originally associated right-to-left.
Newer versions of PHP require parentheses.
In article <10vjsg2$259m3$3@dont-email.me>,
David Brown <david.brown@hesbynett.no> wrote:
On 01/06/2026 13:04, Dan Cross wrote:
In article <10vjdn8$22tgu$1@dont-email.me>,
David Brown <david.brown@hesbynett.no> wrote:
On 31/05/2026 19:11, Bart wrote:
[snip]
Actual examples of too many parentheses?
I see code like this all the time; usually it comes from
hardware vendors (I take it this was from a BSP or something
similar?). I often wonder about vendor programming standards
when I run across things like it.
Yes, this was from a hardware vendor (who shall remain nameless to
protect the guilty - not that I have found other vendors to be much
better). They have a tendency to be obsessed with MISRA, with sticking
to C90, and with filling headers with huge Doxygen templates giving no
information and obscuring the code. (I'm fine with Doxygen comments
that actually add useful information, but not a dozen lines repeating
the names and types from a function signature.)
Yes. I see all of this, and it mystifies me; I have seen how
excessive abstraction can lead to opaque code, but many times
hardware people go in the opposite direction, and one hardly
ever sees useful abstraction; for example, often the same code
sequence could be trivially extracted into a function, but it is
instead repeated multiple times, inline.
Not to mention symbolic names for the magic constants. :-/
Names for magic constants can be good, but they are not always helpful -
if the magic number is only used once, its definition is far from its
use, and it is polluting the global name space, then it can be a lot
better to simply use the number directly and add a comment at the point
of use. But the shift-and-mask constants could be replaced by either a
struct with bit-fields, or inline functions for field extractions, or at
separate local variables for the extracted fields.
I don't mind some magic: the shift constants and the masks, for
instance, are fine. But the magic 527, 259, 23, and 33, and why
the subsequent values are shifted right by 6, could be better
explained by naming those constants.
Btw, with respect to this specific algorithm, I looked them up,
and they seem to be empirically discovered lore, though derived
from a relatively standard algorithm for projection of a
discrete value into a larger space. This stack overflow page
has some details: https://stackoverflow.com/questions/2442576/how-does-one-convert-16-bit-rgb565-to-24-bit-rgb888
Anyway, I don't think the constants have to be defined far away
from the code; I'd be happy with a local `const uint32_t FOO`,
though in this case it should probably just be a comment.
Here's my offering:
// Converts a 16-bit RGB16 (5-6-5) value to an ARGB32
// ("RGBA8888") value.
static inline uint32_t
rgb16_to_argb(uint16_t color)
{
const uint32_t blue5 = (color >> 0) & 0x1F;
const uint32_t green6 = (color >> 5) & 0x3F;
const uint32_t red5 = (color >> 11) & 0x1F;
// Map from a 5 or 6 bit space into an 8 bit space. A
// 5-bit number has 32 possibilities; a 6 bit number
// has 64. We can calculate the projected 8-bit
// value for a k-bit number v, we can use the formula,
// v_8 = (v*2^8-1 + (k - 1)/2)/(2^k-1), or
// (v*255 + 15)/31 (for k=5) or (v*255 + 31)/63 (for
// k=6.
//
// To remove division by a prime and turn it into a
// shift, the constants below were empirically
// discovered to generate good results. See
// https://stackoverflow.com/questions/2442576/how-does-one-convert-16-bit-rgb565-to-24-bit-rgb888
// for details.
const uint32_t blue = (blue5 * 527 + 23) >> 6;
const uint32_t green = (green6 * 259 + 33) >> 6;
const uint32_t red = (red5 * 527 + 23) >> 6;
const uint32_t alpha = 0xFF000000;
return blue | (green << 8) | (red << 16) | alpha;
}
It's longer, yes, but I'd argue it's much easier to understand.
On my compiler, it generates almost identical code, except that
some instructions are in a different order.
This is exactly the sort of thing that, as you point out, a
`static inline` function is far better suited for. Some code
bases don't want to use them for a variety of reasons, usually
compatibility concerns with older code, compilers, or language
standards. Some variants of Unix, for instance, worry about
header compatibility with C90 [and in some cases K&R C] code.
Indeed. But even if they don't want to use "inline", a static function
is better - the compiler will do the inlining anyway (if it makes sense
according to its heuristics).
Assuming the compiler they're working with is known to do so,
then I agree.
On 01/06/2026 19:48, Dan Cross wrote:
In article <10vjsg2$259m3$3@dont-email.me>,
Names for magic constants can be good, but they are not always helpful - >>> if the magic number is only used once, its definition is far from its
use, and it is polluting the global name space, then it can be a lot
better to simply use the number directly and add a comment at the point
of use.ÿ But the shift-and-mask constants could be replaced by either a
struct with bit-fields, or inline functions for field extractions, or at >>> separate local variables for the extracted fields.
I don't mind some magic: the shift constants and the masks, for
instance, are fine.ÿ But the magic 527, 259, 23, and 33, and why
the subsequent values are shifted right by 6, could be better
explained by naming those constants.
Btw, with respect to this specific algorithm, I looked them up,
and they seem to be empirically discovered lore, though derived
from a relatively standard algorithm for projection of a
discrete value into a larger space.ÿ This stack overflow page
has some details:
https://stackoverflow.com/questions/2442576/how-does-one-convert-16-
bit-rgb565-to-24-bit-rgb888
Anyway, I don't think the constants have to be defined far away
from the code; I'd be happy with a local `const uint32_t FOO`,
though in this case it should probably just be a comment.
Here's my offering:
// Converts a 16-bit RGB16 (5-6-5) value to an ARGB32
// ("RGBA8888") value.
static inline uint32_t
rgb16_to_argb(uint16_t color)
{
ÿÿÿÿconst uint32_t blue5ÿ = (color >>ÿ 0) & 0x1F;
ÿÿÿÿconst uint32_t green6 = (color >>ÿ 5) & 0x3F;
ÿÿÿÿconst uint32_t red5ÿÿ = (color >> 11) & 0x1F;
ÿÿÿÿ// Map from a 5 or 6 bit space into an 8 bit space.ÿ A
ÿÿÿÿ// 5-bit number has 32 possibilities; a 6 bit number
ÿÿÿÿ// has 64.ÿÿ We can calculate the projected 8-bit
ÿÿÿÿ// value for a k-bit number v, we can use the formula,
ÿÿÿÿ// v_8 = (v*2^8-1 + (k - 1)/2)/(2^k-1), or
ÿÿÿÿ// (v*255 + 15)/31 (for k=5) or (v*255 + 31)/63 (for
ÿÿÿÿ// k=6.
ÿÿÿÿ//
ÿÿÿÿ// To remove division by a prime and turn it into a
ÿÿÿÿ// shift, the constants below were empirically
ÿÿÿÿ// discovered to generate good results.ÿ See
ÿÿÿÿ// https://stackoverflow.com/questions/2442576/how-does-one-
convert-16-bit-rgb565-to-24-bit-rgb888
ÿÿÿÿ// for details.
ÿÿÿÿconst uint32_t blueÿ = (blue5 * 527 + 23) >> 6;
ÿÿÿÿconst uint32_t green = (green6 * 259 + 33) >> 6;
ÿÿÿÿconst uint32_t redÿÿ = (red5 * 527 + 23) >> 6;
ÿÿÿÿconst uint32_t alpha = 0xFF000000;
ÿÿÿÿreturn blue | (green << 8) | (red << 16) | alpha;
}
It's longer, yes, but I'd argue it's much easier to understand.
On my compiler, it generates almost identical code, except that
some instructions are in a different order.
The speed probably isn't that important. This can be table-driven: you
use those formulae once to populate some tables (and with the shifts built-in). Then the routine can be simplified to this:
ÿ uint32_t rgb16_to_argb_bc(uint16_t color) {
ÿÿÿÿÿ const uint32_t blue5ÿ = (color >>ÿ 0) & 0x1F;
ÿÿÿÿÿ const uint32_t green6 = (color >>ÿ 5) & 0x3F;
ÿÿÿÿÿ const uint32_t red5ÿÿ = (color >> 11) & 0x1F;
ÿÿÿÿÿ return bluetab[blue5] | greentab[green6] | redtab[red5] |
0xFF000000;
ÿ }
On a test I did (one billion conversions cycling over 1M precalculated random 16-bit numbers), the table version was twice as fast. Maybe a bit faster if the Alpha value is pre-added to the red-table.
(Results were merely summed, but if writing into a new buffer, then
memory access is probably more dominant.)
On 02/06/2026 00:11, Keith Thompson wrote:
Keith Thompson <Keith.S.Thompson+u@gmail.com> writes:
[...]
I vaguely recall that there's some language that uses the ?: syntax
for the conditional operator, but with a different precedence and/or
associativity than C. I can't remember which language it is.
The language I was thinking of is PHP. C's ?: operator associates
right-to-left, which makes it possible to write chained conditional
expressions like:
cond1 ? expr1 :
cond2 ? expr2 :
cond3 ? expr3 :
default_expr
PHP's ?: operator originally associated right-to-left.
Newer versions of PHP require parentheses.
I thought you were thinking of C++, where ? has the same precedence as assignment, while in C it has higher precedence. It does not make a
lot of difference, and if you are writing an expression where it
matters, then I think parentheses would be a good idea.
<https://cppreference.com/c/language/operator_precedence> <https://cppreference.com/cpp/language/operator_precedence>
On 01/06/2026 22:39, Keith Thompson wrote:
Bart <bc@freeuk.com> writes:
On 01/06/2026 08:52, David Brown wrote:[...]
That still leaves extra parentheses around the equality operators,
but the decision to keep or remove them is subjective (as is the
choice of "pData1 == NULL" vs. "!pData1").
Maybe it's due to || being a symbol; compare:
ÿÿÿÿÿ if (pData1 == NULL || pData2 == NULL || Length == 0U)
ÿÿÿÿÿ if (pData1 == NULL or pData2 == NULL or Length == 0U)
To me, || seems to draw in the terms on either side as strongly as
==. That happens less using 'or'.
(Both are valid C if using iso646.h.)
[...]
I'm not saying that, just that having a named operators helps to
separate that expression into three groups better than a symbolic operator.
At least for me.
[...]
(Digression: I hate the fact that such a long and sometimes
informative thread has such a stupid subject header.)
David Brown <david.brown@hesbynett.no> writes:
On 02/06/2026 00:11, Keith Thompson wrote:
Keith Thompson <Keith.S.Thompson+u@gmail.com> writes:
[...]
I vaguely recall that there's some language that uses the ?: syntax
for the conditional operator, but with a different precedence and/or
associativity than C. I can't remember which language it is.
The language I was thinking of is PHP. C's ?: operator associates
right-to-left, which makes it possible to write chained conditional
expressions like:
cond1 ? expr1 :
cond2 ? expr2 :
cond3 ? expr3 :
default_expr
PHP's ?: operator originally associated right-to-left.
Newer versions of PHP require parentheses.
I thought you were thinking of C++, where ? has the same precedence as
assignment, while in C it has higher precedence. It does not make a
lot of difference, and if you are writing an expression where it
matters, then I think parentheses would be a good idea.
<https://cppreference.com/c/language/operator_precedence>
<https://cppreference.com/cpp/language/operator_precedence>
Hmm. I'm not sure I either follow or trust those tables.
Looking at the grammar in the C++ standard, there is a difference.
C has:
conditional-expression:
logical-OR-expression
logical-OR-expression ? expression : conditional-expression
while C++ has:
conditional-expression:
logical-or-expression
logical-or-expression ? expression : assignment-expression
But the difference isn't mentioned in the Compatibility annex of the C++ standard.
I'd be interested in seeing a conditional expression whose legality or semantics differs between C and C++.
(Digression: I hate the fact that such a long and sometimes
informative thread has such a stupid subject header.)
[...]
Yes, a compiler can reduce (a + b) * 0 to just 0. But it's not
required to do so, and (INT_MAX + 1) * 0 still has undefined
behavior. Undefined behavior is determined by the rules of the
abstract machine *without* any adjustments permitted by the as-if
rule.
On 02/06/2026 11:07, Keith Thompson wrote:
David Brown <david.brown@hesbynett.no> writes:
On 02/06/2026 00:11, Keith Thompson wrote:
Keith Thompson <Keith.S.Thompson+u@gmail.com> writes:
[...]
I vaguely recall that there's some language that uses the ?: syntax
for the conditional operator, but with a different precedence and/or >>>>> associativity than C.ÿ I can't remember which language it is.
The language I was thinking of is PHP.ÿ C's ?: operator associates
right-to-left, which makes it possible to write chained conditional
expressions like:
ÿÿÿÿÿ cond1 ? expr1 :
ÿÿÿÿÿ cond2 ? expr2 :
ÿÿÿÿÿ cond3 ? expr3 :
ÿÿÿÿÿ default_expr
PHP's ?: operator originally associated right-to-left.
Newer versions of PHP require parentheses.
I thought you were thinking of C++, where ? has the same precedence as
assignment, while in C it has higher precedence.ÿ It does not make a
lot of difference, and if you are writing an expression where it
matters, then I think parentheses would be a good idea.
<https://cppreference.com/c/language/operator_precedence>
<https://cppreference.com/cpp/language/operator_precedence>
Hmm.ÿ I'm not sure I either follow or trust those tables.
cppreference.com is normally very accurate - it is linked from the isocpp.org website and AFAIUI maintained or checked by people involved
in the C++ standards.ÿ Mistakes here are definitely something that
should be taken seriously.
Looking at the grammar in the C++ standard, there is a difference.
C has:
ÿÿÿÿ conditional-expression:
ÿÿÿÿÿÿÿÿ logical-OR-expression
ÿÿÿÿÿÿÿÿ logical-OR-expression ? expression : conditional-expression
while C++ has:
ÿÿÿÿ conditional-expression:
ÿÿÿÿÿÿÿÿ logical-or-expression
ÿÿÿÿÿÿÿÿ logical-or-expression ? expression : assignment-expression
But the difference isn't mentioned in the Compatibility annex of the C++
standard.
I'd be interested in seeing a conditional expression whose legality or
semantics differs between C and C++.
There is a little information in the "discussion" page of the C++ side linked above.ÿ An example is
ÿÿÿÿtrue ? a : b = 7;
In C, the ternary operator has higher precedence than assignment and
this therefore parses as :
ÿÿÿÿ(true ? a : b) = 7;
In C, the ternary operator does not return an lvalue, so this is a constraint error.
In C++, the precedence of ternary and assignment are the same, with right-to-left associativity, so this is parsed as :
ÿÿÿÿtrue ? a : (b = 7)
and evaluates as the value of "a", leaving "b" untouched.
I am not confident enough in my standardese, especially for C++, to
judge if the above explanation is correct according to the standards.
But a quick test on godbolt shows that both gcc and clang follow that
line of reasoning.ÿ (It is possible that they are both wrong, but that
would be surprising.)
The difference in precedences here is, I think, related to the ternary operator being able to evaluate to an lvalue in C++ but not in C - and
that /is/ mentioned in the C++ compatibility annex.
Janis Papanagnou <janis_papanagnou+ng@hotmail.com> writes:
On 2026-05-31 01:43, Keith Thompson wrote:
Bart <bc@freeuk.com> writes:
[...][...]
C's operator precedence rules are complicated and arguably flawed.
I'd say that just the (known) flaw makes them (slightly) complicated;
so you need to remember that "flaw" (or "inconsistency") to be safe.
The rest is completely sensible. And even if one doesn't have a table
to look up the precedences they mostly can be derived (presuming one
has a feeling for the underlying logic of these things or experiences
from other related areas).
Reasonable, but I feel the need to say that that's your personal
opinion. You seem to think that C's precedence rules have one and
only one flaw, and a set of rules with that flaw corrected would
be ideal.
I don't even necessarily disagree, but others are likely to have
different opinions, and those opinions might be perfectly valid.
I don't want to make a huge deal out of this. I honestly don't have
a strong opinion myself. I usually find dealing with the rules
as they exist to be a much better use of my time and attention --
and I don't mean that as a criticism of anyone who choose to think
about alternatives.
[...]
[...]When designing a new language, there are real advantages in strictly
imitating C's rules, just because so many programmers are familiar
with them.
Huh? - How that? - Are you saying here that practically only C-like
languages are in common use?
Huh? No, I didn't say that at all.
I suggest that if you're designing a somewhat C-like language,
sticking to C's precedence rules has advantages due to programmer familiarity. Even for a language that's not particularly C-like,
but that has C-like expressions, the designer might consider
following C's rules.
Or not.
[...]
(I would have been silly for C++ or Objective-C to
change the precedence rules, even to improve them.) But there
are also real advantages in using precedence rules that are better
(e.g., simpler) than C's.
Or - with reference to that flaw - just more consistent.
Consistent systems are inherently simpler, in the sense of easier to
understand and thus more straightforward to use. A precondition for
that is, as said, at least a basic understanding of such things.
Ah, but consistent with what? Internal consistency and consistency
with existing practice are not necessarily the same thing.
[...]
On 2026-06-01 00:54, Keith Thompson wrote:
[...]
Yes, a compiler can reduce (a + b) * 0 to just 0. But it's not
required to do so, and (INT_MAX + 1) * 0 still has undefined
behavior. Undefined behavior is determined by the rules of the
abstract machine *without* any adjustments permitted by the as-if
rule.
This is something I really don't get in the actual C-logic...
Using constants that can be determined at compile time is UB here,
despite the '* 0' mathematically indicating an IMO clear semantics,
but using variables is only UB possibly at runtime? And despite all
that the latter might not even get triggered because it's probably
optimized away? - I can't help, this sounds really crude.
Is there any rationale from the _software designer_'s perspective?
On 31/05/2026 03:37, Janis Papanagnou wrote:
On 2026-05-31 01:43, Keith Thompson wrote:
Bart <bc@freeuk.com> writes:
[...]
This is an example of how readability depends on the reader.ÿ To me,If not, people can choose to ignore those them when writing C code,
for example like this where all () are technically superfluous:
ÿÿÿ crcu32 = (crcu32 >> 4) ^ s_crc32[(crcu32 & 0xF) ^ (b & 0xF)];
Yes, they can, and I personally tend to agree that they should.
The more complex the expressions are the more structure they need.
IMO, the parenthesis above make precedence clear (if unknown!), but
are not contributing to readability. It would have made more sense
to separate the sub-expression within the [...] in an own object to
enhance readability and to more easily understand what's going on.
To emphasize; not the precedences are the problem above, but the
complexity of the expression in connexion with lack of structuring.
there is no benefit in having a sub-expression here because the
structure is clear - this is how you do table-based crc's with 4-bit
chunks.
But to someone unfamiliar with CRC calculations, splitting the
expression up might make it clearer.ÿ (Alternatively, a comment block
with an explanation could help.)
I /do/ think the parentheses here are helpful for readability, precisely because they emphasise the structure of the expression.ÿ You could write:
ÿÿÿÿcrcu32 = crcu32 >> 4 ^ s_crc32[crcu32 & 0xF ^ b & 0xF];
but that needs significantly more cognitive effort to parse when reading
it, could be misinterpreted, and has lost all the structure that makes
it easy to see what is going on.
(I regularly use bit-manipulation and shift instructions in my code -
but I still felt it best to check the details in a precedence table
before writing that.)
The expression as originally parenthesised is thus definitely easier
for /me/ to read, and is almost exactly the way I would write it myself :
ÿÿÿÿcrcu32 = (crcu32 >> 4) ^ s_crc32[(crcu32 & 0xF) ^ (b & 0xF)];
The only differences I would have are the names (why would anyone put variable types into the names like "crcu32" ?
We are not writing
BASIC), and I'd use a small case "0xf".ÿ Unlike almost every example
Bart has shown before, it even has nice spacing!
Looking at the grammar in the C++ standard, there is a difference.
C has:
conditional-expression:
logical-OR-expression
logical-OR-expression ? expression : conditional-expression
while C++ has:
conditional-expression:
logical-or-expression
logical-or-expression ? expression : assignment-expression
But the difference isn't mentioned in the Compatibility annex of the
C++ standard.
On 2026-06-02 11:07, Keith Thompson wrote:
[...]
(Digression: I hate the fact that such a long and sometimes
informative thread has such a stupid subject header.)
And what did prevent you from changing it? :-}
On 2026-06-01 00:54, Keith Thompson wrote:
[...]
Yes, a compiler can reduce (a + b) * 0 to just 0. But it's not
required to do so, and (INT_MAX + 1) * 0 still has undefined
behavior. Undefined behavior is determined by the rules of the
abstract machine *without* any adjustments permitted by the as-if
rule.
This is something I really don't get in the actual C-logic...
Using constants that can be determined at compile time is UB here,
despite the '* 0' mathematically indicating an IMO clear semantics,
but using variables is only UB possibly at runtime? [...]
On 01/06/2026 20:48, Dan Cross wrote:
In article <10vjsg2$259m3$3@dont-email.me>,
David Brown <david.brown@hesbynett.no> wrote:
On 01/06/2026 13:04, Dan Cross wrote:
In article <10vjdn8$22tgu$1@dont-email.me>,
David Brown <david.brown@hesbynett.no> wrote:
On 31/05/2026 19:11, Bart wrote:
[snip]
Actual examples of too many parentheses?
[Snipping the LISP stuff - fun, but OT and not really relevant to the
thread branch. And I have never used the language.]
[snip]
I see code like this all the time; usually it comes from
hardware vendors (I take it this was from a BSP or something
similar?). I often wonder about vendor programming standards
when I run across things like it.
Yes, this was from a hardware vendor (who shall remain nameless to
protect the guilty - not that I have found other vendors to be much
better). They have a tendency to be obsessed with MISRA, with sticking
to C90, and with filling headers with huge Doxygen templates giving no
information and obscuring the code. (I'm fine with Doxygen comments
that actually add useful information, but not a dozen lines repeating
the names and types from a function signature.)
Yes. I see all of this, and it mystifies me; I have seen how
excessive abstraction can lead to opaque code, but many times
hardware people go in the opposite direction, and one hardly
ever sees useful abstraction; for example, often the same code
sequence could be trivially extracted into a function, but it is
instead repeated multiple times, inline.
Indeed. There is just /so/ much that is done badly in these SDK's - I
am not going to go into details as it would take all day. I get the >impression that software libraries are very much an afterthought for
most microcontroller design groups - I don't think they ever bother
talking to developers who will use them. In fact, I don't think they
talk much to the software folks when designing the microcontrollers either.
Sometimes, however, they do have abstractions - sometimes multiple
layers of HALs ("Hardware Abstraction Layer"), drivers, interfaces, etc.
Each layer has a completely different way of viewing things - one will
use #define'd constants for everything, another will use a struct with
30 fields passed as a pointer in order to turn a GPIO pin on or off, and
the next layer will use a macro TURN_GPIO_PIN_A14_ON. When you have
figured out which API you are expected to use, toggling a GPIO leads to
a half-dozen nested calls (not including macros) up and down theses
stacks when all the hardware needs is a single write to a particular >register. And if you are really lucky, a global HAL_LOCK_MUTEX is
acquired and released along the way.
[snip color mapping code]
Yes, that would be vastly better. (I would still prefer to have
different named types for colours in the different encoding schemes.)
This is exactly the sort of thing that, as you point out, a
`static inline` function is far better suited for. Some code
bases don't want to use them for a variety of reasons, usually
compatibility concerns with older code, compilers, or language
standards. Some variants of Unix, for instance, worry about
header compatibility with C90 [and in some cases K&R C] code.
Indeed. But even if they don't want to use "inline", a static function
is better - the compiler will do the inlining anyway (if it makes sense
according to its heuristics).
Assuming the compiler they're working with is known to do so,
then I agree.
If a compiler is not capable of inlining static functions without them
being labelled "inline", then you are unlikely to get efficient results >anyway. (Or the user has not enabled optimisation, and again cannot
expect efficient results.) I don't see the point in pandering to poorly >optimising compilers (including good compilers with optimisation
disabled) in order to produce marginally less big and slow code. There
was a time when a good optimising compiler was a significant investment
and not always within the budget for a project, but such times are far
in the past. I can understand that some developers are hamstrung by
daft C90 restrictions, but I have little sympathy for people wanting
good results from poor tools.
(The exception, perhaps, is people who have to use Microchip development >tools.)
[...]David Brown <david.brown@hesbynett.no> writes:
[...]<https://cppreference.com/c/language/operator_precedence>
<https://cppreference.com/cpp/language/operator_precedence>
Your table however also shows || had same precedence as both ?: and
=. There, I couldn't find an example that made a difference.
Still, I'd find that unsettling; I would rather that ?: was distinct
from bother, either with its own level, or via other language
rules. (In my stuff it is always written with parentheses.)
On 2026-06-01 00:54, Keith Thompson wrote:
[...]Yes, a compiler can reduce (a + b) * 0 to just 0. But it's not
required to do so, and (INT_MAX + 1) * 0 still has undefined
behavior. Undefined behavior is determined by the rules of the
abstract machine *without* any adjustments permitted by the as-if
rule.
This is something I really don't get in the actual C-logic...
Using constants that can be determined at compile time is UB here,
despite the '* 0' mathematically indicating an IMO clear semantics,
but using variables is only UB possibly at runtime? And despite all
that the latter might not even get triggered because it's probably
optimized away? - I can't help, this sounds really crude.
Is there any rationale from the _software designer_'s perspective?
Digression: Perl borrows most or all of C's operators, and keeps
the same precedences. "Operators borrowed from C keep the same
precedence relationship with each other, even where C's precedence
is slightly screwy." But Perl has "and" and "or" operators that
work like "&&" and "||" but have lower precedence (that turns out
to be convenient in some contexts).
I vaguely recall that there's some language that uses the ?: syntax
for the conditional operator, but with a different precedence and/or >associativity than C. I can't remember which language it is.
In article <10vlvie$2ne3j$2@dont-email.me>,
David Brown <david.brown@hesbynett.no> wrote:
On 01/06/2026 20:48, Dan Cross wrote:
In article <10vjsg2$259m3$3@dont-email.me>,
David Brown <david.brown@hesbynett.no> wrote:
On 01/06/2026 13:04, Dan Cross wrote:
In article <10vjdn8$22tgu$1@dont-email.me>,
David Brown <david.brown@hesbynett.no> wrote:
On 31/05/2026 19:11, Bart wrote:
[snip]
[snip color mapping code]
Yes, that would be vastly better. (I would still prefer to have
different named types for colours in the different encoding schemes.)
I'll see your named types and raise you a bitfield struct. The
shifting and masking is superfluous.
This is exactly the sort of thing that, as you point out, a
`static inline` function is far better suited for. Some code
bases don't want to use them for a variety of reasons, usually
compatibility concerns with older code, compilers, or language
standards. Some variants of Unix, for instance, worry about
header compatibility with C90 [and in some cases K&R C] code.
Indeed. But even if they don't want to use "inline", a static function >>>> is better - the compiler will do the inlining anyway (if it makes sense >>>> according to its heuristics).
Assuming the compiler they're working with is known to do so,
then I agree.
If a compiler is not capable of inlining static functions without them
being labelled "inline", then you are unlikely to get efficient results
anyway. (Or the user has not enabled optimisation, and again cannot
expect efficient results.) I don't see the point in pandering to poorly
optimising compilers (including good compilers with optimisation
disabled) in order to produce marginally less big and slow code. There
was a time when a good optimising compiler was a significant investment
and not always within the budget for a project, but such times are far
in the past. I can understand that some developers are hamstrung by
daft C90 restrictions, but I have little sympathy for people wanting
good results from poor tools.
(The exception, perhaps, is people who have to use Microchip development
tools.)
It's not just because the optimizer is bad or the developers are
obtuse. Sometimes it's a deliberate decision to support
external tooling, like a debugger or tracing program or similar.
Some projects deliberately tolerate slower code for that.
Moreover, on large code bases, with long life spans, upgrading a
compiler is a significant investment.
Almost invariably the
code has UB somewhere (I work on a code base that has been
evolving since before ANSI C; out of about 11 million lines,
there's lots of code that can be considered "legacy" in it).
From a business standpoint, it's not worth the time or
engineering resources required to go find all of it and make it
strictly conforming; from a technical standpoint, it may not
always be possible to do so anyway (though other superset
standards, like POSIX, are another matter), and in other cases
the resulting obfuscation to meet much stricter demands of ISO C
has been deemed, rightly or wrongly, as simply not worth it. It
may not ideal, but them's the breaks.
Janis Papanagnou <janis_papanagnou+ng@hotmail.com> writes:
On 2026-06-02 11:07, Keith Thompson wrote:
[...]
(Digression: I hate the fact that such a long and sometimes
informative thread has such a stupid subject header.)
And what did prevent you from changing it? :-}
Futility. At best, I could start a new subthread. The existing
subject line would live on.
I maintain that there are several good reasons why changing the Subject
line is a good thing. Many other people disagree with me, but I don't care >about that.
It is, as you imply, especially a good idea where, as here, the original >(i.e., carried) Subject line is dumb.
On 31/05/2026 10:49, David Brown wrote:
On 31/05/2026 10:12, Richard Harnden wrote:
On 31/05/2026 00:43, Keith Thompson wrote:
C's operator precedence rules are complicated and arguably flawed.
They could have been defined differently.ÿ A simpler set of rules,
with fewer levels,*might* have been better.ÿ I don't have any
concrete suggestions -- nor do I have any strong preferences.
I accept C's rules as they are.ÿ I would accept them if they had
been defined differently.
Can't the compiler easily remove any parens that aren't necessary?
So - just write complex expressions in a way that a human can most
easily understand, it makes your intention clear and probable doesn't
increase the size of the executable.
Of course.ÿ Parentheses do not affect the generated code unless they
affect the semantics of the expression.ÿ (Some people think parentheses
affect the order of evaluation,
They can do if they make a expression be parsed differently. Do you have
an example where they make no difference but people might think they do?
Bart <bc@freeuk.com> writes:
[...]
[...]David Brown <david.brown@hesbynett.no> writes:
[...]<https://cppreference.com/c/language/operator_precedence>
<https://cppreference.com/cpp/language/operator_precedence>
Your table however also shows || had same precedence as both ?: and
=. There, I couldn't find an example that made a difference.
Still, I'd find that unsettling; I would rather that ?: was distinct
from bother, either with its own level, or via other language
rules. (In my stuff it is always written with parentheses.)
I think you're misreading the table due to its poor formatting.
In the C++ table (second URL above), the precedence levels are
numbered from 1 to 17, but the number in the first column is aligned
to the *middle* of the list of operators in the second column.
So level 15 is just "a || b", and level 16 goes from "a ? b : c" to
"a &= b a ^= b a |= b". You can tell where the level 16 section
starts by the "Right-to-left" associativity in the last column,
which is aligned with the *first* item in the list. I've submitted
a suggestion to fix it (and then saw that someone else had already
done so), but apparently cppreference.com is being hit by vandalism,
so it might take a while before it's corrected.
Note that in a context that requires a constant expression, overflow is
a constraint violation. For example, a case label like:
case (INT_MAX + 1) * 0:
must be diagnosed at compile time.
In article <10vh1eo$1ei50$2@dont-email.me>, Bart <bc@freeuk.com> wrote:
On 31/05/2026 10:49, David Brown wrote:
On 31/05/2026 10:12, Richard Harnden wrote:
On 31/05/2026 00:43, Keith Thompson wrote:
C's operator precedence rules are complicated and arguably flawed.
They could have been defined differently.ÿ A simpler set of rules,
with fewer levels,*might* have been better.ÿ I don't have any
concrete suggestions -- nor do I have any strong preferences.
I accept C's rules as they are.ÿ I would accept them if they had
been defined differently.
Can't the compiler easily remove any parens that aren't necessary?
So - just write complex expressions in a way that a human can most
easily understand, it makes your intention clear and probable doesn't
increase the size of the executable.
Of course.ÿ Parentheses do not affect the generated code unless they
affect the semantics of the expression.ÿ (Some people think parentheses
affect the order of evaluation,
They can do if they make a expression be parsed differently. Do you have
an example where they make no difference but people might think they do?
This is all a bit of a distraction from the original point that
David and Richard Harnden were trying to make, which seemed
clear enough to me, but perhaps should have been given with a
better example. Maybe something like:
d = a*b + c;
Is equivalent to,
d = (a*b) + c;
And in this case, the parentheses are superfluous and don't
change the order of evaluation of the expression as far as the
language is concerned. Whether a compiler rearranges it in
generated code in a way that is more convenient of faster or
whatever is another matter.
I would quibble with this idea that the compiler "removes"
parentheses. I get the intuition, but C is not Go where the
compiler "inserts" semi-colons for you, and has no analogous
concept. Rather, as I think Keith said, expressions are parsed
into some internal representation, and then transformed into
something like an abstract syntax tree, where syntactic
notations like parentheses are lost.
Both expressions above correspond to an AST like:
ÚÄÄÄÄÄÄÄ¿
³BinOp +³
ÀÄÄÄÄÄÄÄÙ
? ?
? ?
ÚÄÄÄÄÄÄÄ¿ ÚÄÄÄÄÄÄÄ¿
³BinOp *³ ³Sym `c`³
ÀÄÄÄÄÄÄÄÙ ÀÄÄÄÄÄÄÄÙ
? ?
? ?
ÚÄÄÄÄÄÄÄ¿ ÚÄÄÄÄÄÄÄ¿
³Sym `a`³ ³Sym `b`³
ÀÄÄÄÄÄÄÄÙ ÀÄÄÄÄÄÄÄÙ
But the to get to that, it may be that the compiler uses a
different initial representation, like a parse tree that more
closely resembles the source language grammar. Here, the
two expressions might have different parsed representations.
E.g., for the first, simplifying heavily, may look something
like this:
ÚÄÄÄÄÄÄ¿
³ expr ³
ÀÄÄÄÄÄÄÙ
? ³ ?
? ³ ?
ÚÄÄÄÄÄ¿ . ÚÄÄÄÄÄ¿
³term ³ (+) ³term ³
ÀÄÄÄÄÄÙ ' ÀÄÄÄÄÄÙ
? ³ ? ³
? ³ ? ³
ÚÄÄÄÄÄ¿ . ÚÄÄÄÄÄ¿ ÚÄÄÄÄÄ¿
³ident³ (*) ³ident³ ³ident³
ÀÄÄÄÄÄÙ ' ÀÄÄÄÄÄÙ ÀÄÄÄÄÄÙ
³ ³ ³
³ ³ ³
.Ä. .Ä. .Ä.
(`a`) (`b`) (`c`)
`Ä' `Ä' `Ä'
While the second might add an extra `expr` node, as in:
ÚÄÄÄÄÄÄ¿
³ expr ³
ÀÄÄÄÄÄÄÙ
? ³ ?
? ³ ?
ÚÄÄÄÄÄÄ¿ . ÚÄÄÄÄÄ¿
³ expr ³ (+) ³term ³
ÀÄÄÄÄÄÄÙ ' ÀÄÄÄÄÄÙ
³ ³
³ ³
ÚÄÄÄÄÄ¿ ÚÄÄÄÄÄ¿
³term ³ ³ident³
ÀÄÄÄÄÄÙ ÀÄÄÄÄÄÙ
? ³ ? ³
? ³ ? ³
ÚÄÄÄÄÄ¿ . ÚÄÄÄÄÄ¿ .Ä.
³ident³ (*) ³ident³ (`c`)
ÀÄÄÄÄÄÙ ' ÀÄÄÄÄÄÙ `Ä'
³ ³
³ ³
.Ä. .Ä.
(`a`) (`b`)
`Ä' `Ä'
I believe that the answer, for most compilers that parse and
then convert to an AST, the second is more likely to be created
than the first. However, given that the same AST is created
from both parse trees, this is unlikely to have an effect on the
object code ultimately output from the compiler.
Keith Thompson <Keith.S.Thompson+u@gmail.com> writes:
Note that in a context that requires a constant expression, overflow is
a constraint violation. For example, a case label like:
case (INT_MAX + 1) * 0:
must be diagnosed at compile time.
gcc disagrees with you.
In article <10vlvie$2ne3j$2@dont-email.me>,<snip>
David Brown <david.brown@hesbynett.no> wrote:
Yes. The way one boots an AMD server SoC, for instance,
requires shipping a bunch of binary data structures around to
little microcontrollers spread across a bunch of AXI buses, that
are then responsible for things like configuring PCIe links and
enumerating IO buses and so on. The vendor code for doing this
is opaque, at best. For example, >https://github.com/openSIL/openSIL/blob/turin_poc/xUSL/Nbio/Brh/NbioPcieComplexDataBrh.c
(and that's a cleaned-up version).
[snip color mapping code]
Yes, that would be vastly better. (I would still prefer to have
different named types for colours in the different encoding schemes.)
I'll see your named types and raise you a bitfield struct. The
shifting and masking is superfluous.
In article <10vh1eo$1ei50$2@dont-email.me>, Bart <bc@freeuk.com> wrote:
Both expressions above correspond to an AST like:
ÚÄÄÄÄÄÄÄ¿
³BinOp +³
ÀÄÄÄÄÄÄÄÙ
? ?
? ?
ÚÄÄÄÄÄÄÄ¿ ÚÄÄÄÄÄÄÄ¿
³BinOp *³ ³Sym `c`³
ÀÄÄÄÄÄÄÄÙ ÀÄÄÄÄÄÄÄÙ
? ?
? ?
ÚÄÄÄÄÄÄÄ¿ ÚÄÄÄÄÄÄÄ¿
³Sym `a`³ ³Sym `b`³
ÀÄÄÄÄÄÄÄÙ ÀÄÄÄÄÄÄÄÙ
On 02/06/2026 14:05, Dan Cross wrote:
You're describing a 'Concrete Syntax Tree' or CST, versus AST.
Although in that case, I expect to see a discrete node for bracketed >expressions (ie. parenthesised), as those would also have a distinct >production in any formal grammar.
Personally I don't have much use for CSTs for a normal compiler, but
they might be useful for source-to-source translators, or programs that
do source refactoring, where you want to preserve extras such as
parentheses even if they're not strictly needed.
(Injecting the right parentheses for examples like `(a + b) * c' which
would have an AST like '(* (+ a b) c)' is surpringly tricky. Easier to
just follow the original source!
In any case, that still wouldnt't turn ((a+b)) back into the original;
you'd need a suitable CST.)
cross@spitfire.i.gajendra.net (Dan Cross) writes:
In article <10vh1eo$1ei50$2@dont-email.me>, Bart <bc@freeuk.com> wrote:
Both expressions above correspond to an AST like:
ÚÄÄÄÄÄÄÄ¿
³BinOp +³
ÀÄÄÄÄÄÄÄÙ
? ?
? ?
ÚÄÄÄÄÄÄÄ¿ ÚÄÄÄÄÄÄÄ¿
³BinOp *³ ³Sym `c`³
ÀÄÄÄÄÄÄÄÙ ÀÄÄÄÄÄÄÄÙ
? ?
? ?
ÚÄÄÄÄÄÄÄ¿ ÚÄÄÄÄÄÄÄ¿
³Sym `a`³ ³Sym `b`³
ÀÄÄÄÄÄÄÄÙ ÀÄÄÄÄÄÄÄÙ
Ah, the dangers of assuming everyone uses UTF-8.
Janis Papanagnou <janis_papanagnou+ng@hotmail.com> writes:
On 2026-06-01 00:54, Keith Thompson wrote:
[...]
Yes, a compiler can reduce (a + b) * 0 to just 0. But it's not
required to do so, and (INT_MAX + 1) * 0 still has undefined
behavior. Undefined behavior is determined by the rules of the
abstract machine *without* any adjustments permitted by the as-if
rule.
This is something I really don't get in the actual C-logic...
Using constants that can be determined at compile time is UB here,
despite the '* 0' mathematically indicating an IMO clear semantics,
but using variables is only UB possibly at runtime? [...]
There's an important distinction to make here. Consider this
program:
#include <limits.h>
int
foo(){
int zero = (INT_MAX+1)*0;
return zero;
}
int
main(){
return 0;
}
This program does not transgress the bounds of undefined behavior.
Even more than that, the program is strictly conforming, and must be
accepted by a conforming implementation.
Now let's change the program slightly:
#include <limits.h>
int
foo(){
static int zero = (INT_MAX+1)*0;
return zero;
}
int
main(){
return 0;
}
This program does transgress the bounds of undefined behavior. The
reason for the difference is that in the first program the semantics
of foo() is to evaluate the expression to be stored in 'zero' only
at runtime, whereas in the second program the semantics of foo() is
to evaluate the expression to be stored in 'zero' before program
startup (informally, "at compile time"). What matters is not
whether the offending expression /might/ be evaluated "at compile
time", but whether the offending expression /must/ be evaluated "at
compile time". Only in the second case is undefined behavior
inevitable (and thus it does not occur in the first program).
Fine point: strictly speaking, I believe the C standard allows even
the second program to complete translation phase 8 successfully, and
for any offending behavior to occur only when we actually try to run
the program. To say that another way, there is no requirement that
possible nasal demons be made manifest at any point before an actual >attempted execution. On the other hand, because that possibility is
there lurking in the background, there is no requirement that the
program be accepted, and could be rejected by a conforming compiler.
David Brown <david.brown@hesbynett.no> writes:
On 31/05/2026 16:24, James Kuyper wrote:[...]
On 2026-05-31 07:18, David Brown wrote:
People might think they affect the order of evaluation, such as when you >>>> have function calls :
u = foo(x) + (foo(y) + foo(z));
Some people might think the use of parentheses means that "foo(y)" and >>>> "foo(z)" are called before "foo(x)", when the order of all these calls >>>> (and the additions) is unspecified. (Again, a given compiler might be >>>> influenced by the parentheses, but the language does not require it.
You're correct with regard to the function calls, but the
parenthesized addition must be performed first, and the other one
second, which may make a difference, for the same reasons given in my
previous paragraph.
The parentheses do not dictate the order of evaluation. But you are
correct - and it's worth pointing out, so thank you for doing that -
that for floating point operations, the grouping of operations can
affect the result.
The parentheses do not dictate the order of evaluation *of the
operands*. Each "+" can be evaluated (the addition performed)
only after the values of its operands are known. But regardless
of parentheses or operator precedence, the three operands foo(x),
foo(y), and foo(z) can be evaluated in any of 6 possible orders.
(It's different when you have operations like "&&", "||", and ",",
which imposes additional sequence points.)
If you are talking about floating point arithmetic (I was thinking of
integer arithmetic, but did not specify), then the operations are not
necessarily commutative or associative, and the compiler cannot then
re-arrange the operations unless it knows that doing so does not
affect the result.
It's not just floating-point. Signed integer overflow is also relevant.
(INT_MIN + INT_MAX) + 1 is well defined. (INT_MIN + INT_MAX) +1
is equivalent, and is also well defined. INT_MIN + (INT_MAX +1)
has undefined behavior.
But except for specific cases, the order of evaluation - both for the
values and side-effects - of sub-expressions is unspecified. Indeed,
they are unsequenced - the evaluations can interleave.
Usually, both sub-expressions of a binary operator will be evaluated
before the operator itself, simply because usually the results of the
operator cannot be calculated until the sub-expression's values are
known. But this is not a requirement of the language - if the
compiler can get the same results without doing so, it is free to pick
a different order. "(a + b) * 0" does not need to evaluate "a", "b",
or "a + b" at all unless there is a possibility of a side-effect - and
it can perform the side-effects in any order. "a + (b + c)" can check
"a" for a trap representation and deal with that before looking at "b"
and "c" or the results of "b + c", even though it cannot (for floating
point operations) re-arrange the code to do "a + b" first.
Yes, a compiler can reduce (a + b) * 0 to just 0. But it's not
required to do so, and (INT_MAX + 1) * 0 still has undefined
behavior. Undefined behavior is determined by the rules of the
abstract machine *without* any adjustments permitted by the as-if
rule.
[...]
[...]David Brown <david.brown@hesbynett.no> writes:
[...]<https://cppreference.com/c/language/operator_precedence>
<https://cppreference.com/cpp/language/operator_precedence>
Keith Thompson <Keith.S.Thompson+u@gmail.com> writes:
Note that in a context that requires a constant expression, overflow is
a constraint violation. For example, a case label like:
case (INT_MAX + 1) * 0:
must be diagnosed at compile time.
gcc disagrees with you.
On 01/06/2026 03:10, Tim Rentsch wrote:
Bart <bc@freeuk.com> writes:
On 31/05/2026 17:04, Tim Rentsch wrote:
Richard Harnden <richard.nospam@gmail.invalid> writes:
just write complex expressions in a way that a human can most
easily understand,
Unfortunately, (1) different people have different ideas of what
writing is most easily understood, and (2) different readers have
different notions of which writings are easily understood, and
which writings are not so easily understood. To make things
worse "easily understood" is not a boolean condition, nor is it
necessarily well-ordered -- "most easily understood" isn't always
a well-defined quality, even for a given audience.
Sadly the idea of writing in a way that is "most easily understood"
has resulted in a race to the bottom, where writers are more and
more encouraged to take the view that (some) readers are pretty
much arbitrarily stupid, with the result that expressions become
littered with scads of unnecessary parentheses that actually
detract from ease of reading. Good writing is always a balance
between too much and too little.
Actual examples of too many parentheses?
The point of my comment is that either too many or too few is a
subjective judgment, not an objective one.
My point was that it could be objective, at least for too many. So
(a*a) + (b*b) would be commonly agreed to have too many, [...]
And then there is ?: :
a > b ? c : d # (a>b)?c:d
a + b ? c : d # (a+b)?c:d
The grouping of the first is probably what is intended. But in the
second, the intent might have been (a+b)?c:d, or a+(b?c:c); we don't
know for sure that the author didn't make a mistake or we don't know
outselves.
This example is so addlebrained that it's hard to imagine anyone
being confused about it. Or that it's worth any expenditure of
thought wondering what to do about people who are.
I don't understand what the problem is with my examples.
There can be ambiguity in the mind of the person looking at such
code as to how the first terms are grouped.
In article <86ik81cfk5.fsf_-_@linuxsc.com>,
Tim Rentsch <tr.17687@z991.linuxsc.com> wrote:
Janis Papanagnou <janis_papanagnou+ng@hotmail.com> writes:
On 2026-06-01 00:54, Keith Thompson wrote:
[...]
Yes, a compiler can reduce (a + b) * 0 to just 0. But it's not
required to do so, and (INT_MAX + 1) * 0 still has undefined
behavior. Undefined behavior is determined by the rules of the
abstract machine *without* any adjustments permitted by the as-if
rule.
This is something I really don't get in the actual C-logic...
Using constants that can be determined at compile time is UB here,
despite the '* 0' mathematically indicating an IMO clear semantics,
but using variables is only UB possibly at runtime? [...]
There's an important distinction to make here. Consider this
program:
#include <limits.h>
int
foo(){
int zero = (INT_MAX+1)*0;
return zero;
}
int
main(){
return 0;
}
This program does not transgress the bounds of undefined behavior.
Given that `foo` has external linkage, I find this hard to
believe, and `clang -fsanitize=undefined` agrees with me,
both emitting a diagnostic about the overflow and generating
code in `foo` to call into the sanitizer machinery.
Perhaps you mean that this is irrelevant because `foo` is not
invoked, but I see no reason why that need be the case in e.g.
a freestanding environment.
In a hosted environment, I don't
think anything explicitly prevents `foo` from being called after
`main` returns (though I can't imagine that would happen in real
life; it would be weird if it did).
But I'm not sure what _you_ mean by "transgress the bounds of
undefined behavior" here.
Even more than that, the program is strictly conforming, and must be
accepted by a conforming implementation.
See above.
Now let's change the program slightly:
#include <limits.h>
int
foo(){
static int zero = (INT_MAX+1)*0;
return zero;
}
int
main(){
return 0;
}
This program does transgress the bounds of undefined behavior. The
reason for the difference is that in the first program the semantics
of foo() is to evaluate the expression to be stored in 'zero' only
at runtime, whereas in the second program the semantics of foo() is
to evaluate the expression to be stored in 'zero' before program
startup (informally, "at compile time"). What matters is not
whether the offending expression /might/ be evaluated "at compile
time", but whether the offending expression /must/ be evaluated "at
compile time". Only in the second case is undefined behavior
inevitable (and thus it does not occur in the first program).
Fine point: strictly speaking, I believe the C standard allows even
the second program to complete translation phase 8 successfully, and
for any offending behavior to occur only when we actually try to run
the program. To say that another way, there is no requirement that
possible nasal demons be made manifest at any point before an actual
attempted execution. On the other hand, because that possibility is
there lurking in the background, there is no requirement that the
program be accepted, and could be rejected by a conforming compiler.
Indeed. Further, I believe that the same is true for the first
program, as well.
Here's my offering:
// Converts a 16-bit RGB16 (5-6-5) value to an ARGB32
// ("RGBA8888") value.
static inline uint32_t
rgb16_to_argb(uint16_t color)
{
const uint32_t blue5 = (color >> 0) & 0x1F;
const uint32_t green6 = (color >> 5) & 0x3F;
const uint32_t red5 = (color >> 11) & 0x1F;
// Map from a 5 or 6 bit space into an 8 bit space. A
// 5-bit number has 32 possibilities; a 6 bit number
// has 64. We can calculate the projected 8-bit
// value for a k-bit number v, we can use the formula,
// v_8 = (v*2^8-1 + (k - 1)/2)/(2^k-1), or
// (v*255 + 15)/31 (for k=5) or (v*255 + 31)/63 (for
// k=6.
//
// To remove division by a prime and turn it into a
// shift, the constants below were empirically
// discovered to generate good results. See
// https://stackoverflow.com/questions/2442576/
// how-does-one-convert-16-bit-rgb565-to-24-bit-rgb888
// for details.
const uint32_t blue = (blue5 * 527 + 23) >> 6;
const uint32_t green = (green6 * 259 + 33) >> 6;
const uint32_t red = (red5 * 527 + 23) >> 6;
const uint32_t alpha = 0xFF000000;
return blue | (green << 8) | (red << 16) | alpha;
}
It's longer, yes, but I'd argue it's much easier to understand.
On my compiler, it generates almost identical code, except that
some instructions are in a different order.
Bart <bc@freeuk.com> writes:
My point was that it could be objective, at least for too many. So
(a*a) + (b*b) would be commonly agreed to have too many, [...]
Apparently you misunderstand what is meant by the word objective.
An objective statement is one that is independent of personal
assessment, even collective personal assessment.
Reaching consensus
on a question doesn't make the common view an objective one -- just
a commonly held one.
Here is a story from the earliest weeks of all of the time I have
been programming. In one of the first few programs I ever wrote
(and perhaps even the very first one), I had a statement like so:
x = alpha/beta*gamma
Of course the names here are made up, I don't remember the actual
names used. When x was printed out, it gave a value that was
much different from what I expected. What had happened was I had unconsciously assumed, reasoning by analogy with written
mathematics, that the statement would be interpreted as
alpha
x = ------------
beta*gamma
If someone really can't learn the rules of expression syntax for the
language they are using, they should be advised to try a different
language, or perhaps give up programming altogether.
It's silly to
worry about something that 999 people out of a 1000 (and the actual
numbers are undoubtedly much higher) are able to navigate without
difficulty. Yet the examples you give insist on focusing on the few
hopeless individuals.
On 04/06/2026 10:34, Tim Rentsch wrote:
Bart <bc@freeuk.com> writes:
My point was that it could be objective, at least for too many.ÿ So
(a*a) + (b*b) would be commonly agreed to have too many, [...]
Apparently you misunderstand what is meant by the word objective.
An objective statement is one that is independent of personal
assessment, even collective personal assessment.
I don't know of any infix PL syntax where 'a*a + b*b', as a standalone expression, doesn't mean '(a*a) + (b*b)'.
Google agrees with me (in that 2*2+3*3 shows 13), and so does my Casio calculator.
It's not my personal opinion!
ÿReaching consensus
on a question doesn't make the common view an objective one -- just
a commonly held one.
So, the number of times in this group where I've been told that everyone else disagrees with me about something so I must be wrong - this was
just your (pl) subjective opinion all along?
In the PL world then it is going to be mainly about subjective opinions! There are few absolute truths.
But what about this example:
ÿÿ ((((((a))))))
'Too many parentheses' is still subjective?
How about '((((a)))) using more parentheses than (a)'; that surely must
be objective?
Here is a story from the earliest weeks of all of the time I have
been programming.ÿ In one of the first few programs I ever wrote
(and perhaps even the very first one), I had a statement like so:
ÿÿÿÿ x = alpha/beta*gamma
Of course the names here are made up, I don't remember the actual
names used.ÿ When x was printed out, it gave a value that was
much different from what I expected.ÿ What had happened was I had
unconsciously assumed, reasoning by analogy with written
mathematics, that the statement would be interpreted as
ÿÿÿÿÿÿÿÿÿÿÿ alpha
ÿÿÿÿ x = ------------
ÿÿÿÿÿÿÿÿÿ beta*gamma
You will have quickly found out that PL syntax is not mathematics. For a start, mathematics doesn't normally use '*', nor '/' for that matter.
Yes, there is a discrepancy with the precedences of divide and (implied) multiply. However, a*a + b*b example didn't use divide.
(Note that C has its own problems in this area:
ÿÿ a = b/*p;ÿÿÿÿÿ // divide b by dereferenced pointer p
Here, /* also happens to start a block comment.)
If someone really can't learn the rules of expression syntax for the
language they are using, they should be advised to try a different
language, or perhaps give up programming altogether.
It can be multiple languages, and they might want to write the same expression the same way in each.
It could be no language: maybe its pseudo-code, or some unspecified
language in a forum which is not language-specific. They want anybody to just understand it.
This is the scenerio I mentioned where you can risk not using
precedences when expressions involve "+ - * /", comparisons, and AND/OR since generally these are treated sensibly by infix languages (even in
C, almost).
But operators such as '<< >> & ^ |' are treated more diversely. Here you would be taking a bigger risk. You could label such code as 'C
Syntax' (if posting for example) but that is just being lazy.
ÿIt's silly to
worry about something that 999 people out of a 1000 (and the actual
numbers are undoubtedly much higher) are able to navigate without
difficulty.ÿ Yet the examples you give insist on focusing on the few
hopeless individuals.
Are you saying that whoever wrote code like this:
ÿÿÿÿ crcu32 = (crcu32 >> 4) ^ s_crc32[(crcu32 & 0xF) ^ (b & 0xF)];
is needlessly worrying about the 99.9+% of the readership who you claim
will know C syntax rules precisely? That is, they would find this
version just as clear without any extra cognitive effort:
ÿÿÿÿ crcu32 = crcu32 >> 4 ^ s_crc32[crcu32 & 0xF ^ b & 0xF];
?
If so then you are hopelessly wrong.
On 04/06/2026 13:40, Bart wrote:
On 04/06/2026 10:34, Tim Rentsch wrote:
Bart <bc@freeuk.com> writes:
My point was that it could be objective, at least for too many.ÿ So
(a*a) + (b*b) would be commonly agreed to have too many, [...]
Apparently you misunderstand what is meant by the word objective.
An objective statement is one that is independent of personal
assessment, even collective personal assessment.
I don't know of any infix PL syntax where 'a*a + b*b', as a standalone
expression, doesn't mean '(a*a) + (b*b)'.
Google agrees with me (in that 2*2+3*3 shows 13), and so does my Casio
calculator.
It's not my personal opinion!
You are - again - moving the goalposts.
It is an objective fact that "a * a + b * b" means "(a * a) + (b * b)"
in normal mathematics (at least in the countries I am familiar with),
and also in most mainstream programming languages.
It is an objective fact, therefore, that "(a*a) + (b*b)" has more parentheses than needed in the context of most programming languages.
"(a*a) + (b*b) has too many parentheses", on the other hand, is a purely subjective opinion.
If you wrote, for example, that "a << b + c" is ambiguous in C, then you
(Note that C has its own problems in this area:
ÿÿÿ a = b/*p;ÿÿÿÿÿ // divide b by dereferenced pointer p
Here, /* also happens to start a block comment.)
Here you are objectively wrong.ÿ C does not have a "problem" with this.
The parsing rules of the language are clear - often called "maximum
munch".ÿ The character sequence "/*" is the start of a comment, it is
not two separate operators.
ÿIt's silly to
worry about something that 999 people out of a 1000 (and the actual
numbers are undoubtedly much higher) are able to navigate without
difficulty.ÿ Yet the examples you give insist on focusing on the few
hopeless individuals.
Are you saying that whoever wrote code like this:
ÿÿÿÿÿ crcu32 = (crcu32 >> 4) ^ s_crc32[(crcu32 & 0xF) ^ (b & 0xF)];
is needlessly worrying about the 99.9+% of the readership who you
claim will know C syntax rules precisely? That is, they would find
this version just as clear without any extra cognitive effort:
ÿÿÿÿÿ crcu32 = crcu32 >> 4 ^ s_crc32[crcu32 & 0xF ^ b & 0xF];
?
Tim did not write that.
ÿ That example was not on the list of examples
you gave recently.
On 04/06/2026 13:40, Bart wrote:
[...]
It is an objective fact that "a * a + b * b" means "(a * a) + (b * b)"
in normal mathematics (at least in the countries I am familiar with),
and also in most mainstream programming languages.
It is an objective fact, therefore, that "(a*a) + (b*b)" has more parentheses than needed in the context of most programming languages.
"(a*a) + (b*b) has too many parentheses", on the other hand, is a purely subjective opinion.ÿ Even if it is true that this is "commonly agreed
to" (and AFAIK you have no basis for that claim), that would still be a subjective opinion - no matter how common that opinion is.
Does that clear up your misunderstanding about "objective" and
"subjective" ?
Sometimes you might voice an opinion that is so extreme or uncommon that people might tell you you are wrong, when saying they disagree would be
more appropriate - discussions here are not formal.
[...]
[...]
On 04/06/2026 13:35, David Brown wrote:
[...]
"(a*a) + (b*b) has too many parentheses", on the other hand, is a
purely subjective opinion.
So, you're arguing 'more than needed' is a completely different thing
from 'too many'.
[...]
I don't have the patience for such nonsense any more:
* The () in '(a * b) + c' are generally unnecessary
* The () in 'a << (b + c)' are advisable
* The () in '(a << b) + c)' are necessary if the intent is to have
ÿ what might be the more intuitive meaning.
[...]
On 2026-06-04 15:18, Bart wrote:
[...]
* The () in '(a << b) + c)' are necessary if the intent is to have
ÿÿ what might be the more intuitive meaning.
I've already written in some former post about _unnecessarily_ mixing different types in expressions.
If you stay in such subexpressions with the same types you'll notice
that the parentheses are unnecessary; the C-language's precedences
have been sensibly chosen (in this case[*]).
[*] And even if you add some ofÿ ^ | &ÿ it's still no problem, unless
you have also any of the comparison operators in your expressions.
Janis
[...]
On 04/06/2026 13:35, David Brown wrote:
On 04/06/2026 13:40, Bart wrote:
On 04/06/2026 10:34, Tim Rentsch wrote:
Bart <bc@freeuk.com> writes:
My point was that it could be objective, at least for too many.ÿ So
(a*a) + (b*b) would be commonly agreed to have too many, [...]
Apparently you misunderstand what is meant by the word objective.
An objective statement is one that is independent of personal
assessment, even collective personal assessment.
I don't know of any infix PL syntax where 'a*a + b*b', as a
standalone expression, doesn't mean '(a*a) + (b*b)'.
Google agrees with me (in that 2*2+3*3 shows 13), and so does my
Casio calculator.
It's not my personal opinion!
You are - again - moving the goalposts.
It is an objective fact that "a * a + b * b" means "(a * a) + (b * b)"
in normal mathematics (at least in the countries I am familiar with),
and also in most mainstream programming languages.
It is an objective fact, therefore, that "(a*a) + (b*b)" has more
parentheses than needed in the context of most programming languages.
"(a*a) + (b*b) has too many parentheses", on the other hand, is a
purely subjective opinion.
So, you're arguing 'more than needed' is a completely different thing
from 'too many'.
Sigh...
If you wrote, for example, that "a << b + c" is ambiguous in C, then you
It is technically unambiguous in C.
It can be ambiguous in the mind of
somebody who would have to double-check the precedence levels, or where
the C context is missing.
The discssion seems to about what exactly is 'too many'.
Apparently you can constuct a valid C source file where 99.9% of the
text consists of () characters, but if someone - or even a million
people - say that it is too many, then that is just their subjective opinion.
I don't have the patience for such nonsense any more:
* The () in '(a * b) + c' are generally unnecessary
* The () in 'a << (b + c)' are advisable
* The () in '(a << b) + c)' are necessary if the intent is to have
ÿ what might be the more intuitive meaning.
If this not 100% C-specific, than () are needed for both the last two examples, but not the first.
You all know this.
(Note that C has its own problems in this area:
ÿÿÿ a = b/*p;ÿÿÿÿÿ // divide b by dereferenced pointer p
Here, /* also happens to start a block comment.)
Here you are objectively wrong.ÿ C does not have a "problem" with
this. The parsing rules of the language are clear - often called
"maximum munch".ÿ The character sequence "/*" is the start of a
comment, it is not two separate operators.
This is where it falls down. It's very clearly a 'gotcha', and
consequence of poorly thought-out design.
That the behaviour is deterministic doesn't change that.
ÿIt's silly to
worry about something that 999 people out of a 1000 (and the actual
numbers are undoubtedly much higher) are able to navigate without
difficulty.ÿ Yet the examples you give insist on focusing on the few
hopeless individuals.
Are you saying that whoever wrote code like this:
ÿÿÿÿÿ crcu32 = (crcu32 >> 4) ^ s_crc32[(crcu32 & 0xF) ^ (b & 0xF)];
is needlessly worrying about the 99.9+% of the readership who you
claim will know C syntax rules precisely? That is, they would find
this version just as clear without any extra cognitive effort:
ÿÿÿÿÿ crcu32 = crcu32 >> 4 ^ s_crc32[crcu32 & 0xF ^ b & 0xF];
?
Tim did not write that.
What was the 'something' in "It's silly to worry about something that ..."?
I assume it's people being unable to understand that second example.
Yet I seee parenthese being used in such cases a LOT more than 0.1% of
the time. 50% or more would be my guess.
ÿ That example was not on the list of examples you gave recently.
It was posted several times.
(https://github.com/richgel999/miniz/blob/master/miniz.c line 81, second
hit for '>>')
On 04/06/2026 15:18, Bart wrote:
It is an objective fact, therefore, that "(a*a) + (b*b)" has more
parentheses than needed in the context of most programming languages.
"(a*a) + (b*b) has too many parentheses", on the other hand, is a
purely subjective opinion.
So, you're arguing 'more than needed' is a completely different thing
from 'too many'.
Of course they are different things - albeit related things, rather
than /completely/ different.ÿ One is a question of fact, the other a question of opinion, and they do not always coincide.
It is a fact that "a << (b + c)" has more parentheses than needed.ÿ But
I think we are both of the opinion that it does not have "too many" parentheses - it has an appropriate number of parentheses.
Sigh...
It is technically unambiguous in C.
If you wrote, for example, that "a << b + c" is ambiguous in C, then you >>
There is no "technically" about it.ÿ It is unambiguous in C.
It can be ambiguous in the mind of somebody who would have to double-
check the precedence levels, or where the C context is missing.
I would not use the word "ambiguous" there - "unclear" would be more appropriate in the situation when someone does not know the C precedence levels.
No, it's an attempt to get you to understand the difference between "objective" and "subjective" - fact and opinion.ÿ I don't understand why
you are having such a problem here.
* The () in '(a << b) + c)' are necessary if the intent is to have
ÿÿ what might be the more intuitive meaning.
The parentheses in "(a << b) + c" are necessary if the intent is to
shift "a" by "b", and then add "c" to the result.ÿ That is fact, not opinion.ÿ Any discussion of "intuitive" is necessarily subjective.
ÿÿÿ a = b/*p;ÿÿÿÿÿ // divide b by dereferenced pointer p
This is where it falls down. It's very clearly a 'gotcha', and
consequence of poorly thought-out design.
It is neither a "gotcha", not a consequence of poor design.ÿ It does not "fall down".ÿ It is simply a minor consequence of the choice of operator syntax.ÿ Such an expression would occur rarely in code, and to be a
"gotcha" it would need to be realistic for someone to write it, without spaces, and for their code to compile and be used without the mistake
being noticed.ÿ Do you think that is in any way realistic?ÿ I do not.
And to be "poor design", it needs to be something that is likely to
cause problems
What was the 'something' in "It's silly to worry about something
that ..."?
My mind-reading skills are not that well developed.
On 04/06/2026 15:18, Bart wrote:
(Note that C has its own problems in this area:
ÿÿÿ a = b/*p;ÿÿÿÿÿ // divide b by dereferenced pointer p
Here, /* also happens to start a block comment.)
Here you are objectively wrong.ÿ C does not have a "problem" with
this. The parsing rules of the language are clear - often called
"maximum munch".ÿ The character sequence "/*" is the start of a
comment, it is not two separate operators.
This is where it falls down. It's very clearly a 'gotcha', and
consequence of poorly thought-out design.
It is neither a "gotcha", not a consequence of poor design.
David Brown <david.brown@hesbynett.no> writes:
On 04/06/2026 15:18, Bart wrote:
(Note that C has its own problems in this area:
ÿÿÿ a = b/*p;ÿÿÿÿÿ // divide b by dereferenced pointer p
Here, /* also happens to start a block comment.)
Here you are objectively wrong.ÿ C does not have a "problem" with
this. The parsing rules of the language are clear - often called
"maximum munch".ÿ The character sequence "/*" is the start of a
comment, it is not two separate operators.
This is where it falls down. It's very clearly a 'gotcha', and
consequence of poorly thought-out design.
It is neither a "gotcha", not a consequence of poor design.
Indeed, and in the early days, the compiler itself would never
have seen '/*' - the preprocessor (cpp) would have removed it
from the source before the source reached the first
pass of the compiler (c0).
cross@spitfire.i.gajendra.net (Dan Cross) writes:
In article <86ik81cfk5.fsf_-_@linuxsc.com>,
Tim Rentsch <tr.17687@z991.linuxsc.com> wrote:
Janis Papanagnou <janis_papanagnou+ng@hotmail.com> writes:
On 2026-06-01 00:54, Keith Thompson wrote:
[...]
Yes, a compiler can reduce (a + b) * 0 to just 0. But it's not
required to do so, and (INT_MAX + 1) * 0 still has undefined
behavior. Undefined behavior is determined by the rules of the
abstract machine *without* any adjustments permitted by the as-if
rule.
This is something I really don't get in the actual C-logic...
Using constants that can be determined at compile time is UB here,
despite the '* 0' mathematically indicating an IMO clear semantics,
but using variables is only UB possibly at runtime? [...]
There's an important distinction to make here. Consider this
program:
#include <limits.h>
int
foo(){
int zero = (INT_MAX+1)*0;
return zero;
}
int
main(){
return 0;
}
This program does not transgress the bounds of undefined behavior.
To clarify, the comments in my posting were meant to be read as
saying the given text is the entire program, and that it is strictly >conforming with respect to conforming hosted implementations.
(Incidentally, given the rules for freestanding implementations, I'm
not sure that it is even possible for any program to be strictly
conforming with respect to conforming freestanding implementations.
In any case my statements were meant only in the context of hosted >implementations.)
[snip]
Perhaps you mean that this is irrelevant because `foo` is not
invoked, but I see no reason why that need be the case in e.g.
a freestanding environment.
I explained the context of my previous statements above. Sorry for
not saying that in the original message.
In a hosted environment, I don't
think anything explicitly prevents `foo` from being called after
`main` returns (though I can't imagine that would happen in real
life; it would be weird if it did).
The semantics described in the ISO C standard don't admit that
possibility.
Whether foo() has external linkage or internal
linkage doesn't change that.
Only those actions initiated by
statements in main() are ever elaborated.
But I'm not sure what _you_ mean by "transgress the bounds of
undefined behavior" here.
It's a grammatical fine point. I think for present purposes it's
okay to gloss over the distinction, and say this statement may be
read as saying "the program does not have undefined behavior".
Even more than that, the program is strictly conforming, and must be
accepted by a conforming implementation.
See above.
Now let's change the program slightly:
#include <limits.h>
int
foo(){
static int zero = (INT_MAX+1)*0;
return zero;
}
int
main(){
return 0;
}
This program does transgress the bounds of undefined behavior. The
reason for the difference is that in the first program the semantics
of foo() is to evaluate the expression to be stored in 'zero' only
at runtime, whereas in the second program the semantics of foo() is
to evaluate the expression to be stored in 'zero' before program
startup (informally, "at compile time"). What matters is not
whether the offending expression /might/ be evaluated "at compile
time", but whether the offending expression /must/ be evaluated "at
compile time". Only in the second case is undefined behavior
inevitable (and thus it does not occur in the first program).
Fine point: strictly speaking, I believe the C standard allows even
the second program to complete translation phase 8 successfully, and
for any offending behavior to occur only when we actually try to run
the program. To say that another way, there is no requirement that
possible nasal demons be made manifest at any point before an actual
attempted execution. On the other hand, because that possibility is
there lurking in the background, there is no requirement that the
program be accepted, and could be rejected by a conforming compiler.
Indeed. Further, I believe that the same is true for the first
program, as well.
It isn't. In the first program the offending expression is never
evaluated, because foo() is never called.
On 04/06/2026 17:18, Scott Lurndal wrote:
David Brown <david.brown@hesbynett.no> writes:
On 04/06/2026 15:18, Bart wrote:
(Note that C has its own problems in this area:
ÿÿÿ a = b/*p;ÿÿÿÿÿ // divide b by dereferenced pointer p
Here, /* also happens to start a block comment.)
Here you are objectively wrong.ÿ C does not have a "problem" with
this. The parsing rules of the language are clear - often called
"maximum munch".ÿ The character sequence "/*" is the start of a
comment, it is not two separate operators.
This is where it falls down. It's very clearly a 'gotcha', and
consequence of poorly thought-out design.
It is neither a "gotcha", not a consequence of poor design.
Indeed, and in the early days, the compiler itself would never
have seen '/*' - the preprocessor (cpp) would have removed it
from the source before the source reached the first
pass of the compiler (c0).
How does that not make it bad design?
The proprocessor would strip everything from the /* until the next
matching */, so a chunk of your program goes missing.
Indeed, and in the early days, the compiler itself would never
have seen '/*' - the preprocessor (cpp) would have removed it
from the source before the source reached the first
pass of the compiler (c0).
On 04/06/2026 15:27, David Brown wrote:
On 04/06/2026 15:18, Bart wrote:
It is an objective fact, therefore, that "(a*a) + (b*b)" has more
parentheses than needed in the context of most programming languages.
"(a*a) + (b*b) has too many parentheses", on the other hand, is a
purely subjective opinion.
So, you're arguing 'more than needed' is a completely different thing
from 'too many'.
Of course they are different things - albeit related things, rather
than /completely/ different.ÿ One is a question of fact, the other a
question of opinion, and they do not always coincide.
It is a fact that "a << (b + c)" has more parentheses than needed.
But I think we are both of the opinion that it does not have "too
many" parentheses - it has an appropriate number of parentheses.
So saying 'too many' of something will be a subjective opinion?
OK, so let's try compiling this bit of C:
ÿ void F(int, int);
ÿ int main() {
ÿÿÿÿÿ F(1, 2, 3);
ÿ }
8 out of 9 compilers reported 'Too many arguments'.
[...]
I think we'll leave it here.
[...]
* The () in '(a << b) + c)' are necessary if the intent is to have
ÿÿ what might be the more intuitive meaning.
The parentheses in "(a << b) + c" are necessary if the intent is to
shift "a" by "b", and then add "c" to the result.ÿ That is fact, not
opinion.ÿ Any discussion of "intuitive" is necessarily subjective.
Intuitive because here << performs the same scaling function as multiply:
ÿ a << bÿÿ is the same as a * 2**b
ÿ a * bÿÿÿ is the same as a << log2(b) when b is a power of two
ÿÿÿÿÿÿÿÿÿÿ (or thereabouts!)
The point is: they naturally belong together.
Given 'a * 8 + b' or 'a << 3 + b', it is desirable to freely convert one
to the other without having to restructure the parentheses.
[...]
Indeed, and in the early days, the compiler itself would never
have seen '/*' - the preprocessor (cpp) would have removed it
from the source before the source reached the first
pass of the compiler (c0).
On 04/06/2026 15:27, David Brown wrote:
On 04/06/2026 15:18, Bart wrote:
It is an objective fact, therefore, that "(a*a) + (b*b)" has more
parentheses than needed in the context of most programming languages.
"(a*a) + (b*b) has too many parentheses", on the other hand, is a
purely subjective opinion.
So, you're arguing 'more than needed' is a completely different thing
from 'too many'.
Of course they are different things - albeit related things, rather
than /completely/ different.ÿ One is a question of fact, the other a
question of opinion, and they do not always coincide.
It is a fact that "a << (b + c)" has more parentheses than needed.
But I think we are both of the opinion that it does not have "too
many" parentheses - it has an appropriate number of parentheses.
So saying 'too many' of something will be a subjective opinion? OK, so
let's try compiling this bit of C:
ÿ void F(int, int);
ÿ int main() {
ÿÿÿÿÿ F(1, 2, 3);
ÿ }
8 out of 9 compilers reported 'Too many arguments'.
According to you, that's only their subjective opinion, not an objective fact?
My mind-reading skills are not that well developed.
It didn't stop you giving an opinion about what you thought he meant!
Bart <bc@freeuk.com> writes:
On 04/06/2026 17:18, Scott Lurndal wrote:
David Brown <david.brown@hesbynett.no> writes:
On 04/06/2026 15:18, Bart wrote:
(Note that C has its own problems in this area:
ÿÿÿ a = b/*p;ÿÿÿÿÿ // divide b by dereferenced pointer p
Here, /* also happens to start a block comment.)
Here you are objectively wrong.ÿ C does not have a "problem" with
this. The parsing rules of the language are clear - often called
"maximum munch".ÿ The character sequence "/*" is the start of a
comment, it is not two separate operators.
This is where it falls down. It's very clearly a 'gotcha', and
consequence of poorly thought-out design.
It is neither a "gotcha", not a consequence of poor design.
Indeed, and in the early days, the compiler itself would never
have seen '/*' - the preprocessor (cpp) would have removed it
from the source before the source reached the first
pass of the compiler (c0).
How does that not make it bad design?
The proprocessor would strip everything from the /* until the next
matching */, so a chunk of your program goes missing.
Whatcha talkin' 'bout willis?
On 04/06/2026 13:35, David Brown wrote:
On 04/06/2026 13:40, Bart wrote:
On 04/06/2026 10:34, Tim Rentsch wrote:You are - again - moving the goalposts.
Bart <bc@freeuk.com> writes:
My point was that it could be objective, at least for too many.ÿ So
(a*a) + (b*b) would be commonly agreed to have too many, [...]
Apparently you misunderstand what is meant by the word objective.
An objective statement is one that is independent of personal
assessment, even collective personal assessment.
I don't know of any infix PL syntax where 'a*a + b*b', as a
standalone expression, doesn't mean '(a*a) + (b*b)'.
Google agrees with me (in that 2*2+3*3 shows 13), and so does my
Casio calculator.
It's not my personal opinion!
It is an objective fact that "a * a + b * b" means "(a * a) + (b *
b)" in normal mathematics (at least in the countries I am familiar
with), and also in most mainstream programming languages.
It is an objective fact, therefore, that "(a*a) + (b*b)" has more
parentheses than needed in the context of most programming
languages.
"(a*a) + (b*b) has too many parentheses", on the other hand, is a
purely subjective opinion.
So, you're arguing 'more than needed' is a completely different thing
from 'too many'.
Sigh...
If you wrote, for example, that "a << b + c" is ambiguous in C, then
you
It is technically unambiguous in C. It can be ambiguous in the mind of somebody who would have to double-check the precedence levels, or
where the C context is missing.
The discssion seems to about what exactly is 'too many'.
Apparently you can constuct a valid C source file where 99.9% of the
text consists of () characters, but if someone - or even a million
people - say that it is too many, then that is just their subjective
opinion.
I don't have the patience for such nonsense any more:
* The () in '(a * b) + c' are generally unnecessary
* The () in 'a << (b + c)' are advisable
* The () in '(a << b) + c)' are necessary if the intent is to have
what might be the more intuitive meaning.
On 2026-06-04 18:18, Scott Lurndal wrote:
Indeed, and in the early days, the compiler itself would never
have seen '/*' - the preprocessor (cpp) would have removed it
from the source before the source reached the first
pass of the compiler (c0).
Curious; was the comment-handling at some point in history removed
from the Cpp-processing? - If so, when was that? And I assume the
semantics are still the same; is that correct?
On 2026-06-04 18:18, Scott Lurndal wrote:
Indeed, and in the early days, the compiler itself would never
have seen '/*' - the preprocessor (cpp) would have removed it
from the source before the source reached the first
pass of the compiler (c0).
Curious; was the comment-handling at some point in history removed
from the Cpp-processing? - If so, when was that? And I assume the
semantics are still the same; is that correct?
On 04/06/2026 19:47, Janis Papanagnou wrote:
On 2026-06-04 18:18, Scott Lurndal wrote:
Indeed, and in the early days, the compiler itself would never
have seen '/*' - the preprocessor (cpp) would have removed it
from the source before the source reached the first
pass of the compiler (c0).
Curious; was the comment-handling at some point in history removed
from the Cpp-processing? - If so, when was that? And I assume the
semantics are still the same; is that correct?
No, at least since the standardisation of the C language (including K&R "standard"), "preprocessing" has been an integral part of the C language
and conversion of comments to space characters is done in phase 3 of the translation. But the C standards do not give an explicit distinction between "preprocessing" and "compiling" - just different translation
phases. (They do not define a "compiler" at all.) It is not uncommon
for implementations to separate translation into two or more programs, especially in the good old days when hosts had much less memory, but logically they are all one implementation. Distinguishing "the compiler itself" is somewhat artificial.
On 04/06/2026 17:46, Bart wrote:
On 04/06/2026 15:27, David Brown wrote:
On 04/06/2026 15:18, Bart wrote:
It is an objective fact, therefore, that "(a*a) + (b*b)" has more
parentheses than needed in the context of most programming languages. >>>>>
"(a*a) + (b*b) has too many parentheses", on the other hand, is a
purely subjective opinion.
So, you're arguing 'more than needed' is a completely different
thing from 'too many'.
Of course they are different things - albeit related things, rather
than /completely/ different.ÿ One is a question of fact, the other a
question of opinion, and they do not always coincide.
It is a fact that "a << (b + c)" has more parentheses than needed.
But I think we are both of the opinion that it does not have "too
many" parentheses - it has an appropriate number of parentheses.
So saying 'too many' of something will be a subjective opinion? OK, so
let's try compiling this bit of C:
ÿÿ void F(int, int);
ÿÿ int main() {
ÿÿÿÿÿÿ F(1, 2, 3);
ÿÿ }
8 out of 9 compilers reported 'Too many arguments'.
According to you, that's only their subjective opinion, not an
objective fact?
Again - /please/ stop trying to guess what people say or put words in
their mouths.ÿ I can't remember ever seeing you do so accurately.
It is an objective fact, therefore, that "(a*a) + (b*b)" has more parentheses than needed in the context of most programming languages.
"(a*a) + (b*b) has too many parentheses", on the other hand, is a purely subjective opinion. Even if it is true that this is "commonly agreed
to" (and AFAIK you have no basis for that claim), that would still be a subjective opinion - no matter how common that opinion is.
"Too many parentheses" is subjective, because they affect the ease of reading the code as a human reader.
On Thu, 04 Jun 2026 16:18:07 +0000, Scott Lurndal wrote:
[snip]
Indeed, and in the early days, the compiler itself would never
have seen '/*' - the preprocessor (cpp) would have removed it
from the source before the source reached the first
pass of the compiler (c0).
So, I've looked through "The C Programming Language" (the K&R C)
and the paper "A Tour Through the Portable C Compiler" (S. C.
Johnson, circa 1974), and neither document states that the
preprocessor strips comments. In fact, the mentions of the
preprocessor are exclusively about the #operation operators,
and not about C comments.
In article <10vsh43$b3is$1@dont-email.me>,
Lew Pitcher <lew.pitcher@digitalfreehold.ca> wrote:
On Thu, 04 Jun 2026 16:18:07 +0000, Scott Lurndal wrote:
[snip]
Indeed, and in the early days, the compiler itself would never
have seen '/*' - the preprocessor (cpp) would have removed it
from the source before the source reached the first
pass of the compiler (c0).
So, I've looked through "The C Programming Language" (the K&R C)
and the paper "A Tour Through the Portable C Compiler" (S. C.
Johnson, circa 1974), and neither document states that the
preprocessor strips comments. In fact, the mentions of the
preprocessor are exclusively about the #operation operators,
and not about C comments.
The PDP-11 compiler from 5th Edition research Unix removes
comments in `cc.c`. The 1972 compilers from Dennis Ritchie's
web page remove them in the compiler proper, as they predated
the preprocessor: >https://www.nokia.com/bell-labs/about/dennis-m-ritchie/primevalC.html
On 2026-06-04 18:18, Scott Lurndal wrote:
Indeed, and in the early days, the compiler itself would never
have seen '/*' - the preprocessor (cpp) would have removed it
from the source before the source reached the first
pass of the compiler (c0).
Curious; was the comment-handling at some point in history removed
from the Cpp-processing? - If so, when was that? And I assume the
semantics are still the same; is that correct?
On 04/06/2026 17:47, Scott Lurndal wrote:
Bart <bc@freeuk.com> writes:
On 04/06/2026 17:18, Scott Lurndal wrote:
David Brown <david.brown@hesbynett.no> writes:
On 04/06/2026 15:18, Bart wrote:
(Note that C has its own problems in this area:
ÿÿÿ a = b/*p;ÿÿÿÿÿ // divide b by dereferenced pointer p
Here, /* also happens to start a block comment.)
Here you are objectively wrong.ÿ C does not have a "problem" with >>>>>>> this. The parsing rules of the language are clear - often called >>>>>>> "maximum munch".ÿ The character sequence "/*" is the start of a
comment, it is not two separate operators.
This is where it falls down. It's very clearly a 'gotcha', and
consequence of poorly thought-out design.
It is neither a "gotcha", not a consequence of poor design.
Indeed, and in the early days, the compiler itself would never
have seen '/*' - the preprocessor (cpp) would have removed it
from the source before the source reached the first
pass of the compiler (c0).
How does that not make it bad design?
The proprocessor would strip everything from the /* until the next
matching */, so a chunk of your program goes missing.
Whatcha talkin' 'bout willis?
What were /you/ talking about? What was your point?
In article <865x3yd21n.fsf@linuxsc.com>,[...]
Tim Rentsch <tr.17687@z991.linuxsc.com> wrote:
cross@spitfire.i.gajendra.net (Dan Cross) writes:
In article <86ik81cfk5.fsf_-_@linuxsc.com>,
Tim Rentsch <tr.17687@z991.linuxsc.com> wrote:
There's an important distinction to make here. Consider this
program:
#include <limits.h>
int
foo(){
int zero = (INT_MAX+1)*0;
return zero;
}
int
main(){
return 0;
}
This program does not transgress the bounds of undefined behavior.
To clarify, the comments in my posting were meant to be read as
saying the given text is the entire program, and that it is strictly >>conforming with respect to conforming hosted implementations. >>(Incidentally, given the rules for freestanding implementations, I'm
not sure that it is even possible for any program to be strictly
conforming with respect to conforming freestanding implementations.
In any case my statements were meant only in the context of hosted >>implementations.)
Ok.
[snip]
Perhaps you mean that this is irrelevant because `foo` is not
invoked, but I see no reason why that need be the case in e.g.
a freestanding environment.
I explained the context of my previous statements above. Sorry for
not saying that in the original message.
In a hosted environment, I don't
think anything explicitly prevents `foo` from being called after
`main` returns (though I can't imagine that would happen in real
life; it would be weird if it did).
The semantics described in the ISO C standard don't admit that
possibility.
Could you please point to where it says this, in the C standard?
I cannot find anything that says that arbitrary code cannot run
after `main()` returns, and I don't see how that could possibly
be true.
Whether foo() has external linkage or internal
linkage doesn't change that.
I disagree. There's no possible way for the implementation to
know whether a function with external linkage will be ultimately
invoked or not; consider a system that supports loadable shared
modules. Nothing prevents even this simple program from being
compiled as a shared module, dynamically loaded, the loading
program explicitly searching for and finding the symbol
corresponding to the `foo` function, and invoking it.
Hence, the compiler _must_ treat with UB as written, which is
why `ubsan` inserts trapping code in `foo`.
In your example, `foo` clearly exhibits UB; I think your
argument is whether that has a realized effect or not, since the
UB is not invoked. I'm saying that in general a compiler cannot
possibly know that when it compiles `foo`, and is free to assume
the worst.
cross@spitfire.i.gajendra.net (Dan Cross) writes:
In article <10vsh43$b3is$1@dont-email.me>,
Lew Pitcher <lew.pitcher@digitalfreehold.ca> wrote:
On Thu, 04 Jun 2026 16:18:07 +0000, Scott Lurndal wrote:
[snip]
Indeed, and in the early days, the compiler itself would never
have seen '/*' - the preprocessor (cpp) would have removed it
from the source before the source reached the first
pass of the compiler (c0).
So, I've looked through "The C Programming Language" (the K&R C)
and the paper "A Tour Through the Portable C Compiler" (S. C.
Johnson, circa 1974), and neither document states that the
preprocessor strips comments. In fact, the mentions of the
preprocessor are exclusively about the #operation operators,
and not about C comments.
The PDP-11 compiler from 5th Edition research Unix removes
comments in `cc.c`. The 1972 compilers from Dennis Ritchie's
web page remove them in the compiler proper, as they predated
the preprocessor: >>https://www.nokia.com/bell-labs/about/dennis-m-ritchie/primevalC.html
The v6 cpp.c processes the comments
and deletes them if the 'passcom' (-C) flag is not set.
[snip]
In article <sglUR.17897$pxGb.10844@fx07.iad>,
Scott Lurndal <slp53@pacbell.net> wrote:
cross@spitfire.i.gajendra.net (Dan Cross) writes:
In article <10vsh43$b3is$1@dont-email.me>,
Lew Pitcher <lew.pitcher@digitalfreehold.ca> wrote:
On Thu, 04 Jun 2026 16:18:07 +0000, Scott Lurndal wrote:
[snip]
Indeed, and in the early days, the compiler itself would never
have seen '/*' - the preprocessor (cpp) would have removed it
from the source before the source reached the first
pass of the compiler (c0).
So, I've looked through "The C Programming Language" (the K&R C)
and the paper "A Tour Through the Portable C Compiler" (S. C.
Johnson, circa 1974), and neither document states that the
preprocessor strips comments. In fact, the mentions of the
preprocessor are exclusively about the #operation operators,
and not about C comments.
The PDP-11 compiler from 5th Edition research Unix removes
comments in `cc.c`. The 1972 compilers from Dennis Ritchie's
web page remove them in the compiler proper, as they predated
the preprocessor: >>>https://www.nokia.com/bell-labs/about/dennis-m-ritchie/primevalC.html
The v6 cpp.c processes the comments
and deletes them if the 'passcom' (-C) flag is not set.
[snip]
You sure? That looks like V7 code to me.
On 04/06/2026 19:54, David Brown wrote:[...]
Again - /please/ stop trying to guess what people say or put words
in their mouths.ÿ I can't remember ever seeing you do so accurately.
This is what you actually said:
It is an objective fact, therefore, that "(a*a) + (b*b)" has more
parentheses than needed in the context of most programming languages.
"(a*a) + (b*b) has too many parentheses", on the other hand, is a purely
subjective opinion. Even if it is true that this is "commonly agreed
to" (and AFAIK you have no basis for that claim), that would still be a
subjective opinion - no matter how common that opinion is.
You're saying that:
* "more than needed" is objective
* "too many" is subjective
"Too many parentheses" is subjective, because they affect the ease
of reading the code as a human reader.
And 'more than needed' isn't that?!
No, this is just getting ludicrous and suggests not wanting to tackle
the real subject: should people write '(a << b) & c' or 'a << b & c'?
Tim Rentsch I'm sure will prefer the latter because 99.9% of C
programmers are machines, according to him.
Presumably, the same 99.9% will not use indentation, and will write
their programs all on one line anyway, because it is still after all completely unambiguous according to the C standard!
One advantage of having a single program do the whole thing, is that
error messages can mention the actual text of the line where a problem
was detected, without any pre-processing applied.
Bart <bc@freeuk.com> writes:
On 04/06/2026 17:47, Scott Lurndal wrote:
Bart <bc@freeuk.com> writes:
On 04/06/2026 17:18, Scott Lurndal wrote:
David Brown <david.brown@hesbynett.no> writes:
On 04/06/2026 15:18, Bart wrote:
(Note that C has its own problems in this area:
ÿÿÿ a = b/*p;ÿÿÿÿÿ // divide b by dereferenced pointer p
Here, /* also happens to start a block comment.)
Here you are objectively wrong.ÿ C does not have a "problem" with >>>>>>>> this. The parsing rules of the language are clear - often called >>>>>>>> "maximum munch".ÿ The character sequence "/*" is the start of a >>>>>>>> comment, it is not two separate operators.
This is where it falls down. It's very clearly a 'gotcha', and
consequence of poorly thought-out design.
It is neither a "gotcha", not a consequence of poor design.
Indeed, and in the early days, the compiler itself would never
have seen '/*' - the preprocessor (cpp) would have removed it
from the source before the source reached the first
pass of the compiler (c0).
How does that not make it bad design?
The proprocessor would strip everything from the /* until the next
matching */, so a chunk of your program goes missing.
Whatcha talkin' 'bout willis?
What were /you/ talking about? What was your point?
Your inaccurate characterization that a chunk of the program
went "missing". Nothing meaningful is missing (and the comment
remains in the original source file).
So what do you mean, exactly, when you claim that the output of
the preprocessor causes a chunk of the program (which doesn't
include whitespace or comments) is missing?
Bart <bc@freeuk.com> writes:
On 04/06/2026 19:54, David Brown wrote:[...]
Again - /please/ stop trying to guess what people say or put words
in their mouths.ÿ I can't remember ever seeing you do so accurately.
This is what you actually said:
It is an objective fact, therefore, that "(a*a) + (b*b)" has more
parentheses than needed in the context of most programming languages.
"(a*a) + (b*b) has too many parentheses", on the other hand, is a purely >>> subjective opinion. Even if it is true that this is "commonly agreed
to" (and AFAIK you have no basis for that claim), that would still be a
subjective opinion - no matter how common that opinion is.
You're saying that:
* "more than needed" is objective
* "too many" is subjective
Stop it. He's not saying that.
You're taking phrases out of context and making false claims that the
full statement was far more general than it actually was.
Nobody said or implied that "too many" is always subjective.
"Too many parentheses" is subjective, because they affect the ease
of reading the code as a human reader.
And 'more than needed' isn't that?!
More than needed *for what*? Without that context, we can't tell
whether "more than needed" is subjective or objective.
You know all this.
[...]
No, this is just getting ludicrous and suggests not wanting to tackle
the real subject: should people write '(a << b) & c' or 'a << b & c'?
Oh, is that the real subject?
I presume you prefer `(a << b) & c` to `a << b & c`.
So do I.
Others might or might not have different opinions. If that was the
"real subject", we've wasted a lot of time debating the difference
between subjectivity and objectivity.
Tim Rentsch I'm sure will prefer the latter because 99.9% of C
programmers are machines, according to him.
Tim didn't say or imply that.
Presumably, the same 99.9% will not use indentation, and will write
their programs all on one line anyway, because it is still after all
completely unambiguous according to the C standard!
Of course not, because 99.9% of C programmers are not idiots..
Your record of guessing incorrectly what other people think is
unbroken. I suggest you stop trying.
On 04/06/2026 21:34, Scott Lurndal wrote:
Bart <bc@freeuk.com> writes:
On 04/06/2026 17:47, Scott Lurndal wrote:
Bart <bc@freeuk.com> writes:
On 04/06/2026 17:18, Scott Lurndal wrote:
David Brown <david.brown@hesbynett.no> writes:
On 04/06/2026 15:18, Bart wrote:
(Note that C has its own problems in this area:
ÿÿÿ a = b/*p;ÿÿÿÿÿ // divide b by dereferenced pointer p >>>>>>>>>>
Here, /* also happens to start a block comment.)
Here you are objectively wrong.ÿ C does not have a "problem" with >>>>>>>>> this. The parsing rules of the language are clear - often called >>>>>>>>> "maximum munch".ÿ The character sequence "/*" is the start of a >>>>>>>>> comment, it is not two separate operators.
This is where it falls down. It's very clearly a 'gotcha', and >>>>>>>> consequence of poorly thought-out design.
It is neither a "gotcha", not a consequence of poor design.
Indeed, and in the early days, the compiler itself would never
have seen '/*' - the preprocessor (cpp) would have removed it
from the source before the source reached the first
pass of the compiler (c0).
How does that not make it bad design?
The proprocessor would strip everything from the /* until the next
matching */, so a chunk of your program goes missing.
Whatcha talkin' 'bout willis?
What were /you/ talking about? What was your point?
Your inaccurate characterization that a chunk of the program
went "missing". Nothing meaningful is missing (and the comment
remains in the original source file).
So what do you mean, exactly, when you claim that the output of
the preprocessor causes a chunk of the program (which doesn't
include whitespace or comments) is missing?
This is the example I gave elsewhere:
---------------------------
There are actually other issues associated with /**/ comments; here
someone forgot to terminate the first comment:
puts("one"); /* comment 1
puts("two"); /* commmet 2 */
puts("three"); /* comment 3 */
---------------------------
After preprocessing you're left with this:
puts("one");
puts("three");
That middle puts call is missing, and it's meant to be part of the program.
And 'more than needed' isn't that?!
There are actually other issues associated with /**/ comments; here
someone forgot to terminate the first comment:
ÿÿÿ puts("one");ÿÿÿ /* comment 1
ÿÿÿ puts("two");ÿÿÿ /* commmet 2 */
ÿÿÿ puts("three");ÿ /* comment 3 */
Jesus, the subthread has been going long enough.
On 04/06/2026 22:06, Keith Thompson wrote:
Bart <bc@freeuk.com> writes:
On 04/06/2026 19:54, David Brown wrote:[...]
Stop it. He's not saying that.Again - /please/ stop trying to guess what people say or put words
in their mouths.ÿ I can't remember ever seeing you do so accurately.
This is what you actually said:
It is an objective fact, therefore, that "(a*a) + (b*b)" has more
parentheses than needed in the context of most programming languages.
"(a*a) + (b*b) has too many parentheses", on the other hand, is a purely >>>> subjective opinion. Even if it is true that this is "commonly agreed
to" (and AFAIK you have no basis for that claim), that would still be a >>>> subjective opinion - no matter how common that opinion is.
You're saying that:
* "more than needed" is objective
* "too many" is subjective
That is EXACTLY what he's saying: "It is an OBJECTIVE fact .. has more
... than needed", and:
"has too many ... is ... purely subjective".
You're taking phrases out of context and making false claims that the
full statement was far more general than it actually was.
And this is exactly what other people are doing.
So I used TOO MANY instead of MORE THAN NEEDED to describe the exact
same phenomenon.
(1) Why are you all making such a big fucking deal of this?
(2) Why are you all sticking up for each other?
(3) Why don't you this discuss the fucking subject instead of going
down these pointless rabbit holes?
It is abourt how many brackets are too many, more than needed,
superfluous to requirements, etc etc etc.
Presumably, the same 99.9% will not use indentation, and will write
their programs all on one line anyway, because it is still after all
completely unambiguous according to the C standard!
Of course not, because 99.9% of C programmers are not idiots..
Your record of guessing incorrectly what other people think is
unbroken. I suggest you stop trying.
This is what Tim said:
"If someone really can't learn the rules of expression syntax for the language they are using, they should be advised to try a different
language, or perhaps give up programming altogether. It's silly to
worry about something that 999 people out of a 1000 (and the actual
numbers are undoubtedly much higher) are able to navigate without difficulty."
It sounds to me very much as though he expects 99.9% to know all C's precedences by heart and to never need to use superfluous brackets (or
'more than needed if 'superfluous' is still to subjective).
But of course, I am wrong and he is right, and you will defend his
view (a subjective one) to the death.
Bart <bc@freeuk.com> writes:
On 04/06/2026 22:06, Keith Thompson wrote:
Bart <bc@freeuk.com> writes:
On 04/06/2026 19:54, David Brown wrote:[...]
Stop it. He's not saying that.Again - /please/ stop trying to guess what people say or put words
in their mouths.ÿ I can't remember ever seeing you do so accurately.
This is what you actually said:
It is an objective fact, therefore, that "(a*a) + (b*b)" has more
parentheses than needed in the context of most programming languages. >>>>>
"(a*a) + (b*b) has too many parentheses", on the other hand, is a purely >>>>> subjective opinion. Even if it is true that this is "commonly agreed >>>>> to" (and AFAIK you have no basis for that claim), that would still be a >>>>> subjective opinion - no matter how common that opinion is.
You're saying that:
* "more than needed" is objective
* "too many" is subjective
That is EXACTLY what he's saying: "It is an OBJECTIVE fact .. has more
... than needed", and:
"has too many ... is ... purely subjective".
You're taking phrases out of context and making false claims that the
full statement was far more general than it actually was.
And this is exactly what other people are doing.
Taken literally, your statement implies that you admit that that's
what you're doing. Is that what you meant? If so, I suggest you
*stop* making such false claims. If not, what did you actually mean?
So I used TOO MANY instead of MORE THAN NEEDED to describe the exact
same phenomenon.
That's not the problem. There is an actual meaningful distinction
here, between what's needed by the compiler and what's useful to
improve clarity for human readers. I have found some of what you've
written to be unclear about that distinction.
Can we agree that the question of whether parentheses in a C
expression are necessary to the compiler can be answered objectively?
Can we agree that the question of whether extra parentheses are
helpful to a human reader is at least partly subjective, and
varies from case to case? Is there really anything else that we fundamentally disagree about?
(1) Why are you all making such a big fucking deal of this?
Why are you?
Why are you?
In article <865x3yd21n.fsf@linuxsc.com>,[...]
Tim Rentsch <tr.17687@z991.linuxsc.com> wrote: >>>cross@spitfire.i.gajendra.net (Dan Cross) writes:
In article <86ik81cfk5.fsf_-_@linuxsc.com>,
Tim Rentsch <tr.17687@z991.linuxsc.com> wrote:
There's an important distinction to make here. Consider this
program:
#include <limits.h>
int
foo(){
int zero = (INT_MAX+1)*0;
return zero;
}
int
main(){
return 0;
}
This program does not transgress the bounds of undefined behavior.
To clarify, the comments in my posting were meant to be read as
saying the given text is the entire program, and that it is strictly >>>conforming with respect to conforming hosted implementations. >>>(Incidentally, given the rules for freestanding implementations, I'm
not sure that it is even possible for any program to be strictly >>>conforming with respect to conforming freestanding implementations.
In any case my statements were meant only in the context of hosted >>>implementations.)
Ok.
[snip]
Perhaps you mean that this is irrelevant because `foo` is not
invoked, but I see no reason why that need be the case in e.g.
a freestanding environment.
I explained the context of my previous statements above. Sorry for
not saying that in the original message.
In a hosted environment, I don't
think anything explicitly prevents `foo` from being called after
`main` returns (though I can't imagine that would happen in real
life; it would be weird if it did).
The semantics described in the ISO C standard don't admit that >>>possibility.
Could you please point to where it says this, in the C standard?
I cannot find anything that says that arbitrary code cannot run
after `main()` returns, and I don't see how that could possibly
be true.
N3220 5.1.2.4, Program semantics.
It defines the *observable behavior* of a program, which consists of
accesses to volatile objects, data written to files, and I/O dynamics of >interactive devices.
If the usual "Hello, world" program prints "Hello, world" followed
by "Goodbye", the implementation is non-conforming. If it formats
my hard drive after printing "Goodbye", it's non-conforming and
dangerous.
Whether foo() has external linkage or internal
linkage doesn't change that.
I disagree. There's no possible way for the implementation to
know whether a function with external linkage will be ultimately
invoked or not; consider a system that supports loadable shared
modules. Nothing prevents even this simple program from being
compiled as a shared module, dynamically loaded, the loading
program explicitly searching for and finding the symbol
corresponding to the `foo` function, and invoking it.
Remember that linking is translation phase 8. The compiler is not
the entire implementation.
Hence, the compiler _must_ treat with UB as written, which is
why `ubsan` inserts trapping code in `foo`.
I don't know what "_must_ treat with UB" means.
foo() has undefined behavior if it's called, so replacing its
body with trapping code is valid. But (I'm reasonably sure that)
an implementation cannot reject a program just because it can't
prove that it has no undefined behavior during execution. It can
reject it if it can prove that it *always* has undefined behavior
during execution.
In your example, `foo` clearly exhibits UB; I think your
argument is whether that has a realized effect or not, since the
UB is not invoked. I'm saying that in general a compiler cannot
possibly know that when it compiles `foo`, and is free to assume
the worst.
foo() exhibits UB if and only if it's called during execution.
Yes, a compiler can't know whether foo() will be called.
An implementation, particularly a linker, might know, but is not
required to. No, it is not free to assume the worst.
I certainly wouldn't want a compiler to reject `1/time(NULL)`
because it can't prove that time(NULL) won't be zero, or reject
`argc+1` because it can't prove that argc < INT_MAX. Code whose
behavior would be undefined if it were executed has no behavior
(and therefore no UB) if it's not executed.
James Kuyper <jameskuyper@alumni.caltech.edu> writes:
[...]
One advantage of having a single program do the whole thing, is that
error messages can mention the actual text of the line where a problem
was detected, without any pre-processing applied.
Typical preprocessors emit directives that tell the compiler about
the current file name and line number, precisely so that diagnostic
messages can refer to the original text.
For example:
$ cat hello.c
#include <stdio.h>
int main(void) {
printf("Hello world!\n");
}
$ gcc -E hello.c | tail
extern int __uflow (FILE *);
extern int __overflow (FILE *, int);
# 983 "/usr/include/stdio.h" 3 4
# 2 "hello.c" 2
# 2 "hello.c"
int main(void) {
printf("Hello world!\n");
}
$
The line `# 2 "hello.c"` is, according to the C standard, a
"non-directive", which is a kind of directive. Executing a
non-directive has undefined behavior, but gcc apparently treats it
very much like a #line directive.
It doesn't really matter whether the preprocessor is a separate program
or not.
cross@spitfire.i.gajendra.net (Dan Cross) writes:
In article <sglUR.17897$pxGb.10844@fx07.iad>,
Scott Lurndal <slp53@pacbell.net> wrote:
cross@spitfire.i.gajendra.net (Dan Cross) writes:
In article <10vsh43$b3is$1@dont-email.me>,
Lew Pitcher <lew.pitcher@digitalfreehold.ca> wrote:
On Thu, 04 Jun 2026 16:18:07 +0000, Scott Lurndal wrote:
[snip]
Indeed, and in the early days, the compiler itself would never
have seen '/*' - the preprocessor (cpp) would have removed it
from the source before the source reached the first
pass of the compiler (c0).
So, I've looked through "The C Programming Language" (the K&R C)
and the paper "A Tour Through the Portable C Compiler" (S. C. >>>>>Johnson, circa 1974), and neither document states that the >>>>>preprocessor strips comments. In fact, the mentions of the >>>>>preprocessor are exclusively about the #operation operators,
and not about C comments.
The PDP-11 compiler from 5th Edition research Unix removes
comments in `cc.c`. The 1972 compilers from Dennis Ritchie's
web page remove them in the compiler proper, as they predated
the preprocessor: >>>>https://www.nokia.com/bell-labs/about/dennis-m-ritchie/primevalC.html
The v6 cpp.c processes the comments
and deletes them if the 'passcom' (-C) flag is not set.
[snip]
You sure? That looks like V7 code to me.
Yes, it is. I didn't have a machine readable version of the
v6 compiler handy. Dug it out and here's the v6 version.
getch()
{
register int c, lastst;
while ((c=getc1())=='/' && !instring)
{
if ((c=getc1())!='*')
{
pushback(c);
return('/');
}
if (!skipcom)
{putc('/',fout); putc('*', fout);}
lastst=0;
while ( (c = getc1()) != '\0')
{
if (lastst && c=='/')
{
if (!skipcom)
putc('/', fout);
break;
}
if (c=='\n' || !skipcom)
putc(c, fout);
lastst = (c=='*');
}
if (c=='\0')break;
}
return(c);
}
In article <8xlUR.17899$pxGb.16870@fx07.iad>,
Scott Lurndal <slp53@pacbell.net> wrote:
cross@spitfire.i.gajendra.net (Dan Cross) writes:
In article <sglUR.17897$pxGb.10844@fx07.iad>,
Scott Lurndal <slp53@pacbell.net> wrote:
cross@spitfire.i.gajendra.net (Dan Cross) writes:
In article <10vsh43$b3is$1@dont-email.me>,
Lew Pitcher <lew.pitcher@digitalfreehold.ca> wrote:
On Thu, 04 Jun 2026 16:18:07 +0000, Scott Lurndal wrote:
[snip]
Indeed, and in the early days, the compiler itself would never
have seen '/*' - the preprocessor (cpp) would have removed it
from the source before the source reached the first
pass of the compiler (c0).
So, I've looked through "The C Programming Language" (the K&R C) >>>>>>and the paper "A Tour Through the Portable C Compiler" (S. C. >>>>>>Johnson, circa 1974), and neither document states that the >>>>>>preprocessor strips comments. In fact, the mentions of the >>>>>>preprocessor are exclusively about the #operation operators,
and not about C comments.
The PDP-11 compiler from 5th Edition research Unix removes
comments in `cc.c`. The 1972 compilers from Dennis Ritchie's
web page remove them in the compiler proper, as they predated
the preprocessor: >>>>>https://www.nokia.com/bell-labs/about/dennis-m-ritchie/primevalC.html
The v6 cpp.c processes the comments
and deletes them if the 'passcom' (-C) flag is not set.
[snip]
You sure? That looks like V7 code to me.
Yes, it is. I didn't have a machine readable version of the
v6 compiler handy. Dug it out and here's the v6 version.
getch()
{
register int c, lastst;
while ((c=getc1())=='/' && !instring)
{
if ((c=getc1())!='*')
{
pushback(c);
return('/');
}
if (!skipcom)
{putc('/',fout); putc('*', fout);}
lastst=0;
while ( (c = getc1()) != '\0')
{
if (lastst && c=='/')
{
if (!skipcom)
putc('/', fout);
break;
}
if (c=='\n' || !skipcom)
putc(c, fout);
lastst = (c=='*');
}
if (c=='\0')break;
}
return(c);
}
Yeah, that's from `cc.c`, right?
On 05/06/2026 00:09, Keith Thompson wrote:
Bart <bc@freeuk.com> writes:
On 04/06/2026 22:06, Keith Thompson wrote:
Bart <bc@freeuk.com> writes:
On 04/06/2026 19:54, David Brown wrote:[...]
Stop it.ÿ He's not saying that.Again - /please/ stop trying to guess what people say or put words >>>>>> in their mouths.ÿ I can't remember ever seeing you do so accurately. >>>>>This is what you actually said:
It is an objective fact, therefore, that "(a*a) + (b*b)" has more
parentheses than needed in the context of most programming languages. >>>>>>
"(a*a) + (b*b) has too many parentheses", on the other hand, is a >>>>>> purely
subjective opinion.ÿ Even if it is true that this is "commonly agreed >>>>>> to" (and AFAIK you have no basis for that claim), that would still >>>>>> be a
subjective opinion - no matter how common that opinion is.
You're saying that:
*ÿ "more than needed" is objective
*ÿ "too many" is subjective
That is EXACTLY what he's saying: "It is an OBJECTIVE fact .. has more
... than needed", and:
ÿ "has too many ... is ... purely subjective".
You're taking phrases out of context and making false claims that the
full statement was far more general than it actually was.
And this is exactly what other people are doing.
Taken literally, your statement implies that you admit that that's
what you're doing.ÿ Is that what you meant?ÿ If so, I suggest you
*stop* making such false claims.ÿ If not, what did you actually mean?
So I used TOO MANY instead of MORE THAN NEEDED to describe the exact
same phenomenon.
That's not the problem.ÿ There is an actual meaningful distinction
here, between what's needed by the compiler and what's useful to
improve clarity for human readers.ÿ I have found some of what you've
written to be unclear about that distinction.
Can we agree that the question of whether parentheses in a C
expression are necessary to the compiler can be answered objectively?
Can we agree that the question of whether extra parentheses are
helpful to a human reader is at least partly subjective, and
varies from case to case?ÿ Is there really anything else that we
fundamentally disagree about?
(1) Why are you all making such a big fucking deal of this?
Why are you?
I didn't start this business of something being subjective or objective,
or suggesting than one turn of phrase to discuss the same thing was subjective and the other objective (implying that a subjective opinion
had less worth). TR started that and several people backed him up.
Myself I wouldn't even use those terms. My point was that some overuses
of () for commonly known precedences are more overkill than others.
If that's subjective then so be it; it is not some fundamental law of
the universe. I would just call it common sense.
Why are you?
Since you ask, I was defending my point of view then got sidetracked by
this subjective/objective nonsense. I notice that TR has disappeared
from this subthread.
In article <10vsnl7$lkmu$1@kst.eternal-september.org>,
Keith Thompson <Keith.S.Thompson+u@gmail.com> wrote:
cross@spitfire.i.gajendra.net (Dan Cross) writes:
In article <865x3yd21n.fsf@linuxsc.com>,[...]
Tim Rentsch <tr.17687@z991.linuxsc.com> wrote: >>>>cross@spitfire.i.gajendra.net (Dan Cross) writes:
In article <86ik81cfk5.fsf_-_@linuxsc.com>,
Tim Rentsch <tr.17687@z991.linuxsc.com> wrote:
There's an important distinction to make here. Consider this
program:
#include <limits.h>
int
foo(){
int zero = (INT_MAX+1)*0;
return zero;
}
int
main(){
return 0;
}
This program does not transgress the bounds of undefined behavior.
To clarify, the comments in my posting were meant to be read as
saying the given text is the entire program, and that it is strictly >>>>conforming with respect to conforming hosted implementations. >>>>(Incidentally, given the rules for freestanding implementations, I'm >>>>not sure that it is even possible for any program to be strictly >>>>conforming with respect to conforming freestanding implementations.
In any case my statements were meant only in the context of hosted >>>>implementations.)
Ok.
[snip]
Perhaps you mean that this is irrelevant because `foo` is not
invoked, but I see no reason why that need be the case in e.g.
a freestanding environment.
I explained the context of my previous statements above. Sorry for
not saying that in the original message.
In a hosted environment, I don't
think anything explicitly prevents `foo` from being called after
`main` returns (though I can't imagine that would happen in real
life; it would be weird if it did).
The semantics described in the ISO C standard don't admit that >>>>possibility.
Could you please point to where it says this, in the C standard?
I cannot find anything that says that arbitrary code cannot run
after `main()` returns, and I don't see how that could possibly
be true.
N3220 5.1.2.4, Program semantics.
It defines the *observable behavior* of a program, which consists of >>accesses to volatile objects, data written to files, and I/O dynamics of >>interactive devices.
Yes, but it does so for strictly-conforming programs with no UB.
To understand conformance, we have to jump over to section 4,
which explicitly says that, 'Undefined behavior is otherwise
indicated in this document by the words "undefined behavior" or
by the omission of any explicit definition of behavior.' As it
does not say that a program with an instance of undefined
behavior in an integer constant expression that is not executed
must otherwise behave in any given manner, what the program does
is undefined. A constaint violation mandates a diagnostic, but
beyond that, the standard is (AFAICT) silent.
Undefined Behavior, in turn, is not defined as specific only to
execution: the standard simply says that it is "behavior, upon
use of a *nonportable or erroneous program construct*..." for
which there are no requirements, and there are examples of
things that are explicitly UB at translation time, such as
improperly terminated lexemes and so forth.
Furthermore, the expression above is obviously an integer
constant expression as defined by sec 6.6 para 8. Section 6.6,
para 4, reads in part, "Each constant expression shall evaluate
to a constant that is in the range of representable values for
its type." The expression, `(INT_MAX+1)*0` violates this
constraint, and so therefore a diagnostic is mandated as per
sec 5.1.1.3 para 1. That it appears in code that is not
obviously called from `main` doesn't change that.
Morever, sec 6.6 para 17 says that, "the semantic rules for
evaluation of a constant expression are the same as for
nonconstant expressions." This brings us back to 5.1.2.4,
though I submit that para (4) is a stronger argument for what
you and Tim are saying, as it reads in part, "An actual
implementation is not required to evaluate part of an expression
if it can deduce that its value is not used and that no needed
side effects are produced (including any caused by calling a
function or through volatile access to an object)." I interpret
this to mean that, if the implementation can determine that
there is no way that `foo` can be called, it does not _have_ to
evaluate the above expression. However, it must satisfy the
range constraint from section 6.6, so it likely will, and in any
event, the standard does not say that it, "shall not" evaluate
it, or when.
Once the compiler does that, if it does, and observes UB, the
standard is silent on what requirements it imposes, which means
the behavior is undefined. I see no reason it couldn't arrange
to invoke `foo` at that point.
So no, I do not see how execution according to the rules of the
abstract machine is not guaranteed, here. I certainly see no
way in which this can be regarded as a strictly conforming
program.
If the usual "Hello, world" program prints "Hello, world" followed
by "Goodbye", the implementation is non-conforming. If it formats
my hard drive after printing "Goodbye", it's non-conforming and
dangerous.
Two separate things. My point earlier was that code can
obviously run after `main` terminates. Moreoever, I can't
imagine what would _prevent_ a runtime system that invokes
`main` from doing something like printing, "PROGRAM STOPPED"
after `main` returned. C imposes no requirements here.
Whether `foo` could be invoked after, I think, is undefined.
Whether foo() has external linkage or internal
linkage doesn't change that.
I disagree. There's no possible way for the implementation to
know whether a function with external linkage will be ultimately
invoked or not; consider a system that supports loadable shared
modules. Nothing prevents even this simple program from being
compiled as a shared module, dynamically loaded, the loading
program explicitly searching for and finding the symbol
corresponding to the `foo` function, and invoking it.
Remember that linking is translation phase 8. The compiler is not
the entire implementation.
Exactly my point. The compiler cannot know how `foo` might be
used, or how the translated object might be exercised. There's
I don't see how it could possibly know that, given that `foo`
has external linkage.
Hence, the compiler _must_ treat with UB as written, which is
why `ubsan` inserts trapping code in `foo`.
I don't know what "_must_ treat with UB" means.
foo() has undefined behavior if it's called, so replacing its
body with trapping code is valid. But (I'm reasonably sure that)
an implementation cannot reject a program just because it can't
prove that it has no undefined behavior during execution. It can
reject it if it can prove that it *always* has undefined behavior
during execution.
What I'm saying is that, `foo` has undefined behavior _period_.
That's manifest in an integer constant expression, whether it is
executed at runtime or not. I believe that the standard forces
the expression to be evaluated at translation time, via the
"shall" mandate when checking the constraint on the range in sec
6.6 para 4. Further, that evaluation must happen in accordance
with the rules of the abstract machine, as per 5.1.2.4 para 17.
The diagnostic is mandated, as is the translation-time
evaluation. The expression is itself manifestly exhibits UB,
and so therefore the result of the rest of the translation is
undefined.
In article <10vspuu$lkmu$3@kst.eternal-september.org>,
Keith Thompson <Keith.S.Thompson+u@gmail.com> wrote:
James Kuyper <jameskuyper@alumni.caltech.edu> writes:
[...]
One advantage of having a single program do the whole thing, is that
error messages can mention the actual text of the line where a problem
was detected, without any pre-processing applied.
Typical preprocessors emit directives that tell the compiler about
the current file name and line number, precisely so that diagnostic >>messages can refer to the original text.
For example:
$ cat hello.c
#include <stdio.h>
int main(void) {
printf("Hello world!\n");
}
$ gcc -E hello.c | tail
extern int __uflow (FILE *);
extern int __overflow (FILE *, int);
# 983 "/usr/include/stdio.h" 3 4
# 2 "hello.c" 2
# 2 "hello.c"
int main(void) {
printf("Hello world!\n");
}
$
The line `# 2 "hello.c"` is, according to the C standard, a >>"non-directive", which is a kind of directive. Executing a
non-directive has undefined behavior, but gcc apparently treats it
very much like a #line directive.
It doesn't really matter whether the preprocessor is a separate program
or not.
In fairness to Kuyper, however, the *text* from the original
source file is lost. E.g.,
term% cat n.c
#include <stdio.h>
#define FOO "hi"; // Note trailing `;`
int
main(void)
{
printf("%s\n", FOO);
return 0;
}
term% clang -fkeep-system-includes -E n.c
# 1 "n.c"
# 1 "<built-in>" 1
# 1 "<command line>" 1
# 1 "<built-in>" 2
# 1 "n.c" 2
#include <stdio.h> /* clang -E -fkeep-system-includes */
# 1 "n.c"
# 2 "n.c" 2
int
main(void)
{
printf("%s\n", "hi";);
return 0;
}
term%
In this example, the preprocessor macro `FOO` has been lost, and
only its expansion remains. The compiler has no information to
give a useful diagnostic.
On 04/06/2026 22:06, Keith Thompson wrote:
Bart <bc@freeuk.com> writes:
[snip]
Tim Rentsch I'm sure will prefer the latter because 99.9% of C
programmers are machines, according to him.
Tim didn't say or imply that.
So what was his 99.9% all about? Nobody has a clue, except they are
certain that what I think it is is wrong!
Presumably, the same 99.9% will not use indentation, and will write
their programs all on one line anyway, because it is still after all
completely unambiguous according to the C standard!
Of course not, because 99.9% of C programmers are not idiots..
Your record of guessing incorrectly what other people think is
unbroken. I suggest you stop trying.
This is what Tim said:
"If someone really can't learn the rules of expression syntax for the >language they are using, they should be advised to try a different
language, or perhaps give up programming altogether. It's silly to
worry about something that 999 people out of a 1000 (and the actual
numbers are undoubtedly much higher) are able to navigate without >difficulty."
It sounds to me very much as though he expects 99.9% to know all C's >precedences by heart and to never need to use superfluous brackets (or
'more than needed if 'superfluous' is still to subjective).
But of course, I am wrong and he is right, and you will defend his view
(a subjective one) to the death.
On 04/06/2026 21:34, Scott Lurndal wrote:
Bart <bc@freeuk.com> writes:
On 04/06/2026 17:47, Scott Lurndal wrote:
Bart <bc@freeuk.com> writes:
On 04/06/2026 17:18, Scott Lurndal wrote:
David Brown <david.brown@hesbynett.no> writes:
On 04/06/2026 15:18, Bart wrote:
(Note that C has its own problems in this area:
ÿÿÿ a = b/*p;ÿÿÿÿÿ // divide b by dereferenced pointer p >>>>>>>>>>
Here, /* also happens to start a block comment.)
Here you are objectively wrong.ÿ C does not have a "problem" with >>>>>>>>> this. The parsing rules of the language are clear - often called >>>>>>>>> "maximum munch".ÿ The character sequence "/*" is the start of a >>>>>>>>> comment, it is not two separate operators.
This is where it falls down. It's very clearly a 'gotcha', and >>>>>>>> consequence of poorly thought-out design.
It is neither a "gotcha", not a consequence of poor design.
Indeed, and in the early days, the compiler itself would never
have seen '/*' - the preprocessor (cpp) would have removed it
from the source before the source reached the first
pass of the compiler (c0).
How does that not make it bad design?
The proprocessor would strip everything from the /* until the next
matching */, so a chunk of your program goes missing.
Whatcha talkin' 'bout willis?
What were /you/ talking about? What was your point?
Your inaccurate characterization that a chunk of the program
went "missing". Nothing meaningful is missing (and the comment
remains in the original source file).
So what do you mean, exactly, when you claim that the output of
the preprocessor causes a chunk of the program (which doesn't
include whitespace or comments) is missing?
This is the example I gave elsewhere:
---------------------------
There are actually other issues associated with /**/ comments; here
someone forgot to terminate the first comment:
puts("one"); /* comment 1
puts("two"); /* commmet 2 */
puts("three"); /* comment 3 */
---------------------------
After preprocessing you're left with this:
puts("one");
puts("three");
That middle puts call is missing, and it's meant to be part of the program.
This can also be a consequence of an inadvertent /* sequence such as in
'a = b/*p;'.
In article <10vspuu$lkmu$3@kst.eternal-september.org>,
Keith Thompson <Keith.S.Thompson+u@gmail.com> wrote:
James Kuyper <jameskuyper@alumni.caltech.edu> writes:
[...]
One advantage of having a single program do the whole thing, is that
error messages can mention the actual text of the line where a problem >>>> was detected, without any pre-processing applied.
Typical preprocessors emit directives that tell the compiler about
the current file name and line number, precisely so that diagnostic >>>messages can refer to the original text.
In fairness to Kuyper, however, the *text* from the original
source file is lost. E.g.,
term% cat n.c
#include <stdio.h>
#define FOO "hi"; // Note trailing `;`
int
main(void)
{
printf("%s\n", FOO);
return 0;
}
term% clang -fkeep-system-includes -E n.c
# 1 "n.c"
# 1 "<built-in>" 1
# 1 "<command line>" 1
# 1 "<built-in>" 2
# 1 "n.c" 2
#include <stdio.h> /* clang -E -fkeep-system-includes */
# 1 "n.c"
# 2 "n.c" 2
int
main(void)
{
printf("%s\n", "hi";);
return 0;
}
term%
In this example, the preprocessor macro `FOO` has been lost, and
only its expansion remains. The compiler has no information to
give a useful diagnostic.
Ah, but it does, as long as the original file is still there.
$ gcc -c n.c
n.c: In function ?main?:
n.c:2:17: error: expected ?)? before ?;? token
2 | #define FOO "hi"; // Note trailing `;`
| ^
n.c:6:20: note: in expansion of macro ?FOO?
6 | printf("%s\n", FOO);
| ^~~
n.c:6:11: note: to match this ?(?
6 | printf("%s\n", FOO);
| ^
$
The output of `gcc -E` doesn't include the name FOO, but it does include
the line `# 3 "n.c"`, and that's enough information for the compiler to
open the original source file and copy information from it into an error >message.
(This is perhaps straying slightly off-topic, since the standard
only requires a diagnostic, but it's still interesting to see how
actual compilers do things.)
$ cat n.c
#include <stdio.h>
#define FOO "hi"; // Note trailing `;`
int
main(void)
{
printf("%s\n", FOO);
return 0;
}
$ gcc -E n.c >| n-preprocessed.c
$ grep FOO n-preprocessed.c
$ tail n-preprocessed.c
# 2 "n.c" 2
# 3 "n.c"
int
main(void)
{
printf("%s\n", "hi";);
return 0;
}
$ gcc -c n-preprocessed.c
n.c: In function ?main?:
n.c:6:24: error: expected ?)? before ?;? token
6 | printf("%s\n", FOO);
| ~ ^
| )
$
And if I rename n.c before compiling n-preprocessed.c, the error
messages doesn't include that line of code.
[snip]
getch()
{
register int c, lastst;
while ((c=getc1())=='/' && !instring)
{
if ((c=getc1())!='*')
{
pushback(c);
return('/');
}
if (!skipcom)
{putc('/',fout); putc('*', fout);}
lastst=0;
while ( (c = getc1()) != '\0')
{
if (lastst && c=='/')
{
if (!skipcom)
putc('/', fout);
break;
}
if (c=='\n' || !skipcom)
putc(c, fout);
lastst = (c=='*');
}
if (c=='\0')break;
}
return(c);
}
Yeah, that's from `cc.c`, right?
No, it's from cpp.c
$ ls /work/reference/collegetapes/sltape/v6cc/
c0.c c00.c c01.c c02.c c03.c c04.c c05.c c1.h
c10.c c11.c c12.c c13.c c2.h c20.c c21.c cc.c cpp.c
On 04/06/2026 19:54, David Brown wrote:
On 04/06/2026 17:46, Bart wrote:
On 04/06/2026 15:27, David Brown wrote:
On 04/06/2026 15:18, Bart wrote:
It is an objective fact, therefore, that "(a*a) + (b*b)" has more >>>>>> parentheses than needed in the context of most programming languages. >>>>>>
"(a*a) + (b*b) has too many parentheses", on the other hand, is a >>>>>> purely subjective opinion.
So, you're arguing 'more than needed' is a completely different
thing from 'too many'.
Of course they are different things - albeit related things, rather
than /completely/ different.ÿ One is a question of fact, the other a
question of opinion, and they do not always coincide.
It is a fact that "a << (b + c)" has more parentheses than needed.
But I think we are both of the opinion that it does not have "too
many" parentheses - it has an appropriate number of parentheses.
So saying 'too many' of something will be a subjective opinion? OK,
so let's try compiling this bit of C:
ÿÿ void F(int, int);
ÿÿ int main() {
ÿÿÿÿÿÿ F(1, 2, 3);
ÿÿ }
8 out of 9 compilers reported 'Too many arguments'.
According to you, that's only their subjective opinion, not an
objective fact?
Again - /please/ stop trying to guess what people say or put words in
their mouths.ÿ I can't remember ever seeing you do so accurately.
This is what you actually said:
It is an objective fact, therefore, that "(a*a) + (b*b)" has more parentheses than needed in the context of most programming languages.
"(a*a) + (b*b) has too many parentheses", on the other hand, is a purely subjective opinion.ÿ Even if it is true that this is "commonly agreed
to" (and AFAIK you have no basis for that claim), that would still be a subjective opinion - no matter how common that opinion is.
You're saying that:
*ÿ "more than needed" is objective
*ÿ "too many" is subjective
Even though both are about exactly the same thing: superfluous but
harmless parentheses in an expression.
So you are picking on my choice of words, apparently in order to win
some stupid argument on the internet. Even though the same "too many"
phrase used elsewhere can be objective, according to you.
This looks like a pattern: people here seem to have remarkable trouble debating with me on actual ideas and resort instead to find hidden significance in the some choice of words I'd happen to use.
"Too many parentheses" is subjective, because they affect the ease of
reading the code as a human reader.
And 'more than needed' isn't that?!
Tim Rentsch I'm sure will prefer the latter because 99.9% of C
programmers are machines, according to him.
Presumably, the same 99.9% will not use indentation, and will write
their programs all on one line anyway, because it is still after all completely unambiguous according to the C standard!
In article <10vsrpo$men2$2@dont-email.me>, Bart <bc@freeuk.com> wrote:
On 04/06/2026 22:06, Keith Thompson wrote:
Bart <bc@freeuk.com> writes:
[snip]
Tim Rentsch I'm sure will prefer the latter because 99.9% of C
programmers are machines, according to him.
Tim didn't say or imply that.
So what was his 99.9% all about? Nobody has a clue, except they are
certain that what I think it is is wrong!
Have you thought about, I don't know, maybe asking him?
On Thu, 04 Jun 2026 21:04:50 +0200, David Brown wrote:
On 04/06/2026 19:47, Janis Papanagnou wrote:
On 2026-06-04 18:18, Scott Lurndal wrote:
Indeed, and in the early days, the compiler itself would never
have seen '/*' - the preprocessor (cpp) would have removed it
from the source before the source reached the first
pass of the compiler (c0).
Curious; was the comment-handling at some point in history removed
from the Cpp-processing? - If so, when was that? And I assume the
semantics are still the same; is that correct?
No, at least since the standardisation of the C language (including K&R
"standard"), "preprocessing" has been an integral part of the C language
and conversion of comments to space characters is done in phase 3 of the
translation. But the C standards do not give an explicit distinction
between "preprocessing" and "compiling" - just different translation
phases. (They do not define a "compiler" at all.) It is not uncommon
for implementations to separate translation into two or more programs,
especially in the good old days when hosts had much less memory, but
logically they are all one implementation. Distinguishing "the compiler
itself" is somewhat artificial.
In historic Unix (Version 7 and before), the preprocessor was implemented
as a separate program ("cpp") from the compiler ("cc"). The compiler itself had no facility to handle preprocessor directives, and was, itself, often divided into two separate programs ("cc0" and "cc1"). All three phases ("cpp", "cc0" and "cc1") were managed by a program ("cc"), although the program for each phase could be invoked independently through manual execution.
What differs from today is that the preprocessor was an optional component, made available for a programmer's convenience.
[...][...]
[ ... (INT_MAX+1)*0 ]
Furthermore, the expression above is obviously an integer
constant expression as defined by sec 6.6 para 8. Section 6.6,
para 4, reads in part, "Each constant expression shall evaluate
to a constant that is in the range of representable values for
its type." The expression, `(INT_MAX+1)*0` violates this
constraint, and so therefore a diagnostic is mandated as per
sec 5.1.1.3 para 1. That it appears in code that is not
obviously called from `main` doesn't change that.
[...]
cross@spitfire.i.gajendra.net (Dan Cross) writes:
In article <10vsrpo$men2$2@dont-email.me>, Bart <bc@freeuk.com> wrote:
On 04/06/2026 22:06, Keith Thompson wrote:
Bart <bc@freeuk.com> writes:
[snip]
Tim Rentsch I'm sure will prefer the latter because 99.9% of C
programmers are machines, according to him.
Tim didn't say or imply that.
So what was his 99.9% all about? Nobody has a clue, except they are
certain that what I think it is is wrong!
Have you thought about, I don't know, maybe asking him?
At the risk of saying what may be obvious to everyone, Bart has
shown that he has no interest in having a serious, constructive,
useful, or productive conversation with anyone. His questions
are all rhetorical; he hasn't asked me a straight question
because he isn't really interested in what I would say. In
short, Bart isn't looking for an answer, he's looking for an
argument. My recommendation is just stop responding to him
altogether. My response to him upthread was a sincere effort to
provide a neutral and helpful answer to his question. Maybe my
remarks were helpful to other people, and if they were that's
good. Any further efforts to interact with Bart are not just a
waste of time but actually counterproductive. What Bart needs is
not help with understanding C but a good therapist. In any case
I'm confident that whatever Bart's needs may be, no one responding
to his postings here is in a position to provide them. Please
consider these remarks before responding to him further.
On 04/06/2026 21:29, Bart wrote:
You're saying that:
How can this be /so/ difficult for you?
*ÿ "more than needed" is objective
No, I said that "(a*a) + (b*b)" has more parentheses than needed in the context of most programming languages" is objective.
*ÿ "too many" is subjective
No, I said that "(a*a) + (b*b) has too many parentheses" is subjective.
BC:Sadly the idea of writing in a way that is "most easily understood"
has resulted in a race to the bottom, where writers are more and
more encouraged to take the view that (some) readers are pretty
much arbitrarily stupid, with the result that expressions become
littered with scads of unnecessary parentheses that actually
detract from ease of reading. Good writing is always a balance
between too much and too little.
Actual examples of too many parentheses?
The point of my comment is that either too many or too few is a
subjective judgment, not an objective one.
My point was that it could be objective, at least for too many.
Tim Rentsch I'm sure will prefer the latter because 99.9% of C
programmers are machines, according to him.
Please give a reference for him saying that.ÿ (I'll save you the bother,
he has not made any remarks remotely like this in c.l.c. since I have
been here.)
Don't presume - you make a fool out of yourself every time you do.
Presumably, the same 99.9% will not use indentation, and will write
their programs all on one line anyway, because it is still after all
completely unambiguous according to the C standard!
James Kuyper <jameskuyper@alumni.caltech.edu> writes:
[...]
One advantage of having a single program do the whole thing, is
that error messages can mention the actual text of the line where
a problem was detected, without any pre-processing applied.
Typical preprocessors emit directives that tell the compiler
about the current file name and line number, precisely so that
diagnostic messages can refer to the original text.
For example:
$ cat hello.c
#include <stdio.h>
int main(void) {
printf("Hello world!\n");
}
$ gcc -E hello.c | tail
extern int __uflow (FILE *);
extern int __overflow (FILE *, int);
# 983 "/usr/include/stdio.h" 3 4
# 2 "hello.c" 2
# 2 "hello.c"
int main(void) {
printf("Hello world!\n");
}
$
The line `# 2 "hello.c"` is, according to the C standard, a
"non-directive", which is a kind of directive. Executing a
non-directive has undefined behavior,
Tim Rentsch <tr.17687@z991.linuxsc.com> writes:
Keith Thompson <Keith.S.Thompson+u@gmail.com> writes:
Note that in a context that requires a constant expression, overflow is
a constraint violation. For example, a case label like:
case (INT_MAX + 1) * 0:
must be diagnosed at compile time.
gcc disagrees with you.
What makes you think so?
[...]
But taking a closer look at the standard, I'm not 100% sure that the
language requires a diagnostic, though I think that's the intent.
The relevant constraint is:
Each constant expression shall evaluate to a constant that is
in the range of representable values for its type.
If I squint really hard, I can argue that the entire expression
has to be a constant expression, but it doesn't say that its
subexpressions are constant expressions -- and *if* INT_MAX +
1 evaluates to INT_MIN in the current implementation, then
(INT_MAX + 1) * 0 evaluates to 0 and therefore satisfies the
constraint.
But INT_MAX + 1 could legally trap, for example, and I don't
believe it was intended that a given expression can be a constant
expression or not depending on the vagaries of the behavior of an
instance of UB.
On 05/06/2026 08:29, David Brown wrote:
On 04/06/2026 21:29, Bart wrote:
You're saying that:
How can this be /so/ difficult for you?
*ÿ "more than needed" is objective
No, I said that "(a*a) + (b*b)" has more parentheses than needed in
the context of most programming languages" is objective.
*ÿ "too many" is subjective
No, I said that "(a*a) + (b*b) has too many parentheses" is subjective.
If anyone is interested (which I doubt; bart-bashing is much more fun),
this is the original context:
TR:
Sadly the idea of writing in a way that is "most easily understood"
has resulted in a race to the bottom, where writers are more and
more encouraged to take the view that (some) readers are pretty
much arbitrarily stupid, with the result that expressions become
littered with scads of unnecessary parentheses that actually
detract from ease of reading.ÿ Good writing is always a balance
between too much and too little.
BC:
Actual examples of too many parentheses?
TR:
The point of my comment is that either too many or too few is a
subjective judgment, not an objective one.
Here it is clear that 'too many' was just a paraphrase of 'unnecessary'.
Here is my followup to TR:
BC:
My point was that it could be objective, at least for too many.
For an infix syntax where * has higher priority than +, then it is a
fact that the () in (a*a) + (b*b) are not necessary.
So, assume a minimum number of () needed to properly parse an expression according to intent. Then:
(1) TOO FEW: necessarily has to be subjective. It suggests a desire for
more () than the minimum, but the exact number will vary.
(2) TOO MANY, MORE THAN NEEDED, ETC: These can objective if refering to
any number of extra () above the mininum. This is the point I made
above, the one I defended.
(3) TOO MANY, MORE THAN NEEDED, ETC: These can also be used in a
judgemental manner, and there are subjective. This is where a certain
number of extra () are accepted for readability etc, but the exact level will vary.
If this is the point people have been trying to make, then they've been doing it incredibly badly, and been unnecessarily unpleasant and insulting.
My own view is that C syntax has too much of (3), but necessarily so
because of the choices made in its operator levels.
Tim Rentsch I'm sure will prefer the latter because 99.9% of C
programmers are machines, according to him.
Please give a reference for him saying that.ÿ (I'll save you the
bother, he has not made any remarks remotely like this in c.l.c. since
I have been here.)
Find out what was the subject of the 99.9% (even if that was an exaggeration). Then we'll talk.
No, he didn't use the word 'machines'; I paraphrased to suggest
supernormal people who know everything and never make mistakes.
You're going to argue about this now?
Don't presume - you make a fool out of yourself every time you do.
Presumably, the same 99.9% will not use indentation, and will write
their programs all on one line anyway, because it is still after all
completely unambiguous according to the C standard!
And you proceed to do exactly the same; Bart must be wrong, but you
don't about what!
In article <1BoUR.3$lmCb.1@fx22.iad>, Scott Lurndal <slp53@pacbell.net> wrote: >>cross@spitfire.i.gajendra.net (Dan Cross) writes:<snip>
[snip]
Yeah, that's from `cc.c`, right?
No, it's from cpp.c
$ ls /work/reference/collegetapes/sltape/v6cc/
c0.c c00.c c01.c c02.c c03.c c04.c c05.c c1.h
c10.c c11.c c12.c c13.c c2.h c20.c c21.c cc.c cpp.c
Oh interesting. I don't have a `cpp.c` in my v6 archive.
I wonder what else I'm missing.
value = valp;}
On 05/06/2026 13:39, Bart wrote:
"a << (b + c)" has "more than needed" - that is objective.
"a << (b + c)" does not have "too many" in an objective sense, because
I cannot speak for the intentions of others, but it has certainly been
very frustrating trying to get you to understand the distinction between objective facts and subjective opinions,
Actual examples of too many parentheses?
The point of my comment is that either too many or too few is a
subjective judgment, not an objective one.
Sadly the idea of writing in a way that is "most easily understood"
has resulted in a race to the bottom, where writers are more and
more encouraged to take the view that (some) readers are pretty
much arbitrarily stupid, with the result that expressions become
littered with scads of unnecessary parentheses that actually
detract from ease of reading. Good writing is always a balance
between too much and too little.
On 2026-06-05 01:49, Dan Cross wrote:
[...][...]
[ ... (INT_MAX+1)*0 ]
Furthermore, the expression above is obviously an integer
constant expression as defined by sec 6.6 para 8. Section 6.6,
para 4, reads in part, "Each constant expression shall evaluate
to a constant that is in the range of representable values for
its type." The expression, `(INT_MAX+1)*0` violates this
constraint, and so therefore a diagnostic is mandated as per
sec 5.1.1.3 para 1. That it appears in code that is not
obviously called from `main` doesn't change that.
I'm curious about that "violation"; a violation would require
(at least) two sorts of logical preconditions. - The first is
that all *sequentially* (literally) evaluated sub-expression
values are representable as value - INT_MAX+1 certainly can't
be represented in generated code that conforms to the abstract
*mathematical* value - but is that necessary if _the whole_
expression is (mathematically) just 0 (because of the final
factor). And the second (related) is whether the order of the
sub-expression evaluation is relevant; if we'd assume the
expression evaluation to be considered from right to left then
it would be irrelevant what's inside the parenthesis.
From the standard quotes I cannot really recognize that these
preconditions, how to determine UB/errors/violations, would be
necessary.
I'm no native speaker and I fear my question as formulated was
hard to understand. It's basically the question of the standard
implying (INT_MAX+1)*0 to be analyzed sequentially as written
or whether it could as well analyze it from right to left and
thus recognizing no problem, since from the mathematical view -
but also practically - a concrete representable value of a here
irrelevant sub-expression isn't necessary. Or another try of a
(paraphrased) formulation; for the determination of constraint
violations does the expression have strict (sort of) sequencing
points _after each term_ (and each left-to-right sub-expression
has to be well-defined) or can it be valued/analyzed as a whole
not putting any preconditions about evaluation order etc. when
determining the overall value?
PS: One yet non-considered question that was part of my original
post was: "Is there any rationale from the _software designer_'s perspective?"
Keith Thompson <Keith.S.Thompson+u@gmail.com> writes:[...]
The line `# 2 "hello.c"` is, according to the C standard, a
"non-directive", which is a kind of directive. Executing a
non-directive has undefined behavior,
Since it is gcc that is generating the non-directives, for
internal purposes, and gcc that is consuming them, it hardly
seems worth worrying about whether their behavior is defined
or not.
On 05/06/2026 08:29, David Brown wrote:[...]
On 04/06/2026 21:29, Bart wrote:
TR:
BC:Sadly the idea of writing in a way that is "most easily understood"
has resulted in a race to the bottom, where writers are more and
more encouraged to take the view that (some) readers are pretty
much arbitrarily stupid, with the result that expressions become
littered with scads of unnecessary parentheses that actually
detract from ease of reading. Good writing is always a balance
between too much and too little.
TR:Actual examples of too many parentheses?
The point of my comment is that either too many or too few is a
subjective judgment, not an objective one.
Here it is clear that 'too many' was just a paraphrase of
'unnecessary'.
Tim Rentsch I'm sure will prefer the latter because 99.9% of C
programmers are machines, according to him.
Please give a reference for him saying that.ÿ (I'll save you the
bother, he has not made any remarks remotely like this in
c.l.c. since I have been here.)
Find out what was the subject of the 99.9% (even if that was an exaggeration). Then we'll talk.
No, he didn't use the word 'machines'; I paraphrased to suggest
supernormal people who know everything and never make mistakes.
You're going to argue about this now?
Keith Thompson <Keith.S.Thompson+u@gmail.com> writes:
Tim Rentsch <tr.17687@z991.linuxsc.com> writes:
Keith Thompson <Keith.S.Thompson+u@gmail.com> writes:
Note that in a context that requires a constant expression, overflow is >>>> a constraint violation. For example, a case label like:
case (INT_MAX + 1) * 0:
must be diagnosed at compile time.
gcc disagrees with you.
What makes you think so?
[...]
I'm skipping this and proceeding on to the original question.
I see no basis for this belief. My conclusions are based on what
the C standard actually says, rather than guesses about some
unstated "intentions". I think you would do well to reach your
conclusions based more on the actual text of the C standard, and
less on your interpretation of what the text was "intended" to
mean.
Tim Rentsch <tr.17687@z991.linuxsc.com> writes:
Keith Thompson <Keith.S.Thompson+u@gmail.com> writes:
[...]
The line `# 2 "hello.c"` is, according to the C standard, a
"non-directive", which is a kind of directive. Executing a
non-directive has undefined behavior,
Since it is gcc that is generating the non-directives, for
internal purposes, and gcc that is consuming them, it hardly
seems worth worrying about whether their behavior is defined
or not.
I wasn't worried. I just mentioned in in passing.
You quoted most of the article, but snipped relevant context in
the middle of a sentence.
Bart <bc@freeuk.com> writes:
On 05/06/2026 08:29, David Brown wrote:[...]
On 04/06/2026 21:29, Bart wrote:
TR:
BC:Sadly the idea of writing in a way that is "most easily understood"
has resulted in a race to the bottom, where writers are more and
more encouraged to take the view that (some) readers are pretty
much arbitrarily stupid, with the result that expressions become
littered with scads of unnecessary parentheses that actually
detract from ease of reading. Good writing is always a balance
between too much and too little.
TR:Actual examples of too many parentheses?
The point of my comment is that either too many or too few is a
subjective judgment, not an objective one.
Here it is clear that 'too many' was just a paraphrase of
'unnecessary'.
No, it is clear that "too many" and "unnecessary" have two different meanings.
The idea that "too many" and "unnecessary" mean the same thing
is your own invention.
[...]
Tim Rentsch I'm sure will prefer the latter because 99.9% of C
programmers are machines, according to him.
Please give a reference for him saying that.ÿ (I'll save you the
bother, he has not made any remarks remotely like this in
c.l.c. since I have been here.)
Find out what was the subject of the 99.9% (even if that was an
exaggeration). Then we'll talk.
Only Tim can clarify that point, and he's made it clear that he's
not interested in doing so. Please don't complain to the rest of
us about that.
No, he didn't use the word 'machines'; I paraphrased to suggest
supernormal people who know everything and never make mistakes.
You're going to argue about this now?
Bart, when you make ridiculous and/or false statements, people are going
to argue with you. When you double down on such statements, people are
going to continue to argue with you.
Your use of the word "machines" was ridiculous and false.
Chris M. Thomasson <chris.m.thomasson.1@gmail.com> wrote:
On 6/4/2026 4:44 PM, Bart wrote:
On 05/06/2026 00:09, Keith Thompson wrote:
Bart <bc@freeuk.com> writes:
On 04/06/2026 22:06, Keith Thompson wrote:
Bart <bc@freeuk.com> writes:
On 04/06/2026 19:54, David Brown wrote:[...]
Stop it.ÿ He's not saying that.Again - /please/ stop trying to guess what people say or put words >>>>>>>> in their mouths.ÿ I can't remember ever seeing you do so accurately. >>>>>>>This is what you actually said:
It is an objective fact, therefore, that "(a*a) + (b*b)" has more >>>>>>>> parentheses than needed in the context of most programming languages. >>>>>>>>
"(a*a) + (b*b) has too many parentheses", on the other hand, is a >>>>>>>> purely
subjective opinion.ÿ Even if it is true that this is "commonly agreed >>>>>>>> to" (and AFAIK you have no basis for that claim), that would still >>>>>>>> be a
subjective opinion - no matter how common that opinion is.
You're saying that:
*ÿ "more than needed" is objective
*ÿ "too many" is subjective
That is EXACTLY what he's saying: "It is an OBJECTIVE fact .. has more >>>>> ... than needed", and:
ÿ "has too many ... is ... purely subjective".
You're taking phrases out of context and making false claims that the >>>>>> full statement was far more general than it actually was.
And this is exactly what other people are doing.
Taken literally, your statement implies that you admit that that's
what you're doing.ÿ Is that what you meant?ÿ If so, I suggest you
*stop* making such false claims.ÿ If not, what did you actually mean?
So I used TOO MANY instead of MORE THAN NEEDED to describe the exact >>>>> same phenomenon.
That's not the problem.ÿ There is an actual meaningful distinction
here, between what's needed by the compiler and what's useful to
improve clarity for human readers.ÿ I have found some of what you've
written to be unclear about that distinction.
Can we agree that the question of whether parentheses in a C
expression are necessary to the compiler can be answered objectively?
Can we agree that the question of whether extra parentheses are
helpful to a human reader is at least partly subjective, and
varies from case to case?ÿ Is there really anything else that we
fundamentally disagree about?
(1) Why are you all making such a big fucking deal of this?
Why are you?
I didn't start this business of something being subjective or objective, >>> or suggesting than one turn of phrase to discuss the same thing was
subjective and the other objective (implying that a subjective opinion
had less worth). TR started that and several people backed him up.
Myself I wouldn't even use those terms. My point was that some overuses
of () for commonly known precedences are more overkill than others.
If that's subjective then so be it; it is not some fundamental law of
the universe. I would just call it common sense.
> Why are you?
Since you ask, I was defending my point of view then got sidetracked by
this subjective/objective nonsense. I notice that TR has disappeared
from this subthread.
Wrt the number of ()'s? Might as well go to sleep with the following
song playing in the background:
(The Fate of Ophelia - Taylor Swift (Lyrics) Charlie Puth ft. Selena
Gomez, the weekd, ariana grande)
AFAICS outer parentheses there are excessive, inner ones look OK.
In article <10vsnl7$lkmu$1@kst.eternal-september.org>,
Keith Thompson <Keith.S.Thompson+u@gmail.com> wrote: >>>cross@spitfire.i.gajendra.net (Dan Cross) writes:
In article <865x3yd21n.fsf@linuxsc.com>,[...]
Tim Rentsch <tr.17687@z991.linuxsc.com> wrote: >>>>>cross@spitfire.i.gajendra.net (Dan Cross) writes:
In article <86ik81cfk5.fsf_-_@linuxsc.com>,
Tim Rentsch <tr.17687@z991.linuxsc.com> wrote:
To clarify, the comments in my posting were meant to be read as >>>>>saying the given text is the entire program, and that it is strictly >>>>>conforming with respect to conforming hosted implementations. >>>>>(Incidentally, given the rules for freestanding implementations, I'm >>>>>not sure that it is even possible for any program to be strictly >>>>>conforming with respect to conforming freestanding implementations. >>>>>In any case my statements were meant only in the context of hosted >>>>>implementations.)There's an important distinction to make here. Consider this
program:
#include <limits.h>
int
foo(){
int zero = (INT_MAX+1)*0;
return zero;
}
int
main(){
return 0;
}
This program does not transgress the bounds of undefined behavior. >>>>>
Ok.
[snip]
Perhaps you mean that this is irrelevant because `foo` is not
invoked, but I see no reason why that need be the case in e.g.
a freestanding environment.
I explained the context of my previous statements above. Sorry for >>>>>not saying that in the original message.
In a hosted environment, I don't
think anything explicitly prevents `foo` from being called after
`main` returns (though I can't imagine that would happen in real
life; it would be weird if it did).
The semantics described in the ISO C standard don't admit that >>>>>possibility.
Could you please point to where it says this, in the C standard?
I cannot find anything that says that arbitrary code cannot run
after `main()` returns, and I don't see how that could possibly
be true.
N3220 5.1.2.4, Program semantics.
It defines the *observable behavior* of a program, which consists of >>>accesses to volatile objects, data written to files, and I/O dynamics of >>>interactive devices.
Yes, but it does so for strictly-conforming programs with no UB.
It does so for programs in general, not just strictly conforming
ones. If a program has undefined behavior, all bets are off,
but for example a program that evaluates `printf("%d\n", INT_MAX)`
is not strictly conforming, but it's fully subject to 5.1.2.4.
To understand conformance, we have to jump over to section 4,
which explicitly says that, 'Undefined behavior is otherwise
indicated in this document by the words "undefined behavior" or
by the omission of any explicit definition of behavior.' As it
does not say that a program with an instance of undefined
behavior in an integer constant expression that is not executed
must otherwise behave in any given manner, what the program does
is undefined. A constaint violation mandates a diagnostic, but
beyond that, the standard is (AFAICT) silent.
I don't think an integer constant expression can have undefined
behavior. INT_MAX+1 and 1/0 are not constant expressions, because
neither "evaluate(s) to a constant that is in the range of
representable values for its type".
I claim that an expression that looks like a constant expression
*isn't* a constant-expression if it doesn't appear in a context
that requires a constant-expression.
The program in question, quoted above, has:
int zero = (INT_MAX+1)*0;
`(INT_MAX+1)*0` is not a constant expression, not because of the
overflow, but because a constant expression is not required in
that context. "constant-expression" is defined by a production in
the grammar (it reduces to "conditional-expression"). Even in
int n = 42;
42 is not a a constant expression, because the grammar doesn't
call for a constant expression in that context -- even though it
looks like one. Similarly, in `a + b * c`, `a + b` looks like an
additive expression, but it isn't one. (Not a perfect analogy.)
Undefined Behavior, in turn, is not defined as specific only to
execution: the standard simply says that it is "behavior, upon
use of a *nonportable or erroneous program construct*..." for
which there are no requirements, and there are examples of
things that are explicitly UB at translation time, such as
improperly terminated lexemes and so forth.
Yes, there are constructs that are explicitly UB at translation time.
(I think that's unfortunate, and there are efforts to clear up some
such cases in C2y.)
Signed integer overflow is not one of those constructs.
Any undefined behavior from evaluating INT_MAX+1 happens during
execution (barring constraint violations).
Furthermore, the expression above is obviously an integer
constant expression as defined by sec 6.6 para 8. Section 6.6,
para 4, reads in part, "Each constant expression shall evaluate
to a constant that is in the range of representable values for
its type." The expression, `(INT_MAX+1)*0` violates this
constraint, and so therefore a diagnostic is mandated as per
sec 5.1.1.3 para 1. That it appears in code that is not
obviously called from `main` doesn't change that.
It satisfies the requirements for an integer constant expression in
6.6p8, but it violates the constraint in 6.6p4. (I presume that an
"integer constant expression" must be a "constant expression".)
But since "constant-expression" is a grammatical production,
it doesn't have to satisfy that constraint, and no diagnostic
is required. (A warning is certainly permitted.)
Similarly, this:
int n = INT_MAX + 1;
at block scope doesn't require a diagnostic, though of course it
has undefined behavior -- but at file scope, the initializer is a
constant expression, so that would be a constraint violation.
Morever, sec 6.6 para 17 says that, "the semantic rules for
evaluation of a constant expression are the same as for
nonconstant expressions." This brings us back to 5.1.2.4,
though I submit that para (4) is a stronger argument for what
you and Tim are saying, as it reads in part, "An actual
implementation is not required to evaluate part of an expression
if it can deduce that its value is not used and that no needed
side effects are produced (including any caused by calling a
function or through volatile access to an object)." I interpret
this to mean that, if the implementation can determine that
there is no way that `foo` can be called, it does not _have_ to
evaluate the above expression. However, it must satisfy the
range constraint from section 6.6, so it likely will, and in any
event, the standard does not say that it, "shall not" evaluate
it, or when.
Overflow in a constant expression is not undefined behavior. It's a >constraint violation. But that doesn't apply here, because the
initializer is not a constant expression. (Sorry if I'm repeating
myself.)
Once the compiler does that, if it does, and observes UB, the
standard is silent on what requirements it imposes, which means
the behavior is undefined. I see no reason it couldn't arrange
to invoke `foo` at that point.
Any UB in the program would occur during execution,
and in fact
it *won't* occur during execution because foo() isn't called.
A compiler can't generate code with arbitrary behavior just because
it can't prove that there will be no UB. If it could, every signed
or floating-point arithmetic operation with unknown operand values
would grant the same permission.
So no, I do not see how execution according to the rules of the
abstract machine is not guaranteed, here. I certainly see no
way in which this can be regarded as a strictly conforming
program.
foo()'s behavior would be undefined if it were called. It *isn't*
called, so there's no actual UB. The program does not violate any
of the other requirements for strict conformance.
If the usual "Hello, world" program prints "Hello, world" followed
by "Goodbye", the implementation is non-conforming. If it formats
my hard drive after printing "Goodbye", it's non-conforming and >>>dangerous.
Two separate things. My point earlier was that code can
obviously run after `main` terminates. Moreoever, I can't
imagine what would _prevent_ a runtime system that invokes
`main` from doing something like printing, "PROGRAM STOPPED"
after `main` returned. C imposes no requirements here.
Yes, it does. An OS can print "PROGRAM STOPPED", but not as part
of the execution of the program. On my system, a shell prompt is
printed after a program terminates, but not by the program. If I
execute a "hello, world" program with its output redirected to a file
(on a system that supports that), the resulting file cannot contain
"PROGRAM STOPPED". The requirements in 5.1.2.4 specify both what
the execution of a program must do and what it must not do.
Whether foo() has external linkage or internal
linkage doesn't change that.
I disagree. There's no possible way for the implementation to
know whether a function with external linkage will be ultimately
invoked or not; consider a system that supports loadable shared
modules. Nothing prevents even this simple program from being
compiled as a shared module, dynamically loaded, the loading
program explicitly searching for and finding the symbol
corresponding to the `foo` function, and invoking it.
Remember that linking is translation phase 8. The compiler is not
the entire implementation.
Exactly my point. The compiler cannot know how `foo` might be
used, or how the translated object might be exercised. There's
I don't see how it could possibly know that, given that `foo`
has external linkage.
We were presented with a complete translation unit that included a
function definition for "main". It's a complete program. There's no
valid way for some other program to call foo. If OS provided such
a mechanism, it would be outside the scope of C.
Hence, the compiler _must_ treat with UB as written, which is
why `ubsan` inserts trapping code in `foo`.
I don't know what "_must_ treat with UB" means.
foo() has undefined behavior if it's called, so replacing its
body with trapping code is valid. But (I'm reasonably sure that)
an implementation cannot reject a program just because it can't
prove that it has no undefined behavior during execution. It can
reject it if it can prove that it *always* has undefined behavior
during execution.
What I'm saying is that, `foo` has undefined behavior _period_.
That's manifest in an integer constant expression, whether it is
executed at runtime or not. I believe that the standard forces
the expression to be evaluated at translation time, via the
"shall" mandate when checking the constraint on the range in sec
6.6 para 4. Further, that evaluation must happen in accordance
with the rules of the abstract machine, as per 5.1.2.4 para 17.
The diagnostic is mandated, as is the translation-time
evaluation. The expression is itself manifestly exhibits UB,
and so therefore the result of the rest of the translation is
undefined.
foo is a function. foo does not have undefined behavior; it has no
behavior at all. A *call* to foo during execution has undefined
behavior. (`foo;` is a statement-expression that does nothing;
it does not have undefined behavior.)
[SNIP]
I think the question of whether the initializer is a
constant-expression or not has caused some not entirely relevant
confusion.
Here's another example that avoids that issue.
#include <limits.h>
int foo(void) {
int zero;
zero = INT_MAX;
zero ++;
zero *= 0;
return zero;
}
int main(void) {
return 0;
}
Given my grammatical argument above, I would say that this program
has no constant expressions.
Whether that argument is correct or
not, it certainly has no constant expressions that violate any
constraint or that have undefined behavior. Evaluating `zero ++`
(which doesn't even pretend to be a constant expression) would have
run-time undefined behavior -- *if* foo() were ever called.
And given this translation unit, I don't think there's any way to
construct a multi-TU program that calls foo, so a compiler *can*
determine that foo is never called (but there's no requirement to
do so, or to make any use of that information).
Keith Thompson <Keith.S.Thompson+u@gmail.com> writes:
[snip]
But taking a closer look at the standard, I'm not 100% sure that the
language requires a diagnostic, though I think that's the intent.
The relevant constraint is:
Each constant expression shall evaluate to a constant that is
in the range of representable values for its type.
If I squint really hard, I can argue that the entire expression
has to be a constant expression, but it doesn't say that its
subexpressions are constant expressions -- and *if* INT_MAX +
1 evaluates to INT_MIN in the current implementation, then
(INT_MAX + 1) * 0 evaluates to 0 and therefore satisfies the
constraint.
My reasoning is as follows.
To determine if the constraint is satisfied, the compiler must
first evaluate the expression (INT_MAX + 1) * 0.
To evaluate the expression (INT_MAX + 1) * 0, the compiler must
first evaluate the sub-expression (INT_MAX + 1).
Because the expression (INT_MAX + 1) overflows, the behavior is
undefined, and the compiler is free to decide that the value of
the sub-expression (INT_MAX + 1) is, let's say, 12.
The compiler next evaluates the overall expression as 12*0, which
is 0 (an int).
This result of the overall expression satisfies the constraint,
and so the compiler is not obliged to generate a diagnostic.
[snip]
I see no basis for this belief. My conclusions are based on what
the C standard actually says, rather than guesses about some
unstated "intentions". I think you would do well to reach your
conclusions based more on the actual text of the C standard, and
less on your interpretation of what the text was "intended" to
mean.
On 05/06/2026 08:53, Tim Rentsch wrote:
cross@spitfire.i.gajendra.net (Dan Cross) writes:
In article <10vsrpo$men2$2@dont-email.me>, Bart <bc@freeuk.com> wrote:
On 04/06/2026 22:06, Keith Thompson wrote:
Bart <bc@freeuk.com> writes:
[snip]
Tim Rentsch I'm sure will prefer the latter because 99.9% of C
programmers are machines, according to him.
Tim didn't say or imply that.
So what was his 99.9% all about? Nobody has a clue, except they are
certain that what I think it is is wrong!
Have you thought about, I don't know, maybe asking him?
Asking him straight questions is usually futile. You can probably guess
this from the response below.
Notice he hasn't tried to enlighten anyone about that 99.9%.
That may just have been a throwaway line like when I say 'nobody likes
X', but I would still dispute that, if it's about what I think it is,
it's anything like a super-majority.
At the risk of saying what may be obvious to everyone, Bart has
shown that he has no interest in having a serious, constructive,
useful, or productive conversation with anyone. His questions
are all rhetorical; he hasn't asked me a straight question
because he isn't really interested in what I would say. In
short, Bart isn't looking for an answer, he's looking for an
argument. My recommendation is just stop responding to him
altogether. My response to him upthread was a sincere effort to
provide a neutral and helpful answer to his question. Maybe my
remarks were helpful to other people, and if they were that's
good. Any further efforts to interact with Bart are not just a
waste of time but actually counterproductive. What Bart needs is
not help with understanding C but a good therapist. In any case
I'm confident that whatever Bart's needs may be, no one responding
to his postings here is in a position to provide them. Please
consider these remarks before responding to him further.
I didn't read Bart's posting. Unfortunately it seems
true that any continued interaction with his comments
is counterproductive.
cross@spitfire.i.gajendra.net (Dan Cross) writes:
In article <1BoUR.3$lmCb.1@fx22.iad>, Scott Lurndal <slp53@pacbell.net> wrote:<snip>
cross@spitfire.i.gajendra.net (Dan Cross) writes:
[snip]
Yeah, that's from `cc.c`, right?
No, it's from cpp.c
$ ls /work/reference/collegetapes/sltape/v6cc/
c0.c c00.c c01.c c02.c c03.c c04.c c05.c c1.h
c10.c c11.c c12.c c13.c c2.h c20.c c21.c cc.c cpp.c
Oh interesting. I don't have a `cpp.c` in my v6 archive.
I wonder what else I'm missing.
[snip]
In article <10vu703$11s5q$1@dont-email.me>, Bart <bc@freeuk.com> wrote:
On 05/06/2026 08:53, Tim Rentsch wrote:[...]
[...][...]
Generally speaking, AFAIK, none of the regular posters here are
qualified mental health professionals; as such, we should all
avoid from making armchair psychological diagnoses, the
occasionally midly offcolor joke aside ("that's crazy!").
In article <10vt7b9$pi3s$1@kst.eternal-september.org>,
Keith Thompson <Keith.S.Thompson+u@gmail.com> wrote:
cross@spitfire.i.gajendra.net (Dan Cross) writes:
In article <10vsnl7$lkmu$1@kst.eternal-september.org>,
Keith Thompson <Keith.S.Thompson+u@gmail.com> wrote: >>>>cross@spitfire.i.gajendra.net (Dan Cross) writes:
In article <865x3yd21n.fsf@linuxsc.com>,[...]
Tim Rentsch <tr.17687@z991.linuxsc.com> wrote: >>>>>>cross@spitfire.i.gajendra.net (Dan Cross) writes:
In article <86ik81cfk5.fsf_-_@linuxsc.com>,
Tim Rentsch <tr.17687@z991.linuxsc.com> wrote:
To clarify, the comments in my posting were meant to be read as >>>>>>saying the given text is the entire program, and that it is strictly >>>>>>conforming with respect to conforming hosted implementations. >>>>>>(Incidentally, given the rules for freestanding implementations, I'm >>>>>>not sure that it is even possible for any program to be strictly >>>>>>conforming with respect to conforming freestanding implementations. >>>>>>In any case my statements were meant only in the context of hosted >>>>>>implementations.)There's an important distinction to make here. Consider this
program:
#include <limits.h>
int
foo(){
int zero = (INT_MAX+1)*0;
return zero;
}
int
main(){
return 0;
}
This program does not transgress the bounds of undefined behavior. >>>>>>
Ok.
[snip]
Perhaps you mean that this is irrelevant because `foo` is not
invoked, but I see no reason why that need be the case in e.g.
a freestanding environment.
I explained the context of my previous statements above. Sorry for >>>>>>not saying that in the original message.
In a hosted environment, I don't
think anything explicitly prevents `foo` from being called after >>>>>>> `main` returns (though I can't imagine that would happen in real >>>>>>> life; it would be weird if it did).
The semantics described in the ISO C standard don't admit that >>>>>>possibility.
Could you please point to where it says this, in the C standard?
I cannot find anything that says that arbitrary code cannot run
after `main()` returns, and I don't see how that could possibly
be true.
N3220 5.1.2.4, Program semantics.
It defines the *observable behavior* of a program, which consists of >>>>accesses to volatile objects, data written to files, and I/O dynamics of >>>>interactive devices.
Yes, but it does so for strictly-conforming programs with no UB.
It does so for programs in general, not just strictly conforming
ones. If a program has undefined behavior, all bets are off,
but for example a program that evaluates `printf("%d\n", INT_MAX)`
is not strictly conforming, but it's fully subject to 5.1.2.4.
To understand conformance, we have to jump over to section 4,
which explicitly says that, 'Undefined behavior is otherwise
indicated in this document by the words "undefined behavior" or
by the omission of any explicit definition of behavior.' As it
does not say that a program with an instance of undefined
behavior in an integer constant expression that is not executed
must otherwise behave in any given manner, what the program does
is undefined. A constaint violation mandates a diagnostic, but
beyond that, the standard is (AFAICT) silent.
I don't think an integer constant expression can have undefined
behavior. INT_MAX+1 and 1/0 are not constant expressions, because
neither "evaluate(s) to a constant that is in the range of
representable values for its type".
I claim that an expression that looks like a constant expression
*isn't* a constant-expression if it doesn't appear in a context
that requires a constant-expression.
That's a bold claim, but I think I see why you're saying that.
The program in question, quoted above, has:
int zero = (INT_MAX+1)*0;
`(INT_MAX+1)*0` is not a constant expression, not because of the
overflow, but because a constant expression is not required in
that context. "constant-expression" is defined by a production in
the grammar (it reduces to "conditional-expression"). Even in
int n = 42;
42 is not a a constant expression, because the grammar doesn't
call for a constant expression in that context -- even though it
looks like one. Similarly, in `a + b * c`, `a + b` looks like an
additive expression, but it isn't one. (Not a perfect analogy.)
Right; I see what you mean. In this case, the
`assignment-expression` production applies, not
`constant-expression`.
Undefined Behavior, in turn, is not defined as specific only to
execution: the standard simply says that it is "behavior, upon
use of a *nonportable or erroneous program construct*..." for
which there are no requirements, and there are examples of
things that are explicitly UB at translation time, such as
improperly terminated lexemes and so forth.
Yes, there are constructs that are explicitly UB at translation time.
(I think that's unfortunate, and there are efforts to clear up some
such cases in C2y.)
It's unclear to me how it could be any other way. If UB was
_only_ an issue at runtime, then how could a compiler take
advantage of it to perform optimizations during translation?
We know that compilers do this.
Signed integer overflow is not one of those constructs.
This I'm not sure I agree with. It the compiler detects signed
integer overflow in (perhaps not relevant in _this_ example) an
integer constant expression, I still don't see anthing that
makes that anything other than UB. It's a constaint violation,
sure, but nothing says it is not also UB.
Any undefined behavior from evaluating INT_MAX+1 happens during
execution (barring constraint violations).
I'm not sure the standard says that. The standard says this
happens during _evaluation_, and that evaluation must be
performed in accordance with the rules of the abstract syntax
machine. But it doesn't precisely specify _when_ evaluation
takes place, and in particular, there are places in the standard
that explicitly mention evaluation during translation. I still
don't see anything that prohibits a compiler from evaluating
that expression at compile time (indeed, it clearly does, as it
generates a diagnostic about the overflow).
I suppose that changes the matter: does the language merely
leave that unspecified, in which case, this program is not
strictly conforming, or does it say that it _cannot_ make any translation-time decisions about it? I cannot find a satisfying
argument for the latter.
Furthermore, the expression above is obviously an integer
constant expression as defined by sec 6.6 para 8. Section 6.6,
para 4, reads in part, "Each constant expression shall evaluate
to a constant that is in the range of representable values for
its type." The expression, `(INT_MAX+1)*0` violates this
constraint, and so therefore a diagnostic is mandated as per
sec 5.1.1.3 para 1. That it appears in code that is not
obviously called from `main` doesn't change that.
It satisfies the requirements for an integer constant expression in
6.6p8, but it violates the constraint in 6.6p4. (I presume that an >>"integer constant expression" must be a "constant expression".)
But since "constant-expression" is a grammatical production,
it doesn't have to satisfy that constraint, and no diagnostic
is required. (A warning is certainly permitted.)
Fair point. It's grammatical position makes it an
assignment-expression. I clearly misinterpreted that before.
Similarly, this:
int n = INT_MAX + 1;
at block scope doesn't require a diagnostic, though of course it
has undefined behavior -- but at file scope, the initializer is a
constant expression, so that would be a constraint violation.
Right. The semantics of this are defined in sec 6.7.11 para 5.
Morever, sec 6.6 para 17 says that, "the semantic rules for
evaluation of a constant expression are the same as for
nonconstant expressions." This brings us back to 5.1.2.4,
though I submit that para (4) is a stronger argument for what
you and Tim are saying, as it reads in part, "An actual
implementation is not required to evaluate part of an expression
if it can deduce that its value is not used and that no needed
side effects are produced (including any caused by calling a
function or through volatile access to an object)." I interpret
this to mean that, if the implementation can determine that
there is no way that `foo` can be called, it does not _have_ to
evaluate the above expression. However, it must satisfy the
range constraint from section 6.6, so it likely will, and in any
event, the standard does not say that it, "shall not" evaluate
it, or when.
Overflow in a constant expression is not undefined behavior. It's a >>constraint violation. But that doesn't apply here, because the
initializer is not a constant expression. (Sorry if I'm repeating
myself.)
Where does it say that UB and constraint violations are mutually
exclusive? I don't see any such statement in the standard. Am
I missing it?
The standard says that if a constraint is violated, a diagnostic
must be emitted, regardless of whether or not the constraint
violation is the result of something that is UB not; that is, if
a constraint violation occurs due to something that is UB, the
implementation must still emit a diagnostic: UB is not an escape
hatch from that requirement.
It also says, 'If a "shall" or "shall not" requirement that
appears outside of a constraint or runtime-constraint is
violated, the behavior is undefined. Undefined behavior is
otherwise indicated in this document by the words "undefined
behavior" or by the omission of any explicit definition of
behavior.' However, that does not preclude such behavior being
undefined; it just means that the words "shall" and "shall not"
in a constraint violation do not a priori describe behavior vis
definition.
Once the compiler does that, if it does, and observes UB, the
standard is silent on what requirements it imposes, which means
the behavior is undefined. I see no reason it couldn't arrange
to invoke `foo` at that point.
Any UB in the program would occur during execution,
I suppose; but it's not clear to me that UB is tied _only_ to
execution time.
The standard is explicit that there _are_ things that are
evaluated at translation time, like the initializer for an
object with storage class `constexpr`. It is not clear me that
a compiler is otherwise _prohibited_ from evaluating an
expression during translation; indeed, one could imagine it
doing so to perform constant folding, and I do not believe there
exists any normative text defining it as such.
I realize this is an extreme interpretation, and not one that is
not widely shared. Personally, I think it's rather silly.
However, I that is _a_ danger of the informality of the C
specification; it does not define the semantics of the abstract
machine in the formally precise way that, say, the SML spec
defines that language's semantics. Rather, it informally
specifies them in prose, and that prose is ambiguous.
Probably much good would be done if C's semantics _were_
rigorously defined, but they are not. Thus, they are open to
radical interpretation, and as extreme as those may be, I do not
see how the normative text of the standard explicitly
_prohibits_ them.
and in fact
it *won't* occur during execution because foo() isn't called.
A compiler can't generate code with arbitrary behavior just because
it can't prove that there will be no UB. If it could, every signed
or floating-point arithmetic operation with unknown operand values
would grant the same permission.
But that's not the situation here. The situation is that the
compiler can prove that something _is_ UB.
Regardless, I think you highlighted an actual problem with the
spec; I don't think that behavior is _explicitly_ prohibited,
therefore, it is likely undefined, but at a minimum unspecified,
whether it actually could happen. If the argument against that
is that this renders the language essentially unusuable, then
my response is, "yeah, well, welcome to programming in C in the
2020s." Most compilers would never be that extreme, but I see
no evidence that it would not be an invalid reading of the
literal text of the standard if they did.
So no, I do not see how execution according to the rules of the
abstract machine is not guaranteed, here. I certainly see no
way in which this can be regarded as a strictly conforming
program.
foo()'s behavior would be undefined if it were called. It *isn't*
called, so there's no actual UB. The program does not violate any
of the other requirements for strict conformance.
I understand _what_ you're saying: despite the expression itself
manifesting undefined behavior, in this case it's not UB because
`foo` is never executed. What I'm saying is that I don't see
anything in the standard that restricts UB to _only_ executed
code. A reputable compiler obviously instruments `foo` with
code to trap into ubsan; if it's not UB, since it's not
executed, then why do so? Granted, that's not evidence of
anything other than the behavior of those compilers, but still.
It is clearly the _intent_ that this be a strictly conforming
program. The C standard, as an imprecise, informal document,
cannot guarantee it.
If the usual "Hello, world" program prints "Hello, world" followed
by "Goodbye", the implementation is non-conforming. If it formats
my hard drive after printing "Goodbye", it's non-conforming and >>>>dangerous.
Two separate things. My point earlier was that code can
obviously run after `main` terminates. Moreoever, I can't
imagine what would _prevent_ a runtime system that invokes
`main` from doing something like printing, "PROGRAM STOPPED"
after `main` returned. C imposes no requirements here.
Yes, it does. An OS can print "PROGRAM STOPPED", but not as part
of the execution of the program. On my system, a shell prompt is
printed after a program terminates, but not by the program. If I
execute a "hello, world" program with its output redirected to a file
(on a system that supports that), the resulting file cannot contain >>"PROGRAM STOPPED". The requirements in 5.1.2.4 specify both what
the execution of a program must do and what it must not do.
Files are a separate case. There's no guarantee that the
standard output refers to a file; it may well refer to an
"interactive device", the semantics of which are (necessarily)
unspecified.
Here's an example: consider an interactive user who uses a
screen reader device. Suppose that user makes use of an
implementation that includes runtime support for that device,
and that precedes invocation of `main` with a command sequence
causing the screen reader to (perhaps) change intonation; and
suceeds return from main by outputing another command sequence
that resets to the original state.
I do not see how C could prohibit that, assuming that the
implementation takes care to detect whether standard output
really refers to the screen reader, and does emit the control
sequences if output is redirected to a file. Another user who
runs that same program without a screen reader may see the
standard text printed on the screen, without the control
sequence sandwich.
I don't think a conforming implementation can prohibit that kind
of thing.
Whether foo() has external linkage or internal
linkage doesn't change that.
I disagree. There's no possible way for the implementation to
know whether a function with external linkage will be ultimately
invoked or not; consider a system that supports loadable shared
modules. Nothing prevents even this simple program from being
compiled as a shared module, dynamically loaded, the loading
program explicitly searching for and finding the symbol
corresponding to the `foo` function, and invoking it.
Remember that linking is translation phase 8. The compiler is not
the entire implementation.
Exactly my point. The compiler cannot know how `foo` might be
used, or how the translated object might be exercised. There's
I don't see how it could possibly know that, given that `foo`
has external linkage.
We were presented with a complete translation unit that included a
function definition for "main". It's a complete program. There's no
valid way for some other program to call foo. If OS provided such
a mechanism, it would be outside the scope of C.
Given an excessively pedantic and literal reading of the text of
the standard, I don't think an implementation is explicitly
prohibited from evaluating the initializer at translation time,
deducing that the behavior is undefined, and blaming it on the
program, at which point, all bets are off.
Hence, the compiler _must_ treat with UB as written, which is
why `ubsan` inserts trapping code in `foo`.
I don't know what "_must_ treat with UB" means.
foo() has undefined behavior if it's called, so replacing its
body with trapping code is valid. But (I'm reasonably sure that)
an implementation cannot reject a program just because it can't
prove that it has no undefined behavior during execution. It can >>>>reject it if it can prove that it *always* has undefined behavior >>>>during execution.
What I'm saying is that, `foo` has undefined behavior _period_.
That's manifest in an integer constant expression, whether it is
executed at runtime or not. I believe that the standard forces
the expression to be evaluated at translation time, via the
"shall" mandate when checking the constraint on the range in sec
6.6 para 4. Further, that evaluation must happen in accordance
with the rules of the abstract machine, as per 5.1.2.4 para 17.
The diagnostic is mandated, as is the translation-time
evaluation. The expression is itself manifestly exhibits UB,
and so therefore the result of the rest of the translation is
undefined.
foo is a function. foo does not have undefined behavior; it has no >>behavior at all. A *call* to foo during execution has undefined
behavior. (`foo;` is a statement-expression that does nothing;
it does not have undefined behavior.)
The _evaluation_ of that expression in `foo` has undefined
behavior. The standard does not say that it _cannot_ be
evaluated at translation time.
[SNIP]
I think the question of whether the initializer is a
constant-expression or not has caused some not entirely relevant
confusion.
Here's another example that avoids that issue.
#include <limits.h>
int foo(void) {
int zero;
zero = INT_MAX;
zero ++;
zero *= 0;
return zero;
}
int main(void) {
return 0;
}
Given my grammatical argument above, I would say that this program
has no constant expressions.
Agreed, if by "constant expressions" you mean those mandated to
use the `constant-expression` grammatical production.
Whether that argument is correct or
not, it certainly has no constant expressions that violate any
constraint or that have undefined behavior. Evaluating `zero ++`
(which doesn't even pretend to be a constant expression) would have >>run-time undefined behavior -- *if* foo() were ever called.
Let me turn this around in two ways: suppose that the
translation unit _only_ included `foo`. Could the compiler
deduce that the behavior of `foo`, if called, is undefined? If
not, why not?
Second, suppose that `foo` _were_ called, could the compiler
replace this with a program that was the equivalent of,
`int main(void) {printf("check your nose"); abort();}`? If so
why? If not, why not?
And given this translation unit, I don't think there's any way to
construct a multi-TU program that calls foo, so a compiler *can*
determine that foo is never called (but there's no requirement to
do so, or to make any use of that information).
This is the crux of my point, as well. There's not requirement
for the translator to _not_ evaluate the expression and become
privy to UB.
Would it be stupid if a compiler did that? Yes. Do existing
compilers do so? No, not that I'm aware of. Would some dweeb
nerd compiler douche who thinks this would make a compiler
benchmark some microfraction of a percent faster take advantage
of that? I absolutely think so, yes.
The text of the standard explicitly carves this out; or, rather,[...]
it attempts to. If the result of an expression is not
representable in the target type, _regardless of whether that's
due to UB or not_, a diagnostic is required.
In article <DHAUR.47540$0o1c.29921@fx08.iad>,
Scott Lurndal <slp53@pacbell.net> wrote:
cross@spitfire.i.gajendra.net (Dan Cross) writes:
In article <1BoUR.3$lmCb.1@fx22.iad>, Scott Lurndal <slp53@pacbell.net> wrote:<snip>
cross@spitfire.i.gajendra.net (Dan Cross) writes:
[snip]
Yeah, that's from `cc.c`, right?
No, it's from cpp.c
$ ls /work/reference/collegetapes/sltape/v6cc/
c0.c c00.c c01.c c02.c c03.c c04.c c05.c c1.h
c10.c c11.c c12.c c13.c c2.h c20.c c21.c cc.c cpp.c
Oh interesting. I don't have a `cpp.c` in my v6 archive.
I wonder what else I'm missing.
[snip]
Thanks! This is an artifact definitely worth preserving. As
far as I know, it's not in any of the extant V6 archives. I'll
shoot you an email, if that's ok.
- Dan C.
In article <DHAUR.47540$0o1c.29921@fx08.iad>,
Scott Lurndal <slp53@pacbell.net> wrote:
cross@spitfire.i.gajendra.net (Dan Cross) writes:
In article <1BoUR.3$lmCb.1@fx22.iad>, Scott Lurndal <slp53@pacbell.net> wrote:<snip>
cross@spitfire.i.gajendra.net (Dan Cross) writes:
[snip]
Yeah, that's from `cc.c`, right?
No, it's from cpp.c
$ ls /work/reference/collegetapes/sltape/v6cc/
c0.c c00.c c01.c c02.c c03.c c04.c c05.c c1.h
c10.c c11.c c12.c c13.c c2.h c20.c c21.c cc.c cpp.c
Oh interesting. I don't have a `cpp.c` in my v6 archive.
I wonder what else I'm missing.
[snip]
Thanks! This is an artifact definitely worth preserving. As
far as I know, it's not in any of the extant V6 archives. I'll
shoot you an email, if that's ok.
I claim that an expression that looks like a constant expression
*isn't* a constant-expression if it doesn't appear in a context
that requires a constant-expression.
On 31/05/2026 19:11, Bart wrote:...
Actual examples of too many parentheses?
Any source code written in LISP :-)
(And for too few parentheses, any source code in Forth.)
PS: One yet non-considered question that was part of my original
post was: "Is there any rationale from the _software designer_'s perspective?"
Keith Thompson <Keith.S.Thompson+u@gmail.com> writes:
I claim that an expression that looks like a constant expression
*isn't* a constant-expression if it doesn't appear in a context
that requires a constant-expression.
Right. This question came up years ago in a Defect Report. The
response from the Committee was basically the same as what you
said: the 6.6 constraints for constant expressions apply only in
situations where the C standard expressly requires a constant
expression. (I don't have the DR in front of me; I'm summarizing
based on memory, but am confident the actual wording is consistent
with what I just said.)
Tim Rentsch <tr.17687@z991.linuxsc.com> writes:
Keith Thompson <Keith.S.Thompson+u@gmail.com> writes:
I claim that an expression that looks like a constant expression
*isn't* a constant-expression if it doesn't appear in a context
that requires a constant-expression.
Right. This question came up years ago in a Defect Report. The
response from the Committee was basically the same as what you
said: the 6.6 constraints for constant expressions apply only in
situations where the C standard expressly requires a constant
expression. (I don't have the DR in front of me; I'm summarizing
based on memory, but am confident the actual wording is consistent
with what I just said.)
C99 DR 261 looks similar to what you're talking about.
https://www.open-std.org/jtc1/sc22/wg14/www/docs/dr_261.htm
The Committee Response section says:
In general, the interpretation of an expression for constantness
is context sensitive. For any expression which contains only
constants:
- If the syntax or context only permits a constant expression, the
constraints of 6.6#3 and 6.6#4 shall apply.
- Otherwise, if the expression meets the requirements of 6.6
(including any form accepted in accordance with 6.6#10), it is a
constant expression.
- Otherwise it is not a constant expression.
That's close to what I claimed, but the second bullet point differs.
My claim was that, given:
n = 2+2;
2+2 is not a constant expression because the grammar doesn't require
a constant expression in that context. The Committee's opinion
(at least at the time) was that it is a constant expression because
it meets the requirements of 6.6.
But I *think* it's a distinction without a difference. [...]
Keith Thompson <Keith.S.Thompson+u@gmail.com> writes:[...]
That's close to what I claimed, but the second bullet point differs.
My claim was that, given:
n = 2+2;
2+2 is not a constant expression because the grammar doesn't require
a constant expression in that context. The Committee's opinion
(at least at the time) was that it is a constant expression because
it meets the requirements of 6.6.
But I *think* it's a distinction without a difference. [...]
Right. The key point is that the constraints need to be satisfied
only in situations where the C standard expressly requires a
constant expression. Whether a given expression is called a
"constant expression" doesn't matter; all that does matter is
whether the constraints need to be satisfied.
cross@spitfire.i.gajendra.net (Dan Cross) writes:
In article <865x3yd21n.fsf@linuxsc.com>,
Tim Rentsch <tr.17687@z991.linuxsc.com> wrote:
cross@spitfire.i.gajendra.net (Dan Cross) writes:
In article <86ik81cfk5.fsf_-_@linuxsc.com>,
Tim Rentsch <tr.17687@z991.linuxsc.com> wrote:
[...]
There's an important distinction to make here. Consider this
program:
#include <limits.h>
int
foo(){
int zero = (INT_MAX+1)*0;
return zero;
}
int
main(){
return 0;
}
This program does not transgress the bounds of undefined behavior.
To clarify, the comments in my posting were meant to be read as
saying the given text is the entire program, and that it is strictly
conforming with respect to conforming hosted implementations.
(Incidentally, given the rules for freestanding implementations, I'm
not sure that it is even possible for any program to be strictly
conforming with respect to conforming freestanding implementations.
In any case my statements were meant only in the context of hosted
implementations.)
foo() has undefined behavior if it's called, so replacing its
body with trapping code is valid.
But (I'm reasonably sure that)
an implementation cannot reject a program just because it can't
prove that it has no undefined behavior during execution. [...]
In your example, `foo` clearly exhibits UB; I think your
argument is whether that has a realized effect or not, since the
UB is not invoked. I'm saying that in general a compiler cannot
possibly know that when it compiles `foo`, and is free to assume
the worst.
foo() exhibits UB if and only if it's called during execution.
[...]
The text of the standard explicitly carves this out; or, rather,[...]
it attempts to. If the result of an expression is not
representable in the target type, _regardless of whether that's
due to UB or not_, a diagnostic is required.
How would an expression (appearing in a context that requires an
integer constant expression) not "evaluate to a constant that is in
the range of representable values for its type" other than by UB?
I can't think of an example, but I'd be interested in seeing one.
Note in particular that UINT_MAX+1U is well defined, not an overflow.
In article <1100gbk$1lt8i$2@kst.eternal-september.org>,
Keith Thompson <Keith.S.Thompson+u@gmail.com> wrote:
cross@spitfire.i.gajendra.net (Dan Cross) writes:
[...]
The text of the standard explicitly carves this out; or, rather,[...]
it attempts to. If the result of an expression is not
representable in the target type, _regardless of whether that's
due to UB or not_, a diagnostic is required.
How would an expression (appearing in a context that requires an
integer constant expression) not "evaluate to a constant that is in
the range of representable values for its type" other than by UB?
It wouldn't. But because it's UB, it could evaluate to
anything, including something that didn't violate the
constraint.
I can't think of an example, but I'd be interested in seeing one.
In terms of a practical, working compiler? I doubt that one
exists.
| Sysop: | Tetrazocine |
|---|---|
| Location: | Melbourne, VIC, Australia |
| Users: | 14 |
| Nodes: | 8 (0 / 8) |
| Uptime: | 198:15:36 |
| Calls: | 218 |
| Files: | 21,503 |
| Messages: | 82,300 |