• this girl calls c ugly

    From fir@3:633/10 to All on Wed May 27 19:53:26 2026
    https://www.youtube.com/watch?v=I7fEsbksKRE

    as far as i understood..(becouse if someona talks english fast my mind
    tend to skip more than half of the message)

    overally this is quite curious...


    she probably read this articles etc for lisp ponys who say lisp is
    beautifull and c is not so much... but still this is much of
    incompetence call c ugly...


    --- PyGate Linux v1.5.15
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From fir@3:633/10 to All on Wed May 27 20:15:33 2026
    fir pisze:
    https://www.youtube.com/watch?v=I7fEsbksKRE

    as far as i understood..(becouse if someona talks english fast my mind
    tend to skip more than half of the message)

    overally this is quite curious...


    she probably read this articles etc for lisp ponys who say lisp is beautifull and c is not so much... but still this is much of
    incompetence call c ugly...


    well i see she named bjorne stroustrup smart (well i wouldnt be so bold
    here) - and said he took UGLY c and combined it with OO so this make a
    c++ 'succesfull' language which
    takes much hate

    so i understand c is ugly and to blame c++ is hated... instead of coding
    in lisp... now thats where comedy possibly gets too strong :3

    (now im quite convinced my comment on this is also not too strong in bad
    way of its sense but i was kinda really stunned/surprised ehen someona
    talks such things on video ) -,-

    --- PyGate Linux v1.5.15
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From BGB@3:633/10 to All on Wed May 27 18:49:38 2026
    On 5/27/2026 1:15 PM, fir wrote:
    fir pisze:
    https://www.youtube.com/watch?v=I7fEsbksKRE

    as far as i understood..(becouse if someona talks english fast my mind
    tend to skip more than half of the message)

    overally this is quite curious...


    she probably read this articles etc for lisp ponys who say lisp is
    beautifull and c is not so much... but still this is much of
    incompetence call c ugly...


    well i see she named bjorne stroustrup smart (well i wouldnt be so bold here)ÿ - and said he took UGLY c and combined it with OO so this make a
    c++ 'succesfull'ÿ language which
    takes much hate

    so i understand c is ugly and to blame c++ is hated... instead of coding
    in lisp... now thats where comedy possibly gets too strong :3

    (now im quite convinced my comment on this is also not too strong in bad
    way of its sense but i was kinda really stunned/surprised ehen someona
    talks such things on video ) -,-

    In general I like her videos, and she seems to know what she is talking about...

    But, I am not personally as much of a fan of C++ as she is...


    Like, say:
    Early days (once when I was still young):
    Main compiler I used was Cygwin, but "g++" tended not to work.
    Like, at the time, had the ever-present GCC 2.95.2, ...
    Any attempt to compile C++ resulted in a mess of error messages.
    Later:
    I experimented with it, but:
    It didn't do that much that wasn't syntactic sugar over C.
    I was using customized tools that worked OK with C,
    but C++ made many headaches with inter-op with custom tools.
    If you have C++, but can't really use any features without issues,
    it kinda defeats the point vs just using C.
    Later still:
    My custom compiler can't use more than a 90s era subset;
    So, to really make use of any newer/fancier features,
    I would greatly limit where I could use the code.

    People can be like:
    "But, Templates, and STL containers..."
    And I can be like:
    "But then your binaries are huge and build times suck..."


    Similar reason to why one doesn't build complex patterns or do
    template-like stuff via function macros in the C preprocessor:
    One can do this... But again, bloated binaries and terrible build times.


    So, one is back to the core issue:
    The part that is actually usable, mostly still amounts to syntactic
    sugar over things you can already do in C.


    There are "niceties", granted, but relatively little "actually new".


    And, one of the rare few "actually new" features C++ offers: exceptions.
    Also comes with its own drawbacks (code bloat, try/catch+throw is
    usually slow, use with care else program explodes, ...). Unlike many
    other features, C++ exceptions are also ones that add a (small but often non-zero) code-bloat and performance penalty to pretty much *every*
    function, regardless of whether or not they actually use exceptions
    (because, they could potentially call something, somewhere, that uses exceptions, and so the compiler needs to provide for the possibility
    that an exception could come through this section of code).

    Many people coding styles often forbid it and mandate that people use error-code returns or similar instead (like in C), with exceptions being globally disabled at build time (along with RTTI and similar), which,
    isn't really a strong selling point...


    And, many examples we have of non-trivial projects using C++, such as LLVM/Clang, also tend to own pretty much any computer that one dares use
    to try to compile it...

    Or, Doom3, which isn't *that* much bigger than Quake 3 code wise, and
    still uses a mostly C like subset of C++, but for whatever unexplained
    reason, seems to take an order of magnitude longer to recompile from
    source (... over 10 minutes ...).

    ...



    --- PyGate Linux v1.5.15
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From Lawrence D?Oliveiro@3:633/10 to All on Thu May 28 04:53:49 2026
    On Wed, 27 May 2026 18:49:38 -0500, BGB wrote:

    But, I am not personally as much of a fan of C++ as she is...

    C++ syntax is so complex, the language spec has to add rules that say,
    in case of ambiguity, that this interpretation is meant and not that.

    Someone described this as ?the principle of most surprise?.

    --- PyGate Linux v1.5.15
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From Janis Papanagnou@3:633/10 to All on Thu May 28 09:18:15 2026
    On 2026-05-28 01:49, BGB wrote:
    [...]

    In general I like her videos, and she seems to know what she is talking about...

    But, I am not personally as much of a fan of C++ as she is...

    I'm a big fan of abstractions. - So many things beyond "C" are fine!

    [...]

    [ Cygwin ]

    A sensible but imperfect workaround provided for an inferior platform.

    [...]

    Similar reason to why one doesn't build complex patterns or do template- like stuff via function macros in the C preprocessor:
    One can do this... But again, bloated binaries and terrible build times.

    Code patterns that are bulky in "C" can be formulated tersely with
    C++/STL (while still preserving an efficient implementation, even with complexities guaranteed); and the framework is flexible, orthogonally
    designed. Easy to reuse high-level concepts as opposed to re-implement
    the same code for different types. Or weaken the code by extensive use
    of casts. All sorts of C's problems with memory can be addressed. (The
    list can be continued; but I wonder why such things aren't recognized.)


    So, one is back to the core issue:
    The part that is actually usable, mostly still amounts to syntactic
    sugar over things you can already do in C.

    Huh? It may depend on the developer/programmer. But it's certainly a
    lot more than "syntactic sugar".


    There are "niceties", granted, but relatively little "actually new".

    Not all concepts are "new", of course; we saw them in other languages
    years or (in some cases) decades ago. But C++' and STL features are a
    lot more than just niceties; it's beyond me how one may come to such
    a valuation. (And now let's compare that formulated demand or wish of
    new things with "C"?)


    And, one of the rare few "actually new" features C++ offers: exceptions.

    We used them already in the 1990's.

    Also comes with its own drawbacks (code bloat, try/catch+throw is
    usually slow, use with care else program explodes, ...). [...]

    I cannot confirm your statements, especially in that generality.

    I recall we had bloat with templates on a specific platform in the
    very early pre-standard era, when they were first supported. But we
    didn't have any [noteworthy] speed degradation with exceptions (or
    templates).


    Many people coding styles often forbid it and mandate that people use error-code returns or similar instead (like in C), with exceptions being globally disabled at build time (along with RTTI and similar), which,
    isn't really a strong selling point...

    Yes, stupid things are done. Mandating to use RCs and forbid to use
    exceptions is particularly stupid as a general rule.

    Probably mentally inconvenient as "new" concept for FORTRAN, BASIC,
    or "C" programmers? I can certainly understand that, psychologically.
    But not technically. Maybe the support on the commercial platforms
    was just better than on Cygwin?

    Janis

    [...]


    --- PyGate Linux v1.5.15
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From BGB@3:633/10 to All on Thu May 28 02:35:37 2026
    On 5/27/2026 11:53 PM, Lawrence D?Oliveiro wrote:
    On Wed, 27 May 2026 18:49:38 -0500, BGB wrote:

    But, I am not personally as much of a fan of C++ as she is...

    C++ syntax is so complex, the language spec has to add rules that say,
    in case of ambiguity, that this interpretation is meant and not that.

    Someone described this as ?the principle of most surprise?.


    Someone could almost come up with a language that is "like C++ but less horrible".


    Core language:
    Like a C / C# hybrid;
    Base language could superficially resemble C++;
    In any case, avoid needlessly changing basic syntax.
    Designed to be easier to write a compiler;
    Designed so that compiler isn't dead slow;
    Goal should be that compiled code performance remain similar to C;
    Shouldn't do things that would exclude it from C like use cases.
    Type-system and memory model is mostly similar to C.

    Could impose that declaration types work more like in C#;
    SI + Interfaces rather than full MI;
    Generic Types rather than Templates;
    ...

    Parsing would follow a "can it reasonably be parsed as X" approach:
    If something can syntactically be parsed as a cast/etc, assume it is so;
    If this assumption turns out to be wrong, compiler error.

    In this case, assume that types/etc may only appear in certain contexts,
    and can apply on "<type_expr> <identifier>" pattern recognition to
    detect declarations. Trying to put a declaration (or generic invocation) somewhere where it doesn't normally go, being a syntax error.

    Say:
    if(int i=0)
    { ... }
    Would be regarded as illegal (but could be allowed as a special case in "for()" loops due to popularity).


    Core type-system is C like:
    Type consists of a base-type and any modifiers;
    Base type names:
    char, byte, short, int, long, float, double, ...
    Type Modifiers:
    signed, unsigned, const, volatile, ...
    Where, const and volatile behave like in C.
    Base type sizes (bits):
    char 8 //ASCII / UTF-8
    wchar 16 //UCS-2 / UTF-16
    lchar 32 //UCS-4 / UTF-32
    byte 8
    sbyte 8
    ubyte 8
    short 16
    int 32
    long 64
    float 32
    double 64

    Issue:
    Would want to disallow compound type names, like "long long" or "long
    double".
    As soon as the syntax allows this, an ambiguity is created that
    adversely effects parser speed.

    One other option could be to provide a set of explicit sized types:
    int8, int16, int32, int64, int128
    uint8, uint16, uint32, uint64, uint128
    float16, float32, float64, float128
    ...


    In terms of representation, byte and char would be equivalent, except
    that it might make sense to treat 'char' as not normally an arithmetic
    type, so in order to perform arithmetic on char it would be cast to
    'int' or similar.

    Could assume that a "string" type exists, but primarily exists as an
    opaque "const char *" pointer.

    Nominally, string literals could be stored prefixed with a length stored
    as a transposed UTF-8 codepoint (along with also having a NUL
    terminator). But, unlike "const char *", "string" would not allow
    pointer arithmetic, and so would always point to the start of the string literal, or to an explicitly interned string (doing otherwise would be erroneous).


    While it is "trendy" in some languages to treat "string" as some sort of object type, actually doing string as an object is needlessly wasteful.

    Likewise, storing a length as a prefix still allows "string.length" or
    similar to be O(1). Could assume default string format is UTF-8, but
    mostly treated as a blob of bytes.

    While some languages (Java and C#) went over to UTF-16, this is
    needlessly wasteful of memory and breaks with C tradition.



    Likely memory management:
    Generic heap, new/delete;
    No GC, as IMO no one thus far has made GC work sufficiently well.
    Zones / arenas;
    Initially resembles heap allocs,
    but all objects in a zone can be bulk freed;
    Some objects could be set to track their self-pointer;
    Self-pointer is NULL'ed if object is freed.
    Essentially, similar to Doom's Z_Malloc.
    Automatic:
    Freed by default as soon as parent frame exits.

    OO:
    Likely treat 'class' and 'struct' as distinct.
    Objects are reference type by default (like in C#/Java);
    However, will typically still have automatic lifetime if local.
    Could likely still have RAII, but may manifest differently.
    Structs would behave like C structs or C++ by_value classes.
    Will explicitly forbid inheritance or virtual methods.
    Could maybe have copy-constructors and destructors.
    More likely to be used for C++ style RAII patterns.

    The more restrictive object model would avoid a "big chunk of evil" that exists if trying to write a C++ compiler. Elimination of full MI
    eliminates a lot of complexity; as does eliminating inheritance on
    by-value types.


    Declaration imports:
    Would likely make sense to replace the reliance on "#include" with
    something more resembling the "import" mechanism from Java;
    Though would differ in that the imports identify source files rather
    than classes;

    Would likely treat package/import scope, name namespace scope, as two different entities. The package/import would resemble the mechanism in
    Java, but would instead merely import things at toplevel scope, and
    within an imported module.

    Namespace would be used in a way more like that of C++ or C# namespaces:
    namespace whatever { using whatever_else; ... }


    It is likely that compiler would first generate a "declaration manifest"
    which would be used for these purposes.

    Compiling Foo:
    import bar;
    Checks if bar has a manifest;
    If yes:
    Import manifest for bar;
    Add 'bar' to object dependency graph;
    Also import any of bar's dependencies.
    If no:
    Trigger frontend only compilation for 'bar';
    If success:
    Import manifest for bar;
    Add 'bar' to object dependency graph;
    Else:
    Compiler error.

    Compiler would likely deal with dependency compilation as a sort of
    stack machine, where imports are dealt with before compiling the main
    body of each module. This being to avoid excessive recursion and memory
    usage during dependency importing.

    Would likely make sense to keep a C style preprocessor, but using it for headers could be discouraged (this being a great source of compile-time inefficiency in both C and C++).

    ...


    Well, wont amount to much, just idle thoughts...




    --- PyGate Linux v1.5.15
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From Bonita Montero@3:633/10 to All on Thu May 28 11:48:21 2026
    Am 28.05.2026 um 06:53 schrieb Lawrence D?Oliveiro:

    C++ syntax is so complex, the language spec has to add rules that say,
    in case of ambiguity, that this interpretation is meant and not that.

    There only a few amiguities, mainly the most vexing parse.


    --- PyGate Linux v1.5.15
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From BGB@3:633/10 to All on Thu May 28 04:57:18 2026
    On 5/28/2026 2:18 AM, Janis Papanagnou wrote:
    On 2026-05-28 01:49, BGB wrote:
    [...]

    In general I like her videos, and she seems to know what she is
    talking about...

    But, I am not personally as much of a fan of C++ as she is...

    I'm a big fan of abstractions. - So many things beyond "C" are fine!


    I am not saying that abstractions are bad, but I haven't usually found
    them to be worth the costs IME.


    [...]

    [ Cygwin ]

    A sensible but imperfect workaround provided for an inferior platform.


    Yeah.

    It was the main thing I had when I was in high-school.
    Well, Cygwin and MinGW.


    By college, the "Windows Platform SDK" became freely available, and I
    mostly ended up jumping over this for native Windows development.


    At least the C++ compiler worked, still failed to fully win me over.
    Tried writing minor things in C++, mixed results.


    [...]

    Similar reason to why one doesn't build complex patterns or do
    template- like stuff via function macros in the C preprocessor:
    One can do this... But again, bloated binaries and terrible build times.

    Code patterns that are bulky in "C" can be formulated tersely with
    C++/STL (while still preserving an efficient implementation, even with complexities guaranteed); and the framework is flexible, orthogonally designed. Easy to reuse high-level concepts as opposed to re-implement
    the same code for different types. Or weaken the code by extensive use
    of casts. All sorts of C's problems with memory can be addressed. (The
    list can be continued; but I wonder why such things aren't recognized.)


    Both have a similar issue when used in a naive way though:
    Non-careful use of either results in code bloat.

    But, not really an "easy" way to avoid bloat, other than to write code specifically for what cases are relevant; while also avoiding needless duplication and copy paste (where, overuse of copy/paste can also lead
    to bloat; along with turning the code into an ugly mess).

    But, OTOH, factoring things into too small of pieces can negatively
    effect performance (and, for non-leaf functions, prolog/epilog costs for
    too many tiny functions can also be a source of code bloat).


    As can be noted, trying to mimic templates via creative use of C
    preprocessor macros can also easily result in excessive bloat...




    So, one is back to the core issue:
    The part that is actually usable, mostly still amounts to syntactic
    sugar over things you can already do in C.

    Huh? It may depend on the developer/programmer. But it's certainly a
    lot more than "syntactic sugar".


    Well, for example:
    Operator overloading:
    Basically glorified function calls made to resemble operators;
    Classes:
    Can be done with structs, and implementing vtables manually.

    Implementing class hierarchies via structs can be done, but gets ugly
    (GTK's GObject system sorta went this way).

    ...



    There are "niceties", granted, but relatively little "actually new".

    Not all concepts are "new", of course; we saw them in other languages
    years or (in some cases) decades ago. But C++' and STL features are a
    lot more than just niceties; it's beyond me how one may come to such
    a valuation. (And now let's compare that formulated demand or wish of
    new things with "C"?)


    Well, it is a thing that can be done, but is a double-edged sword.
    Saves code one might have to write out manually.
    But, is very easy to result in things that negatively effect build times.


    Usual strategy is to try to limit how much code is written, and also to
    avoid doing things in ways that result in too much code, or too much cruft.

    Best to avoid both copy paste when reasonable, and sticking anything non-trivial in macros.



    And, one of the rare few "actually new" features C++ offers: exceptions.

    We used them already in the 1990's.


    Here, "new" in the sense that it can't be mapped directly back to stuff
    that can already be expressed natively in C.


    Also comes with its own drawbacks (code bloat, try/catch+throw is
    usually slow, use with care else program explodes, ...). [...]

    I cannot confirm your statements, especially in that generality.

    I recall we had bloat with templates on a specific platform in the
    very early pre-standard era, when they were first supported. But we
    didn't have any [noteworthy] speed degradation with exceptions (or templates).


    The relative impact of try/catch is more modest.

    Typically, it results in every function having an unwind-handling stub
    for, in-case an exception is thrown, it can call any destructors or similar.


    This will depend a lot of the target and ABI, but for example, in my
    compiler, having exceptions enabled ends up costing around an extra 48
    bytes for every non-leaf function.

    This can easily add up to a fair number of kB over the size of a program binary even if they aren't used.


    The per-function cost would be higher for functions which have objects
    with destructors or catch handlers. One could easily be looking at 100s
    of additional bytes for each such function.




    Many people coding styles often forbid it and mandate that people use
    error-code returns or similar instead (like in C), with exceptions
    being globally disabled at build time (along with RTTI and similar),
    which, isn't really a strong selling point...

    Yes, stupid things are done. Mandating to use RCs and forbid to use exceptions is particularly stupid as a general rule.

    Probably mentally inconvenient as "new" concept for FORTRAN, BASIC,
    or "C" programmers? I can certainly understand that, psychologically.
    But not technically. Maybe the support on the commercial platforms
    was just better than on Cygwin?


    On Cygwin, in the early 2000s, "g++" tended to often not work at all...
    MSVC gave a better experience in that at least the compiler worked.

    Now, as for the rules like this in many coding standards, can't speak
    for all of them.


    But, the main thing that stands out to me is how many of the projects
    that use C++ have absurdly slow build times and huge binaries.

    Like, someone is like, "Thing takes 10 minutes to compile and the EXE is
    50MB? Seems fine to me..."



    Well, or LLVM, which can take upwards of an hour to rebuild from source,
    and eats huge amounts of RAM and HDD space while doing so.

    So, yeah, I don't really use LLVM...



    My own C compiler rebuilds from source in around 20 seconds or so.
    Well, or 1m38s via GCC...

    But, yeah however long is considered a reasonable amount of time to
    recompile a 370 kLOC C compiler...

    MSVC produces an 5.5MB EXE, whereas GCC produces a 12MB ELF.

    If I switch "-O3" to "-Os":
    Time drops to 0m58s.
    ELF goes to 7MB.

    Still pretty big...


    Granted, my C compiler's codebase is a little bigger than I would
    prefer, but alas...





    Granted, my C compiler isn't itself all that fast at compiling stuff, sadly:
    It also takes around 20 seconds to recompile Doom (vs around 5 seconds
    for MSVC).

    And, GCC takes around 16 seconds to recompile Doom.

    So, yeah, as for its compiling-stuff speeds, it is typically slower than
    GCC.

    Doom sizes:
    MSVC : 776K (x64)
    GCC : 500K (x86-64)
    GCC : 1.2MB (RV64G, ELF)
    BGBCC : 272K (XG3, *)

    *: Custom ISA stuff bolted onto RISC-V (with 32/64/96 bit instructions,
    goes to bigger instructions rather than smaller, and expands register
    fields to 6-bits using both X and F registers as GPRs, so is the
    opposite of RV-C in this sense; but manages to compete for code density
    by reducing overall instruction counts).

    For binary sizes, it is pretty close between XG3 and a
    differently-extended version of RV64GC though. Not yet a clear winner of
    the code-density crown, but it does more solidly win on speed.

    Where, XG3 was more meant for optimizing the design for speed, but to
    some extent optimizing the design for speed also helps with code density.



    Though, for RV64G, GCC seems to produce a lot of bloat in the ELF
    binaries through the use of excessive amounts of metadata, which is
    already fairly bulky in the ELF format if contrast with PE/COFF (which
    is what my compiler generates).


    ...



    --- PyGate Linux v1.5.15
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From fir@3:633/10 to All on Thu May 28 17:12:25 2026
    BGB pisze:
    On 5/27/2026 1:15 PM, fir wrote:
    fir pisze:
    https://www.youtube.com/watch?v=I7fEsbksKRE

    as far as i understood..(becouse if someona talks english fast my
    mind tend to skip more than half of the message)

    overally this is quite curious...


    she probably read this articles etc for lisp ponys who say lisp is
    beautifull and c is not so much... but still this is much of
    incompetence call c ugly...


    well i see she named bjorne stroustrup smart (well i wouldnt be so
    bold here)ÿ - and said he took UGLY c and combined it with OO so this
    make a c++ 'succesfull'ÿ language which
    takes much hate

    so i understand c is ugly and to blame c++ is hated... instead of
    coding in lisp... now thats where comedy possibly gets too strong :3

    (now im quite convinced my comment on this is also not too strong in
    bad way of its sense but i was kinda really stunned/surprised ehen
    someona talks such things on video ) -,-

    In general I like her videos, and she seems to know what she is talking about...


    good to watch for sure, but those statements are still preposterous

    for me its kinda funny becouse i didnt think people who say c is ugly
    are real

    though my opinion on c++ from -10/10 or about rised recently maybe to
    -9/10 becouse of this so called 'references' who after thinking shoved
    to have some sense (thou in c++ they probably dont even know it have
    sense thay just add sh*t (and by chance 1 on 100 has some sense)


    But, I am not personally as much of a fan of C++ as she is...


    --- PyGate Linux v1.5.15
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From BGB@3:633/10 to All on Thu May 28 14:07:45 2026
    On 5/28/2026 10:12 AM, fir wrote:
    BGB pisze:
    On 5/27/2026 1:15 PM, fir wrote:
    fir pisze:
    https://www.youtube.com/watch?v=I7fEsbksKRE

    as far as i understood..(becouse if someona talks english fast my
    mind tend to skip more than half of the message)

    overally this is quite curious...


    she probably read this articles etc for lisp ponys who say lisp is
    beautifull and c is not so much... but still this is much of
    incompetence call c ugly...


    well i see she named bjorne stroustrup smart (well i wouldnt be so
    bold here)ÿ - and said he took UGLY c and combined it with OO so this
    make a c++ 'succesfull'ÿ language which
    takes much hate

    so i understand c is ugly and to blame c++ is hated... instead of
    coding in lisp... now thats where comedy possibly gets too strong :3

    (now im quite convinced my comment on this is also not too strong in
    bad way of its sense but i was kinda really stunned/surprised ehen
    someona talks such things on video ) -,-

    In general I like her videos, and she seems to know what she is
    talking about...


    good to watch for sure, but those statements are still preposterous

    for me its kinda funny becouse i didnt think people who say c is ugly
    are real

    though my opinion on c++ from -10/10 or about rised recently maybe to
    -9/10 becouse of this so called 'references' who after thinking shoved
    to have some sense (thou in c++ they probably dont even know it have
    sense thay just add sh*t (and by chance 1 on 100 has some sense)


    I can't answer for her, but there are differences in aesthetic preferences.


    Some like LISP syntax, others can't stand the excessive parenthesis.
    There have been attempts to eliminate parenthesis, but then you can end
    up with an indentation sensitive syntax (like Python):
    defun foo (x y)
    if ( > x y )
    - ( * 2 x ) y
    - ( * 2 y ) x

    But, then you have all the hassles of white-space sensitive syntax...

    Infix notation and precedence rules are pros/cons.

    Many people much preferred Pascal style.

    Smalltalk was once popular, and while arguably some aspects of its
    syntax are "aesthetic", I personally found trying to read anything in
    the language to be almost incomprehensible (so, negative points if I
    can't make any sense of what is going on).

    Trying to awkwardly bolt Smalltalk syntax onto C (as in Objective-C)
    being horribly ugly IMO.


    Early on, I had liked JS and ActionScript, as (compared with LISP and
    Scheme) they scaled a lot easier to "real programming work".

    But, then one faces a tension:
    Light duty scripting: Favors keeping the language dynamic and minimizing structural concerns;
    Implementation work: Favors strongly going in a direction more like the
    C-like languages.

    So, my first major language ended up going from being a small JS clone
    to something more like ActionScript3 (with a large and complex VM).

    Then I rebooted it into something more Java-like (a language I called
    BS2), but then:
    It was no longer good at light scripting tasks;
    It failed to really compete well with C in C's home turf.


    BGBCC has BS2 support, but I rarely use-it, as it was often more useful
    to write in C and then invoke BS2 features as C extensions when relevant.

    Well, even if the syntax is ugly, and people can question why BS2
    features and not C++ features.


    Both C++ and BS2 had exceptions, but in both cases, enabling them has
    non-zero overhead.

    This mostly takes the form of:
    Metadata tables to map code-locations to unwind and exception-handler
    entry points;
    Additional stub handlers to be generated in the code mostly to be like
    "all good here, continue on your way."
    Though, maybe could add a flag or something to the EH-unwind metadata to
    flag that no handlers are present, so it should just skip past that step
    and continue on directly. In this case, the idea is that it borrows a
    trick from the Windows X64 ABI of using the function epilog to encode information for the exception unwind via machine instructions (the EH
    unwinder would effectively implement a sort of very naive machine-code interpreter).

    So, the metadata here is mostly a table of packed members to encode the
    start and end RVA (within .text), where to find the epilog, and where to
    find the entry-point for the unwind/catch handlers. I forget the exact
    layout ATM, but do remember it originally came from the PE/COFF spec.



    Did experiment with using the BS2 features in BGBCC to support an EC++
    like subset, but this fell well short of being able to claim actual C++ support.

    Or, Like EC++ but with interfaces and namespaces.

    Didn't add the actual interface keyword though, more like:
    abstract class IFoo {
    public:
    void method1();
    void method2(int x, int y);
    };
    And, it understands it as an interface.

    Did experiment with adding support for templates, but this is mostly
    where I gave up.

    Even getting to C++98 levels would be a strong uphill battle.

    And, then I lose incentive as I don't really use C++, and (unlike C
    land) the C++ people tend to chase after the newest features, rather
    than stick to an older and more conservative subset.


    But, yeah, some priorities I would have for such a language (as another attempt at a C++ alternative):
    Scope remains within what is reasonable for a person to implement in a
    custom compiler;
    Language should ideally be able to compete head on with C in C-like
    use-cases (say, for example, including bare-metal programming and ROMs); Language should ideally be able to have build times similar or better
    than C for similar code complexity.

    So, say, compiler should be ideally able to chew through at least 10
    kLOC per second or so.

    ...


    --- PyGate Linux v1.5.15
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From Chris M. Thomasson@3:633/10 to All on Thu May 28 12:47:56 2026
    On 5/28/2026 12:18 AM, Janis Papanagnou wrote:
    On 2026-05-28 01:49, BGB wrote:
    [...]

    In general I like her videos, and she seems to know what she is
    talking about...

    But, I am not personally as much of a fan of C++ as she is...

    I'm a big fan of abstractions. - So many things beyond "C" are fine!

    [...]

    [ Cygwin ]

    A sensible but imperfect workaround provided for an inferior platform.

    [...]

    Similar reason to why one doesn't build complex patterns or do
    template- like stuff via function macros in the C preprocessor:
    One can do this... But again, bloated binaries and terrible build times.

    Code patterns that are bulky in "C" can be formulated tersely with
    C++/STL (while still preserving an efficient implementation, even with complexities guaranteed); and the framework is flexible, orthogonally designed. Easy to reuse high-level concepts as opposed to re-implement
    the same code for different types. Or weaken the code by extensive use
    of casts.

    All sorts of C's problems with memory can be addressed. (The
    list can be continued; but I wonder why such things aren't recognized.)

    C's problems with memory? Don't you mean the programmers that make bugs?

    [...]

    --- PyGate Linux v1.5.15
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From Lawrence D?Oliveiro@3:633/10 to All on Thu May 28 23:32:16 2026
    On Thu, 28 May 2026 02:35:37 -0500, BGB wrote:

    Someone could almost come up with a language that is "like C++ but less horrible".

    Like this? <https://en.wikipedia.org/wiki/Carbon_(programming_language)>

    --- PyGate Linux v1.5.15
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From Lawrence D?Oliveiro@3:633/10 to All on Thu May 28 23:35:09 2026
    On Thu, 28 May 2026 04:57:18 -0500, BGB wrote:

    It was the main thing I had when I was in high-school. Well, Cygwin
    and MinGW.

    Did you ever discover the platform that they were trying to emulate?

    --- PyGate Linux v1.5.15
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From Lawrence D?Oliveiro@3:633/10 to All on Thu May 28 23:54:52 2026
    On Thu, 28 May 2026 14:07:45 -0500, BGB wrote:

    Some like LISP syntax, others can't stand the excessive parenthesis.

    Worth mentioning that the whole point about Lisp syntax is to get as
    close to no syntax as possible -- essentially, all the ?syntax? is
    down to the meaning of the first word after each opening parenthesis
    (function call, macro call or ?special form?). This is to achieve homoiconicity, which is a core feature of the language.

    Anybody remember PostScript? That, too, was homoiconic. And achieved
    it in a similar way, with an absolutely minimalist compile-time
    syntax.

    Other, newer languages have found a way to implement AST-level macros
    without going to that sort of extreme -- Julia and Rust, I think, can
    manage it. I even did it in Python, with a little bit of fudging.

    There have been attempts to eliminate parenthesis ...

    The problem for me is the ?parenthesis pileup? layout that seems to be traditional among Lisp programmers. I prefer to put parentheses that
    have the interpretation of delimiting statement blocks on lines by
    themselves (and sometimes other constructs as well, if they get too
    long), e.g.

    (defun set_auto_indent (&optional on)
    "lets user change auto-indent setting."
    (interactive)
    (when (eq on nil)
    (setq on
    (y-or-n-p
    (format
    "Auto-indent [%s]? "
    (if
    (eq
    (lookup-key (current-global-map) "\015")
    'auto_indent
    )
    "y"
    "n"
    ) ; if
    )
    )
    ) ; setq
    ) ; when
    (cond
    (on
    (global-set-key "\015" 'auto_indent)
    (global-set-key [?\C-\M-m] 'newline)
    (message "Auto-indent on")
    )
    (t
    (global-set-key "\015" 'newline)
    (global-set-key [?\C-\M-m] 'auto_indent)
    (message "Auto-indent off")
    )
    ) ; cond
    ) ; set_auto_indent

    Infix notation and precedence rules are pros/cons.

    Python took over most of the C operator precedence rules, with one
    interesting wrinkle: they moved up the precedence of the bitwise
    operators so that what has to be written like this in C:

    (®val¯ & ®mask¯) == ®expected¯

    can have the parentheses omitted in Python:

    ®val¯ & ®mask¯ == ®expected¯

    Smalltalk was once popular, and while arguably some aspects of its
    syntax are "aesthetic", I personally found trying to read anything
    in the language to be almost incomprehensible (so, negative points
    if I can't make any sense of what is going on).

    I did have a look at it at one point. Not too hard to manage. The
    hardest part was trying to find some actual explicit definition of the
    whole syntax.

    All in all, though, I think its approach to object-orientation is a
    bit ancient, compared to, say, Python.

    Early on, I had liked JS and ActionScript, as (compared with LISP
    and Scheme) they scaled a lot easier to "real programming work".

    But, then one faces a tension:
    Light duty scripting: Favors keeping the language dynamic and
    minimizing structural concerns;
    Implementation work: Favors strongly going in a direction more like
    the C-like languages.

    Lua sounds like it was designed for what you had in mind.

    But these days, it?s just easier to use Python.

    --- PyGate Linux v1.5.15
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From BGB@3:633/10 to All on Thu May 28 20:07:08 2026
    On 5/28/2026 6:32 PM, Lawrence D?Oliveiro wrote:
    On Thu, 28 May 2026 02:35:37 -0500, BGB wrote:

    Someone could almost come up with a language that is "like C++ but less
    horrible".

    Like this? <https://en.wikipedia.org/wiki/Carbon_(programming_language)>

    Well, except that the language I am imagining would probably still look
    more C-like (where, say, C-like core but borrowing some things from C#
    and Java rather than aiming to to be like C++; but in terms of
    implementation and ABI would be sorta like C++).


    Possibly something like:
    import C.stdio;

    int main(int argc, const char **argv)
    {
    printf("hello world, called as %s\n", argv[0]);
    return(0);
    }

    Where, say, new language inherits the C standard library, just using
    import rather than #include .


    Then, say, if we wanted a 2D vector type, say:
    namespace foo {
    struct Vector2D {
    double x;
    double y;
    Vector2D(double ax, double ay)
    { x=ax; y=ay; }
    }

    Vector2D operator+(Vector2D a, Vector2D b)
    {
    return (Vector2D) { .x=a.x+b.x, .y=a.y+b.y };
    }
    double operator^(Vector2D a, Vector2D b)
    {
    return a.x*b.x + a*y+b.y;
    }
    }

    Then, say:
    using foo; //can skip 'namespace' keyword, as redundant here.

    Vector2D test1()
    {
    Vector2D v0(1.0, 2.0);
    Vector2D v1(3.0, 4.0);
    Vector2D v2 = v0 + v1;
    return v2;
    }

    Or, for a class type:
    public class Foo {
    double x, y;
    Foo(double ax, double ay)
    { x=ax; y=ay; }
    }
    public interface IBaz {
    //public and virtual by default
    double get_x();
    double get_y();
    void set_x(double val);
    void set_y(double val);
    }
    public class Bar:Foo,IBaz {
    //public virtual via IBaz
    double get_x() { return x; }
    double get_y() { return y; }
    void set_x(double val) { x=val; }
    void set_y(double val) { y=val; }
    }

    ...
    {
    IBaz obj1 = new Bar(1.0, 2.0);
    }


    And, if generics exist, say:
    struct GenericVec2D<T> {
    T x;
    T y;
    ...
    }

    GenericVec2D<float> v2f0;
    This instantiates a new type, say:
    GenericVec2D$float
    Where all instances of type 'T' become 'float'.


    So, more in the vain of C mixed with C# and Java, not like some language
    that diverges more drastically from C family conventions...

    Where, like C# (and unlike C++), classes are always by-reference, and
    structs always by value, and structs will forbid inheritance, ...


    Could maybe also define that any functions declared at the global
    top-level have C like binding, with name-mangling and overloading only applying within namespaces or within struct/class methods.



    Note that SI+interfaces would effectively eliminate things like diamond inheritance and virtual inheritance (in effect, class layout will be append-only).

    In-memory layout for classes could be sort of like:
    vtable_pointer (shared across all classes)
    data for parent class
    interface vtable pointer(s) for parent class
    -- end of parent class --
    data for child class
    interface vtable pointer(s) for child class
    only for newly-implemented interfaces.
    -- end of child class --


    Each interface VTable could have a layout like, say:
    Index 0: ClassInfo / RTTI
    Index 1: Offset from Interface Vtable pointer to Base-Class.
    Index 2: First Method
    Index 3: Second Method
    ...

    For base classes, it is nearly identical, except that offset is always 0.

    ...



    --- PyGate Linux v1.5.15
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From Bonita Montero@3:633/10 to All on Fri May 29 08:09:53 2026
    Am 28.05.2026 um 21:07 schrieb BGB:

    Both C++ and BS2 had exceptions, but in both cases, enabling them has non-zero overhead.

    Table-driven exception handling isn't very old but currently it applies
    for every 64 bit platform; under Windows / x86 concatenated stackframes
    are used. With table-driven EH the additional overhead is zero. And with
    the older concatenated stackframes the overhad is very low.

    And, then I lose incentive as I don't really use C++, and (unlike C
    land) the C++ people tend to chase after the newest features, rather
    than stick to an older and more conservative subset.

    There's no language where the users are so detail focussed and open
    to new features. But this new features raise the productivity a lot
    and it was far beyond C even with C++98. With C you've to flip every
    bit ourself over and over and C++ does replace that with standard
    components.
    This has been emphasized through a lot of C++-channels on YouTube;
    I personally prefer the CppCon vids or the vids of Jason Turner.
    And there are a lot of good books like these of Rainer Grimm and
    Nicolai Josuttis.

    --- PyGate Linux v1.5.15
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From Janis Papanagnou@3:633/10 to All on Fri May 29 09:52:44 2026
    On 2026-05-28 11:57, BGB wrote:
    On 5/28/2026 2:18 AM, Janis Papanagnou wrote:
    On 2026-05-28 01:49, BGB wrote:
    [...]
    I'm a big fan of abstractions. - So many things beyond "C" are fine!

    I am not saying that abstractions are bad, but I haven't usually found
    them to be worth the costs IME.

    Wow! - That's completely different from my experience and practice.

    It's what makes usage simple, fast, reliable. Not wasting time for
    details, or fixing technical bugs that should be prevented by the
    language.


    [...]

    Similar reason to why one doesn't build complex patterns or do
    template- like stuff via function macros in the C preprocessor:
    One can do this... But again, bloated binaries and terrible build times.

    Code patterns that are bulky in "C" can be formulated tersely with
    C++/STL (while still preserving an efficient implementation, even with
    complexities guaranteed); and the framework is flexible, orthogonally
    designed. Easy to reuse high-level concepts as opposed to re-implement
    the same code for different types. Or weaken the code by extensive use
    of casts. All sorts of C's problems with memory can be addressed. (The
    list can be continued; but I wonder why such things aren't recognized.)

    Both have a similar issue when used in a naive way though:
    ÿ Non-careful use of either results in code bloat.

    Okay, "when used in a in a naive way". - Let's leave it at that,
    then.


    But, not really an "easy" way to avoid bloat, other than to write code specifically for what cases are relevant; while also avoiding needless duplication and copy paste (where, overuse of copy/paste can also lead
    to bloat; along with turning the code into an ugly mess).

    Hmm.. - as said, the during very early days there were issues; I
    recall on one platform duplication of template code in more that
    one source unit. And/or some environmental hacks (of the compiler)
    to deposit template code for linking. In the later days I've not
    seen such immature things anymore.


    But, OTOH, factoring things into too small of pieces can negatively
    effect performance (and, for non-leaf functions, prolog/epilog costs for
    too many tiny functions can also be a source of code bloat).

    As can be noted, trying to mimic templates via creative use of C preprocessor macros can also easily result in excessive bloat...


    So, one is back to the core issue:
    The part that is actually usable, mostly still amounts to syntactic
    sugar over things you can already do in C.

    Huh? It may depend on the developer/programmer. But it's certainly a
    lot more than "syntactic sugar".

    Well, for example:
    ÿ Operator overloading:
    ÿÿÿ Basically glorified function calls made to resemble operators;
    ÿ Classes:
    ÿÿÿ Can be done with structs, and implementing vtables manually.

    Implementing class hierarchies via structs can be done, but gets ugly
    (GTK's GObject system sorta went this way).

    We obviously disagree completely in what's "syntactic sugar".

    (With that reasoning all ("C" or other languages') features are
    "syntactic sugar" because you can do that also with assembly?)


    There are "niceties", granted, but relatively little "actually new".

    Not all concepts are "new", of course; we saw them in other languages
    years or (in some cases) decades ago. But C++' and STL features are a
    lot more than just niceties; it's beyond me how one may come to such
    a valuation. (And now let's compare that formulated demand or wish of
    new things with "C"?)

    Well, it is a thing that can be done, but is a double-edged sword.
    Saves code one might have to write out manually.
    But, is very easy to result in things that negatively effect build times.

    A simple, less-abstracted language can certainly be easier (thus
    faster) translated to machine code.

    I don't know about your working contexts. In our contexts slightly
    larger built-times were no issue. For one, we built using makefiles,
    and only full builds (to create QA test images, or public releases)
    required much time; they typically ran over night; our systems were
    typically very large!

    Build times were also influenced by other more significant factors.
    Mundane sounding things like ordering of functions in libraries and
    some such. (Though nothing that wouldn't have been possible to be
    addressed by the build-management group.)


    Usual strategy is to try to limit how much code is written, and also to avoid doing things in ways that result in too much code, or too much cruft.

    Best to avoid both copy paste when reasonable, and sticking anything non-trivial in macros.

    We avoided macros if possible.


    And, one of the rare few "actually new" features C++ offers: exceptions.

    We used them already in the 1990's.

    Here, "new" in the sense that it can't be mapped directly back to stuff
    that can already be expressed natively in C.

    Okay.


    Also comes with its own drawbacks (code bloat, try/catch+throw is
    usually slow, use with care else program explodes, ...). [...]

    I cannot confirm your statements, especially in that generality.

    I recall we had bloat with templates on a specific platform in the
    very early pre-standard era, when they were first supported. But we
    didn't have any [noteworthy] speed degradation with exceptions (or
    templates).

    The relative impact of try/catch is more modest.

    Aha; I thought that this would have been the source of criticism.


    Typically, it results in every function having an unwind-handling stub
    for, in-case an exception is thrown, it can call any destructors or
    similar.

    I've seen and heard of may ways in which exceptions have been used,
    ranging from a single "catch all" in the main() function, to each
    function instrumented. I will not judge about these extreme cases.
    All I say is that you, as a software designer, have the options to
    sensibly structure and instrument your code with exceptions.

    There's also the characteristic that you may define exception types
    (or use just existing ones); build or add to a hierarchy to handle
    them flexibly, provide context data with the exception objects, etc.
    Handling all that manually and explicitly, without the support of an
    exception concept I'd certainly not prefer.

    Janis

    [...]


    --- PyGate Linux v1.5.15
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From Janis Papanagnou@3:633/10 to All on Fri May 29 09:56:57 2026
    On 2026-05-28 21:47, Chris M. Thomasson wrote:
    On 5/28/2026 12:18 AM, Janis Papanagnou wrote:
    [...]

    All sorts of C's problems with memory can be addressed. (The
    list can be continued; but I wonder why such things aren't recognized.)

    C's problems with memory? Don't you mean the programmers that make bugs?

    I'm not sure you're serious here or just joking. - To clarify...

    Yes, the programmers "implement the bugs", and the language makes it
    just easy and obligingly support the programmers to make such bugs.

    Janis


    --- PyGate Linux v1.5.15
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From Janis Papanagnou@3:633/10 to All on Fri May 29 10:02:18 2026
    On 2026-05-29 01:54, Lawrence D?Oliveiro wrote:
    On Thu, 28 May 2026 14:07:45 -0500, BGB wrote:

    Infix notation and precedence rules are pros/cons.

    Python took over most of the C operator precedence rules, with one interesting wrinkle: they moved up the precedence of the bitwise
    operators so that what has to be written like this in C:

    (®val¯ & ®mask¯) == ®expected¯

    can have the parentheses omitted in Python:

    ®val¯ & ®mask¯ == ®expected¯

    Unsurprisingly; since exactly *that* was the obvious (and single)
    issue with C's precedence definitions.

    Janis


    --- PyGate Linux v1.5.15
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From BGB@3:633/10 to All on Fri May 29 04:15:37 2026
    On 5/29/2026 1:09 AM, Bonita Montero wrote:
    Am 28.05.2026 um 21:07 schrieb BGB:

    Both C++ and BS2 had exceptions, but in both cases, enabling them has
    non-zero overhead.

    Table-driven exception handling isn't very old but currently it applies
    for every 64 bit platform; under Windows / x86 concatenated stackframes
    are used. With table-driven EH the additional overhead is zero. And with
    the older concatenated stackframes the overhad is very low.


    IME, it is not quite zero, but it is the least-overhead option I am
    aware of.

    Arguably it is less overhead than doing state-capture and adding exception-frames to a linked list, but, ...

    It is also less bloated than needing to drag around DWARF debug-info.




    Looks it up in my specs for the PE/COFF variant I am using for my own
    target, the entry is 64 bits per entry:
    32 bits: Func Base RVA
    32 bits: Packed Lengths
    ( 7: 0): Length of prolog in instruction words;
    (29: 8): Length of function in instruction words.
    ( 30): Selector for instruction word size (*1).
    ( 31): Set to indicated whether a catch handler exists.

    These go into ".pdata" which is basically an array of these entries.


    *1: Relevant for ISA's that can have 16 or 32 bit instruction words, but
    a given function might only use 32-bit words despite 16 being available.
    Say, for example, RV64GC but only using RV64G for a given function.
    Set to indicate only 32-bit instructions were used.
    Currently unused.



    Looking at it again, I had forgot the mechanism:
    The basic handler does some trickery with the saved link register and
    then branches to the epilog, which then does any cleanup and returns
    control back to the unwind code (the original LR is put into an area the unwind logic can use for the next step).

    If catch handlers exists, they will check the unwinding RVA against the
    RVA ranges for the try blocks, and if there is a hit, it can check if
    the exception object matches the class. This part is handled by code
    generated by the compiler. If no hit, it goes through the epilog which
    then hands it back to the unwinder.

    So, minimum-case cost is:
    8 bytes of ".pdata" + 4 instruction words
    Or around 24 bytes.



    Say, for example, one has a program with 1200 functions, and an
    average-length function of 276 bytes.

    This adds 19K vs 331K to ".text", expanding the overall size of ".text"
    by around 6% (plus ~ 10 of ".pdata").

    Granted, this is vs, say:
    23K used for string literals;
    144K for the initial contents of ".data";
    ...


    It is possible some special case could be defined that changes the size layout, say, top bits as tag:
    Tag=00: Same as present
    Tag=01
    ( 7: 0): Size of stack frame in QWORDs;
    (21: 8): Length of function in instruction words.
    (29:22): Length of Epilog in instruction words
    (31:30): Tag (01)
    Tag=10: Reserved
    Tag=11
    ( 7: 0): Length of prolog in instruction words;
    (29: 8): Length of function in instruction words.
    (31:30): Tag (11)

    with the 01 case able to trim off a few instructions.




    Note that a lot of this stuff uses RVAs, where locations within the
    larger VAS are expressed as 32-bit offsets relative to the base address
    of the PE/COFF image. Some metadata structures effectively hold
    Self-RVA's as a trick to allow finding the image base address for other
    RVAs (rather than using full 64-bit pointers; also RVAs don't need base-relocs).


    Well, contrast ELF64 which uses 64-bit values for everything and
    needlessly so bloats the metadata.

    Can't really go much smaller than RVAs though.
    Well, except base relocs, which typically use a 16-bit format.
    (11: 0): Base offset of relocated field within current 4K page
    (15:12): Type of Base-Reloc to apply

    Traditional PE/COFF base relocs had 8B per page:
    Page RVA, Count

    I was able to get rid of this cost though, and instead switched to
    mostly using an all-16-bit format; Where Reloc Type 0 is used to encode
    a page-advance and similar. This could also save a bit of space for
    ".reloc" by blobbing all of the relocs together per-section rather than per-page.


    And, then I lose incentive as I don't really use C++, and (unlike C
    land) the C++ people tend to chase after the newest features, rather
    than stick to an older and more conservative subset.

    There's no language where the users are so detail focussed and open
    to new features. But this new features raise the productivity a lot
    and it was far beyond C even with C++98. With C you've to flip every
    bit ourself over and over and C++ does replace that with standard
    components.
    This has been emphasized through a lot of C++-channels on YouTube;
    I personally prefer the CppCon vids or the vids of Jason Turner.
    And there are a lot of good books like these of Rainer Grimm and
    Nicolai Josuttis.

    The issue is not whether they save effort, but rather the how they can
    effect build times and binary sizes.

    Like, if one doesn't care that the compiler takes a long time to run and
    the EXE is needlessly large, maybe OK, not great if one does care...

    Having to spend minutes or more waiting for the compiler would seriously
    hurt momentum for many tasks.

    Well, and I have not seen much evidence that moderate-sized C++
    codebases (using modern features) can have sub-minute build times.



    In some cases, one might be looking at things to try to trim to keep
    sizes modest.

    Say, for example, if the Boot ROM requires keeping everything under 32K.
    Or they ideally don't want the OS kernel going over 500K.


    Like, code footprint doesn't always matter, but is not always free.



    --- PyGate Linux v1.5.15
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From BGB@3:633/10 to All on Fri May 29 05:20:20 2026
    On 5/29/2026 2:52 AM, Janis Papanagnou wrote:
    On 2026-05-28 11:57, BGB wrote:
    On 5/28/2026 2:18 AM, Janis Papanagnou wrote:
    On 2026-05-28 01:49, BGB wrote:
    [...]
    I'm a big fan of abstractions. - So many things beyond "C" are fine!

    I am not saying that abstractions are bad, but I haven't usually found
    them to be worth the costs IME.

    Wow! - That's completely different from my experience and practice.

    It's what makes usage simple, fast, reliable. Not wasting time for
    details, or fixing technical bugs that should be prevented by the
    language.


    Possibly.

    But, it is also possible I approach programming in a different way.



    [...]

    Similar reason to why one doesn't build complex patterns or do
    template- like stuff via function macros in the C preprocessor:
    One can do this... But again, bloated binaries and terrible build
    times.

    Code patterns that are bulky in "C" can be formulated tersely with
    C++/STL (while still preserving an efficient implementation, even with
    complexities guaranteed); and the framework is flexible, orthogonally
    designed. Easy to reuse high-level concepts as opposed to re-implement
    the same code for different types. Or weaken the code by extensive use
    of casts. All sorts of C's problems with memory can be addressed. (The
    list can be continued; but I wonder why such things aren't recognized.)

    Both have a similar issue when used in a naive way though:
    ÿÿ Non-careful use of either results in code bloat.

    Okay, "when used in a in a naive way". - Let's leave it at that,
    then.


    But, not really an "easy" way to avoid bloat, other than to write code
    specifically for what cases are relevant; while also avoiding needless
    duplication and copy paste (where, overuse of copy/paste can also lead
    to bloat; along with turning the code into an ugly mess).

    Hmm.. - as said, the during very early days there were issues; I
    recall on one platform duplication of template code in more that
    one source unit. And/or some environmental hacks (of the compiler)
    to deposit template code for linking. In the later days I've not
    seen such immature things anymore.


    Possibly, a lot could depend on how one is counting things as well.


    In a lot of cases when using GCC, I end up using:
    -ffunction-sections -fdata-sections -Wl,-gc-sections

    Because otherwise it likes wasting code space by retaining unreachable functions.

    Using "static inline" functions also carries a risk because the can end
    up duplicated across multiple translation units, or in multiple places
    within the same translation unit, so is best used sparingly.



    But, OTOH, factoring things into too small of pieces can negatively
    effect performance (and, for non-leaf functions, prolog/epilog costs
    for too many tiny functions can also be a source of code bloat).

    As can be noted, trying to mimic templates via creative use of C
    preprocessor macros can also easily result in excessive bloat...


    So, one is back to the core issue:
    The part that is actually usable, mostly still amounts to syntactic
    sugar over things you can already do in C.

    Huh? It may depend on the developer/programmer. But it's certainly a
    lot more than "syntactic sugar".

    Well, for example:
    ÿÿ Operator overloading:
    ÿÿÿÿ Basically glorified function calls made to resemble operators;
    ÿÿ Classes:
    ÿÿÿÿ Can be done with structs, and implementing vtables manually.

    Implementing class hierarchies via structs can be done, but gets ugly
    (GTK's GObject system sorta went this way).

    We obviously disagree completely in what's "syntactic sugar".

    (With that reasoning all ("C" or other languages') features are
    "syntactic sugar" because you can do that also with assembly?)


    To be excluded from being syntactic sugar, it needs to be something that
    is not generally possible to express within the base language.

    So, for example:
    Things like operator overloading or classes are syntactic sugar IMO, as
    what they do can be expressed in C, even if a lot less pretty (or far
    from an idiomatic style).


    I would not consider exceptions or RTTI as syntactic sugar, because
    these involve things that do not map to native C.

    Using longjmp, pointer-tagging, etc, could be considered as analogous,
    but not functionally equivalent, to what C++ is doing in these cases.


    As for assembler:
    Main reasons not to use assembler for everything:
    Needlessly verbose;
    Non-portable.

    However, often one can still end up writing C code that looks like
    assembler sometimes, as this is often an effective way to optimize things.

    Say, for example:
    v0=cs[0];
    v2=cs[2];
    v1=cs[1];
    v3=vs[3];
    ct[0]=v0;
    ct[2]=v2;
    ct[1]=v1;
    ct[3]=v3;
    Vs:
    ct[0]=cs[0];
    ct[1]=cs[1];
    ct[2]=cs[2];
    ct[3]=cs[3];

    Because the extra variables can avoid help sidestep latency from the
    load instructions and staggering stores can avoid penalties of two
    adjacent stores to the same cache-line in some cache architectures.
    Where, in the latter case, the compiler may fail to as effectively avoid
    the load-latency or realize the need to stagger the stores for best performance, ...



    There are "niceties", granted, but relatively little "actually new".

    Not all concepts are "new", of course; we saw them in other languages
    years or (in some cases) decades ago. But C++' and STL features are a
    lot more than just niceties; it's beyond me how one may come to such
    a valuation. (And now let's compare that formulated demand or wish of
    new things with "C"?)

    Well, it is a thing that can be done, but is a double-edged sword.
    Saves code one might have to write out manually.
    But, is very easy to result in things that negatively effect build times.

    A simple, less-abstracted language can certainly be easier (thus
    faster) translated to machine code.

    I don't know about your working contexts. In our contexts slightly
    larger built-times were no issue. For one, we built using makefiles,
    and only full builds (to create QA test images, or public releases)
    required much time; they typically ran over night; our systems were
    typically very large!

    Build times were also influenced by other more significant factors.
    Mundane sounding things like ordering of functions in libraries and
    some such. (Though nothing that wouldn't have been possible to be
    addressed by the build-management group.)


    I mostly do stuff myself. But, long build times can kill momentum.

    In some cases, it is unavoidable, for example, Vivado takes 40 minutes
    to synthesize an FPGA design, and Verilator may take multiple hours of
    running to know whether or not something worked.

    This sucks, but little I can do about this.
    I do most of my software level testing in an emulator though, which does
    at least have a fast turn-around...


    Can't run my C compiler on my custom target though, as sadly, it is too
    big and slow (would use too much RAM and take too long to run).

    As noted, my compiler is around 370 kLOC.
    Overall project size is a few MLOC.

    Granted, my compiler has grown a few things related to resource/asset packaging as well, so it also among other things works as an image
    format converter, audio format converter, data packaging tool, and has
    some wacky appendages like a SCAD interpreter and CSG geometry stuff
    glued on (even if, yeah, interpreting SCAD and processing CSG models has little to do with compiling stuff).


    Sometimes one would prefer the code to be smaller, but sometimes it is not.


    Some of my older code has vestiges of a lot of rampant copy/paste
    though. Not ideal. And in my case my compiler, at its core, has its
    basis in code I wrote in high-school (back when my coding skills left something to be desired). Compiler is many layers of rampant hackery though.

    It is more a case of "too bad it isn't less of a mess", but then efforts
    to replace it had mostly fizzled; and the core design of the compiler
    does seem to work effectively.



    Usual strategy is to try to limit how much code is written, and also
    to avoid doing things in ways that result in too much code, or too
    much cruft.

    Best to avoid both copy paste when reasonable, and sticking anything
    non-trivial in macros.

    We avoided macros if possible.


    They are de-facto for constants and similar, but for longer stuff is
    better avoided.

    #define FOO(arg) \
    if(1) { \
    ... big blob of crap ... \
    }

    Being one of those, "yeah, don't do this" things.


    Like, if a person finds themselves tempted to do this, maybe step back
    and try to figure out how one ended up in this situation and if there is
    a way out. If people start doing this too much, the code turns to crap.


    I once implemented a VM backend where I did this a lot, it did not end
    well. Managed to get a VM interpreter that, while fast, took a painfully
    long time to compile due to all the excessive code generated via macros.

    Now stuff mostly just sucks due to other reasons.



    And, one of the rare few "actually new" features C++ offers:
    exceptions.

    We used them already in the 1990's.

    Here, "new" in the sense that it can't be mapped directly back to
    stuff that can already be expressed natively in C.

    Okay.


    Also comes with its own drawbacks (code bloat, try/catch+throw is
    usually slow, use with care else program explodes, ...). [...]

    I cannot confirm your statements, especially in that generality.

    I recall we had bloat with templates on a specific platform in the
    very early pre-standard era, when they were first supported. But we
    didn't have any [noteworthy] speed degradation with exceptions (or
    templates).

    The relative impact of try/catch is more modest.

    Aha; I thought that this would have been the source of criticism.


    Typically, it results in every function having an unwind-handling stub
    for, in-case an exception is thrown, it can call any destructors or
    similar.

    I've seen and heard of may ways in which exceptions have been used,
    ranging from a single "catch all" in the main() function, to each
    function instrumented. I will not judge about these extreme cases.
    All I say is that you, as a software designer, have the options to
    sensibly structure and instrument your code with exceptions.

    There's also the characteristic that you may define exception types
    (or use just existing ones); build or add to a hierarchy to handle
    them flexibly, provide context data with the exception objects, etc.
    Handling all that manually and explicitly, without the support of an exception concept I'd certainly not prefer.


    Well, even when exceptions are not used by a program, one may still see
    a size delta of a few percent by enabling or disabling them...


    But, ironically, I am not totally opposed to exceptions, as they can
    serve a purpose...

    Well, even if sometimes they just amount to a missed uncaught exception
    nuking the program...


    But, things can be considered in relative terms:
    Like, C++ may carry various penalties vs C.

    But, like, it is still going to run circles around something like Python
    or Node.js ...



    --- PyGate Linux v1.5.15
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From Bart@3:633/10 to All on Fri May 29 12:19:04 2026
    On 29/05/2026 09:02, Janis Papanagnou wrote:
    On 2026-05-29 01:54, Lawrence D?Oliveiro wrote:
    On Thu, 28 May 2026 14:07:45 -0500, BGB wrote:

    Infix notation and precedence rules are pros/cons.

    Python took over most of the C operator precedence rules, with one
    interesting wrinkle: they moved up the precedence of the bitwise
    operators so that what has to be written like this in C:

    ÿÿÿÿ (®val¯ & ®mask¯) == ®expected¯

    can have the parentheses omitted in Python:

    ÿÿÿÿ ®val¯ & ®mask¯ == ®expected¯

    Unsurprisingly; since exactly *that* was the obvious (and single)
    issue with C's precedence definitions.

    The only one?

    How about:

    * What is the order here: a ^ b | c

    * Why do bitwise & | ^ need their own level anyway

    * What is most intuitive precedence here: a << 3 + b, and what
    is it in C

    * Why do << >> have their own level anyway

    * Why do == != have a difference precendence from < <= >= >

    Further, here: 'a * b + c' the multplication is done first, but here:

    a *= b += c

    It is done second.

    The issue I have is whether augmented assignments should return a value
    at all. It's just generally too confusing especially with mixed types.
    It's confusing enough with assignments returning a value:

    a = b = x;

    Here, assuming x has no side-effects, you might expect this to mean the
    same as:

    b = x;
    a = x;

    In fact it's more like: 'b = x; a = b;'. Example:

    double a;
    float b;

    a = b = 3.14159265358979323846;

    Here, 'a' will be assigned the less precise 32-bit version of the RHS.



    --- PyGate Linux v1.5.15
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From David Brown@3:633/10 to All on Fri May 29 13:22:50 2026
    On 29/05/2026 12:20, BGB wrote:
    On 5/29/2026 2:52 AM, Janis Papanagnou wrote:
    On 2026-05-28 11:57, BGB wrote:
    On 5/28/2026 2:18 AM, Janis Papanagnou wrote:
    On 2026-05-28 01:49, BGB wrote:
    [...]


    But, not really an "easy" way to avoid bloat, other than to write
    code specifically for what cases are relevant; while also avoiding
    needless duplication and copy paste (where, overuse of copy/paste can
    also lead to bloat; along with turning the code into an ugly mess).

    Hmm.. - as said, the during very early days there were issues; I
    recall on one platform duplication of template code in more that
    one source unit. And/or some environmental hacks (of the compiler)
    to deposit template code for linking. In the later days I've not
    seen such immature things anymore.


    Possibly, a lot could depend on how one is counting things as well.


    In a lot of cases when using GCC, I end up using:
    ÿ -ffunction-sections -fdata-sections -Wl,-gc-sections

    On many targets, "-fdata-sections" can lead to noticeably larger and
    slower code because it effectively eliminates section anchor
    optimisations. It does not negatively affect x86 AFAICS, because x86
    does not use section anchors.

    <https://godbolt.org/z/zeoq41Y7d>

    With -fsection-anchors (enabled with optimisation on targets that
    support it - generally RISCy load/store architectures), program-lifetime variables are kept together in a lump (as though they were in a struct)
    and often addressed by a pointer to that pretend struct. Thus if a
    function accesses two variables "a" and "b", instead of having to load
    the addresses of each of "a" and "b" into separate registers, it loads
    an "anchor" into one register and accesses the variables with reg+offset addressing.

    I've seen "-fdata-sections" used regularly in embedded systems - it is
    almost always a bad idea.

    ("-ffunction-sections" is often very helpful to reduce code image size,
    so keep that one.)




    Because otherwise it likes wasting code space by retaining unreachable functions.

    Using "static inline" functions also carries a risk because the can end
    up duplicated across multiple translation units, or in multiple places within the same translation unit, so is best used sparingly.


    Usually you would only use static inline functions for small functions
    in headers, where they are a better choice than function-like macros.
    In a C file, there is rarely much point in declaring a function "inline"
    - optimising compilers will inline or not as they see fit, without
    regard for "inline". "static" on its own is, of course, always a good
    idea for functions or data that is not "exported" by the current
    translation unit, and will often make generated code smaller.

    How much or how little duplication of code there will be within one translation unit will depend on compiler settings and the rest of the
    code, and not on whether or not you use "inline".




    As for assembler:
    Main reasons not to use assembler for everything:
    ÿ Needlessly verbose;
    ÿ Non-portable.

    However, often one can still end up writing C code that looks like
    assembler sometimes, as this is often an effective way to optimize things.

    Say, for example:
    ÿ v0=cs[0];
    ÿ v2=cs[2];
    ÿ v1=cs[1];
    ÿ v3=vs[3];
    ÿ ct[0]=v0;
    ÿ ct[2]=v2;
    ÿ ct[1]=v1;
    ÿ ct[3]=v3;
    Vs:
    ÿ ct[0]=cs[0];
    ÿ ct[1]=cs[1];
    ÿ ct[2]=cs[2];
    ÿ ct[3]=cs[3];

    Because the extra variables can avoid help sidestep latency from the
    load instructions and staggering stores can avoid penalties of two
    adjacent stores to the same cache-line in some cache architectures.
    Where, in the latter case, the compiler may fail to as effectively avoid
    the load-latency or realize the need to stagger the stores for best performance, ...

    That might be the case for a very simplistic compiler. With an
    optimising compiler, these extra variables will quickly be eliminated.
    If the compiler has a good scheduling model of the device, it do
    whatever instruction scheduling works best for that processor. If the
    model is not good enough, it will be suboptimal. I would not, however,
    expect any different in the generated code for the two code snippets.

    Sometimes this kind of "manual optimisation" is helpful when you have to
    try to get efficient results from a weak compiler, however.




    Usual strategy is to try to limit how much code is written, and also
    to avoid doing things in ways that result in too much code, or too
    much cruft.

    Best to avoid both copy paste when reasonable, and sticking anything
    non-trivial in macros.

    We avoided macros if possible.


    They are de-facto for constants and similar, but for longer stuff is
    better avoided.

    Macros are rarely the best way to define constants. They are needed if
    you are using the constants for pre-processor stuff like conditional compilation. But generally you get clearer code, better typing, and potentially several other benefits from using alternative choices like
    "enum" (even for stand-alone integer constants), "static const"
    variables, and in C23, "constexpr" variables. There's no doubt that a
    lot of code /does/ use macros for constants, but I view it as a relic of
    the past rather than good coding practice.



    But, things can be considered in relative terms:
    Like, C++ may carry various penalties vs C.


    I don't find C++ carries noticeably penalties compared to C, for my
    embedded work. But I do disable exceptions and RTTI - exceptions may
    have very little run-time time overhead, but the unwind tables can be significant when code size is important in small systems.




    --- PyGate Linux v1.5.15
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From Janis Papanagnou@3:633/10 to All on Fri May 29 14:46:35 2026
    On 2026-05-29 13:19, Bart wrote:
    On 29/05/2026 09:02, Janis Papanagnou wrote:
    On 2026-05-29 01:54, Lawrence D?Oliveiro wrote:
    On Thu, 28 May 2026 14:07:45 -0500, BGB wrote:

    Infix notation and precedence rules are pros/cons.

    Python took over most of the C operator precedence rules, with one
    interesting wrinkle: they moved up the precedence of the bitwise
    operators so that what has to be written like this in C:

    ÿÿÿÿ (®val¯ & ®mask¯) == ®expected¯

    can have the parentheses omitted in Python:

    ÿÿÿÿ ®val¯ & ®mask¯ == ®expected¯

    Unsurprisingly; since exactly *that* was the obvious (and single)
    issue with C's precedence definitions.

    The only one?

    Yes. - That group of operators was what I noticed immediately when
    I've learned "C" back then reading K&R. Much later I've seen some
    folks also mentioning that specific disorder. Still later I've got
    information about a paper of some of the "C" authors admitting that
    mistake. (I think I've also seen comments from some regulars here
    that also noted that.) That all together is certainly a solid base
    for a sensible valuation.

    How about:
    [...]

    Your well known very specific views have never been a landmark for reconsidering my personal judgement. (And I'm positive that won't
    ever change; never mind!)

    The "confusions" you listed - not worth quoting - are your personal
    problem. The precedence of assignments and related operations and
    their evaluation order are clear, reasonable, and they can be found
    that way in many existing languages. (Some of your listed "problems"
    have been answered here already in the past - I wonder whether it's
    worth replying to you if you don't learn from the answers. You seem
    to have fun wasting everyone's time.)

    You can continue to assume that all those people, language designers
    and programmers, are wrong, and I accept your astonishment that for
    those folks it's not "confusing" as it is for you.

    Janis


    --- PyGate Linux v1.5.15
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From Bonita Montero@3:633/10 to All on Fri May 29 14:58:33 2026
    Am 29.05.2026 um 11:15 schrieb BGB:

    Like, if one doesn't care that the compiler takes a long time to run
    and the EXE is needlessly large, maybe OK, not great if one does care...

    Binary size doesn't matter with Windows.

    Having to spend minutes or more waiting for the compiler would seriously hurt momentum for many tasks.

    Use C++20 modules and parallel builds.

    Say, for example, if the Boot ROM requires keeping everything under 32K.

    C++ was designed for large scale program development.
    With 32K-systems you can stick with C.



    --- PyGate Linux v1.5.15
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From Bart@3:633/10 to All on Fri May 29 14:22:19 2026
    On 29/05/2026 13:46, Janis Papanagnou wrote:
    On 2026-05-29 13:19, Bart wrote:
    On 29/05/2026 09:02, Janis Papanagnou wrote:
    On 2026-05-29 01:54, Lawrence D?Oliveiro wrote:
    On Thu, 28 May 2026 14:07:45 -0500, BGB wrote:

    Infix notation and precedence rules are pros/cons.

    Python took over most of the C operator precedence rules, with one
    interesting wrinkle: they moved up the precedence of the bitwise
    operators so that what has to be written like this in C:

    ÿÿÿÿ (®val¯ & ®mask¯) == ®expected¯

    can have the parentheses omitted in Python:

    ÿÿÿÿ ®val¯ & ®mask¯ == ®expected¯

    Unsurprisingly; since exactly *that* was the obvious (and single)
    issue with C's precedence definitions.

    The only one?

    Yes. - That group of operators was what I noticed immediately when
    I've learned "C" back then reading K&R. Much later I've seen some
    folks also mentioning that specific disorder. Still later I've got information about a paper of some of the "C" authors admitting that
    mistake. (I think I've also seen comments from some regulars here
    that also noted that.) That all together is certainly a solid base
    for a sensible valuation.

    How about:
    [...]

    Your well known very specific views have never been a landmark for reconsidering my personal judgement. (And I'm positive that won't
    ever change; never mind!)

    The "confusions" you listed - not worth quoting - are your personal
    problem. The precedence of assignments and related operations and
    their evaluation order are clear, reasonable, and they can be found
    that way in many existing languages. (Some of your listed "problems"
    have been answered here already in the past - I wonder whether it's
    worth replying to you if you don't learn from the answers. You seem
    to have fun wasting everyone's time.)

    You can continue to assume that all those people, language designers
    and programmers, are wrong,

    I noticed that you didn't answer my questions.

    Precedence is 100% for the benefit of human programmers, therefore if
    you have lots of obscure or unintuitive levels, they have to provide
    some extra value.

    Having too many levels, having them unintuitive, and having them differ
    across languages, suggests that the value either doesn't exist or is not worthwhile.

    So, yes, language designers can be wrong. Having lots of obscure
    precedence levels SOUNDS a good idea and it is VERY EASY to implement,
    hence it is commonly found. And newer languages are often obliged to perpetuate those decisions.

    You can agree or disagree with the above, but if the latter, I'd quite
    like to hear your arguments, and some examples that show the advantages
    of having ^ one level above &, or one level below, or whatever the hell
    it is.

    and I accept your astonishment that for
    those folks it's not "confusing" as it is for you.

    It was presumably confusing for Google when they decided to have a much reduced set for their Go language.

    Something you might do when you have time (as I'm busy), is to analyse
    the expressions in some C codebases, and isolate those where removal of parentheses that group terms, would result in exactly the same shape of expressions, and are therefore redundant.

    (You would have to exclude ones using macros.)

    It would be interesting to see if the use of those unnecessary
    parentheses correlated more closely with the less well known precedence levels.

    If so, then having those custom levels was unnecessary.

    --- PyGate Linux v1.5.15
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From Janis Papanagnou@3:633/10 to All on Fri May 29 15:22:36 2026
    On 2026-05-29 12:20, BGB wrote:
    On 5/29/2026 2:52 AM, Janis Papanagnou wrote:
    On 2026-05-28 11:57, BGB wrote:
    On 5/28/2026 2:18 AM, Janis Papanagnou wrote:
    On 2026-05-28 01:49, BGB wrote:
    [...]
    I'm a big fan of abstractions. - So many things beyond "C" are fine!

    I am not saying that abstractions are bad, but I haven't usually
    found them to be worth the costs IME.

    Wow! - That's completely different from my experience and practice.

    It's what makes usage simple, fast, reliable. Not wasting time for
    details, or fixing technical bugs that should be prevented by the
    language.

    Possibly.

    But, it is also possible I approach programming in a different way.

    If you're doing your own personal stuff that doesn't surprise me.

    [...]

    Possibly, a lot could depend on how one is counting things as well.

    In a lot of cases when using GCC, I end up using:
    ÿ -ffunction-sections -fdata-sections -Wl,-gc-sections

    Well, you mentioned gcc now repeatedly. Myself I had been using gcc
    mainly in private context only. Professionally we (mostly) used the
    tools that came with the vendors on commercial (often Unix) systems.

    I really cannot tell about Windows environments, Cygwin, or using
    (maybe older?) gcc versions for large scale software development.
    For newer systems and versions I'd certainly not expect problems
    as you seem to have encountered.

    [...]
    [...]

    To be excluded from being syntactic sugar, it needs to be something that
    is not generally possible to express within the base language.

    So, for example:
    Things like operator overloading or classes are syntactic sugar IMO, as
    what they do can be expressed in C, even if a lot less pretty (or far
    from an idiomatic style).

    I would not consider exceptions or RTTI as syntactic sugar, because
    these involve things that do not map to native C.

    Using longjmp, pointer-tagging, etc, could be considered as analogous,
    but not functionally equivalent, to what C++ is doing in these cases.

    Thanks for explaining your view on "syntactic sugar".

    [...]

    We avoided macros if possible.

    They are de-facto for constants and similar, but for longer stuff is
    better avoided.

    We've been talking about C++; for constants I regularly use constants.

    In "C", frankly, it appears to me less important; I used to use Cpp
    names for constant items or expressions, mainly because that's "how
    it is done in C". - OTOH, if I can give an item a type, why should
    I use a primitive mechanism. Because it's faster? No memory wasted?

    In our C++ projects we used Cpp features certainly not for constant
    literals. We used them for tags (prevent double inclusion of h-files), conditionals, and some such. Rarely for more tricky things involving
    tasks to circumvent limitations of the "C" language in cases where we
    wrote code that should work with both of these languages.

    Janis

    [...]


    --- PyGate Linux v1.5.15
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From Janis Papanagnou@3:633/10 to All on Fri May 29 17:15:04 2026
    On 2026-05-29 15:22, Bart wrote:
    On 29/05/2026 13:46, Janis Papanagnou wrote:
    On 2026-05-29 13:19, Bart wrote:
    On 29/05/2026 09:02, Janis Papanagnou wrote:
    On 2026-05-29 01:54, Lawrence D?Oliveiro wrote:
    On Thu, 28 May 2026 14:07:45 -0500, BGB wrote:
    [...]
    [...]

    The "confusions" you listed - not worth quoting - are your personal
    problem. The precedence of assignments and related operations and
    their evaluation order are clear, reasonable, and they can be found
    that way in many existing languages. (Some of your listed "problems"
    have been answered here already in the past - I wonder whether it's
    worth replying to you if you don't learn from the answers. You seem
    to have fun wasting everyone's time.)

    [...]

    I noticed that you didn't answer my questions.

    Yes, because, as experience shows, it's obviously a waste of time!

    Okay, I'll bite. - I'll go waste my time again and comment on your
    other post where you said you are confused about these cases...

    Janis

    [...]


    --- PyGate Linux v1.5.15
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From Scott Lurndal@3:633/10 to All on Fri May 29 15:59:53 2026
    Janis Papanagnou <janis_papanagnou+ng@hotmail.com> writes:
    On 2026-05-29 15:22, Bart wrote:
    On 29/05/2026 13:46, Janis Papanagnou wrote:
    On 2026-05-29 13:19, Bart wrote:
    On 29/05/2026 09:02, Janis Papanagnou wrote:
    On 2026-05-29 01:54, Lawrence D?Oliveiro wrote:
    On Thu, 28 May 2026 14:07:45 -0500, BGB wrote:
    [...]
    [...]

    The "confusions" you listed - not worth quoting - are your personal
    problem. The precedence of assignments and related operations and
    their evaluation order are clear, reasonable, and they can be found
    that way in many existing languages. (Some of your listed "problems"
    have been answered here already in the past - I wonder whether it's
    worth replying to you if you don't learn from the answers. You seem
    to have fun wasting everyone's time.)

    [...]

    I noticed that you didn't answer my questions.

    Yes, because, as experience shows, it's obviously a waste of time!

    Okay, I'll bite. - I'll go waste my time again and comment on your
    other post where you said you are confused about these cases...

    Why bother? C isn't ever going to change operator precedence
    just to make Bart happy.

    --- PyGate Linux v1.5.15
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From Bart@3:633/10 to All on Fri May 29 17:05:56 2026
    On 29/05/2026 16:15, Janis Papanagnou wrote:
    On 2026-05-29 15:22, Bart wrote:
    On 29/05/2026 13:46, Janis Papanagnou wrote:
    On 2026-05-29 13:19, Bart wrote:
    On 29/05/2026 09:02, Janis Papanagnou wrote:
    On 2026-05-29 01:54, Lawrence D?Oliveiro wrote:
    On Thu, 28 May 2026 14:07:45 -0500, BGB wrote:
    [...]
    [...]

    The "confusions" you listed - not worth quoting - are your personal
    problem. The precedence of assignments and related operations and
    their evaluation order are clear, reasonable, and they can be found
    that way in many existing languages. (Some of your listed "problems"
    have been answered here already in the past - I wonder whether it's
    worth replying to you if you don't learn from the answers. You seem
    to have fun wasting everyone's time.)

    [...]

    I noticed that you didn't answer my questions.

    Yes, because, as experience shows, it's obviously a waste of time!

    It can go both ways.

    You always exasperatingly insist that there is only one thing wrong with
    C's precedence rules, and I think you said once that they are are
    otherwise perfect. And yet there endless examples on forums of people
    saying they are confusing.



    --- PyGate Linux v1.5.15
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From Janis Papanagnou@3:633/10 to All on Fri May 29 18:10:35 2026
    On 2026-05-29 13:19, Bart wrote:
    How about:

    Generally; if in doubt you can always inspect the precedence
    table that you find in K&R and elsewhere. Below some hints to
    hopefully reduce your confusion...


    * What is the order here: a ^ b | c

    Personally I don't think that there's a prevalent definition
    how these should be ordered. - But note the history of these
    operators, which was BTW why they are in "C" at the positions
    that they are; their (potential) use as boolean sets.

    To memorize the bit-value operations you can associate them
    with the glyphs for the boolean operations; per convention
    AND (&&) comes before OR (||), and insert the bitwise xor in
    between. - As said; there's a rationale for the choice, and
    a historic explanation (that makes sense), but you can in
    case of doubt or to make things clearer also use parentheses.


    * Why do bitwise & | ^ need their own level anyway

    For historic reasons of potential use. (As said above.)


    * What is most intuitive precedence here: a << 3 + b, and what
    ÿ is it in C

    What it is in "C" you should look up in the precedence table.

    What it is intuitively depends on what the programmer wanted to
    express. As it is written a shift by a calculated expression
    seems have been the intention, and that's also how "C" handles
    that.

    Why would one who intends to say: make space for three bits in
    'a' to put there the bit-pattern stored in 'b'? If I'd wanted to
    express that I'd use a binary '|', and because or the well known
    precedence I'd have no parenthesis around, as in a << 3 | b .

    Of course there's also the potential semantics that you have a
    'b' that exceeds three bit and you want to have that added to the
    shifted value of 'a'. But then I'd not use bit-value operations
    to express that but the clear and more appropriate a * 8 + b .

    Both semantics bit-op and int-arithmetic don't need parenthesis
    in "C" because of its clear and sensible precedence.

    You are mixing bit-ops and arithmetic unnecessarily, but you are
    also too lazy to write parenthesis to clear up your dirty code?


    * Why do << >> have their own level anyway

    As you have been told so many times; to not require parentheses
    unnecessarily, to be able to omit them and not pollute the code
    with parentheses in a language that is already full of all sorts
    of { } [ ] ( ) and other punctuation characters.


    * Why do == != have a difference precendence from < <= >= >

    I've explained that already twice in the past.


    Further, here: 'a * b + c' the multplication is done first, but here:

    ÿÿ aÿ *= b += c

    It is done second.

    You understand that '=', '*=', and '*' are three different things,
    don't you?

    I hope you understand that '=' should have low precedence. And that
    it makes sense to evaluate that from right to left. Do you follow?

    "C" obviously decided to have them all, =, +=, *=, etc. in a single
    group, and thus evaluated from right to left. - Easy rule, easy to
    memorize. - And that is actually what you are demanding from many
    other operators, to put them in a single group. - But here you are
    complaining about it!

    Of course the rules for those combined (sort of two-address) operators
    could have been defined differently, in an own group with other rules.
    (Algol 68 had done that, actually; the semantics are like "apply these operations from left to right, indicating an incremental modification
    of the underlying value.)

    If "C" would have separated these operators from the assignment and
    created another own group with different evaluation rules you'd also
    have complained.


    The issue I have is whether augmented assignments should return a value
    at all. It's just generally too confusing especially with mixed types.

    If you're confused about something either try to understand the rule,
    or just learn it without understanding it, or look it up, a or write
    your programs in a way that you understand; use parentheses, separate
    complex expressions to simpler ones, etc.

    It's confusing enough with assignments returning a value:

    ÿÿ a = b = x;

    If it hurts/confuses you then don't use that feature.


    Here, assuming x has no side-effects, you might expect this to mean the
    same as:

    ÿÿ b = x;
    ÿÿ a = x;

    I cannot tell what your brain makes you expect one semantic or another.
    The semantics are clearly defined. Your interpretation here is wrong.

    All I can suggest is what I've written above already; and to spare you
    a search, it was:

    If you're confused about something either try to understand the rule,
    or just learn it without understanding it, or look it up, a or write
    your programs in a way that you understand; use parentheses, separate
    complex expressions to simpler ones, etc.


    In fact it's more like: 'b = x; a = b;'. Example:

    ÿÿÿ double a;
    ÿÿÿ float b;

    ÿÿÿ aÿ = b = 3.14159265358979323846;

    Here, 'a' will be assigned the less precise 32-bit version of the RHS.

    And why are you composing such stupid examples (if not only for sake
    of an argument)? - An experienced programmer wouldn't write such an
    expression with mixed types if he intends clear and non-dubious code.

    Janis


    --- PyGate Linux v1.5.15
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From Bart@3:633/10 to All on Fri May 29 17:12:07 2026
    On 29/05/2026 16:59, Scott Lurndal wrote:
    Janis Papanagnou <janis_papanagnou+ng@hotmail.com> writes:
    On 2026-05-29 15:22, Bart wrote:
    On 29/05/2026 13:46, Janis Papanagnou wrote:
    On 2026-05-29 13:19, Bart wrote:
    On 29/05/2026 09:02, Janis Papanagnou wrote:
    On 2026-05-29 01:54, Lawrence D?Oliveiro wrote:
    On Thu, 28 May 2026 14:07:45 -0500, BGB wrote:
    [...]
    [...]

    The "confusions" you listed - not worth quoting - are your personal
    problem. The precedence of assignments and related operations and
    their evaluation order are clear, reasonable, and they can be found
    that way in many existing languages. (Some of your listed "problems"
    have been answered here already in the past - I wonder whether it's
    worth replying to you if you don't learn from the answers. You seem
    to have fun wasting everyone's time.)

    [...]

    I noticed that you didn't answer my questions.

    Yes, because, as experience shows, it's obviously a waste of time!

    Okay, I'll bite. - I'll go waste my time again and comment on your
    other post where you said you are confused about these cases...

    Why bother? C isn't ever going to change operator precedence
    just to make Bart happy.

    It would make me happy just for someone to admit there are problems. JP
    always says they are perfect but for one little thing.

    What a C compiler /can/ do is to warn when an expression relies on
    knowing the precedence of the more obscure operators.



    --- PyGate Linux v1.5.15
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From Janis Papanagnou@3:633/10 to All on Fri May 29 18:34:24 2026
    On 2026-05-29 18:05, Bart wrote:
    On 29/05/2026 16:15, Janis Papanagnou wrote:
    On 2026-05-29 15:22, Bart wrote:
    On 29/05/2026 13:46, Janis Papanagnou wrote:
    On 2026-05-29 13:19, Bart wrote:
    On 29/05/2026 09:02, Janis Papanagnou wrote:
    On 2026-05-29 01:54, Lawrence D?Oliveiro wrote:
    On Thu, 28 May 2026 14:07:45 -0500, BGB wrote:
    [...]
    [...]

    The "confusions" you listed - not worth quoting - are your personal
    problem. The precedence of assignments and related operations and
    their evaluation order are clear, reasonable, and they can be found
    that way in many existing languages. (Some of your listed "problems"
    have been answered here already in the past - I wonder whether it's
    worth replying to you if you don't learn from the answers. You seem
    to have fun wasting everyone's time.)

    [...]

    I noticed that you didn't answer my questions.

    Yes, because, as experience shows, it's obviously a waste of time!

    (My reply on your previous examples just posted.)


    It can go both ways.

    You always exasperatingly insist that there is only one thing wrong with
    C's precedence rules, and I think you said once that they are are
    otherwise perfect.

    It is true and acknowledged even by the designers of the C-language
    that there's a misplaced group in the table; the three bit-ops.

    I don't think that "perfect" is a fitting word, but it's otherwise
    (modulo bit-ops) surely an *appropriate* sensible choice they made.

    The less important options might be defined in one way or another;
    and various language designers have taken different ways how many
    groups they define, or how boolean and numeric operators relate in
    precedence, for example.

    And yet there endless examples on forums of people
    saying they are confusing.

    Your endless complaint-tirades may have made me miss these dubious
    "endless examples on forums of people saying they are confusing".
    If you are active on these forums I'd at least not be astonished
    if you'd contributed to anyone's confusion given how you behave
    here even on simple and clear facts about "C".

    Janis


    --- PyGate Linux v1.5.15
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From Janis Papanagnou@3:633/10 to All on Fri May 29 18:48:49 2026
    On 2026-05-29 18:12, Bart wrote:
    On 29/05/2026 16:59, Scott Lurndal wrote:
    Janis Papanagnou <janis_papanagnou+ng@hotmail.com> writes:
    On 2026-05-29 15:22, Bart wrote:

    I noticed that you didn't answer my questions.

    Yes, because, as experience shows, it's obviously a waste of time!

    Okay, I'll bite. - I'll go waste my time again and comment on your
    other post where you said you are confused about these cases...

    Why bother?ÿ C isn't ever going to change operator precedence
    just to make Bart happy.

    I think it's the "someone is wrong on the internet" syndrome.[*]
    My apologies. :-)


    It would make me happy just for someone to admit there are problems. JP always says they are perfect but for one little thing.

    You said in your previous post that I would have said that they
    are perfect. And now you are even saying that I'd always say that.

    Please stop that!

    Just for the record...

    What I would say is that operator precedences are in "C"
    "sensibly and appropriately defined, modulo the bit-ops".

    If you want to cite me in future please quote this statement
    and not anything else.

    Janis

    [*] https://xkcd.com/386/


    --- PyGate Linux v1.5.15
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From tTh@3:633/10 to All on Fri May 29 19:29:19 2026
    On 5/29/26 15:22, Bart wrote:

    Something you might do when you have time (as I'm busy), is to analyse
    the expressions in some C codebases, and isolate those where removal of parentheses that group terms, would result in exactly the same shape of expressions, and are therefore redundant.

    This is a strange exercice. When I write complex expression,
    I sometime use redondant parenthesis for the clarity of
    my intentions about this computation. I'm thinking that
    those extra (()) are a sort of in-line comments.

    --
    ** **
    * tTh des Bourtoulots *
    * http://maison.tth.netlib.re/ *
    ** **

    --- PyGate Linux v1.5.15
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From Bart@3:633/10 to All on Fri May 29 18:53:49 2026
    On 29/05/2026 18:29, tTh wrote:
    On 5/29/26 15:22, Bart wrote:

    Something you might do when you have time (as I'm busy), is to analyse
    the expressions in some C codebases, and isolate those where removal
    of parentheses that group terms, would result in exactly the same
    shape of expressions, and are therefore redundant.

    ÿÿ This is a strange exercice. When I write complex expression,
    ÿÿ I sometime use redondant parenthesis for the clarity of
    ÿÿ my intentions about this computation. I'm thinking that
    ÿÿ those extra (()) are a sort of in-line comments.


    Sure, but some here like to say that such expressions, if they still
    work without parentheses, are unambiguous anyway.

    They forget that people aren't compilers.

    And then the point becomes, if you always add the parentheses, what was
    the point of having that particular precedence level?

    --- PyGate Linux v1.5.15
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From Chris M. Thomasson@3:633/10 to All on Fri May 29 11:00:18 2026
    On 5/29/2026 12:56 AM, Janis Papanagnou wrote:
    On 2026-05-28 21:47, Chris M. Thomasson wrote:
    On 5/28/2026 12:18 AM, Janis Papanagnou wrote:
    [...]

    All sorts of C's problems with memory can be addressed. (The
    list can be continued; but I wonder why such things aren't recognized.)

    C's problems with memory? Don't you mean the programmers that make bugs?

    I'm not sure you're serious here or just joking. - To clarify...

    Yes, the programmers "implement the bugs", and the language makes it
    just easy and obligingly support the programmers to make such bugs.

    Well... Actually, its not C's fault at all. We have some tools to help a programmer, but ultimately its up to them...?

    --- PyGate Linux v1.5.15
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From Bart@3:633/10 to All on Fri May 29 19:09:50 2026
    On 29/05/2026 17:48, Janis Papanagnou wrote:
    On 2026-05-29 18:12, Bart wrote:
    On 29/05/2026 16:59, Scott Lurndal wrote:
    Janis Papanagnou <janis_papanagnou+ng@hotmail.com> writes:
    On 2026-05-29 15:22, Bart wrote:

    I noticed that you didn't answer my questions.

    Yes, because, as experience shows, it's obviously a waste of time!

    Okay, I'll bite. - I'll go waste my time again and comment on your
    other post where you said you are confused about these cases...

    Why bother?ÿ C isn't ever going to change operator precedence
    just to make Bart happy.

    I think it's the "someone is wrong on the internet" syndrome.[*]
    My apologies. :-)


    It would make me happy just for someone to admit there are problems.
    JP always says they are perfect but for one little thing.

    You said in your previous post that I would have said that they
    are perfect. And now you are even saying that I'd always say that.

    Please stop that!

    Just for the record...

    What I would say is that operator precedences are in "C"
    "sensibly and appropriately defined, modulo the bit-ops".


    You actually said this:

    (The point is that - with the exception of & ^ | - the ranking
    makes perfectly sense and should be easily usable without doubt
    by a concept-knowing programmer."

    (14-May-2026 0159 BST.)

    And today:

    Unsurprisingly; since exactly *that* was the obvious (and single)
    issue with C's precedence definitions.

    So you do seem to have a consistent view that there is only one thing
    wrong with C operator precedence.


    However, in that same 14-May post was this exchange:

    Dan Cross:
    Programmers _should_ absolutely learn the rules. But in C,
    there are many of them, and some of them are deceptively subtle.

    JP:
    We agreed.

    So, a hint of something else that is amiss? You just seem reluctant to
    say it yourself, but will disagree with me when I suggest it, while at
    the same time agree when somebody else does so.



    --- PyGate Linux v1.5.15
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From Bart@3:633/10 to All on Fri May 29 19:18:19 2026
    On 29/05/2026 17:10, Janis Papanagnou wrote:
    On 2026-05-29 13:19, Bart wrote:

    Further, here: 'a * b + c' the multplication is done first, but here:

    ÿÿÿ aÿ *= b += c

    It is done second.

    You understand that '=', '*=', and '*' are three different things,
    don't you?

    I hope you understand that '=' should have low precedence. And that
    it makes sense to evaluate that from right to left. Do you follow?

    "C" obviously decided to have them all, =, +=, *=, etc. in a single
    group, and thus evaluated from right to left. - Easy rule, easy to
    memorize. - And that is actually what you are demanding from many
    other operators, to put them in a single group. - But here you are complaining about it!

    Of course the rules for those combined (sort of two-address) operators
    could have been defined differently, in an own group with other rules.
    (Algol 68 had done that, actually; the semantics are like "apply these operations from left to right, indicating an incremental modification
    of the underlying value.)

    For those who don't know, the behaviour of this C code:

    a += b += c += d

    is very different from the equivalent Algol68:

    a +:= b +:= c +:= d

    This only modifies 'a'.

    In fact it's more like: 'b = x; a = b;'. Example:

    ÿÿÿÿ double a;
    ÿÿÿÿ float b;

    ÿÿÿÿ aÿ = b = 3.14159265358979323846;

    Here, 'a' will be assigned the less precise 32-bit version of the RHS.

    And why are you composing such stupid examples (if not only for sake
    of an argument)? - An experienced programmer wouldn't write such an expression with mixed types if he intends clear and non-dubious code.

    Maybe the code started off as:

    a = x;
    b = x;

    and somebody decided to refactor it. Or it was 'a = b = x' all along but
    a, b started off as the same type but then one was changed.

    These things happen. When discussing language design we have to consider
    what thousands or millions of programmers might think or assume.

    You seem to like making it 100% about me. How about stopping making it
    always so personal.

    --- PyGate Linux v1.5.15
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From Keith Thompson@3:633/10 to All on Fri May 29 12:09:55 2026
    Bart <bc@freeuk.com> writes:
    On 29/05/2026 16:59, Scott Lurndal wrote:
    [...]
    Why bother? C isn't ever going to change operator precedence
    just to make Bart happy.

    It would make me happy just for someone to admit there are
    problems.

    There are problems.

    Are you happy now?

    I didn't think so.

    [...]

    --
    Keith Thompson (The_Other_Keith) Keith.S.Thompson+u@gmail.com
    void Void(void) { Void(); } /* The recursive call of the void */

    --- PyGate Linux v1.5.15
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From Keith Thompson@3:633/10 to All on Fri May 29 12:28:17 2026
    Bart <bc@freeuk.com> writes:
    On 29/05/2026 18:29, tTh wrote:
    On 5/29/26 15:22, Bart wrote:
    Something you might do when you have time (as I'm busy), is to
    analyse the expressions in some C codebases, and isolate those
    where removal of parentheses that group terms, would result in
    exactly the same shape of expressions, and are therefore redundant.
    ÿÿ This is a strange exercice. When I write complex expression,
    ÿÿ I sometime use redondant parenthesis for the clarity of
    ÿÿ my intentions about this computation. I'm thinking that
    ÿÿ those extra (()) are a sort of in-line comments.


    Sure, but some here like to say that such expressions, if they still
    work without parentheses, are unambiguous anyway.

    They forget that people aren't compilers.

    To state the obvious, complicated expressions that rely on C's more
    obscure precedence can be confusing to some human readers, but are
    unambiguous to compilers. Most of us will add parentheses that
    are not strictly necessary to the compiler, but that are helpful to
    human readers. There is no universal agreement on which parentheses
    are helpful or necessary and which are not, and there never will be.

    And then the point becomes, if you always add the parentheses, what
    was the point of having that particular precedence level?

    You're asking why C is designed the way it is. We could waste a
    great deal of time and effort answering that for you. There are
    numerous documents about the design and history of C, and of
    its ancestor languages. I could provide you with links.

    What if I did the research and presented you with an explanation for
    all of C's precedence levels? What if I could tell you exactly what
    Ken Thompson had in mind when he specified the expression syntax
    for B, and exactly what Dennis Ritchie had in mind when he based C
    on Thompson's work? What if you clearly and completely understood
    why C's expression syntax is the way it is, including parts of the
    syntax that Ritchie himself regretted, including parts that you
    obviously could have done better?

    What then? Would an answer to your question help you? Would it
    satisfy you? Would you even thank me for the effort if I did that?
    Or would you just keep complaining about something that cannot be
    changed without breaking existing code? Would you talk about C's
    expression syntax in a way intended to help others understand it,
    or would you continue to post contrived examples that make it appear
    as confusing as possible?

    Bart, what do you want?

    --
    Keith Thompson (The_Other_Keith) Keith.S.Thompson+u@gmail.com
    void Void(void) { Void(); } /* The recursive call of the void */

    --- PyGate Linux v1.5.15
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From Bart@3:633/10 to All on Fri May 29 20:49:02 2026
    On 29/05/2026 20:28, Keith Thompson wrote:
    Bart <bc@freeuk.com> writes:

    or would you continue to post contrived examples that make it appear
    as confusing as possible?

    Examples are examples. Do you want me to post one that didn't illustrate
    an issue? It necessaily has to be contrived.


    Bart, what do you want?



    Today I was just replying to this post today that I found annoying:

    JP:
    Unsurprisingly; since exactly *that* was the obvious (and single)
    issue with C's precedence definitions.

    That is, the suggestion, made several times by JP, that there is only
    one thing wrong.


    --- PyGate Linux v1.5.15
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From James Kuyper@3:633/10 to All on Fri May 29 15:57:07 2026
    On 2026-05-29 12:10, Janis Papanagnou wrote:
    On 2026-05-29 13:19, Bart wrote:
    ...
    * What is the order here: a ^ b | c

    (a^b)|c

    Personally I don't think that there's a prevalent definition
    how these should be ordered.

    I'm not sure what you mean by "prevalent definition". Ordinarily, I'd
    expect the C standard to qualify - it definitely defines the order, and
    the very purpose of a language standard is to prevail over non-standard alternatives. However, I'm sure you're aware of the C standard, and made
    that comment anyway, so I presume you mean something different by it.


    --- PyGate Linux v1.5.15
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From Janis Papanagnou@3:633/10 to All on Fri May 29 22:00:30 2026
    On 2026-05-29 20:09, Bart wrote:
    You actually said this:

    You continue your trollish stance to cherry-pick words without
    understanding or trying to understand what's been expressed.

    The insight appears to me that you're taking communication in a
    similar way as you "design" your languages; focusing on personal
    *syntax* preferences instead of the more important *semantics*.

    Despite we're talking in your native language (and not mine) you
    obvious completely miss or deliberately ignore that there's a
    difference between "it makes perfectly sense" and "it's perfect".
    (I said the former, you put the latter in my mouth. And to not
    get identified as a liar you're squirming with such moves. Gee!)

    Or are again just confused about the difference, and despite you
    have already been advised to quote what I think can neither be
    misrepresented nor misinterpreted (since it doesn't contain common
    word patterns that obviously confuse you)
    >> What I would say is that operator precedences are in "C"
    >> "sensibly and appropriately defined, modulo the bit-ops".
    you're still playing your stupid game; you ignored that. I suggest
    to try to map this statement to either of the above two statements,
    the one I said and the one you (wrongly) attributed, and see which
    one fits. (Hint: the former.)

    [...]

    Dan Cross:
    Programmers _should_ absolutely learn the rules.ÿ But in C,
    there are many of them, and some of them are deceptively subtle.

    JP:
    We agreed.
    Bart, you are incapable of understanding semantics and associating
    context.

    Janis


    --- PyGate Linux v1.5.15
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From Janis Papanagnou@3:633/10 to All on Fri May 29 22:03:04 2026
    On 2026-05-29 21:49, Bart wrote:
    On 29/05/2026 20:28, Keith Thompson wrote:
    Bart <bc@freeuk.com> writes:

    or would you continue to post contrived examples that make it appear
    as confusing as possible?

    Examples are examples. Do you want me to post one that didn't illustrate
    an issue? It necessaily has to be contrived.


    Bart, what do you want?



    Today I was just replying to this post today that I found annoying:

    JP:
    Unsurprisingly; since exactly *that* was the obvious (and single)
    issue with C's precedence definitions.

    That is, the suggestion, made several times by JP, that there is only
    one thing wrong.

    The C-precedence rules have one issue. The rest is a sensible choice.

    (The C-language has more issue, but that was not the topic here.)

    Janis


    --- PyGate Linux v1.5.15
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From BGB@3:633/10 to All on Fri May 29 15:16:51 2026
    On 5/29/2026 6:22 AM, David Brown wrote:
    On 29/05/2026 12:20, BGB wrote:
    On 5/29/2026 2:52 AM, Janis Papanagnou wrote:
    On 2026-05-28 11:57, BGB wrote:
    On 5/28/2026 2:18 AM, Janis Papanagnou wrote:
    On 2026-05-28 01:49, BGB wrote:
    [...]


    But, not really an "easy" way to avoid bloat, other than to write
    code specifically for what cases are relevant; while also avoiding
    needless duplication and copy paste (where, overuse of copy/paste
    can also lead to bloat; along with turning the code into an ugly mess). >>>
    Hmm.. - as said, the during very early days there were issues; I
    recall on one platform duplication of template code in more that
    one source unit. And/or some environmental hacks (of the compiler)
    to deposit template code for linking. In the later days I've not
    seen such immature things anymore.


    Possibly, a lot could depend on how one is counting things as well.


    In a lot of cases when using GCC, I end up using:
    ÿÿ -ffunction-sections -fdata-sections -Wl,-gc-sections

    On many targets, "-fdata-sections" can lead to noticeably larger and
    slower code because it effectively eliminates section anchor
    optimisations.ÿ It does not negatively affect x86 AFAICS, because x86
    does not use section anchors.

    <https://godbolt.org/z/zeoq41Y7d>

    With -fsection-anchors (enabled with optimisation on targets that
    support it - generally RISCy load/store architectures), program-lifetime variables are kept together in a lump (as though they were in a struct)
    and often addressed by a pointer to that pretend struct.ÿ Thus if a
    function accesses two variables "a" and "b", instead of having to load
    the addresses of each of "a" and "b" into separate registers, it loads
    an "anchor" into one register and accesses the variables with reg+offset addressing.

    I've seen "-fdata-sections" used regularly in embedded systems - it is almost always a bad idea.

    ("-ffunction-sections" is often very helpful to reduce code image size,
    so keep that one.)


    Both seem to help on x86, x86-64, and also on RISC-V, at making GCC's
    output at least sorta space-comparable to my own compilers.

    The merit of "-fdata-sections" is mostly that it eliminates unused
    global variables; whereas "-ffunction-sections" eliminates unreachable functions.

    Neither is needed with my own compiler, which compiles things in a way
    such that it eliminates anything that is unreachable.


    Both posed an issue initially when porting ROTT, because in some cases
    it relied on the ability to go out-of-bounds for one array to access
    data in another array. I ended up reworking some of these cases though
    to use a single larger array.



    Have noted though that GCC targeting RISC-V still tends to produce
    fairly large binaries even with "-Os". Its code for the basic subset
    (RV64G) does tend to be a little faster than what BGBCC generates, but
    also a fair bit more bulky. Though, the final ELF file ends up bigger
    still, as a significant chunk of the file ends up needing to hold ELF
    related metadata (comparably, PE/COFF can end up much leaner here).


    Though, on the other side, with modern MSVC, despite the relative
    leanness of the PE/COFF format, MSVC tends to produce binaries with much larger ".text" sections.

    This issue was a lot less with VS2008 though, which tended to generate less-bloated binaries (with code-size more competitive with GCC).


    Also in modern MSVC, there is little distinction between "/O1" and
    "/Os", both being more space-efficient than "/O2" (though, "/O2" is
    usually faster, but also more prone to misguided attempts at auto-vectorization).





    Because otherwise it likes wasting code space by retaining unreachable
    functions.

    Using "static inline" functions also carries a risk because the can
    end up duplicated across multiple translation units, or in multiple
    places within the same translation unit, so is best used sparingly.


    Usually you would only use static inline functions for small functions
    in headers, where they are a better choice than function-like macros. In
    a C file, there is rarely much point in declaring a function "inline" - optimising compilers will inline or not as they see fit, without regard
    for "inline".ÿ "static" on its own is, of course, always a good idea for functions or data that is not "exported" by the current translation
    unit, and will often make generated code smaller.

    How much or how little duplication of code there will be within one translation unit will depend on compiler settings and the rest of the
    code, and not on whether or not you use "inline".


    OK.

    But, yeah, small functions are usually better than macros in at least
    that the compiler can avoid duplicating them (or maybe merge them
    between translation units when it notices that the contents are identical).





    As for assembler:
    Main reasons not to use assembler for everything:
    ÿÿ Needlessly verbose;
    ÿÿ Non-portable.

    However, often one can still end up writing C code that looks like
    assembler sometimes, as this is often an effective way to optimize
    things.

    Say, for example:
    ÿÿ v0=cs[0];
    ÿÿ v2=cs[2];
    ÿÿ v1=cs[1];
    ÿÿ v3=vs[3];
    ÿÿ ct[0]=v0;
    ÿÿ ct[2]=v2;
    ÿÿ ct[1]=v1;
    ÿÿ ct[3]=v3;
    Vs:
    ÿÿ ct[0]=cs[0];
    ÿÿ ct[1]=cs[1];
    ÿÿ ct[2]=cs[2];
    ÿÿ ct[3]=cs[3];

    Because the extra variables can avoid help sidestep latency from the
    load instructions and staggering stores can avoid penalties of two
    adjacent stores to the same cache-line in some cache architectures.
    Where, in the latter case, the compiler may fail to as effectively
    avoid the load-latency or realize the need to stagger the stores for
    best performance, ...

    That might be the case for a very simplistic compiler.ÿ With an
    optimising compiler, these extra variables will quickly be eliminated.
    If the compiler has a good scheduling model of the device, it do
    whatever instruction scheduling works best for that processor.ÿ If the
    model is not good enough, it will be suboptimal.ÿ I would not, however, expect any different in the generated code for the two code snippets.

    Sometimes this kind of "manual optimisation" is helpful when you have to
    try to get efficient results from a weak compiler, however.


    Possibly, but this sort of thing can help with both BGBCC and with MSVC IME.


    While BGBCC does use a shuffle-to-reorder instructions things, it may
    fail to do so in some cases:
    If the instructions end up mapped to the same CPU register;
    If its heuristics can't prove non-alias.

    Though, in the simple example given, it could (probably) turn the latter
    into the former, but "better" to write code such that things are in
    closer to the optimal order by default.

    Note that using different variables with overlapping scopes reduces the likelihood of the compiler assigning both to the same register, which is
    a much more real risk if relying on implicit temporaries (whose lifetime
    only exists within a single expression).

    But, in my case, a lot of this comes down to trying to tweak the
    compilers' internal register allocation heuristics for best results (and
    the tight balance between how many registers to save/restore for the
    function, vs avoiding assigning short-lived temporaries to the same
    register too quickly and hindering the instruction-scheduling).

    Arguably, could make sense to instead do the reordering at the 3AC
    level, rather than reordering at the level of ISA instructions, but this
    is just sorta how I ended up doing things (and one can know the
    effective timing latency of a CPU instruction a lot more easily than a
    3AC op).





    Usual strategy is to try to limit how much code is written, and also
    to avoid doing things in ways that result in too much code, or too
    much cruft.

    Best to avoid both copy paste when reasonable, and sticking anything
    non-trivial in macros.

    We avoided macros if possible.


    They are de-facto for constants and similar, but for longer stuff is
    better avoided.

    Macros are rarely the best way to define constants.ÿ They are needed if
    you are using the constants for pre-processor stuff like conditional compilation.ÿ But generally you get clearer code, better typing, and potentially several other benefits from using alternative choices like "enum" (even for stand-alone integer constants), "static const"
    variables, and in C23, "constexpr" variables.ÿ There's no doubt that a
    lot of code /does/ use macros for constants, but I view it as a relic of
    the past rather than good coding practice.


    They are traditional...

    Like:
    static const double M_PI = 3.14159265358979;

    Could also make sense, but people don't do usually this, they usually
    use macros...

    In BGBCC, both can be handled as constants, just they end up being
    handled at different stages:
    #define: Constant ends up inlined in the preprocessor/parse stage;
    const: Constant shows up in the "reducer" (which evaluates constant expressions).

    Where, as noted, BGBCC's pipeline looks kinda like:
    Toplevel:
    Ingest each named source file;

    Then, in the C case, per translation unit:
    Preprocess;
    Parse;
    Frontend Compile + Reduce;
    This does an AST walk, but at each stage,
    invokes the reducer to see if it can perform AST level rewrites;
    Reducer can also implement some edge-case features.
    So, is mostly necessary, vs an optional optimization thing.
    Emits output as a Stack IL.
    May be output to a file, or used as input to next stage.
    The stack IL partly resembles a mix of JVM and .NET bytecode.
    The IL ops themselves operate more like in .NET bytecode.
    This serves the role of static libraries and object files.
    For a static library, all the stack IL gets blobbed together.
    So, every translation unit ends up effectively appended on.

    Middle Stage (processes IL Blobs):
    Processes Stack IL, translates to 3AC (loosely SSA form);
    Builds a big table of all global declarations, etc.

    Backend:
    Walks call-graph to determine dependencies;
    Unreachable functions/globals/etc are marked as culled.
    Ranks all the functions and variables by priority;
    Sorts them into roughly priority order;
    Then does shuffling to try to density-optimize globals;
    Swaps globals when doing so would allow more memory density.
    May also apply random shuffling and clustering heuristics.
    Then, compiles each function:
    Figure out stack-frame layout,
    how many registers to reserve,
    etc.
    Emit machine code for 3AC ops;
    Try to shuffle instructions to improve instruction scheduling;
    Or, if a variable:
    Figure out whether it goes in ".data" or ".bss"
    If initialized, deal with initialization stuff;
    ...
    Or, an ASM Blob:
    Assemble it.
    Or, Ingests contents that go into ".rsrc" section;
    May involve image and audio converters, etc;
    BGBCC uses different resource sections from Windows though.
    ...
    Output:
    Gets is input as a set of sections, symbols, and relocs;
    Figures out layout within the output image (eg, PE/COFF);
    Figures out how much space it needs for base relocs, etc.
    Builds up a table of "initial base-relocs"
    Splats the sections into the image buffer;
    Applies relevant relocs;
    Sorts base relocs by RVA;
    Generates actual ".reloc" section contents.
    Fill in PE/COFF headers and similar;
    If applicable, LZ4 compress the image.
    I tend to store EXE's in LZ4 compressed form,
    the image is decompressed during load.
    This format leaves the initial PE/COFF headers uncompressed.
    Need the headers to figure out where to load the image.
    Else would need a temporary buffer to decompress into.


    Typical loader process:
    Look at headers;
    Figure out where to load to, etc;
    Read in (or decompress) image contents;
    Apply base relocs;
    Pull in any DLLs, etc;
    Go.

    The LZ4 compression is mostly because:
    Loader is often IO bound;
    May save memory in some cases;
    LZ4 decompression is faster than more IO;
    It also seems to be effective against program binaries (*).

    *: I have my RP2 format, which generally does better for general purpose
    data compression, but slightly worse for compressing program binaries,
    so LZ4 has mostly won here. Also generally don't want a "stronger"
    compressor, like Deflate, both because an Inflater is a much bigger
    chunk of code, and also much slower than LZ4.



    Can note that BGBCC also mostly takes over the role of the "resource
    compiler" as well, so can process resources. These are generally listed
    as a text file of entries to import, giving an internal "lumpname",
    external filename, and a tag to specify which file conversions to apply.

    I am using a vert different resource section type than Windows though,
    in that I just sorta replaced it with a modified version of the Quake
    WAD2 format (not to be confused with PAK, where PAK serves a different
    role). Note that the WAD2 directory in this case uses RVA's and not
    WAD-file offsets (so, effectively, it is integrated into the PE/COFF
    image, not just a WAD file that was shoved in).

    Generally, one can access lumps from C land with declarations like:
    extern unsigned char __rsrc_lumpname[];


    Typically, formats used internally are things like BMP and WAV. Though,
    when using BMP, it is typically 16 color or 256 color to avoid wasting
    space. Sometimes monochrome or 4 color. One downside of BMP is that for
    a full 256-color palette it needs 1K of memory just for the palette,
    tempting to consider a non-standard variant that uses RGB555 for the
    palette (thus reducing it to 512 bytes). For small images it is often
    smaller to store them as 16-bit hi-color to avoid the space penalty of
    the color palette.

    There are already non-standard BMP variants though, like BMP with LZ compression. A lot depends on what is needed for a particular use-case.


    For WAV typical formats are 2 or 4 bit ADPCM at 8/11/16 kHz.
    2 bit ADPCM: 16/22/32 kbps.
    4 bit ADPCM: 32/44/64 kbps.

    Have found encoder-side tricks to make ADPCM more compressible with LZ4. Basically, it tries to do a reverse LZ search when encoding and encodes
    audio following patterns when the pattern would be a "close enough" match.

    had also experimented before with using some trickery involving FIR
    filters and lookup tables to improve perceptual quality of 8kHz/2b ADPCM
    to try to make it sound "less like total crap". But, this requires
    additional metadata and a more complex process to decode (and to get
    best results with this will result in worse audio quality if the audio
    is just naively decoded as 8kHz/2b ADPCM without the filters).

    But, yeah, with these tricks can reduce the effective bitrate (when LZ4 compressed) down to around 8-12 kbps. Note that while entropy coding
    could help more, it is modest, and the most effective strategy (range
    coding) being mostly too slow to be worthwhile.

    Also, of the things I have tested, ADPCM was still the front runner for "actually passable" audio quality in this domain (to me, some of the
    modern cellphone codecs sound like unintelligible broken garbage, and
    require much more complex decoders, not worth the bother).


    only thing I have found that gets much lower bitrate is, say:
    One divides the audio up into chunks of 64 samples (1/125 second for 8kHz); Pick the top 4 square-waves from between 1 and 4 kHz;
    Encode the phase and intensity of each square wave.

    Typically, the strategy was to break it into 4 half-octaves and pick the highest peak in each half-octave; and then totally ignore everything
    below 1kHz. If the frequency and amplitude are encoded in around 16 bits
    each, this achieves an effective bitrate of 8 kbps.

    Though, another strategy was 8 quarter octaves and pick the top 4 loudest.

    But, audio quality is worse than 2b ADPCM.

    Can push it to 4kbps by only encoding the top 2 waveforms.
    But, then speech sounds robotic and borderline unintelligible.
    Note that dropping to 62.5 Hz sampling also makes speech unintelligible.

    While traditionally, this used sine-waves (sinewave synthesis) I had
    better results with square waves (simpler/cheaper, also better results audio-wise). Computational cost for decoding is fairly modest (mostly
    some "for()" loops and fixed-point arithmetic).

    Though, effective bitrate may be lower, because it seems that speech
    encoded this way is often LZ compressible as well (and can be helped
    along with pattern matching tricks).

    ...


    But, yeah, generally want images and audio to be fairly compact when
    shoving them inside an EXE or DLL, for more general asset data,
    generally better to use an external file.

    I had often used a custom "WAD4" format here, which is kinda like "WAD2
    but with longer names and a directory tree". It then exists as a lower
    cost option to the ZIP format (while semi-popular, ZIP is a
    high-overhead format to be used this way).

    Also can use WAD4 as a sort of VFS packaging.




    But, things can be considered in relative terms:
    Like, C++ may carry various penalties vs C.


    I don't find C++ carries noticeably penalties compared to C, for my
    embedded work.ÿ But I do disable exceptions and RTTI - exceptions may
    have very little run-time time overhead, but the unwind tables can be significant when code size is important in small systems.


    Yes, that is the main thing.
    They carry zero performance penalty in practice;
    But, have a non-zero penalty for image size.

    Not enough to be a deal-breaker towards using them if they are used, but enough that one wants them disabled if not used...



    --- PyGate Linux v1.5.15
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From Janis Papanagnou@3:633/10 to All on Fri May 29 22:17:54 2026
    On 2026-05-29 20:18, Bart wrote:
    On 29/05/2026 17:10, Janis Papanagnou wrote:
    On 2026-05-29 13:19, Bart wrote:

    Further, here: 'a * b + c' the multplication is done first, but here:

    ÿÿÿ aÿ *= b += c

    It is done second.

    You understand that '=', '*=', and '*' are three different things,
    don't you?

    I hope you understand that '=' should have low precedence. And that
    it makes sense to evaluate that from right to left. Do you follow?

    "C" obviously decided to have them all, =, +=, *=, etc. in a single
    group, and thus evaluated from right to left. - Easy rule, easy to
    memorize. - And that is actually what you are demanding from many
    other operators, to put them in a single group. - But here you are
    complaining about it!

    Of course the rules for those combined (sort of two-address) operators
    could have been defined differently, in an own group with other rules.
    (Algol 68 had done that, actually; the semantics are like "apply these
    operations from left to right, indicating an incremental modification
    of the underlying value.)

    For those who don't know, the behaviour of this C code:

    Those you have read my post already know that, since that was
    what I explained as a possible alternative rule for these sorts
    of operators. (It's still quoted above.) Folks here are capable
    of understanding that without your echoing post.


    ÿÿ a += b += c += d

    is very different from the equivalent Algol68:

    ÿÿ a +:= b +:= c +:= d

    This only modifies 'a'.

    I explained already in my post that there's a difference. (Are you
    so proud of having understood that that you want to repeat it? -
    "Look Ma, no hands!")

    (And neither is "perfect"; both are sensible choices. - Not sure
    you understand that.)

    [...]

    You seem to like making it 100% about me. How about stopping making it always so personal.

    What you expose here (about your personality) is nothing new, and
    it's about your personality; you obviously aren't really interested
    to know or understand or learn the facts.

    You had asked, even insisted for answers to your samples because
    you obviously weren't intellectually capable of understanding the
    topic, and all you posted is this reply! - Pathetic!

    Janis


    --- PyGate Linux v1.5.15
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From Janis Papanagnou@3:633/10 to All on Fri May 29 22:34:08 2026
    On 2026-05-29 21:57, James Kuyper wrote:
    On 2026-05-29 12:10, Janis Papanagnou wrote:
    On 2026-05-29 13:19, Bart wrote:
    ...
    * What is the order here: a ^ b | c

    (a^b)|c

    Personally I don't think that there's a prevalent definition
    how these should be ordered.

    I'm not sure what you mean by "prevalent definition". Ordinarily, I'd
    expect the C standard to qualify - it definitely defines the order, and
    the very purpose of a language standard is to prevail over non-standard alternatives. However, I'm sure you're aware of the C standard, and made
    that comment anyway, so I presume you mean something different by it.

    It was about Bart's confusion; he seems to be looking for something
    "naturally" understandable, like the common * and / have precedence
    over + and - , which is known by most non-IT people from basic math.

    There is no such commonly know ("prevalent") definition for the bit
    operations. So we need to look that up in appropriate documents to
    get to know their evaluation order. - That was what I intended to
    express.

    Sorry if that was unclear, and thanks for asking to clarify that.

    How "C" defines their precedence can of course be read in any book
    about "C", there's not even a "C Standard" document necessary.

    In addition I gave an explanation why they decided to have these
    operators in three separated precedence groups, and hinted on what
    was the rationale for the same order as the boolean && and || .

    Janis


    --- PyGate Linux v1.5.15
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From Bart@3:633/10 to All on Fri May 29 21:47:40 2026
    On 29/05/2026 21:17, Janis Papanagnou wrote:
    On 2026-05-29 20:18, Bart wrote:

    (Are you
    so proud of having understood that that you want to repeat it? -
    "Look Ma, no hands!")

    What you expose here (about your personality) is nothing new, and
    it's about your personality; you obviously aren't really interested
    to know or understand or learn the facts.

    you obviously weren't intellectually capable of understanding the
    topic, and all you posted is this reply!

    - Pathetic!

    It doesn't look like a civil discussion is possible here, so long as you
    keep up the personal insults.

    I thank you for those replies but there doesn't seem any point in taking
    this further.

    --- PyGate Linux v1.5.15
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From Keith Thompson@3:633/10 to All on Fri May 29 13:56:35 2026
    Bart <bc@freeuk.com> writes:
    On 29/05/2026 20:28, Keith Thompson wrote:
    Bart <bc@freeuk.com> writes:
    or would you continue to post contrived examples that make it appear
    as confusing as possible?

    Examples are examples. Do you want me to post one that didn't
    illustrate an issue? It necessaily has to be contrived.

    Bart, what do you want?



    Today I was just replying to this post today that I found annoying:

    JP:
    Unsurprisingly; since exactly *that* was the obvious (and single)
    issue with C's precedence definitions.

    That is, the suggestion, made several times by JP, that there is only
    one thing wrong.

    I note your refusal to address most of what I wrote.

    Upthread, you asked a question:

    And then the point becomes, if you always add the parentheses, what
    was the point of having that particular precedence level?

    You've made it clear that you were never interested in an answer.

    Please do not waste everyone's time by asking questions when you're
    not interested in the answers. Please do not assume that anyone
    can tell whether one of your questions is sincere, figurative,
    or rhetorical.

    Bart, what do you want?

    --
    Keith Thompson (The_Other_Keith) Keith.S.Thompson+u@gmail.com
    void Void(void) { Void(); } /* The recursive call of the void */

    --- PyGate Linux v1.5.15
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From Bart@3:633/10 to All on Fri May 29 22:14:32 2026
    On 29/05/2026 21:00, Janis Papanagnou wrote:
    On 2026-05-29 20:09, Bart wrote:
    You actually said this:

    You continue your trollish stance to cherry-pick words without
    understanding or trying to understand what's been expressed.

    The insight appears to me that you're taking communication in a
    similar way as you "design" your languages; focusing on personal
    *syntax* preferences instead of the more important *semantics*.

    Despite we're talking in your native language

    English is my second language, technically.

    (and not mine) you
    obvious completely miss or deliberately ignore that there's a
    difference between "it makes perfectly sense" and "it's perfect".
    (I said the former, you put the latter in my mouth.

    It's paraphrasing. Here is your quote again:

    (The point is that - with the exception of & ^ | - the ranking
    makes perfect[ly] sense and should be easily usable without doubt
    by a concept-knowing programmer."

    I've isolated the 'ly' as that is incorrect grammar and have ignored it.

    I don't know what impression someone can take from this other than you
    think it's all dandy apart from that one exception.

    So, what am I missing? Did you mean the rest is all fine, considering
    this is C, but it is not perfect?

    In that case, what /would/ be perfect in your view? Assume a fantasy
    language where anything is possible.



    ÿ >> What I would say is that operator precedences are in "C"
    ÿ >> "sensibly and appropriately defined, modulo the bit-ops".
    you're still playing your stupid game; you ignored that. I suggest
    to try to map this statement to either of the above two statements,
    the one I said and the one you (wrongly) attributed, and see which
    one fits. (Hint: the former.)

    OK, so why don't you list all the things you think are amiss with C
    operator precedences. Apart from that exception.



    --- PyGate Linux v1.5.15
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From Bart@3:633/10 to All on Fri May 29 22:54:08 2026
    On 29/05/2026 21:56, Keith Thompson wrote:
    Bart <bc@freeuk.com> writes:
    On 29/05/2026 20:28, Keith Thompson wrote:
    Bart <bc@freeuk.com> writes:
    or would you continue to post contrived examples that make it appear
    as confusing as possible?

    Examples are examples. Do you want me to post one that didn't
    illustrate an issue? It necessaily has to be contrived.

    Bart, what do you want?



    Today I was just replying to this post today that I found annoying:

    JP:
    Unsurprisingly; since exactly *that* was the obvious (and single)
    issue with C's precedence definitions.

    That is, the suggestion, made several times by JP, that there is only
    one thing wrong.

    I note your refusal to address most of what I wrote.

    Upthread, you asked a question:

    And then the point becomes, if you always add the parentheses, what
    was the point of having that particular precedence level?

    You've made it clear that you were never interested in an answer.


    You said this:

    "You're asking why C is designed the way it is. We could waste a
    great deal of time and effort answering that for you. There are
    numerous documents about the design and history of C, and of
    its ancestor languages. I could provide you with links."

    Actually I'm not asking why C is like that. We're already there.

    I'm saying that there is no value in those extra levels, some people
    think is, and I'm arging about that. I was replying to tTh.

    As for my question, what /is/ the point? I'm still waiting!

    Of course, I want the answer to be that there isn't any point if
    parentheses will be used anyway.


    Bart, what do you want?

    What answer do you want from me? As I said it was a reply to JP. You
    didn't need to step it.



    --- PyGate Linux v1.5.15
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From Keith Thompson@3:633/10 to All on Fri May 29 15:52:14 2026
    Bart <bc@freeuk.com> writes:
    On 29/05/2026 21:56, Keith Thompson wrote:
    [...]
    I note your refusal to address most of what I wrote.

    Upthread, you asked a question:

    And then the point becomes, if you always add the
    parentheses, what was the point of having that particular
    precedence level?

    You've made it clear that you were never interested in an answer.

    You said this:

    "You're asking why C is designed the way it is. We could waste a
    great deal of time and effort answering that for you. There are
    numerous documents about the design and history of C, and of
    its ancestor languages. I could provide you with links."

    Actually I'm not asking why C is like that. We're already there.

    Then your question was unclear. The only reasonable interpretation
    I could see for your question, quoted above, is that you wanted
    to know why Dennis Ritchie chose the specific precedence rules
    that he chose when he was designing the C language in the 1970s.
    (The precedence rules have been stable since then.)

    What did you mean to ask? Was your question meant to be rhetorical?
    Did you just mean to let us know that you don't like C's precedence
    rules? I think we all knew that.

    [...]

    As for my question, what /is/ the point? I'm still waiting!

    I see that (what *is* the point) as a different question from what
    you wrote earlier (what *was* the point).

    The rules are what they are.

    I honestly don't have any strong opinions about what the rules
    *should* be.

    Of course, I want the answer to be that there isn't any point if
    parentheses will be used anyway.

    Parentheses are not always used. Some programmers know the
    precedence rules well enough, and expect their readers to know them
    well enough, that they don't need to add parentheses. I don't bother
    with parentheses in `a = b + c` or `a + b * c`. Others might not
    bother with parentheses in more obscure cases where I would use them.

    C compilers must implement the rules as specified in the standard.
    Future editions of the standard are unlikely to reorder the
    precedence rules, since that would quietly break existing code.

    C programmers may or may not choose to remember and/or take advantage
    of all the precedence rules. I haven't memorized all of them myself.

    [snip]

    --
    Keith Thompson (The_Other_Keith) Keith.S.Thompson+u@gmail.com
    void Void(void) { Void(); } /* The recursive call of the void */

    --- PyGate Linux v1.5.15
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From Lawrence D?Oliveiro@3:633/10 to All on Fri May 29 23:18:42 2026
    On Fri, 29 May 2026 12:19:04 +0100, Bart wrote:

    * Why do bitwise & | ^ need their own level anyway

    So that you can do shifting and masking with minimal parentheses.

    * Why do << >> have their own level anyway

    So that shift expressions can use common arithmetic operators with
    minimal parentheses.

    Further, here: 'a * b + c' the multplication is done first, but
    here:

    a *= b += c

    It is done second.

    That kind of thing is disallowed in Python, for some reason.

    --- PyGate Linux v1.5.15
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From Lawrence D?Oliveiro@3:633/10 to All on Fri May 29 23:20:49 2026
    On Fri, 29 May 2026 08:09:53 +0200, Bonita Montero wrote:

    There's no language where the users are so detail focussed and open
    to new features [than C++].

    But they still don?t have ?try-finally?.

    --- PyGate Linux v1.5.15
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From Bart@3:633/10 to All on Sat May 30 01:26:47 2026
    On 30/05/2026 00:18, Lawrence D?Oliveiro wrote:
    On Fri, 29 May 2026 12:19:04 +0100, Bart wrote:

    * Why do bitwise & | ^ need their own level anyway

    So that you can do shifting and masking with minimal parentheses.

    Can you give examples?

    Because you can do 'a << b & c' without << >> needing their own private
    level; it only needs to be lower than bitwise ops.

    For example they could have the same level as * and / as they
    essentially do the same thing.

    * Why do << >> have their own level anyway

    So that shift expressions can use common arithmetic operators with
    minimal parentheses.

    Again, examples?



    Further, here: 'a * b + c' the multplication is done first, but
    here:

    a *= b += c

    It is done second.

    That kind of thing is disallowed in Python, for some reason.

    I disallow it too (in my stuff). It's too confusing, no matter that is
    100% unambiguous according to some arcane language rules.

    --- PyGate Linux v1.5.15
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From James Kuyper@3:633/10 to All on Fri May 29 20:31:50 2026
    On 2026-05-29 18:52, Keith Thompson wrote:
    Bart <bc@freeuk.com> writes:
    ...
    Of course, I want the answer to be that there isn't any point if
    parentheses will be used anyway.

    The answer, of course, is that the condition of your "if" clause is not
    true. In the overwhelming majority of the cases, people do not use
    parentheses to clarify the order of evaluation that is guaranteed by C's grammar rules. They only use them in the cases where they feel that
    there's a significant chance of confusion. Of course, that depends upon
    your audience. If I was required to write code in such a way that you
    would have trouble misunderstanding it, I'd write

    a = m*x + b;

    as

    a = ((m*x)+b);

    I internalized C's grammar rules a long time ago (which causes problems
    on the rare occasions when they've changed them). The main exception are
    the bit-wise operators which are well known as having the wrong
    precedence - but I've seldom needed to use those operators.
    As a result, I seldom have any confusion as to the order of evaluation,
    which makes it very hard for me to realize that it might be a good idea
    to put in some redundant parentheses to clarify that order for other
    people. That means that some of my code is probably more cryptic than it
    should be.

    --- PyGate Linux v1.5.15
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From Bart@3:633/10 to All on Sat May 30 02:03:54 2026
    On 30/05/2026 01:31, James Kuyper wrote:
    On 2026-05-29 18:52, Keith Thompson wrote:
    Bart <bc@freeuk.com> writes:
    ...
    Of course, I want the answer to be that there isn't any point if
    parentheses will be used anyway.

    The answer, of course, is that the condition of your "if" clause is not
    true. In the overwhelming majority of the cases, people do not use parentheses to clarify the order of evaluation that is guaranteed by C's grammar rules. They only use them in the cases where they feel that
    there's a significant chance of confusion.

    Those are the cases we're talking about! That is:

    << >> & | ^

    Maybe add == != and < <= >= > is someone wants to take advantage of
    their different levels, but I guess 99% wouldn't even know about what.

    Most of the rest, there tends to be agreement across languages:

    school arithmetic group - comparisons - logical and/or

    I haven't included ?: as that's too weird.

    --- PyGate Linux v1.5.15
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From Keith Thompson@3:633/10 to All on Fri May 29 19:02:36 2026
    Bart <bc@freeuk.com> writes:
    On 30/05/2026 01:31, James Kuyper wrote:
    On 2026-05-29 18:52, Keith Thompson wrote:
    Bart <bc@freeuk.com> writes:
    ...
    Of course, I want the answer to be that there isn't any point if
    parentheses will be used anyway.

    The answer, of course, is that the condition of your "if" clause is
    not true. In the overwhelming majority of the cases, people do not
    use parentheses to clarify the order of evaluation that is guaranteed
    by C's grammar rules. They only use them in the cases where they feel
    that there's a significant chance of confusion.

    Those are the cases we're talking about! That is:

    << >> & | ^

    Maybe add == != and < <= >= > is someone wants to take advantage of
    their different levels, but I guess 99% wouldn't even know about what.

    Most of the rest, there tends to be agreement across languages:

    school arithmetic group - comparisons - logical and/or

    I haven't included ?: as that's too weird.

    So what is your question? I had thought that you meant to ask why
    Ritchie defined the precedences that way, but apparently that's
    not what you meant.

    Do you even have a question? Is there anything anyone could tell
    you that you don't think you already know?

    If you have a question, can you restate it in unambiguous terms?
    If not, what are we talking about?

    --
    Keith Thompson (The_Other_Keith) Keith.S.Thompson+u@gmail.com
    void Void(void) { Void(); } /* The recursive call of the void */

    --- PyGate Linux v1.5.15
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From Lawrence D?Oliveiro@3:633/10 to All on Sat May 30 03:49:45 2026
    On Fri, 29 May 2026 05:20:20 -0500, BGB wrote:

    To be excluded from being syntactic sugar, it needs to be something
    that is not generally possible to express within the base language.

    So, for example: Things like operator overloading or classes are
    syntactic sugar IMO, as what they do can be expressed in C, even if
    a lot less pretty (or far from an idiomatic style).

    I would not consider exceptions or RTTI as syntactic sugar, because
    these involve things that do not map to native C.

    But surely *anything* that ?is not generally possible to express
    within the base language? woud ?not map to native C?.

    --- PyGate Linux v1.5.15
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From Lawrence D?Oliveiro@3:633/10 to All on Sat May 30 04:25:31 2026
    On Sat, 30 May 2026 01:26:47 +0100, Bart wrote:

    On 30/05/2026 00:18, Lawrence D?Oliveiro wrote:

    On Fri, 29 May 2026 12:19:04 +0100, Bart wrote:

    * Why do bitwise & | ^ need their own level anyway

    So that you can do shifting and masking with minimal parentheses.

    Can you give examples?

    You haven?t done much bit manipulation, have you?

    Extracting RGB components from a pixel:

    const unsigned int
    r = pixel >> 16 & 255,
    g = pixel >> 8 & 255,
    b = pixel & 255;

    Combining RGBA components into a pixel:

    colors[i] =
    channel[0] << 24
    |
    channel[1] << 16
    |
    channel[2] << 8
    |
    channel[3];

    * Why do << >> have their own level anyway

    So that shift expressions can use common arithmetic operators with
    minimal parentheses.

    Again, examples?

    From the same code module, putting together a subpicture image
    consisting of 2 bits per pixel:

    pixbuf[bufpixels / 4] |= histogram[histindex].index << bufpixels % 4 * 2;

    <https://bitbucket.org/ldo17/dvd_menu_animator/src/master/spuhelper.c>

    --- PyGate Linux v1.5.15
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From BGB@3:633/10 to All on Sat May 30 01:04:14 2026
    On 5/29/2026 7:58 AM, Bonita Montero wrote:
    Am 29.05.2026 um 11:15 schrieb BGB:

    Like, if one doesn't care that the compiler takes a long time to run
    and the EXE is needlessly large, maybe OK, not great if one does care...

    Binary size doesn't matter with Windows.


    On a typical modern desktop PC...

    Does still get annoying if it is needlessly large for no good reason.

    Even if, yeah, modern PC will not care much about loading a 50 or 100MB
    EXE file...



    Having to spend minutes or more waiting for the compiler would
    seriously hurt momentum for many tasks.

    Use C++20 modules and parallel builds.


    Possibly, I have my reasons, and not all of my current development is
    limited to PC class systems.



    Say, for example, if the Boot ROM requires keeping everything under 32K.

    C++ was designed for large scale program development.
    With 32K-systems you can stick with C.


    There are intermediate options, where ones' RAM is measured in MB.


    Or, basically, say imagine writing software on something where CPU
    speeds and RAM sizes are basically similar to what things were like in
    the 1990s.

    Comparably, a desktop PC is much faster, and with almost limitless RAM.


    Well, and one thing I am often messing with:
    Well, I am using a PE/COFF variant...
    But, the OS is not on Windows, and the ISA is not x86 based, ...
    Still has EXE's and DLL's though.

    Can't use C++ there, because no (native) C++ compiler exists.
    Well, except if using GCC to generate RV64G; can run RV64G on it;
    But, RV64G's performance is lacking, and the ELF files are bloated.
    Kinda sucks when a significant part of the binary is just metadata.


    Then again, the PE variant is non-standard:
    No MZ stub;
    LZ4 compression;
    Mostly using 64-byte section alignment;
    And a FileOffset==RVA constraint, ...
    Various structures have been tweaked;
    ...


    Well, and the format is itself an offshoot of a WinCE variant of the
    format rather than from mainline Windows. Well, imagine an OS that sort
    of took design inspiration both from WinCE and the Unix family OS's.

    And ended up with a CLI experience kinda similar to Cygwin. And a very
    crap attempt at making a GUI (that mostly just launches with a terminal
    window that can be used to start other programs).


    Well, a basic view of my crappy GUI can be seen here: https://www.youtube.com/watch?v=HAyMDRzxYzY

    Though, the point of this video was more Doom but with the musical notes replaced with DTMF-like tones (but still having octaves and similar so
    it at least sorta sounds like music). As sort of a bit of hackery with
    the MIDI playback code (tweaking out the FM synthesis to to play DTMF
    tones rather than the normal FM instruments).

    Well, that and another recent-ish video of Doom modified to sorta
    resemble the monochrome style of "Return of the Obra Dinn" (well, and
    also this Doom port also has a 3D glasses mod, and 3D+Obra which
    actually works pretty OK, ...).

    ...



    --- PyGate Linux v1.5.15
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From Bonita Montero@3:633/10 to All on Sat May 30 11:18:21 2026
    Am 30.05.2026 um 01:20 schrieb Lawrence D?Oliveiro:

    But they still don?t have ?try-finally?.

    There's RAII:

    #pragma once
    #include <utility>
    #include <concepts>
    #include "nui.hpp"

    template<std::invocable Fn>
    struct defer final
    {
    defer( Fn &&fn, bool enabled = true ) :
    m_fn( std::forward<Fn>( fn ) ),
    m_enabled( enabled )
    {
    }
    defer( defer const & ) = delete;
    void operator =( defer const & ) = delete;
    ~defer() noexcept( std::is_nothrow_invocable_v<Fn> )
    {
    if( m_enabled ) [[likely]]
    m_fn();
    }
    template<typename ... Fns>
    bool operator ()( defer<Fns> &... additional ) noexcept( std::is_nothrow_invocable_v<Fn> && (std::is_nothrow_invocable_v<Fns> &&
    ...) )
    {
    if( !m_enabled ) [[unlikely]]
    return false;
    m_enabled = false;
    m_fn();
    return (additional() && ...);
    }
    template<typename ... Fns>
    void disable( defer<Fns> &... additional ) noexcept
    {
    m_enabled = false;
    (additional.disable(), ...);
    }
    template<typename ... Fns>
    void enable( defer<Fns> &... additional ) noexcept
    {
    m_enabled = true;
    (additional.enable(), ...);
    }
    private:
    NO_UNIQUE_ADDRESS Fn m_fn;
    bool m_enabled;
    };

    --- PyGate Linux v1.5.15
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From Bart@3:633/10 to All on Sat May 30 12:01:27 2026
    On 30/05/2026 05:25, Lawrence D?Oliveiro wrote:
    On Sat, 30 May 2026 01:26:47 +0100, Bart wrote:

    On 30/05/2026 00:18, Lawrence D?Oliveiro wrote:

    On Fri, 29 May 2026 12:19:04 +0100, Bart wrote:

    * Why do bitwise & | ^ need their own level anyway

    So that you can do shifting and masking with minimal parentheses.

    Can you give examples?

    You haven?t done much bit manipulation, have you?

    Extracting RGB components from a pixel:

    const unsigned int
    r = pixel >> 16 & 255,
    g = pixel >> 8 & 255,
    b = pixel & 255;

    This merely requires <<'s precendence to be lower than &.

    It doesn't need & | ^ to be distinct (only one is used here anyway).

    It doesn't beed << >> to be in a distinct group from multiply or add groups.


    Combining RGBA components into a pixel:

    colors[i] =
    channel[0] << 24
    |
    channel[1] << 16
    |
    channel[2] << 8
    |
    channel[3];


    Exactly the same applies here. But if one of those | was & or ^, then
    you might start needing parentheses.


    * Why do << >> have their own level anyway

    So that shift expressions can use common arithmetic operators with
    minimal parentheses.

    Again, examples?

    From the same code module, putting together a subpicture image
    consisting of 2 bits per pixel:

    pixbuf[bufpixels / 4] |= histogram[histindex].index << bufpixels
    % 4 * 2;

    This is a better example, as is this from your link:

    pixbuf[nr_buf_pixels] = colors[srcpix >> src_pix_index * 2 & 3];

    This can indeed be written with fewer parentheses given the priority of
    << relative to * and &.

    But it is also not clear because the part after >> is sprawling. You'd
    want it like this:

    pixbuf[nr_buf_pixels] = colors[srcpix >> (src_pix_index * 2) & 3];

    Now there is less analysis to do to establish the span of the shift-count.

    These are examples from MZLIB:

    crcu32 = (crcu32 >> 4) ^ s_crc32[(crcu32 & 0xF) ^ (b & 0xF)];
    crcu32 = (crcu32 >> 4) ^ s_crc32[(crcu32 & 0xF) ^ (b >> 4)];

    C's precedence rules say that many of those parentheses are not strictly needed, which means the following are exactly equivalent:

    crcu32 = crcu32 >> 4 ^ s_crc32[crcu32 & 0xF ^ b & 0xF];
    crcu32 = crcu32 >> 4 ^ s_crc32[crcu32 & 0xF ^ b >> 4];

    So why were they added? Could it be that they make things clearer?

    Remove ambiguity in the mind of the reader? Leader to fewer surprises
    when a new term needs to be added?

    With the original, NOBODY NEEDS TO CARE what the hell the precedences of
    ^ & with respect to each other. Port the fragment to a language with slightly different rules and it it would still work.

    Post that fragment somewhere, and people will know what it means
    *without needing to know which exact language it is*.

    This is why I think it is pointless to devote 4 dedicated levels to <<
    & | ^, and poor to rely on them for the meaning of your code.





    --- PyGate Linux v1.5.15
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From Bart@3:633/10 to All on Sat May 30 12:12:29 2026
    On 30/05/2026 03:02, Keith Thompson wrote:
    Bart <bc@freeuk.com> writes:
    On 30/05/2026 01:31, James Kuyper wrote:
    On 2026-05-29 18:52, Keith Thompson wrote:
    Bart <bc@freeuk.com> writes:
    ...
    Of course, I want the answer to be that there isn't any point if
    parentheses will be used anyway.

    The answer, of course, is that the condition of your "if" clause is
    not true. In the overwhelming majority of the cases, people do not
    use parentheses to clarify the order of evaluation that is guaranteed
    by C's grammar rules. They only use them in the cases where they feel
    that there's a significant chance of confusion.

    Those are the cases we're talking about! That is:

    << >> & | ^

    Maybe add == != and < <= >= > is someone wants to take advantage of
    their different levels, but I guess 99% wouldn't even know about what.

    Most of the rest, there tends to be agreement across languages:

    school arithmetic group - comparisons - logical and/or

    I haven't included ?: as that's too weird.

    So what is your question? I had thought that you meant to ask why
    Ritchie defined the precedences that way, but apparently that's
    not what you meant.

    You seem to have a problem with context. I was replying to JK who was
    replying to a quote of mine from within one of your posts (maybe he's killfiled me so couldn't respond directly).

    There was no question posted. He suggested that most of the time,
    parentheses are not used and gave examples using * and +.

    My original remarks were about the widespread use of parentheses to
    clarify the grouping of operators with the more obscure priorities, and
    my reply addressed that.

    See also the examples I posted a few minutes ago involving >> & and ^.

    That is, if you are interested in my point, which I doubt. You seem more intent on some personal campaign.

    --- PyGate Linux v1.5.15
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From David Brown@3:633/10 to All on Sat May 30 13:52:43 2026
    On 29/05/2026 22:16, BGB wrote:
    On 5/29/2026 6:22 AM, David Brown wrote:
    On 29/05/2026 12:20, BGB wrote:
    On 5/29/2026 2:52 AM, Janis Papanagnou wrote:
    On 2026-05-28 11:57, BGB wrote:
    On 5/28/2026 2:18 AM, Janis Papanagnou wrote:
    On 2026-05-28 01:49, BGB wrote:
    [...]


    But, not really an "easy" way to avoid bloat, other than to write
    code specifically for what cases are relevant; while also avoiding
    needless duplication and copy paste (where, overuse of copy/paste
    can also lead to bloat; along with turning the code into an ugly
    mess).

    Hmm.. - as said, the during very early days there were issues; I
    recall on one platform duplication of template code in more that
    one source unit. And/or some environmental hacks (of the compiler)
    to deposit template code for linking. In the later days I've not
    seen such immature things anymore.


    Possibly, a lot could depend on how one is counting things as well.


    In a lot of cases when using GCC, I end up using:
    ÿÿ -ffunction-sections -fdata-sections -Wl,-gc-sections

    On many targets, "-fdata-sections" can lead to noticeably larger and
    slower code because it effectively eliminates section anchor
    optimisations.ÿ It does not negatively affect x86 AFAICS, because x86
    does not use section anchors.

    <https://godbolt.org/z/zeoq41Y7d>

    With -fsection-anchors (enabled with optimisation on targets that
    support it - generally RISCy load/store architectures), program-
    lifetime variables are kept together in a lump (as though they were in
    a struct) and often addressed by a pointer to that pretend struct.
    Thus if a function accesses two variables "a" and "b", instead of
    having to load the addresses of each of "a" and "b" into separate
    registers, it loads an "anchor" into one register and accesses the
    variables with reg+offset addressing.

    I've seen "-fdata-sections" used regularly in embedded systems - it is
    almost always a bad idea.

    ("-ffunction-sections" is often very helpful to reduce code image
    size, so keep that one.)


    Both seem to help on x86, x86-64, and also on RISC-V, at making GCC's
    output at least sorta space-comparable to my own compilers.

    The merit of "-fdata-sections" is mostly that it eliminates unused
    global variables; whereas "-ffunction-sections" eliminates unreachable functions.

    That is the point of them, yes. "-ffunction-sections" can be useful at removing unused code from more general code. For microcontrollers,
    SDK's and manufacturers' driver code will normally contain a large
    number of functions that can be eliminated in this way, saving a lot of
    code space.

    However, in practice, "-fdata-sections" rarely eliminates a significant
    amount - most programs do not have large amounts of statically-allocated
    data that is not used. Gcc, and I think most other compilers, put the
    static lifetime data for each translation unit in its own section, so if
    no data from a translation unit is used it will be eliminated at link
    time even with -fno-data-sections. And of course it makes no difference
    for heap data or stack data.

    In my testing, "-ffunction-sections" is absolutely worth using (on
    targets where code space is relevant - there's no need for PC software).
    On some targets, it may mean a few lost opportunities for shorter
    jump/call instructions between functions in the same translation unit,
    but the cost is rarely anything more than a slightly longer link time.
    But "-fdata-sections" typically gives almost no ram space savings, and
    makes code bigger and slower.

    As I noted, gcc on x86 does not support section anchors, so there is not likely to be much code cost for -ffdata-sections.

    Where section anchors shine - and where -fdata-sections therefore has
    cost - is when a function needs to access more than one piece of static lifetime data defined in the same translation unit (or another
    translation unit if you are using LTO). That happens a lot in embedded
    ARM programming at least. I don't know about RISC-V. If the target
    normally uses a "small data section" for ram (I know this is common on PowerPC), then there is, in effect, a program-wide section anchor
    already. So it is possible that it relatively few targets have section anchors - but the 32-bit ARM on gcc is a vastly popular choice in the
    embedded world, so it is important to understand the cost of this
    compiler flag for that target at least.


    Neither is needed with my own compiler, which compiles things in a way
    such that it eliminates anything that is unreachable.

    [...]


    That might be the case for a very simplistic compiler.ÿ With an
    optimising compiler, these extra variables will quickly be eliminated.
    If the compiler has a good scheduling model of the device, it do
    whatever instruction scheduling works best for that processor.ÿ If the
    model is not good enough, it will be suboptimal.ÿ I would not,
    however, expect any different in the generated code for the two code
    snippets.

    Sometimes this kind of "manual optimisation" is helpful when you have
    to try to get efficient results from a weak compiler, however.


    Possibly, but this sort of thing can help with both BGBCC and with MSVC
    IME

    I don't tend to think of MSVC as a highly optimising compiler - but it
    is not a tool I have much use for, as it does not handle the targets I
    need. When I have sometimes looked at the generated code on godbolt, it
    has not impressed me at all. So it could well fall into the "helpful
    when using a weaker compiler" category.





    Usual strategy is to try to limit how much code is written, and
    also to avoid doing things in ways that result in too much code, or >>>>> too much cruft.

    Best to avoid both copy paste when reasonable, and sticking
    anything non-trivial in macros.

    We avoided macros if possible.


    They are de-facto for constants and similar, but for longer stuff is
    better avoided.

    Macros are rarely the best way to define constants.ÿ They are needed
    if you are using the constants for pre-processor stuff like
    conditional compilation.ÿ But generally you get clearer code, better
    typing, and potentially several other benefits from using alternative
    choices like "enum" (even for stand-alone integer constants), "static
    const" variables, and in C23, "constexpr" variables.ÿ There's no doubt
    that a lot of code /does/ use macros for constants, but I view it as a
    relic of the past rather than good coding practice.


    They are traditional...

    Like:
    ÿ static const double M_PI = 3.14159265358979;

    Could also make sense, but people don't do usually this, they usually
    use macros...

    They should not do so (IMHO, of course). Yes, macros are traditional -
    but there are no plus sides to using them for this kind of thing.
    (There are no plus sides to using all-caps either, but people do that too.)

    (I'm snipping all the details of your own C compiler, because there is
    very little I can comment on.)




    But, things can be considered in relative terms:
    Like, C++ may carry various penalties vs C.


    I don't find C++ carries noticeably penalties compared to C, for my
    embedded work.ÿ But I do disable exceptions and RTTI - exceptions may
    have very little run-time time overhead, but the unwind tables can be
    significant when code size is important in small systems.


    Yes, that is the main thing.
    ÿ They carry zero performance penalty in practice;
    ÿ But, have a non-zero penalty for image size.

    Not enough to be a deal-breaker towards using them if they are used, but enough that one wants them disabled if not used...


    Agreed.

    (I could also note that I make heavy use of templates in C++ code - it
    often leads to smaller and faster results.)





    --- PyGate Linux v1.5.15
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From David Brown@3:633/10 to All on Sat May 30 14:07:02 2026
    On 29/05/2026 19:53, Bart wrote:
    On 29/05/2026 18:29, tTh wrote:
    On 5/29/26 15:22, Bart wrote:

    Something you might do when you have time (as I'm busy), is to
    analyse the expressions in some C codebases, and isolate those where
    removal of parentheses that group terms, would result in exactly the
    same shape of expressions, and are therefore redundant.

    ÿÿÿ This is a strange exercice. When I write complex expression,
    ÿÿÿ I sometime use redondant parenthesis for the clarity of
    ÿÿÿ my intentions about this computation. I'm thinking that
    ÿÿÿ those extra (()) are a sort of in-line comments.


    Sure, but some here like to say that such expressions, if they still
    work without parentheses, are unambiguous anyway.

    They are obviously unambiguous to compilers, and they are unambiguous to people who either know the precedence rules, or are able to look them up
    to be sure of them at the time. For those that don't know the rules and
    would rather guess randomly, they might misinterpret the expressions but
    the expressions themselves are still unambiguous. So yes, complex
    expressions without parentheses /are/ unambiguous.

    However, being unambiguous does not mean people will not make mistakes
    when reading or writing them, or that they can read and write them
    correctly without effort. As you say, people are not compilers.

    Parentheses can certainly reduce the cognitive effort for people reading
    or writing complex expressions, and can significantly reduce the risk of errors. Extra local variables for sub-expressions can do this too
    (especially when the language has good scoping rules and allows
    variables to be declared when you need them).

    So it is wrong to suggest that expressions written in C without extra parentheses are somehow "ambiguous" - but it is correct to say that
    adding extra parentheses (within reason) can often help the readability
    of code.

    And this applies equally in all languages, no matter what precedence the operators have, or how many levels there are. I can certainly agree
    with you that C would have been slightly nicer if the bitwise operators
    and equality operators were at different precedences. There are a
    number of changes I would have preferred - some of which you would agree
    with, some not. But even if C were to have those changes overnight, it
    would not change anything about what I wrote above. Regardless of the operator precedence, expressions are not ambiguous, but parentheses or splitting into sub-expressions can make code clearer.


    --- PyGate Linux v1.5.15
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From Dan Cross@3:633/10 to All on Sat May 30 12:29:15 2026
    In article <10vd1tu$ekvl$1@dont-email.me>, Bart <bc@freeuk.com> wrote:
    On 29/05/2026 21:56, Keith Thompson wrote:
    [snip]
    Upthread, you asked a question:

    And then the point becomes, if you always add the parentheses, what
    was the point of having that particular precedence level?

    You've made it clear that you were never interested in an answer.

    You said this:

    "You're asking why C is designed the way it is. We could waste a
    great deal of time and effort answering that for you. There are
    numerous documents about the design and history of C, and of
    its ancestor languages. I could provide you with links."

    Actually I'm not asking why C is like that. We're already there.

    I'm saying that there is no value in those extra levels, some people
    think is, and I'm arging about that. I was replying to tTh.

    As for my question, what /is/ the point? I'm still waiting!

    To clarify: the question is, what is the point of those levels?

    How is that different from asking "why C is like that"?

    Of course, I want the answer to be that there isn't any point if
    parentheses will be used anyway.

    There is a point, but it is history. That is, the "point" is of
    those precedence levels is the history and evolution of the
    language.

    In PL/1 and early C, `|` and `&` were logical operators. The
    short-circuiting `||` and `&&` came later, but the usage low
    precedence for `|` and `&` was already baked in.

    That's the point: the precedence reflects the original use as
    boolean operators, not how things evolved for use almost purely
    as bitwise operators.

    - Dan C.


    --- PyGate Linux v1.5.15
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From Janis Papanagnou@3:633/10 to All on Sat May 30 14:40:57 2026
    On 2026-05-30 13:52, David Brown wrote:
    On 29/05/2026 22:16, BGB wrote:
    On 5/29/2026 6:22 AM, David Brown wrote:
    On 29/05/2026 12:20, BGB wrote:
    On 5/29/2026 2:52 AM, Janis Papanagnou wrote:

    We avoided macros if possible.

    They are de-facto for constants and similar, but for longer stuff is
    better avoided.

    Macros are rarely the best way to define constants.ÿ They are needed
    if you are using the constants for pre-processor stuff like
    conditional compilation.ÿ But generally you get clearer code, better
    typing, and potentially several other benefits from using alternative
    choices like "enum" (even for stand-alone integer constants), "static
    const" variables, and in C23, "constexpr" variables.ÿ There's no
    doubt that a lot of code /does/ use macros for constants, but I view
    it as a relic of the past rather than good coding practice.

    They are traditional...

    Like:
    ÿÿ static const double M_PI = 3.14159265358979;

    Could also make sense, but people don't do usually this, they usually
    use macros...

    They should not do so (IMHO, of course).ÿ Yes, macros are traditional -
    but there are no plus sides to using them for this kind of thing. (There
    are no plus sides to using all-caps either, but people do that too.)

    Because in early days Cpp constants have been used and Cpp-stuff often capitalized[*]. Our C++ coding rules back then had mandated lowercase
    also for constants, but strangely some folks were so used to uppercase
    Cpp literals that they disliked to write constants (as other objects)
    in lowercase, and stated opinions were sometimes heated like religious
    topics.

    I wonder what lexical convention regular "C" (or C++) programmers here
    use for constants nowadays.

    Curiously I inspected my latest C-source to see what convention I've
    actually followed recently. But I noticed that I had no hard constants
    used at all; all parameters came from a configuration file and through
    the command line interface. (That makes sense, I guess.)

    Janis

    [*] Strangely there were C-function-macros that were written lowercase,
    though.

    [...]


    --- PyGate Linux v1.5.15
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From Bart@3:633/10 to All on Sat May 30 13:56:48 2026
    On 30/05/2026 13:29, Dan Cross wrote:
    In article <10vd1tu$ekvl$1@dont-email.me>, Bart <bc@freeuk.com> wrote:
    On 29/05/2026 21:56, Keith Thompson wrote:
    [snip]
    Upthread, you asked a question:

    And then the point becomes, if you always add the parentheses, what >>> was the point of having that particular precedence level?

    You've made it clear that you were never interested in an answer.

    You said this:

    "You're asking why C is designed the way it is. We could waste a
    great deal of time and effort answering that for you. There are
    numerous documents about the design and history of C, and of
    its ancestor languages. I could provide you with links."

    Actually I'm not asking why C is like that. We're already there.

    I'm saying that there is no value in those extra levels, some people
    think is, and I'm arging about that. I was replying to tTh.

    As for my question, what /is/ the point? I'm still waiting!

    To clarify: the question is, what is the point of those levels?

    How is that different from asking "why C is like that"?

    My question is actually independent of C or its history.

    I accept those levels exist. I was asking do they currently serve a
    useful purpose.

    If not, people can choose to ignore those them when writing C code, for example like this where all () are technically superfluous:

    crcu32 = (crcu32 >> 4) ^ s_crc32[(crcu32 & 0xF) ^ (b & 0xF)];

    And they can choose to not adopt them when devising new languages,
    however many still do faithfully recreate the same pattern, with a few
    notable exceptions such as Go lang.





    --- PyGate Linux v1.5.15
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From David Brown@3:633/10 to All on Sat May 30 16:36:12 2026
    On 30/05/2026 14:40, Janis Papanagnou wrote:
    On 2026-05-30 13:52, David Brown wrote:
    On 29/05/2026 22:16, BGB wrote:
    On 5/29/2026 6:22 AM, David Brown wrote:
    On 29/05/2026 12:20, BGB wrote:
    On 5/29/2026 2:52 AM, Janis Papanagnou wrote:

    We avoided macros if possible.

    They are de-facto for constants and similar, but for longer stuff
    is better avoided.

    Macros are rarely the best way to define constants.ÿ They are needed
    if you are using the constants for pre-processor stuff like
    conditional compilation.ÿ But generally you get clearer code, better
    typing, and potentially several other benefits from using
    alternative choices like "enum" (even for stand-alone integer
    constants), "static const" variables, and in C23, "constexpr"
    variables.ÿ There's no doubt that a lot of code /does/ use macros
    for constants, but I view it as a relic of the past rather than good
    coding practice.

    They are traditional...

    Like:
    ÿÿ static const double M_PI = 3.14159265358979;

    Could also make sense, but people don't do usually this, they usually
    use macros...

    They should not do so (IMHO, of course).ÿ Yes, macros are traditional
    - but there are no plus sides to using them for this kind of thing.
    (There are no plus sides to using all-caps either, but people do that
    too.)

    Because in early days Cpp constants have been used and Cpp-stuff often capitalized[*]. Our C++ coding rules back then had mandated lowercase
    also for constants, but strangely some folks were so used to uppercase
    Cpp literals that they disliked to write constants (as other objects)
    in lowercase, and stated opinions were sometimes heated like religious topics.

    I wonder what lexical convention regular "C" (or C++) programmers here
    use for constants nowadays.

    Curiously I inspected my latest C-source to see what convention I've
    actually followed recently. But I noticed that I had no hard constants
    used at all; all parameters came from a configuration file and through
    the command line interface. (That makes sense, I guess.)

    Janis

    [*] Strangely there were C-function-macros that were written lowercase, though.


    I think there is a reasonable case for all-caps for macros that are
    doing something "weird", as a warning to users. You know that it's
    risky trying to write "MAX(a++, b++)", as it might evaluate one or both
    of the parameter expressions twice. But if you also use all-caps for well-behaved macros, that dilutes the warning effect.

    I use all-caps for define names that I expect to come from outside the
    source files - like a command line flag "-DPROG_VARIANT=2" in a
    makefile, and that kind of thing. That, to me, counts as a "weird" macro.

    I am happy to use macros where they make sense, but I would not use a
    macro if a static const, enum, static inline function, or constexpr
    variable will do just as well.



    --- PyGate Linux v1.5.15
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From BGB@3:633/10 to All on Sat May 30 15:48:28 2026
    On 5/30/2026 6:52 AM, David Brown wrote:
    On 29/05/2026 22:16, BGB wrote:
    On 5/29/2026 6:22 AM, David Brown wrote:
    On 29/05/2026 12:20, BGB wrote:
    On 5/29/2026 2:52 AM, Janis Papanagnou wrote:
    On 2026-05-28 11:57, BGB wrote:
    On 5/28/2026 2:18 AM, Janis Papanagnou wrote:
    On 2026-05-28 01:49, BGB wrote:
    [...]


    But, not really an "easy" way to avoid bloat, other than to write >>>>>> code specifically for what cases are relevant; while also avoiding >>>>>> needless duplication and copy paste (where, overuse of copy/paste >>>>>> can also lead to bloat; along with turning the code into an ugly
    mess).

    Hmm.. - as said, the during very early days there were issues; I
    recall on one platform duplication of template code in more that
    one source unit. And/or some environmental hacks (of the compiler)
    to deposit template code for linking. In the later days I've not
    seen such immature things anymore.


    Possibly, a lot could depend on how one is counting things as well.


    In a lot of cases when using GCC, I end up using:
    ÿÿ -ffunction-sections -fdata-sections -Wl,-gc-sections

    On many targets, "-fdata-sections" can lead to noticeably larger and
    slower code because it effectively eliminates section anchor
    optimisations.ÿ It does not negatively affect x86 AFAICS, because x86
    does not use section anchors.

    <https://godbolt.org/z/zeoq41Y7d>

    With -fsection-anchors (enabled with optimisation on targets that
    support it - generally RISCy load/store architectures), program-
    lifetime variables are kept together in a lump (as though they were
    in a struct) and often addressed by a pointer to that pretend struct.
    Thus if a function accesses two variables "a" and "b", instead of
    having to load the addresses of each of "a" and "b" into separate
    registers, it loads an "anchor" into one register and accesses the
    variables with reg+offset addressing.

    I've seen "-fdata-sections" used regularly in embedded systems - it
    is almost always a bad idea.

    ("-ffunction-sections" is often very helpful to reduce code image
    size, so keep that one.)


    Both seem to help on x86, x86-64, and also on RISC-V, at making GCC's
    output at least sorta space-comparable to my own compilers.

    The merit of "-fdata-sections" is mostly that it eliminates unused
    global variables; whereas "-ffunction-sections" eliminates unreachable
    functions.

    That is the point of them, yes.ÿ "-ffunction-sections" can be useful at removing unused code from more general code.ÿ For microcontrollers,
    SDK's and manufacturers' driver code will normally contain a large
    number of functions that can be eliminated in this way, saving a lot of
    code space.

    However, in practice, "-fdata-sections" rarely eliminates a significant amount - most programs do not have large amounts of statically-allocated data that is not used.ÿ Gcc, and I think most other compilers, put the static lifetime data for each translation unit in its own section, so if
    no data from a translation unit is used it will be eliminated at link
    time even with -fno-data-sections.ÿ And of course it makes no difference
    for heap data or stack data.


    The main place it makes a difference is global arrays from a translation
    unit that is included, but for functions that are not included.

    Also functions with large static arrays.


    void SomeFunc()
    {
    static char buf[4096];
    ...
    }

    Where, say, eliminating SomeFunc does not necessarily eliminate buf.


    In my testing, "-ffunction-sections" is absolutely worth using (on
    targets where code space is relevant - there's no need for PC software).
    ÿOn some targets, it may mean a few lost opportunities for shorter jump/call instructions between functions in the same translation unit,
    but the cost is rarely anything more than a slightly longer link time.
    But "-fdata-sections" typically gives almost no ram space savings, and
    makes code bigger and slower.

    As I noted, gcc on x86 does not support section anchors, so there is not likely to be much code cost for -ffdata-sections.

    Where section anchors shine - and where -fdata-sections therefore has
    cost - is when a function needs to access more than one piece of static lifetime data defined in the same translation unit (or another
    translation unit if you are using LTO).ÿ That happens a lot in embedded
    ARM programming at least.ÿ I don't know about RISC-V.ÿ If the target normally uses a "small data section" for ram (I know this is common on PowerPC), then there is, in effect, a program-wide section anchor
    already.ÿ So it is possible that it relatively few targets have section anchors - but the 32-bit ARM on gcc is a vastly popular choice in the embedded world, so it is important to understand the cost of this
    compiler flag for that target at least.


    It depends on the way it is built.


    A lot of times though (for non-relocatable static-linked binaries) it
    mostly tends to use AUIPC+LD or AUIPC+ST pairs to access global
    variables. There is a Global Pointer that needs to be loaded when the
    binary is started, unclear what it is used for exactly.


    in PIC/PIE binaries, it uses AUIPC+ADDI to get a GOT pointer and then
    uses the GOT pointers to access global variables (via fetching the
    address of the variable from the GOT).


    Can note that BGBCC targeting RV works differently, instead using GP to
    access global variables, and clustering the commonly accessed global
    variables around GP (GP is initialized to point towards at the start of
    the ".data" section for the main EXE at program startup, though in my
    ABI this may actually be a copy allocated elsewhere in RAM, and not
    actually pointing at the version of the section located in the original
    PE image; note that the loader also applies base relocs for the data
    section separately when locating it; in effect the base relocs being internally partitioned per-section, rather than the per-page
    partitioning scheme used in the original PE/COFF).


    For my target, I mostly end up needing to use PIE binaries with GCC, as
    it needs to be able to load the binary at different locations.

    However, I am using a custom C library, as I (still) haven't managed to
    get the "ld-linux.so" stuff working. Not yet figured out whatever poorly-documented arcane dark magic is needed to get this part working.


    As noted, my compiler's output (including for plain RISC-V) using
    PE/COFF, which was also the native format for the OS.


    Note that for Linux binaries in this case it would mimics the Linux
    syscall interface; though as I hadn't gotten very far with the PIE
    loader, most of the syscalls are still not implemented.


    My own makeshift OS has a different syscall mechanism, ironically using
    the same registers, but a syscall number of -1 (Linux uses positive
    syscall numbers).

    They work in different ways, IIRC:
    X10..X15: Arg1..Arg6
    X16: Unused, 0
    X17: Syscall Number (always positive)

    In my case, syscalls took a different form, IIRC:
    X10: Object ID (Handle)
    X11: Method Number (Integer)
    X12: Method Args List (Pointer)
    X13: Return Value (Pointer)
    X14..X16: Unused, 0 (RV)
    X17: Holds -1 (RV).

    In this case, system calls and many OS APIs take the form of object
    method calls, with a special range of low-numbered object IDs (Eg, 0 or
    NULL) mapping to core/basic syscalls.


    But, yeah, some OS APIs would take the form of objects which would be
    wrapped in a VTable struct, say:
    SomeApi_Vt **api;
    (*Api)->ApiMethod(api, arg1, arg2);

    In this case, well, there were two major ways of requesting APIs:
    Pairs of EIGHTCC values, for some public APIs
    Or as FOURCC's for shorthand (zero padded to 64 bits).
    As a UUID / GUID:
    Primarily used for local / private interfaces.


    Well, people probably can't guess where this mechanism originally came
    from...


    Well, not exactly the same as the inspiration, as there is no IDL
    compiler involved, mostly just bare C structs representing the VTables.
    There is essentially a blob of generic reusable method-wrappers (and a
    whole generic reusable VTable) that is shared across many of these
    objects, so calling a method on an object then just sorta translates it
    into the corresponding system call to invoke that method slot.

    Well, and this mechanism being part of why (for RV) I stuck with an ABI variant that passes everything in X registers (separate X and F
    registers would make a big ugly mess for this whenever a method has a floating-point argument).



    Neither is needed with my own compiler, which compiles things in a way
    such that it eliminates anything that is unreachable.

    [...]


    That might be the case for a very simplistic compiler.ÿ With an
    optimising compiler, these extra variables will quickly be
    eliminated. If the compiler has a good scheduling model of the
    device, it do whatever instruction scheduling works best for that
    processor.ÿ If the model is not good enough, it will be suboptimal.
    I would not, however, expect any different in the generated code for
    the two code snippets.

    Sometimes this kind of "manual optimisation" is helpful when you have
    to try to get efficient results from a weak compiler, however.


    Possibly, but this sort of thing can help with both BGBCC and with
    MSVC IME

    I don't tend to think of MSVC as a highly optimising compiler - but it
    is not a tool I have much use for, as it does not handle the targets I
    need.ÿ When I have sometimes looked at the generated code on godbolt, it
    has not impressed me at all.ÿ So it could well fall into the "helpful
    when using a weaker compiler" category.


    Depends on what target I am building for:
    Windows Native: Typically MSVC
    WSL: Usually GCC or Clang
    Seems to have: GCC 13.2.0; Clang 18.1.3
    RISC-V GCC: Also 13.2.0 (also via WSL)
    Linux: Typically GCC

    I rarely much use Cygwin anymore, as it was mostly rendered obsolete by
    WSL (on Win10 or similar).
    Though, Cygwin may still be relevant on Win7 or WinXP systems.

    For BGBCC, it can build both on native Windows and on Linux/WSL (though recently noted that this build was broken, mostly by GCC and Clang being
    more pedantic about missing prototypes, and a few prototypes were being
    missed by my function-prototype mining tool). Went and fixed this, but
    haven't posted this yet.


    As for optimizing in MSVC, yeah, it is in the area of not terrible, but
    not super clever either.

    If one expects the sort of high-level code-rewriting cleverness that GCC
    or Clang often does, one will be disappointed.

    But, sometimes, the main "heavy hitter" optimizations are things like constant-folding and register allocation, which it does do effectively.

    Though, both MSVC and BGBCC seem to use one sort of strategy for
    register allocation:


    Static assign things to callee-save registers and use remaining
    registers for dynamic allocation within basic-blocks. Variables with
    finite non-overlapping lifetimes (that do not cross basic-block
    boundaries) may potentially share a register (this more generally
    applies to things like temporaries).

    And, GCC and Clang use another: Assign dynamically but carry values
    across basic-block boundaries along control-flow paths.

    Both tend to give different patterns though, and seem to favor different
    types of code.






    Usual strategy is to try to limit how much code is written, and
    also to avoid doing things in ways that result in too much code,
    or too much cruft.

    Best to avoid both copy paste when reasonable, and sticking
    anything non-trivial in macros.

    We avoided macros if possible.


    They are de-facto for constants and similar, but for longer stuff is
    better avoided.

    Macros are rarely the best way to define constants.ÿ They are needed
    if you are using the constants for pre-processor stuff like
    conditional compilation.ÿ But generally you get clearer code, better
    typing, and potentially several other benefits from using alternative
    choices like "enum" (even for stand-alone integer constants), "static
    const" variables, and in C23, "constexpr" variables.ÿ There's no
    doubt that a lot of code /does/ use macros for constants, but I view
    it as a relic of the past rather than good coding practice.


    They are traditional...

    Like:
    ÿÿ static const double M_PI = 3.14159265358979;

    Could also make sense, but people don't do usually this, they usually
    use macros...

    They should not do so (IMHO, of course).ÿ Yes, macros are traditional -
    but there are no plus sides to using them for this kind of thing. (There
    are no plus sides to using all-caps either, but people do that too.)


    It is more tradition...

    My conventions, as noted, are sorta like:
    Macros / Constants: All caps;
    Functions:
    LIBNAME_SysSys_FirstLetterCaps //externally callable within LIBNAME
    libname_subsys_nocaps //usually private to a subsystem
    LIBNAME_FirstLetterCaps //main API for a private library
    somefunction //C library convention
    libname_somefunction //some OS API stuff
    libFirstLetterCaps //GL-like, common for public APIs
    FirstLetterCaps //was common in Win32 API, unused
    Conventions are looser for standalone programs, but:
    somename, some_name, ... //small programs
    S_SomeFunc //id Software like, letter for major subsystem


    In my own makeshift OS, had used OpenGL like naming for a lot of OS APIs.
    tkWhatever //TestKern OS APIs
    tkgdiWhatever //TKGDI: Basically Graphics/GUI stuff


    As noted, some amount of these are implemented as object wrappers.
    Likewise for my OpenGL implementation.

    Though, OpenGL is more annoying in that it usually works via a "GetProcAddress()" type mechanism, so you need to fetch each function
    pointer internally (and provide a lookup mechanism for each function).
    Where, the main (static linked) part of the GL API is effectively
    wrapper functions over function pointers gained via said
    "GetProcAddress()" mechanism, which then go into the userland
    implementation of those functions (typically, with part of the GL API
    running in userland, and a backend part that runs "elsewhere", such as
    in the GUI process, and is reached over an Object / COM interface).


    FWIW, I didn't design this part of the GL API, but if I had, probably
    would have just used COM objects internally.

    Still better IMO to provide a nice C API wrapper over said COM objects
    though, rather than go the DirectX route and be like "Hey, application
    code, have fun with these here bare COM objects!".


    Exposing bare COM objects and GUIDs in a public API is poor design IMO.

    Well, even if in effect many of the API calls are just:
    void apiDoSomething()
    { (*someapi_context)->DoSomething(someapi_context); }


    This differs some from Linux APIs, which often like sharing bare
    functions and variables across API boundaries (so, no real wall of
    separation between the library and application in this sense).


    These partly runs into an issue in that in my case, BGBCC (like MSVC)
    requires being explicit about DLL imports and exports, and sharing
    global variables across DLL boundaries is generally discouraged.

    Note that the DLL mechanism doesn't actually support sharing global
    variables directly, so if you try to share a global across a DLL
    boundary, what you actually get is a hidden function-call that returns a pointer to the variable.

    __declspec(dllimport) int somevar; //not actually a variable.

    x=somevar;
    Is more like:
    x=*(int *)(__get_somevar());

    But, generally discouraged.


    Sharing variables across DLLs is bad practice IMO, and ideally only
    sharing functions that represent a public API (and not, "whatever random
    stuff happens to be in the library"). Contrast to the Linux "shared
    object" approach which does tend to take more of a "share everything" approach, and libraries tend to not maintain as string of a library/application separation.

    Well, and then Cygwin goes and tries to fake Linux behavior on top of
    DLLs (in which case a large library can also find itself running into
    the hard limit on the maximum number of DLL exports).

    ...


    Can note that I had approached C library linking in a different way from MSVC/Windows:
    Windows:
    Main EXE and every DLL get their own static-linked C library.
    Can opt into a shared DLL for the C library, but this adds wonk.
    BGBCC+TestKern:
    Main EXE gets a static-linked C library;
    Exports a COM interface that DLLs can use;
    DLLs get a static linked C-library stub;
    Invokes main C library via a hidden COM interface.


    This basically allows things like malloc/free and stdio to work across
    DLL boundaries (unlike Windows where each gets their own local heap and
    stdio, and trying to invoke a pointer from one DLL in another tends to
    cause stuff to explode).

    Granted, the DLLs effectively pulling the C library from the main EXE
    via COM objects may seem a little unorthodox, but it seemed like the
    best way to address my use cases.


    (I'm snipping all the details of your own C compiler, because there is
    very little I can comment on.)




    But, things can be considered in relative terms:
    Like, C++ may carry various penalties vs C.


    I don't find C++ carries noticeably penalties compared to C, for my
    embedded work.ÿ But I do disable exceptions and RTTI - exceptions may
    have very little run-time time overhead, but the unwind tables can be
    significant when code size is important in small systems.


    Yes, that is the main thing.
    ÿÿ They carry zero performance penalty in practice;
    ÿÿ But, have a non-zero penalty for image size.

    Not enough to be a deal-breaker towards using them if they are used,
    but enough that one wants them disabled if not used...


    Agreed.

    (I could also note that I make heavy use of templates in C++ code - it
    often leads to smaller and faster results.)

    Curious...


    I had tended to use the "write everything one off for the task at hand" approach, but this is a higher-effort approach.

    ...


    --- PyGate Linux v1.5.15
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From Keith Thompson@3:633/10 to All on Sat May 30 16:43:16 2026
    Bart <bc@freeuk.com> writes:
    On 30/05/2026 13:29, Dan Cross wrote:
    In article <10vd1tu$ekvl$1@dont-email.me>, Bart <bc@freeuk.com> wrote:
    On 29/05/2026 21:56, Keith Thompson wrote:
    [snip]
    Upthread, you asked a question:

    And then the point becomes, if you always add the parentheses, what >>>> was the point of having that particular precedence level?

    You've made it clear that you were never interested in an answer.

    You said this:

    "You're asking why C is designed the way it is. We could waste a
    great deal of time and effort answering that for you. There are
    numerous documents about the design and history of C, and of
    its ancestor languages. I could provide you with links."

    Actually I'm not asking why C is like that. We're already there.

    I'm saying that there is no value in those extra levels, some people
    think is, and I'm arging about that. I was replying to tTh.

    As for my question, what /is/ the point? I'm still waiting!
    To clarify: the question is, what is the point of those levels?
    How is that different from asking "why C is like that"?

    My question is actually independent of C or its history.

    I accept those levels exist. I was asking do they currently serve a
    useful purpose.

    That's very different from your original question, which is quoted
    above. Your original question, with its use of the past tense,
    seemed clearly (to me) to be about how C was originally designed.

    I don't have a straightforward yes or no answer to your restated
    question.

    C's operator precedence rules are complicated and arguably flawed.
    They could have been defined differently. A simpler set of rules,
    with fewer levels, *might* have been better. I don't have any
    concrete suggestions -- nor do I have any strong preferences.
    I accept C's rules as they are. I would accept them if they had
    been defined differently.

    Nothing about the current rules particularly bothers me. There are
    no objective criteria for deciding what the rules *should* be.
    Even having multiplication bind more tightly than addition is
    fundamentally an arbitrary choice (though one that's almost
    universally recognized, even outside the context of programming
    languages).

    Of course all C implementations must implement the expression
    syntax as it's defined by the standard, and any changes in future
    editions of the standard would be impractical. As a programmer,
    I don't have to be as strict; I can add parentheses when writing
    code, and I can look up the rules as needed when reading code.

    If not, people can choose to ignore those them when writing C code,
    for example like this where all () are technically superfluous:

    crcu32 = (crcu32 >> 4) ^ s_crc32[(crcu32 & 0xF) ^ (b & 0xF)];

    Yes, they can, and I personally tend to agree that they should.

    And they can choose to not adopt them when devising new languages,
    however many still do faithfully recreate the same pattern, with a few notable exceptions such as Go lang.

    When designing a new language, there are real advantages in strictly
    imitating C's rules, just because so many programmers are familiar
    with them. (I would have been silly for C++ or Objective-C to
    change the precedence rules, even to improve them.) But there
    are also real advantages in using precedence rules that are better
    (e.g., simpler) than C's. It depends on the nature of the language.
    It could be an interesting discussion for comp.lang.misc.

    --
    Keith Thompson (The_Other_Keith) Keith.S.Thompson+u@gmail.com
    void Void(void) { Void(); } /* The recursive call of the void */

    --- PyGate Linux v1.5.15
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From Lawrence D?Oliveiro@3:633/10 to All on Sun May 31 00:29:40 2026
    On Sat, 30 May 2026 12:01:27 +0100, Bart wrote:

    It doesn't beed << >> to be in a distinct group from multiply or add
    groups.

    But it is also not clear because the part after >> is sprawling.

    It?s a counterexample to your claim that ?<< >> [don?t need] to be in
    a distinct group?, isn?t it?

    You'd want it like this:

    Because they are in a distinct group, you don?t need it like this.

    Remove ambiguity in the mind of the reader? Leader to fewer
    surprises when a new term needs to be added?

    The new terms will most likely fit into the existing ones in the
    natural way.

    --- PyGate Linux v1.5.15
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From Janis Papanagnou@3:633/10 to All on Sun May 31 03:37:50 2026
    On 2026-05-31 01:43, Keith Thompson wrote:
    Bart <bc@freeuk.com> writes:
    [...]
    [...]

    C's operator precedence rules are complicated and arguably flawed.

    I'd say that just the (known) flaw makes them (slightly) complicated;
    so you need to remember that "flaw" (or "inconsistency") to be safe.
    The rest is completely sensible. And even if one doesn't have a table
    to look up the precedences they mostly can be derived (presuming one
    has a feeling for the underlying logic of these things or experiences
    from other related areas).

    They could have been defined differently. A simpler set of rules,
    with fewer levels, *might* have been better. I don't have any
    concrete suggestions -- nor do I have any strong preferences.
    I accept C's rules as they are. I would accept them if they had
    been defined differently.

    Nothing about the current rules particularly bothers me. There are
    no objective criteria for deciding what the rules *should* be.

    There are. (What I called above as "derived underlying logic".) Some
    aspects have already been formulated in this and other threads here.
    But maybe not obvious to recognize without background in mathematics,
    logic, or CS.

    Even having multiplication bind more tightly than addition is
    fundamentally an arbitrary choice

    (Now opinions are getting really strange; in the above stated sense.)

    (though one that's almost
    universally recognized, even outside the context of programming
    languages).

    [...]

    If not, people can choose to ignore those them when writing C code,
    for example like this where all () are technically superfluous:

    crcu32 = (crcu32 >> 4) ^ s_crc32[(crcu32 & 0xF) ^ (b & 0xF)];

    Yes, they can, and I personally tend to agree that they should.

    The more complex the expressions are the more structure they need.

    IMO, the parenthesis above make precedence clear (if unknown!), but
    are not contributing to readability. It would have made more sense
    to separate the sub-expression within the [...] in an own object to
    enhance readability and to more easily understand what's going on.

    To emphasize; not the precedences are the problem above, but the
    complexity of the expression in connexion with lack of structuring.

    [...]

    When designing a new language, there are real advantages in strictly imitating C's rules, just because so many programmers are familiar
    with them.

    Huh? - How that? - Are you saying here that practically only C-like
    languages are in common use? - But even if so; there's quite some
    languages with differing precedence rules, not C-based, and without
    such a flaw like the one being discussed. - When designing a *new*
    language I'd certainly choose one of the sensible precedence rules,
    and just without those obvious flaws. (And not use "C" as base, of
    course.)

    (I would have been silly for C++ or Objective-C to
    change the precedence rules, even to improve them.) But there
    are also real advantages in using precedence rules that are better
    (e.g., simpler) than C's.

    Or - with reference to that flaw - just more consistent.

    Consistent systems are inherently simpler, in the sense of easier to
    understand and thus more straightforward to use. A precondition for
    that is, as said, at least a basic understanding of such things.

    It depends on the nature of the language.
    It could be an interesting discussion for comp.lang.misc.

    Janis


    --- PyGate Linux v1.5.15
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From Keith Thompson@3:633/10 to All on Sat May 30 19:53:40 2026
    Janis Papanagnou <janis_papanagnou+ng@hotmail.com> writes:
    On 2026-05-31 01:43, Keith Thompson wrote:
    Bart <bc@freeuk.com> writes:
    [...]
    [...]
    C's operator precedence rules are complicated and arguably flawed.

    I'd say that just the (known) flaw makes them (slightly) complicated;
    so you need to remember that "flaw" (or "inconsistency") to be safe.
    The rest is completely sensible. And even if one doesn't have a table
    to look up the precedences they mostly can be derived (presuming one
    has a feeling for the underlying logic of these things or experiences
    from other related areas).

    Reasonable, but I feel the need to say that that's your personal
    opinion. You seem to think that C's precedence rules have one and
    only one flaw, and a set of rules with that flaw corrected would
    be ideal.

    I don't even necessarily disagree, but others are likely to have
    different opinions, and those opinions might be perfectly valid.

    I don't want to make a huge deal out of this. I honestly don't have
    a strong opinion myself. I usually find dealing with the rules
    as they exist to be a much better use of my time and attention --
    and I don't mean that as a criticism of anyone who choose to think
    about alternatives.

    They could have been defined differently. A simpler set of rules,
    with fewer levels, *might* have been better. I don't have any
    concrete suggestions -- nor do I have any strong preferences.
    I accept C's rules as they are. I would accept them if they had
    been defined differently.
    Nothing about the current rules particularly bothers me. There are
    no objective criteria for deciding what the rules *should* be.

    There are. (What I called above as "derived underlying logic".) Some
    aspects have already been formulated in this and other threads here.
    But maybe not obvious to recognize without background in mathematics,
    logic, or CS.

    Even having multiplication bind more tightly than addition is
    fundamentally an arbitrary choice

    (Now opinions are getting really strange; in the above stated sense.)

    Mathematical notation almost universally has multiplication binding
    more tightly than addition. It's consistent because the consistency
    itself has big advantages so that you can write x + y * z (or x +
    y ? z) and everyone knows what you mean. Strict left-to-right
    evaluation would also have been a valid choice. (I don't know the
    history, but it probably goes back several centuries.)

    (though one that's almost
    universally recognized, even outside the context of programming
    languages).
    [...]

    If not, people can choose to ignore those them when writing C code,
    for example like this where all () are technically superfluous:

    crcu32 = (crcu32 >> 4) ^ s_crc32[(crcu32 & 0xF) ^ (b & 0xF)];
    Yes, they can, and I personally tend to agree that they should.

    The more complex the expressions are the more structure they need.

    IMO, the parenthesis above make precedence clear (if unknown!), but
    are not contributing to readability. It would have made more sense
    to separate the sub-expression within the [...] in an own object to
    enhance readability and to more easily understand what's going on.

    To emphasize; not the precedences are the problem above, but the
    complexity of the expression in connexion with lack of structuring.

    [...]
    When designing a new language, there are real advantages in strictly
    imitating C's rules, just because so many programmers are familiar
    with them.

    Huh? - How that? - Are you saying here that practically only C-like
    languages are in common use?

    Huh? No, I didn't say that at all.

    I suggest that if you're designing a somewhat C-like language,
    sticking to C's precedence rules has advantages due to programmer
    familiarity. Even for a language that's not particularly C-like,
    but that has C-like expressions, the designer might consider
    following C's rules.

    Or not.

    - But even if so; there's quite some
    languages with differing precedence rules, not C-based, and without
    such a flaw like the one being discussed. - When designing a *new*
    language I'd certainly choose one of the sensible precedence rules,
    and just without those obvious flaws. (And not use "C" as base, of
    course.)

    Certainly.

    (I would have been silly for C++ or Objective-C to
    change the precedence rules, even to improve them.) But there
    are also real advantages in using precedence rules that are better
    (e.g., simpler) than C's.

    Or - with reference to that flaw - just more consistent.

    Consistent systems are inherently simpler, in the sense of easier to understand and thus more straightforward to use. A precondition for
    that is, as said, at least a basic understanding of such things.

    Ah, but consistent with what? Internal consistency and consistency
    with existing practice are not necessarily the same thing.

    It depends on the nature of the language.
    It could be an interesting discussion for comp.lang.misc.

    --
    Keith Thompson (The_Other_Keith) Keith.S.Thompson+u@gmail.com
    void Void(void) { Void(); } /* The recursive call of the void */

    --- PyGate Linux v1.5.15
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From Richard Harnden@3:633/10 to All on Sun May 31 09:12:31 2026
    On 31/05/2026 00:43, Keith Thompson wrote:
    C's operator precedence rules are complicated and arguably flawed.
    They could have been defined differently. A simpler set of rules,
    with fewer levels,*might* have been better. I don't have any
    concrete suggestions -- nor do I have any strong preferences.
    I accept C's rules as they are. I would accept them if they had
    been defined differently.

    Can't the compiler easily remove any parens that aren't necessary?
    So - just write complex expressions in a way that a human can most
    easily understand, it makes your intention clear and probable doesn't
    increase the size of the executable.



    --- PyGate Linux v1.5.15
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From David Brown@3:633/10 to All on Sun May 31 11:14:29 2026
    On 30/05/2026 22:48, BGB wrote:
    On 5/30/2026 6:52 AM, David Brown wrote:
    On 29/05/2026 22:16, BGB wrote:
    On 5/29/2026 6:22 AM, David Brown wrote:
    On 29/05/2026 12:20, BGB wrote:
    On 5/29/2026 2:52 AM, Janis Papanagnou wrote:
    On 2026-05-28 11:57, BGB wrote:
    On 5/28/2026 2:18 AM, Janis Papanagnou wrote:
    On 2026-05-28 01:49, BGB wrote:
    [...]


    But, not really an "easy" way to avoid bloat, other than to write >>>>>>> code specifically for what cases are relevant; while also
    avoiding needless duplication and copy paste (where, overuse of >>>>>>> copy/paste can also lead to bloat; along with turning the code
    into an ugly mess).

    Hmm.. - as said, the during very early days there were issues; I
    recall on one platform duplication of template code in more that
    one source unit. And/or some environmental hacks (of the compiler) >>>>>> to deposit template code for linking. In the later days I've not
    seen such immature things anymore.


    Possibly, a lot could depend on how one is counting things as well.


    In a lot of cases when using GCC, I end up using:
    ÿÿ -ffunction-sections -fdata-sections -Wl,-gc-sections

    On many targets, "-fdata-sections" can lead to noticeably larger and
    slower code because it effectively eliminates section anchor
    optimisations.ÿ It does not negatively affect x86 AFAICS, because
    x86 does not use section anchors.

    <https://godbolt.org/z/zeoq41Y7d>

    With -fsection-anchors (enabled with optimisation on targets that
    support it - generally RISCy load/store architectures), program-
    lifetime variables are kept together in a lump (as though they were
    in a struct) and often addressed by a pointer to that pretend
    struct. Thus if a function accesses two variables "a" and "b",
    instead of having to load the addresses of each of "a" and "b" into
    separate registers, it loads an "anchor" into one register and
    accesses the variables with reg+offset addressing.

    I've seen "-fdata-sections" used regularly in embedded systems - it
    is almost always a bad idea.

    ("-ffunction-sections" is often very helpful to reduce code image
    size, so keep that one.)


    Both seem to help on x86, x86-64, and also on RISC-V, at making GCC's
    output at least sorta space-comparable to my own compilers.

    The merit of "-fdata-sections" is mostly that it eliminates unused
    global variables; whereas "-ffunction-sections" eliminates
    unreachable functions.

    That is the point of them, yes.ÿ "-ffunction-sections" can be useful
    at removing unused code from more general code.ÿ For microcontrollers,
    SDK's and manufacturers' driver code will normally contain a large
    number of functions that can be eliminated in this way, saving a lot
    of code space.

    However, in practice, "-fdata-sections" rarely eliminates a
    significant amount - most programs do not have large amounts of
    statically-allocated data that is not used.ÿ Gcc, and I think most
    other compilers, put the static lifetime data for each translation
    unit in its own section, so if no data from a translation unit is used
    it will be eliminated at link time even with -fno-data-sections.ÿ And
    of course it makes no difference for heap data or stack data.


    The main place it makes a difference is global arrays from a translation unit that is included, but for functions that are not included.

    Also functions with large static arrays.


    void SomeFunc()
    {
    ÿ static char buf[4096];
    ÿ ...
    }

    Where, say, eliminating SomeFunc does not necessarily eliminate buf.

    Yes, if you have such code but want to eliminate it, then
    -fdata-sections would definitely benefit. I have not seen such code in practice (at least not with very big static arrays, and that also was
    not an essential part of the program). But of course I have only seen a microscopic part of all C code written - if you come across this sort of thing, then I appreciate your point.

    (There are several ways to make this more "friendly" to builds that need
    to be compact, such as putting the buffer and/or SomeFunc in a separate
    file or giving it a specific section of its own.)



    In my testing, "-ffunction-sections" is absolutely worth using (on
    targets where code space is relevant - there's no need for PC
    software). ÿÿOn some targets, it may mean a few lost opportunities for
    shorter jump/call instructions between functions in the same
    translation unit, but the cost is rarely anything more than a slightly
    longer link time. But "-fdata-sections" typically gives almost no ram
    space savings, and makes code bigger and slower.

    As I noted, gcc on x86 does not support section anchors, so there is
    not likely to be much code cost for -ffdata-sections.

    Where section anchors shine - and where -fdata-sections therefore has
    cost - is when a function needs to access more than one piece of
    static lifetime data defined in the same translation unit (or another
    translation unit if you are using LTO).ÿ That happens a lot in
    embedded ARM programming at least.ÿ I don't know about RISC-V.ÿ If the
    target normally uses a "small data section" for ram (I know this is
    common on PowerPC), then there is, in effect, a program-wide section
    anchor already.ÿ So it is possible that it relatively few targets have
    section anchors - but the 32-bit ARM on gcc is a vastly popular choice
    in the embedded world, so it is important to understand the cost of
    this compiler flag for that target at least.


    It depends on the way it is built.


    A lot of times though (for non-relocatable static-linked binaries) it
    mostly tends to use AUIPC+LD or AUIPC+ST pairs to access global
    variables. There is a Global Pointer that needs to be loaded when the
    binary is started, unclear what it is used for exactly.


    If you have a global pointer, then it will probably be used for
    gp+offset access to global data, eliminating the need for section anchors.

    I have not used RISC-V, and am not familiar with its details. I can see
    from godbolt that when -fdata-sections is in action and you are loading
    from static lifetime variables, the compiler generates instructions like

    lw a5, a_variable
    lw a4, b_variable
    lw a0, c_variable

    When you do not have "-fdata-sections", it uses anchors :

    lla a4, .LANCHOR0
    lw a5, 0(a4)
    lw a3, 4(a4)
    lw a0, 8(a4)

    From my (limited) understanding, RISC-V cannot use 32-bit absolute addressing. So the "lw a5, a_variable" must be a pseudo-instruction -
    using register + offset addressing. If there is a global pointer, then presumably that is used here. Alternatively, the pseudo instruction
    might assemble to two real instruction to support the 32-bit address. I
    know both techniques are used in some targets, but don't know about RISC-V.

    Certainly it would surprise me if the "lw a5, a_variable" version were
    more efficient than using anchors - otherwise why would gcc generate
    code with anchors when given a free choice? (Perhaps gcc is not well
    tuned for RISC-V code generation - I am wary of making too many
    assumptions about the processor just from some simple compiler outputs.)

    (clang does not, apparently, support section anchors as an optimisation technique. Both with and without -fdata-sections, on RISC-V it first
    uses two instructions to load ".L_MergedGlobals" into a register and
    then uses that register plus offset to access data.)



    I don't tend to think of MSVC as a highly optimising compiler - but it
    is not a tool I have much use for, as it does not handle the targets I
    need.ÿ When I have sometimes looked at the generated code on godbolt,
    it has not impressed me at all.ÿ So it could well fall into the
    "helpful when using a weaker compiler" category.


    Depends on what target I am building for:
    ÿ Windows Native: Typically MSVC
    ÿ WSL: Usually GCC or Clang
    ÿÿÿ Seems to have: GCC 13.2.0; Clang 18.1.3
    ÿÿÿ RISC-V GCC: Also 13.2.0 (also via WSL)
    ÿ Linux: Typically GCC

    I rarely much use Cygwin anymore, as it was mostly rendered obsolete by
    WSL (on Win10 or similar).
    Though, Cygwin may still be relevant on Win7 or WinXP systems.


    Cygwin has its own wide range of complications. If you want to use gcc targeting native Windows, msys2 and mingw-64 are probably your best bet, either compiled natively under msys2 or as a cross-compile from Linux.
    But don't place too much emphasis on my advice, as I very rarely compile
    C or C++ code for Windows - most of my PC target (Linux or Windows)
    coding is in Python.

    For BGBCC, it can build both on native Windows and on Linux/WSL (though recently noted that this build was broken, mostly by GCC and Clang being more pedantic about missing prototypes, and a few prototypes were being missed by my function-prototype mining tool). Went and fixed this, but haven't posted this yet.


    As for optimizing in MSVC, yeah, it is in the area of not terrible, but
    not super clever either.

    If one expects the sort of high-level code-rewriting cleverness that GCC
    or Clang often does, one will be disappointed.

    But, sometimes, the main "heavy hitter" optimizations are things like constant-folding and register allocation, which it does do effectively.

    Though, both MSVC and BGBCC seem to use one sort of strategy for
    register allocation:


    Static assign things to callee-save registers and use remaining
    registers for dynamic allocation within basic-blocks. Variables with
    finite non-overlapping lifetimes (that do not cross basic-block
    boundaries) may potentially share a register (this more generally
    applies to things like temporaries).

    And, GCC and Clang use another: Assign dynamically but carry values
    across basic-block boundaries along control-flow paths.

    Both tend to give different patterns though, and seem to favor different types of code.


    [...]


    (I could also note that I make heavy use of templates in C++ code - it
    often leads to smaller and faster results.)

    Curious...


    I had tended to use the "write everything one off for the task at hand" approach, but this is a higher-effort approach.


    A lot of code tends to fall into the category of shuffling data around
    or doing simple checks or conversions. It's also common to have wrapper functions for libraries to get something nicer, safer and more
    convenient than some API that belongs in the early 1990's. Good C++
    templates (and sometimes even good macros in C) can make the use of
    these things far nicer, and most of the code that the templates appear
    to generate inline in the caller disappears in optimisation.



    --- PyGate Linux v1.5.15
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From David Brown@3:633/10 to All on Sun May 31 11:47:50 2026
    On 31/05/2026 03:37, Janis Papanagnou wrote:
    On 2026-05-31 01:43, Keith Thompson wrote:
    Bart <bc@freeuk.com> writes:
    [...]

    If not, people can choose to ignore those them when writing C code,
    for example like this where all () are technically superfluous:

    ÿÿÿ crcu32 = (crcu32 >> 4) ^ s_crc32[(crcu32 & 0xF) ^ (b & 0xF)];

    Yes, they can, and I personally tend to agree that they should.

    The more complex the expressions are the more structure they need.

    IMO, the parenthesis above make precedence clear (if unknown!), but
    are not contributing to readability. It would have made more sense
    to separate the sub-expression within the [...] in an own object to
    enhance readability and to more easily understand what's going on.

    To emphasize; not the precedences are the problem above, but the
    complexity of the expression in connexion with lack of structuring.

    This is an example of how readability depends on the reader. To me,
    there is no benefit in having a sub-expression here because the
    structure is clear - this is how you do table-based crc's with 4-bit
    chunks. But to someone unfamiliar with CRC calculations, splitting the expression up might make it clearer. (Alternatively, a comment block
    with an explanation could help.)

    I /do/ think the parentheses here are helpful for readability, precisely because they emphasise the structure of the expression. You could write:

    crcu32 = crcu32 >> 4 ^ s_crc32[crcu32 & 0xF ^ b & 0xF];

    but that needs significantly more cognitive effort to parse when reading
    it, could be misinterpreted, and has lost all the structure that makes
    it easy to see what is going on.

    (I regularly use bit-manipulation and shift instructions in my code -
    but I still felt it best to check the details in a precedence table
    before writing that.)

    The expression as originally parenthesised is thus definitely easier for
    /me/ to read, and is almost exactly the way I would write it myself :

    crcu32 = (crcu32 >> 4) ^ s_crc32[(crcu32 & 0xF) ^ (b & 0xF)];

    The only differences I would have are the names (why would anyone put
    variable types into the names like "crcu32" ? We are not writing
    BASIC), and I'd use a small case "0xf". Unlike almost every example
    Bart has shown before, it even has nice spacing!




    --- PyGate Linux v1.5.15
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From David Brown@3:633/10 to All on Sun May 31 11:49:24 2026
    On 31/05/2026 10:12, Richard Harnden wrote:
    On 31/05/2026 00:43, Keith Thompson wrote:
    C's operator precedence rules are complicated and arguably flawed.
    They could have been defined differently.ÿ A simpler set of rules,
    with fewer levels,*might* have been better.ÿ I don't have any
    concrete suggestions -- nor do I have any strong preferences.
    I accept C's rules as they are.ÿ I would accept them if they had
    been defined differently.

    Can't the compiler easily remove any parens that aren't necessary?
    So - just write complex expressions in a way that a human can most
    easily understand, it makes your intention clear and probable doesn't increase the size of the executable.



    Of course. Parentheses do not affect the generated code unless they
    affect the semantics of the expression. (Some people think parentheses
    affect the order of evaluation, but that is not the case for most
    compilers.)


    --- PyGate Linux v1.5.15
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From Bart@3:633/10 to All on Sun May 31 10:59:42 2026
    On 31/05/2026 01:29, Lawrence D?Oliveiro wrote:
    On Sat, 30 May 2026 12:01:27 +0100, Bart wrote:

    It doesn't beed << >> to be in a distinct group from multiply or add
    groups.

    But it is also not clear because the part after >> is sprawling.

    It?s a counterexample to your claim that ?<< >> [don?t need] to be in
    a distinct group?, isn?t it?

    Sure, when an expression exactly suits how its current level works, such as:

    a << b + c

    WHEN you intend it to be 'a << (b + c)'. How about when you intend it to
    be: '(a << b) + c'?

    This is arguably more intuitive since << scales numbers in the same way
    as '*'. As it is:

    a << 3 + b means a << (3+b)
    a * 3 + b means (a*3) + b

    And also

    a << 3 | b means (a<<3) | b
    a << 3 + b means a << (3+b)

    Both examples have a similar function but thanks to the odd priorities
    are quite different when choosing between << and *, or | and +.


    --- PyGate Linux v1.5.15
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From Bart@3:633/10 to All on Sun May 31 11:10:31 2026
    On 31/05/2026 10:49, David Brown wrote:
    On 31/05/2026 10:12, Richard Harnden wrote:
    On 31/05/2026 00:43, Keith Thompson wrote:
    C's operator precedence rules are complicated and arguably flawed.
    They could have been defined differently.ÿ A simpler set of rules,
    with fewer levels,*might* have been better.ÿ I don't have any
    concrete suggestions -- nor do I have any strong preferences.
    I accept C's rules as they are.ÿ I would accept them if they had
    been defined differently.

    Can't the compiler easily remove any parens that aren't necessary?
    So - just write complex expressions in a way that a human can most
    easily understand, it makes your intention clear and probable doesn't
    increase the size of the executable.



    Of course.ÿ Parentheses do not affect the generated code unless they
    affect the semantics of the expression.ÿ (Some people think parentheses affect the order of evaluation,

    They can do if they make a expression be parsed differently. Do you have
    an example where they make no difference but people might think they do?



    --- PyGate Linux v1.5.15
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From Keith Thompson@3:633/10 to All on Sun May 31 03:45:50 2026
    Richard Harnden <richard.nospam@gmail.invalid> writes:
    On 31/05/2026 00:43, Keith Thompson wrote:
    C's operator precedence rules are complicated and arguably flawed.
    They could have been defined differently. A simpler set of rules,
    with fewer levels,*might* have been better. I don't have any
    concrete suggestions -- nor do I have any strong preferences.
    I accept C's rules as they are. I would accept them if they had
    been defined differently.

    Can't the compiler easily remove any parens that aren't necessary?
    So - just write complex expressions in a way that a human can most
    easily understand, it makes your intention clear and probable doesn't increase the size of the executable.

    Compilers generally remove *all* parens, necessary or not.
    The output of a compiler is assembly or machine code. You almost
    certainly can't tell from the generated code whether the input was,
    for example, `a * b + c`, `(a * b) + c`, or `(((a) * (b)) + (c))`.

    --
    Keith Thompson (The_Other_Keith) Keith.S.Thompson+u@gmail.com
    void Void(void) { Void(); } /* The recursive call of the void */

    --- PyGate Linux v1.5.15
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From Keith Thompson@3:633/10 to All on Sun May 31 04:02:25 2026
    Keith Thompson <Keith.S.Thompson+u@gmail.com> writes:
    Richard Harnden <richard.nospam@gmail.invalid> writes:
    On 31/05/2026 00:43, Keith Thompson wrote:
    C's operator precedence rules are complicated and arguably flawed.
    They could have been defined differently. A simpler set of rules,
    with fewer levels,*might* have been better. I don't have any
    concrete suggestions -- nor do I have any strong preferences.
    I accept C's rules as they are. I would accept them if they had
    been defined differently.

    Can't the compiler easily remove any parens that aren't necessary?
    So - just write complex expressions in a way that a human can most
    easily understand, it makes your intention clear and probable doesn't
    increase the size of the executable.

    Compilers generally remove *all* parens, necessary or not.
    The output of a compiler is assembly or machine code. You almost
    certainly can't tell from the generated code whether the input was,
    for example, `a * b + c`, `(a * b) + c`, or `(((a) * (b)) + (c))`.

    I realize I missed part of the point of your question.

    Adding parentheses to an expression in a way that yields
    an equivalent expression almost certainly will not affect the
    generated code. Any parentheses that "restate" the precedence
    rules are only for the convenience of human readers.

    Ideally, you should always use exactly the right number of
    parentheses to optimize readability. But since humans are not
    compilers, there is no one way to do that. I would probably
    add parentheses to `x == y & z`, assuming I really wanted the
    semantics of `(x == y) & z` for some reason, but I would find the
    superfluous parentheses in `x + (y * z)` or `x = (y + z)` annoying.
    (Almost as annoying as the poor choice of variable names.)

    It's possible to have too few parentheses or too many.

    --
    Keith Thompson (The_Other_Keith) Keith.S.Thompson+u@gmail.com
    void Void(void) { Void(); } /* The recursive call of the void */

    --- PyGate Linux v1.5.15
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From David Brown@3:633/10 to All on Sun May 31 13:18:56 2026
    On 31/05/2026 12:10, Bart wrote:
    On 31/05/2026 10:49, David Brown wrote:
    On 31/05/2026 10:12, Richard Harnden wrote:
    On 31/05/2026 00:43, Keith Thompson wrote:
    C's operator precedence rules are complicated and arguably flawed.
    They could have been defined differently.ÿ A simpler set of rules,
    with fewer levels,*might* have been better.ÿ I don't have any
    concrete suggestions -- nor do I have any strong preferences.
    I accept C's rules as they are.ÿ I would accept them if they had
    been defined differently.

    Can't the compiler easily remove any parens that aren't necessary?
    So - just write complex expressions in a way that a human can most
    easily understand, it makes your intention clear and probable doesn't
    increase the size of the executable.



    Of course.ÿ Parentheses do not affect the generated code unless they
    affect the semantics of the expression.ÿ (Some people think
    parentheses affect the order of evaluation,

    They can do if they make a expression be parsed differently.

    As I said, they "do not affect the generated code unless they affect the semantics of the expression." Obviously that only applies to extra parentheses. If that's what you mean by "parsed differently", then we
    agree - clearly "(a + b) * c" gives different code from "a + (b * c)".

    But you might consider "(a + b) + c" to be "parsed differently" than "a
    + (b + c)", because of how a particular compiler implements its parser.
    It's possible that this results in different code for a particular
    compiler, but there is no difference in the meaning for the C language.

    I perhaps expressed it poorly when I said extra parentheses "do not"
    affect the generated code - extra parentheses do not change the meaning
    of the C code, and so compilers don't have to consider them in any way
    (and optimising compilers generally don't). But a compiler is, of
    course, free to be influenced by them and generate code that varies with
    extra parentheses, as long as the final results match those required by
    the standard.

    Do you have
    an example where they make no difference but people might think they do?


    People might think they affect the order of evaluation, such as when you
    have function calls :

    u = foo(x) + (foo(y) + foo(z));

    Some people might think the use of parentheses means that "foo(y)" and "foo(z)" are called before "foo(x)", when the order of all these calls
    (and the additions) is unspecified. (Again, a given compiler might be influenced by the parentheses, but the language does not require it.)





    --- PyGate Linux v1.5.15
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From David Brown@3:633/10 to All on Sun May 31 13:24:03 2026
    On 31/05/2026 02:29, Lawrence D?Oliveiro wrote:
    On Sat, 30 May 2026 12:01:27 +0100, Bart wrote:

    It doesn't beed << >> to be in a distinct group from multiply or add
    groups.

    But it is also not clear because the part after >> is sprawling.

    It?s a counterexample to your claim that ?<< >> [don?t need] to be in
    a distinct group?, isn?t it?


    No, it is not. If << and >> had been in the same group as
    multiplication and division, your code (the snippets that Bart
    referenced) would have had the same semantics. Other code might have different semantics, but Bart was entirely correct in this case.


    --- PyGate Linux v1.5.15
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From James Kuyper@3:633/10 to All on Sun May 31 10:15:01 2026
    On 2026-05-31 05:49, David Brown wrote:
    ...
    Of course. Parentheses do not affect the generated code unless they
    affect the semantics of the expression. (Some people think parentheses affect the order of evaluation, but that is not the case for most compilers.)

    I assume that last sentence is meant to apply only to parentheses which
    don't change the semantics? Otherwise it seems manifestly false.

    --- PyGate Linux v1.5.15
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From James Kuyper@3:633/10 to All on Sun May 31 10:24:30 2026
    On 2026-05-31 07:18, David Brown wrote:
    On 31/05/2026 12:10, Bart wrote:
    On 31/05/2026 10:49, David Brown wrote:
    On 31/05/2026 10:12, Richard Harnden wrote:
    On 31/05/2026 00:43, Keith Thompson wrote:
    ...
    But you might consider "(a + b) + c" to be "parsed differently" than "a
    + (b + c)", because of how a particular compiler implements its parser.
    It's possible that this results in different code for a particular
    compiler, but there is no difference in the meaning for the C language.

    (a + b) + c mandates adding a to b, then adding the result to c. a + (b
    + c) mandates adding b to c then adding the result to a. As far as
    mathematics is concerned, that's the same thing, but in computer math it
    can make a difference if one of the two results in overflow or
    unnecessary loss of precision, and the other does not.

    ...
    Do you have
    an example where they make no difference but people might think they do?


    People might think they affect the order of evaluation, such as when you have function calls :

    u = foo(x) + (foo(y) + foo(z));

    Some people might think the use of parentheses means that "foo(y)" and "foo(z)" are called before "foo(x)", when the order of all these calls
    (and the additions) is unspecified. (Again, a given compiler might be influenced by the parentheses, but the language does not require it.

    You're correct with regard to the function calls, but the parenthesized addition must be performed first, and the other one second, which may
    make a difference, for the same reasons given in my previous paragraph.

    --- PyGate Linux v1.5.15
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From David Brown@3:633/10 to All on Sun May 31 16:29:38 2026
    On 31/05/2026 16:15, James Kuyper wrote:
    On 2026-05-31 05:49, David Brown wrote:
    ...
    Of course. Parentheses do not affect the generated code unless they
    affect the semantics of the expression. (Some people think parentheses
    affect the order of evaluation, but that is not the case for most
    compilers.)

    I assume that last sentence is meant to apply only to parentheses which
    don't change the semantics? Otherwise it seems manifestly false.

    Yes. I thought I was quite clear in this, given that I wrote almost
    exactly that in the previous sentence (which you also quoted above).


    --- PyGate Linux v1.5.15
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From David Brown@3:633/10 to All on Sun May 31 17:35:49 2026
    On 31/05/2026 16:24, James Kuyper wrote:
    On 2026-05-31 07:18, David Brown wrote:
    On 31/05/2026 12:10, Bart wrote:
    On 31/05/2026 10:49, David Brown wrote:
    On 31/05/2026 10:12, Richard Harnden wrote:
    On 31/05/2026 00:43, Keith Thompson wrote:
    ...
    But you might consider "(a + b) + c" to be "parsed differently" than "a
    + (b + c)", because of how a particular compiler implements its parser.
    It's possible that this results in different code for a particular
    compiler, but there is no difference in the meaning for the C language.

    (a + b) + c mandates adding a to b, then adding the result to c. a + (b
    + c) mandates adding b to c then adding the result to a. As far as mathematics is concerned, that's the same thing, but in computer math it
    can make a difference if one of the two results in overflow or
    unnecessary loss of precision, and the other does not.

    ...
    Do you have
    an example where they make no difference but people might think they do? >>>

    People might think they affect the order of evaluation, such as when you
    have function calls :

    u = foo(x) + (foo(y) + foo(z));

    Some people might think the use of parentheses means that "foo(y)" and
    "foo(z)" are called before "foo(x)", when the order of all these calls
    (and the additions) is unspecified. (Again, a given compiler might be
    influenced by the parentheses, but the language does not require it.

    You're correct with regard to the function calls, but the parenthesized addition must be performed first, and the other one second, which may
    make a difference, for the same reasons given in my previous paragraph.

    The parentheses do not dictate the order of evaluation. But you are
    correct - and it's worth pointing out, so thank you for doing that -
    that for floating point operations, the grouping of operations can
    affect the result.

    If you are talking about floating point arithmetic (I was thinking of
    integer arithmetic, but did not specify), then the operations are not necessarily commutative or associative, and the compiler cannot then re-arrange the operations unless it knows that doing so does not affect
    the result.

    But except for specific cases, the order of evaluation - both for the
    values and side-effects - of sub-expressions is unspecified. Indeed,
    they are unsequenced - the evaluations can interleave.

    Usually, both sub-expressions of a binary operator will be evaluated
    before the operator itself, simply because usually the results of the
    operator cannot be calculated until the sub-expression's values are
    known. But this is not a requirement of the language - if the compiler
    can get the same results without doing so, it is free to pick a
    different order. "(a + b) * 0" does not need to evaluate "a", "b", or
    "a + b" at all unless there is a possibility of a side-effect - and it
    can perform the side-effects in any order. "a + (b + c)" can check "a"
    for a trap representation and deal with that before looking at "b" and
    "c" or the results of "b + c", even though it cannot (for floating point operations) re-arrange the code to do "a + b" first.

    If an implementation provides additional semantics to signed integer arithmetic, such as saturating or trapping overflow, then signed integer arithmetic operations are no longer associative. But normal C undefined behaviour on overflow is fully associative (as is wrapping semantics,
    for addition, subtraction and multiplication).

    So for non-associative operations, parentheses can affect the semantics
    - and therefore the most likely (but not required) order of evaluation
    of at least some parts of the sub-expressions. However, that also then
    means we are not longer talking about parentheses that do not affect the semantics of the expression, which is what this thread branch is about.



    --- PyGate Linux v1.5.15
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From Tim Rentsch@3:633/10 to All on Sun May 31 09:04:43 2026
    Richard Harnden <richard.nospam@gmail.invalid> writes:

    just write complex expressions in a way that a human can most
    easily understand,

    Unfortunately, (1) different people have different ideas of what
    writing is most easily understood, and (2) different readers have
    different notions of which writings are easily understood, and
    which writings are not so easily understood. To make things
    worse "easily understood" is not a boolean condition, nor is it
    necessarily well-ordered -- "most easily understood" isn't always
    a well-defined quality, even for a given audience.

    Sadly the idea of writing in a way that is "most easily understood"
    has resulted in a race to the bottom, where writers are more and
    more encouraged to take the view that (some) readers are pretty
    much arbitrarily stupid, with the result that expressions become
    littered with scads of unnecessary parentheses that actually
    detract from ease of reading. Good writing is always a balance
    between too much and too little.

    --- PyGate Linux v1.5.15
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From James Kuyper@3:633/10 to All on Sun May 31 12:46:24 2026
    On 2026-05-31 11:35, David Brown wrote:
    ...
    Usually, both sub-expressions of a binary operator will be evaluated
    before the operator itself, simply because usually the results of the operator cannot be calculated until the sub-expression's values are
    known. But this is not a requirement of the language

    "The value computations of the operands of an operator are sequenced
    before the value computation of the result of the operator." (6.5.1p3)

    - if the compiler
    can get the same results without doing so, it is free to pick a
    different order.

    Correct - but "same results" is crucial; it allows you to invoke the
    "as-if" rule. Otherwise, the sequencing specified by 6.5.1p3 must be
    honored.

    ...
    If an implementation provides additional semantics to signed integer arithmetic, such as saturating or trapping overflow, then signed integer arithmetic operations are no longer associative. But normal C undefined behaviour on overflow is fully associative (as is wrapping semantics,
    for addition, subtraction and multiplication).

    I don't follow that. I believe that overflow is guaranteed for (5 +
    INT_MAX) + INT_MIN, and completely avoided by 5 + (INT_MAX + INT_MIN),
    which differ only by association. Are you saying they both have the
    same chance of overflowing?

    --- PyGate Linux v1.5.15
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From Bart@3:633/10 to All on Sun May 31 18:11:05 2026
    On 31/05/2026 17:04, Tim Rentsch wrote:
    Richard Harnden <richard.nospam@gmail.invalid> writes:

    just write complex expressions in a way that a human can most
    easily understand,

    Unfortunately, (1) different people have different ideas of what
    writing is most easily understood, and (2) different readers have
    different notions of which writings are easily understood, and
    which writings are not so easily understood. To make things
    worse "easily understood" is not a boolean condition, nor is it
    necessarily well-ordered -- "most easily understood" isn't always
    a well-defined quality, even for a given audience.

    Sadly the idea of writing in a way that is "most easily understood"
    has resulted in a race to the bottom, where writers are more and
    more encouraged to take the view that (some) readers are pretty
    much arbitrarily stupid, with the result that expressions become
    littered with scads of unnecessary parentheses that actually
    detract from ease of reading. Good writing is always a balance
    between too much and too little.

    Actual examples of too many parentheses?

    I don't think they are needed for the three main groups (unless you need
    to override the normal behaviour):

    * Arithmetic ops that everyone knows

    * Comparison ops which can be considered a single level
    (in C there are two, but they are rarely chained)

    * Logical AND/OR ops

    Most involved in coding should know the order of these groups and will
    know that AND takes precedence over OR because that is common.

    The leaves the following which are not used in the real world and which
    are diverse across languages:

    << >> & | ^

    There it makes sense to use parentheses to make things clear when any of
    these appear, but only if there is more than one and they are mixed.

    I don't think that is particularly onerous to have to write, or too much clutter to read.

    I wouldn't call anyone stupid for using () in such cases; more pragmatic.

    There are some odd ones such as "." (not even considered a binary
    operator in some languages), and assignment, but these also commonly
    behave the same way across languages.

    And then there is ?: :

    a > b ? c : d # (a>b)?c:d
    a + b ? c : d # (a+b)?c:d

    The grouping of the first is probably what is intended. But in the
    second, the intent might have been (a+b)?c:d, or a+(b?c:c); we don't
    know for sure that the author didn't make a mistake or we don't know outselves.

    Another candidate for parentheses when there are leading or trailing
    binary ops involved.


    --- PyGate Linux v1.5.15
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From BGB@3:633/10 to All on Sun May 31 13:25:51 2026
    On 5/31/2026 4:14 AM, David Brown wrote:
    On 30/05/2026 22:48, BGB wrote:
    On 5/30/2026 6:52 AM, David Brown wrote:
    On 29/05/2026 22:16, BGB wrote:
    On 5/29/2026 6:22 AM, David Brown wrote:
    On 29/05/2026 12:20, BGB wrote:
    On 5/29/2026 2:52 AM, Janis Papanagnou wrote:
    On 2026-05-28 11:57, BGB wrote:
    On 5/28/2026 2:18 AM, Janis Papanagnou wrote:
    On 2026-05-28 01:49, BGB wrote:
    [...]


    But, not really an "easy" way to avoid bloat, other than to
    write code specifically for what cases are relevant; while also >>>>>>>> avoiding needless duplication and copy paste (where, overuse of >>>>>>>> copy/paste can also lead to bloat; along with turning the code >>>>>>>> into an ugly mess).

    Hmm.. - as said, the during very early days there were issues; I >>>>>>> recall on one platform duplication of template code in more that >>>>>>> one source unit. And/or some environmental hacks (of the compiler) >>>>>>> to deposit template code for linking. In the later days I've not >>>>>>> seen such immature things anymore.


    Possibly, a lot could depend on how one is counting things as well. >>>>>>

    In a lot of cases when using GCC, I end up using:
    ÿÿ -ffunction-sections -fdata-sections -Wl,-gc-sections

    On many targets, "-fdata-sections" can lead to noticeably larger
    and slower code because it effectively eliminates section anchor
    optimisations.ÿ It does not negatively affect x86 AFAICS, because
    x86 does not use section anchors.

    <https://godbolt.org/z/zeoq41Y7d>

    With -fsection-anchors (enabled with optimisation on targets that
    support it - generally RISCy load/store architectures), program-
    lifetime variables are kept together in a lump (as though they were >>>>> in a struct) and often addressed by a pointer to that pretend
    struct. Thus if a function accesses two variables "a" and "b",
    instead of having to load the addresses of each of "a" and "b" into >>>>> separate registers, it loads an "anchor" into one register and
    accesses the variables with reg+offset addressing.

    I've seen "-fdata-sections" used regularly in embedded systems - it >>>>> is almost always a bad idea.

    ("-ffunction-sections" is often very helpful to reduce code image
    size, so keep that one.)


    Both seem to help on x86, x86-64, and also on RISC-V, at making
    GCC's output at least sorta space-comparable to my own compilers.

    The merit of "-fdata-sections" is mostly that it eliminates unused
    global variables; whereas "-ffunction-sections" eliminates
    unreachable functions.

    That is the point of them, yes.ÿ "-ffunction-sections" can be useful
    at removing unused code from more general code.ÿ For
    microcontrollers, SDK's and manufacturers' driver code will normally
    contain a large number of functions that can be eliminated in this
    way, saving a lot of code space.

    However, in practice, "-fdata-sections" rarely eliminates a
    significant amount - most programs do not have large amounts of
    statically-allocated data that is not used.ÿ Gcc, and I think most
    other compilers, put the static lifetime data for each translation
    unit in its own section, so if no data from a translation unit is
    used it will be eliminated at link time even with -fno-data-
    sections.ÿ And of course it makes no difference for heap data or
    stack data.


    The main place it makes a difference is global arrays from a
    translation unit that is included, but for functions that are not
    included.

    Also functions with large static arrays.


    void SomeFunc()
    {
    ÿÿ static char buf[4096];
    ÿÿ ...
    }

    Where, say, eliminating SomeFunc does not necessarily eliminate buf.

    Yes, if you have such code but want to eliminate it, then -fdata-
    sections would definitely benefit.ÿ I have not seen such code in
    practice (at least not with very big static arrays, and that also was
    not an essential part of the program).ÿ But of course I have only seen a microscopic part of all C code written - if you come across this sort of thing, then I appreciate your point.

    (There are several ways to make this more "friendly" to builds that need
    to be compact, such as putting the buffer and/or SomeFunc in a separate
    file or giving it a specific section of its own.)


    I have seen this pattern sometimes, though usually in "medium old" code,
    with newer code more often assuming that the stack is really big and so
    can handle putting 1MB or more in a local array. Though, this is not
    great on a target which doesn't have a huge stack.

    In my case, I usually had 128K as the default stack size in my project.




    In my testing, "-ffunction-sections" is absolutely worth using (on
    targets where code space is relevant - there's no need for PC
    software). ÿÿOn some targets, it may mean a few lost opportunities
    for shorter jump/call instructions between functions in the same
    translation unit, but the cost is rarely anything more than a
    slightly longer link time. But "-fdata-sections" typically gives
    almost no ram space savings, and makes code bigger and slower.

    As I noted, gcc on x86 does not support section anchors, so there is
    not likely to be much code cost for -ffdata-sections.

    Where section anchors shine - and where -fdata-sections therefore has
    cost - is when a function needs to access more than one piece of
    static lifetime data defined in the same translation unit (or another
    translation unit if you are using LTO).ÿ That happens a lot in
    embedded ARM programming at least.ÿ I don't know about RISC-V.ÿ If
    the target normally uses a "small data section" for ram (I know this
    is common on PowerPC), then there is, in effect, a program-wide
    section anchor already.ÿ So it is possible that it relatively few
    targets have section anchors - but the 32-bit ARM on gcc is a vastly
    popular choice in the embedded world, so it is important to
    understand the cost of this compiler flag for that target at least.


    It depends on the way it is built.


    A lot of times though (for non-relocatable static-linked binaries) it
    mostly tends to use AUIPC+LD or AUIPC+ST pairs to access global
    variables. There is a Global Pointer that needs to be loaded when the
    binary is started, unclear what it is used for exactly.


    If you have a global pointer, then it will probably be used for
    gp+offset access to global data, eliminating the need for section anchors.

    I have not used RISC-V, and am not familiar with its details.ÿ I can see from godbolt that when -fdata-sections is in action and you are loading
    from static lifetime variables, the compiler generates instructions like

    ÿÿÿÿlw a5, a_variable
    ÿÿÿÿlw a4, b_variable
    ÿÿÿÿlw a0, c_variable

    When you do not have "-fdata-sections", it uses anchors :

    ÿÿÿÿlla a4, .LANCHOR0
    ÿÿÿÿlw a5, 0(a4)
    ÿÿÿÿlw a3, 4(a4)
    ÿÿÿÿlw a0, 8(a4)

    From my (limited) understanding, RISC-V cannot use 32-bit absolute addressing.ÿ So the "lw a5, a_variable" must be a pseudo-instruction -
    using register + offset addressing.ÿ If there is a global pointer, then presumably that is used here.ÿ Alternatively, the pseudo instruction
    might assemble to two real instruction to support the 32-bit address.ÿ I know both techniques are used in some targets, but don't know about RISC-V.


    It can use one of two strategies for these (after breaking up pseudo-instructions):
    LUI a5, HiAddr //Abs32, Low 2GB only
    LW a5, LoAddr(a5)
    Or:
    AUIPC a5, HiAddr //PC-Rel
    LW a5, LoAddr(a5)

    IIRC, LLA is similar, just using an ADDI as the second instruction.
    But, yeah, the latter sequence would be more efficient.


    I would expect something different if building with -fPIC or -fPIE, but
    this depends on if it is a version of GCC built with support for these
    (if using a version of GCC built for non-hosted targets, it ignores
    these). Where, one effectively needs different GCC builds for bare-metal
    (like OS kernels) and for hosted Linux development, for whatever bizarre reason...


    Certainly it would surprise me if the "lw a5, a_variable" version were
    more efficient than using anchors - otherwise why would gcc generate
    code with anchors when given a free choice?ÿ (Perhaps gcc is not well
    tuned for RISC-V code generation - I am wary of making too many
    assumptions about the processor just from some simple compiler outputs.)


    It is not, it is a 2-op sequence usually.

    Plain RISC-V has a bigger problem with 64-bit constants though,
    generally needs to either load these from memory (more typical in GCC)
    or build them in-place (which needs roughly 6 instructions in RISC-V).

    Say (possible, but GCC doesn't do this):
    LUI t0, ValHiA
    LUI t1, valHiB
    ADDI t0, t0, valLoA
    ADDI t1, t1, valLoB
    SLLI t1, t1, 32
    ADD a0, t0, t1


    In my case, I have extensions for RV that can turn a lot of this stuff
    into single instructions (albeit with larger 8 and 12 byte encodings).

    In some cases, it can save bytes, for example:
    LW a1, Disp33s(a0)
    As a 64-bit / 8-byte encoding, vs:
    LUI t0, DispHi
    ADD t0, t0, a0
    LW a1, DispLo(a0)
    Needing 12 bytes.


    My own (more drastic) extensions can save more, by having a few Disp16 instructions, which can access 256K or 512K past GP within a single
    32-bit instruction.


    But, if/when any of this would end up in mainline RISC-V is uncertain.
    Weirdly, there is a lot more emphasis there on big/fancy features (with
    niche applicability), rather than on smaller things that can improve the properties of the base ISA (and that could more generally benefit nearly
    all code built for the ISA).


    (clang does not, apparently, support section anchors as an optimisation technique.ÿ Both with and without -fdata-sections, on RISC-V it first
    uses two instructions to load ".L_MergedGlobals" into a register and
    then uses that register plus offset to access data.)


    Yeah.

    As noted, BGBCC mostly use the GP register to access globals for RV
    based targets; sorting them out so that the most common ones come first
    and so are typically a single instruction.

    This is one merit though of not using separate compilation.
    However, the approach used by my compiler is much more memory intensive.




    I don't tend to think of MSVC as a highly optimising compiler - but
    it is not a tool I have much use for, as it does not handle the
    targets I need.ÿ When I have sometimes looked at the generated code
    on godbolt, it has not impressed me at all.ÿ So it could well fall
    into the "helpful when using a weaker compiler" category.


    Depends on what target I am building for:
    ÿÿ Windows Native: Typically MSVC
    ÿÿ WSL: Usually GCC or Clang
    ÿÿÿÿ Seems to have: GCC 13.2.0; Clang 18.1.3
    ÿÿÿÿ RISC-V GCC: Also 13.2.0 (also via WSL)
    ÿÿ Linux: Typically GCC

    I rarely much use Cygwin anymore, as it was mostly rendered obsolete
    by WSL (on Win10 or similar).
    Though, Cygwin may still be relevant on Win7 or WinXP systems.


    Cygwin has its own wide range of complications.ÿ If you want to use gcc targeting native Windows, msys2 and mingw-64 are probably your best bet, either compiled natively under msys2 or as a cross-compile from Linux.
    But don't place too much emphasis on my advice, as I very rarely compile
    C or C++ code for Windows - most of my PC target (Linux or Windows)
    coding is in Python.


    Yes, I had used MinGW for a while, before mostly moving over to MSVC for native Windows.

    The tradeoff is mostly:
    MinGW is closer to native for Windows;
    Cygwin could give a closer approximation of Linux on Windows, so one can
    build a lot of Linux software and use "./configure" scripts and similar.


    But, as noted, Cygwin's role was mostly displaced by WSL, which
    effectively runs a Linux userland on Windows.


    There was WSL1, which basically mapped Linux syscalls over to the
    Windows kernel, and WSL2, which runs the Linux kernel in a VM.


    Though, in my case I was using WSL1 as seemingly MS had decided that my
    PC can't do virtualization (and sees it as necessary for WSL2), even
    despite having a CPU that can do so, and it is enabled in the BIOS.


    For BGBCC, it can build both on native Windows and on Linux/WSL
    (though recently noted that this build was broken, mostly by GCC and
    Clang being more pedantic about missing prototypes, and a few
    prototypes were being missed by my function-prototype mining tool).
    Went and fixed this, but haven't posted this yet.


    As for optimizing in MSVC, yeah, it is in the area of not terrible,
    but not super clever either.

    If one expects the sort of high-level code-rewriting cleverness that
    GCC or Clang often does, one will be disappointed.

    But, sometimes, the main "heavy hitter" optimizations are things like
    constant-folding and register allocation, which it does do effectively.

    Though, both MSVC and BGBCC seem to use one sort of strategy for
    register allocation:


    Static assign things to callee-save registers and use remaining
    registers for dynamic allocation within basic-blocks. Variables with
    finite non-overlapping lifetimes (that do not cross basic-block
    boundaries) may potentially share a register (this more generally
    applies to things like temporaries).

    And, GCC and Clang use another: Assign dynamically but carry values
    across basic-block boundaries along control-flow paths.

    Both tend to give different patterns though, and seem to favor
    different types of code.


    [...]


    (I could also note that I make heavy use of templates in C++ code -
    it often leads to smaller and faster results.)

    Curious...


    I had tended to use the "write everything one off for the task at
    hand" approach, but this is a higher-effort approach.


    A lot of code tends to fall into the category of shuffling data around
    or doing simple checks or conversions.ÿ It's also common to have wrapper functions for libraries to get something nicer, safer and more
    convenient than some API that belongs in the early 1990's.ÿ Good C++ templates (and sometimes even good macros in C) can make the use of
    these things far nicer, and most of the code that the templates appear
    to generate inline in the caller disappears in optimisation.


    OK.



    --- PyGate Linux v1.5.15
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From Dan Cross@3:633/10 to All on Sun May 31 19:11:57 2026
    In article <10vemqf$r5qe$1@dont-email.me>, Bart <bc@freeuk.com> wrote:
    On 30/05/2026 13:29, Dan Cross wrote:
    In article <10vd1tu$ekvl$1@dont-email.me>, Bart <bc@freeuk.com> wrote:
    On 29/05/2026 21:56, Keith Thompson wrote:
    [snip]
    Upthread, you asked a question:

    And then the point becomes, if you always add the parentheses, what >>>> was the point of having that particular precedence level?

    You've made it clear that you were never interested in an answer.

    You said this:

    "You're asking why C is designed the way it is. We could waste a
    great deal of time and effort answering that for you. There are
    numerous documents about the design and history of C, and of
    its ancestor languages. I could provide you with links."

    Actually I'm not asking why C is like that. We're already there.

    I'm saying that there is no value in those extra levels, some people
    think is, and I'm arging about that. I was replying to tTh.

    As for my question, what /is/ the point? I'm still waiting!

    To clarify: the question is, what is the point of those levels?

    How is that different from asking "why C is like that"?

    My question is actually independent of C or its history.

    I accept those levels exist. I was asking do they currently serve a
    useful purpose.

    That is a distinction without a difference: I do not see how the
    two can be separated from one another.

    The useful purpose the C rules serve is allowing existing code
    to compile unmodified; the reason that existing code was written
    that way is because that's how the language was defined; the
    language was defined that way due to the aforementioned history.

    If not, people can choose to ignore those them when writing C code, for >example like this where all () are technically superfluous:

    crcu32 = (crcu32 >> 4) ^ s_crc32[(crcu32 & 0xF) ^ (b & 0xF)];

    And they can choose to not adopt them when devising new languages,
    however many still do faithfully recreate the same pattern, with a few >notable exceptions such as Go lang.

    Languages should, presumably, do what makes sense for them.
    Lots of languages echo parts of C's syntax where that has proven
    to be convenient and popular; curly braces for grouping might be
    an example there. Others have not, or have purposely discarded
    parts of C syntax that have proven awkward or unpopular. An
    example there might be the variable declaration syntax, or the
    structure of `typedef`.

    I can't think of many languages that keep the exact same parsing
    rules with respect to operator precedence. You mentioned Go;
    neither Rust nor Zig follow C's rules, either.

    - Dan C.


    --- PyGate Linux v1.5.15
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From Dan Cross@3:633/10 to All on Sun May 31 19:34:00 2026
    In article <10vhq39$1lpo1$1@dont-email.me>, Bart <bc@freeuk.com> wrote:
    On 31/05/2026 17:04, Tim Rentsch wrote:
    Richard Harnden <richard.nospam@gmail.invalid> writes:

    just write complex expressions in a way that a human can most
    easily understand,

    Unfortunately, (1) different people have different ideas of what
    writing is most easily understood, and (2) different readers have
    different notions of which writings are easily understood, and
    which writings are not so easily understood. To make things
    worse "easily understood" is not a boolean condition, nor is it
    necessarily well-ordered -- "most easily understood" isn't always
    a well-defined quality, even for a given audience.

    Sadly the idea of writing in a way that is "most easily understood"
    has resulted in a race to the bottom, where writers are more and
    more encouraged to take the view that (some) readers are pretty
    much arbitrarily stupid, with the result that expressions become
    littered with scads of unnecessary parentheses that actually
    detract from ease of reading. Good writing is always a balance
    between too much and too little.

    Actual examples of too many parentheses?

    I was working on some code in a Unix-like kernel the other day
    where the original author wrote, `if ((a == 0) && (b == 1))`
    type expressions. The inner parentheses were totally
    superfluous. I removed them.

    As Tim wrote, there's obviously a balance to be struck between
    excessive verbosity and extreme concision. Over time,
    programmers working in a language (or a code base) do tend to
    internalize that some operations are more frequently
    misunderstood than others, and parenthesize accordingly.

    - Dan C.


    --- PyGate Linux v1.5.15
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From David Brown@3:633/10 to All on Sun May 31 22:14:48 2026
    On 31/05/2026 20:25, BGB wrote:
    On 5/31/2026 4:14 AM, David Brown wrote:
    On 30/05/2026 22:48, BGB wrote:
    On 5/30/2026 6:52 AM, David Brown wrote:
    On 29/05/2026 22:16, BGB wrote:
    On 5/29/2026 6:22 AM, David Brown wrote:
    On 29/05/2026 12:20, BGB wrote:
    On 5/29/2026 2:52 AM, Janis Papanagnou wrote:
    On 2026-05-28 11:57, BGB wrote:
    On 5/28/2026 2:18 AM, Janis Papanagnou wrote:
    On 2026-05-28 01:49, BGB wrote:
    [...]

    Also functions with large static arrays.


    void SomeFunc()
    {
    ÿÿ static char buf[4096];
    ÿÿ ...
    }

    Where, say, eliminating SomeFunc does not necessarily eliminate buf.

    Yes, if you have such code but want to eliminate it, then -fdata-
    sections would definitely benefit.ÿ I have not seen such code in
    practice (at least not with very big static arrays, and that also was
    not an essential part of the program).ÿ But of course I have only seen
    a microscopic part of all C code written - if you come across this
    sort of thing, then I appreciate your point.

    (There are several ways to make this more "friendly" to builds that
    need to be compact, such as putting the buffer and/or SomeFunc in a
    separate file or giving it a specific section of its own.)


    I have seen this pattern sometimes, though usually in "medium old" code, with newer code more often assuming that the stack is really big and so
    can handle putting 1MB or more in a local array. Though, this is not
    great on a target which doesn't have a huge stack.

    In my case, I usually had 128K as the default stack size in my project.


    OK. My code typically has a stack of 1 KB or less per thread. It is
    not inconceivable that I would have a static array like this, but it
    would not be in code that is likely to be unused.


    Where section anchors shine - and where -fdata-sections therefore
    has cost - is when a function needs to access more than one piece of
    static lifetime data defined in the same translation unit (or
    another translation unit if you are using LTO).ÿ That happens a lot
    in embedded ARM programming at least.ÿ I don't know about RISC-V.
    If the target normally uses a "small data section" for ram (I know
    this is common on PowerPC), then there is, in effect, a program-wide
    section anchor already.ÿ So it is possible that it relatively few
    targets have section anchors - but the 32-bit ARM on gcc is a vastly
    popular choice in the embedded world, so it is important to
    understand the cost of this compiler flag for that target at least.


    It depends on the way it is built.


    A lot of times though (for non-relocatable static-linked binaries) it
    mostly tends to use AUIPC+LD or AUIPC+ST pairs to access global
    variables. There is a Global Pointer that needs to be loaded when the
    binary is started, unclear what it is used for exactly.


    If you have a global pointer, then it will probably be used for
    gp+offset access to global data, eliminating the need for section
    anchors.

    I have not used RISC-V, and am not familiar with its details.ÿ I can
    see from godbolt that when -fdata-sections is in action and you are
    loading from static lifetime variables, the compiler generates
    instructions like

    ÿÿÿÿÿlw a5, a_variable
    ÿÿÿÿÿlw a4, b_variable
    ÿÿÿÿÿlw a0, c_variable

    When you do not have "-fdata-sections", it uses anchors :

    ÿÿÿÿÿlla a4, .LANCHOR0
    ÿÿÿÿÿlw a5, 0(a4)
    ÿÿÿÿÿlw a3, 4(a4)
    ÿÿÿÿÿlw a0, 8(a4)

    ÿFrom my (limited) understanding, RISC-V cannot use 32-bit absolute
    addressing.ÿ So the "lw a5, a_variable" must be a pseudo-instruction -
    using register + offset addressing.ÿ If there is a global pointer,
    then presumably that is used here.ÿ Alternatively, the pseudo
    instruction might assemble to two real instruction to support the 32-
    bit address.ÿ I know both techniques are used in some targets, but
    don't know about RISC-V.


    It can use one of two strategies for these (after breaking up pseudo- instructions):
    ÿ LUIÿÿÿ a5, HiAddrÿÿÿÿÿ //Abs32, Low 2GB only
    ÿ LWÿÿÿÿ a5, LoAddr(a5)
    Or:
    ÿ AUIPCÿ a5, HiAddrÿÿÿÿÿ //PC-Rel
    ÿ LWÿÿÿÿ a5, LoAddr(a5)

    IIRC, LLA is similar, just using an ADDI as the second instruction.
    But, yeah, the latter sequence would be more efficient.


    Thanks. That clears things up for me. And in particular, it shows that section anchors (and therefore no "-fdata-sections") can make a
    significant difference to gcc code for RISC-V.


    I would expect something different if building with -fPIC or -fPIE, but
    this depends on if it is a version of GCC built with support for these
    (if using a version of GCC built for non-hosted targets, it ignores
    these). Where, one effectively needs different GCC builds for bare-metal (like OS kernels) and for hosted Linux development, for whatever bizarre reason...


    Certainly it would surprise me if the "lw a5, a_variable" version were
    more efficient than using anchors - otherwise why would gcc generate
    code with anchors when given a free choice?ÿ (Perhaps gcc is not well
    tuned for RISC-V code generation - I am wary of making too many
    assumptions about the processor just from some simple compiler outputs.)


    It is not, it is a 2-op sequence usually.

    Plain RISC-V has a bigger problem with 64-bit constants though,
    generally needs to either load these from memory (more typical in GCC)
    or build them in-place (which needs roughly 6 instructions in RISC-V).

    Say (possible, but GCC doesn't do this):
    ÿ LUIÿÿ t0, ValHiA
    ÿ LUIÿÿ t1, valHiB
    ÿ ADDIÿ t0, t0, valLoA
    ÿ ADDIÿ t1, t1, valLoB
    ÿ SLLIÿ t1, t1, 32
    ÿ ADDÿÿ a0, t0, t1


    In my case, I have extensions for RV that can turn a lot of this stuff
    into single instructions (albeit with larger 8 and 12 byte encodings).

    In some cases, it can save bytes, for example:
    ÿ LWÿÿ a1, Disp33s(a0)
    As a 64-bit / 8-byte encoding, vs:
    ÿ LUIÿ t0, DispHi
    ÿ ADDÿ t0, t0, a0
    ÿ LWÿÿ a1, DispLo(a0)
    Needing 12 bytes.


    My own (more drastic) extensions can save more, by having a few Disp16 instructions, which can access 256K or 512K past GP within a single 32-
    bit instruction.


    But, if/when any of this would end up in mainline RISC-V is uncertain. Weirdly, there is a lot more emphasis there on big/fancy features (with niche applicability), rather than on smaller things that can improve the properties of the base ISA (and that could more generally benefit nearly
    all code built for the ISA).



    [...]


    Cygwin has its own wide range of complications.ÿ If you want to use
    gcc targeting native Windows, msys2 and mingw-64 are probably your
    best bet, either compiled natively under msys2 or as a cross-compile
    from Linux. But don't place too much emphasis on my advice, as I very
    rarely compile C or C++ code for Windows - most of my PC target (Linux
    or Windows) coding is in Python.


    Yes, I had used MinGW for a while, before mostly moving over to MSVC for native Windows.

    The tradeoff is mostly:
    MinGW is closer to native for Windows;
    Cygwin could give a closer approximation of Linux on Windows, so one can build a lot of Linux software and use "./configure" scripts and similar.


    Note that MinGW and Mingw-w64 are very, very different. (And the corresponding environments and utility collections, msys and msys2, are equally different.) Mingw-w64, as I understand it, is somewhat of a
    balance between old MinGW and Cygwin in being close to native for most purposes, but providing more POSIX compliance than MinGW. It is also
    much newer, much better maintained, with modern language support in its
    tools (last I heard, with MinGW you did not even get C99 support in the standard library). And of course it has 64-bit support.

    You may well find WSL or MSVC to be a better choice for your
    requirements, but don't mistake Mingw-w64 for MinGW.


    But, as noted, Cygwin's role was mostly displaced by WSL, which
    effectively runs a Linux userland on Windows.


    There was WSL1, which basically mapped Linux syscalls over to the
    Windows kernel, and WSL2, which runs the Linux kernel in a VM.


    Though, in my case I was using WSL1 as seemingly MS had decided that my
    PC can't do virtualization (and sees it as necessary for WSL2), even
    despite having a CPU that can do so, and it is enabled in the BIOS.



    --- PyGate Linux v1.5.15
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From David Brown@3:633/10 to All on Sun May 31 22:24:51 2026
    On 31/05/2026 18:46, James Kuyper wrote:
    On 2026-05-31 11:35, David Brown wrote:
    ...
    Usually, both sub-expressions of a binary operator will be evaluated
    before the operator itself, simply because usually the results of the
    operator cannot be calculated until the sub-expression's values are
    known. But this is not a requirement of the language

    "The value computations of the operands of an operator are sequenced
    before the value computation of the result of the operator." (6.5.1p3)

    - if the compiler
    can get the same results without doing so, it is free to pick a
    different order.

    Correct - but "same results" is crucial; it allows you to invoke the
    "as-if" rule. Otherwise, the sequencing specified by 6.5.1p3 must be
    honored.


    OK.

    ...
    If an implementation provides additional semantics to signed integer
    arithmetic, such as saturating or trapping overflow, then signed integer
    arithmetic operations are no longer associative. But normal C undefined
    behaviour on overflow is fully associative (as is wrapping semantics,
    for addition, subtraction and multiplication).

    I don't follow that. I believe that overflow is guaranteed for (5 +
    INT_MAX) + INT_MIN, and completely avoided by 5 + (INT_MAX + INT_MIN),
    which differ only by association. Are you saying they both have the
    same chance of overflowing?

    No - I see now what you are saying. Overflow is never guaranteed to do anything, including to exist, because it is UB. So the compiler can
    happily treat "(5 + INT_MAX) + INT_MIN" as though you had written "5 + (INT_MAX + INT_MIN)". It can freely re-arrange an expression like this
    that has a potential overflow into one without risk of overflow, as long
    as the same results are given for all values that do not overflow. (The overflow is not part of the observable behaviour.) But it cannot
    re-arrange the other way unless it knows that intermediary overflows
    have no effect. (And the compiler usually does know this.)

    What I am trying to say - but described inaccurately - is that
    expressions can be re-arranged by the compiler without preserving
    overflow behaviour, but it must avoid introducing /new/ overflow risks
    if they can affect the results. It may, however, introduce new
    intermediary overflows if they do not affect the results.




    --- PyGate Linux v1.5.15
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From James Kuyper@3:633/10 to All on Sun May 31 18:26:53 2026
    On 2026-05-31 16:24, David Brown wrote:
    On 31/05/2026 18:46, James Kuyper wrote:
    On 2026-05-31 11:35, David Brown wrote:
    ...
    If an implementation provides additional semantics to signed integer
    arithmetic, such as saturating or trapping overflow, then signed integer >>> arithmetic operations are no longer associative. But normal C undefined >>> behaviour on overflow is fully associative (as is wrapping semantics,
    for addition, subtraction and multiplication).

    I don't follow that. I believe that overflow is guaranteed for (5 +
    INT_MAX) + INT_MIN, and completely avoided by 5 + (INT_MAX + INT_MIN),
    which differ only by association. Are you saying they both have the
    same chance of overflowing?

    No - I see now what you are saying. Overflow is never guaranteed to do anything, including to exist, because it is UB. So the compiler can

    I only meant that overflow was guaranteed, and that the behavior was
    therefore guaranteed to be undefined. I didn't mean to imply that any particular behavior was guaranteed.

    happily treat "(5 + INT_MAX) + INT_MIN" as though you had written "5 + (INT_MAX + INT_MIN)". It can freely re-arrange an expression like this
    that has a potential overflow into one without risk of overflow, as long
    as the same results are given for all values that do not overflow. (The overflow is not part of the observable behaviour.) But it cannot
    re-arrange the other way unless it knows that intermediary overflows
    have no effect. (And the compiler usually does know this.)

    That's what I was mainly concerned about - if I've carefully arranged to
    make sure that overflow is impossible, I'd be rather upset by a compiler
    which, because "normal C undefined behaviour on overflow is fully
    associative", rearranges the associations in my code to make overflow
    possible. I interpreted that comment as meaning that "whether or not the behavior is undefined is fully associative". I guess that what you
    actually meant was "if the behavior is undefined, the compiler is free
    to rearrange the associations".

    --- PyGate Linux v1.5.15
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From Keith Thompson@3:633/10 to All on Sun May 31 15:54:47 2026
    David Brown <david.brown@hesbynett.no> writes:
    On 31/05/2026 16:24, James Kuyper wrote:
    On 2026-05-31 07:18, David Brown wrote:
    [...]
    People might think they affect the order of evaluation, such as when you >>> have function calls :

    u = foo(x) + (foo(y) + foo(z));

    Some people might think the use of parentheses means that "foo(y)" and
    "foo(z)" are called before "foo(x)", when the order of all these calls
    (and the additions) is unspecified. (Again, a given compiler might be
    influenced by the parentheses, but the language does not require it.

    You're correct with regard to the function calls, but the
    parenthesized addition must be performed first, and the other one
    second, which may make a difference, for the same reasons given in my
    previous paragraph.

    The parentheses do not dictate the order of evaluation. But you are
    correct - and it's worth pointing out, so thank you for doing that -
    that for floating point operations, the grouping of operations can
    affect the result.

    The parentheses do not dictate the order of evaluation *of the
    operands*. Each "+" can be evaluated (the addition performed)
    only after the values of its operands are known. But regardless
    of parentheses or operator precedence, the three operands foo(x),
    foo(y), and foo(z) can be evaluated in any of 6 possible orders.
    (It's different when you have operations like "&&", "||", and ",",
    which imposes additional sequence points.)

    If you are talking about floating point arithmetic (I was thinking of
    integer arithmetic, but did not specify), then the operations are not necessarily commutative or associative, and the compiler cannot then re-arrange the operations unless it knows that doing so does not
    affect the result.

    It's not just floating-point. Signed integer overflow is also relevant.

    (INT_MIN + INT_MAX) + 1 is well defined. (INT_MIN + INT_MAX) +1
    is equivalent, and is also well defined. INT_MIN + (INT_MAX +1)
    has undefined behavior.

    But except for specific cases, the order of evaluation - both for the
    values and side-effects - of sub-expressions is unspecified. Indeed,
    they are unsequenced - the evaluations can interleave.

    Usually, both sub-expressions of a binary operator will be evaluated
    before the operator itself, simply because usually the results of the operator cannot be calculated until the sub-expression's values are
    known. But this is not a requirement of the language - if the
    compiler can get the same results without doing so, it is free to pick
    a different order. "(a + b) * 0" does not need to evaluate "a", "b",
    or "a + b" at all unless there is a possibility of a side-effect - and
    it can perform the side-effects in any order. "a + (b + c)" can check
    "a" for a trap representation and deal with that before looking at "b"
    and "c" or the results of "b + c", even though it cannot (for floating
    point operations) re-arrange the code to do "a + b" first.

    Yes, a compiler can reduce (a + b) * 0 to just 0. But it's not
    required to do so, and (INT_MAX + 1) * 0 still has undefined
    behavior. Undefined behavior is determined by the rules of the
    abstract machine *without* any adjustments permitted by the as-if
    rule.

    [...]

    --
    Keith Thompson (The_Other_Keith) Keith.S.Thompson+u@gmail.com
    void Void(void) { Void(); } /* The recursive call of the void */

    --- PyGate Linux v1.5.15
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From Tim Rentsch@3:633/10 to All on Sun May 31 16:08:04 2026
    cross@spitfire.i.gajendra.net (Dan Cross) writes:

    In [...] early C, `|` and `&` were logical operators. The
    short-circuiting `||` and `&&` came later, but the usage low
    precedence for `|` and `&` was already baked in.

    That's the point: the precedence reflects the original use as
    boolean operators, not how things evolved for use almost purely
    as bitwise operators.

    Surely even in pre-K&R C the & and | operators were used for
    bitwise-and and bitwise-or as well as logical connectors.

    --- PyGate Linux v1.5.15
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From Keith Thompson@3:633/10 to All on Sun May 31 16:32:17 2026
    Tim Rentsch <tr.17687@z991.linuxsc.com> writes:
    cross@spitfire.i.gajendra.net (Dan Cross) writes:
    In [...] early C, `|` and `&` were logical operators. The
    short-circuiting `||` and `&&` came later, but the usage low
    precedence for `|` and `&` was already baked in.

    That's the point: the precedence reflects the original use as
    boolean operators, not how things evolved for use almost purely
    as bitwise operators.

    Surely even in pre-K&R C the & and | operators were used for
    bitwise-and and bitwise-or as well as logical connectors.

    They were used for both (and that was the problem).

    The "original use" being referred to is in BCPL and B, and in *very*
    early C.

    Reference:
    https://www.nokia.com/bell-labs/about/dennis-m-ritchie/chist.pdf

    Neonatal C

    Rapid changes continued after the language had been named,
    for example the introduction of the && and || operators. In
    BCPL and B, the evaluation of expressions depends on context:
    within if and other conditional statements that compare an
    expression?s value with zero, these languages place a special
    interpretation on the and (&) and or (|) operators. In ordinary
    contexts, they operate bitwise, but in the B statement

    if (e1 & e2) ...

    the compiler must evaluate e1 and if it is non-zero, evaluate e2,
    and if it too is non-zero, elaborate the statement dependent on
    the if. The requirement descends recursively on & and | operators
    within e1 and e2. The short-circuit semantics of the Boolean
    operators in such ?truth-value? context seemed desirable,
    but the overloading of the operators was difficult to explain
    and use. At the suggestion of Alan Snyder, I introduced the &&
    and || operators to make the mechanism more explicit.

    Their tardy introduction explains an infelicity of C?s
    precedence rules. In B one writes

    if (a==b & c) ...

    to check whether a equals b and c is non-zero; in such a
    conditional expression it is better that & have lower precedence
    than ==. In converting from B to C, one wants to replace & by
    && in such a statement; to make the conversion less painful,
    we decided to keep the precedence of the & operator the same
    relative to ==, and merely split the precedence of && slightly
    from &. Today, it seems that it would have been preferable to
    move the relative precedences of & and ==, and thereby simplify
    a common C idiom: to test a masked value against another value,
    one must write

    if ((a&mask) == b) ...

    where the inner parentheses are required but easily forgotten.

    --
    Keith Thompson (The_Other_Keith) Keith.S.Thompson+u@gmail.com
    void Void(void) { Void(); } /* The recursive call of the void */

    --- PyGate Linux v1.5.15
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From Tim Rentsch@3:633/10 to All on Sun May 31 17:12:23 2026
    Keith Thompson <Keith.S.Thompson+u@gmail.com> writes:

    Tim Rentsch <tr.17687@z991.linuxsc.com> writes:

    cross@spitfire.i.gajendra.net (Dan Cross) writes:

    In [...] early C, `|` and `&` were logical operators. The
    short-circuiting `||` and `&&` came later, but the usage low
    precedence for `|` and `&` was already baked in.

    That's the point: the precedence reflects the original use as
    boolean operators, not how things evolved for use almost purely
    as bitwise operators.

    Surely even in pre-K&R C the & and | operators were used for
    bitwise-and and bitwise-or as well as logical connectors.

    They were used for both [...]

    That's all I was saying.

    --- PyGate Linux v1.5.15
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From Lawrence D?Oliveiro@3:633/10 to All on Mon Jun 1 00:33:03 2026
    On Sun, 31 May 2026 10:59:42 +0100, Bart wrote:

    How about when you intend it to be: '(a << b) + c'?

    I gave real-world examples of the usage that you asked for, how about
    you do the same?

    --- PyGate Linux v1.5.15
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From Bart@3:633/10 to All on Mon Jun 1 02:26:17 2026
    On 01/06/2026 01:33, Lawrence D?Oliveiro wrote:
    On Sun, 31 May 2026 10:59:42 +0100, Bart wrote:

    How about when you intend it to be: '(a << b) + c'?

    I gave real-world examples of the usage that you asked for, how about
    you do the same?

    Can do, but they wouldn't be in C. The problems are when I port to C or
    port from C or simply try to understand it.

    Examples:

    hsum := hsum << 4 - hsum + c

    lxvalue := lxvalue << 8 + (pstart+i-1)^

    macro makemodrm(mode, opc, rm) = mode<<6 + opc<<3 + rm

    genxrm(0xD9 + mf << 1, code, a)

    am.sib := scaletable[scale]<<6 + index<<3 + base

    p++^ := r>>5<<5 + g>>5<<2 + b>>6

    scale := (sib>>6 + 1| 1, 2, 4, 8 |0)

    hdr.usedc[i+1] := t>>4 + 1

    index := r<<5 + g<<2 + b

    rgb := b<<16 + g<<8 + r

    shortopc := ttt<<3 + rmopc

    Here, '<< >>' have same precedence as '* /'.

    Notice I like to use '+' rather than '|', which in my syntax is 'ior'.
    But '+ -' and 'iand ior ixor' all have the same precedence so I'd never
    have to worry about it anyway

    In C, '+ -' and '& | ^' are on opposite sides of '<< >>'.

    And yes sometime I need to use parentheses to override; it is no big deal.

    Generally, C seems to need at least 20% more parentheses (as a
    proportion of all tokens) than code written in my syntax despite all
    these extra levels to help you write fewer.

    Bear in mind that C uses {...} to enclose data where I'd need to use
    (...), but those {} aren't counted.


    --- PyGate Linux v1.5.15
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From Tim Rentsch@3:633/10 to All on Sun May 31 19:10:13 2026
    Bart <bc@freeuk.com> writes:

    On 31/05/2026 17:04, Tim Rentsch wrote:

    Richard Harnden <richard.nospam@gmail.invalid> writes:

    just write complex expressions in a way that a human can most
    easily understand,

    Unfortunately, (1) different people have different ideas of what
    writing is most easily understood, and (2) different readers have
    different notions of which writings are easily understood, and
    which writings are not so easily understood. To make things
    worse "easily understood" is not a boolean condition, nor is it
    necessarily well-ordered -- "most easily understood" isn't always
    a well-defined quality, even for a given audience.

    Sadly the idea of writing in a way that is "most easily understood"
    has resulted in a race to the bottom, where writers are more and
    more encouraged to take the view that (some) readers are pretty
    much arbitrarily stupid, with the result that expressions become
    littered with scads of unnecessary parentheses that actually
    detract from ease of reading. Good writing is always a balance
    between too much and too little.

    Actual examples of too many parentheses?

    The point of my comment is that either too many or too few is a
    subjective judgment, not an objective one.

    And then there is ?: :

    a > b ? c : d # (a>b)?c:d
    a + b ? c : d # (a+b)?c:d

    The grouping of the first is probably what is intended. But in the
    second, the intent might have been (a+b)?c:d, or a+(b?c:c); we don't
    know for sure that the author didn't make a mistake or we don't know outselves.

    This example is so addlebrained that it's hard to imagine anyone
    being confused about it. Or that it's worth any expenditure of
    thought wondering what to do about people who are.

    --- PyGate Linux v1.5.15
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From David Brown@3:633/10 to All on Mon Jun 1 08:28:53 2026
    On 01/06/2026 00:26, James Kuyper wrote:
    On 2026-05-31 16:24, David Brown wrote:
    On 31/05/2026 18:46, James Kuyper wrote:
    On 2026-05-31 11:35, David Brown wrote:
    ...
    If an implementation provides additional semantics to signed integer
    arithmetic, such as saturating or trapping overflow, then signed integer >>>> arithmetic operations are no longer associative. But normal C undefined >>>> behaviour on overflow is fully associative (as is wrapping semantics,
    for addition, subtraction and multiplication).

    I don't follow that. I believe that overflow is guaranteed for (5 +
    INT_MAX) + INT_MIN, and completely avoided by 5 + (INT_MAX + INT_MIN),
    which differ only by association. Are you saying they both have the
    same chance of overflowing?

    No - I see now what you are saying. Overflow is never guaranteed to do
    anything, including to exist, because it is UB. So the compiler can

    I only meant that overflow was guaranteed, and that the behavior was therefore guaranteed to be undefined. I didn't mean to imply that any particular behavior was guaranteed.

    That's a useful distinction.


    happily treat "(5 + INT_MAX) + INT_MIN" as though you had written "5 +
    (INT_MAX + INT_MIN)". It can freely re-arrange an expression like this
    that has a potential overflow into one without risk of overflow, as long
    as the same results are given for all values that do not overflow. (The
    overflow is not part of the observable behaviour.) But it cannot
    re-arrange the other way unless it knows that intermediary overflows
    have no effect. (And the compiler usually does know this.)

    That's what I was mainly concerned about - if I've carefully arranged to
    make sure that overflow is impossible, I'd be rather upset by a compiler which, because "normal C undefined behaviour on overflow is fully associative", rearranges the associations in my code to make overflow possible.

    I am quite happy for the compiler to make such re-arrangements, as long
    as it knows that doing so gives the same results on the target. I too
    would be most upset if it made these re-arrangements when I had the flag "-fsanitize=signed-overflow" (or equivalent on other compilers) in
    action and halted my program when it "discovered" and overflow bug. But
    I am quite happy for it to make these re-arrangements for the code it generates - I am always happy with more efficient object code that
    follows the "as if" rule.

    (If the expression is floating point, or the target has unusual
    capabilities, so that an overflow is detectable then of course such re-arrangements are not valid as they would break the "as-if" rule.)


    I interpreted that comment as meaning that "whether or not the
    behavior is undefined is fully associative". I guess that what you
    actually meant was "if the behavior is undefined, the compiler is free
    to rearrange the associations".

    That is better phrasing than I used. Thanks.



    --- PyGate Linux v1.5.15
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From David Brown@3:633/10 to All on Mon Jun 1 08:39:03 2026
    On 01/06/2026 00:54, Keith Thompson wrote:
    David Brown <david.brown@hesbynett.no> writes:
    On 31/05/2026 16:24, James Kuyper wrote:
    On 2026-05-31 07:18, David Brown wrote:
    [...]
    People might think they affect the order of evaluation, such as when you >>>> have function calls :

    u = foo(x) + (foo(y) + foo(z));

    Some people might think the use of parentheses means that "foo(y)" and >>>> "foo(z)" are called before "foo(x)", when the order of all these calls >>>> (and the additions) is unspecified. (Again, a given compiler might be >>>> influenced by the parentheses, but the language does not require it.

    You're correct with regard to the function calls, but the
    parenthesized addition must be performed first, and the other one
    second, which may make a difference, for the same reasons given in my
    previous paragraph.

    The parentheses do not dictate the order of evaluation. But you are
    correct - and it's worth pointing out, so thank you for doing that -
    that for floating point operations, the grouping of operations can
    affect the result.

    The parentheses do not dictate the order of evaluation *of the
    operands*. Each "+" can be evaluated (the addition performed)
    only after the values of its operands are known. But regardless
    of parentheses or operator precedence, the three operands foo(x),
    foo(y), and foo(z) can be evaluated in any of 6 possible orders.
    (It's different when you have operations like "&&", "||", and ",",
    which imposes additional sequence points.)


    Yes. And I have seen code where the author believed that the
    parentheses /did/ affect the order of evaluation of the "foo" calls. It
    is definitely a misunderstanding people can make, though I of course
    have no idea how often people make it.

    If you are talking about floating point arithmetic (I was thinking of
    integer arithmetic, but did not specify), then the operations are not
    necessarily commutative or associative, and the compiler cannot then
    re-arrange the operations unless it knows that doing so does not
    affect the result.

    It's not just floating-point. Signed integer overflow is also relevant.

    (INT_MIN + INT_MAX) + 1 is well defined. (INT_MIN + INT_MAX) +1
    is equivalent, and is also well defined. INT_MIN + (INT_MAX +1)
    has undefined behavior.

    Compilers can re-arrange integer arithmetic, despite new overflows, if
    they know the result is the same. On pretty much any current processor,
    a compiler generating code for integer "a + b + c" could do the
    additions in any order - treating the operations as commutative and
    fully associative. The final result will be the same in every case
    where the original expression did not overflow (i.e., every case with
    defined behaviour).

    If the implementation makes overflow detectable in some way (such as by "-fsanitize=signed-arithmetic-overflow"), or the hardware does something
    that gives different results from overflow (saturating, hardware traps),
    then it's an entirely different matter.


    But except for specific cases, the order of evaluation - both for the
    values and side-effects - of sub-expressions is unspecified. Indeed,
    they are unsequenced - the evaluations can interleave.

    Usually, both sub-expressions of a binary operator will be evaluated
    before the operator itself, simply because usually the results of the
    operator cannot be calculated until the sub-expression's values are
    known. But this is not a requirement of the language - if the
    compiler can get the same results without doing so, it is free to pick
    a different order. "(a + b) * 0" does not need to evaluate "a", "b",
    or "a + b" at all unless there is a possibility of a side-effect - and
    it can perform the side-effects in any order. "a + (b + c)" can check
    "a" for a trap representation and deal with that before looking at "b"
    and "c" or the results of "b + c", even though it cannot (for floating
    point operations) re-arrange the code to do "a + b" first.

    Yes, a compiler can reduce (a + b) * 0 to just 0. But it's not
    required to do so, and (INT_MAX + 1) * 0 still has undefined
    behavior. Undefined behavior is determined by the rules of the
    abstract machine *without* any adjustments permitted by the as-if
    rule.


    Sure. (And it's a good point to make.)



    --- PyGate Linux v1.5.15
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From David Brown@3:633/10 to All on Mon Jun 1 09:52:08 2026
    On 31/05/2026 19:11, Bart wrote:
    On 31/05/2026 17:04, Tim Rentsch wrote:
    Richard Harnden <richard.nospam@gmail.invalid> writes:

    just write complex expressions in a way that a human can most
    easily understand,

    Unfortunately, (1) different people have different ideas of what
    writing is most easily understood, and (2) different readers have
    different notions of which writings are easily understood, and
    which writings are not so easily understood.ÿ To make things
    worse "easily understood" is not a boolean condition, nor is it
    necessarily well-ordered -- "most easily understood" isn't always
    a well-defined quality, even for a given audience.

    Sadly the idea of writing in a way that is "most easily understood"
    has resulted in a race to the bottom, where writers are more and
    more encouraged to take the view that (some) readers are pretty
    much arbitrarily stupid, with the result that expressions become
    littered with scads of unnecessary parentheses that actually
    detract from ease of reading.ÿ Good writing is always a balance
    between too much and too little.

    Actual examples of too many parentheses?

    Any source code written in LISP :-)

    (And for too few parentheses, any source code in Forth.)


    From a quick grep of an SDK in a project I am working on, I saw this
    example :

    if ((((pData1 == NULL) || (pData2 == NULL))) || (Length == 0U))

    The number of parentheses there is so high it's hard to see that not
    only is there an unnecessary extra parentheses for the first ||
    operator, but there is a second set of extra parentheses around it. Eliminating these would give :

    if ((pData1 == NULL) || (pData2 == NULL) || (Length == 0U))

    or, with an extra space for clarity,

    if ( (pData1 == NULL) || (pData2 == NULL) || (Length == 0U) )

    That still leaves extra parentheses around the equality operators, but
    the decision to keep or remove them is subjective (as is the choice of
    "pData1 == NULL" vs. "!pData1").

    But IMHO, the original line had at least two sets of completely
    redundant and unhelpful parentheses which made it harder to read - the
    reader is left wondering whether these parentheses are there for a
    purpose and have an effect on what should have been a simple and clear expression.


    The SDK also contains examples of parentheses used because it mixes
    relatively rare operators (shifts and binary operators). Parentheses
    around such sub-expressions are not uncommon, and can definitely be
    helpful, but the quantity here makes things hard to read. Ironically,
    though it is a macro, there are not "safety" parentheses around the
    argument in the expression.

    And yes, these really are the names of the macro in this code.


    #define CONVERTARGB88882ARGB4444(Color) \
    ((((Color & 0xFFU) >> 4) & 0xFU) |\
    (((((Color & 0xFF00U) >> 8) >> 4) & 0xFU) << 4) |\
    (((((Color & 0xFF0000U) >> 16) >> 4) & 0xFU) << 8) | \
    (((((Color & 0xFF000000U) >> 24) >> 4) & 0xFU) << 12))

    #define CONVERTRGB5652ARGB8888(Color) \
    (((((((Color >> 11) & 0x1FU) * 527) + 23) >> 6) << 16) |\
    ((((((Color >> 5) & 0x3FU) * 259) + 33) >> 6) << 8) |\
    ((((Color & 0x1FU) * 527) + 23) >> 6) | 0xFF000000)

    It can be argued that the parentheses themselves are not the problem
    here - it is doing too much in one expression. Static inline functions
    would make things clearer, as would a separation of the steps of
    breaking down the original colour format into parts, scaling or
    conversions, then building up the new colour format. Different named
    types for the different formats would go a long way towards usability
    and safety - at least using typedefs, but preferably using structs to
    make real different types. And surely nicer names could have been found!



    --- PyGate Linux v1.5.15
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From Keith Thompson@3:633/10 to All on Mon Jun 1 02:33:39 2026
    David Brown <david.brown@hesbynett.no> writes:
    On 01/06/2026 00:54, Keith Thompson wrote:
    [...]
    (INT_MIN + INT_MAX) + 1 is well defined. (INT_MIN + INT_MAX) +1
    is equivalent, and is also well defined. INT_MIN + (INT_MAX +1)
    has undefined behavior.

    Oops, I forgot to delete some parentheses. I meant to write that
    INT_MIN + INT_MAX + 1 is equivalent to (INT_MIN + INT_MAX) + 1.
    The redundant parentheses don't impose any change in semantics.

    Compilers can re-arrange integer arithmetic, despite new overflows, if
    they know the result is the same. On pretty much any current
    processor, a compiler generating code for integer "a + b + c" could do
    the additions in any order - treating the operations as commutative
    and fully associative. The final result will be the same in every
    case where the original expression did not overflow (i.e., every case
    with defined behaviour).

    Right, good point. Since (INT_MIN + INT_MAX) + 1 is well defined,
    if a compiler rearranges it so it evaluates INT_MAX + 1 as an
    intermediate result, that's permitted **if** the result is the same.
    (It makes more sense if the operands are variables with those values
    rather than constants.) A compiler can take advantage of how the
    hardware works. UB applies to the C source code, not (necessarily)
    to the operations that are performed by the actual machine. If it
    generates code that yields the correct result by consulting a Ouija
    board, that's still conforming.

    [...]

    --
    Keith Thompson (The_Other_Keith) Keith.S.Thompson+u@gmail.com
    void Void(void) { Void(); } /* The recursive call of the void */

    --- PyGate Linux v1.5.15
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From Keith Thompson@3:633/10 to All on Mon Jun 1 02:42:20 2026
    David Brown <david.brown@hesbynett.no> writes:
    On 31/05/2026 19:11, Bart wrote:
    [...]
    Actual examples of too many parentheses?

    Any source code written in LISP :-)

    (And for too few parentheses, any source code in Forth.)


    From a quick grep of an SDK in a project I am working on, I saw this
    example :

    if ((((pData1 == NULL) || (pData2 == NULL))) || (Length == 0U))

    The number of parentheses there is so high it's hard to see that not
    only is there an unnecessary extra parentheses for the first ||
    operator, but there is a second set of extra parentheses around
    it. Eliminating these would give :

    if ((pData1 == NULL) || (pData2 == NULL) || (Length == 0U))

    or, with an extra space for clarity,

    if ( (pData1 == NULL) || (pData2 == NULL) || (Length == 0U) )

    That still leaves extra parentheses around the equality operators, but
    the decision to keep or remove them is subjective (as is the choice of "pData1 == NULL" vs. "!pData1").

    Yeah, I'd write that as

    if (pData1 == NULL || pData2 == NULL || Length == 0U)

    The fact that || binds more loosely than == is one of those things
    that I arbitrarily find sufficiently intuitive.

    [...]

    And yes, these really are the names of the macro in this code.


    #define CONVERTARGB88882ARGB4444(Color) \
    ((((Color & 0xFFU) >> 4) & 0xFU) |\
    (((((Color & 0xFF00U) >> 8) >> 4) & 0xFU) << 4) |\
    (((((Color & 0xFF0000U) >> 16) >> 4) & 0xFU) << 8) | \
    (((((Color & 0xFF000000U) >> 24) >> 4) & 0xFU) << 12))

    #define CONVERTRGB5652ARGB8888(Color) \
    (((((((Color >> 11) & 0x1FU) * 527) + 23) >> 6) << 16) |\
    ((((((Color >> 5) & 0x3FU) * 259) + 33) >> 6) << 8) |\
    ((((Color & 0x1FU) * 527) + 23) >> 6) | 0xFF000000)

    In a macro definition, I'd parenthesize each occurrence of Color,
    in case the argument is a more complicated expression, as well as parenthesizing the entire definition (the latter was done here).
    The rest of the parentheses feel excessive, but I frankly can't be
    bothered to figure out which can be omitted without hurting clarity.

    [...]

    --
    Keith Thompson (The_Other_Keith) Keith.S.Thompson+u@gmail.com
    void Void(void) { Void(); } /* The recursive call of the void */

    --- PyGate Linux v1.5.15
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From Bart@3:633/10 to All on Mon Jun 1 11:12:00 2026
    On 01/06/2026 03:10, Tim Rentsch wrote:
    Bart <bc@freeuk.com> writes:

    On 31/05/2026 17:04, Tim Rentsch wrote:

    Richard Harnden <richard.nospam@gmail.invalid> writes:

    just write complex expressions in a way that a human can most
    easily understand,

    Unfortunately, (1) different people have different ideas of what
    writing is most easily understood, and (2) different readers have
    different notions of which writings are easily understood, and
    which writings are not so easily understood. To make things
    worse "easily understood" is not a boolean condition, nor is it
    necessarily well-ordered -- "most easily understood" isn't always
    a well-defined quality, even for a given audience.

    Sadly the idea of writing in a way that is "most easily understood"
    has resulted in a race to the bottom, where writers are more and
    more encouraged to take the view that (some) readers are pretty
    much arbitrarily stupid, with the result that expressions become
    littered with scads of unnecessary parentheses that actually
    detract from ease of reading. Good writing is always a balance
    between too much and too little.

    Actual examples of too many parentheses?

    The point of my comment is that either too many or too few is a
    subjective judgment, not an objective one.

    My point was that it could be objective, at least for too many. So (a*a)
    + (b*b) would be commonly agreed to have too many, and I was extending
    that to other examples in computing.


    And then there is ?: :

    a > b ? c : d # (a>b)?c:d
    a + b ? c : d # (a+b)?c:d

    The grouping of the first is probably what is intended. But in the
    second, the intent might have been (a+b)?c:d, or a+(b?c:c); we don't
    know for sure that the author didn't make a mistake or we don't know
    outselves.

    This example is so addlebrained that it's hard to imagine anyone
    being confused about it. Or that it's worth any expenditure of
    thought wondering what to do about people who are.

    I don't understand what the problem is with my examples. There can be ambiguity in the mind of the person looking at such code as to how the
    first terms are grouped.

    These are more or less real examples, I just simplified the terms. Here
    are some from MZLIB:

    return (status == MZ_OK) ? MZ_BUF_ERROR : status;

    return (pL == pE) ? (l_len < r_len) : (l < r);

    sym = (match_dist < 512) ? s0 : s1;

    return ((pState->m_last_status == TINFL_STATUS_DONE) && (!pState->m_dict_avail)) ? MZ_STREAM_END : MZ_OK;

    I believe that in the first three, all parentheses are superflous, but
    they are used anyway. Why is that?

    (My preferences for ?: are that the whole thing is syntax, outside of
    the precedence scheme, and that it has mandatory parentheses. That
    second line would then look like this:

    return (pL == pE ? l_len < r_len : l < r);

    There are fewer parentheses in all, and less potential confusion. You
    can even have assignments in each branch; they will not interfere with ?:.)

    As for the last one, I haven't figured it out yet. But simplifying the
    terms:

    return ((a == b) && (!c)) ? d : e;

    then the same applies: this could be:

    return a == b && !c ? d : e;

    However, I had to confirm this by comparing the ASTs for both.

    I'd say that MZLIB is doing the right thing by not being too clever.

    --- PyGate Linux v1.5.15
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From David Brown@3:633/10 to All on Mon Jun 1 12:36:39 2026
    On 01/06/2026 12:12, Bart wrote:
    On 01/06/2026 03:10, Tim Rentsch wrote:
    Bart <bc@freeuk.com> writes:

    On 31/05/2026 17:04, Tim Rentsch wrote:

    Richard Harnden <richard.nospam@gmail.invalid> writes:

    just write complex expressions in a way that a human can most
    easily understand,

    Unfortunately, (1) different people have different ideas of what
    writing is most easily understood, and (2) different readers have
    different notions of which writings are easily understood, and
    which writings are not so easily understood.ÿ To make things
    worse "easily understood" is not a boolean condition, nor is it
    necessarily well-ordered -- "most easily understood" isn't always
    a well-defined quality, even for a given audience.

    Sadly the idea of writing in a way that is "most easily understood"
    has resulted in a race to the bottom, where writers are more and
    more encouraged to take the view that (some) readers are pretty
    much arbitrarily stupid, with the result that expressions become
    littered with scads of unnecessary parentheses that actually
    detract from ease of reading.ÿ Good writing is always a balance
    between too much and too little.

    Actual examples of too many parentheses?

    The point of my comment is that either too many or too few is a
    subjective judgment, not an objective one.

    My point was that it could be objective, at least for too many. So (a*a)
    + (b*b) would be commonly agreed to have too many, and I was extending
    that to other examples in computing.


    No, it is all still subjective. But the more levels of parentheses, the
    more consensus you are likely to get on the subjective opinions.

    To be "objective", you would have to have some kind of measure, with statistically significant results. If someone were to conduct a survey
    and measure the accuracy and thinking time for people to understand expressions written in different ways with different levels of
    parentheses, then there would be a basis for calling things "objective".



    --- PyGate Linux v1.5.15
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From Bart@3:633/10 to All on Mon Jun 1 11:47:04 2026
    On 01/06/2026 08:52, David Brown wrote:
    On 31/05/2026 19:11, Bart wrote:
    On 31/05/2026 17:04, Tim Rentsch wrote:
    Richard Harnden <richard.nospam@gmail.invalid> writes:

    just write complex expressions in a way that a human can most
    easily understand,

    Unfortunately, (1) different people have different ideas of what
    writing is most easily understood, and (2) different readers have
    different notions of which writings are easily understood, and
    which writings are not so easily understood.ÿ To make things
    worse "easily understood" is not a boolean condition, nor is it
    necessarily well-ordered -- "most easily understood" isn't always
    a well-defined quality, even for a given audience.

    Sadly the idea of writing in a way that is "most easily understood"
    has resulted in a race to the bottom, where writers are more and
    more encouraged to take the view that (some) readers are pretty
    much arbitrarily stupid, with the result that expressions become
    littered with scads of unnecessary parentheses that actually
    detract from ease of reading.ÿ Good writing is always a balance
    between too much and too little.

    Actual examples of too many parentheses?

    Any source code written in LISP :-)

    (And for too few parentheses, any source code in Forth.)


    From a quick grep of an SDK in a project I am working on, I saw this example :

    ÿÿÿÿif ((((pData1 == NULL) || (pData2 == NULL))) || (Length == 0U))

    The number of parentheses there is so high it's hard to see that not
    only is there an unnecessary extra parentheses for the first ||
    operator, but there is a second set of extra parentheses around it. Eliminating these would give :

    ÿÿÿÿif ((pData1 == NULL) || (pData2 == NULL) || (Length == 0U))

    or, with an extra space for clarity,

    ÿÿÿÿif ( (pData1 == NULL) || (pData2 == NULL) || (Length == 0U) )

    That still leaves extra parentheses around the equality operators, but
    the decision to keep or remove them is subjective (as is the choice of "pData1 == NULL" vs. "!pData1").

    Maybe it's due to || being a symbol; compare:

    if (pData1 == NULL || pData2 == NULL || Length == 0U)

    if (pData1 == NULL or pData2 == NULL or Length == 0U)

    To me, || seems to draw in the terms on either side as strongly as ==.
    That happens less using 'or'.

    (Both are valid C if using iso646.h.)


    But IMHO, the original line had at least two sets of completely
    redundant and unhelpful parentheses which made it harder to read - the reader is left wondering whether these parentheses are there for a
    purpose and have an effect on what should have been a simple and clear expression.

    The pattern seems to be '((a || b)) || c) || d' so maybe the author
    didn't understand that || is parsed LTR anyway.


    The SDK also contains examples of parentheses used because it mixes relatively rare operators (shifts and binary operators).ÿ Parentheses
    around such sub-expressions are not uncommon, and can definitely be
    helpful, but the quantity here makes things hard to read.ÿ Ironically, though it is a macro, there are not "safety" parentheses around the
    argument in the expression.

    And yes, these really are the names of the macro in this code.


    #define CONVERTARGB88882ARGB4444(Color) \
    ÿÿÿÿ((((Color & 0xFFU) >> 4) & 0xFU) |\
    ÿÿÿÿ(((((Color & 0xFF00U) >> 8) >> 4) & 0xFU) << 4) |\
    ÿÿÿÿ(((((Color & 0xFF0000U) >> 16) >> 4) & 0xFU) << 8) | \
    ÿÿÿÿ(((((Color & 0xFF000000U) >> 24) >> 4) & 0xFU) << 12))
    #define CONVERTRGB5652ARGB8888(Color) \
    ÿÿÿÿ(((((((Color >> 11) & 0x1FU) * 527) + 23) >> 6) << 16) |\
    ÿÿÿÿ((((((Color >> 5) & 0x3FU) * 259) + 33) >> 6) << 8) |\
    ÿÿÿÿ((((Color & 0x1FU) * 527) + 23) >> 6) | 0xFF000000)

    It can be argued that the parentheses themselves are not the problem
    here - it is doing too much in one expression.ÿ Static inline functions would make things clearer, as would a separation of the steps of
    breaking down the original colour format into parts, scaling or
    conversions, then building up the new colour format.ÿ Different named
    types for the different formats would go a long way towards usability
    and safety - at least using typedefs, but preferably using structs to
    make real different types.ÿ And surely nicer names could have been found!

    Your examples actually look reasonable. In fact, it could probably do
    with more parentheses around 'Color'... (I've just seen you've already mentioned this!)

    The first part of the second has to apply 6 operations to 'Color' in
    strict LTR order. Using parentheses ensures not having to worry about precedence, since the ops are '>> & * + >> <<'

    The macro names seem self-explanatory too, although they could do with
    some underscores.

    But anything involving macros probably doesn't count; you expect () to
    be heavily used in the expansion.

    This is an example from Lua:

    op_arith(L, l_addi, luai_numadd);

    On the face of it, perfectly reasonable. But it expands to this:

    {TValue*v1=(&((base+(((void)0),((((int)((((i)>>((((0+7)+8)+1)))& ((~((~(Instruction)0)<<(8)))<<(0))))))))))->val);TValue*v2=(&(( base+(((void)0),((((int)((((i)>>(((((0+7)+8)+1)+8)))&((~((~( Instruction)0)<<(8)))<<(0))))))))))->val);{StkId ra=(base+(((int) ((((i)>>((0+7)))&((~((~(Instruction)0)<<(8)))<<(0)))))));if(((((v1) )->tt_)==(((3)|((0)<<4))))&&((((v2))->tt_)==(((3)|((0)<<4))))){
    lua_Integer i1=(((void)0),(((v1)->value_).i));lua_Integer i2=(((void) 0),(((v2)->value_).i));pc++;{TValue*io=((&(ra)->val));((io)->value_) .i=(((lua_Integer)(((lua_Unsigned)(i1))+((lua_Unsigned)(i2)))));((io) ->tt_=(((3)|((0)<<4))));};}else{lua_Number n1;lua_Number n2;if((((((v1)) ->tt_)==(((3)|((1)<<4))))?((n1)=(((void)0),(((v1)->value_).n)),1):((((( v1))->tt_)==(((3)|((0)<<4))))?((n1)=((lua_Number)(((((void)0),(((v1)-> value_).i))))),1):0))&&(((((v2))->tt_)==(((3)|((1)<<4))))?((n2)=(((void) 0),(((v2)->value_).n)),1):(((((v2))->tt_)==(((3)|((0)<<4))))?((n2)=(( lua_Number)(((((void)0),(((v2)->value_).i))))),1):0))){pc++;{TValue* io=((&(ra)->val));((io)->value_).n=(((n1)+(n2)));((io)->tt_=(((3)| ((1)<<4))));};}};};};

    (I had fun debugging this at one time in my compiler. I've no idea how
    the original developer did so.)

    Not too many () in the macro definitions, but I can only see the top
    level; here deeply nested macros are used.

    --- PyGate Linux v1.5.15
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From David Brown@3:633/10 to All on Mon Jun 1 12:50:34 2026
    On 01/06/2026 11:42, Keith Thompson wrote:
    David Brown <david.brown@hesbynett.no> writes:
    On 31/05/2026 19:11, Bart wrote:
    [...]
    Actual examples of too many parentheses?

    Any source code written in LISP :-)

    (And for too few parentheses, any source code in Forth.)


    From a quick grep of an SDK in a project I am working on, I saw this
    example :

    if ((((pData1 == NULL) || (pData2 == NULL))) || (Length == 0U))

    The number of parentheses there is so high it's hard to see that not
    only is there an unnecessary extra parentheses for the first ||
    operator, but there is a second set of extra parentheses around
    it. Eliminating these would give :

    if ((pData1 == NULL) || (pData2 == NULL) || (Length == 0U))

    or, with an extra space for clarity,

    if ( (pData1 == NULL) || (pData2 == NULL) || (Length == 0U) )

    That still leaves extra parentheses around the equality operators, but
    the decision to keep or remove them is subjective (as is the choice of
    "pData1 == NULL" vs. "!pData1").

    Yeah, I'd write that as

    if (pData1 == NULL || pData2 == NULL || Length == 0U)

    The fact that || binds more loosely than == is one of those things
    that I arbitrarily find sufficiently intuitive.


    Yes, the precedence levels of "==" and "||" (and "&&") are clearly intentional, and I think a lot of C programmers are happy with skipping
    the parentheses here. But some people would prefer to have the sub-expressions parenthesised, and I think that is fair enough too -
    it's not going to cause anyone extra difficulties in reading the line.

    [...]

    And yes, these really are the names of the macro in this code.


    #define CONVERTARGB88882ARGB4444(Color) \
    ((((Color & 0xFFU) >> 4) & 0xFU) |\
    (((((Color & 0xFF00U) >> 8) >> 4) & 0xFU) << 4) |\
    (((((Color & 0xFF0000U) >> 16) >> 4) & 0xFU) << 8) | \
    (((((Color & 0xFF000000U) >> 24) >> 4) & 0xFU) << 12))

    #define CONVERTRGB5652ARGB8888(Color) \
    (((((((Color >> 11) & 0x1FU) * 527) + 23) >> 6) << 16) |\
    ((((((Color >> 5) & 0x3FU) * 259) + 33) >> 6) << 8) |\
    ((((Color & 0x1FU) * 527) + 23) >> 6) | 0xFF000000)

    In a macro definition, I'd parenthesize each occurrence of Color,
    in case the argument is a more complicated expression, as well as parenthesizing the entire definition (the latter was done here).
    The rest of the parentheses feel excessive, but I frankly can't be
    bothered to figure out which can be omitted without hurting clarity.


    That's the problem with code like that. People will think "that's a
    mess - I'll just assume / hope that it is correct". It is very
    difficult to check in code reviews, or to maintain, modify or adapt, so
    no one will bother figuring it out. It is "write-only" code.

    But while I know there are certainly some of the parentheses that could
    be removed, I am not sure that would actually improve the readability significantly. Like many people, I prefer not to rely on knowledge of
    the relative precedences of shifts and bitwise operators. My preference
    would be for major refactoring, not for removing (or adding) parentheses.




    --- PyGate Linux v1.5.15
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From David Brown@3:633/10 to All on Mon Jun 1 12:55:59 2026
    On 01/06/2026 12:47, Bart wrote:
    On 01/06/2026 08:52, David Brown wrote:
    On 31/05/2026 19:11, Bart wrote:
    On 31/05/2026 17:04, Tim Rentsch wrote:
    Richard Harnden <richard.nospam@gmail.invalid> writes:

    just write complex expressions in a way that a human can most
    easily understand,

    Unfortunately, (1) different people have different ideas of what
    writing is most easily understood, and (2) different readers have
    different notions of which writings are easily understood, and
    which writings are not so easily understood.ÿ To make things
    worse "easily understood" is not a boolean condition, nor is it
    necessarily well-ordered -- "most easily understood" isn't always
    a well-defined quality, even for a given audience.

    Sadly the idea of writing in a way that is "most easily understood"
    has resulted in a race to the bottom, where writers are more and
    more encouraged to take the view that (some) readers are pretty
    much arbitrarily stupid, with the result that expressions become
    littered with scads of unnecessary parentheses that actually
    detract from ease of reading.ÿ Good writing is always a balance
    between too much and too little.

    Actual examples of too many parentheses?

    Any source code written in LISP :-)

    (And for too few parentheses, any source code in Forth.)


    ÿFrom a quick grep of an SDK in a project I am working on, I saw this
    example :

    ÿÿÿÿÿif ((((pData1 == NULL) || (pData2 == NULL))) || (Length == 0U))

    The number of parentheses there is so high it's hard to see that not
    only is there an unnecessary extra parentheses for the first ||
    operator, but there is a second set of extra parentheses around it.
    Eliminating these would give :

    ÿÿÿÿÿif ((pData1 == NULL) || (pData2 == NULL) || (Length == 0U))

    or, with an extra space for clarity,

    ÿÿÿÿÿif ( (pData1 == NULL) || (pData2 == NULL) || (Length == 0U) )

    That still leaves extra parentheses around the equality operators, but
    the decision to keep or remove them is subjective (as is the choice of
    "pData1 == NULL" vs. "!pData1").

    Maybe it's due to || being a symbol; compare:

    ÿÿÿÿ if (pData1 == NULL || pData2 == NULL || Length == 0U)

    ÿÿÿÿ if (pData1 == NULL or pData2 == NULL or Length == 0U)

    To me, || seems to draw in the terms on either side as strongly as ==.
    That happens less using 'or'.

    (Both are valid C if using iso646.h.)


    But IMHO, the original line had at least two sets of completely
    redundant and unhelpful parentheses which made it harder to read - the
    reader is left wondering whether these parentheses are there for a
    purpose and have an effect on what should have been a simple and clear
    expression.

    The pattern seems to be '((a || b)) || c) || d' so maybe the author
    didn't understand that || is parsed LTR anyway.


    The SDK also contains examples of parentheses used because it mixes
    relatively rare operators (shifts and binary operators).ÿ Parentheses
    around such sub-expressions are not uncommon, and can definitely be
    helpful, but the quantity here makes things hard to read.ÿ Ironically,
    though it is a macro, there are not "safety" parentheses around the
    argument in the expression.

    And yes, these really are the names of the macro in this code.


    #define CONVERTARGB88882ARGB4444(Color) \
    ÿÿÿÿÿ((((Color & 0xFFU) >> 4) & 0xFU) |\
    ÿÿÿÿÿ(((((Color & 0xFF00U) >> 8) >> 4) & 0xFU) << 4) |\
    ÿÿÿÿÿ(((((Color & 0xFF0000U) >> 16) >> 4) & 0xFU) << 8) | \
    ÿÿÿÿÿ(((((Color & 0xFF000000U) >> 24) >> 4) & 0xFU) << 12))
    #define CONVERTRGB5652ARGB8888(Color) \
    ÿÿÿÿÿ(((((((Color >> 11) & 0x1FU) * 527) + 23) >> 6) << 16) |\
    ÿÿÿÿÿ((((((Color >> 5) & 0x3FU) * 259) + 33) >> 6) << 8) |\
    ÿÿÿÿÿ((((Color & 0x1FU) * 527) + 23) >> 6) | 0xFF000000)

    It can be argued that the parentheses themselves are not the problem
    here - it is doing too much in one expression.ÿ Static inline
    functions would make things clearer, as would a separation of the
    steps of breaking down the original colour format into parts, scaling
    or conversions, then building up the new colour format.ÿ Different
    named types for the different formats would go a long way towards
    usability and safety - at least using typedefs, but preferably using
    structs to make real different types.ÿ And surely nicer names could
    have been found!

    Your examples actually look reasonable. In fact, it could probably do
    with more parentheses around 'Color'... (I've just seen you've already mentioned this!)

    The first part of the second has to apply 6 operations to 'Color' in
    strict LTR order. Using parentheses ensures not having to worry about precedence, since the ops are '>> & * + >> <<'

    The macro names seem self-explanatory too, although they could do with
    some underscores.

    Indeed.


    But anything involving macros probably doesn't count; you expect () to
    be heavily used in the expansion.

    I think macro definitions "count", as do how the macros are used in
    code. But the full expansions do not "count" as they are not something normally read or written by the programmer. (I appreciate that you need
    to see such things sometimes when implementing a compiler, and
    occasionally people look at the output of a pre-processor, but in normal
    use, the appearance of a macro expansion does not matter.)


    This is an example from Lua:

    ÿÿÿ op_arith(L, l_addi, luai_numadd);

    On the face of it, perfectly reasonable. But it expands to this:

    {TValue*v1=(&((base+(((void)0),((((int)((((i)>>((((0+7)+8)+1)))& ((~((~(Instruction)0)<<(8)))<<(0))))))))))->val);TValue*v2=(&(( base+(((void)0),((((int)((((i)>>(((((0+7)+8)+1)+8)))&((~((~( Instruction)0)<<(8)))<<(0))))))))))->val);{StkId ra=(base+(((int) ((((i)>>((0+7)))&((~((~(Instruction)0)<<(8)))<<(0)))))));if(((((v1) )->tt_)==(((3)|((0)<<4))))&&((((v2))->tt_)==(((3)|((0)<<4))))){
    lua_Integer i1=(((void)0),(((v1)->value_).i));lua_Integer i2=(((void) 0),(((v2)->value_).i));pc++;{TValue*io=((&(ra)->val));((io)->value_) .i=(((lua_Integer)(((lua_Unsigned)(i1))+((lua_Unsigned)(i2)))));((io) ->tt_=(((3)|((0)<<4))));};}else{lua_Number n1;lua_Number n2;if((((((v1)) ->tt_)==(((3)|((1)<<4))))?((n1)=(((void)0),(((v1)->value_).n)),1):((((( v1))->tt_)==(((3)|((0)<<4))))?((n1)=((lua_Number)(((((void)0),(((v1)-> value_).i))))),1):0))&&(((((v2))->tt_)==(((3)|((1)<<4))))?((n2)=(((void) 0),(((v2)->value_).n)),1):(((((v2))->tt_)==(((3)|((0)<<4))))?((n2)=(( lua_Number)(((((void)0),(((v2)->value_).i))))),1):0))){pc++;{TValue* io=((&(ra)->val));((io)->value_).n=(((n1)+(n2)));((io)->tt_=(((3)| ((1)<<4))));};}};};};

    (I had fun debugging this at one time in my compiler. I've no idea how
    the original developer did so.)

    I assume the author did so built it up in parts. The readability is in
    the source - and the source is "op_arith(L, l_addi, luai_numadd);" -
    there are not too many parentheses there.


    Not too many () in the macro definitions, but I can only see the top
    level; here deeply nested macros are used.



    --- PyGate Linux v1.5.15
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From Dan Cross@3:633/10 to All on Mon Jun 1 11:04:34 2026
    In article <10vjdn8$22tgu$1@dont-email.me>,
    David Brown <david.brown@hesbynett.no> wrote:
    On 31/05/2026 19:11, Bart wrote:
    On 31/05/2026 17:04, Tim Rentsch wrote:
    Richard Harnden <richard.nospam@gmail.invalid> writes:

    just write complex expressions in a way that a human can most
    easily understand,

    Unfortunately, (1) different people have different ideas of what
    writing is most easily understood, and (2) different readers have
    different notions of which writings are easily understood, and
    which writings are not so easily understood.ÿ To make things
    worse "easily understood" is not a boolean condition, nor is it
    necessarily well-ordered -- "most easily understood" isn't always
    a well-defined quality, even for a given audience.

    Sadly the idea of writing in a way that is "most easily understood"
    has resulted in a race to the bottom, where writers are more and
    more encouraged to take the view that (some) readers are pretty
    much arbitrarily stupid, with the result that expressions become
    littered with scads of unnecessary parentheses that actually
    detract from ease of reading.ÿ Good writing is always a balance
    between too much and too little.

    Actual examples of too many parentheses?

    Any source code written in LISP :-)

    Hey now. Some of us have programmed in Lisp professionally, and
    rather enjoy it.

    Lisp is often maligned for its parentheses; I don't think that's
    fair. They really aren't that onorus once you start working in
    it, and they're unambiguous; one may of the structure of Lisp
    code as a shorthand notation for the resulting program's AST.

    (And for too few parentheses, any source code in Forth.)

    No comment.

    From a quick grep of an SDK in a project I am working on, I saw this
    example :

    if ((((pData1 == NULL) || (pData2 == NULL))) || (Length == 0U))

    The number of parentheses there is so high it's hard to see that not
    only is there an unnecessary extra parentheses for the first ||
    operator, but there is a second set of extra parentheses around it. >Eliminating these would give :

    if ((pData1 == NULL) || (pData2 == NULL) || (Length == 0U))

    or, with an extra space for clarity,

    if ( (pData1 == NULL) || (pData2 == NULL) || (Length == 0U) )

    That still leaves extra parentheses around the equality operators, but
    the decision to keep or remove them is subjective (as is the choice of >"pData1 == NULL" vs. "!pData1").

    But IMHO, the original line had at least two sets of completely
    redundant and unhelpful parentheses which made it harder to read - the >reader is left wondering whether these parentheses are there for a
    purpose and have an effect on what should have been a simple and clear >expression.

    I see code like this all the time; usually it comes from
    hardware vendors (I take it this was from a BSP or something
    similar?). I often wonder about vendor programming standards
    when I run across things like it.

    The SDK also contains examples of parentheses used because it mixes >relatively rare operators (shifts and binary operators). Parentheses
    around such sub-expressions are not uncommon, and can definitely be
    helpful, but the quantity here makes things hard to read. Ironically, >though it is a macro, there are not "safety" parentheses around the
    argument in the expression.

    And yes, these really are the names of the macro in this code.

    #define CONVERTARGB88882ARGB4444(Color) \
    ((((Color & 0xFFU) >> 4) & 0xFU) |\
    (((((Color & 0xFF00U) >> 8) >> 4) & 0xFU) << 4) |\
    (((((Color & 0xFF0000U) >> 16) >> 4) & 0xFU) << 8) | \
    (((((Color & 0xFF000000U) >> 24) >> 4) & 0xFU) << 12))

    #define CONVERTRGB5652ARGB8888(Color) \
    (((((((Color >> 11) & 0x1FU) * 527) + 23) >> 6) << 16) |\
    ((((((Color >> 5) & 0x3FU) * 259) + 33) >> 6) << 8) |\
    ((((Color & 0x1FU) * 527) + 23) >> 6) | 0xFF000000)

    It can be argued that the parentheses themselves are not the problem
    here - it is doing too much in one expression. Static inline functions >would make things clearer, as would a separation of the steps of
    breaking down the original colour format into parts, scaling or
    conversions, then building up the new colour format. Different named
    types for the different formats would go a long way towards usability
    and safety - at least using typedefs, but preferably using structs to
    make real different types. And surely nicer names could have been found!

    Not to mention symbolic names for the magic constants. :-/

    This is exactly the sort of thing that, as you point out, a
    `static inline` function is far better suited for. Some code
    bases don't want to use them for a variety of reasons, usually
    compatibility concerns with older code, compilers, or language
    standards. Some variants of Unix, for instance, worry about
    header compatibility with C90 [and in some cases K&R C] code.

    - Dan C.


    --- PyGate Linux v1.5.15
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From David Brown@3:633/10 to All on Mon Jun 1 14:04:18 2026
    On 01/06/2026 13:04, Dan Cross wrote:
    In article <10vjdn8$22tgu$1@dont-email.me>,
    David Brown <david.brown@hesbynett.no> wrote:
    On 31/05/2026 19:11, Bart wrote:
    On 31/05/2026 17:04, Tim Rentsch wrote:
    Richard Harnden <richard.nospam@gmail.invalid> writes:

    just write complex expressions in a way that a human can most
    easily understand,

    Unfortunately, (1) different people have different ideas of what
    writing is most easily understood, and (2) different readers have
    different notions of which writings are easily understood, and
    which writings are not so easily understood.ÿ To make things
    worse "easily understood" is not a boolean condition, nor is it
    necessarily well-ordered -- "most easily understood" isn't always
    a well-defined quality, even for a given audience.

    Sadly the idea of writing in a way that is "most easily understood"
    has resulted in a race to the bottom, where writers are more and
    more encouraged to take the view that (some) readers are pretty
    much arbitrarily stupid, with the result that expressions become
    littered with scads of unnecessary parentheses that actually
    detract from ease of reading.ÿ Good writing is always a balance
    between too much and too little.

    Actual examples of too many parentheses?

    Any source code written in LISP :-)

    Hey now. Some of us have programmed in Lisp professionally, and
    rather enjoy it.

    Lisp is often maligned for its parentheses; I don't think that's
    fair. They really aren't that onorus once you start working in
    it, and they're unambiguous; one may of the structure of Lisp
    code as a shorthand notation for the resulting program's AST.


    I did include a smiley - I know there are people here who enjoy working
    with LISP, and have probably heard a few too many jokes about parentheses!

    (And for too few parentheses, any source code in Forth.)

    No comment.

    From a quick grep of an SDK in a project I am working on, I saw this
    example :

    if ((((pData1 == NULL) || (pData2 == NULL))) || (Length == 0U))

    The number of parentheses there is so high it's hard to see that not
    only is there an unnecessary extra parentheses for the first ||
    operator, but there is a second set of extra parentheses around it.
    Eliminating these would give :

    if ((pData1 == NULL) || (pData2 == NULL) || (Length == 0U))

    or, with an extra space for clarity,

    if ( (pData1 == NULL) || (pData2 == NULL) || (Length == 0U) )

    That still leaves extra parentheses around the equality operators, but
    the decision to keep or remove them is subjective (as is the choice of
    "pData1 == NULL" vs. "!pData1").

    But IMHO, the original line had at least two sets of completely
    redundant and unhelpful parentheses which made it harder to read - the
    reader is left wondering whether these parentheses are there for a
    purpose and have an effect on what should have been a simple and clear
    expression.

    I see code like this all the time; usually it comes from
    hardware vendors (I take it this was from a BSP or something
    similar?). I often wonder about vendor programming standards
    when I run across things like it.


    Yes, this was from a hardware vendor (who shall remain nameless to
    protect the guilty - not that I have found other vendors to be much
    better). They have a tendency to be obsessed with MISRA, with sticking
    to C90, and with filling headers with huge Doxygen templates giving no information and obscuring the code. (I'm fine with Doxygen comments
    that actually add useful information, but not a dozen lines repeating
    the names and types from a function signature.)

    The SDK also contains examples of parentheses used because it mixes
    relatively rare operators (shifts and binary operators). Parentheses
    around such sub-expressions are not uncommon, and can definitely be
    helpful, but the quantity here makes things hard to read. Ironically,
    though it is a macro, there are not "safety" parentheses around the
    argument in the expression.

    And yes, these really are the names of the macro in this code.

    #define CONVERTARGB88882ARGB4444(Color) \
    ((((Color & 0xFFU) >> 4) & 0xFU) |\
    (((((Color & 0xFF00U) >> 8) >> 4) & 0xFU) << 4) |\
    (((((Color & 0xFF0000U) >> 16) >> 4) & 0xFU) << 8) | \
    (((((Color & 0xFF000000U) >> 24) >> 4) & 0xFU) << 12))

    #define CONVERTRGB5652ARGB8888(Color) \
    (((((((Color >> 11) & 0x1FU) * 527) + 23) >> 6) << 16) |\
    ((((((Color >> 5) & 0x3FU) * 259) + 33) >> 6) << 8) |\
    ((((Color & 0x1FU) * 527) + 23) >> 6) | 0xFF000000)

    It can be argued that the parentheses themselves are not the problem
    here - it is doing too much in one expression. Static inline functions
    would make things clearer, as would a separation of the steps of
    breaking down the original colour format into parts, scaling or
    conversions, then building up the new colour format. Different named
    types for the different formats would go a long way towards usability
    and safety - at least using typedefs, but preferably using structs to
    make real different types. And surely nicer names could have been found!

    Not to mention symbolic names for the magic constants. :-/

    Names for magic constants can be good, but they are not always helpful -
    if the magic number is only used once, its definition is far from its
    use, and it is polluting the global name space, then it can be a lot
    better to simply use the number directly and add a comment at the point
    of use. But the shift-and-mask constants could be replaced by either a
    struct with bit-fields, or inline functions for field extractions, or at separate local variables for the extracted fields.


    This is exactly the sort of thing that, as you point out, a
    `static inline` function is far better suited for. Some code
    bases don't want to use them for a variety of reasons, usually
    compatibility concerns with older code, compilers, or language
    standards. Some variants of Unix, for instance, worry about
    header compatibility with C90 [and in some cases K&R C] code.


    Indeed. But even if they don't want to use "inline", a static function
    is better - the compiler will do the inlining anyway (if it makes sense according to its heuristics).



    --- PyGate Linux v1.5.15
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From Dan Cross@3:633/10 to All on Mon Jun 1 18:48:37 2026
    In article <10vjsg2$259m3$3@dont-email.me>,
    David Brown <david.brown@hesbynett.no> wrote:
    On 01/06/2026 13:04, Dan Cross wrote:
    In article <10vjdn8$22tgu$1@dont-email.me>,
    David Brown <david.brown@hesbynett.no> wrote:
    On 31/05/2026 19:11, Bart wrote:
    [snip]
    Actual examples of too many parentheses?

    Any source code written in LISP :-)

    Hey now. Some of us have programmed in Lisp professionally, and
    rather enjoy it.

    Lisp is often maligned for its parentheses; I don't think that's
    fair. They really aren't that onorus once you start working in
    it, and they're unambiguous; one may of the structure of Lisp
    code as a shorthand notation for the resulting program's AST.

    I did include a smiley - I know there are people here who enjoy working
    with LISP, and have probably heard a few too many jokes about parentheses!

    It's fine; many variants of Lisp are deserving of criticism, and
    that community has a tendency to get too touchy about defending
    the language's honor. People like Stallman are fond of calling
    Lisp "the most powerful language" but I think that's nonsense.

    A problem with many Lisp variants is that they're dynamically
    typed; I once had to fix a production outage that happened with
    a programmer converted a pair of integers into a triple. The
    pair had been represented using a single `CONS` cell, but when
    it became apparent that a triple was needed, it was changed into
    a proper list. The operation for retrieving the first half of a
    `CONS` cell is `CAR`; the operation for retrieving the second
    half is `CDR`. Lisp hackers usually refer to the two halves as
    "the CAR" and "the CDR" of the cell.

    If a `CONS` cell just holds a pair of scalar values, as in this
    example, these functions give back those scalars. However,
    lists are built from `CONS` cells, where the CAR of the list is
    the first element, and the CDR is the tail of the list, which is
    itself a list.

    Anyway, to access the values from the pair, the programmer used
    `CAR` and `CDR`, but when the pair was converted to a list, this
    was no longer correct; the first element was still accessible as
    the CAR, but the CDR was now a list; to get the second element
    of the list one would use `CADR` (or the better named `SECOND`).

    Unfortunately, the programmer missed one place, and passed the
    CDR of the list to a function that expected a `FIXNUM` and tried
    to do arithmetic on it. Lisp is usually strongly typed, so you
    can't just add a list to a number; that raises a "condition"
    (which is like an exception, though that you can often restart
    the thing that raised the condition). In this program, that
    resulted in an ISE and an error returned to the user.

    The fix was trivial, but it struck me at the time that in a
    statically typed language it would have been a compile time
    error.

    (And for too few parentheses, any source code in Forth.)

    No comment.

    From a quick grep of an SDK in a project I am working on, I saw this
    example :

    if ((((pData1 == NULL) || (pData2 == NULL))) || (Length == 0U))

    The number of parentheses there is so high it's hard to see that not
    only is there an unnecessary extra parentheses for the first ||
    operator, but there is a second set of extra parentheses around it.
    Eliminating these would give :

    if ((pData1 == NULL) || (pData2 == NULL) || (Length == 0U))

    or, with an extra space for clarity,

    if ( (pData1 == NULL) || (pData2 == NULL) || (Length == 0U) )

    That still leaves extra parentheses around the equality operators, but
    the decision to keep or remove them is subjective (as is the choice of
    "pData1 == NULL" vs. "!pData1").

    But IMHO, the original line had at least two sets of completely
    redundant and unhelpful parentheses which made it harder to read - the
    reader is left wondering whether these parentheses are there for a
    purpose and have an effect on what should have been a simple and clear
    expression.

    I see code like this all the time; usually it comes from
    hardware vendors (I take it this was from a BSP or something
    similar?). I often wonder about vendor programming standards
    when I run across things like it.

    Yes, this was from a hardware vendor (who shall remain nameless to
    protect the guilty - not that I have found other vendors to be much
    better). They have a tendency to be obsessed with MISRA, with sticking
    to C90, and with filling headers with huge Doxygen templates giving no >information and obscuring the code. (I'm fine with Doxygen comments
    that actually add useful information, but not a dozen lines repeating
    the names and types from a function signature.)

    Yes. I see all of this, and it mystifies me; I have seen how
    excessive abstraction can lead to opaque code, but many times
    hardware people go in the opposite direction, and one hardly
    ever sees useful abstraction; for example, often the same code
    sequence could be trivially extracted into a function, but it is
    instead repeated multiple times, inline.

    The SDK also contains examples of parentheses used because it mixes
    relatively rare operators (shifts and binary operators). Parentheses
    around such sub-expressions are not uncommon, and can definitely be
    helpful, but the quantity here makes things hard to read. Ironically,
    though it is a macro, there are not "safety" parentheses around the
    argument in the expression.

    And yes, these really are the names of the macro in this code.

    #define CONVERTARGB88882ARGB4444(Color) \
    ((((Color & 0xFFU) >> 4) & 0xFU) |\
    (((((Color & 0xFF00U) >> 8) >> 4) & 0xFU) << 4) |\
    (((((Color & 0xFF0000U) >> 16) >> 4) & 0xFU) << 8) | \
    (((((Color & 0xFF000000U) >> 24) >> 4) & 0xFU) << 12))

    #define CONVERTRGB5652ARGB8888(Color) \
    (((((((Color >> 11) & 0x1FU) * 527) + 23) >> 6) << 16) |\
    ((((((Color >> 5) & 0x3FU) * 259) + 33) >> 6) << 8) |\
    ((((Color & 0x1FU) * 527) + 23) >> 6) | 0xFF000000)

    It can be argued that the parentheses themselves are not the problem
    here - it is doing too much in one expression. Static inline functions
    would make things clearer, as would a separation of the steps of
    breaking down the original colour format into parts, scaling or
    conversions, then building up the new colour format. Different named
    types for the different formats would go a long way towards usability
    and safety - at least using typedefs, but preferably using structs to
    make real different types. And surely nicer names could have been found! >>
    Not to mention symbolic names for the magic constants. :-/

    Names for magic constants can be good, but they are not always helpful -
    if the magic number is only used once, its definition is far from its
    use, and it is polluting the global name space, then it can be a lot
    better to simply use the number directly and add a comment at the point
    of use. But the shift-and-mask constants could be replaced by either a >struct with bit-fields, or inline functions for field extractions, or at >separate local variables for the extracted fields.

    I don't mind some magic: the shift constants and the masks, for
    instance, are fine. But the magic 527, 259, 23, and 33, and why
    the subsequent values are shifted right by 6, could be better
    explained by naming those constants.

    Btw, with respect to this specific algorithm, I looked them up,
    and they seem to be empirically discovered lore, though derived
    from a relatively standard algorithm for projection of a
    discrete value into a larger space. This stack overflow page
    has some details: https://stackoverflow.com/questions/2442576/how-does-one-convert-16-bit-rgb565-to-24-bit-rgb888

    Anyway, I don't think the constants have to be defined far away
    from the code; I'd be happy with a local `const uint32_t FOO`,
    though in this case it should probably just be a comment.
    Here's my offering:

    // Converts a 16-bit RGB16 (5-6-5) value to an ARGB32
    // ("RGBA8888") value.
    static inline uint32_t
    rgb16_to_argb(uint16_t color)
    {
    const uint32_t blue5 = (color >> 0) & 0x1F;
    const uint32_t green6 = (color >> 5) & 0x3F;
    const uint32_t red5 = (color >> 11) & 0x1F;

    // Map from a 5 or 6 bit space into an 8 bit space. A
    // 5-bit number has 32 possibilities; a 6 bit number
    // has 64. We can calculate the projected 8-bit
    // value for a k-bit number v, we can use the formula,
    // v_8 = (v*2^8-1 + (k - 1)/2)/(2^k-1), or
    // (v*255 + 15)/31 (for k=5) or (v*255 + 31)/63 (for
    // k=6.
    //
    // To remove division by a prime and turn it into a
    // shift, the constants below were empirically
    // discovered to generate good results. See
    // https://stackoverflow.com/questions/2442576/how-does-one-convert-16-bit-rgb565-to-24-bit-rgb888
    // for details.
    const uint32_t blue = (blue5 * 527 + 23) >> 6;
    const uint32_t green = (green6 * 259 + 33) >> 6;
    const uint32_t red = (red5 * 527 + 23) >> 6;
    const uint32_t alpha = 0xFF000000;

    return blue | (green << 8) | (red << 16) | alpha;
    }

    It's longer, yes, but I'd argue it's much easier to understand.
    On my compiler, it generates almost identical code, except that
    some instructions are in a different order.

    This is exactly the sort of thing that, as you point out, a
    `static inline` function is far better suited for. Some code
    bases don't want to use them for a variety of reasons, usually
    compatibility concerns with older code, compilers, or language
    standards. Some variants of Unix, for instance, worry about
    header compatibility with C90 [and in some cases K&R C] code.

    Indeed. But even if they don't want to use "inline", a static function
    is better - the compiler will do the inlining anyway (if it makes sense >according to its heuristics).

    Assuming the compiler they're working with is known to do so,
    then I agree.

    - Dan C.


    --- PyGate Linux v1.5.15
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From Bart@3:633/10 to All on Mon Jun 1 21:04:01 2026
    On 01/06/2026 19:48, Dan Cross wrote:
    In article <10vjsg2$259m3$3@dont-email.me>,

    Names for magic constants can be good, but they are not always helpful -
    if the magic number is only used once, its definition is far from its
    use, and it is polluting the global name space, then it can be a lot
    better to simply use the number directly and add a comment at the point
    of use. But the shift-and-mask constants could be replaced by either a
    struct with bit-fields, or inline functions for field extractions, or at
    separate local variables for the extracted fields.

    I don't mind some magic: the shift constants and the masks, for
    instance, are fine. But the magic 527, 259, 23, and 33, and why
    the subsequent values are shifted right by 6, could be better
    explained by naming those constants.

    Btw, with respect to this specific algorithm, I looked them up,
    and they seem to be empirically discovered lore, though derived
    from a relatively standard algorithm for projection of a
    discrete value into a larger space. This stack overflow page
    has some details: https://stackoverflow.com/questions/2442576/how-does-one-convert-16-bit-rgb565-to-24-bit-rgb888

    Anyway, I don't think the constants have to be defined far away
    from the code; I'd be happy with a local `const uint32_t FOO`,
    though in this case it should probably just be a comment.
    Here's my offering:

    // Converts a 16-bit RGB16 (5-6-5) value to an ARGB32
    // ("RGBA8888") value.
    static inline uint32_t
    rgb16_to_argb(uint16_t color)
    {
    const uint32_t blue5 = (color >> 0) & 0x1F;
    const uint32_t green6 = (color >> 5) & 0x3F;
    const uint32_t red5 = (color >> 11) & 0x1F;

    // Map from a 5 or 6 bit space into an 8 bit space. A
    // 5-bit number has 32 possibilities; a 6 bit number
    // has 64. We can calculate the projected 8-bit
    // value for a k-bit number v, we can use the formula,
    // v_8 = (v*2^8-1 + (k - 1)/2)/(2^k-1), or
    // (v*255 + 15)/31 (for k=5) or (v*255 + 31)/63 (for
    // k=6.
    //
    // To remove division by a prime and turn it into a
    // shift, the constants below were empirically
    // discovered to generate good results. See
    // https://stackoverflow.com/questions/2442576/how-does-one-convert-16-bit-rgb565-to-24-bit-rgb888
    // for details.
    const uint32_t blue = (blue5 * 527 + 23) >> 6;
    const uint32_t green = (green6 * 259 + 33) >> 6;
    const uint32_t red = (red5 * 527 + 23) >> 6;
    const uint32_t alpha = 0xFF000000;

    return blue | (green << 8) | (red << 16) | alpha;
    }

    It's longer, yes, but I'd argue it's much easier to understand.
    On my compiler, it generates almost identical code, except that
    some instructions are in a different order.

    The speed probably isn't that important. This can be table-driven: you
    use those formulae once to populate some tables (and with the shifts built-in). Then the routine can be simplified to this:


    uint32_t rgb16_to_argb_bc(uint16_t color) {
    const uint32_t blue5 = (color >> 0) & 0x1F;
    const uint32_t green6 = (color >> 5) & 0x3F;
    const uint32_t red5 = (color >> 11) & 0x1F;

    return bluetab[blue5] | greentab[green6] | redtab[red5] |
    0xFF000000;
    }

    On a test I did (one billion conversions cycling over 1M precalculated
    random 16-bit numbers), the table version was twice as fast. Maybe a bit faster if the Alpha value is pre-added to the red-table.

    (Results were merely summed, but if writing into a new buffer, then
    memory access is probably more dominant.)

    --- PyGate Linux v1.5.15
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From Keith Thompson@3:633/10 to All on Mon Jun 1 14:26:44 2026
    Bart <bc@freeuk.com> writes:
    [...]
    These are more or less real examples, I just simplified the
    terms. Here are some from MZLIB:

    return (status == MZ_OK) ? MZ_BUF_ERROR : status;

    return (pL == pE) ? (l_len < r_len) : (l < r);

    sym = (match_dist < 512) ? s0 : s1;

    return ((pState->m_last_status == TINFL_STATUS_DONE) &&
    (!pState->m_dict_avail)) ? MZ_STREAM_END : MZ_OK;

    I believe that in the first three, all parentheses are superflous, but
    they are used anyway. Why is that?

    Obviously it's because the author of the code thought it was
    clearer with the parentheses (or was working under a coding standard
    written by someone who thought so). I don't think there are any
    deeper conclusions to be drawn. I would have written most of them
    differently, but it's not a big deal.

    (My preferences for ?: are that the whole thing is syntax, outside of
    the precedence scheme, and that it has mandatory parentheses. That
    second line would then look like this:

    return (pL == pE ? l_len < r_len : l < r);

    There are fewer parentheses in all, and less potential confusion. You
    can even have assignments in each branch; they will not interfere with
    ?:.)

    But the precedence scheme *is* syntax. If you prefer to think of ?:
    as something other than an operator, something that that doesn't
    follow the same set of rules as other operators, and if that works
    for you, then that's fine. But then how do you know that
    return (pL == pE ? l_len < r_len : l < r);
    means
    return ((pL == pE) ? (l_len < r_len) : l < r);
    and not
    return (pL == (pE ? l_len < r_len : l < r));
    ?

    I know that because I know that ?: is an operator that binds more
    loosely than "==".

    In any case, however you think about ?:, it's clear that
    "pL == pE ? l_len < r_len : l < r" is an expression, and "return"
    *is* outside of the precedence scheme. The outer parentheses are
    superfluous but harmless. (Personally, I dislike parenthesizing
    the expression in a return statement.)

    [...]

    --
    Keith Thompson (The_Other_Keith) Keith.S.Thompson+u@gmail.com
    void Void(void) { Void(); } /* The recursive call of the void */

    --- PyGate Linux v1.5.15
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From Keith Thompson@3:633/10 to All on Mon Jun 1 14:39:04 2026
    Bart <bc@freeuk.com> writes:
    On 01/06/2026 08:52, David Brown wrote:
    [...]
    That still leaves extra parentheses around the equality operators,
    but the decision to keep or remove them is subjective (as is the
    choice of "pData1 == NULL" vs. "!pData1").

    Maybe it's due to || being a symbol; compare:

    if (pData1 == NULL || pData2 == NULL || Length == 0U)

    if (pData1 == NULL or pData2 == NULL or Length == 0U)

    To me, || seems to draw in the terms on either side as strongly as
    ==. That happens less using 'or'.

    (Both are valid C if using iso646.h.)

    The "and" macro in <iso646.h> is exactly equivalent to "||".
    If your intuition tells you they have different precedences, that
    could be a problem. On the other hand, if you choose to use them
    differently in ways that don't break anything, that's fine.

    Digression: Perl borrows most or all of C's operators, and keeps
    the same precedences. "Operators borrowed from C keep the same
    precedence relationship with each other, even where C's precedence
    is slightly screwy." But Perl has "and" and "or" operators that
    work like "&&" and "||" but have lower precedence (that turns out
    to be convenient in some contexts).

    I vaguely recall that there's some language that uses the ?: syntax
    for the conditional operator, but with a different precedence and/or associativity than C. I can't remember which language it is.

    [...]

    --
    Keith Thompson (The_Other_Keith) Keith.S.Thompson+u@gmail.com
    void Void(void) { Void(); } /* The recursive call of the void */

    --- PyGate Linux v1.5.15
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From Keith Thompson@3:633/10 to All on Mon Jun 1 15:11:21 2026
    Keith Thompson <Keith.S.Thompson+u@gmail.com> writes:
    [...]
    I vaguely recall that there's some language that uses the ?: syntax
    for the conditional operator, but with a different precedence and/or associativity than C. I can't remember which language it is.

    The language I was thinking of is PHP. C's ?: operator associates right-to-left, which makes it possible to write chained conditional
    expressions like:

    cond1 ? expr1 :
    cond2 ? expr2 :
    cond3 ? expr3 :
    default_expr

    PHP's ?: operator originally associated right-to-left.
    Newer versions of PHP require parentheses.

    --
    Keith Thompson (The_Other_Keith) Keith.S.Thompson+u@gmail.com
    void Void(void) { Void(); } /* The recursive call of the void */

    --- PyGate Linux v1.5.15
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From Tim Rentsch@3:633/10 to All on Mon Jun 1 15:23:09 2026
    Keith Thompson <Keith.S.Thompson+u@gmail.com> writes:

    The "and" macro in <iso646.h> is exactly equivalent to "||".

    I don't think so.

    --- PyGate Linux v1.5.15
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From Bart@3:633/10 to All on Mon Jun 1 23:24:10 2026
    On 01/06/2026 22:39, Keith Thompson wrote:
    Bart <bc@freeuk.com> writes:
    On 01/06/2026 08:52, David Brown wrote:
    [...]
    That still leaves extra parentheses around the equality operators,
    but the decision to keep or remove them is subjective (as is the
    choice of "pData1 == NULL" vs. "!pData1").

    Maybe it's due to || being a symbol; compare:

    if (pData1 == NULL || pData2 == NULL || Length == 0U)

    if (pData1 == NULL or pData2 == NULL or Length == 0U)

    To me, || seems to draw in the terms on either side as strongly as
    ==. That happens less using 'or'.

    (Both are valid C if using iso646.h.)

    The "and" macro in <iso646.h> is exactly equivalent to "||".

    I don't think so.

    If your intuition tells you they have different precedences, that
    could be a problem.

    I'm not saying that, just that having a named operators helps to
    separate that expression into three groups better than a symbolic operator.

    At least for me.



    --- PyGate Linux v1.5.15
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From Keith Thompson@3:633/10 to All on Mon Jun 1 16:06:52 2026
    Tim Rentsch <tr.17687@z991.linuxsc.com> writes:
    Keith Thompson <Keith.S.Thompson+u@gmail.com> writes:
    The "and" macro in <iso646.h> is exactly equivalent to "||".

    I don't think so.

    Right, that was a typo/thinko.

    The "and" macro is (almost) exactly equivalent to "&&".

    --
    Keith Thompson (The_Other_Keith) Keith.S.Thompson+u@gmail.com
    void Void(void) { Void(); } /* The recursive call of the void */

    --- PyGate Linux v1.5.15
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From David Brown@3:633/10 to All on Tue Jun 2 08:41:49 2026
    On 02/06/2026 00:11, Keith Thompson wrote:
    Keith Thompson <Keith.S.Thompson+u@gmail.com> writes:
    [...]
    I vaguely recall that there's some language that uses the ?: syntax
    for the conditional operator, but with a different precedence and/or
    associativity than C. I can't remember which language it is.

    The language I was thinking of is PHP. C's ?: operator associates right-to-left, which makes it possible to write chained conditional expressions like:

    cond1 ? expr1 :
    cond2 ? expr2 :
    cond3 ? expr3 :
    default_expr

    PHP's ?: operator originally associated right-to-left.
    Newer versions of PHP require parentheses.


    I thought you were thinking of C++, where ? has the same precedence as assignment, while in C it has higher precedence. It does not make a lot
    of difference, and if you are writing an expression where it matters,
    then I think parentheses would be a good idea.

    <https://cppreference.com/c/language/operator_precedence> <https://cppreference.com/cpp/language/operator_precedence>


    --- PyGate Linux v1.5.15
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From David Brown@3:633/10 to All on Tue Jun 2 09:09:02 2026
    On 01/06/2026 20:48, Dan Cross wrote:
    In article <10vjsg2$259m3$3@dont-email.me>,
    David Brown <david.brown@hesbynett.no> wrote:
    On 01/06/2026 13:04, Dan Cross wrote:
    In article <10vjdn8$22tgu$1@dont-email.me>,
    David Brown <david.brown@hesbynett.no> wrote:
    On 31/05/2026 19:11, Bart wrote:
    [snip]
    Actual examples of too many parentheses?


    [Snipping the LISP stuff - fun, but OT and not really relevant to the
    thread branch. And I have never used the language.]



    I see code like this all the time; usually it comes from
    hardware vendors (I take it this was from a BSP or something
    similar?). I often wonder about vendor programming standards
    when I run across things like it.

    Yes, this was from a hardware vendor (who shall remain nameless to
    protect the guilty - not that I have found other vendors to be much
    better). They have a tendency to be obsessed with MISRA, with sticking
    to C90, and with filling headers with huge Doxygen templates giving no
    information and obscuring the code. (I'm fine with Doxygen comments
    that actually add useful information, but not a dozen lines repeating
    the names and types from a function signature.)

    Yes. I see all of this, and it mystifies me; I have seen how
    excessive abstraction can lead to opaque code, but many times
    hardware people go in the opposite direction, and one hardly
    ever sees useful abstraction; for example, often the same code
    sequence could be trivially extracted into a function, but it is
    instead repeated multiple times, inline.


    Indeed. There is just /so/ much that is done badly in these SDK's - I
    am not going to go into details as it would take all day. I get the impression that software libraries are very much an afterthought for
    most microcontroller design groups - I don't think they ever bother
    talking to developers who will use them. In fact, I don't think they
    talk much to the software folks when designing the microcontrollers either.

    Sometimes, however, they do have abstractions - sometimes multiple
    layers of HALs ("Hardware Abstraction Layer"), drivers, interfaces, etc.
    Each layer has a completely different way of viewing things - one will
    use #define'd constants for everything, another will use a struct with
    30 fields passed as a pointer in order to turn a GPIO pin on or off, and
    the next layer will use a macro TURN_GPIO_PIN_A14_ON. When you have
    figured out which API you are expected to use, toggling a GPIO leads to
    a half-dozen nested calls (not including macros) up and down theses
    stacks when all the hardware needs is a single write to a particular
    register. And if you are really lucky, a global HAL_LOCK_MUTEX is
    acquired and released along the way.

    The most extreme example I saw was working on a very small 8-bit microcontroller - 2 KB of code flash. I wanted to use the ADC, and
    thought I'd save reading the datasheet and reference manual by using the "wizard" and SDK. The result needed 4 KB of flash - twice what the chip
    had - and half of its ram. I looked in the manual and got the same
    results I needed with a single line of C code that compiled to just one assembly instruction.

    (Time to cut the rant short.)


    Not to mention symbolic names for the magic constants. :-/

    Names for magic constants can be good, but they are not always helpful -
    if the magic number is only used once, its definition is far from its
    use, and it is polluting the global name space, then it can be a lot
    better to simply use the number directly and add a comment at the point
    of use. But the shift-and-mask constants could be replaced by either a
    struct with bit-fields, or inline functions for field extractions, or at
    separate local variables for the extracted fields.

    I don't mind some magic: the shift constants and the masks, for
    instance, are fine. But the magic 527, 259, 23, and 33, and why
    the subsequent values are shifted right by 6, could be better
    explained by naming those constants.


    Agreed - those are the "magic" ones, and need explaining (or perhaps calculating, at compile time, from something that makes sense to the
    reader and maintainer).

    Btw, with respect to this specific algorithm, I looked them up,
    and they seem to be empirically discovered lore, though derived
    from a relatively standard algorithm for projection of a
    discrete value into a larger space. This stack overflow page
    has some details: https://stackoverflow.com/questions/2442576/how-does-one-convert-16-bit-rgb565-to-24-bit-rgb888


    A URL in comments in the code would be a lot better than just the numbers.

    Anyway, I don't think the constants have to be defined far away
    from the code; I'd be happy with a local `const uint32_t FOO`,
    though in this case it should probably just be a comment.
    Here's my offering:

    // Converts a 16-bit RGB16 (5-6-5) value to an ARGB32
    // ("RGBA8888") value.
    static inline uint32_t
    rgb16_to_argb(uint16_t color)
    {
    const uint32_t blue5 = (color >> 0) & 0x1F;
    const uint32_t green6 = (color >> 5) & 0x3F;
    const uint32_t red5 = (color >> 11) & 0x1F;

    // Map from a 5 or 6 bit space into an 8 bit space. A
    // 5-bit number has 32 possibilities; a 6 bit number
    // has 64. We can calculate the projected 8-bit
    // value for a k-bit number v, we can use the formula,
    // v_8 = (v*2^8-1 + (k - 1)/2)/(2^k-1), or
    // (v*255 + 15)/31 (for k=5) or (v*255 + 31)/63 (for
    // k=6.
    //
    // To remove division by a prime and turn it into a
    // shift, the constants below were empirically
    // discovered to generate good results. See
    // https://stackoverflow.com/questions/2442576/how-does-one-convert-16-bit-rgb565-to-24-bit-rgb888
    // for details.
    const uint32_t blue = (blue5 * 527 + 23) >> 6;
    const uint32_t green = (green6 * 259 + 33) >> 6;
    const uint32_t red = (red5 * 527 + 23) >> 6;
    const uint32_t alpha = 0xFF000000;

    return blue | (green << 8) | (red << 16) | alpha;
    }

    It's longer, yes, but I'd argue it's much easier to understand.
    On my compiler, it generates almost identical code, except that
    some instructions are in a different order.

    Yes, that would be vastly better. (I would still prefer to have
    different named types for colours in the different encoding schemes.)


    This is exactly the sort of thing that, as you point out, a
    `static inline` function is far better suited for. Some code
    bases don't want to use them for a variety of reasons, usually
    compatibility concerns with older code, compilers, or language
    standards. Some variants of Unix, for instance, worry about
    header compatibility with C90 [and in some cases K&R C] code.

    Indeed. But even if they don't want to use "inline", a static function
    is better - the compiler will do the inlining anyway (if it makes sense
    according to its heuristics).

    Assuming the compiler they're working with is known to do so,
    then I agree.


    If a compiler is not capable of inlining static functions without them
    being labelled "inline", then you are unlikely to get efficient results anyway. (Or the user has not enabled optimisation, and again cannot
    expect efficient results.) I don't see the point in pandering to poorly optimising compilers (including good compilers with optimisation
    disabled) in order to produce marginally less big and slow code. There
    was a time when a good optimising compiler was a significant investment
    and not always within the budget for a project, but such times are far
    in the past. I can understand that some developers are hamstrung by
    daft C90 restrictions, but I have little sympathy for people wanting
    good results from poor tools.

    (The exception, perhaps, is people who have to use Microchip development tools.)





    --- PyGate Linux v1.5.15
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From David Brown@3:633/10 to All on Tue Jun 2 09:17:01 2026
    On 01/06/2026 22:04, Bart wrote:
    On 01/06/2026 19:48, Dan Cross wrote:
    In article <10vjsg2$259m3$3@dont-email.me>,

    Names for magic constants can be good, but they are not always helpful - >>> if the magic number is only used once, its definition is far from its
    use, and it is polluting the global name space, then it can be a lot
    better to simply use the number directly and add a comment at the point
    of use.ÿ But the shift-and-mask constants could be replaced by either a
    struct with bit-fields, or inline functions for field extractions, or at >>> separate local variables for the extracted fields.

    I don't mind some magic: the shift constants and the masks, for
    instance, are fine.ÿ But the magic 527, 259, 23, and 33, and why
    the subsequent values are shifted right by 6, could be better
    explained by naming those constants.

    Btw, with respect to this specific algorithm, I looked them up,
    and they seem to be empirically discovered lore, though derived
    from a relatively standard algorithm for projection of a
    discrete value into a larger space.ÿ This stack overflow page
    has some details:
    https://stackoverflow.com/questions/2442576/how-does-one-convert-16-
    bit-rgb565-to-24-bit-rgb888

    Anyway, I don't think the constants have to be defined far away
    from the code; I'd be happy with a local `const uint32_t FOO`,
    though in this case it should probably just be a comment.
    Here's my offering:

    // Converts a 16-bit RGB16 (5-6-5) value to an ARGB32
    // ("RGBA8888") value.
    static inline uint32_t
    rgb16_to_argb(uint16_t color)
    {
    ÿÿÿÿconst uint32_t blue5ÿ = (color >>ÿ 0) & 0x1F;
    ÿÿÿÿconst uint32_t green6 = (color >>ÿ 5) & 0x3F;
    ÿÿÿÿconst uint32_t red5ÿÿ = (color >> 11) & 0x1F;

    ÿÿÿÿ// Map from a 5 or 6 bit space into an 8 bit space.ÿ A
    ÿÿÿÿ// 5-bit number has 32 possibilities; a 6 bit number
    ÿÿÿÿ// has 64.ÿÿ We can calculate the projected 8-bit
    ÿÿÿÿ// value for a k-bit number v, we can use the formula,
    ÿÿÿÿ// v_8 = (v*2^8-1 + (k - 1)/2)/(2^k-1), or
    ÿÿÿÿ// (v*255 + 15)/31 (for k=5) or (v*255 + 31)/63 (for
    ÿÿÿÿ// k=6.
    ÿÿÿÿ//
    ÿÿÿÿ// To remove division by a prime and turn it into a
    ÿÿÿÿ// shift, the constants below were empirically
    ÿÿÿÿ// discovered to generate good results.ÿ See
    ÿÿÿÿ// https://stackoverflow.com/questions/2442576/how-does-one-
    convert-16-bit-rgb565-to-24-bit-rgb888
    ÿÿÿÿ// for details.
    ÿÿÿÿconst uint32_t blueÿ = (blue5 * 527 + 23) >> 6;
    ÿÿÿÿconst uint32_t green = (green6 * 259 + 33) >> 6;
    ÿÿÿÿconst uint32_t redÿÿ = (red5 * 527 + 23) >> 6;
    ÿÿÿÿconst uint32_t alpha = 0xFF000000;

    ÿÿÿÿreturn blue | (green << 8) | (red << 16) | alpha;
    }

    It's longer, yes, but I'd argue it's much easier to understand.
    On my compiler, it generates almost identical code, except that
    some instructions are in a different order.

    The speed probably isn't that important. This can be table-driven: you
    use those formulae once to populate some tables (and with the shifts built-in). Then the routine can be simplified to this:


    ÿ uint32_t rgb16_to_argb_bc(uint16_t color) {
    ÿÿÿÿÿ const uint32_t blue5ÿ = (color >>ÿ 0) & 0x1F;
    ÿÿÿÿÿ const uint32_t green6 = (color >>ÿ 5) & 0x3F;
    ÿÿÿÿÿ const uint32_t red5ÿÿ = (color >> 11) & 0x1F;

    ÿÿÿÿÿ return bluetab[blue5] | greentab[green6] | redtab[red5] |
    0xFF000000;
    ÿ }

    On a test I did (one billion conversions cycling over 1M precalculated random 16-bit numbers), the table version was twice as fast. Maybe a bit faster if the Alpha value is pre-added to the red-table.

    (Results were merely summed, but if writing into a new buffer, then
    memory access is probably more dominant.)

    Such timing results are, as they stand, totally useless - the best
    choice of algorithm is entirely dependent on the target device,
    tradeoffs for speed and code/table space, and on how the code is used in practice.

    It is absolutely true that a table-based approach can give faster
    results. (And there's no doubt that your table code here is vastly
    clearer than the original macro.) On some microcontrollers, avoiding
    the multiplications would give code that is an order of magnitude
    faster. On others, table lookup would be a lot slower.

    So it is definitely worth thinking about alternative approaches such as
    this, but testing on a PC gives very little information about real-world speed.





    --- PyGate Linux v1.5.15
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From Keith Thompson@3:633/10 to All on Tue Jun 2 02:07:48 2026
    David Brown <david.brown@hesbynett.no> writes:
    On 02/06/2026 00:11, Keith Thompson wrote:
    Keith Thompson <Keith.S.Thompson+u@gmail.com> writes:
    [...]
    I vaguely recall that there's some language that uses the ?: syntax
    for the conditional operator, but with a different precedence and/or
    associativity than C. I can't remember which language it is.

    The language I was thinking of is PHP. C's ?: operator associates
    right-to-left, which makes it possible to write chained conditional
    expressions like:
    cond1 ? expr1 :
    cond2 ? expr2 :
    cond3 ? expr3 :
    default_expr
    PHP's ?: operator originally associated right-to-left.
    Newer versions of PHP require parentheses.

    I thought you were thinking of C++, where ? has the same precedence as assignment, while in C it has higher precedence. It does not make a
    lot of difference, and if you are writing an expression where it
    matters, then I think parentheses would be a good idea.

    <https://cppreference.com/c/language/operator_precedence> <https://cppreference.com/cpp/language/operator_precedence>

    Hmm. I'm not sure I either follow or trust those tables.

    Looking at the grammar in the C++ standard, there is a difference.
    C has:

    conditional-expression:
    logical-OR-expression
    logical-OR-expression ? expression : conditional-expression

    while C++ has:

    conditional-expression:
    logical-or-expression
    logical-or-expression ? expression : assignment-expression

    But the difference isn't mentioned in the Compatibility annex of the C++ standard.

    I'd be interested in seeing a conditional expression whose legality or semantics differs between C and C++.

    (Digression: I hate the fact that such a long and sometimes
    informative thread has such a stupid subject header.)

    --
    Keith Thompson (The_Other_Keith) Keith.S.Thompson+u@gmail.com
    void Void(void) { Void(); } /* The recursive call of the void */

    --- PyGate Linux v1.5.15
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From Janis Papanagnou@3:633/10 to All on Tue Jun 2 11:35:57 2026
    On 2026-06-02 00:24, Bart wrote:
    On 01/06/2026 22:39, Keith Thompson wrote:
    Bart <bc@freeuk.com> writes:
    On 01/06/2026 08:52, David Brown wrote:
    [...]
    That still leaves extra parentheses around the equality operators,
    but the decision to keep or remove them is subjective (as is the
    choice of "pData1 == NULL" vs. "!pData1").

    Maybe it's due to || being a symbol; compare:

    ÿÿÿÿÿ if (pData1 == NULL || pData2 == NULL || Length == 0U)

    ÿÿÿÿÿ if (pData1 == NULL or pData2 == NULL or Length == 0U)

    To me, || seems to draw in the terms on either side as strongly as
    ==. That happens less using 'or'.

    (Both are valid C if using iso646.h.)

    [...]

    [...]

    I'm not saying that, just that having a named operators helps to
    separate that expression into three groups better than a symbolic operator.

    At least for me.

    I suppose because, as words, they stand out, are easier to distinguish, especially in that mass of in "C" existing punctuation characters, and psychologically suggest their dominance? - Yes, maybe.[*] - But don't
    count on such "perception logic"; you generally won't get happy.[**]

    Janis

    [*] In the above quoted example where there's identifiers around the
    operators it appears to me that the 'and'/'or' variants would be worse,
    though, concerning visual perceivability.
    It would certainly be different if the example had used numbers, like
    if ( pData1 == 0 || pData2 == 0 || Length == 0 )
    or the '!var' variant (for those who prefer that)
    if ( !pData1 || !pData2 || !Length )

    [**] For example, with Pascal. While that language has only very few
    precedence groups - as you said you'd prefer fewer to many groups! -
    they put the 'and' together with arithmetic * and / , and the 'or'
    together with + and - . Given that all the comparisons are in the
    lowest precedence group you will have to use parenthesis, say, for
    'a < b && c == d' (in "C") would be '(a < b) and (c = d)' (in Pascal).
    The keyword didn't help given the precedence groups' design with only
    few precedence groups [in Pascal].
    Back these days, that demand of parenthesis sightly annoyed me, since I
    thought (and still think) that the boolean keywords would sufficiently
    hint on the semantic intention, and with more precedence levels in the
    language design they could easily have simplified those common cases.
    To each [language] its own [flaw].

    --- PyGate Linux v1.5.15
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From Janis Papanagnou@3:633/10 to All on Tue Jun 2 11:38:38 2026
    On 2026-06-02 11:07, Keith Thompson wrote:
    [...]

    (Digression: I hate the fact that such a long and sometimes
    informative thread has such a stupid subject header.)

    And what did prevent you from changing it? :-}

    Janis


    --- PyGate Linux v1.5.15
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From David Brown@3:633/10 to All on Tue Jun 2 11:46:36 2026
    On 02/06/2026 11:07, Keith Thompson wrote:
    David Brown <david.brown@hesbynett.no> writes:
    On 02/06/2026 00:11, Keith Thompson wrote:
    Keith Thompson <Keith.S.Thompson+u@gmail.com> writes:
    [...]
    I vaguely recall that there's some language that uses the ?: syntax
    for the conditional operator, but with a different precedence and/or
    associativity than C. I can't remember which language it is.

    The language I was thinking of is PHP. C's ?: operator associates
    right-to-left, which makes it possible to write chained conditional
    expressions like:
    cond1 ? expr1 :
    cond2 ? expr2 :
    cond3 ? expr3 :
    default_expr
    PHP's ?: operator originally associated right-to-left.
    Newer versions of PHP require parentheses.

    I thought you were thinking of C++, where ? has the same precedence as
    assignment, while in C it has higher precedence. It does not make a
    lot of difference, and if you are writing an expression where it
    matters, then I think parentheses would be a good idea.

    <https://cppreference.com/c/language/operator_precedence>
    <https://cppreference.com/cpp/language/operator_precedence>

    Hmm. I'm not sure I either follow or trust those tables.

    cppreference.com is normally very accurate - it is linked from the
    isocpp.org website and AFAIUI maintained or checked by people involved
    in the C++ standards. Mistakes here are definitely something that
    should be taken seriously.


    Looking at the grammar in the C++ standard, there is a difference.
    C has:

    conditional-expression:
    logical-OR-expression
    logical-OR-expression ? expression : conditional-expression

    while C++ has:

    conditional-expression:
    logical-or-expression
    logical-or-expression ? expression : assignment-expression

    But the difference isn't mentioned in the Compatibility annex of the C++ standard.

    I'd be interested in seeing a conditional expression whose legality or semantics differs between C and C++.

    There is a little information in the "discussion" page of the C++ side
    linked above. An example is

    true ? a : b = 7;

    In C, the ternary operator has higher precedence than assignment and
    this therefore parses as :

    (true ? a : b) = 7;

    In C, the ternary operator does not return an lvalue, so this is a
    constraint error.

    In C++, the precedence of ternary and assignment are the same, with right-to-left associativity, so this is parsed as :

    true ? a : (b = 7)

    and evaluates as the value of "a", leaving "b" untouched.

    I am not confident enough in my standardese, especially for C++, to
    judge if the above explanation is correct according to the standards.
    But a quick test on godbolt shows that both gcc and clang follow that
    line of reasoning. (It is possible that they are both wrong, but that
    would be surprising.)

    The difference in precedences here is, I think, related to the ternary operator being able to evaluate to an lvalue in C++ but not in C - and
    that /is/ mentioned in the C++ compatibility annex.


    (Digression: I hate the fact that such a long and sometimes
    informative thread has such a stupid subject header.)


    Agreed.


    --- PyGate Linux v1.5.15
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From Janis Papanagnou@3:633/10 to All on Tue Jun 2 11:48:35 2026
    On 2026-06-01 00:54, Keith Thompson wrote:
    [...]

    Yes, a compiler can reduce (a + b) * 0 to just 0. But it's not
    required to do so, and (INT_MAX + 1) * 0 still has undefined
    behavior. Undefined behavior is determined by the rules of the
    abstract machine *without* any adjustments permitted by the as-if
    rule.

    This is something I really don't get in the actual C-logic...

    Using constants that can be determined at compile time is UB here,
    despite the '* 0' mathematically indicating an IMO clear semantics,
    but using variables is only UB possibly at runtime? And despite all
    that the latter might not even get triggered because it's probably
    optimized away? - I can't help, this sounds really crude.

    Is there any rationale from the _software designer_'s perspective?

    Janis


    --- PyGate Linux v1.5.15
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From Bart@3:633/10 to All on Tue Jun 2 11:09:12 2026
    On 02/06/2026 10:46, David Brown wrote:
    On 02/06/2026 11:07, Keith Thompson wrote:
    David Brown <david.brown@hesbynett.no> writes:
    On 02/06/2026 00:11, Keith Thompson wrote:
    Keith Thompson <Keith.S.Thompson+u@gmail.com> writes:
    [...]
    I vaguely recall that there's some language that uses the ?: syntax
    for the conditional operator, but with a different precedence and/or >>>>> associativity than C.ÿ I can't remember which language it is.

    The language I was thinking of is PHP.ÿ C's ?: operator associates
    right-to-left, which makes it possible to write chained conditional
    expressions like:
    ÿÿÿÿÿ cond1 ? expr1 :
    ÿÿÿÿÿ cond2 ? expr2 :
    ÿÿÿÿÿ cond3 ? expr3 :
    ÿÿÿÿÿ default_expr
    PHP's ?: operator originally associated right-to-left.
    Newer versions of PHP require parentheses.

    I thought you were thinking of C++, where ? has the same precedence as
    assignment, while in C it has higher precedence.ÿ It does not make a
    lot of difference, and if you are writing an expression where it
    matters, then I think parentheses would be a good idea.

    <https://cppreference.com/c/language/operator_precedence>
    <https://cppreference.com/cpp/language/operator_precedence>

    Hmm.ÿ I'm not sure I either follow or trust those tables.

    cppreference.com is normally very accurate - it is linked from the isocpp.org website and AFAIUI maintained or checked by people involved
    in the C++ standards.ÿ Mistakes here are definitely something that
    should be taken seriously.


    Looking at the grammar in the C++ standard, there is a difference.
    C has:

    ÿÿÿÿ conditional-expression:
    ÿÿÿÿÿÿÿÿ logical-OR-expression
    ÿÿÿÿÿÿÿÿ logical-OR-expression ? expression : conditional-expression

    while C++ has:

    ÿÿÿÿ conditional-expression:
    ÿÿÿÿÿÿÿÿ logical-or-expression
    ÿÿÿÿÿÿÿÿ logical-or-expression ? expression : assignment-expression

    But the difference isn't mentioned in the Compatibility annex of the C++
    standard.

    I'd be interested in seeing a conditional expression whose legality or
    semantics differs between C and C++.

    There is a little information in the "discussion" page of the C++ side linked above.ÿ An example is

    ÿÿÿÿtrue ? a : b = 7;

    In C, the ternary operator has higher precedence than assignment and
    this therefore parses as :

    ÿÿÿÿ(true ? a : b) = 7;

    In C, the ternary operator does not return an lvalue, so this is a constraint error.

    In C++, the precedence of ternary and assignment are the same, with right-to-left associativity, so this is parsed as :

    ÿÿÿÿtrue ? a : (b = 7)

    and evaluates as the value of "a", leaving "b" untouched.

    I am not confident enough in my standardese, especially for C++, to
    judge if the above explanation is correct according to the standards.
    But a quick test on godbolt shows that both gcc and clang follow that
    line of reasoning.ÿ (It is possible that they are both wrong, but that
    would be surprising.)

    The difference in precedences here is, I think, related to the ternary operator being able to evaluate to an lvalue in C++ but not in C - and
    that /is/ mentioned in the C++ compatibility annex.


    I was surprised that there would be such a subtle difference given that
    the languages are that close; there are totally unrelated languages that slavishly follow C precedence rules more closely!

    But the behaviour only seems to vary in code that would be invalid in C
    anyway (unbracketed ?: term on LHS of '=' operator).

    Your table however also shows || had same precedence as both ?: and =.
    There, I couldn't find an example that made a difference.

    Still, I'd find that unsettling; I would rather that ?: was distinct
    from bother, either with its own level, or via other language rules. (In
    my stuff it is always written with parentheses.)

    --- PyGate Linux v1.5.15
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From Janis Papanagnou@3:633/10 to All on Tue Jun 2 12:16:09 2026
    On 2026-05-31 04:53, Keith Thompson wrote:
    Janis Papanagnou <janis_papanagnou+ng@hotmail.com> writes:
    On 2026-05-31 01:43, Keith Thompson wrote:
    Bart <bc@freeuk.com> writes:
    [...]
    [...]
    C's operator precedence rules are complicated and arguably flawed.

    I'd say that just the (known) flaw makes them (slightly) complicated;
    so you need to remember that "flaw" (or "inconsistency") to be safe.
    The rest is completely sensible. And even if one doesn't have a table
    to look up the precedences they mostly can be derived (presuming one
    has a feeling for the underlying logic of these things or experiences
    from other related areas).

    Reasonable, but I feel the need to say that that's your personal
    opinion. You seem to think that C's precedence rules have one and
    only one flaw, and a set of rules with that flaw corrected would
    be ideal.

    Erm, no. Not "ideal". (This is just another formulation for what
    Bart expressed with the word "perfect" that he'd put in my mouth.)

    What I'm saying is that with the knowledge of the contexts of the
    underlying models (mathematical and logical calculi, based on a
    sensible definition) the possible (sensible) options are sparse.

    We should always keep in mind that there's an inherent difference
    of subjective opinions and knowledge based sensible conventions.
    Such conventions are not as universal as natural laws, they can't
    be because they are human-made, but they can be sensibly defined
    (often due to practical reasoning). Arithmetic is such a case, and
    the hierarchy of lower-level operations to higher-level (+, *, **) straightforwardly defined. Practical consideration, like usage in
    positional notation systems add to the form of such conventions.
    That's certainly far from an unfounded just "subjective opinion".


    I don't even necessarily disagree, but others are likely to have
    different opinions, and those opinions might be perfectly valid.

    It's not about opinions. (See above.)


    I don't want to make a huge deal out of this. I honestly don't have
    a strong opinion myself. I usually find dealing with the rules
    as they exist to be a much better use of my time and attention --
    and I don't mean that as a criticism of anyone who choose to think
    about alternatives.

    Oh, I'm not handling that differently in practice. When I had read
    my K&R translation (from 1983) to learn the C-language I just made
    a small note in my book (http://volatile.gridbug.de/C-op-prec.png,
    where the faint comment "sinnvoll" means "sensible") and carried on.

    That comment was actually useful to immediately see that flaw when
    looking up the precedences while programming in "C" to not make a
    programming mistake.

    [...]

    [...]
    When designing a new language, there are real advantages in strictly
    imitating C's rules, just because so many programmers are familiar
    with them.

    Huh? - How that? - Are you saying here that practically only C-like
    languages are in common use?

    Huh? No, I didn't say that at all.

    I suggest that if you're designing a somewhat C-like language,
    sticking to C's precedence rules has advantages due to programmer familiarity. Even for a language that's not particularly C-like,
    but that has C-like expressions, the designer might consider
    following C's rules.

    Oh, I see.

    Or not.

    [...]

    (I would have been silly for C++ or Objective-C to
    change the precedence rules, even to improve them.) But there
    are also real advantages in using precedence rules that are better
    (e.g., simpler) than C's.

    Or - with reference to that flaw - just more consistent.

    Consistent systems are inherently simpler, in the sense of easier to
    understand and thus more straightforward to use. A precondition for
    that is, as said, at least a basic understanding of such things.

    Ah, but consistent with what? Internal consistency and consistency
    with existing practice are not necessarily the same thing.

    Right. And both should be considered when designing such things.

    Janis

    [...]


    --- PyGate Linux v1.5.15
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From James Kuyper@3:633/10 to All on Tue Jun 2 06:37:17 2026
    On 2026-06-02 05:48, Janis Papanagnou wrote:
    On 2026-06-01 00:54, Keith Thompson wrote:
    [...]

    Yes, a compiler can reduce (a + b) * 0 to just 0. But it's not
    required to do so, and (INT_MAX + 1) * 0 still has undefined
    behavior. Undefined behavior is determined by the rules of the
    abstract machine *without* any adjustments permitted by the as-if
    rule.

    This is something I really don't get in the actual C-logic...

    Using constants that can be determined at compile time is UB here,
    despite the '* 0' mathematically indicating an IMO clear semantics,
    but using variables is only UB possibly at runtime? And despite all
    that the latter might not even get triggered because it's probably
    optimized away? - I can't help, this sounds really crude.

    Is there any rationale from the _software designer_'s perspective?

    Yes - the rationale is to keep things simple. The abstract machine has
    exactly the semantics specified in the standard. Whether or not a given expression has undefined behavior depends only upon the operator and
    it's operands, and not on the context in which it is invoked. It's only
    when the abstract machine has defined observable behavior when executing
    a program that it becomes meaningful to allow optimizations that
    preserve that behavior.

    --- PyGate Linux v1.5.15
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From Janis Papanagnou@3:633/10 to All on Tue Jun 2 12:55:05 2026
    On 2026-05-31 11:47, David Brown wrote:
    On 31/05/2026 03:37, Janis Papanagnou wrote:
    On 2026-05-31 01:43, Keith Thompson wrote:
    Bart <bc@freeuk.com> writes:
    [...]

    If not, people can choose to ignore those them when writing C code,
    for example like this where all () are technically superfluous:

    ÿÿÿ crcu32 = (crcu32 >> 4) ^ s_crc32[(crcu32 & 0xF) ^ (b & 0xF)];

    Yes, they can, and I personally tend to agree that they should.

    The more complex the expressions are the more structure they need.

    IMO, the parenthesis above make precedence clear (if unknown!), but
    are not contributing to readability. It would have made more sense
    to separate the sub-expression within the [...] in an own object to
    enhance readability and to more easily understand what's going on.

    To emphasize; not the precedences are the problem above, but the
    complexity of the expression in connexion with lack of structuring.

    This is an example of how readability depends on the reader.ÿ To me,
    there is no benefit in having a sub-expression here because the
    structure is clear - this is how you do table-based crc's with 4-bit
    chunks.

    To me, the precedence is as clear as the structure. That's not the
    issue I see with that expression.

    It's the overloaded expression that is what makes it "unreadable".
    (It's actually similar to those overloaded expressions that we saw
    in another recent sub-/thread about color-conversions.)

    But to someone unfamiliar with CRC calculations, splitting the
    expression up might make it clearer.ÿ (Alternatively, a comment block
    with an explanation could help.)

    And that has also nothing to do neither with table-based algorithms
    (which are a triviality) nor with the CRC (or other) coding-programs.
    (Note that I'm saying that as someone who has implemented a lot of
    such things, various CRCs, directly calculated and table-based, and
    a lot much more demanding coding software than these simple CRCs.)


    I /do/ think the parentheses here are helpful for readability, precisely because they emphasise the structure of the expression.ÿ You could write:

    ÿÿÿÿcrcu32 = crcu32 >> 4 ^ s_crc32[crcu32 & 0xF ^ b & 0xF];

    but that needs significantly more cognitive effort to parse when reading
    it, could be misinterpreted, and has lost all the structure that makes
    it easy to see what is going on.

    Yes, I recognize that in that example the parentheses help combining
    parts. (But as said, I see the primary problem in the complexity of
    the expression.)


    (I regularly use bit-manipulation and shift instructions in my code -
    but I still felt it best to check the details in a precedence table
    before writing that.)

    Agreed.


    The expression as originally parenthesised is thus definitely easier
    for /me/ to read, and is almost exactly the way I would write it myself :

    ÿÿÿÿcrcu32 = (crcu32 >> 4) ^ s_crc32[(crcu32 & 0xF) ^ (b & 0xF)];

    Acknowledged.


    The only differences I would have are the names (why would anyone put variable types into the names like "crcu32" ?

    Given the more obvious problem I see with that expression I hadn't
    commented on that; but you are right and I certainly agree.

    The coding algorithms that I implemented had always been just plain
    straight ("unsigned") registers. (So there's no need to reflect that
    property in the names of such variable.)

    We are not writing
    BASIC), and I'd use a small case "0xf".ÿ Unlike almost every example
    Bart has shown before, it even has nice spacing!

    I'm not that picky with the hexadecimal constants it seems; I seem to
    use both forms, depending on subjective readability in the respective
    context. A quick glimpse into my table-driven CRC C-code shows that I
    used lower-case, and that seems generally to be the prevalent form.[*]

    Janis

    [*] Anecdotally: lowercase tables may have their issues though; once
    I had implemented a DES-3 algorithm, which was full of tabular data.
    (That was at times when we could obtain standards only as paper.)
    My code failed against my tests in about 30% of the test cases. The
    reason of the problem was a single table entry 'db' vs. 'bd' (or vv.)
    It was a horror to find that typo in that huge stack of table data;
    somehow it was difficult to find that even after having read several
    times across all that constant data. I'm not sure the same could have
    happened with uppercase, though...


    --- PyGate Linux v1.5.15
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From Tim Rentsch@3:633/10 to All on Tue Jun 2 04:16:48 2026
    Keith Thompson <Keith.S.Thompson+u@gmail.com> writes:

    [syntax for conditional expressions]

    Looking at the grammar in the C++ standard, there is a difference.
    C has:

    conditional-expression:
    logical-OR-expression
    logical-OR-expression ? expression : conditional-expression

    while C++ has:

    conditional-expression:
    logical-or-expression
    logical-or-expression ? expression : assignment-expression

    But the difference isn't mentioned in the Compatibility annex of the
    C++ standard.

    Like I have said before, there are lots of differences between C
    and C++ that aren't mentioned in the Compatibility annex of the
    C++ standard. It isn't surprising to find another one.

    --- PyGate Linux v1.5.15
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From Keith Thompson@3:633/10 to All on Tue Jun 2 05:01:38 2026
    Janis Papanagnou <janis_papanagnou+ng@hotmail.com> writes:
    On 2026-06-02 11:07, Keith Thompson wrote:
    [...]
    (Digression: I hate the fact that such a long and sometimes
    informative thread has such a stupid subject header.)

    And what did prevent you from changing it? :-}

    Futility. At best, I could start a new subthread. The existing
    subject line would live on.

    --
    Keith Thompson (The_Other_Keith) Keith.S.Thompson+u@gmail.com
    void Void(void) { Void(); } /* The recursive call of the void */

    --- PyGate Linux v1.5.15
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From Tim Rentsch@3:633/10 to All on Tue Jun 2 05:06:18 2026
    Janis Papanagnou <janis_papanagnou+ng@hotmail.com> writes:

    On 2026-06-01 00:54, Keith Thompson wrote:

    [...]

    Yes, a compiler can reduce (a + b) * 0 to just 0. But it's not
    required to do so, and (INT_MAX + 1) * 0 still has undefined
    behavior. Undefined behavior is determined by the rules of the
    abstract machine *without* any adjustments permitted by the as-if
    rule.

    This is something I really don't get in the actual C-logic...

    Using constants that can be determined at compile time is UB here,
    despite the '* 0' mathematically indicating an IMO clear semantics,
    but using variables is only UB possibly at runtime? [...]

    There's an important distinction to make here. Consider this
    program:

    #include <limits.h>

    int
    foo(){
    int zero = (INT_MAX+1)*0;
    return zero;
    }

    int
    main(){
    return 0;
    }

    This program does not transgress the bounds of undefined behavior.
    Even more than that, the program is strictly conforming, and must be
    accepted by a conforming implementation.

    Now let's change the program slightly:

    #include <limits.h>

    int
    foo(){
    static int zero = (INT_MAX+1)*0;
    return zero;
    }

    int
    main(){
    return 0;
    }

    This program does transgress the bounds of undefined behavior. The
    reason for the difference is that in the first program the semantics
    of foo() is to evaluate the expression to be stored in 'zero' only
    at runtime, whereas in the second program the semantics of foo() is
    to evaluate the expression to be stored in 'zero' before program
    startup (informally, "at compile time"). What matters is not
    whether the offending expression /might/ be evaluated "at compile
    time", but whether the offending expression /must/ be evaluated "at
    compile time". Only in the second case is undefined behavior
    inevitable (and thus it does not occur in the first program).

    Fine point: strictly speaking, I believe the C standard allows even
    the second program to complete translation phase 8 successfully, and
    for any offending behavior to occur only when we actually try to run
    the program. To say that another way, there is no requirement that
    possible nasal demons be made manifest at any point before an actual
    attempted execution. On the other hand, because that possibility is
    there lurking in the background, there is no requirement that the
    program be accepted, and could be rejected by a conforming compiler.

    --- PyGate Linux v1.5.15
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From Dan Cross@3:633/10 to All on Tue Jun 2 12:07:42 2026
    In article <10vlvie$2ne3j$2@dont-email.me>,
    David Brown <david.brown@hesbynett.no> wrote:
    On 01/06/2026 20:48, Dan Cross wrote:
    In article <10vjsg2$259m3$3@dont-email.me>,
    David Brown <david.brown@hesbynett.no> wrote:
    On 01/06/2026 13:04, Dan Cross wrote:
    In article <10vjdn8$22tgu$1@dont-email.me>,
    David Brown <david.brown@hesbynett.no> wrote:
    On 31/05/2026 19:11, Bart wrote:
    [snip]
    Actual examples of too many parentheses?


    [Snipping the LISP stuff - fun, but OT and not really relevant to the
    thread branch. And I have never used the language.]

    Fair!

    [snip]
    I see code like this all the time; usually it comes from
    hardware vendors (I take it this was from a BSP or something
    similar?). I often wonder about vendor programming standards
    when I run across things like it.

    Yes, this was from a hardware vendor (who shall remain nameless to
    protect the guilty - not that I have found other vendors to be much
    better). They have a tendency to be obsessed with MISRA, with sticking
    to C90, and with filling headers with huge Doxygen templates giving no
    information and obscuring the code. (I'm fine with Doxygen comments
    that actually add useful information, but not a dozen lines repeating
    the names and types from a function signature.)

    Yes. I see all of this, and it mystifies me; I have seen how
    excessive abstraction can lead to opaque code, but many times
    hardware people go in the opposite direction, and one hardly
    ever sees useful abstraction; for example, often the same code
    sequence could be trivially extracted into a function, but it is
    instead repeated multiple times, inline.

    Indeed. There is just /so/ much that is done badly in these SDK's - I
    am not going to go into details as it would take all day. I get the >impression that software libraries are very much an afterthought for
    most microcontroller design groups - I don't think they ever bother
    talking to developers who will use them. In fact, I don't think they
    talk much to the software folks when designing the microcontrollers either.

    I think that's exactly what happens: the uCtlr companies don't
    have robust software development organizations, and it's seen as
    a side-bag to their core business. The same is true for the
    bigger hardware vendors, as well (lookin' at you, Intel). Cue
    the famous story about Fred Brooks throwing Gene Amdahl out of
    his office until the latter came with with a hardware design for
    the IBM 360 with byte addressing and power of two widths for
    primitive data types.

    Sometimes, however, they do have abstractions - sometimes multiple
    layers of HALs ("Hardware Abstraction Layer"), drivers, interfaces, etc.
    Each layer has a completely different way of viewing things - one will
    use #define'd constants for everything, another will use a struct with
    30 fields passed as a pointer in order to turn a GPIO pin on or off, and
    the next layer will use a macro TURN_GPIO_PIN_A14_ON. When you have
    figured out which API you are expected to use, toggling a GPIO leads to
    a half-dozen nested calls (not including macros) up and down theses
    stacks when all the hardware needs is a single write to a particular >register. And if you are really lucky, a global HAL_LOCK_MUTEX is
    acquired and released along the way.

    Yes. The way one boots an AMD server SoC, for instance,
    requires shipping a bunch of binary data structures around to
    little microcontrollers spread across a bunch of AXI buses, that
    are then responsible for things like configuring PCIe links and
    enumerating IO buses and so on. The vendor code for doing this
    is opaque, at best. For example, https://github.com/openSIL/openSIL/blob/turin_poc/xUSL/Nbio/Brh/NbioPcieComplexDataBrh.c
    (and that's a cleaned-up version).

    [snip color mapping code]

    Yes, that would be vastly better. (I would still prefer to have
    different named types for colours in the different encoding schemes.)

    I'll see your named types and raise you a bitfield struct. The
    shifting and masking is superfluous.

    This is exactly the sort of thing that, as you point out, a
    `static inline` function is far better suited for. Some code
    bases don't want to use them for a variety of reasons, usually
    compatibility concerns with older code, compilers, or language
    standards. Some variants of Unix, for instance, worry about
    header compatibility with C90 [and in some cases K&R C] code.

    Indeed. But even if they don't want to use "inline", a static function
    is better - the compiler will do the inlining anyway (if it makes sense
    according to its heuristics).

    Assuming the compiler they're working with is known to do so,
    then I agree.

    If a compiler is not capable of inlining static functions without them
    being labelled "inline", then you are unlikely to get efficient results >anyway. (Or the user has not enabled optimisation, and again cannot
    expect efficient results.) I don't see the point in pandering to poorly >optimising compilers (including good compilers with optimisation
    disabled) in order to produce marginally less big and slow code. There
    was a time when a good optimising compiler was a significant investment
    and not always within the budget for a project, but such times are far
    in the past. I can understand that some developers are hamstrung by
    daft C90 restrictions, but I have little sympathy for people wanting
    good results from poor tools.

    (The exception, perhaps, is people who have to use Microchip development >tools.)

    It's not just because the optimizer is bad or the developers are
    obtuse. Sometimes it's a deliberate decision to support
    external tooling, like a debugger or tracing program or similar.
    Some projects deliberately tolerate slower code for that.

    Moreover, on large code bases, with long life spans, upgrading a
    compiler is a significant investment. Almost invariably the
    code has UB somewhere (I work on a code base that has been
    evolving since before ANSI C; out of about 11 million lines,
    there's lots of code that can be considered "legacy" in it).
    From a business standpoint, it's not worth the time or
    engineering resources required to go find all of it and make it
    strictly conforming; from a technical standpoint, it may not
    always be possible to do so anyway (though other superset
    standards, like POSIX, are another matter), and in other cases
    the resulting obfuscation to meet much stricter demands of ISO C
    has been deemed, rightly or wrongly, as simply not worth it. It
    may not ideal, but them's the breaks.

    - Dan C.


    --- PyGate Linux v1.5.15
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From Keith Thompson@3:633/10 to All on Tue Jun 2 05:25:53 2026
    Bart <bc@freeuk.com> writes:
    [...]
    David Brown <david.brown@hesbynett.no> writes:
    [...]
    <https://cppreference.com/c/language/operator_precedence>
    <https://cppreference.com/cpp/language/operator_precedence>
    [...]
    Your table however also shows || had same precedence as both ?: and
    =. There, I couldn't find an example that made a difference.

    Still, I'd find that unsettling; I would rather that ?: was distinct
    from bother, either with its own level, or via other language
    rules. (In my stuff it is always written with parentheses.)

    I think you're misreading the table due to its poor formatting.

    In the C++ table (second URL above), the precedence levels are
    numbered from 1 to 17, but the number in the first column is aligned
    to the *middle* of the list of operators in the second column.
    So level 15 is just "a || b", and level 16 goes from "a ? b : c" to
    "a &= b a ^= b a |= b". You can tell where the level 16 section
    starts by the "Right-to-left" associativity in the last column,
    which is aligned with the *first* item in the list. I've submitted
    a suggestion to fix it (and then saw that someone else had already
    done so), but apparently cppreference.com is being hit by vandalism,
    so it might take a while before it's corrected.

    --
    Keith Thompson (The_Other_Keith) Keith.S.Thompson+u@gmail.com
    void Void(void) { Void(); } /* The recursive call of the void */

    --- PyGate Linux v1.5.15
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From Keith Thompson@3:633/10 to All on Tue Jun 2 05:35:43 2026
    Janis Papanagnou <janis_papanagnou+ng@hotmail.com> writes:
    On 2026-06-01 00:54, Keith Thompson wrote:
    [...]
    Yes, a compiler can reduce (a + b) * 0 to just 0. But it's not
    required to do so, and (INT_MAX + 1) * 0 still has undefined
    behavior. Undefined behavior is determined by the rules of the
    abstract machine *without* any adjustments permitted by the as-if
    rule.

    This is something I really don't get in the actual C-logic...

    Using constants that can be determined at compile time is UB here,
    despite the '* 0' mathematically indicating an IMO clear semantics,
    but using variables is only UB possibly at runtime? And despite all
    that the latter might not even get triggered because it's probably
    optimized away? - I can't help, this sounds really crude.

    Is there any rationale from the _software designer_'s perspective?

    In the abstract machine, every operator and subexpression is
    evaluated (barring things like "||", "&&", and "?:"). (INT_MAX + 1)
    has undefined behavior due to overflow, therefore any expression
    that has (INT_MAX + 1) as a subexpression has undefined behavior.

    Replacing (expr * 0) by 0 is an optimization, and optimizations
    are *optional*. A naive implementation could generate code that
    peforms the addition and the muliplication by 0; if the addition
    traps, it traps.

    Note that in a context that requires a constant expression, overflow is
    a constraint violation. For example, a case label like:

    case (INT_MAX + 1) * 0:

    must be diagnosed at compile time.

    --
    Keith Thompson (The_Other_Keith) Keith.S.Thompson+u@gmail.com
    void Void(void) { Void(); } /* The recursive call of the void */

    --- PyGate Linux v1.5.15
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From Kenny McCormack@3:633/10 to All on Tue Jun 2 12:36:25 2026
    Subject: Operator precedence in other (non-C, but "C-like") languages (Was: something about a girl)

    In article <10vku5o$2glfs$2@kst.eternal-september.org>,
    Keith Thompson <Keith.S.Thompson+u@gmail.com> wrote:
    ...
    Digression: Perl borrows most or all of C's operators, and keeps
    the same precedences. "Operators borrowed from C keep the same
    precedence relationship with each other, even where C's precedence
    is slightly screwy." But Perl has "and" and "or" operators that
    work like "&&" and "||" but have lower precedence (that turns out
    to be convenient in some contexts).

    I vaguely recall that there's some language that uses the ?: syntax
    for the conditional operator, but with a different precedence and/or >associativity than C. I can't remember which language it is.

    (It turns out it was PHP that you were thinking of)

    There is another language that claims to be C-like in terms of its
    operators and functions (although its overall syntax and reason for
    existence are completely not like C), but which has the quirk that || and
    && work like in C, except that they don't do "short circuit" evaluation.
    Both sides of the operator are always evaluated. Working in this language,
    I found this lack of short-circuit jarring, but when I mentioned it on the support board (for this particular language), they had no idea what I was talking about...

    And that language is: WinBatch.

    --
    Men rarely (if ever) manage to dream up a God superior to themselves.
    Most Gods have the manners and morals of a spoiled child.

    --- PyGate Linux v1.5.15
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From David Brown@3:633/10 to All on Tue Jun 2 14:37:11 2026
    On 02/06/2026 14:07, Dan Cross wrote:
    In article <10vlvie$2ne3j$2@dont-email.me>,
    David Brown <david.brown@hesbynett.no> wrote:
    On 01/06/2026 20:48, Dan Cross wrote:
    In article <10vjsg2$259m3$3@dont-email.me>,
    David Brown <david.brown@hesbynett.no> wrote:
    On 01/06/2026 13:04, Dan Cross wrote:
    In article <10vjdn8$22tgu$1@dont-email.me>,
    David Brown <david.brown@hesbynett.no> wrote:
    On 31/05/2026 19:11, Bart wrote:
    [snip]

    [snip color mapping code]

    Yes, that would be vastly better. (I would still prefer to have
    different named types for colours in the different encoding schemes.)

    I'll see your named types and raise you a bitfield struct. The
    shifting and masking is superfluous.

    Sure. "Named types" does not preclude bit-fields. I'd prefer some kind
    of struct, for type safety, but even a typedef is better than nothing.
    And when you have a struct for something like this, bit-fields are an
    obvious choice (at least for code that doesn't have to be portable to different endian systems).


    This is exactly the sort of thing that, as you point out, a
    `static inline` function is far better suited for. Some code
    bases don't want to use them for a variety of reasons, usually
    compatibility concerns with older code, compilers, or language
    standards. Some variants of Unix, for instance, worry about
    header compatibility with C90 [and in some cases K&R C] code.

    Indeed. But even if they don't want to use "inline", a static function >>>> is better - the compiler will do the inlining anyway (if it makes sense >>>> according to its heuristics).

    Assuming the compiler they're working with is known to do so,
    then I agree.

    If a compiler is not capable of inlining static functions without them
    being labelled "inline", then you are unlikely to get efficient results
    anyway. (Or the user has not enabled optimisation, and again cannot
    expect efficient results.) I don't see the point in pandering to poorly
    optimising compilers (including good compilers with optimisation
    disabled) in order to produce marginally less big and slow code. There
    was a time when a good optimising compiler was a significant investment
    and not always within the budget for a project, but such times are far
    in the past. I can understand that some developers are hamstrung by
    daft C90 restrictions, but I have little sympathy for people wanting
    good results from poor tools.

    (The exception, perhaps, is people who have to use Microchip development
    tools.)

    It's not just because the optimizer is bad or the developers are
    obtuse. Sometimes it's a deliberate decision to support
    external tooling, like a debugger or tracing program or similar.
    Some projects deliberately tolerate slower code for that.

    That's fine - you are knowingly picking a different tradeoff.


    Moreover, on large code bases, with long life spans, upgrading a
    compiler is a significant investment.

    In any given project, I consider the toolchain as part of the project.
    I don't upgrade or replace it without very good reason. If I pull an
    old project out of its mothballs to make a change, the first thing I do
    is a clean rebuild and compare the binary to the one stored with the
    project, to be sure that everything builds cleanly and identically. (My record for doing this was almost exactly 20 years between code changes -
    and that code was in C90. But the compiler was happy to do inlining optimisations without the "inline" keyword.)

    Almost invariably the
    code has UB somewhere (I work on a code base that has been
    evolving since before ANSI C; out of about 11 million lines,
    there's lots of code that can be considered "legacy" in it).
    From a business standpoint, it's not worth the time or
    engineering resources required to go find all of it and make it
    strictly conforming; from a technical standpoint, it may not
    always be possible to do so anyway (though other superset
    standards, like POSIX, are another matter), and in other cases
    the resulting obfuscation to meet much stricter demands of ISO C
    has been deemed, rightly or wrongly, as simply not worth it. It
    may not ideal, but them's the breaks.


    Fair enough.

    I am a little spoiled in that most of the code I work with, I wrote.
    But that is less true now than it used to be. In the old days, I would
    rarely need anything from the standard library and nothing from microcontroller vendors or third parties. (Before that, I would
    typically use assembly for microcontrollers, and then everything was my
    own work.) I have sometimes had to add "-fno-strict-aliasing -fwrapv"
    to deal with UB in other people's code, which always feels uncomfortable.

    And of course sometimes you get handed code written by a muppet with no
    clue what they are doing. I once had to debug code written by someone
    who did not see the point in keeping the number and type of parameters
    in sync between function definitions and function calls. The parts of
    the program that worked did so by sheer chance.



    --- PyGate Linux v1.5.15
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From Kenny McCormack@3:633/10 to All on Tue Jun 2 12:39:08 2026
    Subject: It is not futile to change the subject line (Was: this girl calls c ugly)

    In article <10vmgn2$2tjoi$1@kst.eternal-september.org>,
    Keith Thompson <Keith.S.Thompson+u@gmail.com> wrote:
    Janis Papanagnou <janis_papanagnou+ng@hotmail.com> writes:
    On 2026-06-02 11:07, Keith Thompson wrote:
    [...]
    (Digression: I hate the fact that such a long and sometimes
    informative thread has such a stupid subject header.)

    And what did prevent you from changing it? :-}

    Futility. At best, I could start a new subthread. The existing
    subject line would live on.

    See. That wasn't so hard, was it?

    I maintain that there are several good reasons why changing the Subject
    line is a good thing. Many other people disagree with me, but I don't care about that.

    It is, as you imply, especially a good idea where, as here, the original
    (i.e., carried) Subject line is dumb.

    --
    I shot a man on Fifth Aveneue, just to see him die.

    --- PyGate Linux v1.5.15
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From Kenny McCormack@3:633/10 to All on Tue Jun 2 12:42:09 2026
    Subject: Re: It is not futile to change the subject line (Was: this girl calls c ugly)

    In article <10vmitc$o9ge$2@news.xmission.com>,
    Kenny McCormack <gazelle@shell.xmission.com> wrote:
    ...
    I maintain that there are several good reasons why changing the Subject
    line is a good thing. Many other people disagree with me, but I don't care >about that.

    It is, as you imply, especially a good idea where, as here, the original >(i.e., carried) Subject line is dumb.

    Oh, and please read this, which I recently composed on the subject:

    3 Reasons Why People Don't Change Subject Lines

    1) Because they don't know how (and can't be bothered to learn). Or,
    eqv, that whatever crappy newsreader they are using makes it difficult
    or impossible to do so.

    2) Because they think it is a violation of netiquette to do so. I.e.,
    they think it "breaks" the thread". The theory is that doing so creates problems for people who use poor newsreaders.

    3) Because they get a perverse thrill out of keeping an old, totally inappropriate thread title, when they clearly know better.

    --
    Many people in the American South think that DJT is, and will be remembered
    as, one of the best presidents in US history. They are absolutely correct.

    He is currently at number 46 on the list. High praise, indeed!

    --- PyGate Linux v1.5.15
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From Dan Cross@3:633/10 to All on Tue Jun 2 13:05:08 2026
    In article <10vh1eo$1ei50$2@dont-email.me>, Bart <bc@freeuk.com> wrote:
    On 31/05/2026 10:49, David Brown wrote:
    On 31/05/2026 10:12, Richard Harnden wrote:
    On 31/05/2026 00:43, Keith Thompson wrote:
    C's operator precedence rules are complicated and arguably flawed.
    They could have been defined differently.ÿ A simpler set of rules,
    with fewer levels,*might* have been better.ÿ I don't have any
    concrete suggestions -- nor do I have any strong preferences.
    I accept C's rules as they are.ÿ I would accept them if they had
    been defined differently.

    Can't the compiler easily remove any parens that aren't necessary?
    So - just write complex expressions in a way that a human can most
    easily understand, it makes your intention clear and probable doesn't
    increase the size of the executable.

    Of course.ÿ Parentheses do not affect the generated code unless they
    affect the semantics of the expression.ÿ (Some people think parentheses
    affect the order of evaluation,

    They can do if they make a expression be parsed differently. Do you have
    an example where they make no difference but people might think they do?

    This is all a bit of a distraction from the original point that
    David and Richard Harnden were trying to make, which seemed
    clear enough to me, but perhaps should have been given with a
    better example. Maybe something like:

    d = a*b + c;

    Is equivalent to,

    d = (a*b) + c;

    And in this case, the parentheses are superfluous and don't
    change the order of evaluation of the expression as far as the
    language is concerned. Whether a compiler rearranges it in
    generated code in a way that is more convenient of faster or
    whatever is another matter.

    I would quibble with this idea that the compiler "removes"
    parentheses. I get the intuition, but C is not Go where the
    compiler "inserts" semi-colons for you, and has no analogous
    concept. Rather, as I think Keith said, expressions are parsed
    into some internal representation, and then transformed into
    something like an abstract syntax tree, where syntactic
    notations like parentheses are lost.

    Both expressions above correspond to an AST like:

    ÚÄÄÄÄÄÄÄ¿
    ³BinOp +³
    ÀÄÄÄÄÄÄÄÙ
    ? ?
    ? ?
    ÚÄÄÄÄÄÄÄ¿ ÚÄÄÄÄÄÄÄ¿
    ³BinOp *³ ³Sym `c`³
    ÀÄÄÄÄÄÄÄÙ ÀÄÄÄÄÄÄÄÙ
    ? ?
    ? ?
    ÚÄÄÄÄÄÄÄ¿ ÚÄÄÄÄÄÄÄ¿
    ³Sym `a`³ ³Sym `b`³
    ÀÄÄÄÄÄÄÄÙ ÀÄÄÄÄÄÄÄÙ

    But the to get to that, it may be that the compiler uses a
    different initial representation, like a parse tree that more
    closely resembles the source language grammar. Here, the
    two expressions might have different parsed representations.
    E.g., for the first, simplifying heavily, may look something
    like this:

    ÚÄÄÄÄÄÄ¿
    ³ expr ³
    ÀÄÄÄÄÄÄÙ
    ? ³ ?
    ? ³ ?
    ÚÄÄÄÄÄ¿ . ÚÄÄÄÄÄ¿
    ³term ³ (+) ³term ³
    ÀÄÄÄÄÄÙ ' ÀÄÄÄÄÄÙ
    ? ³ ? ³
    ? ³ ? ³
    ÚÄÄÄÄÄ¿ . ÚÄÄÄÄÄ¿ ÚÄÄÄÄÄ¿
    ³ident³ (*) ³ident³ ³ident³
    ÀÄÄÄÄÄÙ ' ÀÄÄÄÄÄÙ ÀÄÄÄÄÄÙ
    ³ ³ ³
    ³ ³ ³
    .Ä. .Ä. .Ä.
    (`a`) (`b`) (`c`)
    `Ä' `Ä' `Ä'

    While the second might add an extra `expr` node, as in:

    ÚÄÄÄÄÄÄ¿
    ³ expr ³
    ÀÄÄÄÄÄÄÙ
    ? ³ ?
    ? ³ ?
    ÚÄÄÄÄÄÄ¿ . ÚÄÄÄÄÄ¿
    ³ expr ³ (+) ³term ³
    ÀÄÄÄÄÄÄÙ ' ÀÄÄÄÄÄÙ
    ³ ³
    ³ ³
    ÚÄÄÄÄÄ¿ ÚÄÄÄÄÄ¿
    ³term ³ ³ident³
    ÀÄÄÄÄÄÙ ÀÄÄÄÄÄÙ
    ? ³ ? ³
    ? ³ ? ³
    ÚÄÄÄÄÄ¿ . ÚÄÄÄÄÄ¿ .Ä.
    ³ident³ (*) ³ident³ (`c`)
    ÀÄÄÄÄÄÙ ' ÀÄÄÄÄÄÙ `Ä'
    ³ ³
    ³ ³
    .Ä. .Ä.
    (`a`) (`b`)
    `Ä' `Ä'

    I believe that the answer, for most compilers that parse and
    then convert to an AST, the second is more likely to be created
    than the first. However, given that the same AST is created
    from both parse trees, this is unlikely to have an effect on the
    object code ultimately output from the compiler.

    - Dan C.


    --- PyGate Linux v1.5.15
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From Bart@3:633/10 to All on Tue Jun 2 14:20:15 2026
    On 02/06/2026 13:25, Keith Thompson wrote:
    Bart <bc@freeuk.com> writes:
    [...]
    David Brown <david.brown@hesbynett.no> writes:
    [...]
    <https://cppreference.com/c/language/operator_precedence>
    <https://cppreference.com/cpp/language/operator_precedence>
    [...]
    Your table however also shows || had same precedence as both ?: and
    =. There, I couldn't find an example that made a difference.

    Still, I'd find that unsettling; I would rather that ?: was distinct
    from bother, either with its own level, or via other language
    rules. (In my stuff it is always written with parentheses.)

    I think you're misreading the table due to its poor formatting.

    In the C++ table (second URL above), the precedence levels are
    numbered from 1 to 17, but the number in the first column is aligned
    to the *middle* of the list of operators in the second column.
    So level 15 is just "a || b", and level 16 goes from "a ? b : c" to
    "a &= b a ^= b a |= b". You can tell where the level 16 section
    starts by the "Right-to-left" associativity in the last column,
    which is aligned with the *first* item in the list. I've submitted
    a suggestion to fix it (and then saw that someone else had already
    done so), but apparently cppreference.com is being hit by vandalism,
    so it might take a while before it's corrected.


    You're right, it is a confusing layout. But it might explain why I
    couldn't find different behaviours between C/C++ in my examples
    involving || and ?:.

    --- PyGate Linux v1.5.15
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From Tim Rentsch@3:633/10 to All on Tue Jun 2 06:29:01 2026
    Keith Thompson <Keith.S.Thompson+u@gmail.com> writes:

    Note that in a context that requires a constant expression, overflow is
    a constraint violation. For example, a case label like:

    case (INT_MAX + 1) * 0:

    must be diagnosed at compile time.

    gcc disagrees with you.

    --- PyGate Linux v1.5.15
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From Bart@3:633/10 to All on Tue Jun 2 14:38:10 2026
    On 02/06/2026 14:05, Dan Cross wrote:
    In article <10vh1eo$1ei50$2@dont-email.me>, Bart <bc@freeuk.com> wrote:
    On 31/05/2026 10:49, David Brown wrote:
    On 31/05/2026 10:12, Richard Harnden wrote:
    On 31/05/2026 00:43, Keith Thompson wrote:
    C's operator precedence rules are complicated and arguably flawed.
    They could have been defined differently.ÿ A simpler set of rules,
    with fewer levels,*might* have been better.ÿ I don't have any
    concrete suggestions -- nor do I have any strong preferences.
    I accept C's rules as they are.ÿ I would accept them if they had
    been defined differently.

    Can't the compiler easily remove any parens that aren't necessary?
    So - just write complex expressions in a way that a human can most
    easily understand, it makes your intention clear and probable doesn't
    increase the size of the executable.

    Of course.ÿ Parentheses do not affect the generated code unless they
    affect the semantics of the expression.ÿ (Some people think parentheses
    affect the order of evaluation,

    They can do if they make a expression be parsed differently. Do you have
    an example where they make no difference but people might think they do?

    This is all a bit of a distraction from the original point that
    David and Richard Harnden were trying to make, which seemed
    clear enough to me, but perhaps should have been given with a
    better example. Maybe something like:

    d = a*b + c;

    Is equivalent to,

    d = (a*b) + c;

    And in this case, the parentheses are superfluous and don't
    change the order of evaluation of the expression as far as the
    language is concerned. Whether a compiler rearranges it in
    generated code in a way that is more convenient of faster or
    whatever is another matter.

    I would quibble with this idea that the compiler "removes"
    parentheses. I get the intuition, but C is not Go where the
    compiler "inserts" semi-colons for you, and has no analogous
    concept. Rather, as I think Keith said, expressions are parsed
    into some internal representation, and then transformed into
    something like an abstract syntax tree, where syntactic
    notations like parentheses are lost.

    Both expressions above correspond to an AST like:

    ÚÄÄÄÄÄÄÄ¿
    ³BinOp +³
    ÀÄÄÄÄÄÄÄÙ
    ? ?
    ? ?
    ÚÄÄÄÄÄÄÄ¿ ÚÄÄÄÄÄÄÄ¿
    ³BinOp *³ ³Sym `c`³
    ÀÄÄÄÄÄÄÄÙ ÀÄÄÄÄÄÄÄÙ
    ? ?
    ? ?
    ÚÄÄÄÄÄÄÄ¿ ÚÄÄÄÄÄÄÄ¿
    ³Sym `a`³ ³Sym `b`³
    ÀÄÄÄÄÄÄÄÙ ÀÄÄÄÄÄÄÄÙ

    But the to get to that, it may be that the compiler uses a
    different initial representation, like a parse tree that more
    closely resembles the source language grammar. Here, the
    two expressions might have different parsed representations.
    E.g., for the first, simplifying heavily, may look something
    like this:

    ÚÄÄÄÄÄÄ¿
    ³ expr ³
    ÀÄÄÄÄÄÄÙ
    ? ³ ?
    ? ³ ?
    ÚÄÄÄÄÄ¿ . ÚÄÄÄÄÄ¿
    ³term ³ (+) ³term ³
    ÀÄÄÄÄÄÙ ' ÀÄÄÄÄÄÙ
    ? ³ ? ³
    ? ³ ? ³
    ÚÄÄÄÄÄ¿ . ÚÄÄÄÄÄ¿ ÚÄÄÄÄÄ¿
    ³ident³ (*) ³ident³ ³ident³
    ÀÄÄÄÄÄÙ ' ÀÄÄÄÄÄÙ ÀÄÄÄÄÄÙ
    ³ ³ ³
    ³ ³ ³
    .Ä. .Ä. .Ä.
    (`a`) (`b`) (`c`)
    `Ä' `Ä' `Ä'

    While the second might add an extra `expr` node, as in:

    ÚÄÄÄÄÄÄ¿
    ³ expr ³
    ÀÄÄÄÄÄÄÙ
    ? ³ ?
    ? ³ ?
    ÚÄÄÄÄÄÄ¿ . ÚÄÄÄÄÄ¿
    ³ expr ³ (+) ³term ³
    ÀÄÄÄÄÄÄÙ ' ÀÄÄÄÄÄÙ
    ³ ³
    ³ ³
    ÚÄÄÄÄÄ¿ ÚÄÄÄÄÄ¿
    ³term ³ ³ident³
    ÀÄÄÄÄÄÙ ÀÄÄÄÄÄÙ
    ? ³ ? ³
    ? ³ ? ³
    ÚÄÄÄÄÄ¿ . ÚÄÄÄÄÄ¿ .Ä.
    ³ident³ (*) ³ident³ (`c`)
    ÀÄÄÄÄÄÙ ' ÀÄÄÄÄÄÙ `Ä'
    ³ ³
    ³ ³
    .Ä. .Ä.
    (`a`) (`b`)
    `Ä' `Ä'

    I believe that the answer, for most compilers that parse and
    then convert to an AST, the second is more likely to be created
    than the first. However, given that the same AST is created
    from both parse trees, this is unlikely to have an effect on the
    object code ultimately output from the compiler.


    You're describing a 'Concrete Syntax Tree' or CST, versus AST.

    Although in that case, I expect to see a discrete node for bracketed expressions (ie. parenthesised), as those would also have a distinct production in any formal grammar.

    Personally I don't have much use for CSTs for a normal compiler, but
    they might be useful for source-to-source translators, or programs that
    do source refactoring, where you want to preserve extras such as
    parentheses even if they're not strictly needed.

    (Injecting the right parentheses for examples like `(a + b) * c' which
    would have an AST like '(* (+ a b) c)' is surpringly tricky. Easier to
    just follow the original source!

    In any case, that still wouldnt't turn ((a+b)) back into the original;
    you'd need a suitable CST.)

    --- PyGate Linux v1.5.15
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From David Brown@3:633/10 to All on Tue Jun 2 16:10:31 2026
    On 02/06/2026 15:29, Tim Rentsch wrote:
    Keith Thompson <Keith.S.Thompson+u@gmail.com> writes:

    Note that in a context that requires a constant expression, overflow is
    a constraint violation. For example, a case label like:

    case (INT_MAX + 1) * 0:

    must be diagnosed at compile time.

    gcc disagrees with you.

    My testing shows all versions of gcc that I tested on godbolt gave a
    warning, even without any options. I don't believe that INT_MAX can
    have any type suffixes that would avoid the overflow.

    What version of gcc and/or flags let that case label pass without a diagnostic?

    (I don't know if Keith is correct about it being a constraint violation
    - I have not looked at the details there.)




    --- PyGate Linux v1.5.15
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From Scott Lurndal@3:633/10 to All on Tue Jun 2 15:06:51 2026
    cross@spitfire.i.gajendra.net (Dan Cross) writes:
    In article <10vlvie$2ne3j$2@dont-email.me>,
    David Brown <david.brown@hesbynett.no> wrote:
    <snip>

    Yes. The way one boots an AMD server SoC, for instance,
    requires shipping a bunch of binary data structures around to
    little microcontrollers spread across a bunch of AXI buses, that
    are then responsible for things like configuring PCIe links and
    enumerating IO buses and so on. The vendor code for doing this
    is opaque, at best. For example, >https://github.com/openSIL/openSIL/blob/turin_poc/xUSL/Nbio/Brh/NbioPcieComplexDataBrh.c
    (and that's a cleaned-up version).

    Indeed, even the high-speed SERDES now have small microprocessors
    that need proprietary firmware loaded at power-on.

    Most vendors prefer to keep such details proprietary, for various
    reasons both good and bad.

    I'll agree that software (firmware) development at chip vendors
    has been, in the past, an afterthought with the primary emphasis
    on the hardware side. In modern chip design, software has taken
    a larger role in both hardware definition, and the software
    quality has improved somewhat.


    [snip color mapping code]

    Yes, that would be vastly better. (I would still prefer to have
    different named types for colours in the different encoding schemes.)

    I'll see your named types and raise you a bitfield struct. The
    shifting and masking is superfluous.

    Or useful helpers:

    a = bit::extract(value, 12, 0); /* Extract bits 12:0 from value */

    b = bit::insert(b, 0x10, 5, 5); /* Insert 0x10 into b starting at bit 5 for 5 bits */

    One might also define data structures for control and status registers using bitfield structs.

    e.g. for the SATA UAHC_GLB_OOBR register:

    union UAHC_GBL_OOBR {
    uint32_t u;
    struct UAHC_GBL_OOBR_s {
    #if __BYTE_ORDER == __BIG_ENDIAN
    uint32_t we : 1; /**< R/W/H - Write enable. */
    uint32_t cwmin : 7; /**< R/W/H - COMWAKE minimum value. Writable only if WE is set. */
    uint32_t cwmax : 8; /**< R/W/H - COMWAKE maximum value. Writable only if WE is set. */
    uint32_t cimin : 8; /**< R/W/H - COMINIT minimum value. Writable only if WE is set. */
    uint32_t cimax : 8; /**< R/W/H - COMINIT maximum value. Writable only if WE is set. */
    #else
    uint32_t cimax : 8;
    uint32_t cimin : 8;
    uint32_t cwmax : 8;
    uint32_t cwmin : 7;
    uint32_t we : 1;
    #endif
    } s;
    };

    --- PyGate Linux v1.5.15
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From Scott Lurndal@3:633/10 to All on Tue Jun 2 15:10:10 2026
    cross@spitfire.i.gajendra.net (Dan Cross) writes:
    In article <10vh1eo$1ei50$2@dont-email.me>, Bart <bc@freeuk.com> wrote:


    Both expressions above correspond to an AST like:

    ÚÄÄÄÄÄÄÄ¿
    ³BinOp +³
    ÀÄÄÄÄÄÄÄÙ
    ? ?
    ? ?
    ÚÄÄÄÄÄÄÄ¿ ÚÄÄÄÄÄÄÄ¿
    ³BinOp *³ ³Sym `c`³
    ÀÄÄÄÄÄÄÄÙ ÀÄÄÄÄÄÄÄÙ
    ? ?
    ? ?
    ÚÄÄÄÄÄÄÄ¿ ÚÄÄÄÄÄÄÄ¿
    ³Sym `a`³ ³Sym `b`³
    ÀÄÄÄÄÄÄÄÙ ÀÄÄÄÄÄÄÄÙ

    Ah, the dangers of assuming everyone uses UTF-8.

    --- PyGate Linux v1.5.15
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From Dan Cross@3:633/10 to All on Tue Jun 2 15:19:04 2026
    In article <10vmmc2$2utb2$2@dont-email.me>, Bart <bc@freeuk.com> wrote:
    On 02/06/2026 14:05, Dan Cross wrote:
    You're describing a 'Concrete Syntax Tree' or CST, versus AST.

    Yes. "Concrete Syntax Tree" is another name for "Parse Tree".

    Although in that case, I expect to see a discrete node for bracketed >expressions (ie. parenthesised), as those would also have a distinct >production in any formal grammar.

    Was that not in the second parse tree diagram I presented?
    Granted, I called it "expr", but as I noted, I was simplifying
    heavily, mostly for space.

    Personally I don't have much use for CSTs for a normal compiler, but
    they might be useful for source-to-source translators, or programs that
    do source refactoring, where you want to preserve extras such as
    parentheses even if they're not strictly needed.

    I think you're missing the point, here.

    The question was whether, given some compiler, `a*b + c`
    generates different code from `(a*b) + c`, and what it means for
    the compiler to "remove the parentheses." I submit that, with
    respect to the former, the answer is "very very unlikely" and
    with respect to the latter, the question is a category error.

    (Injecting the right parentheses for examples like `(a + b) * c' which
    would have an AST like '(* (+ a b) c)' is surpringly tricky. Easier to
    just follow the original source!

    In any case, that still wouldnt't turn ((a+b)) back into the original;
    you'd need a suitable CST.)

    That's not related to what I was trying to convey.

    - Dan C.


    --- PyGate Linux v1.5.15
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From Dan Cross@3:633/10 to All on Tue Jun 2 15:31:45 2026
    In article <mnCTR.17470$_BG8.10863@fx24.iad>,
    Scott Lurndal <slp53@pacbell.net> wrote:
    cross@spitfire.i.gajendra.net (Dan Cross) writes:
    In article <10vh1eo$1ei50$2@dont-email.me>, Bart <bc@freeuk.com> wrote:


    Both expressions above correspond to an AST like:

    ÚÄÄÄÄÄÄÄ¿
    ³BinOp +³
    ÀÄÄÄÄÄÄÄÙ
    ? ?
    ? ?
    ÚÄÄÄÄÄÄÄ¿ ÚÄÄÄÄÄÄÄ¿
    ³BinOp *³ ³Sym `c`³
    ÀÄÄÄÄÄÄÄÙ ÀÄÄÄÄÄÄÄÙ
    ? ?
    ? ?
    ÚÄÄÄÄÄÄÄ¿ ÚÄÄÄÄÄÄÄ¿
    ³Sym `a`³ ³Sym `b`³
    ÀÄÄÄÄÄÄÄÙ ÀÄÄÄÄÄÄÄÙ

    Ah, the dangers of assuming everyone uses UTF-8.

    Yeah, my bad. Here:

    +-------+
    |BinOp +|
    +-------+
    / \
    / \
    +-------+ +-------+
    |BinOp *| |Sym `c`|
    +-------+ +-------+
    / \
    / \
    +-------+ +-------+
    |Sym `a`| |Sym `b`|
    +-------+ +-------+

    (The original looks bad in my newsreader, too.)

    - Dan C.


    --- PyGate Linux v1.5.15
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From Dan Cross@3:633/10 to All on Tue Jun 2 16:28:36 2026
    In article <86ik81cfk5.fsf_-_@linuxsc.com>,
    Tim Rentsch <tr.17687@z991.linuxsc.com> wrote:
    Janis Papanagnou <janis_papanagnou+ng@hotmail.com> writes:

    On 2026-06-01 00:54, Keith Thompson wrote:

    [...]

    Yes, a compiler can reduce (a + b) * 0 to just 0. But it's not
    required to do so, and (INT_MAX + 1) * 0 still has undefined
    behavior. Undefined behavior is determined by the rules of the
    abstract machine *without* any adjustments permitted by the as-if
    rule.

    This is something I really don't get in the actual C-logic...

    Using constants that can be determined at compile time is UB here,
    despite the '* 0' mathematically indicating an IMO clear semantics,
    but using variables is only UB possibly at runtime? [...]

    There's an important distinction to make here. Consider this
    program:

    #include <limits.h>

    int
    foo(){
    int zero = (INT_MAX+1)*0;
    return zero;
    }

    int
    main(){
    return 0;
    }

    This program does not transgress the bounds of undefined behavior.

    Given that `foo` has external linkage, I find this hard to
    believe, and `clang -fsanitize=undefined` agrees with me,
    both emitting a diagnostic about the overflow and generating
    code in `foo` to call into the sanitizer machinery.

    Perhaps you mean that this is irrelevant because `foo` is not
    invoked, but I see no reason why that need be the case in e.g.
    a freestanding environment. In a hosted environment, I don't
    think anything explicitly prevents `foo` from being called after
    `main` returns (though I can't imagine that would happen in real
    life; it would be weird if it did).

    But I'm not sure what _you_ mean by "transgress the bounds of
    undefined behavior" here.

    Even more than that, the program is strictly conforming, and must be
    accepted by a conforming implementation.

    See above.

    Now let's change the program slightly:

    #include <limits.h>

    int
    foo(){
    static int zero = (INT_MAX+1)*0;
    return zero;
    }

    int
    main(){
    return 0;
    }

    This program does transgress the bounds of undefined behavior. The
    reason for the difference is that in the first program the semantics
    of foo() is to evaluate the expression to be stored in 'zero' only
    at runtime, whereas in the second program the semantics of foo() is
    to evaluate the expression to be stored in 'zero' before program
    startup (informally, "at compile time"). What matters is not
    whether the offending expression /might/ be evaluated "at compile
    time", but whether the offending expression /must/ be evaluated "at
    compile time". Only in the second case is undefined behavior
    inevitable (and thus it does not occur in the first program).

    Fine point: strictly speaking, I believe the C standard allows even
    the second program to complete translation phase 8 successfully, and
    for any offending behavior to occur only when we actually try to run
    the program. To say that another way, there is no requirement that
    possible nasal demons be made manifest at any point before an actual >attempted execution. On the other hand, because that possibility is
    there lurking in the background, there is no requirement that the
    program be accepted, and could be rejected by a conforming compiler.

    Indeed. Further, I believe that the same is true for the first
    program, as well.

    - Dan C.


    --- PyGate Linux v1.5.15
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From Chris M. Thomasson@3:633/10 to All on Tue Jun 2 13:59:24 2026
    On 5/31/2026 3:54 PM, Keith Thompson wrote:
    David Brown <david.brown@hesbynett.no> writes:
    On 31/05/2026 16:24, James Kuyper wrote:
    On 2026-05-31 07:18, David Brown wrote:
    [...]
    People might think they affect the order of evaluation, such as when you >>>> have function calls :

    u = foo(x) + (foo(y) + foo(z));

    Some people might think the use of parentheses means that "foo(y)" and >>>> "foo(z)" are called before "foo(x)", when the order of all these calls >>>> (and the additions) is unspecified. (Again, a given compiler might be >>>> influenced by the parentheses, but the language does not require it.

    You're correct with regard to the function calls, but the
    parenthesized addition must be performed first, and the other one
    second, which may make a difference, for the same reasons given in my
    previous paragraph.

    The parentheses do not dictate the order of evaluation. But you are
    correct - and it's worth pointing out, so thank you for doing that -
    that for floating point operations, the grouping of operations can
    affect the result.

    The parentheses do not dictate the order of evaluation *of the
    operands*. Each "+" can be evaluated (the addition performed)
    only after the values of its operands are known. But regardless
    of parentheses or operator precedence, the three operands foo(x),
    foo(y), and foo(z) can be evaluated in any of 6 possible orders.
    (It's different when you have operations like "&&", "||", and ",",
    which imposes additional sequence points.)

    If you are talking about floating point arithmetic (I was thinking of
    integer arithmetic, but did not specify), then the operations are not
    necessarily commutative or associative, and the compiler cannot then
    re-arrange the operations unless it knows that doing so does not
    affect the result.

    It's not just floating-point. Signed integer overflow is also relevant.

    (INT_MIN + INT_MAX) + 1 is well defined. (INT_MIN + INT_MAX) +1
    is equivalent, and is also well defined. INT_MIN + (INT_MAX +1)
    has undefined behavior.

    But except for specific cases, the order of evaluation - both for the
    values and side-effects - of sub-expressions is unspecified. Indeed,
    they are unsequenced - the evaluations can interleave.

    Usually, both sub-expressions of a binary operator will be evaluated
    before the operator itself, simply because usually the results of the
    operator cannot be calculated until the sub-expression's values are
    known. But this is not a requirement of the language - if the
    compiler can get the same results without doing so, it is free to pick
    a different order. "(a + b) * 0" does not need to evaluate "a", "b",
    or "a + b" at all unless there is a possibility of a side-effect - and
    it can perform the side-effects in any order. "a + (b + c)" can check
    "a" for a trap representation and deal with that before looking at "b"
    and "c" or the results of "b + c", even though it cannot (for floating
    point operations) re-arrange the code to do "a + b" first.

    Yes, a compiler can reduce (a + b) * 0 to just 0. But it's not
    required to do so, and (INT_MAX + 1) * 0 still has undefined
    behavior. Undefined behavior is determined by the rules of the
    abstract machine *without* any adjustments permitted by the as-if
    rule.

    [...]


    10 + 5 - 7 + 3

    Oh my this is an error for the programmers logic! they forgot to do:

    10 + 5 - (7 + 3)

    ?


    --- PyGate Linux v1.5.15
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From Keith Thompson@3:633/10 to All on Tue Jun 2 15:12:41 2026
    Keith Thompson <Keith.S.Thompson+u@gmail.com> writes:
    [...]
    David Brown <david.brown@hesbynett.no> writes:
    [...]
    <https://cppreference.com/c/language/operator_precedence>
    <https://cppreference.com/cpp/language/operator_precedence>
    [...]

    Both tables are now much clearer. Someone added dividing lines
    between the precedence levels.

    --
    Keith Thompson (The_Other_Keith) Keith.S.Thompson+u@gmail.com
    void Void(void) { Void(); } /* The recursive call of the void */

    --- PyGate Linux v1.5.15
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From Keith Thompson@3:633/10 to All on Tue Jun 2 15:29:50 2026
    Tim Rentsch <tr.17687@z991.linuxsc.com> writes:
    Keith Thompson <Keith.S.Thompson+u@gmail.com> writes:
    Note that in a context that requires a constant expression, overflow is
    a constraint violation. For example, a case label like:

    case (INT_MAX + 1) * 0:

    must be diagnosed at compile time.

    gcc disagrees with you.

    What makes you think so?

    $ cat c.c
    #include <limits.h>
    int main(void) {
    switch (0) {
    case (INT_MAX + 1) * 0:
    break;
    }
    }
    $ gcc -std=c17 -pedantic-errors -c c.c
    c.c: In function ?main?:
    c.c:4:23: warning: integer overflow in expression of type ?int? results in ?-2147483648? [-Woverflow]
    4 | case (INT_MAX + 1) * 0:
    | ^
    c.c:4:9: error: overflow in constant expression [-Woverflow]
    4 | case (INT_MAX + 1) * 0:
    | ^~~~
    $

    But taking a closer look at the standard, I'm not 100% sure that the
    language requires a diagnostic, though I think that's the intent.
    The relevant constraint is:

    Each constant expression shall evaluate to a constant that is
    in the range of representable values for its type.

    If I squint really hard, I can argue that the entire expression
    has to be a constant expression, but it doesn't say that its
    subexpressions are constant expressions -- and *if* INT_MAX +
    1 evaluates to INT_MIN in the current implementation, then
    (INT_MAX + 1) * 0 evaluates to 0 and therefore satisfies the
    constraint.

    But INT_MAX + 1 could legally trap, for example, and I don't believe
    it was intended that a given expression can be a constant expression
    or not depending on the vagaries of the behavior of an instance
    of UB.

    --
    Keith Thompson (The_Other_Keith) Keith.S.Thompson+u@gmail.com
    void Void(void) { Void(); } /* The recursive call of the void */

    --- PyGate Linux v1.5.15
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From Tim Rentsch@3:633/10 to All on Thu Jun 4 02:34:59 2026
    Bart <bc@freeuk.com> writes:

    On 01/06/2026 03:10, Tim Rentsch wrote:

    Bart <bc@freeuk.com> writes:

    On 31/05/2026 17:04, Tim Rentsch wrote:

    Richard Harnden <richard.nospam@gmail.invalid> writes:

    just write complex expressions in a way that a human can most
    easily understand,

    Unfortunately, (1) different people have different ideas of what
    writing is most easily understood, and (2) different readers have
    different notions of which writings are easily understood, and
    which writings are not so easily understood. To make things
    worse "easily understood" is not a boolean condition, nor is it
    necessarily well-ordered -- "most easily understood" isn't always
    a well-defined quality, even for a given audience.

    Sadly the idea of writing in a way that is "most easily understood"
    has resulted in a race to the bottom, where writers are more and
    more encouraged to take the view that (some) readers are pretty
    much arbitrarily stupid, with the result that expressions become
    littered with scads of unnecessary parentheses that actually
    detract from ease of reading. Good writing is always a balance
    between too much and too little.

    Actual examples of too many parentheses?

    The point of my comment is that either too many or too few is a
    subjective judgment, not an objective one.

    My point was that it could be objective, at least for too many. So
    (a*a) + (b*b) would be commonly agreed to have too many, [...]

    Apparently you misunderstand what is meant by the word objective.
    An objective statement is one that is independent of personal
    assessment, even collective personal assessment. Reaching consensus
    on a question doesn't make the common view an objective one -- just
    a commonly held one. Saying the Sun rises in the East is an
    objective statement. Saying the temperature is too hot in the month
    of September is not an objective statement, even if most people
    think so.

    And then there is ?: :

    a > b ? c : d # (a>b)?c:d
    a + b ? c : d # (a+b)?c:d

    The grouping of the first is probably what is intended. But in the
    second, the intent might have been (a+b)?c:d, or a+(b?c:c); we don't
    know for sure that the author didn't make a mistake or we don't know
    outselves.

    This example is so addlebrained that it's hard to imagine anyone
    being confused about it. Or that it's worth any expenditure of
    thought wondering what to do about people who are.

    I don't understand what the problem is with my examples.

    Here is a story from the earliest weeks of all of the time I have
    been programming. In one of the first few programs I ever wrote
    (and perhaps even the very first one), I had a statement like so:

    x = alpha/beta*gamma

    Of course the names here are made up, I don't remember the actual
    names used. When x was printed out, it gave a value that was
    much different from what I expected. What had happened was I had
    unconsciously assumed, reasoning by analogy with written
    mathematics, that the statement would be interpreted as

    alpha
    x = ------------
    beta*gamma

    After getting the program output back, and seeing the unexpected
    result, someone explained to me that the statement was interpreted
    as

    x = (alpha/beta)*gamma

    because that was how the language worked. Of course I was surprised
    but I learned the rule and after that had no further problems with
    how to read such expressions.

    There can be ambiguity in the mind of the person looking at such
    code as to how the first terms are grouped.

    This statement illustrates the problem with examples that you give.
    Not only is the presumed reader sort of arbitrarily naive, he or she
    is apparently incapable of learning. Everyone who has ever learned
    to program has had an experience of a program doing something other
    than what was expected, because of a misunderstanding about how the
    language works. When that happens, most people simply learn about
    their misunderstanding and correct it. The readers in your examples
    are like people who started programming after developing Alzheimer's
    disease (and no offense meant to anyone afflicted with Alzheimer's).
    Maybe there are such people, whether or not caused by a medical
    condition, but it doesn't match most programmers' experience, and in
    any case is not worth worrying about. If someone can't understand
    the rules of the road they shouldn't be behind the wheel of a car.
    If someone really can't learn the rules of expression syntax for the
    language they are using, they should be advised to try a different
    language, or perhaps give up programming altogether. It's silly to
    worry about something that 999 people out of a 1000 (and the actual
    numbers are undoubtedly much higher) are able to navigate without
    difficulty. Yet the examples you give insist on focusing on the few
    hopeless individuals. It shouldn't be a surprise that most people
    don't share your concerns.

    --- PyGate Linux v1.5.15
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From Tim Rentsch@3:633/10 to All on Thu Jun 4 03:37:24 2026
    cross@spitfire.i.gajendra.net (Dan Cross) writes:

    In article <86ik81cfk5.fsf_-_@linuxsc.com>,
    Tim Rentsch <tr.17687@z991.linuxsc.com> wrote:

    Janis Papanagnou <janis_papanagnou+ng@hotmail.com> writes:

    On 2026-06-01 00:54, Keith Thompson wrote:

    [...]

    Yes, a compiler can reduce (a + b) * 0 to just 0. But it's not
    required to do so, and (INT_MAX + 1) * 0 still has undefined
    behavior. Undefined behavior is determined by the rules of the
    abstract machine *without* any adjustments permitted by the as-if
    rule.

    This is something I really don't get in the actual C-logic...

    Using constants that can be determined at compile time is UB here,
    despite the '* 0' mathematically indicating an IMO clear semantics,
    but using variables is only UB possibly at runtime? [...]

    There's an important distinction to make here. Consider this
    program:

    #include <limits.h>

    int
    foo(){
    int zero = (INT_MAX+1)*0;
    return zero;
    }

    int
    main(){
    return 0;
    }

    This program does not transgress the bounds of undefined behavior.

    To clarify, the comments in my posting were meant to be read as
    saying the given text is the entire program, and that it is strictly
    conforming with respect to conforming hosted implementations.
    (Incidentally, given the rules for freestanding implementations, I'm
    not sure that it is even possible for any program to be strictly
    conforming with respect to conforming freestanding implementations.
    In any case my statements were meant only in the context of hosted implementations.)

    Given that `foo` has external linkage, I find this hard to
    believe, and `clang -fsanitize=undefined` agrees with me,
    both emitting a diagnostic about the overflow and generating
    code in `foo` to call into the sanitizer machinery.

    A conforming implementation is free to emit a diagnostic whenever it
    chooses, for any reason at all, regardless of whether the program
    source is legal C or not. (I feel obliged to point out that, if a preprocessing #error directive is encountered, then there may be an
    exception to that statement; however, there is no such #error in
    the program shown above.)

    Perhaps you mean that this is irrelevant because `foo` is not
    invoked, but I see no reason why that need be the case in e.g.
    a freestanding environment.

    I explained the context of my previous statements above. Sorry for
    not saying that in the original message.

    In a hosted environment, I don't
    think anything explicitly prevents `foo` from being called after
    `main` returns (though I can't imagine that would happen in real
    life; it would be weird if it did).

    The semantics described in the ISO C standard don't admit that
    possibility. Whether foo() has external linkage or internal
    linkage doesn't change that. Only those actions initiated by
    statements in main() are ever elaborated.

    But I'm not sure what _you_ mean by "transgress the bounds of
    undefined behavior" here.

    It's a grammatical fine point. I think for present purposes it's
    okay to gloss over the distinction, and say this statement may be
    read as saying "the program does not have undefined behavior".

    Even more than that, the program is strictly conforming, and must be
    accepted by a conforming implementation.

    See above.

    Now let's change the program slightly:

    #include <limits.h>

    int
    foo(){
    static int zero = (INT_MAX+1)*0;
    return zero;
    }

    int
    main(){
    return 0;
    }

    This program does transgress the bounds of undefined behavior. The
    reason for the difference is that in the first program the semantics
    of foo() is to evaluate the expression to be stored in 'zero' only
    at runtime, whereas in the second program the semantics of foo() is
    to evaluate the expression to be stored in 'zero' before program
    startup (informally, "at compile time"). What matters is not
    whether the offending expression /might/ be evaluated "at compile
    time", but whether the offending expression /must/ be evaluated "at
    compile time". Only in the second case is undefined behavior
    inevitable (and thus it does not occur in the first program).

    Fine point: strictly speaking, I believe the C standard allows even
    the second program to complete translation phase 8 successfully, and
    for any offending behavior to occur only when we actually try to run
    the program. To say that another way, there is no requirement that
    possible nasal demons be made manifest at any point before an actual
    attempted execution. On the other hand, because that possibility is
    there lurking in the background, there is no requirement that the
    program be accepted, and could be rejected by a conforming compiler.

    Indeed. Further, I believe that the same is true for the first
    program, as well.

    It isn't. In the first program the offending expression is never
    evaluated, because foo() is never called.

    --- PyGate Linux v1.5.15
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From Tim Rentsch@3:633/10 to All on Thu Jun 4 03:58:35 2026
    cross@spitfire.i.gajendra.net (Dan Cross) writes:

    [discussing how to produce a 32-bit color value from rgb16]

    Here's my offering:

    // Converts a 16-bit RGB16 (5-6-5) value to an ARGB32
    // ("RGBA8888") value.
    static inline uint32_t
    rgb16_to_argb(uint16_t color)
    {
    const uint32_t blue5 = (color >> 0) & 0x1F;
    const uint32_t green6 = (color >> 5) & 0x3F;
    const uint32_t red5 = (color >> 11) & 0x1F;

    // Map from a 5 or 6 bit space into an 8 bit space. A
    // 5-bit number has 32 possibilities; a 6 bit number
    // has 64. We can calculate the projected 8-bit
    // value for a k-bit number v, we can use the formula,
    // v_8 = (v*2^8-1 + (k - 1)/2)/(2^k-1), or
    // (v*255 + 15)/31 (for k=5) or (v*255 + 31)/63 (for
    // k=6.
    //
    // To remove division by a prime and turn it into a
    // shift, the constants below were empirically
    // discovered to generate good results. See
    // https://stackoverflow.com/questions/2442576/
    // how-does-one-convert-16-bit-rgb565-to-24-bit-rgb888
    // for details.
    const uint32_t blue = (blue5 * 527 + 23) >> 6;
    const uint32_t green = (green6 * 259 + 33) >> 6;
    const uint32_t red = (red5 * 527 + 23) >> 6;
    const uint32_t alpha = 0xFF000000;

    return blue | (green << 8) | (red << 16) | alpha;
    }

    It's longer, yes, but I'd argue it's much easier to understand.
    On my compiler, it generates almost identical code, except that
    some instructions are in a different order.

    I would choose a different approach, for two reasons. One is that,
    for code that is likely to be in a header file, my preference is
    that it be compilable under C90 rules if possible. The other is
    that, given the simple nature of the transformation, it should be
    able to produce a constant expression if given a constant input
    value. Here is an possible implementation:

    #define SOLID_RGB24_of_RGB16( rgb16 ) \
    ARGB32_( 255ul, \
    SCALE_5_to_8_( BITS_AT_OF_( 5, 11, (rgb16) ) ), \
    SCALE_6_to_8_( BITS_AT_OF_( 6, 5, (rgb16) ) ), \
    SCALE_5_to_8_( BITS_AT_OF_( 5, 0, (rgb16) ) ) \
    )

    #define ARGB32_( alpha, red, green, blue ) ( \
    alpha << 24 | red << 16 | green << 8 | blue \
    )

    #define SCALE_5_to_8_( u ) ( u *527ul +23 >>6 )
    #define SCALE_6_to_8_( u ) ( u *259ul +33 >>6 )

    #define BITS_AT_OF_(width,where,u) ( u >> where & (1ul << width)-1 )


    And here is a simple test driver:

    const unsigned long some_red = SOLID_RGB24_of_RGB16( 29u << 11 );
    const unsigned long some_green = SOLID_RGB24_of_RGB16( 59u << 5 );
    const unsigned long some_blue = SOLID_RGB24_of_RGB16( 29u << 0 );

    #include <stdio.h>

    int
    main(){
    printf( " red: %#8lx\n", some_red );
    printf( " green: %#8lx\n", some_green );
    printf( " blue: %#8lx\n", some_blue );
    }


    --- PyGate Linux v1.5.15
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From Bart@3:633/10 to All on Thu Jun 4 12:40:45 2026
    On 04/06/2026 10:34, Tim Rentsch wrote:
    Bart <bc@freeuk.com> writes:

    My point was that it could be objective, at least for too many. So
    (a*a) + (b*b) would be commonly agreed to have too many, [...]

    Apparently you misunderstand what is meant by the word objective.
    An objective statement is one that is independent of personal
    assessment, even collective personal assessment.

    I don't know of any infix PL syntax where 'a*a + b*b', as a standalone expression, doesn't mean '(a*a) + (b*b)'.

    Google agrees with me (in that 2*2+3*3 shows 13), and so does my Casio calculator.

    It's not my personal opinion!

    I'm sure you can trawl for some obscure languages where that expression
    works differently, or where you can reassign priority or meaning to
    those operators, but that is just being contrary for the sake of it.


    Reaching consensus
    on a question doesn't make the common view an objective one -- just
    a commonly held one.

    So, the number of times in this group where I've been told that everyone
    else disagrees with me about something so I must be wrong - this was
    just your (pl) subjective opinion all along?

    In the PL world then it is going to be mainly about subjective opinions!
    There are few absolute truths.

    But what about this example:

    ((((((a))))))

    'Too many parentheses' is still subjective?

    How about '((((a)))) using more parentheses than (a)'; that surely must
    be objective?

    Here is a story from the earliest weeks of all of the time I have
    been programming. In one of the first few programs I ever wrote
    (and perhaps even the very first one), I had a statement like so:

    x = alpha/beta*gamma

    Of course the names here are made up, I don't remember the actual
    names used. When x was printed out, it gave a value that was
    much different from what I expected. What had happened was I had unconsciously assumed, reasoning by analogy with written
    mathematics, that the statement would be interpreted as

    alpha
    x = ------------
    beta*gamma

    You will have quickly found out that PL syntax is not mathematics. For a start, mathematics doesn't normally use '*', nor '/' for that matter.

    Yes, there is a discrepancy with the precedences of divide and (implied) multiply. However, a*a + b*b example didn't use divide.

    (Note that C has its own problems in this area:

    a = b/*p; // divide b by dereferenced pointer p

    Here, /* also happens to start a block comment.)



    If someone really can't learn the rules of expression syntax for the
    language they are using, they should be advised to try a different
    language, or perhaps give up programming altogether.

    It can be multiple languages, and they might want to write the same
    expression the same way in each.

    It could be no language: maybe its pseudo-code, or some unspecified
    language in a forum which is not language-specific. They want anybody to
    just understand it.

    This is the scenerio I mentioned where you can risk not using
    precedences when expressions involve "+ - * /", comparisons, and AND/OR
    since generally these are treated sensibly by infix languages (even in
    C, almost).

    But operators such as '<< >> & ^ |' are treated more diversely. Here you
    would be taking a bigger risk. You could label such code as 'C Syntax'
    (if posting for example) but that is just being lazy.

    It's silly to
    worry about something that 999 people out of a 1000 (and the actual
    numbers are undoubtedly much higher) are able to navigate without
    difficulty. Yet the examples you give insist on focusing on the few
    hopeless individuals.

    Are you saying that whoever wrote code like this:

    crcu32 = (crcu32 >> 4) ^ s_crc32[(crcu32 & 0xF) ^ (b & 0xF)];

    is needlessly worrying about the 99.9+% of the readership who you claim
    will know C syntax rules precisely? That is, they would find this
    version just as clear without any extra cognitive effort:

    crcu32 = crcu32 >> 4 ^ s_crc32[crcu32 & 0xF ^ b & 0xF];

    ?

    If so then you are hopelessly wrong.



    --- PyGate Linux v1.5.15
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From David Brown@3:633/10 to All on Thu Jun 4 14:35:25 2026
    On 04/06/2026 13:40, Bart wrote:
    On 04/06/2026 10:34, Tim Rentsch wrote:
    Bart <bc@freeuk.com> writes:

    My point was that it could be objective, at least for too many.ÿ So
    (a*a) + (b*b) would be commonly agreed to have too many, [...]

    Apparently you misunderstand what is meant by the word objective.
    An objective statement is one that is independent of personal
    assessment, even collective personal assessment.

    I don't know of any infix PL syntax where 'a*a + b*b', as a standalone expression, doesn't mean '(a*a) + (b*b)'.

    Google agrees with me (in that 2*2+3*3 shows 13), and so does my Casio calculator.

    It's not my personal opinion!

    You are - again - moving the goalposts.

    It is an objective fact that "a * a + b * b" means "(a * a) + (b * b)"
    in normal mathematics (at least in the countries I am familiar with),
    and also in most mainstream programming languages.

    It is an objective fact, therefore, that "(a*a) + (b*b)" has more
    parentheses than needed in the context of most programming languages.

    "(a*a) + (b*b) has too many parentheses", on the other hand, is a purely subjective opinion. Even if it is true that this is "commonly agreed
    to" (and AFAIK you have no basis for that claim), that would still be a subjective opinion - no matter how common that opinion is.

    Does that clear up your misunderstanding about "objective" and
    "subjective" ?


    ÿReaching consensus
    on a question doesn't make the common view an objective one -- just
    a commonly held one.

    So, the number of times in this group where I've been told that everyone else disagrees with me about something so I must be wrong - this was
    just your (pl) subjective opinion all along?

    Facts and opinions are different. You regularly get facts about C
    wrong, and you are told you are wrong - that is objective. You
    regularly give opinions that people disagree with, and are told they
    disagree - that is subjective.

    If you wrote, for example, that "a << b + c" is ambiguous in C, then you
    would be factually and objectively wrong. If you wrote that it is
    unclear, then you would be expressing a subjective opinion, and people
    may or may not agree with you.

    Sometimes you might voice an opinion that is so extreme or uncommon that people might tell you you are wrong, when saying they disagree would be
    more appropriate - discussions here are not formal.


    In the PL world then it is going to be mainly about subjective opinions! There are few absolute truths.

    The programming language world is full of absolutely truths. The C
    standards, for example, are full of facts about the C language. It is
    not just a collection of guidelines or ideas for people to like or dislike.


    But what about this example:

    ÿÿ ((((((a))))))

    'Too many parentheses' is still subjective?

    Yes, obviously. "More parentheses than necessary" is objective, "too
    many parentheses" is subjective. I expect most people will share the
    same opinion, but it is still an opinion.


    How about '((((a)))) using more parentheses than (a)'; that surely must
    be objective?

    Yes.


    Here is a story from the earliest weeks of all of the time I have
    been programming.ÿ In one of the first few programs I ever wrote
    (and perhaps even the very first one), I had a statement like so:

    ÿÿÿÿ x = alpha/beta*gamma

    Of course the names here are made up, I don't remember the actual
    names used.ÿ When x was printed out, it gave a value that was
    much different from what I expected.ÿ What had happened was I had
    unconsciously assumed, reasoning by analogy with written
    mathematics, that the statement would be interpreted as

    ÿÿÿÿÿÿÿÿÿÿÿ alpha
    ÿÿÿÿ x = ------------
    ÿÿÿÿÿÿÿÿÿ beta*gamma

    You will have quickly found out that PL syntax is not mathematics. For a start, mathematics doesn't normally use '*', nor '/' for that matter.

    It's not so much the symbols, as the layout. A mathematician would not
    write "aöb?c" either. They would write it in a way that makes the
    intended precedence obvious to other mathematicians reading it, taking
    into account the exact symbols used ("a.b" or "ab" might be considered
    to bind tighter than "a?b"), the spacing, the position of the symbols on
    the page, and - importantly - the context.

    Programming can definitely be viewed as a sort of mathematics, but
    writing code is not the same as writing mathematics.


    Yes, there is a discrepancy with the precedences of divide and (implied) multiply. However, a*a + b*b example didn't use divide.

    (Note that C has its own problems in this area:

    ÿÿ a = b/*p;ÿÿÿÿÿ // divide b by dereferenced pointer p

    Here, /* also happens to start a block comment.)


    Here you are objectively wrong. C does not have a "problem" with this.
    The parsing rules of the language are clear - often called "maximum
    munch". The character sequence "/*" is the start of a comment, it is
    not two separate operators.

    You might personally have a problem with this. Whether you do or do not
    is also an objective fact, but one that only you can judge. And you can
    have a subjective opinion as to whether or not you like the rules of C here.



    If someone really can't learn the rules of expression syntax for the
    language they are using, they should be advised to try a different
    language, or perhaps give up programming altogether.

    It can be multiple languages, and they might want to write the same expression the same way in each.


    Sure.

    I also don't think people should be required to learn all the details of
    a language in order to use it. Indeed, for bigger languages (say, C++
    or Python) it would be infeasible to learn everything. Exactly where
    you draw the lines of what you need to know and what you can look up if necessary will vary by person, and by the type of tasks they are doing
    in a language.


    It could be no language: maybe its pseudo-code, or some unspecified
    language in a forum which is not language-specific. They want anybody to just understand it.

    This is the scenerio I mentioned where you can risk not using
    precedences when expressions involve "+ - * /", comparisons, and AND/OR since generally these are treated sensibly by infix languages (even in
    C, almost).

    But operators such as '<< >> & ^ |' are treated more diversely. Here you would be taking a bigger risk. You could label such code as 'C
    Syntax' (if posting for example) but that is just being lazy.


    It is correct that details here vary more. Whether you think extra parentheses should or should not be used is, however, a subjective
    opinion. (My opinion is probably more in line with yours than Tim's
    here - but it is still subjective.)

    ÿIt's silly to
    worry about something that 999 people out of a 1000 (and the actual
    numbers are undoubtedly much higher) are able to navigate without
    difficulty.ÿ Yet the examples you give insist on focusing on the few
    hopeless individuals.

    Are you saying that whoever wrote code like this:

    ÿÿÿÿ crcu32 = (crcu32 >> 4) ^ s_crc32[(crcu32 & 0xF) ^ (b & 0xF)];

    is needlessly worrying about the 99.9+% of the readership who you claim
    will know C syntax rules precisely? That is, they would find this
    version just as clear without any extra cognitive effort:

    ÿÿÿÿ crcu32 = crcu32 >> 4 ^ s_crc32[crcu32 & 0xF ^ b & 0xF];

    ?

    Tim did not write that. That example was not on the list of examples
    you gave recently. The examples a couple of posts up in this branch
    were a lot simpler. (That does not mean that Tim's "999 out of 1000"
    figures are based on evidence.)


    If so then you are hopelessly wrong.




    --- PyGate Linux v1.5.15
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From Bart@3:633/10 to All on Thu Jun 4 14:18:05 2026
    On 04/06/2026 13:35, David Brown wrote:
    On 04/06/2026 13:40, Bart wrote:
    On 04/06/2026 10:34, Tim Rentsch wrote:
    Bart <bc@freeuk.com> writes:

    My point was that it could be objective, at least for too many.ÿ So
    (a*a) + (b*b) would be commonly agreed to have too many, [...]

    Apparently you misunderstand what is meant by the word objective.
    An objective statement is one that is independent of personal
    assessment, even collective personal assessment.

    I don't know of any infix PL syntax where 'a*a + b*b', as a standalone
    expression, doesn't mean '(a*a) + (b*b)'.

    Google agrees with me (in that 2*2+3*3 shows 13), and so does my Casio
    calculator.

    It's not my personal opinion!

    You are - again - moving the goalposts.

    It is an objective fact that "a * a + b * b" means "(a * a) + (b * b)"
    in normal mathematics (at least in the countries I am familiar with),
    and also in most mainstream programming languages.

    It is an objective fact, therefore, that "(a*a) + (b*b)" has more parentheses than needed in the context of most programming languages.

    "(a*a) + (b*b) has too many parentheses", on the other hand, is a purely subjective opinion.

    So, you're arguing 'more than needed' is a completely different thing
    from 'too many'.

    Sigh...


    If you wrote, for example, that "a << b + c" is ambiguous in C, then you

    It is technically unambiguous in C. It can be ambiguous in the mind of somebody who would have to double-check the precedence levels, or where
    the C context is missing.

    The discssion seems to about what exactly is 'too many'.

    Apparently you can constuct a valid C source file where 99.9% of the
    text consists of () characters, but if someone - or even a million
    people - say that it is too many, then that is just their subjective
    opinion.

    I don't have the patience for such nonsense any more:

    * The () in '(a * b) + c' are generally unnecessary

    * The () in 'a << (b + c)' are advisable

    * The () in '(a << b) + c)' are necessary if the intent is to have
    what might be the more intuitive meaning.

    If this not 100% C-specific, than () are needed for both the last two examples, but not the first.

    You all know this.


    (Note that C has its own problems in this area:

    ÿÿÿ a = b/*p;ÿÿÿÿÿ // divide b by dereferenced pointer p

    Here, /* also happens to start a block comment.)


    Here you are objectively wrong.ÿ C does not have a "problem" with this.
    The parsing rules of the language are clear - often called "maximum
    munch".ÿ The character sequence "/*" is the start of a comment, it is
    not two separate operators.

    This is where it falls down. It's very clearly a 'gotcha', and
    consequence of poorly thought-out design.

    That the behaviour is deterministic doesn't change that.

    ÿIt's silly to
    worry about something that 999 people out of a 1000 (and the actual
    numbers are undoubtedly much higher) are able to navigate without
    difficulty.ÿ Yet the examples you give insist on focusing on the few
    hopeless individuals.

    Are you saying that whoever wrote code like this:

    ÿÿÿÿÿ crcu32 = (crcu32 >> 4) ^ s_crc32[(crcu32 & 0xF) ^ (b & 0xF)];

    is needlessly worrying about the 99.9+% of the readership who you
    claim will know C syntax rules precisely? That is, they would find
    this version just as clear without any extra cognitive effort:

    ÿÿÿÿÿ crcu32 = crcu32 >> 4 ^ s_crc32[crcu32 & 0xF ^ b & 0xF];

    ?

    Tim did not write that.

    What was the 'something' in "It's silly to worry about something that ..."?

    I assume it's people being unable to understand that second example.

    Yet I seee parenthese being used in such cases a LOT more than 0.1% of
    the time. 50% or more would be my guess.


    ÿ That example was not on the list of examples
    you gave recently.


    It was posted several times.

    (https://github.com/richgel999/miniz/blob/master/miniz.c line 81, second
    hit for '>>')



    --- PyGate Linux v1.5.15
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From Janis Papanagnou@3:633/10 to All on Thu Jun 4 15:21:23 2026
    In your post you already addressed a lot that I'd have written as well
    (more or less).

    On 2026-06-04 14:35, David Brown wrote:
    On 04/06/2026 13:40, Bart wrote:
    [...]

    [...]

    It is an objective fact that "a * a + b * b" means "(a * a) + (b * b)"
    in normal mathematics (at least in the countries I am familiar with),
    and also in most mainstream programming languages.

    It is an objective fact, therefore, that "(a*a) + (b*b)" has more parentheses than needed in the context of most programming languages.

    "(a*a) + (b*b) has too many parentheses", on the other hand, is a purely subjective opinion.ÿ Even if it is true that this is "commonly agreed
    to" (and AFAIK you have no basis for that claim), that would still be a subjective opinion - no matter how common that opinion is.

    Does that clear up your misunderstanding about "objective" and
    "subjective" ?

    [...]

    Sometimes you might voice an opinion that is so extreme or uncommon that people might tell you you are wrong, when saying they disagree would be
    more appropriate - discussions here are not formal.

    Right. - And thus I've considered Bart's informal "too many parentheses"
    as just a sloppy formulation of "more than necessary" or "some spurious" parentheses. - We should grant him at least the same inaccuracies in his
    heated posts as he receives in any heated replies.

    Of course in the sense of clearness of communication it's not wrong to
    point out inaccurate statements. - Especially if such inaccuracies are deliberately used to rhetorically obfuscate previous wrong statements!

    Janis

    [...]


    --- PyGate Linux v1.5.15
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From Tim Rentsch@3:633/10 to All on Thu Jun 4 06:38:19 2026
    Bart <bc@freeuk.com> writes:

    [...]

    Thank you for your response. I'm sorry my comments weren't
    more helpful to you.

    --- PyGate Linux v1.5.15
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From Janis Papanagnou@3:633/10 to All on Thu Jun 4 15:47:38 2026
    On 2026-06-04 15:18, Bart wrote:
    On 04/06/2026 13:35, David Brown wrote:
    [...]

    "(a*a) + (b*b) has too many parentheses", on the other hand, is a
    purely subjective opinion.

    So, you're arguing 'more than needed' is a completely different thing
    from 'too many'.

    It's a different thing, indeed. The suspicious keyword is "too many";
    a valuation, and subjective. - It's no biggie to me, and in my other
    post I said that I'd just read it as a sloppy formulated variant of
    "more than necessary" or some such. So while inaccurately formulated
    I'm fine with that; I understood what you had intended to express.

    But the "completely", BTW, in your "is a completely different thing"
    is a cheap rhetorical exaggeration to obfuscate or diminish the issue
    with your valuating statement. (I don't like such primitive rhetoric
    moves.)

    [...]

    I don't have the patience for such nonsense any more:

    * The () in '(a * b) + c' are generally unnecessary

    Right.


    * The () in 'a << (b + c)' are advisable

    Maybe, maybe not. (Depending on the involved persons, and on how they
    handle the cases shown below; whether they mix types in subexpressions
    or not.)


    * The () in '(a << b) + c)' are necessary if the intent is to have
    ÿ what might be the more intuitive meaning.

    I've already written in some former post about _unnecessarily_ mixing
    different types in expressions.

    If you stay in such subexpressions with the same types you'll notice
    that the parentheses are unnecessary; the C-language's precedences
    have been sensibly chosen (in this case[*]).

    [*] And even if you add some of ^ | & it's still no problem, unless
    you have also any of the comparison operators in your expressions.

    Janis

    [...]


    --- PyGate Linux v1.5.15
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From Janis Papanagnou@3:633/10 to All on Thu Jun 4 15:57:39 2026
    On 2026-06-04 15:47, Janis Papanagnou wrote:
    On 2026-06-04 15:18, Bart wrote:
    [...]

    * The () in '(a << b) + c)' are necessary if the intent is to have
    ÿÿ what might be the more intuitive meaning.

    I've already written in some former post about _unnecessarily_ mixing different types in expressions.

    To not cause misunderstandings here; by "different types" I meant the
    bit-logic and int-arithmetic, as explained in my mentioned former post.
    (The technical data types are of course both just some sort of 'int'.)

    If you stay in such subexpressions with the same types you'll notice
    that the parentheses are unnecessary; the C-language's precedences
    have been sensibly chosen (in this case[*]).

    [*] And even if you add some ofÿ ^ | &ÿ it's still no problem, unless
    you have also any of the comparison operators in your expressions.

    Janis

    [...]



    --- PyGate Linux v1.5.15
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From David Brown@3:633/10 to All on Thu Jun 4 16:27:28 2026
    On 04/06/2026 15:18, Bart wrote:
    On 04/06/2026 13:35, David Brown wrote:
    On 04/06/2026 13:40, Bart wrote:
    On 04/06/2026 10:34, Tim Rentsch wrote:
    Bart <bc@freeuk.com> writes:

    My point was that it could be objective, at least for too many.ÿ So
    (a*a) + (b*b) would be commonly agreed to have too many, [...]

    Apparently you misunderstand what is meant by the word objective.
    An objective statement is one that is independent of personal
    assessment, even collective personal assessment.

    I don't know of any infix PL syntax where 'a*a + b*b', as a
    standalone expression, doesn't mean '(a*a) + (b*b)'.

    Google agrees with me (in that 2*2+3*3 shows 13), and so does my
    Casio calculator.

    It's not my personal opinion!

    You are - again - moving the goalposts.

    It is an objective fact that "a * a + b * b" means "(a * a) + (b * b)"
    in normal mathematics (at least in the countries I am familiar with),
    and also in most mainstream programming languages.

    It is an objective fact, therefore, that "(a*a) + (b*b)" has more
    parentheses than needed in the context of most programming languages.

    "(a*a) + (b*b) has too many parentheses", on the other hand, is a
    purely subjective opinion.

    So, you're arguing 'more than needed' is a completely different thing
    from 'too many'.


    Of course they are different things - albeit related things, rather than /completely/ different. One is a question of fact, the other a question
    of opinion, and they do not always coincide.

    It is a fact that "a << (b + c)" has more parentheses than needed. But
    I think we are both of the opinion that it does not have "too many" parentheses - it has an appropriate number of parentheses.


    Sigh...


    If you wrote, for example, that "a << b + c" is ambiguous in C, then you

    It is technically unambiguous in C.

    There is no "technically" about it. It is unambiguous in C.

    It can be ambiguous in the mind of
    somebody who would have to double-check the precedence levels, or where
    the C context is missing.

    I would not use the word "ambiguous" there - "unclear" would be more appropriate in the situation when someone does not know the C precedence levels.

    If you are given the expression and don't know it is in C, it's a very different matter - there are all kinds of things it could mean. In C++,
    it could mean concatenating two strings and passing the result to a
    output stream. In Forth, it could mean anything you like. With no
    context, the expression is not "ambiguous" because that implies that
    there is a number of reasonable interpretations - and without context,
    there is no limit to the interpretations.

    So while I entirely agree that "a << b + c" may not be clear, and may
    easily be misinterpreted, "ambiguous" is the wrong word to use.


    The discssion seems to about what exactly is 'too many'.

    No, it's an attempt to get you to understand the difference between "objective" and "subjective" - fact and opinion. I don't understand why
    you are having such a problem here.


    Apparently you can constuct a valid C source file where 99.9% of the
    text consists of () characters, but if someone - or even a million
    people - say that it is too many, then that is just their subjective opinion.

    64 levels of nested parentheses is /factually/ and /objectively/ too
    many to be guaranteed supported by a conforming C compiler. It takes a
    far smaller number to be viewed as too many in the subjective opinion of
    a large proportion of people.


    I don't have the patience for such nonsense any more:

    * The () in '(a * b) + c' are generally unnecessary


    Yes. They are unnecessary in C (that is a fact), and most people would
    not find them helpful in understanding the expression (that is a claimed
    fact, given without evidence, about people's opinions. It is my opinion
    that this claimed fact is true).


    * The () in 'a << (b + c)' are advisable

    That is a subjective opinion. /I/ would generally advise including the parentheses here. Other people might have a different opinion. And
    people can have different opinions depending on the target audience.


    * The () in '(a << b) + c)' are necessary if the intent is to have
    ÿ what might be the more intuitive meaning.

    The parentheses in "(a << b) + c" are necessary if the intent is to
    shift "a" by "b", and then add "c" to the result. That is fact, not
    opinion. Any discussion of "intuitive" is necessarily subjective.


    If this not 100% C-specific, than () are needed for both the last two examples, but not the first.

    You all know this.


    Do /you/ know what is fact and what is opinion here? Do you understand
    the difference, after spoon-feeding you these examples?

    And do you understand why it is important in a discussion to be able to
    make these distinctions? It matters, even if you and I would likely
    both want the parentheses mostly in the same places.



    (Note that C has its own problems in this area:

    ÿÿÿ a = b/*p;ÿÿÿÿÿ // divide b by dereferenced pointer p

    Here, /* also happens to start a block comment.)


    Here you are objectively wrong.ÿ C does not have a "problem" with
    this. The parsing rules of the language are clear - often called
    "maximum munch".ÿ The character sequence "/*" is the start of a
    comment, it is not two separate operators.

    This is where it falls down. It's very clearly a 'gotcha', and
    consequence of poorly thought-out design.

    It is neither a "gotcha", not a consequence of poor design. It does not
    "fall down". It is simply a minor consequence of the choice of operator syntax. Such an expression would occur rarely in code, and to be a
    "gotcha" it would need to be realistic for someone to write it, without spaces, and for their code to compile and be used without the mistake
    being noticed. Do you think that is in any way realistic? I do not.

    And to be "poor design", it needs to be something that is likely to
    cause problems (which it is not), or which requires significant effort
    to work around. Writing "a = b / *p;" is not challenging, and a lot of
    people prefer spaces around binary operators anyway.

    I'd say you were making a mountain out of a molehill, but I don't think
    it's as big as a molehill.


    That the behaviour is deterministic doesn't change that.


    Of course it does. If some compilers treated it differently, then there
    might be a chance that someone wrote such code and got the expected
    results from the tool they were using, even though it was treated
    differently by other tools.


    ÿIt's silly to
    worry about something that 999 people out of a 1000 (and the actual
    numbers are undoubtedly much higher) are able to navigate without
    difficulty.ÿ Yet the examples you give insist on focusing on the few
    hopeless individuals.

    Are you saying that whoever wrote code like this:

    ÿÿÿÿÿ crcu32 = (crcu32 >> 4) ^ s_crc32[(crcu32 & 0xF) ^ (b & 0xF)];

    is needlessly worrying about the 99.9+% of the readership who you
    claim will know C syntax rules precisely? That is, they would find
    this version just as clear without any extra cognitive effort:

    ÿÿÿÿÿ crcu32 = crcu32 >> 4 ^ s_crc32[crcu32 & 0xF ^ b & 0xF];

    ?

    Tim did not write that.

    What was the 'something' in "It's silly to worry about something that ..."?


    My mind-reading skills are not that well developed.

    I assume it's people being unable to understand that second example.

    He did not say he was talking about those examples. Given that the
    "crc" examples are more distant in the Usenet thread, it seems a stretch
    to assume he was referring to them, rather than to the code examples you
    had just given. (It would, perhaps, have been helpful if Tim had not
    snipped those examples.)


    Yet I seee parenthese being used in such cases a LOT more than 0.1% of
    the time. 50% or more would be my guess.


    ÿ That example was not on the list of examples you gave recently.


    It was posted several times.

    (https://github.com/richgel999/miniz/blob/master/miniz.c line 81, second
    hit for '>>')




    --- PyGate Linux v1.5.15
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From Bart@3:633/10 to All on Thu Jun 4 16:46:21 2026
    On 04/06/2026 15:27, David Brown wrote:
    On 04/06/2026 15:18, Bart wrote:

    It is an objective fact, therefore, that "(a*a) + (b*b)" has more
    parentheses than needed in the context of most programming languages.

    "(a*a) + (b*b) has too many parentheses", on the other hand, is a
    purely subjective opinion.

    So, you're arguing 'more than needed' is a completely different thing
    from 'too many'.

    Of course they are different things - albeit related things, rather
    than /completely/ different.ÿ One is a question of fact, the other a question of opinion, and they do not always coincide.

    It is a fact that "a << (b + c)" has more parentheses than needed.ÿ But
    I think we are both of the opinion that it does not have "too many" parentheses - it has an appropriate number of parentheses.

    So saying 'too many' of something will be a subjective opinion? OK, so
    let's try compiling this bit of C:

    void F(int, int);

    int main() {
    F(1, 2, 3);
    }

    8 out of 9 compilers reported 'Too many arguments'.

    According to you, that's only their subjective opinion, not an objective
    fact?

    I tried a version in Go for good measure; it also used 'Too many'.

    I think we'll leave it here.




    Sigh...


    If you wrote, for example, that "a << b + c" is ambiguous in C, then you >>
    It is technically unambiguous in C.

    There is no "technically" about it.ÿ It is unambiguous in C.

    It can be ambiguous in the mind of somebody who would have to double-
    check the precedence levels, or where the C context is missing.

    I would not use the word "ambiguous" there - "unclear" would be more appropriate in the situation when someone does not know the C precedence levels.

    What would think if you saw this:

    r << 16 + g << 8 + b

    Did they really mean 'r << (16 + g) << (8 + b)' ?


    No, it's an attempt to get you to understand the difference between "objective" and "subjective" - fact and opinion.ÿ I don't understand why
    you are having such a problem here.

    See my example above with compilers. Maybe you can give all their
    authors the same patronising talk.

    * The () in '(a << b) + c)' are necessary if the intent is to have
    ÿÿ what might be the more intuitive meaning.

    The parentheses in "(a << b) + c" are necessary if the intent is to
    shift "a" by "b", and then add "c" to the result.ÿ That is fact, not opinion.ÿ Any discussion of "intuitive" is necessarily subjective.

    Intuitive because here << performs the same scaling function as multiply:

    a << b is the same as a * 2**b

    a * b is the same as a << log2(b) when b is a power of two
    (or thereabouts!)

    The point is: they naturally belong together.

    Given 'a * 8 + b' or 'a << 3 + b', it is desirable to freely convert one
    to the other without having to restructure the parentheses.

    ÿÿÿ a = b/*p;ÿÿÿÿÿ // divide b by dereferenced pointer p

    This is where it falls down. It's very clearly a 'gotcha', and
    consequence of poorly thought-out design.

    It is neither a "gotcha", not a consequence of poor design.ÿ It does not "fall down".ÿ It is simply a minor consequence of the choice of operator syntax.ÿ Such an expression would occur rarely in code, and to be a
    "gotcha" it would need to be realistic for someone to write it, without spaces, and for their code to compile and be used without the mistake
    being noticed.ÿ Do you think that is in any way realistic?ÿ I do not.

    It's a poor show. This program:

    #include <stdio.h>
    int main() {
    int a=1, b=200, c=3, d=77;
    int *p = &d;

    a = b / *p;
    c = d /* comment*/ + 5;

    printf("%d\n", a);
    printf("%d\n", c);
    }

    displays 2 82. If that space between / and * is lost, it still compiles,
    but displays 205 3.

    Yes, it's unlikely, but so what? You don't dismess such issues in a PL
    by crossing your fingers and suggesting it's unlikely to come up.

    There are actually other issues associated with /**/ comments; here
    someone forgot to terminate the first comment:

    puts("one"); /* comment 1
    puts("two"); /* commmet 2 */
    puts("three"); /* comment 3 */

    The middle line is silently elided. This is one with // comments:

    puts("one"); // file c:\cx\
    puts("two");
    puts("three");

    Again, the middle line is commented out.

    I'd say C comments have a few issues. That the standard explains exactly
    how they work doesn't help.


    And to be "poor design", it needs to be something that is likely to
    cause problems

    But you would choose not to have these issues in a new language.

    What was the 'something' in "It's silly to worry about something
    that ..."?


    My mind-reading skills are not that well developed.

    It didn't stop you giving an opinion about what you thought he meant!


    --- PyGate Linux v1.5.15
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From Scott Lurndal@3:633/10 to All on Thu Jun 4 16:18:07 2026
    David Brown <david.brown@hesbynett.no> writes:
    On 04/06/2026 15:18, Bart wrote:

    (Note that C has its own problems in this area:

    ÿÿÿ a = b/*p;ÿÿÿÿÿ // divide b by dereferenced pointer p

    Here, /* also happens to start a block comment.)


    Here you are objectively wrong.ÿ C does not have a "problem" with
    this. The parsing rules of the language are clear - often called
    "maximum munch".ÿ The character sequence "/*" is the start of a
    comment, it is not two separate operators.

    This is where it falls down. It's very clearly a 'gotcha', and
    consequence of poorly thought-out design.

    It is neither a "gotcha", not a consequence of poor design.

    Indeed, and in the early days, the compiler itself would never
    have seen '/*' - the preprocessor (cpp) would have removed it
    from the source before the source reached the first
    pass of the compiler (c0).

    --- PyGate Linux v1.5.15
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From Bart@3:633/10 to All on Thu Jun 4 17:23:16 2026
    On 04/06/2026 17:18, Scott Lurndal wrote:
    David Brown <david.brown@hesbynett.no> writes:
    On 04/06/2026 15:18, Bart wrote:

    (Note that C has its own problems in this area:

    ÿÿÿ a = b/*p;ÿÿÿÿÿ // divide b by dereferenced pointer p

    Here, /* also happens to start a block comment.)


    Here you are objectively wrong.ÿ C does not have a "problem" with
    this. The parsing rules of the language are clear - often called
    "maximum munch".ÿ The character sequence "/*" is the start of a
    comment, it is not two separate operators.

    This is where it falls down. It's very clearly a 'gotcha', and
    consequence of poorly thought-out design.

    It is neither a "gotcha", not a consequence of poor design.

    Indeed, and in the early days, the compiler itself would never
    have seen '/*' - the preprocessor (cpp) would have removed it
    from the source before the source reached the first
    pass of the compiler (c0).

    How does that not make it bad design?

    The proprocessor would strip everything from the /* until the next
    matching */, so a chunk of your program goes missing.

    If lucky, what's left will be an error, but not always.

    --- PyGate Linux v1.5.15
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From Dan Cross@3:633/10 to All on Thu Jun 4 16:31:35 2026
    In article <865x3yd21n.fsf@linuxsc.com>,
    Tim Rentsch <tr.17687@z991.linuxsc.com> wrote:
    cross@spitfire.i.gajendra.net (Dan Cross) writes:
    In article <86ik81cfk5.fsf_-_@linuxsc.com>,
    Tim Rentsch <tr.17687@z991.linuxsc.com> wrote:
    Janis Papanagnou <janis_papanagnou+ng@hotmail.com> writes:
    On 2026-06-01 00:54, Keith Thompson wrote:
    [...]

    Yes, a compiler can reduce (a + b) * 0 to just 0. But it's not
    required to do so, and (INT_MAX + 1) * 0 still has undefined
    behavior. Undefined behavior is determined by the rules of the
    abstract machine *without* any adjustments permitted by the as-if
    rule.

    This is something I really don't get in the actual C-logic...

    Using constants that can be determined at compile time is UB here,
    despite the '* 0' mathematically indicating an IMO clear semantics,
    but using variables is only UB possibly at runtime? [...]

    There's an important distinction to make here. Consider this
    program:

    #include <limits.h>

    int
    foo(){
    int zero = (INT_MAX+1)*0;
    return zero;
    }

    int
    main(){
    return 0;
    }

    This program does not transgress the bounds of undefined behavior.

    To clarify, the comments in my posting were meant to be read as
    saying the given text is the entire program, and that it is strictly >conforming with respect to conforming hosted implementations.
    (Incidentally, given the rules for freestanding implementations, I'm
    not sure that it is even possible for any program to be strictly
    conforming with respect to conforming freestanding implementations.
    In any case my statements were meant only in the context of hosted >implementations.)

    Ok.

    [snip]
    Perhaps you mean that this is irrelevant because `foo` is not
    invoked, but I see no reason why that need be the case in e.g.
    a freestanding environment.

    I explained the context of my previous statements above. Sorry for
    not saying that in the original message.

    In a hosted environment, I don't
    think anything explicitly prevents `foo` from being called after
    `main` returns (though I can't imagine that would happen in real
    life; it would be weird if it did).

    The semantics described in the ISO C standard don't admit that
    possibility.

    Could you please point to where it says this, in the C standard?

    I cannot find anything that says that arbitrary code cannot run
    after `main()` returns, and I don't see how that could possibly
    be true.

    Whether foo() has external linkage or internal
    linkage doesn't change that.

    I disagree. There's no possible way for the implementation to
    know whether a function with external linkage will be ultimately
    invoked or not; consider a system that supports loadable shared
    modules. Nothing prevents even this simple program from being
    compiled as a shared module, dynamically loaded, the loading
    program explicitly searching for and finding the symbol
    corresponding to the `foo` function, and invoking it.

    Hence, the compiler _must_ treat with UB as written, which is
    why `ubsan` inserts trapping code in `foo`.

    In your example, `foo` clearly exhibits UB; I think your
    argument is whether that has a realized effect or not, since the
    UB is not invoked. I'm saying that in general a compiler cannot
    possibly know that when it compiles `foo`, and is free to assume
    the worst.

    Only those actions initiated by
    statements in main() are ever elaborated.

    This is not true: code can obviously run outside of the bounds
    of `main`, for several reasons.

    First, there is the issue of static initializers, which you had
    mentioned earlier. Not at play here, but it does invalidate
    your statement above, as these run "before" main is invoked.

    Second, we know that code can run after because `atexit` can be
    used to register handlers that will run after it terminates: as
    section 5.1.2.3.4 of n3220 says, "a return from the initial call
    to the main function is equivalent to calling the exit function
    with the value returned by the main function as its argument",
    which means that it will run `atexit` handlers. (But, as the
    footnote warns, lifetimes of variables with automatic storage
    duration have ended in this case in accordance with sec 6.2.4,
    since `main` has terminated.)

    Third, it is possible to invoke code that may conditionally be
    executed, such as signal handlers, in response to external
    events. Certainly, `signal(SIGINT, some_handler);` does not
    immediately guarantee that `some_handler` is run, but it does
    not prevent it from running, either.

    Of course, for the second and third points, we must acknowledge
    that one might quibble about what it means to say that a program
    invokes "actions initiated by statements in main()".
    Registering signal and exit handlers is (generally) going to be
    something done in as the consequence of an "action initiated by
    statements in main()". And subsequent invocation of (say)
    `some_handler` in the example above in response to receipt of a
    `SIGINT` signal is arguably a consequence of that.

    But I'm not sure what _you_ mean by "transgress the bounds of
    undefined behavior" here.

    It's a grammatical fine point. I think for present purposes it's
    okay to gloss over the distinction, and say this statement may be
    read as saying "the program does not have undefined behavior".

    Except it does. `foo` is an example of what Regehr calls a
    "Type 3" function in https://blog.regehr.org/archives/213.

    Also you are discounting time-travel; code not not actually
    invoke UB to suffer from it. The mere existence of it can be
    enough.
    https://devblogs.microsoft.com/oldnewthing/20140627-00/?p=633

    Moreover, undefined behavior simply is; the definition from the
    C standard does not say that time-traveling "post-modern"
    compilers are free to assume and do anything if they observe UB
    anywhere in a program.

    Here, because `foo` has external linkage, the compiler cannot
    know whether `foo` is invoked or not, and I see nothing
    preventing it from assuming that the entire program is in error.
    In particular, I don't think there is anything that prevents a
    compiler from simply emitting `int main(void) { abort(); }`.

    Even more than that, the program is strictly conforming, and must be
    accepted by a conforming implementation.

    See above.

    Now let's change the program slightly:

    #include <limits.h>

    int
    foo(){
    static int zero = (INT_MAX+1)*0;
    return zero;
    }

    int
    main(){
    return 0;
    }

    This program does transgress the bounds of undefined behavior. The
    reason for the difference is that in the first program the semantics
    of foo() is to evaluate the expression to be stored in 'zero' only
    at runtime, whereas in the second program the semantics of foo() is
    to evaluate the expression to be stored in 'zero' before program
    startup (informally, "at compile time"). What matters is not
    whether the offending expression /might/ be evaluated "at compile
    time", but whether the offending expression /must/ be evaluated "at
    compile time". Only in the second case is undefined behavior
    inevitable (and thus it does not occur in the first program).

    Fine point: strictly speaking, I believe the C standard allows even
    the second program to complete translation phase 8 successfully, and
    for any offending behavior to occur only when we actually try to run
    the program. To say that another way, there is no requirement that
    possible nasal demons be made manifest at any point before an actual
    attempted execution. On the other hand, because that possibility is
    there lurking in the background, there is no requirement that the
    program be accepted, and could be rejected by a conforming compiler.

    Indeed. Further, I believe that the same is true for the first
    program, as well.

    It isn't. In the first program the offending expression is never
    evaluated, because foo() is never called.

    See above.

    Of course, I don't think any of this would _actually_ happen,
    and if it did, one should take the compiler that does it and
    toss it in the trash. But I don't think it's prohibited,
    either; such is one of the consequences of an informal
    specification like the C standard. Time-travel is especially
    pernicious in post-modern compilers.

    - Dan C.


    --- PyGate Linux v1.5.15
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From Scott Lurndal@3:633/10 to All on Thu Jun 4 16:47:50 2026
    Bart <bc@freeuk.com> writes:
    On 04/06/2026 17:18, Scott Lurndal wrote:
    David Brown <david.brown@hesbynett.no> writes:
    On 04/06/2026 15:18, Bart wrote:

    (Note that C has its own problems in this area:

    ÿÿÿ a = b/*p;ÿÿÿÿÿ // divide b by dereferenced pointer p

    Here, /* also happens to start a block comment.)


    Here you are objectively wrong.ÿ C does not have a "problem" with
    this. The parsing rules of the language are clear - often called
    "maximum munch".ÿ The character sequence "/*" is the start of a
    comment, it is not two separate operators.

    This is where it falls down. It's very clearly a 'gotcha', and
    consequence of poorly thought-out design.

    It is neither a "gotcha", not a consequence of poor design.

    Indeed, and in the early days, the compiler itself would never
    have seen '/*' - the preprocessor (cpp) would have removed it
    from the source before the source reached the first
    pass of the compiler (c0).

    How does that not make it bad design?

    The proprocessor would strip everything from the /* until the next
    matching */, so a chunk of your program goes missing.

    Whatcha talkin' 'bout willis?


    --- PyGate Linux v1.5.15
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From Janis Papanagnou@3:633/10 to All on Thu Jun 4 19:47:25 2026
    On 2026-06-04 18:18, Scott Lurndal wrote:

    Indeed, and in the early days, the compiler itself would never
    have seen '/*' - the preprocessor (cpp) would have removed it
    from the source before the source reached the first
    pass of the compiler (c0).

    Curious; was the comment-handling at some point in history removed
    from the Cpp-processing? - If so, when was that? And I assume the
    semantics are still the same; is that correct?

    Janis


    --- PyGate Linux v1.5.15
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From Janis Papanagnou@3:633/10 to All on Thu Jun 4 20:15:35 2026
    On 2026-06-04 17:46, Bart wrote:
    On 04/06/2026 15:27, David Brown wrote:
    On 04/06/2026 15:18, Bart wrote:

    It is an objective fact, therefore, that "(a*a) + (b*b)" has more
    parentheses than needed in the context of most programming languages.

    "(a*a) + (b*b) has too many parentheses", on the other hand, is a
    purely subjective opinion.

    So, you're arguing 'more than needed' is a completely different thing
    from 'too many'.

    Of course they are different things - albeit related things, rather
    than /completely/ different.ÿ One is a question of fact, the other a
    question of opinion, and they do not always coincide.

    It is a fact that "a << (b + c)" has more parentheses than needed.
    But I think we are both of the opinion that it does not have "too
    many" parentheses - it has an appropriate number of parentheses.

    So saying 'too many' of something will be a subjective opinion?

    Oh, I feel guilty! - My hint on the _suspicious_ "too many" keyword
    got (wrongly!) *generalized* as always being a subjective valuation
    in all cases. - I apologize to have contributed to your confusion.

    OK, so let's try compiling this bit of C:

    ÿ void F(int, int);

    ÿ int main() {
    ÿÿÿÿÿ F(1, 2, 3);
    ÿ }

    8 out of 9 compilers reported 'Too many arguments'.

    And that is correct in _this semantic context_. - The rules require
    two integers and providing three is too many of course; that's a fact.

    In x=(((((((a))))))); the rules allow the parenthesis, but they are
    neither required nor do they seem to serve any sensible purpose. Here
    the "two many" (w.r.t. clearness of the expression) is a subjectively
    common and sensible valuation.

    [...]

    I think we'll leave it here.

    (I'd have hoped we could leave all that from the beginning!)

    [...]

    * The () in '(a << b) + c)' are necessary if the intent is to have
    ÿÿ what might be the more intuitive meaning.

    The parentheses in "(a << b) + c" are necessary if the intent is to
    shift "a" by "b", and then add "c" to the result.ÿ That is fact, not
    opinion.ÿ Any discussion of "intuitive" is necessarily subjective.

    Intuitive because here << performs the same scaling function as multiply:

    ÿ a << bÿÿ is the same as a * 2**b

    ÿ a * bÿÿÿ is the same as a << log2(b) when b is a power of two
    ÿÿÿÿÿÿÿÿÿÿ (or thereabouts!)

    ("or thereabouts"? - You are squirming and lacking the precision that
    would be necessary here for a sensual consideration of the concepts.)


    The point is: they naturally belong together.

    You can express the shift by arithmetic, yes. And you can express some
    *special cases* of arithmetic by the shifts. - That doesn't imply that
    you should thus mix types. Rather the opposite; if you stay within the respective operation class the precedences of the C-languages support
    you with no parentheses necessary while staying withing the respective operation classes.

    The point, is if you operate on bits you should best use bit-operations
    and if you do arithmetic you should best use arithmetic operations.
    (The word "best" expresses my personal valuation based on explanations
    I already gave before and repeat below - it may be worth to think about
    that for a moment before continuing.)


    Given 'a * 8 + b' or 'a << 3 + b', it is desirable to freely convert one
    to the other without having to restructure the parentheses.

    No, it's undesirable if you want to express a cleanly typed expression.

    If (for some reason) you don't care about a clear separation *then* you
    might *need*, and probably *should* use parentheses to reestablish the clearness that you gave up in the first place by mixing the arithmetic
    and bit type operations.

    I suggest if you intend arithmetic write a * 8 + b (not a * 8 | b ),
    if you intend bit operations write u << 3 | v (unless there's reason)
    and you need no parentheses. So that then, in the cases where the shift
    value is to be calculated, you may write u << a + b
    and also need no parentheses (but you can of course use them if you're
    unsure about readers' understanding or if you as programmer are unsure
    about it despite existing precedence tables and given explanations).[*]

    The precedences in "C" are sensibly defined in all those cases.

    Janis

    [*] Note that I used different letters to enhance comprehensibility for
    you. (Where you used the same names because you haven't been capable of recognizing the point and throw all in one bag thus missing the point
    of differentiating the two operation classes.)

    [...]


    --- PyGate Linux v1.5.15
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From Lew Pitcher@3:633/10 to All on Thu Jun 4 18:45:23 2026
    On Thu, 04 Jun 2026 16:18:07 +0000, Scott Lurndal wrote:

    [snip]
    Indeed, and in the early days, the compiler itself would never
    have seen '/*' - the preprocessor (cpp) would have removed it
    from the source before the source reached the first
    pass of the compiler (c0).

    So, I've looked through "The C Programming Language" (the K&R C)
    and the paper "A Tour Through the Portable C Compiler" (S. C.
    Johnson, circa 1974), and neither document states that the
    preprocessor strips comments. In fact, the mentions of the
    preprocessor are exclusively about the #operation operators,
    and not about C comments.

    In "A Tour Through the Portable C Compiler", Mr. Johnson explicitly
    states that the 1st compiler pass (which follows the preprocessor pass)
    takes care of the comments. Specifically, Mr. Johnson says
    "Pass 1
    The first pass does lexical analysis, parsing, symbol table
    maintenance, tree building, optimization, and a number of
    machine dependant things. ...

    Lexical Analysis
    The lexical analyzer is a conceptually simple routine that reads
    the input and returns the tokens of the C language as it encounters
    them ... The conceptual simplicity of this job is confounded a bit
    by several other simple jobs that unfortunately must go on
    simultaneously. These include
    ...
    * Skipping comments
    "
    It appears that, in at least one seminal C compiler, the job of reducing comments to whitespace was not part of the preprocessor's responsibility
    but was instead implemented as part of the first (lexical) pass of the
    compiler proper.

    --
    Lew Pitcher
    "In Skills We Trust"
    Not LLM output - I'm just like this.

    --- PyGate Linux v1.5.15
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From David Brown@3:633/10 to All on Thu Jun 4 20:54:42 2026
    On 04/06/2026 17:46, Bart wrote:
    On 04/06/2026 15:27, David Brown wrote:
    On 04/06/2026 15:18, Bart wrote:

    It is an objective fact, therefore, that "(a*a) + (b*b)" has more
    parentheses than needed in the context of most programming languages.

    "(a*a) + (b*b) has too many parentheses", on the other hand, is a
    purely subjective opinion.

    So, you're arguing 'more than needed' is a completely different thing
    from 'too many'.

    Of course they are different things - albeit related things, rather
    than /completely/ different.ÿ One is a question of fact, the other a
    question of opinion, and they do not always coincide.

    It is a fact that "a << (b + c)" has more parentheses than needed.
    But I think we are both of the opinion that it does not have "too
    many" parentheses - it has an appropriate number of parentheses.

    So saying 'too many' of something will be a subjective opinion? OK, so
    let's try compiling this bit of C:

    ÿ void F(int, int);

    ÿ int main() {
    ÿÿÿÿÿ F(1, 2, 3);
    ÿ }

    8 out of 9 compilers reported 'Too many arguments'.

    According to you, that's only their subjective opinion, not an objective fact?

    Again - /please/ stop trying to guess what people say or put words in
    their mouths. I can't remember ever seeing you do so accurately.

    "Too many parentheses" is subjective, because they affect the ease of
    reading the code as a human reader. "Too many arguments in a function
    call" affects the semantics of the code - it is objective fact. It is
    not something that involves human opinions.

    I think it would be easier to explain this to my cat than to you.
    Simple logic seems to be completely beyond your grasp.


    My mind-reading skills are not that well developed.

    It didn't stop you giving an opinion about what you thought he meant!


    I did not claim to know, or even assume, what he /meant/ - I commented
    on what he /said/. That was factual. And I made a comment on what I /thought/ it was likely that he meant (or did not mean). That was
    opinion, and clearly so. The words are in the post for all to see, the thoughts behind those words are not.


    --- PyGate Linux v1.5.15
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From Bart@3:633/10 to All on Thu Jun 4 19:57:58 2026
    On 04/06/2026 17:47, Scott Lurndal wrote:
    Bart <bc@freeuk.com> writes:
    On 04/06/2026 17:18, Scott Lurndal wrote:
    David Brown <david.brown@hesbynett.no> writes:
    On 04/06/2026 15:18, Bart wrote:

    (Note that C has its own problems in this area:

    ÿÿÿ a = b/*p;ÿÿÿÿÿ // divide b by dereferenced pointer p

    Here, /* also happens to start a block comment.)


    Here you are objectively wrong.ÿ C does not have a "problem" with
    this. The parsing rules of the language are clear - often called
    "maximum munch".ÿ The character sequence "/*" is the start of a
    comment, it is not two separate operators.

    This is where it falls down. It's very clearly a 'gotcha', and
    consequence of poorly thought-out design.

    It is neither a "gotcha", not a consequence of poor design.

    Indeed, and in the early days, the compiler itself would never
    have seen '/*' - the preprocessor (cpp) would have removed it
    from the source before the source reached the first
    pass of the compiler (c0).

    How does that not make it bad design?

    The proprocessor would strip everything from the /* until the next
    matching */, so a chunk of your program goes missing.

    Whatcha talkin' 'bout willis?


    What were /you/ talking about? What was your point?


    --- PyGate Linux v1.5.15
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From Keith Thompson@3:633/10 to All on Thu Jun 4 11:59:49 2026
    Bart <bc@freeuk.com> writes:
    On 04/06/2026 13:35, David Brown wrote:
    On 04/06/2026 13:40, Bart wrote:
    On 04/06/2026 10:34, Tim Rentsch wrote:
    Bart <bc@freeuk.com> writes:

    My point was that it could be objective, at least for too many.ÿ So
    (a*a) + (b*b) would be commonly agreed to have too many, [...]

    Apparently you misunderstand what is meant by the word objective.
    An objective statement is one that is independent of personal
    assessment, even collective personal assessment.

    I don't know of any infix PL syntax where 'a*a + b*b', as a
    standalone expression, doesn't mean '(a*a) + (b*b)'.

    Google agrees with me (in that 2*2+3*3 shows 13), and so does my
    Casio calculator.

    It's not my personal opinion!
    You are - again - moving the goalposts.
    It is an objective fact that "a * a + b * b" means "(a * a) + (b *
    b)" in normal mathematics (at least in the countries I am familiar
    with), and also in most mainstream programming languages.
    It is an objective fact, therefore, that "(a*a) + (b*b)" has more
    parentheses than needed in the context of most programming
    languages.
    "(a*a) + (b*b) has too many parentheses", on the other hand, is a
    purely subjective opinion.

    So, you're arguing 'more than needed' is a completely different thing
    from 'too many'.

    Sigh...

    Yes, it's a different thing, assuming at least one reasonable
    interpretation of "more than needed". But if you use the phrase
    "more than needed" without specifying *for what purpose*, you have
    ammunition for a long pointless argument.

    `(a*a) + (b*b)` objectively has more parentheses than are needed
    *for the purpose of telling the compiler which operations go with
    which operands*. Assuming it's a full expression, it's exactly
    equivalent to `a*a + b*b`.

    My subjective opinion is that `(a*a) + (b*b)` has "too many"
    parentheses. The relative precedences of "*" and "+" are
    sufficiently well known that I find the parentheses distracting.

    A subjective opinion doesn't become objective just because almost
    everyone agrees with it.

    Even the idea that "*" should bind more tightly than "+" is
    subjective. It's a rule that only goes back to the 1600s or so.
    Mathematicians *invented* it. There are real advantages to that
    choice, and *tremendous* advantages to having a near-universal
    convention, but for example strict left-to-right association would
    also have been a valid choice. (And implementing an expression
    parser that binds "+" more tightly than "*" could be an interesting
    exercise, though few would want to use it in practice.) Again, a
    subjective preference doesn't become objective just because nearly
    everyone agrees with it.

    On the other hand, if I'm explaining the precedence rules, I might
    say (as I did above) that `a*a + b*b` is equivalent to `(a*a) + (b*b)`.
    In that context the parentheses are not "too many"; they're a
    necessary part of the explanation. I find them to be "too many"
    for the purpose of writing clear code, but not for some more
    specialized purposes.

    If you wrote, for example, that "a << b + c" is ambiguous in C, then
    you

    It is technically unambiguous in C. It can be ambiguous in the mind of somebody who would have to double-check the precedence levels, or
    where the C context is missing.

    Agreed.

    The discssion seems to about what exactly is 'too many'.

    If so, then we need to be clear what "too many" means. Too many
    for what purpose?

    Apparently you can constuct a valid C source file where 99.9% of the
    text consists of () characters, but if someone - or even a million
    people - say that it is too many, then that is just their subjective
    opinion.

    (((((((((((((((a))))))))))))))), without any more context, is likely
    to be "too many" parentheses. If it's the result of a complicated
    macro expansion or machine-generated code, it might not matter.
    If the purpose is to test what depth of parentheses a compiler
    supports without crashing, it's probably not nearly enough.

    I don't have the patience for such nonsense any more:

    * The () in '(a * b) + c' are generally unnecessary

    * The () in 'a << (b + c)' are advisable

    * The () in '(a << b) + c)' are necessary if the intent is to have
    what might be the more intuitive meaning.

    I agree on all three points (apart from the mismatched ")" in the
    last one).

    [...]

    --
    Keith Thompson (The_Other_Keith) Keith.S.Thompson+u@gmail.com
    void Void(void) { Void(); } /* The recursive call of the void */

    --- PyGate Linux v1.5.15
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From David Brown@3:633/10 to All on Thu Jun 4 21:04:50 2026
    On 04/06/2026 19:47, Janis Papanagnou wrote:
    On 2026-06-04 18:18, Scott Lurndal wrote:

    Indeed, and in the early days, the compiler itself would never
    have seen '/*' - the preprocessor (cpp) would have removed it
    from the source before the source reached the first
    pass of the compiler (c0).

    Curious; was the comment-handling at some point in history removed
    from the Cpp-processing? - If so, when was that? And I assume the
    semantics are still the same; is that correct?


    No, at least since the standardisation of the C language (including K&R "standard"), "preprocessing" has been an integral part of the C language
    and conversion of comments to space characters is done in phase 3 of the translation. But the C standards do not give an explicit distinction
    between "preprocessing" and "compiling" - just different translation
    phases. (They do not define a "compiler" at all.) It is not uncommon
    for implementations to separate translation into two or more programs, especially in the good old days when hosts had much less memory, but
    logically they are all one implementation. Distinguishing "the compiler itself" is somewhat artificial.


    --- PyGate Linux v1.5.15
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From Keith Thompson@3:633/10 to All on Thu Jun 4 12:11:45 2026
    Janis Papanagnou <janis_papanagnou+ng@hotmail.com> writes:
    On 2026-06-04 18:18, Scott Lurndal wrote:
    Indeed, and in the early days, the compiler itself would never
    have seen '/*' - the preprocessor (cpp) would have removed it
    from the source before the source reached the first
    pass of the compiler (c0).

    Curious; was the comment-handling at some point in history removed
    from the Cpp-processing? - If so, when was that? And I assume the
    semantics are still the same; is that correct?

    According to the standard, each comment is replaced by one space
    character in translation phase 3. For implementations where the
    preprocessor is a separate program, it typically handles translation
    phases 1-6 or 1-7. ("gcc -E" doesn't splice string literals.)

    The semantics may have been different in some ancient
    implementations. For example, I vaguely recall that it was common
    for ABC/**/DEF to be equivalent to ABCDEF. K&R1 says that comments
    are treated as whitespace.

    --
    Keith Thompson (The_Other_Keith) Keith.S.Thompson+u@gmail.com
    void Void(void) { Void(); } /* The recursive call of the void */

    --- PyGate Linux v1.5.15
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From Lew Pitcher@3:633/10 to All on Thu Jun 4 19:13:36 2026
    On Thu, 04 Jun 2026 21:04:50 +0200, David Brown wrote:

    On 04/06/2026 19:47, Janis Papanagnou wrote:
    On 2026-06-04 18:18, Scott Lurndal wrote:

    Indeed, and in the early days, the compiler itself would never
    have seen '/*' - the preprocessor (cpp) would have removed it
    from the source before the source reached the first
    pass of the compiler (c0).

    Curious; was the comment-handling at some point in history removed
    from the Cpp-processing? - If so, when was that? And I assume the
    semantics are still the same; is that correct?


    No, at least since the standardisation of the C language (including K&R "standard"), "preprocessing" has been an integral part of the C language
    and conversion of comments to space characters is done in phase 3 of the translation. But the C standards do not give an explicit distinction between "preprocessing" and "compiling" - just different translation
    phases. (They do not define a "compiler" at all.) It is not uncommon
    for implementations to separate translation into two or more programs, especially in the good old days when hosts had much less memory, but logically they are all one implementation. Distinguishing "the compiler itself" is somewhat artificial.

    In historic Unix (Version 7 and before), the preprocessor was implemented
    as a separate program ("cpp") from the compiler ("cc"). The compiler itself
    had no facility to handle preprocessor directives, and was, itself, often divided into two separate programs ("cc0" and "cc1"). All three phases
    ("cpp", "cc0" and "cc1") were managed by a program ("cc"), although the
    program for each phase could be invoked independently through manual
    execution.

    What differs from today is that the preprocessor was an optional component, made available for a programmer's convenience.


    --
    Lew Pitcher
    "In Skills We Trust"
    Not LLM output - I'm just like this.

    --- PyGate Linux v1.5.15
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From Bart@3:633/10 to All on Thu Jun 4 20:29:39 2026
    On 04/06/2026 19:54, David Brown wrote:
    On 04/06/2026 17:46, Bart wrote:
    On 04/06/2026 15:27, David Brown wrote:
    On 04/06/2026 15:18, Bart wrote:

    It is an objective fact, therefore, that "(a*a) + (b*b)" has more
    parentheses than needed in the context of most programming languages. >>>>>
    "(a*a) + (b*b) has too many parentheses", on the other hand, is a
    purely subjective opinion.

    So, you're arguing 'more than needed' is a completely different
    thing from 'too many'.

    Of course they are different things - albeit related things, rather
    than /completely/ different.ÿ One is a question of fact, the other a
    question of opinion, and they do not always coincide.

    It is a fact that "a << (b + c)" has more parentheses than needed.
    But I think we are both of the opinion that it does not have "too
    many" parentheses - it has an appropriate number of parentheses.

    So saying 'too many' of something will be a subjective opinion? OK, so
    let's try compiling this bit of C:

    ÿÿ void F(int, int);

    ÿÿ int main() {
    ÿÿÿÿÿÿ F(1, 2, 3);
    ÿÿ }

    8 out of 9 compilers reported 'Too many arguments'.

    According to you, that's only their subjective opinion, not an
    objective fact?

    Again - /please/ stop trying to guess what people say or put words in
    their mouths.ÿ I can't remember ever seeing you do so accurately.

    This is what you actually said:

    It is an objective fact, therefore, that "(a*a) + (b*b)" has more parentheses than needed in the context of most programming languages.

    "(a*a) + (b*b) has too many parentheses", on the other hand, is a purely subjective opinion. Even if it is true that this is "commonly agreed
    to" (and AFAIK you have no basis for that claim), that would still be a subjective opinion - no matter how common that opinion is.

    You're saying that:

    * "more than needed" is objective
    * "too many" is subjective

    Even though both are about exactly the same thing: superfluous but
    harmless parentheses in an expression.

    So you are picking on my choice of words, apparently in order to win
    some stupid argument on the internet. Even though the same "too many"
    phrase used elsewhere can be objective, according to you.

    This looks like a pattern: people here seem to have remarkable trouble debating with me on actual ideas and resort instead to find hidden significance in the some choice of words I'd happen to use.


    "Too many parentheses" is subjective, because they affect the ease of reading the code as a human reader.

    And 'more than needed' isn't that?!

    Why don't you write a bunch of expressions with variable numbers of parentheses, and against each tick off whether 'more than needed' and
    'too many' is true.

    I'd be interested in whether there would be any difference in the two
    columns, and if there is one, as what point they would diverge.

    No, this is just getting ludicrous and suggests not wanting to tackle
    the real subject: should people write '(a << b) & c' or 'a << b & c'?

    Tim Rentsch I'm sure will prefer the latter because 99.9% of C
    programmers are machines, according to him.

    Presumably, the same 99.9% will not use indentation, and will write
    their programs all on one line anyway, because it is still after all completely unambiguous according to the C standard!



    --- PyGate Linux v1.5.15
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From Dan Cross@3:633/10 to All on Thu Jun 4 20:19:58 2026
    In article <10vsh43$b3is$1@dont-email.me>,
    Lew Pitcher <lew.pitcher@digitalfreehold.ca> wrote:
    On Thu, 04 Jun 2026 16:18:07 +0000, Scott Lurndal wrote:

    [snip]
    Indeed, and in the early days, the compiler itself would never
    have seen '/*' - the preprocessor (cpp) would have removed it
    from the source before the source reached the first
    pass of the compiler (c0).

    So, I've looked through "The C Programming Language" (the K&R C)
    and the paper "A Tour Through the Portable C Compiler" (S. C.
    Johnson, circa 1974), and neither document states that the
    preprocessor strips comments. In fact, the mentions of the
    preprocessor are exclusively about the #operation operators,
    and not about C comments.

    The PDP-11 compiler from 5th Edition research Unix removes
    comments in `cc.c`. The 1972 compilers from Dennis Ritchie's
    web page remove them in the compiler proper, as they predated
    the preprocessor: https://www.nokia.com/bell-labs/about/dennis-m-ritchie/primevalC.html

    - Dan C.


    --- PyGate Linux v1.5.15
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From Scott Lurndal@3:633/10 to All on Thu Jun 4 20:31:20 2026
    cross@spitfire.i.gajendra.net (Dan Cross) writes:
    In article <10vsh43$b3is$1@dont-email.me>,
    Lew Pitcher <lew.pitcher@digitalfreehold.ca> wrote:
    On Thu, 04 Jun 2026 16:18:07 +0000, Scott Lurndal wrote:

    [snip]
    Indeed, and in the early days, the compiler itself would never
    have seen '/*' - the preprocessor (cpp) would have removed it
    from the source before the source reached the first
    pass of the compiler (c0).

    So, I've looked through "The C Programming Language" (the K&R C)
    and the paper "A Tour Through the Portable C Compiler" (S. C.
    Johnson, circa 1974), and neither document states that the
    preprocessor strips comments. In fact, the mentions of the
    preprocessor are exclusively about the #operation operators,
    and not about C comments.

    The PDP-11 compiler from 5th Edition research Unix removes
    comments in `cc.c`. The 1972 compilers from Dennis Ritchie's
    web page remove them in the compiler proper, as they predated
    the preprocessor: >https://www.nokia.com/bell-labs/about/dennis-m-ritchie/primevalC.html

    The v6 cpp.c processes the comments
    and deletes them if the 'passcom' (-C) flag is not set.


    case '/': for (;;) {
    if (*p++=='*') {/* comment */
    if (!passcom) {inp=p-2; dump(); ++flslvl;}
    for (;;) {
    while (!iscom(*p++));
    if (p[-1]=='*') for (;;) {
    if (*p++=='/') goto endcom;
    if (eob(--p)) {
    if (!passcom) {inp=p; p=refill(p);}
    else if ((p-inp)>=BUFSIZ) {/* split long comment */
    inp=p; p=refill(p); /* last char written is '*' */
    putc('/',fout); /* terminate first part */
    /* and fake start of 2nd */
    outp=inp=p-=3; *p++='/'; *p++='*'; *p++='*';
    } else p=refill(p);
    } else break;
    } else if (p[-1]=='\n') {
    ++lineno[ifno]; if (!passcom) putc('\n',fout);
    } else if (eob(--p)) {
    if (!passcom) {inp=p; p=refill(p);}
    else if ((p-inp)>=BUFSIZ) {/* split long comment */
    inp=p; p=refill(p);
    putc('*',fout); putc('/',fout);
    outp=inp=p-=2; *p++='/'; *p++='*';
    } else p=refill(p);
    } else ++p; /* ignore null byte */
    }
    endcom:
    if (!passcom) {outp=inp=p; --flslvl; goto again;}
    break;
    }
    if (eob(--p)) p=refill(p);
    else break;

    --- PyGate Linux v1.5.15
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From James Kuyper@3:633/10 to All on Thu Jun 4 16:33:52 2026
    On 2026-06-04 13:47, Janis Papanagnou wrote:
    On 2026-06-04 18:18, Scott Lurndal wrote:

    Indeed, and in the early days, the compiler itself would never
    have seen '/*' - the preprocessor (cpp) would have removed it
    from the source before the source reached the first
    pass of the compiler (c0).

    Curious; was the comment-handling at some point in history removed
    from the Cpp-processing? - If so, when was that? And I assume the
    semantics are still the same; is that correct?

    That question can only be answered in the context of a particular implementation of C. The C standard defines only what the entire
    implementation must do when translating and executing a program. Whether
    all of those tasks are performed by a single program, or whether
    responsibility for different parts of the process are given to different programs is an implementation detail outside the scope of the C standard.
    cpp basically implemented translation phases 1-4. cc implemented phases
    5-7. The linker implemented phase 8. But those statements are only
    partially accurate, and other impllementations divided the tasks
    differently.

    One advantage of having a single program do the whole thing, is that
    error messages can mention the actual text of the line where a problem
    was detected, without any pre-processing applied.

    --- PyGate Linux v1.5.15
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From Scott Lurndal@3:633/10 to All on Thu Jun 4 20:34:23 2026
    Bart <bc@freeuk.com> writes:
    On 04/06/2026 17:47, Scott Lurndal wrote:
    Bart <bc@freeuk.com> writes:
    On 04/06/2026 17:18, Scott Lurndal wrote:
    David Brown <david.brown@hesbynett.no> writes:
    On 04/06/2026 15:18, Bart wrote:

    (Note that C has its own problems in this area:

    ÿÿÿ a = b/*p;ÿÿÿÿÿ // divide b by dereferenced pointer p

    Here, /* also happens to start a block comment.)


    Here you are objectively wrong.ÿ C does not have a "problem" with >>>>>>> this. The parsing rules of the language are clear - often called >>>>>>> "maximum munch".ÿ The character sequence "/*" is the start of a
    comment, it is not two separate operators.

    This is where it falls down. It's very clearly a 'gotcha', and
    consequence of poorly thought-out design.

    It is neither a "gotcha", not a consequence of poor design.

    Indeed, and in the early days, the compiler itself would never
    have seen '/*' - the preprocessor (cpp) would have removed it
    from the source before the source reached the first
    pass of the compiler (c0).

    How does that not make it bad design?

    The proprocessor would strip everything from the /* until the next
    matching */, so a chunk of your program goes missing.

    Whatcha talkin' 'bout willis?


    What were /you/ talking about? What was your point?


    Your inaccurate characterization that a chunk of the program
    went "missing". Nothing meaningful is missing (and the comment
    remains in the original source file).

    So what do you mean, exactly, when you claim that the output of
    the preprocessor causes a chunk of the program (which doesn't
    include whitespace or comments) is missing?

    --- PyGate Linux v1.5.15
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From Keith Thompson@3:633/10 to All on Thu Jun 4 13:36:52 2026
    cross@spitfire.i.gajendra.net (Dan Cross) writes:
    In article <865x3yd21n.fsf@linuxsc.com>,
    Tim Rentsch <tr.17687@z991.linuxsc.com> wrote:
    cross@spitfire.i.gajendra.net (Dan Cross) writes:
    In article <86ik81cfk5.fsf_-_@linuxsc.com>,
    Tim Rentsch <tr.17687@z991.linuxsc.com> wrote:
    [...]
    There's an important distinction to make here. Consider this
    program:

    #include <limits.h>

    int
    foo(){
    int zero = (INT_MAX+1)*0;
    return zero;
    }

    int
    main(){
    return 0;
    }

    This program does not transgress the bounds of undefined behavior.

    To clarify, the comments in my posting were meant to be read as
    saying the given text is the entire program, and that it is strictly >>conforming with respect to conforming hosted implementations. >>(Incidentally, given the rules for freestanding implementations, I'm
    not sure that it is even possible for any program to be strictly
    conforming with respect to conforming freestanding implementations.
    In any case my statements were meant only in the context of hosted >>implementations.)

    Ok.

    [snip]
    Perhaps you mean that this is irrelevant because `foo` is not
    invoked, but I see no reason why that need be the case in e.g.
    a freestanding environment.

    I explained the context of my previous statements above. Sorry for
    not saying that in the original message.

    In a hosted environment, I don't
    think anything explicitly prevents `foo` from being called after
    `main` returns (though I can't imagine that would happen in real
    life; it would be weird if it did).

    The semantics described in the ISO C standard don't admit that
    possibility.

    Could you please point to where it says this, in the C standard?

    I cannot find anything that says that arbitrary code cannot run
    after `main()` returns, and I don't see how that could possibly
    be true.

    N3220 5.1.2.4, Program semantics.

    It defines the *observable behavior* of a program, which consists of
    accesses to volatile objects, data written to files, and I/O dynamics of interactive devices.

    If the usual "Hello, world" program prints "Hello, world" followed
    by "Goodbye", the implementation is non-conforming. If it formats
    my hard drive after printing "Goodbye", it's non-conforming and
    dangerous.

    Whether foo() has external linkage or internal
    linkage doesn't change that.

    I disagree. There's no possible way for the implementation to
    know whether a function with external linkage will be ultimately
    invoked or not; consider a system that supports loadable shared
    modules. Nothing prevents even this simple program from being
    compiled as a shared module, dynamically loaded, the loading
    program explicitly searching for and finding the symbol
    corresponding to the `foo` function, and invoking it.

    Remember that linking is translation phase 8. The compiler is not
    the entire implementation.

    Hence, the compiler _must_ treat with UB as written, which is
    why `ubsan` inserts trapping code in `foo`.

    I don't know what "_must_ treat with UB" means.

    foo() has undefined behavior if it's called, so replacing its
    body with trapping code is valid. But (I'm reasonably sure that)
    an implementation cannot reject a program just because it can't
    prove that it has no undefined behavior during execution. It can
    reject it if it can prove that it *always* has undefined behavior
    during execution.

    In your example, `foo` clearly exhibits UB; I think your
    argument is whether that has a realized effect or not, since the
    UB is not invoked. I'm saying that in general a compiler cannot
    possibly know that when it compiles `foo`, and is free to assume
    the worst.

    foo() exhibits UB if and only if it's called during execution.

    Yes, a compiler can't know whether foo() will be called.
    An implementation, particularly a linker, might know, but is not
    required to. No, it is not free to assume the worst.

    I certainly wouldn't want a compiler to reject `1/time(NULL)`
    because it can't prove that time(NULL) won't be zero, or reject
    `argc+1` because it can't prove that argc < INT_MAX. Code whose
    behavior would be undefined if it were executed has no behavior
    (and therefore no UB) if it's not executed.

    [...]

    --
    Keith Thompson (The_Other_Keith) Keith.S.Thompson+u@gmail.com
    void Void(void) { Void(); } /* The recursive call of the void */

    --- PyGate Linux v1.5.15
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From Dan Cross@3:633/10 to All on Thu Jun 4 20:41:28 2026
    In article <sglUR.17897$pxGb.10844@fx07.iad>,
    Scott Lurndal <slp53@pacbell.net> wrote:
    cross@spitfire.i.gajendra.net (Dan Cross) writes:
    In article <10vsh43$b3is$1@dont-email.me>,
    Lew Pitcher <lew.pitcher@digitalfreehold.ca> wrote:
    On Thu, 04 Jun 2026 16:18:07 +0000, Scott Lurndal wrote:

    [snip]
    Indeed, and in the early days, the compiler itself would never
    have seen '/*' - the preprocessor (cpp) would have removed it
    from the source before the source reached the first
    pass of the compiler (c0).

    So, I've looked through "The C Programming Language" (the K&R C)
    and the paper "A Tour Through the Portable C Compiler" (S. C.
    Johnson, circa 1974), and neither document states that the
    preprocessor strips comments. In fact, the mentions of the
    preprocessor are exclusively about the #operation operators,
    and not about C comments.

    The PDP-11 compiler from 5th Edition research Unix removes
    comments in `cc.c`. The 1972 compilers from Dennis Ritchie's
    web page remove them in the compiler proper, as they predated
    the preprocessor: >>https://www.nokia.com/bell-labs/about/dennis-m-ritchie/primevalC.html

    The v6 cpp.c processes the comments
    and deletes them if the 'passcom' (-C) flag is not set.

    [snip]

    You sure? That looks like V7 code to me.

    - Dan C.


    --- PyGate Linux v1.5.15
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From Scott Lurndal@3:633/10 to All on Thu Jun 4 20:49:08 2026
    cross@spitfire.i.gajendra.net (Dan Cross) writes:
    In article <sglUR.17897$pxGb.10844@fx07.iad>,
    Scott Lurndal <slp53@pacbell.net> wrote:
    cross@spitfire.i.gajendra.net (Dan Cross) writes:
    In article <10vsh43$b3is$1@dont-email.me>,
    Lew Pitcher <lew.pitcher@digitalfreehold.ca> wrote:
    On Thu, 04 Jun 2026 16:18:07 +0000, Scott Lurndal wrote:

    [snip]
    Indeed, and in the early days, the compiler itself would never
    have seen '/*' - the preprocessor (cpp) would have removed it
    from the source before the source reached the first
    pass of the compiler (c0).

    So, I've looked through "The C Programming Language" (the K&R C)
    and the paper "A Tour Through the Portable C Compiler" (S. C.
    Johnson, circa 1974), and neither document states that the
    preprocessor strips comments. In fact, the mentions of the
    preprocessor are exclusively about the #operation operators,
    and not about C comments.

    The PDP-11 compiler from 5th Edition research Unix removes
    comments in `cc.c`. The 1972 compilers from Dennis Ritchie's
    web page remove them in the compiler proper, as they predated
    the preprocessor: >>>https://www.nokia.com/bell-labs/about/dennis-m-ritchie/primevalC.html

    The v6 cpp.c processes the comments
    and deletes them if the 'passcom' (-C) flag is not set.

    [snip]

    You sure? That looks like V7 code to me.

    Yes, it is. I didn't have a machine readable version of the
    v6 compiler handy. Dug it out and here's the v6 version.


    getch()
    {
    register int c, lastst;

    while ((c=getc1())=='/' && !instring)
    {
    if ((c=getc1())!='*')
    {
    pushback(c);
    return('/');
    }
    if (!skipcom)
    {putc('/',fout); putc('*', fout);}
    lastst=0;
    while ( (c = getc1()) != '\0')
    {
    if (lastst && c=='/')
    {
    if (!skipcom)
    putc('/', fout);
    break;
    }
    if (c=='\n' || !skipcom)
    putc(c, fout);
    lastst = (c=='*');
    }
    if (c=='\0')break;
    }
    return(c);
    }

    --- PyGate Linux v1.5.15
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From Keith Thompson@3:633/10 to All on Thu Jun 4 14:06:23 2026
    Bart <bc@freeuk.com> writes:
    On 04/06/2026 19:54, David Brown wrote:
    [...]
    Again - /please/ stop trying to guess what people say or put words
    in their mouths.ÿ I can't remember ever seeing you do so accurately.

    This is what you actually said:

    It is an objective fact, therefore, that "(a*a) + (b*b)" has more
    parentheses than needed in the context of most programming languages.

    "(a*a) + (b*b) has too many parentheses", on the other hand, is a purely
    subjective opinion. Even if it is true that this is "commonly agreed
    to" (and AFAIK you have no basis for that claim), that would still be a
    subjective opinion - no matter how common that opinion is.

    You're saying that:

    * "more than needed" is objective
    * "too many" is subjective

    Stop it. He's not saying that.

    You're taking phrases out of context and making false claims that the
    full statement was far more general than it actually was.

    Nobody said or implied that "too many" is always subjective.

    "Too many parentheses" is subjective, because they affect the ease
    of reading the code as a human reader.

    And 'more than needed' isn't that?!

    More than needed *for what*? Without that context, we can't tell
    whether "more than needed" is subjective or objective.

    You know all this.

    [...]

    No, this is just getting ludicrous and suggests not wanting to tackle
    the real subject: should people write '(a << b) & c' or 'a << b & c'?

    Oh, is that the real subject?

    I presume you prefer `(a << b) & c` to `a << b & c`.

    So do I.

    Others might or might not have different opinions. If that was the
    "real subject", we've wasted a lot of time debating the difference
    between subjectivity and objectivity.

    Tim Rentsch I'm sure will prefer the latter because 99.9% of C
    programmers are machines, according to him.

    Tim didn't say or imply that.

    Presumably, the same 99.9% will not use indentation, and will write
    their programs all on one line anyway, because it is still after all completely unambiguous according to the C standard!

    Of course not, because 99.9% of C programmers are not idiots..
    Your record of guessing incorrectly what other people think is
    unbroken. I suggest you stop trying.

    If people are having a debate about some controversial topic, have
    you found that arguing against some unrealistic parody of the other
    person's position is ever useful (unless your goal is to prolong
    the debate)? Stop telling people what they think.

    Tim probably prefers fewer parentheses than most C programmers do.
    You probably prefer more. There *might* be an interesting discussion
    to be had about that difference, but I doubt it.

    --
    Keith Thompson (The_Other_Keith) Keith.S.Thompson+u@gmail.com
    void Void(void) { Void(); } /* The recursive call of the void */

    --- PyGate Linux v1.5.15
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From Keith Thompson@3:633/10 to All on Thu Jun 4 14:16:14 2026
    James Kuyper <jameskuyper@alumni.caltech.edu> writes:
    [...]
    One advantage of having a single program do the whole thing, is that
    error messages can mention the actual text of the line where a problem
    was detected, without any pre-processing applied.

    Typical preprocessors emit directives that tell the compiler about
    the current file name and line number, precisely so that diagnostic
    messages can refer to the original text.

    For example:

    $ cat hello.c
    #include <stdio.h>
    int main(void) {
    printf("Hello world!\n");
    }
    $ gcc -E hello.c | tail
    extern int __uflow (FILE *);
    extern int __overflow (FILE *, int);
    # 983 "/usr/include/stdio.h" 3 4

    # 2 "hello.c" 2

    # 2 "hello.c"
    int main(void) {
    printf("Hello world!\n");
    }
    $

    The line `# 2 "hello.c"` is, according to the C standard, a
    "non-directive", which is a kind of directive. Executing a
    non-directive has undefined behavior, but gcc apparently treats it
    very much like a #line directive.

    It doesn't really matter whether the preprocessor is a separate program
    or not.

    --
    Keith Thompson (The_Other_Keith) Keith.S.Thompson+u@gmail.com
    void Void(void) { Void(); } /* The recursive call of the void */

    --- PyGate Linux v1.5.15
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From Bart@3:633/10 to All on Thu Jun 4 22:28:30 2026
    On 04/06/2026 21:34, Scott Lurndal wrote:
    Bart <bc@freeuk.com> writes:
    On 04/06/2026 17:47, Scott Lurndal wrote:
    Bart <bc@freeuk.com> writes:
    On 04/06/2026 17:18, Scott Lurndal wrote:
    David Brown <david.brown@hesbynett.no> writes:
    On 04/06/2026 15:18, Bart wrote:

    (Note that C has its own problems in this area:

    ÿÿÿ a = b/*p;ÿÿÿÿÿ // divide b by dereferenced pointer p

    Here, /* also happens to start a block comment.)


    Here you are objectively wrong.ÿ C does not have a "problem" with >>>>>>>> this. The parsing rules of the language are clear - often called >>>>>>>> "maximum munch".ÿ The character sequence "/*" is the start of a >>>>>>>> comment, it is not two separate operators.

    This is where it falls down. It's very clearly a 'gotcha', and
    consequence of poorly thought-out design.

    It is neither a "gotcha", not a consequence of poor design.

    Indeed, and in the early days, the compiler itself would never
    have seen '/*' - the preprocessor (cpp) would have removed it
    from the source before the source reached the first
    pass of the compiler (c0).

    How does that not make it bad design?

    The proprocessor would strip everything from the /* until the next
    matching */, so a chunk of your program goes missing.

    Whatcha talkin' 'bout willis?


    What were /you/ talking about? What was your point?


    Your inaccurate characterization that a chunk of the program
    went "missing". Nothing meaningful is missing (and the comment
    remains in the original source file).

    So what do you mean, exactly, when you claim that the output of
    the preprocessor causes a chunk of the program (which doesn't
    include whitespace or comments) is missing?

    This is the example I gave elsewhere:

    ---------------------------
    There are actually other issues associated with /**/ comments; here
    someone forgot to terminate the first comment:

    puts("one"); /* comment 1
    puts("two"); /* commmet 2 */
    puts("three"); /* comment 3 */
    ---------------------------

    After preprocessing you're left with this:

    puts("one");
    puts("three");

    That middle puts call is missing, and it's meant to be part of the program.

    This can also be a consequence of an inadvertent /* sequence such as in
    'a = b/*p;'.


    --- PyGate Linux v1.5.15
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From Bart@3:633/10 to All on Thu Jun 4 22:47:36 2026
    On 04/06/2026 22:06, Keith Thompson wrote:
    Bart <bc@freeuk.com> writes:
    On 04/06/2026 19:54, David Brown wrote:
    [...]
    Again - /please/ stop trying to guess what people say or put words
    in their mouths.ÿ I can't remember ever seeing you do so accurately.

    This is what you actually said:

    It is an objective fact, therefore, that "(a*a) + (b*b)" has more
    parentheses than needed in the context of most programming languages.

    "(a*a) + (b*b) has too many parentheses", on the other hand, is a purely >>> subjective opinion. Even if it is true that this is "commonly agreed
    to" (and AFAIK you have no basis for that claim), that would still be a
    subjective opinion - no matter how common that opinion is.

    You're saying that:

    * "more than needed" is objective
    * "too many" is subjective

    Stop it. He's not saying that.

    That is EXACTLY what he's saying: "It is an OBJECTIVE fact .. has more
    ... than needed", and:

    "has too many ... is ... purely subjective".




    You're taking phrases out of context and making false claims that the
    full statement was far more general than it actually was.

    And this is exactly what other people are doing.

    So I used TOO MANY instead of MORE THAN NEEDED to describe the exact
    same phenomenon.

    (1) Why are you all making such a big fucking deal of this?

    (2) Why are you all sticking up for each other?

    (3) Why don't you this discuss the fucking subject instead of going down
    these pointless rabbit holes?


    Nobody said or implied that "too many" is always subjective.

    "Too many parentheses" is subjective, because they affect the ease
    of reading the code as a human reader.

    And 'more than needed' isn't that?!

    More than needed *for what*? Without that context, we can't tell
    whether "more than needed" is subjective or objective.

    Jesus, the subthread has been going long enough.

    It is abourt how many brackets are too many, more than needed,
    superfluous to requirements, etc etc etc.

    Yes, I've finally broken and refuse to call round brackets 'parentheses' anymore.

    Except that I really no longer care. Do whatever the hell you like with
    your fucking language.

    This is not a civil discussion forum, it is a bear-pit.




    You know all this.

    [...]

    No, this is just getting ludicrous and suggests not wanting to tackle
    the real subject: should people write '(a << b) & c' or 'a << b & c'?

    Oh, is that the real subject?

    I presume you prefer `(a << b) & c` to `a << b & c`.

    So do I.

    Others might or might not have different opinions. If that was the
    "real subject", we've wasted a lot of time debating the difference
    between subjectivity and objectivity.

    Tim Rentsch I'm sure will prefer the latter because 99.9% of C
    programmers are machines, according to him.

    Tim didn't say or imply that.

    So what was his 99.9% all about? Nobody has a clue, except they are
    certain that what I think it is is wrong!


    Presumably, the same 99.9% will not use indentation, and will write
    their programs all on one line anyway, because it is still after all
    completely unambiguous according to the C standard!

    Of course not, because 99.9% of C programmers are not idiots..
    Your record of guessing incorrectly what other people think is
    unbroken. I suggest you stop trying.

    This is what Tim said:

    "If someone really can't learn the rules of expression syntax for the
    language they are using, they should be advised to try a different
    language, or perhaps give up programming altogether. It's silly to
    worry about something that 999 people out of a 1000 (and the actual
    numbers are undoubtedly much higher) are able to navigate without
    difficulty."

    It sounds to me very much as though he expects 99.9% to know all C's precedences by heart and to never need to use superfluous brackets (or
    'more than needed if 'superfluous' is still to subjective).

    But of course, I am wrong and he is right, and you will defend his view
    (a subjective one) to the death.


    --- PyGate Linux v1.5.15
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From Scott Lurndal@3:633/10 to All on Thu Jun 4 21:58:14 2026
    Bart <bc@freeuk.com> writes:
    On 04/06/2026 21:34, Scott Lurndal wrote:
    Bart <bc@freeuk.com> writes:
    On 04/06/2026 17:47, Scott Lurndal wrote:
    Bart <bc@freeuk.com> writes:
    On 04/06/2026 17:18, Scott Lurndal wrote:
    David Brown <david.brown@hesbynett.no> writes:
    On 04/06/2026 15:18, Bart wrote:

    (Note that C has its own problems in this area:

    ÿÿÿ a = b/*p;ÿÿÿÿÿ // divide b by dereferenced pointer p >>>>>>>>>>
    Here, /* also happens to start a block comment.)


    Here you are objectively wrong.ÿ C does not have a "problem" with >>>>>>>>> this. The parsing rules of the language are clear - often called >>>>>>>>> "maximum munch".ÿ The character sequence "/*" is the start of a >>>>>>>>> comment, it is not two separate operators.

    This is where it falls down. It's very clearly a 'gotcha', and >>>>>>>> consequence of poorly thought-out design.

    It is neither a "gotcha", not a consequence of poor design.

    Indeed, and in the early days, the compiler itself would never
    have seen '/*' - the preprocessor (cpp) would have removed it
    from the source before the source reached the first
    pass of the compiler (c0).

    How does that not make it bad design?

    The proprocessor would strip everything from the /* until the next
    matching */, so a chunk of your program goes missing.

    Whatcha talkin' 'bout willis?


    What were /you/ talking about? What was your point?


    Your inaccurate characterization that a chunk of the program
    went "missing". Nothing meaningful is missing (and the comment
    remains in the original source file).

    So what do you mean, exactly, when you claim that the output of
    the preprocessor causes a chunk of the program (which doesn't
    include whitespace or comments) is missing?

    This is the example I gave elsewhere:

    ---------------------------
    There are actually other issues associated with /**/ comments; here
    someone forgot to terminate the first comment:

    puts("one"); /* comment 1
    puts("two"); /* commmet 2 */
    puts("three"); /* comment 3 */
    ---------------------------

    After preprocessing you're left with this:

    puts("one");
    puts("three");

    That middle puts call is missing, and it's meant to be part of the program.

    Of course. There is no functional difference between removing the commented text in cpp and leaving it in. In both cases, puts("two"); will be treated as a comment and will be ignored by the rest of the compiler.


    --- PyGate Linux v1.5.15
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From Chris M. Thomasson@3:633/10 to All on Thu Jun 4 15:25:08 2026
    On 6/4/2026 12:29 PM, Bart wrote:
    [...]

    And 'more than needed' isn't that?!

    All hail extra ()'s! :^)

    ((branch) ? (cond0) : (cond1))

    Well, I like to make my ? operators explicitly separated with extra
    ()'s... I basically never use (?:) anyway. Some times I did in a crazy
    macro expression along the lines of the chaos PP lib... Oh my.

    ;^o

    [...]

    --- PyGate Linux v1.5.15
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From Richard Harnden@3:633/10 to All on Thu Jun 4 23:25:53 2026
    On 04/06/2026 22:28, Bart wrote:
    There are actually other issues associated with /**/ comments; here
    someone forgot to terminate the first comment:

    ÿÿÿ puts("one");ÿÿÿ /* comment 1
    ÿÿÿ puts("two");ÿÿÿ /* commmet 2 */
    ÿÿÿ puts("three");ÿ /* comment 3 */

    I get ...

    $ gcc x.c
    x.c:6:21: warning: '/*' within block comment [-Wcomment]
    6 | puts("two"); /* commmet 2 */
    | ^
    1 warning generated.

    ... so I don't see it as a big deal.

    It's up there with typing 'foo():' when I meant 'foo();' - I'll get lots
    of errors which all boil down to my inability to release the shift-key.
    I don't blame the keyboard layout. Or the font. Or my eyesight.

    I'm sure there are plenty of editors that have a nested-comments angry
    colour.

    --- PyGate Linux v1.5.15
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From Janis Papanagnou@3:633/10 to All on Fri Jun 5 00:27:00 2026
    On 2026-06-04 23:47, Bart wrote:

    Jesus, the subthread has been going long enough.

    I'd dare to say that there's an extremely high chance
    that *everyone* in this group is agreeing with you on
    this statement! - I suggest pinning it at the wall. :-)

    Janis

    --- PyGate Linux v1.5.15
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From Keith Thompson@3:633/10 to All on Thu Jun 4 16:09:28 2026
    Bart <bc@freeuk.com> writes:
    On 04/06/2026 22:06, Keith Thompson wrote:
    Bart <bc@freeuk.com> writes:
    On 04/06/2026 19:54, David Brown wrote:
    [...]
    Again - /please/ stop trying to guess what people say or put words
    in their mouths.ÿ I can't remember ever seeing you do so accurately.

    This is what you actually said:

    It is an objective fact, therefore, that "(a*a) + (b*b)" has more
    parentheses than needed in the context of most programming languages.

    "(a*a) + (b*b) has too many parentheses", on the other hand, is a purely >>>> subjective opinion. Even if it is true that this is "commonly agreed
    to" (and AFAIK you have no basis for that claim), that would still be a >>>> subjective opinion - no matter how common that opinion is.

    You're saying that:

    * "more than needed" is objective
    * "too many" is subjective
    Stop it. He's not saying that.

    That is EXACTLY what he's saying: "It is an OBJECTIVE fact .. has more
    ... than needed", and:

    "has too many ... is ... purely subjective".

    You're taking phrases out of context and making false claims that the
    full statement was far more general than it actually was.

    And this is exactly what other people are doing.

    Taken literally, your statement implies that you admit that that's
    what you're doing. Is that what you meant? If so, I suggest you
    *stop* making such false claims. If not, what did you actually mean?

    So I used TOO MANY instead of MORE THAN NEEDED to describe the exact
    same phenomenon.

    That's not the problem. There is an actual meaningful distinction
    here, between what's needed by the compiler and what's useful to
    improve clarity for human readers. I have found some of what you've
    written to be unclear about that distinction.

    Can we agree that the question of whether parentheses in a C
    expression are necessary to the compiler can be answered objectively?
    Can we agree that the question of whether extra parentheses are
    helpful to a human reader is at least partly subjective, and
    varies from case to case? Is there really anything else that we
    fundamentally disagree about?

    (1) Why are you all making such a big fucking deal of this?

    Why are you?

    (2) Why are you all sticking up for each other?

    Most of us happen to agree with each other on most of the points being discussed. I'm not "sticking up" for anyone. I have expressed
    disagreement in this thread with people other than you.

    (3) Why don't you this discuss the fucking subject instead of going
    down these pointless rabbit holes?

    OK, what subject do you want to discuss? Please be clear and specific.

    [...]

    It is abourt how many brackets are too many, more than needed,
    superfluous to requirements, etc etc etc.

    There is of course no objective answer to that, only opinions.
    A substantial percentage of this thread has been about exactly
    what you now say you want it to be about. I've said myself that
    I think the parentheses in `(a*a) + (b*b)` are excessive, but the
    parentheses in `(a << b) & c` are appropriate.

    ?
    ?
    ?
    That (pointing to the prevous paragraph) was me talking about exactly
    what you want us to be talking about. Consider acknowledging that.

    [...]

    Presumably, the same 99.9% will not use indentation, and will write
    their programs all on one line anyway, because it is still after all
    completely unambiguous according to the C standard!

    Of course not, because 99.9% of C programmers are not idiots..
    Your record of guessing incorrectly what other people think is
    unbroken. I suggest you stop trying.

    This is what Tim said:

    "If someone really can't learn the rules of expression syntax for the language they are using, they should be advised to try a different
    language, or perhaps give up programming altogether. It's silly to
    worry about something that 999 people out of a 1000 (and the actual
    numbers are undoubtedly much higher) are able to navigate without difficulty."

    And you inferred from that that he opposes using indentation.

    https://en.wikipedia.org/wiki/Straw_man

    Or maybe you were being figurative, but I honestly can't tell.

    It sounds to me very much as though he expects 99.9% to know all C's precedences by heart and to never need to use superfluous brackets (or
    'more than needed if 'superfluous' is still to subjective).

    But of course, I am wrong and he is right, and you will defend his
    view (a subjective one) to the death.

    Nope.

    I don't know whether that's his opinion or not. Perhaps you haven't
    noticed that I don't always agree with Tim. I don't know whether
    he thinks that the parentheses in `(a << b) & c` are excessive, or
    whether he finds `a << b & c` clearer. He can certainly express his
    own opinion if he wants to. If he thinks (subjectively) that those
    parentheses are excessive, then I (subjectively) disagree with him.

    --
    Keith Thompson (The_Other_Keith) Keith.S.Thompson+u@gmail.com
    void Void(void) { Void(); } /* The recursive call of the void */

    --- PyGate Linux v1.5.15
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From Bart@3:633/10 to All on Fri Jun 5 00:44:45 2026
    On 05/06/2026 00:09, Keith Thompson wrote:
    Bart <bc@freeuk.com> writes:
    On 04/06/2026 22:06, Keith Thompson wrote:
    Bart <bc@freeuk.com> writes:
    On 04/06/2026 19:54, David Brown wrote:
    [...]
    Again - /please/ stop trying to guess what people say or put words
    in their mouths.ÿ I can't remember ever seeing you do so accurately.

    This is what you actually said:

    It is an objective fact, therefore, that "(a*a) + (b*b)" has more
    parentheses than needed in the context of most programming languages. >>>>>
    "(a*a) + (b*b) has too many parentheses", on the other hand, is a purely >>>>> subjective opinion. Even if it is true that this is "commonly agreed >>>>> to" (and AFAIK you have no basis for that claim), that would still be a >>>>> subjective opinion - no matter how common that opinion is.

    You're saying that:

    * "more than needed" is objective
    * "too many" is subjective
    Stop it. He's not saying that.

    That is EXACTLY what he's saying: "It is an OBJECTIVE fact .. has more
    ... than needed", and:

    "has too many ... is ... purely subjective".

    You're taking phrases out of context and making false claims that the
    full statement was far more general than it actually was.

    And this is exactly what other people are doing.

    Taken literally, your statement implies that you admit that that's
    what you're doing. Is that what you meant? If so, I suggest you
    *stop* making such false claims. If not, what did you actually mean?

    So I used TOO MANY instead of MORE THAN NEEDED to describe the exact
    same phenomenon.

    That's not the problem. There is an actual meaningful distinction
    here, between what's needed by the compiler and what's useful to
    improve clarity for human readers. I have found some of what you've
    written to be unclear about that distinction.

    Can we agree that the question of whether parentheses in a C
    expression are necessary to the compiler can be answered objectively?
    Can we agree that the question of whether extra parentheses are
    helpful to a human reader is at least partly subjective, and
    varies from case to case? Is there really anything else that we fundamentally disagree about?

    (1) Why are you all making such a big fucking deal of this?

    Why are you?

    I didn't start this business of something being subjective or objective,
    or suggesting than one turn of phrase to discuss the same thing was
    subjective and the other objective (implying that a subjective opinion
    had less worth). TR started that and several people backed him up.

    Myself I wouldn't even use those terms. My point was that some overuses
    of () for commonly known precedences are more overkill than others.

    If that's subjective then so be it; it is not some fundamental law of
    the universe. I would just call it common sense.

    Why are you?

    Since you ask, I was defending my point of view then got sidetracked by
    this subjective/objective nonsense. I notice that TR has disappeared
    from this subthread.


    --- PyGate Linux v1.5.15
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From Dan Cross@3:633/10 to All on Thu Jun 4 23:49:43 2026
    In article <10vsnl7$lkmu$1@kst.eternal-september.org>,
    Keith Thompson <Keith.S.Thompson+u@gmail.com> wrote: >cross@spitfire.i.gajendra.net (Dan Cross) writes:
    In article <865x3yd21n.fsf@linuxsc.com>,
    Tim Rentsch <tr.17687@z991.linuxsc.com> wrote: >>>cross@spitfire.i.gajendra.net (Dan Cross) writes:
    In article <86ik81cfk5.fsf_-_@linuxsc.com>,
    Tim Rentsch <tr.17687@z991.linuxsc.com> wrote:
    [...]
    There's an important distinction to make here. Consider this
    program:

    #include <limits.h>

    int
    foo(){
    int zero = (INT_MAX+1)*0;
    return zero;
    }

    int
    main(){
    return 0;
    }

    This program does not transgress the bounds of undefined behavior.

    To clarify, the comments in my posting were meant to be read as
    saying the given text is the entire program, and that it is strictly >>>conforming with respect to conforming hosted implementations. >>>(Incidentally, given the rules for freestanding implementations, I'm
    not sure that it is even possible for any program to be strictly >>>conforming with respect to conforming freestanding implementations.
    In any case my statements were meant only in the context of hosted >>>implementations.)

    Ok.

    [snip]
    Perhaps you mean that this is irrelevant because `foo` is not
    invoked, but I see no reason why that need be the case in e.g.
    a freestanding environment.

    I explained the context of my previous statements above. Sorry for
    not saying that in the original message.

    In a hosted environment, I don't
    think anything explicitly prevents `foo` from being called after
    `main` returns (though I can't imagine that would happen in real
    life; it would be weird if it did).

    The semantics described in the ISO C standard don't admit that >>>possibility.

    Could you please point to where it says this, in the C standard?

    I cannot find anything that says that arbitrary code cannot run
    after `main()` returns, and I don't see how that could possibly
    be true.

    N3220 5.1.2.4, Program semantics.

    It defines the *observable behavior* of a program, which consists of
    accesses to volatile objects, data written to files, and I/O dynamics of >interactive devices.

    Yes, but it does so for strictly-conforming programs with no UB.

    To understand conformance, we have to jump over to section 4,
    which explicitly says that, 'Undefined behavior is otherwise
    indicated in this document by the words "undefined behavior" or
    by the omission of any explicit definition of behavior.' As it
    does not say that a program with an instance of undefined
    behavior in an integer constant expression that is not executed
    must otherwise behave in any given manner, what the program does
    is undefined. A constaint violation mandates a diagnostic, but
    beyond that, the standard is (AFAICT) silent.

    Undefined Behavior, in turn, is not defined as specific only to
    execution: the standard simply says that it is "behavior, upon
    use of a *nonportable or erroneous program construct*..." for
    which there are no requirements, and there are examples of
    things that are explicitly UB at translation time, such as
    improperly terminated lexemes and so forth.

    Furthermore, the expression above is obviously an integer
    constant expression as defined by sec 6.6 para 8. Section 6.6,
    para 4, reads in part, "Each constant expression shall evaluate
    to a constant that is in the range of representable values for
    its type." The expression, `(INT_MAX+1)*0` violates this
    constraint, and so therefore a diagnostic is mandated as per
    sec 5.1.1.3 para 1. That it appears in code that is not
    obviously called from `main` doesn't change that.

    Morever, sec 6.6 para 17 says that, "the semantic rules for
    evaluation of a constant expression are the same as for
    nonconstant expressions." This brings us back to 5.1.2.4,
    though I submit that para (4) is a stronger argument for what
    you and Tim are saying, as it reads in part, "An actual
    implementation is not required to evaluate part of an expression
    if it can deduce that its value is not used and that no needed
    side effects are produced (including any caused by calling a
    function or through volatile access to an object)." I interpret
    this to mean that, if the implementation can determine that
    there is no way that `foo` can be called, it does not _have_ to
    evaluate the above expression. However, it must satisfy the
    range constraint from section 6.6, so it likely will, and in any
    event, the standard does not say that it, "shall not" evaluate
    it, or when.

    Once the compiler does that, if it does, and observes UB, the
    standard is silent on what requirements it imposes, which means
    the behavior is undefined. I see no reason it couldn't arrange
    to invoke `foo` at that point.

    So no, I do not see how execution according to the rules of the
    abstract machine is not guaranteed, here. I certainly see no
    way in which this can be regarded as a strictly conforming
    program.

    If the usual "Hello, world" program prints "Hello, world" followed
    by "Goodbye", the implementation is non-conforming. If it formats
    my hard drive after printing "Goodbye", it's non-conforming and
    dangerous.

    Two separate things. My point earlier was that code can
    obviously run after `main` terminates. Moreoever, I can't
    imagine what would _prevent_ a runtime system that invokes
    `main` from doing something like printing, "PROGRAM STOPPED"
    after `main` returned. C imposes no requirements here.

    Whether `foo` could be invoked after, I think, is undefined.

    Whether foo() has external linkage or internal
    linkage doesn't change that.

    I disagree. There's no possible way for the implementation to
    know whether a function with external linkage will be ultimately
    invoked or not; consider a system that supports loadable shared
    modules. Nothing prevents even this simple program from being
    compiled as a shared module, dynamically loaded, the loading
    program explicitly searching for and finding the symbol
    corresponding to the `foo` function, and invoking it.

    Remember that linking is translation phase 8. The compiler is not
    the entire implementation.

    Exactly my point. The compiler cannot know how `foo` might be
    used, or how the translated object might be exercised. There's
    I don't see how it could possibly know that, given that `foo`
    has external linkage.

    Hence, the compiler _must_ treat with UB as written, which is
    why `ubsan` inserts trapping code in `foo`.

    I don't know what "_must_ treat with UB" means.

    foo() has undefined behavior if it's called, so replacing its
    body with trapping code is valid. But (I'm reasonably sure that)
    an implementation cannot reject a program just because it can't
    prove that it has no undefined behavior during execution. It can
    reject it if it can prove that it *always* has undefined behavior
    during execution.

    What I'm saying is that, `foo` has undefined behavior _period_.
    That's manifest in an integer constant expression, whether it is
    executed at runtime or not. I believe that the standard forces
    the expression to be evaluated at translation time, via the
    "shall" mandate when checking the constraint on the range in sec
    6.6 para 4. Further, that evaluation must happen in accordance
    with the rules of the abstract machine, as per 5.1.2.4 para 17.
    The diagnostic is mandated, as is the translation-time
    evaluation. The expression is itself manifestly exhibits UB,
    and so therefore the result of the rest of the translation is
    undefined.

    I could be wrong; this is all excessively pedantic. And of
    course, if an implementation does something silly and emits
    garbage for Tim's program, then I argue it should be chucked
    onto the dustbin of excessive fawning over the standard. But
    I'm not convinced that the standard _prohibits_ such an extreme
    interpretation.

    In your example, `foo` clearly exhibits UB; I think your
    argument is whether that has a realized effect or not, since the
    UB is not invoked. I'm saying that in general a compiler cannot
    possibly know that when it compiles `foo`, and is free to assume
    the worst.

    foo() exhibits UB if and only if it's called during execution.

    Yes, a compiler can't know whether foo() will be called.
    An implementation, particularly a linker, might know, but is not
    required to. No, it is not free to assume the worst.

    See above.

    I certainly wouldn't want a compiler to reject `1/time(NULL)`
    because it can't prove that time(NULL) won't be zero, or reject
    `argc+1` because it can't prove that argc < INT_MAX. Code whose
    behavior would be undefined if it were executed has no behavior
    (and therefore no UB) if it's not executed.

    That's categorically different; what you are describing are what
    Regehr calls, "Type-2" functions, and I agree with you for
    those.

    The program that Tim posted has a "Type-3" function, and
    constraints dictate that the UB express must be evaluated at
    translation time, and a diagnostic emitted. In the most
    charitable interpretation, it cannot be considered a strictly
    conforming program, even if the implementation is smart enough
    to avoid evaluating the constant expression, as it is
    unspecified whether it's evaluated or not, and strictly
    conforming programs shall not rely on unspecified behavior.

    - Dan C.


    --- PyGate Linux v1.5.15
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From Dan Cross@3:633/10 to All on Fri Jun 5 00:02:26 2026
    In article <10vspuu$lkmu$3@kst.eternal-september.org>,
    Keith Thompson <Keith.S.Thompson+u@gmail.com> wrote:
    James Kuyper <jameskuyper@alumni.caltech.edu> writes:
    [...]
    One advantage of having a single program do the whole thing, is that
    error messages can mention the actual text of the line where a problem
    was detected, without any pre-processing applied.

    Typical preprocessors emit directives that tell the compiler about
    the current file name and line number, precisely so that diagnostic
    messages can refer to the original text.

    For example:

    $ cat hello.c
    #include <stdio.h>
    int main(void) {
    printf("Hello world!\n");
    }
    $ gcc -E hello.c | tail
    extern int __uflow (FILE *);
    extern int __overflow (FILE *, int);
    # 983 "/usr/include/stdio.h" 3 4

    # 2 "hello.c" 2

    # 2 "hello.c"
    int main(void) {
    printf("Hello world!\n");
    }
    $

    The line `# 2 "hello.c"` is, according to the C standard, a
    "non-directive", which is a kind of directive. Executing a
    non-directive has undefined behavior, but gcc apparently treats it
    very much like a #line directive.

    It doesn't really matter whether the preprocessor is a separate program
    or not.

    In fairness to Kuyper, however, the *text* from the original
    source file is lost. E.g.,

    term% cat n.c
    #include <stdio.h>
    #define FOO "hi"; // Note trailing `;`
    int
    main(void)
    {
    printf("%s\n", FOO);
    return 0;
    }
    term% clang -fkeep-system-includes -E n.c
    # 1 "n.c"
    # 1 "<built-in>" 1
    # 1 "<command line>" 1
    # 1 "<built-in>" 2
    # 1 "n.c" 2
    #include <stdio.h> /* clang -E -fkeep-system-includes */
    # 1 "n.c"
    # 2 "n.c" 2

    int
    main(void)
    {
    printf("%s\n", "hi";);
    return 0;
    }
    term%

    In this example, the preprocessor macro `FOO` has been lost, and
    only its expansion remains. The compiler has no information to
    give a useful diagnostic.

    - Dan C.


    --- PyGate Linux v1.5.15
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From Dan Cross@3:633/10 to All on Fri Jun 5 00:03:11 2026
    In article <8xlUR.17899$pxGb.16870@fx07.iad>,
    Scott Lurndal <slp53@pacbell.net> wrote:
    cross@spitfire.i.gajendra.net (Dan Cross) writes:
    In article <sglUR.17897$pxGb.10844@fx07.iad>,
    Scott Lurndal <slp53@pacbell.net> wrote:
    cross@spitfire.i.gajendra.net (Dan Cross) writes:
    In article <10vsh43$b3is$1@dont-email.me>,
    Lew Pitcher <lew.pitcher@digitalfreehold.ca> wrote:
    On Thu, 04 Jun 2026 16:18:07 +0000, Scott Lurndal wrote:

    [snip]
    Indeed, and in the early days, the compiler itself would never
    have seen '/*' - the preprocessor (cpp) would have removed it
    from the source before the source reached the first
    pass of the compiler (c0).

    So, I've looked through "The C Programming Language" (the K&R C)
    and the paper "A Tour Through the Portable C Compiler" (S. C. >>>>>Johnson, circa 1974), and neither document states that the >>>>>preprocessor strips comments. In fact, the mentions of the >>>>>preprocessor are exclusively about the #operation operators,
    and not about C comments.

    The PDP-11 compiler from 5th Edition research Unix removes
    comments in `cc.c`. The 1972 compilers from Dennis Ritchie's
    web page remove them in the compiler proper, as they predated
    the preprocessor: >>>>https://www.nokia.com/bell-labs/about/dennis-m-ritchie/primevalC.html

    The v6 cpp.c processes the comments
    and deletes them if the 'passcom' (-C) flag is not set.

    [snip]

    You sure? That looks like V7 code to me.

    Yes, it is. I didn't have a machine readable version of the
    v6 compiler handy. Dug it out and here's the v6 version.


    getch()
    {
    register int c, lastst;

    while ((c=getc1())=='/' && !instring)
    {
    if ((c=getc1())!='*')
    {
    pushback(c);
    return('/');
    }
    if (!skipcom)
    {putc('/',fout); putc('*', fout);}
    lastst=0;
    while ( (c = getc1()) != '\0')
    {
    if (lastst && c=='/')
    {
    if (!skipcom)
    putc('/', fout);
    break;
    }
    if (c=='\n' || !skipcom)
    putc(c, fout);
    lastst = (c=='*');
    }
    if (c=='\0')break;
    }
    return(c);
    }

    Yeah, that's from `cc.c`, right?

    I think 7th Ed was the first where `cpp` was liberated from the
    compiler proper (or the driver, anyway).

    - Dan C.


    --- PyGate Linux v1.5.15
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From Scott Lurndal@3:633/10 to All on Fri Jun 5 00:18:05 2026
    cross@spitfire.i.gajendra.net (Dan Cross) writes:
    In article <8xlUR.17899$pxGb.16870@fx07.iad>,
    Scott Lurndal <slp53@pacbell.net> wrote:
    cross@spitfire.i.gajendra.net (Dan Cross) writes:
    In article <sglUR.17897$pxGb.10844@fx07.iad>,
    Scott Lurndal <slp53@pacbell.net> wrote:
    cross@spitfire.i.gajendra.net (Dan Cross) writes:
    In article <10vsh43$b3is$1@dont-email.me>,
    Lew Pitcher <lew.pitcher@digitalfreehold.ca> wrote:
    On Thu, 04 Jun 2026 16:18:07 +0000, Scott Lurndal wrote:

    [snip]
    Indeed, and in the early days, the compiler itself would never
    have seen '/*' - the preprocessor (cpp) would have removed it
    from the source before the source reached the first
    pass of the compiler (c0).

    So, I've looked through "The C Programming Language" (the K&R C) >>>>>>and the paper "A Tour Through the Portable C Compiler" (S. C. >>>>>>Johnson, circa 1974), and neither document states that the >>>>>>preprocessor strips comments. In fact, the mentions of the >>>>>>preprocessor are exclusively about the #operation operators,
    and not about C comments.

    The PDP-11 compiler from 5th Edition research Unix removes
    comments in `cc.c`. The 1972 compilers from Dennis Ritchie's
    web page remove them in the compiler proper, as they predated
    the preprocessor: >>>>>https://www.nokia.com/bell-labs/about/dennis-m-ritchie/primevalC.html

    The v6 cpp.c processes the comments
    and deletes them if the 'passcom' (-C) flag is not set.

    [snip]

    You sure? That looks like V7 code to me.

    Yes, it is. I didn't have a machine readable version of the
    v6 compiler handy. Dug it out and here's the v6 version.


    getch()
    {
    register int c, lastst;

    while ((c=getc1())=='/' && !instring)
    {
    if ((c=getc1())!='*')
    {
    pushback(c);
    return('/');
    }
    if (!skipcom)
    {putc('/',fout); putc('*', fout);}
    lastst=0;
    while ( (c = getc1()) != '\0')
    {
    if (lastst && c=='/')
    {
    if (!skipcom)
    putc('/', fout);
    break;
    }
    if (c=='\n' || !skipcom)
    putc(c, fout);
    lastst = (c=='*');
    }
    if (c=='\0')break;
    }
    return(c);
    }

    Yeah, that's from `cc.c`, right?

    No, it's from cpp.c

    $ ls /work/reference/collegetapes/sltape/v6cc/
    c0.c c00.c c01.c c02.c c03.c c04.c c05.c c1.h
    c10.c c11.c c12.c c13.c c2.h c20.c c21.c cc.c cpp.c


    --- PyGate Linux v1.5.15
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From Chris M. Thomasson@3:633/10 to All on Thu Jun 4 17:26:11 2026
    On 6/4/2026 4:44 PM, Bart wrote:
    On 05/06/2026 00:09, Keith Thompson wrote:
    Bart <bc@freeuk.com> writes:
    On 04/06/2026 22:06, Keith Thompson wrote:
    Bart <bc@freeuk.com> writes:
    On 04/06/2026 19:54, David Brown wrote:
    [...]
    Again - /please/ stop trying to guess what people say or put words >>>>>> in their mouths.ÿ I can't remember ever seeing you do so accurately. >>>>>
    This is what you actually said:

    It is an objective fact, therefore, that "(a*a) + (b*b)" has more
    parentheses than needed in the context of most programming languages. >>>>>>
    "(a*a) + (b*b) has too many parentheses", on the other hand, is a >>>>>> purely
    subjective opinion.ÿ Even if it is true that this is "commonly agreed >>>>>> to" (and AFAIK you have no basis for that claim), that would still >>>>>> be a
    subjective opinion - no matter how common that opinion is.

    You're saying that:

    *ÿ "more than needed" is objective
    *ÿ "too many" is subjective
    Stop it.ÿ He's not saying that.

    That is EXACTLY what he's saying: "It is an OBJECTIVE fact .. has more
    ... than needed", and:

    ÿ "has too many ... is ... purely subjective".

    You're taking phrases out of context and making false claims that the
    full statement was far more general than it actually was.

    And this is exactly what other people are doing.

    Taken literally, your statement implies that you admit that that's
    what you're doing.ÿ Is that what you meant?ÿ If so, I suggest you
    *stop* making such false claims.ÿ If not, what did you actually mean?

    So I used TOO MANY instead of MORE THAN NEEDED to describe the exact
    same phenomenon.

    That's not the problem.ÿ There is an actual meaningful distinction
    here, between what's needed by the compiler and what's useful to
    improve clarity for human readers.ÿ I have found some of what you've
    written to be unclear about that distinction.

    Can we agree that the question of whether parentheses in a C
    expression are necessary to the compiler can be answered objectively?
    Can we agree that the question of whether extra parentheses are
    helpful to a human reader is at least partly subjective, and
    varies from case to case?ÿ Is there really anything else that we
    fundamentally disagree about?

    (1) Why are you all making such a big fucking deal of this?

    Why are you?

    I didn't start this business of something being subjective or objective,
    or suggesting than one turn of phrase to discuss the same thing was subjective and the other objective (implying that a subjective opinion
    had less worth). TR started that and several people backed him up.

    Myself I wouldn't even use those terms. My point was that some overuses
    of () for commonly known precedences are more overkill than others.

    If that's subjective then so be it; it is not some fundamental law of
    the universe. I would just call it common sense.

    Why are you?

    Since you ask, I was defending my point of view then got sidetracked by
    this subjective/objective nonsense. I notice that TR has disappeared
    from this subthread.


    Wrt the number of ()'s? Might as well go to sleep with the following
    song playing in the background:

    (The Fate of Ophelia - Taylor Swift (Lyrics) Charlie Puth ft. Selena
    Gomez, the weekd, ariana grande)

    https://youtu.be/yleL-JbEHc8?list=RDyleL-JbEHc8



    --- PyGate Linux v1.5.15
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From Keith Thompson@3:633/10 to All on Thu Jun 4 18:04:38 2026
    cross@spitfire.i.gajendra.net (Dan Cross) writes:
    In article <10vsnl7$lkmu$1@kst.eternal-september.org>,
    Keith Thompson <Keith.S.Thompson+u@gmail.com> wrote:
    cross@spitfire.i.gajendra.net (Dan Cross) writes:
    In article <865x3yd21n.fsf@linuxsc.com>,
    Tim Rentsch <tr.17687@z991.linuxsc.com> wrote: >>>>cross@spitfire.i.gajendra.net (Dan Cross) writes:
    In article <86ik81cfk5.fsf_-_@linuxsc.com>,
    Tim Rentsch <tr.17687@z991.linuxsc.com> wrote:
    [...]
    There's an important distinction to make here. Consider this
    program:

    #include <limits.h>

    int
    foo(){
    int zero = (INT_MAX+1)*0;
    return zero;
    }

    int
    main(){
    return 0;
    }

    This program does not transgress the bounds of undefined behavior.

    To clarify, the comments in my posting were meant to be read as
    saying the given text is the entire program, and that it is strictly >>>>conforming with respect to conforming hosted implementations. >>>>(Incidentally, given the rules for freestanding implementations, I'm >>>>not sure that it is even possible for any program to be strictly >>>>conforming with respect to conforming freestanding implementations.
    In any case my statements were meant only in the context of hosted >>>>implementations.)

    Ok.

    [snip]
    Perhaps you mean that this is irrelevant because `foo` is not
    invoked, but I see no reason why that need be the case in e.g.
    a freestanding environment.

    I explained the context of my previous statements above. Sorry for
    not saying that in the original message.

    In a hosted environment, I don't
    think anything explicitly prevents `foo` from being called after
    `main` returns (though I can't imagine that would happen in real
    life; it would be weird if it did).

    The semantics described in the ISO C standard don't admit that >>>>possibility.

    Could you please point to where it says this, in the C standard?

    I cannot find anything that says that arbitrary code cannot run
    after `main()` returns, and I don't see how that could possibly
    be true.

    N3220 5.1.2.4, Program semantics.

    It defines the *observable behavior* of a program, which consists of >>accesses to volatile objects, data written to files, and I/O dynamics of >>interactive devices.

    Yes, but it does so for strictly-conforming programs with no UB.

    It does so for programs in general, not just strictly conforming
    ones. If a program has undefined behavior, all bets are off,
    but for example a program that evaluates `printf("%d\n", INT_MAX)`
    is not strictly conforming, but it's fully subject to 5.1.2.4.

    To understand conformance, we have to jump over to section 4,
    which explicitly says that, 'Undefined behavior is otherwise
    indicated in this document by the words "undefined behavior" or
    by the omission of any explicit definition of behavior.' As it
    does not say that a program with an instance of undefined
    behavior in an integer constant expression that is not executed
    must otherwise behave in any given manner, what the program does
    is undefined. A constaint violation mandates a diagnostic, but
    beyond that, the standard is (AFAICT) silent.

    I don't think an integer constant expression can have undefined
    behavior. INT_MAX+1 and 1/0 are not constant expressions, because
    neither "evaluate(s) to a constant that is in the range of
    representable values for its type".

    I claim that an expression that looks like a constant expression
    *isn't* a constant-expression if it doesn't appear in a context
    that requires a constant-expression.

    The program in question, quoted above, has:

    int zero = (INT_MAX+1)*0;

    `(INT_MAX+1)*0` is not a constant expression, not because of the
    overflow, but because a constant expression is not required in
    that context. "constant-expression" is defined by a production in
    the grammar (it reduces to "conditional-expression"). Even in

    int n = 42;

    42 is not a a constant expression, because the grammar doesn't
    call for a constant expression in that context -- even though it
    looks like one. Similarly, in `a + b * c`, `a + b` looks like an
    additive expression, but it isn't one. (Not a perfect analogy.)

    Undefined Behavior, in turn, is not defined as specific only to
    execution: the standard simply says that it is "behavior, upon
    use of a *nonportable or erroneous program construct*..." for
    which there are no requirements, and there are examples of
    things that are explicitly UB at translation time, such as
    improperly terminated lexemes and so forth.

    Yes, there are constructs that are explicitly UB at translation time.
    (I think that's unfortunate, and there are efforts to clear up some
    such cases in C2y.)

    Signed integer overflow is not one of those constructs.
    Any undefined behavior from evaluating INT_MAX+1 happens during
    execution (barring constraint violations).

    Furthermore, the expression above is obviously an integer
    constant expression as defined by sec 6.6 para 8. Section 6.6,
    para 4, reads in part, "Each constant expression shall evaluate
    to a constant that is in the range of representable values for
    its type." The expression, `(INT_MAX+1)*0` violates this
    constraint, and so therefore a diagnostic is mandated as per
    sec 5.1.1.3 para 1. That it appears in code that is not
    obviously called from `main` doesn't change that.

    It satisfies the requirements for an integer constant expression in
    6.6p8, but it violates the constraint in 6.6p4. (I presume that an
    "integer constant expression" must be a "constant expression".)
    But since "constant-expression" is a grammatical production,
    it doesn't have to satisfy that constraint, and no diagnostic
    is required. (A warning is certainly permitted.)

    Similarly, this:
    int n = INT_MAX + 1;
    at block scope doesn't require a diagnostic, though of course it
    has undefined behavior -- but at file scope, the initializer is a
    constant expression, so that would be a constraint violation.

    Morever, sec 6.6 para 17 says that, "the semantic rules for
    evaluation of a constant expression are the same as for
    nonconstant expressions." This brings us back to 5.1.2.4,
    though I submit that para (4) is a stronger argument for what
    you and Tim are saying, as it reads in part, "An actual
    implementation is not required to evaluate part of an expression
    if it can deduce that its value is not used and that no needed
    side effects are produced (including any caused by calling a
    function or through volatile access to an object)." I interpret
    this to mean that, if the implementation can determine that
    there is no way that `foo` can be called, it does not _have_ to
    evaluate the above expression. However, it must satisfy the
    range constraint from section 6.6, so it likely will, and in any
    event, the standard does not say that it, "shall not" evaluate
    it, or when.

    Overflow in a constant expression is not undefined behavior. It's a
    constraint violation. But that doesn't apply here, because the
    initializer is not a constant expression. (Sorry if I'm repeating
    myself.)

    Once the compiler does that, if it does, and observes UB, the
    standard is silent on what requirements it imposes, which means
    the behavior is undefined. I see no reason it couldn't arrange
    to invoke `foo` at that point.

    Any UB in the program would occur during execution, and in fact
    it *won't* occur during execution because foo() isn't called.
    A compiler can't generate code with arbitrary behavior just because
    it can't prove that there will be no UB. If it could, every signed
    or floating-point arithmetic operation with unknown operand values
    would grant the same permission.

    So no, I do not see how execution according to the rules of the
    abstract machine is not guaranteed, here. I certainly see no
    way in which this can be regarded as a strictly conforming
    program.

    foo()'s behavior would be undefined if it were called. It *isn't*
    called, so there's no actual UB. The program does not violate any
    of the other requirements for strict conformance.

    If the usual "Hello, world" program prints "Hello, world" followed
    by "Goodbye", the implementation is non-conforming. If it formats
    my hard drive after printing "Goodbye", it's non-conforming and
    dangerous.

    Two separate things. My point earlier was that code can
    obviously run after `main` terminates. Moreoever, I can't
    imagine what would _prevent_ a runtime system that invokes
    `main` from doing something like printing, "PROGRAM STOPPED"
    after `main` returned. C imposes no requirements here.

    Yes, it does. An OS can print "PROGRAM STOPPED", but not as part
    of the execution of the program. On my system, a shell prompt is
    printed after a program terminates, but not by the program. If I
    execute a "hello, world" program with its output redirected to a file
    (on a system that supports that), the resulting file cannot contain
    "PROGRAM STOPPED". The requirements in 5.1.2.4 specify both what
    the execution of a program must do and what it must not do.

    Whether `foo` could be invoked after, I think, is undefined.

    Whether foo() has external linkage or internal
    linkage doesn't change that.

    I disagree. There's no possible way for the implementation to
    know whether a function with external linkage will be ultimately
    invoked or not; consider a system that supports loadable shared
    modules. Nothing prevents even this simple program from being
    compiled as a shared module, dynamically loaded, the loading
    program explicitly searching for and finding the symbol
    corresponding to the `foo` function, and invoking it.

    Remember that linking is translation phase 8. The compiler is not
    the entire implementation.

    Exactly my point. The compiler cannot know how `foo` might be
    used, or how the translated object might be exercised. There's
    I don't see how it could possibly know that, given that `foo`
    has external linkage.

    We were presented with a complete translation unit that included a
    function definition for "main". It's a complete program. There's no
    valid way for some other program to call foo. If OS provided such
    a mechanism, it would be outside the scope of C.

    Hence, the compiler _must_ treat with UB as written, which is
    why `ubsan` inserts trapping code in `foo`.

    I don't know what "_must_ treat with UB" means.

    foo() has undefined behavior if it's called, so replacing its
    body with trapping code is valid. But (I'm reasonably sure that)
    an implementation cannot reject a program just because it can't
    prove that it has no undefined behavior during execution. It can
    reject it if it can prove that it *always* has undefined behavior
    during execution.

    What I'm saying is that, `foo` has undefined behavior _period_.
    That's manifest in an integer constant expression, whether it is
    executed at runtime or not. I believe that the standard forces
    the expression to be evaluated at translation time, via the
    "shall" mandate when checking the constraint on the range in sec
    6.6 para 4. Further, that evaluation must happen in accordance
    with the rules of the abstract machine, as per 5.1.2.4 para 17.
    The diagnostic is mandated, as is the translation-time
    evaluation. The expression is itself manifestly exhibits UB,
    and so therefore the result of the rest of the translation is
    undefined.

    foo is a function. foo does not have undefined behavior; it has no
    behavior at all. A *call* to foo during execution has undefined
    behavior. (`foo;` is a statement-expression that does nothing;
    it does not have undefined behavior.)

    [SNIP]

    I think the question of whether the initializer is a
    constant-expression or not has caused some not entirely relevant
    confusion.

    Here's another example that avoids that issue.

    #include <limits.h>

    int foo(void) {
    int zero;
    zero = INT_MAX;
    zero ++;
    zero *= 0;
    return zero;
    }

    int main(void) {
    return 0;
    }

    Given my grammatical argument above, I would say that this program
    has no constant expressions. Whether that argument is correct or
    not, it certainly has no constant expressions that violate any
    constraint or that have undefined behavior. Evaluating `zero ++`
    (which doesn't even pretend to be a constant expression) would have
    run-time undefined behavior -- *if* foo() were ever called.

    And given this translation unit, I don't think there's any way to
    construct a multi-TU program that calls foo, so a compiler *can*
    determine that foo is never called (but there's no requirement to
    do so, or to make any use of that information).

    --
    Keith Thompson (The_Other_Keith) Keith.S.Thompson+u@gmail.com
    void Void(void) { Void(); } /* The recursive call of the void */

    --- PyGate Linux v1.5.15
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From Keith Thompson@3:633/10 to All on Thu Jun 4 18:36:46 2026
    cross@spitfire.i.gajendra.net (Dan Cross) writes:
    In article <10vspuu$lkmu$3@kst.eternal-september.org>,
    Keith Thompson <Keith.S.Thompson+u@gmail.com> wrote:
    James Kuyper <jameskuyper@alumni.caltech.edu> writes:
    [...]
    One advantage of having a single program do the whole thing, is that
    error messages can mention the actual text of the line where a problem
    was detected, without any pre-processing applied.

    Typical preprocessors emit directives that tell the compiler about
    the current file name and line number, precisely so that diagnostic >>messages can refer to the original text.

    For example:

    $ cat hello.c
    #include <stdio.h>
    int main(void) {
    printf("Hello world!\n");
    }
    $ gcc -E hello.c | tail
    extern int __uflow (FILE *);
    extern int __overflow (FILE *, int);
    # 983 "/usr/include/stdio.h" 3 4

    # 2 "hello.c" 2

    # 2 "hello.c"
    int main(void) {
    printf("Hello world!\n");
    }
    $

    The line `# 2 "hello.c"` is, according to the C standard, a >>"non-directive", which is a kind of directive. Executing a
    non-directive has undefined behavior, but gcc apparently treats it
    very much like a #line directive.

    It doesn't really matter whether the preprocessor is a separate program
    or not.

    In fairness to Kuyper, however, the *text* from the original
    source file is lost. E.g.,

    term% cat n.c
    #include <stdio.h>
    #define FOO "hi"; // Note trailing `;`
    int
    main(void)
    {
    printf("%s\n", FOO);
    return 0;
    }
    term% clang -fkeep-system-includes -E n.c
    # 1 "n.c"
    # 1 "<built-in>" 1
    # 1 "<command line>" 1
    # 1 "<built-in>" 2
    # 1 "n.c" 2
    #include <stdio.h> /* clang -E -fkeep-system-includes */
    # 1 "n.c"
    # 2 "n.c" 2

    int
    main(void)
    {
    printf("%s\n", "hi";);
    return 0;
    }
    term%

    In this example, the preprocessor macro `FOO` has been lost, and
    only its expansion remains. The compiler has no information to
    give a useful diagnostic.

    Ah, but it does, as long as the original file is still there.

    $ gcc -c n.c
    n.c: In function ?main?:
    n.c:2:17: error: expected ?)? before ?;? token
    2 | #define FOO "hi"; // Note trailing `;`
    | ^
    n.c:6:20: note: in expansion of macro ?FOO?
    6 | printf("%s\n", FOO);
    | ^~~
    n.c:6:11: note: to match this ?(?
    6 | printf("%s\n", FOO);
    | ^
    $

    The output of `gcc -E` doesn't include the name FOO, but it does include
    the line `# 3 "n.c"`, and that's enough information for the compiler to
    open the original source file and copy information from it into an error message.

    (This is perhaps straying slightly off-topic, since the standard
    only requires a diagnostic, but it's still interesting to see how
    actual compilers do things.)

    $ cat n.c
    #include <stdio.h>
    #define FOO "hi"; // Note trailing `;`
    int
    main(void)
    {
    printf("%s\n", FOO);
    return 0;
    }
    $ gcc -E n.c >| n-preprocessed.c
    $ grep FOO n-preprocessed.c
    $ tail n-preprocessed.c
    # 2 "n.c" 2


    # 3 "n.c"
    int
    main(void)
    {
    printf("%s\n", "hi";);
    return 0;
    }
    $ gcc -c n-preprocessed.c
    n.c: In function ?main?:
    n.c:6:24: error: expected ?)? before ?;? token
    6 | printf("%s\n", FOO);
    | ~ ^
    | )
    $

    And if I rename n.c before compiling n-preprocessed.c, the error
    messages doesn't include that line of code.

    --
    Keith Thompson (The_Other_Keith) Keith.S.Thompson+u@gmail.com
    void Void(void) { Void(); } /* The recursive call of the void */

    --- PyGate Linux v1.5.15
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From Dan Cross@3:633/10 to All on Fri Jun 5 02:47:35 2026
    In article <10vsrpo$men2$2@dont-email.me>, Bart <bc@freeuk.com> wrote:
    On 04/06/2026 22:06, Keith Thompson wrote:
    Bart <bc@freeuk.com> writes:
    [snip]
    Tim Rentsch I'm sure will prefer the latter because 99.9% of C
    programmers are machines, according to him.

    Tim didn't say or imply that.

    So what was his 99.9% all about? Nobody has a clue, except they are
    certain that what I think it is is wrong!

    Have you thought about, I don't know, maybe asking him?

    Presumably, the same 99.9% will not use indentation, and will write
    their programs all on one line anyway, because it is still after all
    completely unambiguous according to the C standard!

    Of course not, because 99.9% of C programmers are not idiots..
    Your record of guessing incorrectly what other people think is
    unbroken. I suggest you stop trying.

    This is what Tim said:

    "If someone really can't learn the rules of expression syntax for the >language they are using, they should be advised to try a different
    language, or perhaps give up programming altogether. It's silly to
    worry about something that 999 people out of a 1000 (and the actual
    numbers are undoubtedly much higher) are able to navigate without >difficulty."

    It sounds to me very much as though he expects 99.9% to know all C's >precedences by heart and to never need to use superfluous brackets (or
    'more than needed if 'superfluous' is still to subjective).

    But of course, I am wrong and he is right, and you will defend his view
    (a subjective one) to the death.

    You omited some of what reads to me like fairly important
    context before the part you posted:

    |This statement illustrates the problem with examples that you give.
    |Not only is the presumed reader sort of arbitrarily naive, he or she
    |is apparently incapable of learning. Everyone who has ever learned
    |to program has had an experience of a program doing something other
    |than what was expected, because of a misunderstanding about how the
    |language works. When that happens, most people simply learn about
    |their misunderstanding and correct it. The readers in your examples
    |are like people who started programming after developing Alzheimer's
    |disease (and no offense meant to anyone afflicted with Alzheimer's).
    |Maybe there are such people, whether or not caused by a medical
    |condition, but it doesn't match most programmers' experience, and in
    |any case is not worth worrying about. If someone can't understand
    |the rules of the road they shouldn't be behind the wheel of a car.

    I don't presume to speak for him, but his point appears to be
    that most programmers (999 out of a 10000) learn from their
    mistakes. Part of that may be developing techniques to prevent
    future reoccurance of those mistakes.

    Programmers make mistakes; it happens all the time. Many C
    programmers may well have experienced mistakes with operator
    precedence; it's well-known that the rules have some rough
    edges. Usually this is fairly easy to spot in testing; it may
    result in a momentary head scratching, perhaps a, "huh...that's
    weird..." followed by looking at a table or puzzling over the
    grammar for a moment, and then an, "ohh....I see." Perhaps
    the programmer thinks, "wow, that confused me.... I'm going to
    put in some parentheses to make it clear what's going on the
    next time I'm in here..." or maybe they don't. That's the part
    that is subjective.

    The point is, not just most programmers, but most people in
    general, make mistakes and then learn from them. If one cannot
    learn from those mistakes vis a particular activity (like
    programming, or maybe driving) them maybe one should not be
    doing that activity, whatever it is. I suppose one might
    struggle to learn from one's mistakes and still enjoy
    programming, perhaps as a hobby. I don't see any harm in that;
    driving might be another matter: cars are big, heavy, and go
    fast enough to kill someone.

    Where you seem to go off the rails in _this_ discussion is what
    others have already told you: you are mistaking an expression of
    preference with measurable facts. What constitutes "too many"
    or "too few" parentheses is not well-defined: one cannot go look
    in a text book and and a defintiion of "too many" here. And
    even though most people agree that `((((((((a * b))))))))` is
    "too many", that's still an opinion: someone else may disagree.
    _I_ may think that the person who wrote that and anyone who
    agree with them has no taste and an utter lack of class, but
    that's nothing more than my opinion.

    Here's an example: when I use the ternary operator, I _usually_
    wrap the first expression in parens. Necessary? Almost never.
    But I just like the way it looks, but aesthetics are purely
    subjective.

    - Dan C.


    --- PyGate Linux v1.5.15
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From Dan Cross@3:633/10 to All on Fri Jun 5 02:49:49 2026
    In article <10vsqlu$men2$1@dont-email.me>, Bart <bc@freeuk.com> wrote:
    On 04/06/2026 21:34, Scott Lurndal wrote:
    Bart <bc@freeuk.com> writes:
    On 04/06/2026 17:47, Scott Lurndal wrote:
    Bart <bc@freeuk.com> writes:
    On 04/06/2026 17:18, Scott Lurndal wrote:
    David Brown <david.brown@hesbynett.no> writes:
    On 04/06/2026 15:18, Bart wrote:

    (Note that C has its own problems in this area:

    ÿÿÿ a = b/*p;ÿÿÿÿÿ // divide b by dereferenced pointer p >>>>>>>>>>
    Here, /* also happens to start a block comment.)


    Here you are objectively wrong.ÿ C does not have a "problem" with >>>>>>>>> this. The parsing rules of the language are clear - often called >>>>>>>>> "maximum munch".ÿ The character sequence "/*" is the start of a >>>>>>>>> comment, it is not two separate operators.

    This is where it falls down. It's very clearly a 'gotcha', and >>>>>>>> consequence of poorly thought-out design.

    It is neither a "gotcha", not a consequence of poor design.

    Indeed, and in the early days, the compiler itself would never
    have seen '/*' - the preprocessor (cpp) would have removed it
    from the source before the source reached the first
    pass of the compiler (c0).

    How does that not make it bad design?

    The proprocessor would strip everything from the /* until the next
    matching */, so a chunk of your program goes missing.

    Whatcha talkin' 'bout willis?


    What were /you/ talking about? What was your point?


    Your inaccurate characterization that a chunk of the program
    went "missing". Nothing meaningful is missing (and the comment
    remains in the original source file).

    So what do you mean, exactly, when you claim that the output of
    the preprocessor causes a chunk of the program (which doesn't
    include whitespace or comments) is missing?

    This is the example I gave elsewhere:

    ---------------------------
    There are actually other issues associated with /**/ comments; here
    someone forgot to terminate the first comment:

    puts("one"); /* comment 1
    puts("two"); /* commmet 2 */
    puts("three"); /* comment 3 */
    ---------------------------

    After preprocessing you're left with this:

    puts("one");
    puts("three");

    That middle puts call is missing, and it's meant to be part of the program.

    The middle call is not "missing". It is "commented out." It
    that was not deliberate, you might have a bad time, but it's
    independent of the preprocessor.

    This can also be a consequence of an inadvertent /* sequence such as in
    'a = b/*p;'.

    Sounds like a bug.

    - Dan C.


    --- PyGate Linux v1.5.15
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From Dan Cross@3:633/10 to All on Fri Jun 5 02:54:15 2026
    In article <10vt97i$pube$1@kst.eternal-september.org>,
    Keith Thompson <Keith.S.Thompson+u@gmail.com> wrote: >cross@spitfire.i.gajendra.net (Dan Cross) writes:
    In article <10vspuu$lkmu$3@kst.eternal-september.org>,
    Keith Thompson <Keith.S.Thompson+u@gmail.com> wrote:
    James Kuyper <jameskuyper@alumni.caltech.edu> writes:
    [...]
    One advantage of having a single program do the whole thing, is that
    error messages can mention the actual text of the line where a problem >>>> was detected, without any pre-processing applied.

    Typical preprocessors emit directives that tell the compiler about
    the current file name and line number, precisely so that diagnostic >>>messages can refer to the original text.

    In fairness to Kuyper, however, the *text* from the original
    source file is lost. E.g.,

    term% cat n.c
    #include <stdio.h>
    #define FOO "hi"; // Note trailing `;`
    int
    main(void)
    {
    printf("%s\n", FOO);
    return 0;
    }
    term% clang -fkeep-system-includes -E n.c
    # 1 "n.c"
    # 1 "<built-in>" 1
    # 1 "<command line>" 1
    # 1 "<built-in>" 2
    # 1 "n.c" 2
    #include <stdio.h> /* clang -E -fkeep-system-includes */
    # 1 "n.c"
    # 2 "n.c" 2

    int
    main(void)
    {
    printf("%s\n", "hi";);
    return 0;
    }
    term%

    In this example, the preprocessor macro `FOO` has been lost, and
    only its expansion remains. The compiler has no information to
    give a useful diagnostic.

    Ah, but it does, as long as the original file is still there.

    Mm, yeah, I suppose, as long as the original is still available.

    $ gcc -c n.c
    n.c: In function ?main?:
    n.c:2:17: error: expected ?)? before ?;? token
    2 | #define FOO "hi"; // Note trailing `;`
    | ^
    n.c:6:20: note: in expansion of macro ?FOO?
    6 | printf("%s\n", FOO);
    | ^~~
    n.c:6:11: note: to match this ?(?
    6 | printf("%s\n", FOO);
    | ^
    $

    The output of `gcc -E` doesn't include the name FOO, but it does include
    the line `# 3 "n.c"`, and that's enough information for the compiler to
    open the original source file and copy information from it into an error >message.

    (This is perhaps straying slightly off-topic, since the standard
    only requires a diagnostic, but it's still interesting to see how
    actual compilers do things.)

    $ cat n.c
    #include <stdio.h>
    #define FOO "hi"; // Note trailing `;`
    int
    main(void)
    {
    printf("%s\n", FOO);
    return 0;
    }
    $ gcc -E n.c >| n-preprocessed.c
    $ grep FOO n-preprocessed.c
    $ tail n-preprocessed.c
    # 2 "n.c" 2


    # 3 "n.c"
    int
    main(void)
    {
    printf("%s\n", "hi";);
    return 0;
    }
    $ gcc -c n-preprocessed.c
    n.c: In function ?main?:
    n.c:6:24: error: expected ?)? before ?;? token
    6 | printf("%s\n", FOO);
    | ~ ^
    | )
    $

    And if I rename n.c before compiling n-preprocessed.c, the error
    messages doesn't include that line of code.

    I feel like there is a Stallman joke in there struggling to get
    out, but I can't quite get there.

    - Dan C.


    --- PyGate Linux v1.5.15
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From Dan Cross@3:633/10 to All on Fri Jun 5 03:02:07 2026
    In article <1BoUR.3$lmCb.1@fx22.iad>, Scott Lurndal <slp53@pacbell.net> wrote: >cross@spitfire.i.gajendra.net (Dan Cross) writes:
    [snip]
    getch()
    {
    register int c, lastst;

    while ((c=getc1())=='/' && !instring)
    {
    if ((c=getc1())!='*')
    {
    pushback(c);
    return('/');
    }
    if (!skipcom)
    {putc('/',fout); putc('*', fout);}
    lastst=0;
    while ( (c = getc1()) != '\0')
    {
    if (lastst && c=='/')
    {
    if (!skipcom)
    putc('/', fout);
    break;
    }
    if (c=='\n' || !skipcom)
    putc(c, fout);
    lastst = (c=='*');
    }
    if (c=='\0')break;
    }
    return(c);
    }

    Yeah, that's from `cc.c`, right?

    No, it's from cpp.c

    $ ls /work/reference/collegetapes/sltape/v6cc/
    c0.c c00.c c01.c c02.c c03.c c04.c c05.c c1.h
    c10.c c11.c c12.c c13.c c2.h c20.c c21.c cc.c cpp.c

    Oh interesting. I don't have a `cpp.c` in my v6 archive.

    I wonder what else I'm missing.

    - Dan C.


    --- PyGate Linux v1.5.15
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From David Brown@3:633/10 to All on Fri Jun 5 09:29:19 2026
    On 04/06/2026 21:29, Bart wrote:
    On 04/06/2026 19:54, David Brown wrote:
    On 04/06/2026 17:46, Bart wrote:
    On 04/06/2026 15:27, David Brown wrote:
    On 04/06/2026 15:18, Bart wrote:

    It is an objective fact, therefore, that "(a*a) + (b*b)" has more >>>>>> parentheses than needed in the context of most programming languages. >>>>>>
    "(a*a) + (b*b) has too many parentheses", on the other hand, is a >>>>>> purely subjective opinion.

    So, you're arguing 'more than needed' is a completely different
    thing from 'too many'.

    Of course they are different things - albeit related things, rather
    than /completely/ different.ÿ One is a question of fact, the other a
    question of opinion, and they do not always coincide.

    It is a fact that "a << (b + c)" has more parentheses than needed.
    But I think we are both of the opinion that it does not have "too
    many" parentheses - it has an appropriate number of parentheses.

    So saying 'too many' of something will be a subjective opinion? OK,
    so let's try compiling this bit of C:

    ÿÿ void F(int, int);

    ÿÿ int main() {
    ÿÿÿÿÿÿ F(1, 2, 3);
    ÿÿ }

    8 out of 9 compilers reported 'Too many arguments'.

    According to you, that's only their subjective opinion, not an
    objective fact?

    Again - /please/ stop trying to guess what people say or put words in
    their mouths.ÿ I can't remember ever seeing you do so accurately.

    This is what you actually said:

    It is an objective fact, therefore, that "(a*a) + (b*b)" has more parentheses than needed in the context of most programming languages.

    "(a*a) + (b*b) has too many parentheses", on the other hand, is a purely subjective opinion.ÿ Even if it is true that this is "commonly agreed
    to" (and AFAIK you have no basis for that claim), that would still be a subjective opinion - no matter how common that opinion is.

    You're saying that:

    How can this be /so/ difficult for you?


    *ÿ "more than needed" is objective

    No, I said that "(a*a) + (b*b)" has more parentheses than needed in the context of most programming languages" is objective.

    *ÿ "too many" is subjective

    No, I said that "(a*a) + (b*b) has too many parentheses" is subjective.

    The context is /critical/. There are plenty of situations where the
    words "more than needed" might turn up in a subjective phrase. There
    are plenty of situations where "too many" might turn up in an objective phrase.

    It is not those particular words that make the difference between
    "subjective" and "objective". "Subjective" means there is a subject -
    almost always a human subject - and the judgement or categorisation
    depends on that person or persons. "Objective" means there is no person involved, and the judgement or categorisation is independent of any person.

    A categorisation of an expression that depends on its meaning in C does
    not involve a person - the judgement is mechanical and based solely on
    the expression and the C standards. It is therefore objective. Any sufficiently intelligent and literate person will reach the same
    decision even if they have never used C or any other programming language.

    A categorisation of what people feel is too many parentheses in an
    expression is entirely dependent on that person. Some people might be
    happy with more, some people might prefer a minimum number allowed by
    the language while maintaining the same semantics. Some might prefer
    lots but be okay with fewer, or prefer fewer but understand why others
    prefer more. Some might draw a hard line and say that more than three nestings is too much, others might have no limits. Some will say it
    depends on the circumstances, drawing distinction between code that they
    write and code they have to read, or code that is generated
    automatically in some way. Clearly, this is all highly subjective.


    Even though both are about exactly the same thing: superfluous but
    harmless parentheses in an expression.

    So you are picking on my choice of words, apparently in order to win
    some stupid argument on the internet. Even though the same "too many"
    phrase used elsewhere can be objective, according to you.


    I don't care about the words - I care that you can make a distinction
    between what is factual and objective, and what is opinionated and
    subjective.

    My suspicion is that you actually have a real, serious problem in this
    area. Your programming has been so insular and isolated for so long,
    that you are perhaps genuinely unable to make such distinctions - at
    least in the context of programming. For you, programming revolves
    entirely around /you/ - you designed your language(s), you implemented
    it, you use it. Your language, and the programs you have written in it,
    are part of you and have no non-subjective existence - and languages and programs that are not yours have only limited existence and relevance to
    you. This makes it very difficult for you to distinguish between
    objective matters, such as a language's syntax, and subjective matters,
    such as coding style. For example, you appear to think that code
    written in an unclear style means the syntax is ambiguous, conflating subjective opinion with objective fact. You view the C standards as a
    set of guidelines, rather than a contract and specification, because in
    your own programming world your language descriptions /are/ a set of guidelines and rough notes that you can change at a whim as easily and
    often as you change code written in the language. In your programming
    world, everything is subjective because it all comes from your personal
    likes and dislikes, and everything seems objective because there are no
    other people to have opinions or thoughts.

    This looks like a pattern: people here seem to have remarkable trouble debating with me on actual ideas and resort instead to find hidden significance in the some choice of words I'd happen to use.


    For discussions to have any chance of being productive, they have to
    share a common language and understanding of terms and concepts.



    "Too many parentheses" is subjective, because they affect the ease of
    reading the code as a human reader.

    And 'more than needed' isn't that?!

    In the context it was used, that is correct. "More than needed" means
    that some could be removed without changing the semantics of the
    expression - it's meaning as a C expression.


    Tim Rentsch I'm sure will prefer the latter because 99.9% of C
    programmers are machines, according to him.

    Please give a reference for him saying that. (I'll save you the bother,
    he has not made any remarks remotely like this in c.l.c. since I have
    been here.)


    Presumably, the same 99.9% will not use indentation, and will write
    their programs all on one line anyway, because it is still after all completely unambiguous according to the C standard!

    Don't presume - you make a fool out of yourself every time you do.



    --- PyGate Linux v1.5.15
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From Tim Rentsch@3:633/10 to All on Fri Jun 5 00:53:39 2026
    cross@spitfire.i.gajendra.net (Dan Cross) writes:

    In article <10vsrpo$men2$2@dont-email.me>, Bart <bc@freeuk.com> wrote:

    On 04/06/2026 22:06, Keith Thompson wrote:

    Bart <bc@freeuk.com> writes:

    [snip]
    Tim Rentsch I'm sure will prefer the latter because 99.9% of C
    programmers are machines, according to him.

    Tim didn't say or imply that.

    So what was his 99.9% all about? Nobody has a clue, except they are
    certain that what I think it is is wrong!

    Have you thought about, I don't know, maybe asking him?

    At the risk of saying what may be obvious to everyone, Bart has
    shown that he has no interest in having a serious, constructive,
    useful, or productive conversation with anyone. His questions
    are all rhetorical; he hasn't asked me a straight question
    because he isn't really interested in what I would say. In
    short, Bart isn't looking for an answer, he's looking for an
    argument. My recommendation is just stop responding to him
    altogether. My response to him upthread was a sincere effort to
    provide a neutral and helpful answer to his question. Maybe my
    remarks were helpful to other people, and if they were that's
    good. Any further efforts to interact with Bart are not just a
    waste of time but actually counterproductive. What Bart needs is
    not help with understanding C but a good therapist. In any case
    I'm confident that whatever Bart's needs may be, no one responding
    to his postings here is in a position to provide them. Please
    consider these remarks before responding to him further.

    --- PyGate Linux v1.5.15
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From David Brown@3:633/10 to All on Fri Jun 5 10:34:00 2026
    On 04/06/2026 21:13, Lew Pitcher wrote:
    On Thu, 04 Jun 2026 21:04:50 +0200, David Brown wrote:

    On 04/06/2026 19:47, Janis Papanagnou wrote:
    On 2026-06-04 18:18, Scott Lurndal wrote:

    Indeed, and in the early days, the compiler itself would never
    have seen '/*' - the preprocessor (cpp) would have removed it
    from the source before the source reached the first
    pass of the compiler (c0).

    Curious; was the comment-handling at some point in history removed
    from the Cpp-processing? - If so, when was that? And I assume the
    semantics are still the same; is that correct?


    No, at least since the standardisation of the C language (including K&R
    "standard"), "preprocessing" has been an integral part of the C language
    and conversion of comments to space characters is done in phase 3 of the
    translation. But the C standards do not give an explicit distinction
    between "preprocessing" and "compiling" - just different translation
    phases. (They do not define a "compiler" at all.) It is not uncommon
    for implementations to separate translation into two or more programs,
    especially in the good old days when hosts had much less memory, but
    logically they are all one implementation. Distinguishing "the compiler
    itself" is somewhat artificial.

    In historic Unix (Version 7 and before), the preprocessor was implemented
    as a separate program ("cpp") from the compiler ("cc"). The compiler itself had no facility to handle preprocessor directives, and was, itself, often divided into two separate programs ("cc0" and "cc1"). All three phases ("cpp", "cc0" and "cc1") were managed by a program ("cc"), although the program for each phase could be invoked independently through manual execution.


    When you type "$(CC) main.c -o main" and get a program "main", there are usually a number of programs run in the process. Traditionally (and
    still the case for some compilers) there was a split between the "preprocessor" and the "compiler". But such a split is artificial in
    terms of the C implementation - as is having the compiler generate
    assembly and pass it to a separate assembler, and a separate linker. A
    C implementation translates C into a program suitable for running on the target - whether that is done using a single program or multiple
    programs is implementation detail.

    (In contrast, I have seen embedded C compilers where there was a single program that covered everything - preprocessing, compiling, assembling, linking, and also contained the standard headers and standard library as
    part of the monolithic tool rather than separate files.)

    What differs from today is that the preprocessor was an optional component, made available for a programmer's convenience.


    I am not sure what you mean. I can run code through a C pre-processor
    without compiling it today. I can write "manually pre-processed" C code
    and compile it today without a pre-processing stage. I have rarely had
    use of the former (perhaps debugging some macros), and never had need of
    the later, but it is certainly possible.



    --- PyGate Linux v1.5.15
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From Janis Papanagnou@3:633/10 to All on Fri Jun 5 10:41:10 2026
    On 2026-06-05 01:49, Dan Cross wrote:
    [...]
    [...]

    [ ... (INT_MAX+1)*0 ]

    Furthermore, the expression above is obviously an integer
    constant expression as defined by sec 6.6 para 8. Section 6.6,
    para 4, reads in part, "Each constant expression shall evaluate
    to a constant that is in the range of representable values for
    its type." The expression, `(INT_MAX+1)*0` violates this
    constraint, and so therefore a diagnostic is mandated as per
    sec 5.1.1.3 para 1. That it appears in code that is not
    obviously called from `main` doesn't change that.

    I'm curious about that "violation"; a violation would require
    (at least) two sorts of logical preconditions. - The first is
    that all *sequentially* (literally) evaluated sub-expression
    values are representable as value - INT_MAX+1 certainly can't
    be represented in generated code that conforms to the abstract
    *mathematical* value - but is that necessary if _the whole_
    expression is (mathematically) just 0 (because of the final
    factor). And the second (related) is whether the order of the
    sub-expression evaluation is relevant; if we'd assume the
    expression evaluation to be considered from right to left then
    it would be irrelevant what's inside the parenthesis.

    From the standard quotes I cannot really recognize that these
    preconditions, how to determine UB/errors/violations, would be
    necessary.

    I'm no native speaker and I fear my question as formulated was
    hard to understand. It's basically the question of the standard
    implying (INT_MAX+1)*0 to be analyzed sequentially as written
    or whether it could as well analyze it from right to left and
    thus recognizing no problem, since from the mathematical view -
    but also practically - a concrete representable value of a here
    irrelevant sub-expression isn't necessary. Or another try of a
    (paraphrased) formulation; for the determination of constraint
    violations does the expression have strict (sort of) sequencing
    points _after each term_ (and each left-to-right sub-expression
    has to be well-defined) or can it be valued/analyzed as a whole
    not putting any preconditions about evaluation order etc. when
    determining the overall value?

    Janis

    PS: One yet non-considered question that was part of my original
    post was: "Is there any rationale from the _software designer_'s
    perspective?"

    [...]


    --- PyGate Linux v1.5.15
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From Bart@3:633/10 to All on Fri Jun 5 11:04:51 2026
    On 05/06/2026 08:53, Tim Rentsch wrote:
    cross@spitfire.i.gajendra.net (Dan Cross) writes:

    In article <10vsrpo$men2$2@dont-email.me>, Bart <bc@freeuk.com> wrote:

    On 04/06/2026 22:06, Keith Thompson wrote:

    Bart <bc@freeuk.com> writes:

    [snip]
    Tim Rentsch I'm sure will prefer the latter because 99.9% of C
    programmers are machines, according to him.

    Tim didn't say or imply that.

    So what was his 99.9% all about? Nobody has a clue, except they are
    certain that what I think it is is wrong!

    Have you thought about, I don't know, maybe asking him?

    Asking him straight questions is usually futile. You can probably guess
    this from the response below.

    Notice he hasn't tried to enlighten anyone about that 99.9%.

    That may just have been a throwaway line like when I say 'nobody likes
    X', but I would still dispute that, if it's about what I think it is,
    it's anything like a super-majority.


    At the risk of saying what may be obvious to everyone, Bart has
    shown that he has no interest in having a serious, constructive,
    useful, or productive conversation with anyone. His questions
    are all rhetorical; he hasn't asked me a straight question
    because he isn't really interested in what I would say. In
    short, Bart isn't looking for an answer, he's looking for an
    argument. My recommendation is just stop responding to him
    altogether. My response to him upthread was a sincere effort to
    provide a neutral and helpful answer to his question. Maybe my
    remarks were helpful to other people, and if they were that's
    good. Any further efforts to interact with Bart are not just a
    waste of time but actually counterproductive. What Bart needs is
    not help with understanding C but a good therapist. In any case
    I'm confident that whatever Bart's needs may be, no one responding
    to his postings here is in a position to provide them. Please
    consider these remarks before responding to him further.


    --- PyGate Linux v1.5.15
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From Bart@3:633/10 to All on Fri Jun 5 12:39:43 2026
    On 05/06/2026 08:29, David Brown wrote:
    On 04/06/2026 21:29, Bart wrote:

    You're saying that:

    How can this be /so/ difficult for you?


    *ÿ "more than needed" is objective

    No, I said that "(a*a) + (b*b)" has more parentheses than needed in the context of most programming languages" is objective.

    *ÿ "too many" is subjective

    No, I said that "(a*a) + (b*b) has too many parentheses" is subjective.

    If anyone is interested (which I doubt; bart-bashing is much more fun),
    this is the original context:

    TR:
    Sadly the idea of writing in a way that is "most easily understood"
    has resulted in a race to the bottom, where writers are more and
    more encouraged to take the view that (some) readers are pretty
    much arbitrarily stupid, with the result that expressions become
    littered with scads of unnecessary parentheses that actually
    detract from ease of reading. Good writing is always a balance
    between too much and too little.

    BC:
    Actual examples of too many parentheses?

    TR:
    The point of my comment is that either too many or too few is a
    subjective judgment, not an objective one.

    Here it is clear that 'too many' was just a paraphrase of 'unnecessary'.
    Here is my followup to TR:

    BC:
    My point was that it could be objective, at least for too many.

    For an infix syntax where * has higher priority than +, then it is a
    fact that the () in (a*a) + (b*b) are not necessary.

    So, assume a minimum number of () needed to properly parse an expression according to intent. Then:

    (1) TOO FEW: necessarily has to be subjective. It suggests a desire for
    more () than the minimum, but the exact number will vary.

    (2) TOO MANY, MORE THAN NEEDED, ETC: These can objective if refering to
    any number of extra () above the mininum. This is the point I made
    above, the one I defended.

    (3) TOO MANY, MORE THAN NEEDED, ETC: These can also be used in a
    judgemental manner, and there are subjective. This is where a certain
    number of extra () are accepted for readability etc, but the exact level
    will vary.

    If this is the point people have been trying to make, then they've been
    doing it incredibly badly, and been unnecessarily unpleasant and insulting.

    My own view is that C syntax has too much of (3), but necessarily so
    because of the choices made in its operator levels.

    The syntaxes I work on tend to have more of (2); () is less often needed
    for readability because of more sensible design choices. And IMO less
    often needed for overrides too, for the same reasons.

    For example, where C has (*P).m or (*Q)[i], I'd write P^.m or Q^[i],
    since I chose a postfix rather then prefix deferences operator.

    In general, for the same programs, C will probably use at least 20% more parentheses.



    Tim Rentsch I'm sure will prefer the latter because 99.9% of C
    programmers are machines, according to him.

    Please give a reference for him saying that.ÿ (I'll save you the bother,
    he has not made any remarks remotely like this in c.l.c. since I have
    been here.)

    Find out what was the subject of the 99.9% (even if that was an
    exaggeration). Then we'll talk.

    No, he didn't use the word 'machines'; I paraphrased to suggest
    supernormal people who know everything and never make mistakes.

    You're going to argue about this now?


    Presumably, the same 99.9% will not use indentation, and will write
    their programs all on one line anyway, because it is still after all
    completely unambiguous according to the C standard!

    Don't presume - you make a fool out of yourself every time you do.

    And you proceed to do exactly the same; Bart must be wrong, but you
    don't about what!



    --- PyGate Linux v1.5.15
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From Tim Rentsch@3:633/10 to All on Fri Jun 5 05:34:20 2026
    I didn't read Bart's posting. Unfortunately it seems
    true that any continued interaction with his comments
    is counterproductive.

    --- PyGate Linux v1.5.15
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From Tim Rentsch@3:633/10 to All on Fri Jun 5 05:49:58 2026
    Keith Thompson <Keith.S.Thompson+u@gmail.com> writes:

    James Kuyper <jameskuyper@alumni.caltech.edu> writes:
    [...]

    One advantage of having a single program do the whole thing, is
    that error messages can mention the actual text of the line where
    a problem was detected, without any pre-processing applied.

    Typical preprocessors emit directives that tell the compiler
    about the current file name and line number, precisely so that
    diagnostic messages can refer to the original text.

    For example:

    $ cat hello.c
    #include <stdio.h>
    int main(void) {
    printf("Hello world!\n");
    }
    $ gcc -E hello.c | tail
    extern int __uflow (FILE *);
    extern int __overflow (FILE *, int);
    # 983 "/usr/include/stdio.h" 3 4

    # 2 "hello.c" 2

    # 2 "hello.c"
    int main(void) {
    printf("Hello world!\n");
    }
    $

    The line `# 2 "hello.c"` is, according to the C standard, a
    "non-directive", which is a kind of directive. Executing a
    non-directive has undefined behavior,

    Since it is gcc that is generating the non-directives, for
    internal purposes, and gcc that is consuming them, it hardly
    seems worth worrying about whether their behavior is defined
    or not.

    --- PyGate Linux v1.5.15
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From Tim Rentsch@3:633/10 to All on Fri Jun 5 06:41:23 2026
    Keith Thompson <Keith.S.Thompson+u@gmail.com> writes:

    Tim Rentsch <tr.17687@z991.linuxsc.com> writes:

    Keith Thompson <Keith.S.Thompson+u@gmail.com> writes:

    Note that in a context that requires a constant expression, overflow is
    a constraint violation. For example, a case label like:

    case (INT_MAX + 1) * 0:

    must be diagnosed at compile time.

    gcc disagrees with you.

    What makes you think so?

    [...]

    I'm skipping this and proceeding on to the original question.

    But taking a closer look at the standard, I'm not 100% sure that the
    language requires a diagnostic, though I think that's the intent.
    The relevant constraint is:

    Each constant expression shall evaluate to a constant that is
    in the range of representable values for its type.

    If I squint really hard, I can argue that the entire expression
    has to be a constant expression, but it doesn't say that its
    subexpressions are constant expressions -- and *if* INT_MAX +
    1 evaluates to INT_MIN in the current implementation, then
    (INT_MAX + 1) * 0 evaluates to 0 and therefore satisfies the
    constraint.

    My reasoning is as follows.

    To determine if the constraint is satisfied, the compiler must
    first evaluate the expression (INT_MAX + 1) * 0.

    To evaluate the expression (INT_MAX + 1) * 0, the compiler must
    first evaluate the sub-expression (INT_MAX + 1).

    Because the expression (INT_MAX + 1) overflows, the behavior is
    undefined, and the compiler is free to decide that the value of
    the sub-expression (INT_MAX + 1) is, let's say, 12.

    The compiler next evaluates the overall expression as 12*0, which
    is 0 (an int).

    This result of the overall expression satisfies the constraint,
    and so the compiler is not obliged to generate a diagnostic.

    Going back, when evaluating (INT_MAX + 1), the compiler could
    have decided to choose the value 3.14159e47. In that case the
    value of the overall expression would be 0.0. This value has
    type double, which does not satisfy the constraint that the
    result have integer type. Thus if the compiler had made this
    decision then a diagnostic would be required.

    Overall conclusion: whether a diagnostic is required depends on
    what behavior is chosen for the construct (INT_MAX + 1). The
    implementation could choose a behavior where the constraint is
    satisfied, or it could choose a behavior where the constraint is
    not satisfied.

    But INT_MAX + 1 could legally trap, for example, and I don't
    believe it was intended that a given expression can be a constant
    expression or not depending on the vagaries of the behavior of an
    instance of UB.

    I see no basis for this belief. My conclusions are based on what
    the C standard actually says, rather than guesses about some
    unstated "intentions". I think you would do well to reach your
    conclusions based more on the actual text of the C standard, and
    less on your interpretation of what the text was "intended" to
    mean.

    --- PyGate Linux v1.5.15
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From David Brown@3:633/10 to All on Fri Jun 5 15:42:59 2026
    On 05/06/2026 13:39, Bart wrote:
    On 05/06/2026 08:29, David Brown wrote:
    On 04/06/2026 21:29, Bart wrote:

    You're saying that:

    How can this be /so/ difficult for you?


    *ÿ "more than needed" is objective

    No, I said that "(a*a) + (b*b)" has more parentheses than needed in
    the context of most programming languages" is objective.

    *ÿ "too many" is subjective

    No, I said that "(a*a) + (b*b) has too many parentheses" is subjective.

    If anyone is interested (which I doubt; bart-bashing is much more fun),
    this is the original context:


    I am writing in a detailed and repetitive maner to be sure there are no misunderstandings, not as "bart-bashing".

    TR:
    Sadly the idea of writing in a way that is "most easily understood"
    has resulted in a race to the bottom, where writers are more and
    more encouraged to take the view that (some) readers are pretty
    much arbitrarily stupid, with the result that expressions become
    littered with scads of unnecessary parentheses that actually
    detract from ease of reading.ÿ Good writing is always a balance
    between too much and too little.


    This is clearly about "too many" or "too few" as a subjective matter -
    i.e., in addition to the minimum required for the desired semantics.
    (The minimum requirements are objective, so the code has the correct C semantics - additional parentheses are about style and clarity, which
    are subjective.)

    BC:
    Actual examples of too many parentheses?

    I assume here we are again talking about "too many" beyond the necessary number. Coming from anyone else, I would happily assume they are
    talking about subjective opinions - "Can you give examples of real-world
    code where you think there are too many unnecessary parentheses,
    resulting in code that is harder to read than it would otherwise be?"
    Coming from you, it might also mean the nonsensical question "Can you
    give examples of code that objectively has too many unnecessary
    parentheses?".


    TR:
    The point of my comment is that either too many or too few is a
    subjective judgment, not an objective one.

    Here it is clear that 'too many' was just a paraphrase of 'unnecessary'.

    No, it is not. In the expression "a << (b + c)", there are unnecessary parentheses, but not - IMHO - too many parentheses. That is because "unnecessary" (in this context - and don't generalise from it) is an
    objective matter of whether or not the semantics of the expression are affected by the parentheses. "Too many" (in this context) is a
    subjective matter of clarity of code. In my opinion, the parentheses
    are helpful and there are therefore not too many of them - but as a
    matter of C semantics, they are objectively unnecessary.

    Again, I am unable to read Tim's mind, and I am not accountable for what
    he writes or how he writes it. But to my reading, it is quite clear
    that "too many" is /not/ a paraphrase of "unnecessary".

    Here is my followup to TR:

    BC:
    My point was that it could be objective, at least for too many.


    Yes, you wrote that. You are wrong. At least, you are wrong until
    someone exceeds the 63 levels of nesting that are required to be
    supported by conforming compilers, but I do not believe that is
    something you are considering.

    For an infix syntax where * has higher priority than +, then it is a
    fact that the () in (a*a) + (b*b) are not necessary.

    Agreed.


    So, assume a minimum number of () needed to properly parse an expression according to intent. Then:

    No, don't assume that. "Intent" implies reading the mind of the
    programmer. There is no such thing as "obvious intent" - there is the objective semantics of what the programmer writes, and the subjective
    ease with which people (including the programmer himself/herself) can
    read the code and understand the semantics of it. The former depends
    solely on the code written, the later depends significantly on the
    people reading it.

    Let us rather assume a minimum number of parentheses so that removing
    any would change the semantics of the expression. That is an objective measure.


    (1) TOO FEW: necessarily has to be subjective. It suggests a desire for
    more () than the minimum, but the exact number will vary.


    Agreed. (And we would both share the opinion that "a << b + c" has too
    few parentheses because we would feel it is easier to read with more parentheses - while we would both think that "a * a + b * b" does not
    have too few.)

    (2) TOO MANY, MORE THAN NEEDED, ETC: These can objective if refering to
    any number of extra () above the mininum. This is the point I made
    above, the one I defended.

    Nope.

    "a << (b + c)" has "more than needed" - that is objective.

    "a << (b + c)" does not have "too many" in an objective sense, because
    the extra parentheses have not affected any objective characteristic of
    the expression - the semantics are the same. Some people may
    subjectively feel there are "too many" because they think "a << b + c"
    is clearer - others will have different subjective opinions.


    That is the context of the phrases we have had, and how they have been used.

    Terms like "too many" or "more than needed" can be used in different
    contexts, and have different meanings. If you have a bowl that can hold
    6 apples, and you try to put 10 apples in the bowl, that is objectively
    "too many". If you write "that expression has more parentheses than
    needed to make the meaning clear to readers", then that is a subjective
    claim - it does not say anything about the number of parentheses needed
    to express the semantics in C (that's objective), but talks about the subjective views of readers.

    You cannot take a phrase like these and say "this is always objective"
    or "this is always subjective" - the context is always critical.


    (3) TOO MANY, MORE THAN NEEDED, ETC: These can also be used in a
    judgemental manner, and there are subjective. This is where a certain
    number of extra () are accepted for readability etc, but the exact level will vary.

    If this is the point people have been trying to make, then they've been doing it incredibly badly, and been unnecessarily unpleasant and insulting.


    I cannot speak for the intentions of others, but it has certainly been
    very frustrating trying to get you to understand the distinction between objective facts and subjective opinions, and trying to get you to stop re-writing other people's words and to stop taking partial quotations
    out of context and wildly and inaccurately generalising them.

    My own view is that C syntax has too much of (3), but necessarily so
    because of the choices made in its operator levels.

    That's a subjective opinion. I would agree with it, to at least some
    extent - some of the precedence order is not as I would have picked.
    But given that there are situations where I would include additional parentheses in C code despite agreeing with the precedence order, I
    don't think the C syntax rule choices are the issue. I don't believe I
    would use fewer parentheses even if << and >> had the same precedence
    level as * and /, or if the bitwise operators had higher precedence than equality and other relational operators.


    Tim Rentsch I'm sure will prefer the latter because 99.9% of C
    programmers are machines, according to him.

    Please give a reference for him saying that.ÿ (I'll save you the
    bother, he has not made any remarks remotely like this in c.l.c. since
    I have been here.)

    Find out what was the subject of the 99.9% (even if that was an exaggeration). Then we'll talk.

    Again, I am not responsible for what Tim (or anyone else) writes. If
    you have asked him for clarification, and he has not given a
    satisfactory answer, there's little more to do.


    No, he didn't use the word 'machines'; I paraphrased to suggest
    supernormal people who know everything and never make mistakes.

    You're going to argue about this now?

    Normally there is nothing wrong with paraphrasing, though in this
    discussion it would make a lot more sense to be precise about
    quotations. However, wildly exaggerating what someone says is not "paraphrasing". It is misrepresenting them, and is dishonest when done intentionally and knowingly.



    Presumably, the same 99.9% will not use indentation, and will write
    their programs all on one line anyway, because it is still after all
    completely unambiguous according to the C standard!

    Don't presume - you make a fool out of yourself every time you do.

    And you proceed to do exactly the same; Bart must be wrong, but you
    don't about what!


    I am not presuming - I was making a comment based on past history. It
    would be nice if it changed, either because you stop trying to guess
    what people think or might say, and stop distorting what they write.
    Put a bit more effort into reading peoples posts, and less effort into
    the paranoia, and I'm sure you'll feel the threads are more productive.



    --- PyGate Linux v1.5.15
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From Scott Lurndal@3:633/10 to All on Fri Jun 5 14:04:19 2026
    cross@spitfire.i.gajendra.net (Dan Cross) writes:
    In article <1BoUR.3$lmCb.1@fx22.iad>, Scott Lurndal <slp53@pacbell.net> wrote: >>cross@spitfire.i.gajendra.net (Dan Cross) writes:
    [snip]
    <snip>
    Yeah, that's from `cc.c`, right?

    No, it's from cpp.c

    $ ls /work/reference/collegetapes/sltape/v6cc/
    c0.c c00.c c01.c c02.c c03.c c04.c c05.c c1.h
    c10.c c11.c c12.c c13.c c2.h c20.c c21.c cc.c cpp.c

    Oh interesting. I don't have a `cpp.c` in my v6 archive.

    I wonder what else I'm missing.

    For your archive, cpp.c

    #
    # include <stdio.h>
    /* C command */

    # define SBSIZE 15000
    # define SYMSIZ 1500
    # define TOKLEN 16
    # define DROP (-2)
    # define SAME 0
    # define MAXINC 10
    char sbf[SBSIZE];
    # define CHSPACE 1000
    char ts[CHSPACE+50];
    # define EXPSIZE 500
    char *strdex(), *copy(), *calloc(), *token(), *coptok();
    char *tsa ts;
    char *tsp ts;
    char *fnames[MAXINC];
    # define LINELEN 512
    FILE *fin;
    FILE *fout;
    int instring;
    char direct[50];
    int nd 1;
    char *dirs[10] {direct, 0};
    char nfil[100];
    int pflag;
    int depth;
    int skipcom;
    FILE *fins[MAXINC];
    int ifno;
    char *lp;
    char *line;
    # define NPREDEF 20
    char *prespc[NPREDEF];
    char **predef prespc;
    char *punspc[NPREDEF];
    char **prund punspc;
    char **predp;
    int lineno[MAXINC];
    int exfail;
    struct symtab {
    char name[TOKLEN];
    char *value;
    } *symtab, *lookup();
    struct symtab *defloc;
    struct symtab *udfloc;
    struct symtab *incloc;
    struct symtab *ifloc;
    struct symtab *elsloc;
    struct symtab *eifloc;
    struct symtab *ifdloc;
    struct symtab *ifnloc;
    struct symtab *sysloc;
    struct symtab *lneloc;
    struct symtab *prdloc;
    int trulvl;
    int flslvl;
    char *stringbuf;

    mainpp(argc,argv)
    char *argv[];
    {
    int i;
    # ifdef tgp
    int ifbrk;
    # endif
    char ln[LINELEN];
    register int c;
    register char *rlp;
    char *sp;
    struct symtab stab[SYMSIZ];

    fin = stdin;
    fout = stdout;
    # ifdef unix
    fnames[ifno=0] = "";
    # endif
    # ifdef gcos
    fnames[ifno=0] = "s*";
    # endif
    # ifdef ibm
    fnames[ifno=0] = "";
    # endif
    for(i=1; i<argc; i++)
    {
    switch(argv[i][0])
    {
    case '-':
    switch(argv[i][1])
    {
    case 'P':
    pflag++;
    case 'E':
    continue;
    case 'D':
    if (predef>prespc+NPREDEF)
    {
    error("too many -D options, ignoring %s",argv[i]);
    continue;
    }
    *predef++ = argv[i]+2;
    continue;
    case 'U':
    if (prund>punspc+NPREDEF)
    {
    error("too many -U options, ignoring %s",argv[i]);
    continue;
    }
    *prund++ = argv[i]+2;
    continue;
    case 'I':
    if (nd>8)
    error("excessive -I file (%s) ignored",argv[i]);
    else
    dirs[nd++] = argv[i]+2;
    continue;
    case '\0': continue;
    default:
    error("unknown flag %s", argv[i]);
    continue;
    }
    default:
    if (fin==stdin)
    {
    fin = fopen(argv[i], "r");
    if (fin==NULL)
    {
    error("No source file %s",argv[i]);
    exit(8);
    }
    fnames[ifno]=argv[i];
    strcpy(direct, argv[i]);
    for(sp=direct; *sp; sp++);
    while (sp>direct && *sp != '/') sp--;
    # ifdef unix
    if (sp==direct)
    *sp++ = '.';
    # endif
    *sp=0; /* direct now has place where source file is */
    }
    else
    if (fout==stdout)
    {
    fout= fopen(argv[i], "w");
    if (fout==NULL)
    {
    error("Can't write %s", argv[i]);
    exit(8);
    }
    }
    else
    error("extraneous name %s", argv[i]);
    }
    }

    fins[ifno]=fin;
    exfail = 0;
    /* after user -I files here are the standard include libraries */
    # ifdef unix
    dirs[nd++] = "/usr/include";
    # endif
    # ifdef gcos
    dirs[nd++] = "cc";
    # endif
    # ifdef ibm
    dirs[nd++] = "stdio.";
    # endif
    /* dirs[nd++] = "/compool"; */
    dirs[nd++] = 0;
    symtab = stab;
    for (c=0; c<SYMSIZ; c++) {
    stab[c].name[0] = '\0';
    stab[c].value = 0;
    }
    insym(&defloc, "define");
    insym(&udfloc, "undef");
    insym(&incloc, "include");
    insym(&elsloc, "else");
    insym(&eifloc, "endif");
    insym(&ifdloc, "ifdef");
    insym(&ifnloc, "ifndef");
    insym(&ifloc, "if");
    # ifdef unix
    insym(&sysloc, "unix");
    # endif
    # ifdef gcos
    insym (&sysloc, "gcos");
    # endif
    # ifdef ibm
    insym (&sysloc, "ibm");
    # endif
    insym(&lneloc, "line");
    predp=predef;
    while (predp>prespc)
    if (sp=strdex(*--predp, '='))
    {
    *sp++=0;
    stsym(*predp, sp);
    }
    else
    insym(&prdloc, *predp);
    predp=prund;
    while (predp>punspc)
    {
    if (sp=strdex(*--predp, '='))
    *sp++=0;
    lookup(*predp, DROP);
    }
    stringbuf = sbf;
    trulvl = 0;
    flslvl = 0;
    line = ln;
    lineno[0] = 1;
    if (pflag==0) fprintf(fout, "# 1 \"%s\"\n", fnames[ifno]);
    while(getline()) {
    skipcom=0;
    if (ln[0] != '#' && flslvl==0)
    {
    # ifdef tgp
    ifbrk= checklen(line);
    # endif
    for (rlp = line; c = *rlp++;)
    putc(c, fout);
    # ifdef tgp
    if (ifbrk)
    fprintf(fout,"\n# %d",lineno[ifno]);
    # endif
    }
    putc('\n', fout);
    }
    # ifdef tgp
    checklen(line);
    # endif
    for(rlp=line; c = *rlp++;)
    putc(c,fout);
    }

    getline()
    {
    register int c, sc, state;
    struct symtab *np;
    char *namep, *filname, **dirp;
    int filok, inctype;

    lp = line;
    *lp = '\0';
    state = 0;
    if ((c=getch()) == '#')
    state = 1;
    while (c!='\n' && c!='\0') {
    if (letter(c)) {
    namep = lp;
    sch(c);
    while (letnum(c=getch()))
    sch(c);
    sch('\0');
    lp--;
    if (state==6)
    {
    lookup(namep, DROP);
    goto out;
    }
    if (state>3 && state <6) {
    if (flslvl==0 &&(state+!lookup(namep,-1)->name[0])==5)
    trulvl++;
    else
    flslvl++;
    out:
    while (c!='\n' && c!= '\0')
    c = getch();
    return(c);
    }
    if (state==3) /* include */
    if (*namep != '"' && *namep != '<')
    {
    error("Bad include syntax", 0);
    state=1;
    }
    if (state!=2 || flslvl==0)
    {
    pushback(c);
    np = lookup(namep, state);
    c = getch();
    }
    if (state==1) {
    if (np==defloc)
    skipcom = state = 2;
    else if (np==incloc)
    state = 3;
    else if (np==ifnloc)
    state = 4;
    else if (np==ifdloc)
    state = 5;
    else if (np==eifloc) {
    if (flslvl)
    --flslvl;
    else if (trulvl)
    --trulvl;
    else errback("If-less endif",0);
    goto out;
    }
    else if (np==elsloc) {
    if (flslvl)
    --flslvl? ++flslvl : ++trulvl;
    else if (trulvl)
    {++flslvl; --trulvl;}
    else
    errback("If-less else",0);
    goto out;
    }
    else if (np==udfloc) {
    state=6;
    }
    else if (np==ifloc) {
    /*
    if (flslvl ==0 && yyparse())
    */ error("IF not implemented, true assumed",0); if (1)
    trulvl++;
    else
    flslvl++;
    return('\n');
    }
    else if (np==lneloc)
    {
    if(pflag==0) fprintf(fout, "# ");
    lp=line;
    for(; c !='\n' && c != '\0'; c=getch())
    if (!pflag)
    sch(c);
    sch('\0');
    return(c);
    }
    else {
    errback("Undefined control",0);
    while (c!='\n' && c!='\0')
    c = getch();
    return(c);
    }
    } else if (state==2) {
    if (flslvl)
    goto out;
    np->value = stringbuf;
    if (c != '\n' && c != 0)
    {
    savch(c);
    while ((c=getch())!='\n' && c!='\0')
    {
    if (c== '\\')
    {
    c = getch();
    if (c=='\n')continue;
    savch('\\');
    }
    savch(c);
    }
    }
    savch('\0');
    return(1);
    }
    continue;
    } else if ((sc=c) == '\'' || sc== '"' || (state==3 && sc== '<')) {
    sch(sc);
    filname = lp;
    inctype = sc=='<';
    if (sc== '<')
    {
    /*
    fprintf(fout==stdout?stderr:stdout, "note: include <> obsolete, use \"\"\n");
    */
    sc= '>';
    }
    instring++;
    while ((c=getch())!=sc && c!='\n' && c!='\0') {
    sch(c);
    if (c=='\\')
    sch(getch());
    }
    instring = 0;
    if (flslvl)
    goto out;
    if (state==3) {
    if (flslvl)
    goto out;
    *lp = '\0';
    while ((c=getch())!='\n' && c!='\0');
    if (ifno+1 >=MAXINC)
    error("Unreasonable include nesting",0);
    filok=0;
    for(dirp=dirs+inctype; *dirp; dirp++)
    {
    if (filname[0]=='/' || **dirp=='\0')
    strcpy(nfil,filname);
    else
    {
    strcpy(nfil,*dirp);
    # ifdef unix
    strcat(nfil, "/");
    # endif
    # ifdef gcos
    strcat(nfil, "/");
    # endif
    # ifdef ibm
    strcat(nfil, ".");
    # endif
    strcat(nfil, filname);
    }
    if ( (fins[ifno+1]=fopen(nfil, "r"))!=NULL)
    {
    filok=1;
    fin = fins[++ifno];
    break;
    }
    }
    if (filok==0)
    errback("Can't find include file %s", filname);
    else
    {
    if (pflag==0) fprintf(fout, "\n# 1 \"%s\"", filname);
    lineno[ifno]=1;
    fnames[ifno] = copy(filname);
    }
    return(c);
    }
    }
    sch(sc=c);
    c = getch();
    if (isdigit(sc))
    {
    for (;isalpha(c) || isdigit(c); c=getch())
    sch(c);
    }
    }
    sch('\0');
    if (state>1)
    errback("Control syntax",0);
    return(c);
    }
    insym(sp, namep)
    struct symtab **sp;
    char *namep;
    {
    register struct symtab *np;
    *sp = np = lookup(namep, 1);
    np -> value = np -> name;
    }

    stsym(namep, valp)
    char *namep, *valp;
    {
    register struct symtab *np;

    np = lookup(namep, 1);
    value = valp;
    }

    error(s, x)
    char *s;
    {
    FILE *efout;
    efout = fout==stdout ? stderr : stdout;
    if (fnames[ifno][0])
    fprintf(efout,"%s: %d: ", fnames[ifno], lineno[ifno]);
    fprintf(efout, s, x);
    putc('\n',efout);
    exfail++;
    }
    errback(s,x)
    char *s;
    {
    lineno[ifno]--;
    error(s,x);
    lineno[ifno]++;
    }

    sch(c)
    {
    register char *rlp;

    rlp = lp;
    if (rlp==line+LINELEN-2)
    error("Line overflow", 0);
    *rlp++ = c;
    if (rlp>line+LINELEN-1)
    rlp = line+LINELEN-1;
    lp = rlp;
    }

    savch(c)
    {
    *stringbuf++ = c;
    if (stringbuf-sbf < SBSIZE)
    return;
    error("Too much defining", 0);
    exit(exfail);
    }

    getch()
    {
    register int c, lastst;

    while ((c=getc1())=='/' && !instring)
    {
    if ((c=getc1())!='*')
    {
    pushback(c);
    return('/');
    }
    if (!skipcom)
    {putc('/',fout); putc('*', fout);}
    lastst=0;
    while ( (c = getc1()) != '\0')
    {
    if (lastst && c=='/')
    {
    if (!skipcom)
    putc('/', fout);
    break;
    }
    if (c=='\n' || !skipcom)
    putc(c, fout);
    lastst = (c=='*');
    }
    if (c=='\0')break;
    }
    return(c);
    }
    char pushbuff[EXPSIZE];
    char *pushp pushbuff;
    pushback(c)
    {
    *++pushp = c;
    if (pushp>pushbuff+EXPSIZE) {
    error("too much backup", 0);
    exit(8);
    }
    }

    getc1()
    {
    register c;

    if (*pushp !=0)
    return(*pushp--);
    depth=0;
    if ((c = getc(fin)) == EOF && ifno>0) {
    fclose(fin);
    fin = fins[--ifno];
    if (pflag==0) fprintf(fout, "\n# %d \"%s\"\n",lineno[ifno], fnames[ifno]);
    c = getc1();
    if (c=='\n') lineno[ifno]--;
    }
    if (c==EOF)
    return(0);
    if (c=='\n' )
    lineno[ifno]++;
    return(c);
    }

    struct symtab *
    lookup(namep, enterf)
    char *namep;
    {
    register char *np, *snp;
    register struct symtab *sp;
    int i, c, around;
    np = namep;
    snp = np+TOKLEN;
    around = i = 0;
    while ( (c = *np++ ) && (np-snp)<0)
    {
    i =+ c;
    }
    i =% SYMSIZ;
    sp = &symtab[i];
    while (sp->name[0]) {
    if (sp->name[0] != DROP)
    {
    snp = sp->name;
    np = namep;
    while (*snp++ == *np)
    if (*np++ == '\0' || np==namep+TOKLEN) {
    if (enterf==DROP)
    {
    sp->name[0]= DROP;
    return(sp);
    }
    if (!enterf)
    subst(namep, sp);
    return(sp);
    }
    }
    if (++sp >= &symtab[SYMSIZ])
    if (around++)
    {
    error("too many defines", 0);
    exit(exfail);
    }
    else
    sp = symtab;
    }
    if (enterf>0) {
    snp = namep;
    for (np = &sp->name[0]; np < &sp->name[TOKLEN];)
    if (*np++ = *snp)
    snp++;
    }
    return(sp);
    }
    char revbuff[200], *bp;
    backsch(c)
    {
    if (bp-revbuff > 200)
    error("Excessive define looping", bp--);
    *bp++ = c;
    }

    subst(np, sp)
    char *np;
    struct symtab *sp;
    {
    register char *vp;
    int macflg;

    lp = np;
    bp = revbuff;
    if (depth++>100)
    {
    error("define recursion loop on %s", np);
    return;
    }
    if ((vp = sp->value) == 0)
    return;
    macflg= (*vp == '(');
    /* arrange that define unix unix still
    has no effect, avoiding rescanning */
    while (blank(*vp))
    vp++;
    if (strcmp(sp->name,vp) == SAME)
    {
    while (*vp)
    sch(*vp++);
    return;
    }
    if (macflg)
    expdef(vp);
    else
    while (*vp)
    backsch(*vp++);
    while (bp>revbuff)
    pushback(*--bp);
    }




    char *
    copy(as)
    char as[];
    {
    register char *otsp, *s;
    int i;

    otsp = tsp;
    s = as;
    while(*tsp++ = *s++);
    if (tsp >tsa+CHSPACE)
    {
    # ifdef unix
    tsp = tsa = i = calloc(CHSPACE+50,sizeof(char));
    if (i== NULL)
    # endif
    {
    error("no space for file names", 0);
    exit(8);
    }
    }
    return(otsp);
    }


    expdef(proto)
    char *proto;
    {
    char buffer[EXPSIZE], *parg[20], *pval[20], name[20], *cspace, *wp;
    char protcop[EXPSIZE], *pr;
    int narg, k, c;
    pr = protcop;
    while (*pr++ = *proto++)
    if (pr>=protcop+EXPSIZE){
    error("define prototype too big", 0);
    exit(8);
    }
    proto= protcop;
    for (narg=0; (parg[narg] = token(&proto)) != 0; narg++)
    ;
    /* now scan input */
    cspace = buffer;
    while ((c=getch()) == ' ');
    if (c != '(')
    {
    error("defined function requires arguments", 0);
    return;
    }
    pushback(c);
    for(k=0; pval[k] = coptok(&cspace, buffer+EXPSIZE); k++);
    if (k!=narg)
    {
    error("define argument mismatch");
    return;
    }
    while (c= *proto++)
    {
    if (!letter(c))
    backsch(c);
    else
    {
    wp = name;
    *wp++ = c;
    while (letnum(*proto))
    *wp++ = *proto++;
    *wp = 0;
    for (k=0; k<narg; k++)
    if(strcmp(name,parg[k]) == SAME)
    break;
    wp = k <narg ? pval[k] : name;
    while (*wp) backsch(*wp++);
    }
    }
    }

    char *
    token(cpp) char **cpp;
    {
    char *val;
    int stc;
    stc = **cpp;
    *(*cpp)++ = '\0';
    if (stc==')') return(0);
    while (**cpp == ' ') (*cpp)++;
    for (val = *cpp; (stc= **cpp) != ',' && stc!= ')'; (*cpp)++)
    {
    if (!letnum(stc) || (val == *cpp && !letter(stc)))
    {
    error("define prototype argument error");
    return(0);
    }
    }
    return(val);
    }

    char *
    coptok (cpp, clim) char **cpp, *clim;
    {
    char *val;
    int stc, stop,paren;
    paren = stop = 0;
    val = *cpp;
    if (getch() == ')')
    return(0);
    while (((stc = getch()) != ',' && stc != ')' ) || paren > 0 || stop >0)
    {
    if (stc == '\0')
    {
    error("non terminated macro call", 0);
    val = 0;
    break;
    }
    if (stop == 0 && (stc == '"' || stc == '\''))
    stop = stc;
    else if (stc==stop)
    stop=0;
    if ( stc == '\\')
    {
    stc = getch();
    if (stop>0 || (stc != ',' && stc != '\\'))
    *(*cpp)++ = '\\';
    *(*cpp)++ = stc;
    }
    else
    {
    *(*cpp)++ = stc;
    if (stop==0)
    {
    if (stc == '(')
    paren++;
    if (stc == ')')
    paren--;
    }
    }
    if (*cpp >= clim)
    {
    error("define argument too long",0);
    exit(8);
    }
    }
    *(*cpp)++ = 0;
    pushback(stc);
    return(val);
    }
    letter(c)
    {
    if (isalpha(c) || c == '_')
    return (1);
    else
    return(0);
    }
    letnum(c)
    {
    if (letter(c) || isdigit(c))
    return(1);
    else
    return(0);
    }


    blank(c)
    {
    return(c==' ' || c== '\t');
    }

    char *
    strdex(s,c)
    char *s;
    {
    while (*s)
    if (*s==c)
    return(s);
    else
    s++;
    return(0);
    }
    # ifdef tgp
    # define MAXOUT 80
    checklen(sln)
    char *sln;
    {
    /* for tgp: scans string sln, and puts in newlines for blanks,
    where it likes, but to make lines less than MAXOUT chars long */

    char *p, *s, *st;
    int stopc, back, ifdone, c;
    st=s=sln;
    ifdone=p=stopc=back=0;
    while (c= *s++)
    {
    if (c == '\\')
    back=2;
    if (back==0)
    {
    if (stopc== c)
    stopc=0;
    else
    if (c == '"' || c == '\'')
    stopc= c;
    }
    if (back>0)back--;
    if (s-st >MAXOUT && p != 0)
    {
    st=p;
    *p= '\n';
    ifdone=1;
    }

    if (stopc==0 && back==0)
    if (c==' ') p=s-1;;
    }
    return(ifdone);
    }
    # endif

























    main(argc,argv) char *argv[]; {
    exit(mainpp (argc,argv) );
    }


    --- PyGate Linux v1.5.15
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From Bart@3:633/10 to All on Fri Jun 5 16:50:21 2026
    On 05/06/2026 14:42, David Brown wrote:
    On 05/06/2026 13:39, Bart wrote:

    "a << (b + c)" has "more than needed" - that is objective.

    "a << (b + c)" does not have "too many" in an objective sense, because


    OK. Suppose "too many" /is/ subjective; what actual difference does it
    make to anything?

    I cannot speak for the intentions of others, but it has certainly been
    very frustrating trying to get you to understand the distinction between objective facts and subjective opinions,

    Why is that even important? I asked:

    Actual examples of too many parentheses?


    The reply was:

    The point of my comment is that either too many or too few is a
    subjective judgment, not an objective one.

    I didn't introduce this objective/subjective business. It seems now more
    like a ploy to devalue any arguments of mine, and also to evade
    answering; I'm still waiting for those examples from TR!

    These were his prior comments:

    Sadly the idea of writing in a way that is "most easily understood"
    has resulted in a race to the bottom, where writers are more and
    more encouraged to take the view that (some) readers are pretty
    much arbitrarily stupid, with the result that expressions become
    littered with scads of unnecessary parentheses that actually
    detract from ease of reading. Good writing is always a balance
    between too much and too little.

    So he obviously has his own tolerance level. I would also guess those 'writers' belong to that 0.1%.

    I would actually agree that parentheses can add clutter, but not that
    the answer is to not use them when they are optional.

    It C they are often added many of us (we are a lot more than 0.1%) need
    them to more easily parse code. That doesn't mean we are stupid.

    I suggested that minimising parentheses because the result is still 'unambiguous' is equivalent to doing away with indentation for the same reason.

    People didn't like that. Yet indentation and extra parentheses /are/
    both redundant.



    --- PyGate Linux v1.5.15
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From Keith Thompson@3:633/10 to All on Fri Jun 5 10:49:28 2026
    Janis Papanagnou <janis_papanagnou+ng@hotmail.com> writes:
    On 2026-06-05 01:49, Dan Cross wrote:
    [...]
    [...]

    [ ... (INT_MAX+1)*0 ]

    Furthermore, the expression above is obviously an integer
    constant expression as defined by sec 6.6 para 8. Section 6.6,
    para 4, reads in part, "Each constant expression shall evaluate
    to a constant that is in the range of representable values for
    its type." The expression, `(INT_MAX+1)*0` violates this
    constraint, and so therefore a diagnostic is mandated as per
    sec 5.1.1.3 para 1. That it appears in code that is not
    obviously called from `main` doesn't change that.

    I'm curious about that "violation"; a violation would require
    (at least) two sorts of logical preconditions. - The first is
    that all *sequentially* (literally) evaluated sub-expression
    values are representable as value - INT_MAX+1 certainly can't
    be represented in generated code that conforms to the abstract
    *mathematical* value - but is that necessary if _the whole_
    expression is (mathematically) just 0 (because of the final
    factor). And the second (related) is whether the order of the
    sub-expression evaluation is relevant; if we'd assume the
    expression evaluation to be considered from right to left then
    it would be irrelevant what's inside the parenthesis.

    If the expression were evaluated right to left, it would still
    compute INT_MAX+1, which is UB.

    Let's look at an example where it's not in a context that requires a
    constant expression:

    int n;
    n = (INT_MAX+1)*0;

    In the abstract machine, the RHS is evaluated by adding INT_MAX
    and 1 (which overflows, UB) and then multiplying the result by 0.

    A compiler is allowed, but not required, to reduce the assignment to
    `n = 0;`. If it does so, then no overflow occurs at run time --
    but the definedness of the behavior is determined independent of
    any optimizations. The C standard does not require any particular
    behavior. It can set n to 0 because that's a valid consequence
    of UB.

    Let's take an example where it's definitely in a context that
    requires an integer constant expression:

    switch (0) {
    case (INT_MAX+1)*0:
    break;
    }

    The wording in 6.6 (Constant expressions) is slightly vague.
    For example, I would assume that any subexpression of a constant
    expression must be a constant expression, but it doesn't actually
    say so.

    But since, in the abstract machine, (INT_MAX+1)*0 doesn't yield
    any defined value, I'd say it violates the constraint that "Each
    constant expression shall evaluate to a constant that is in the
    range of representable values for its type".

    The alternative would be for to be a constant expression for
    implementations that are able to recognize that anything multiplied
    by zero is zero (analysis that compilers aren't required to perform),
    and not for others.

    On the other hand, "An implementation may accept other forms of
    constant expressions; however, it is implementation-defined whether
    they are an integer constant expression." That probably allows,
    but does not reuqire, an implementation to treat (INT_MAX+1)*0 as
    a constant expression with the value 0.

    From the standard quotes I cannot really recognize that these
    preconditions, how to determine UB/errors/violations, would be
    necessary.

    I'm no native speaker and I fear my question as formulated was
    hard to understand. It's basically the question of the standard
    implying (INT_MAX+1)*0 to be analyzed sequentially as written
    or whether it could as well analyze it from right to left and
    thus recognizing no problem, since from the mathematical view -
    but also practically - a concrete representable value of a here
    irrelevant sub-expression isn't necessary. Or another try of a
    (paraphrased) formulation; for the determination of constraint
    violations does the expression have strict (sort of) sequencing
    points _after each term_ (and each left-to-right sub-expression
    has to be well-defined) or can it be valued/analyzed as a whole
    not putting any preconditions about evaluation order etc. when
    determining the overall value?

    PS: One yet non-considered question that was part of my original
    post was: "Is there any rationale from the _software designer_'s perspective?"

    From a programmer's perspective, it's good to have consistent
    rules rather than leaving the decision of whether an expression
    is a constant expression up to the undocumented vagaries of how
    clever a compiler happens to be.

    --
    Keith Thompson (The_Other_Keith) Keith.S.Thompson+u@gmail.com
    void Void(void) { Void(); } /* The recursive call of the void */

    --- PyGate Linux v1.5.15
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From Keith Thompson@3:633/10 to All on Fri Jun 5 11:01:24 2026
    Tim Rentsch <tr.17687@z991.linuxsc.com> writes:
    Keith Thompson <Keith.S.Thompson+u@gmail.com> writes:
    [...]
    The line `# 2 "hello.c"` is, according to the C standard, a
    "non-directive", which is a kind of directive. Executing a
    non-directive has undefined behavior,

    Since it is gcc that is generating the non-directives, for
    internal purposes, and gcc that is consuming them, it hardly
    seems worth worrying about whether their behavior is defined
    or not.

    I wasn't worried. I just mentioned in in passing.

    You quoted most of the article, but snipped relevant context in
    the middle of a sentence.

    --
    Keith Thompson (The_Other_Keith) Keith.S.Thompson+u@gmail.com
    void Void(void) { Void(); } /* The recursive call of the void */

    --- PyGate Linux v1.5.15
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From Keith Thompson@3:633/10 to All on Fri Jun 5 11:09:34 2026
    Bart <bc@freeuk.com> writes:
    On 05/06/2026 08:29, David Brown wrote:
    On 04/06/2026 21:29, Bart wrote:
    [...]
    TR:
    Sadly the idea of writing in a way that is "most easily understood"
    has resulted in a race to the bottom, where writers are more and
    more encouraged to take the view that (some) readers are pretty
    much arbitrarily stupid, with the result that expressions become
    littered with scads of unnecessary parentheses that actually
    detract from ease of reading. Good writing is always a balance
    between too much and too little.

    BC:
    Actual examples of too many parentheses?

    TR:
    The point of my comment is that either too many or too few is a
    subjective judgment, not an objective one.

    Here it is clear that 'too many' was just a paraphrase of
    'unnecessary'.

    No, it is clear that "too many" and "unnecessary" have two different
    meanings.

    I think you and I agree that the parentheses in `a << (b + c)`
    are *unnecessary* (in the specific sense that they do not affect
    the semantics of the expression), but they are not *too many*
    (in the sense that they are helpful to most human readers).

    The idea that "too many" and "unnecessary" mean the same thing
    is your own invention.

    [...]

    Tim Rentsch I'm sure will prefer the latter because 99.9% of C
    programmers are machines, according to him.

    Please give a reference for him saying that.ÿ (I'll save you the
    bother, he has not made any remarks remotely like this in
    c.l.c. since I have been here.)

    Find out what was the subject of the 99.9% (even if that was an exaggeration). Then we'll talk.

    Only Tim can clarify that point, and he's made it clear that he's
    not interested in doing so. Please don't complain to the rest of
    us about that.

    No, he didn't use the word 'machines'; I paraphrased to suggest
    supernormal people who know everything and never make mistakes.

    You're going to argue about this now?

    Bart, when you make ridiculous and/or false statements, people are going
    to argue with you. When you double down on such statements, people are
    going to continue to argue with you.

    Your use of the word "machines" was ridiculous and false.

    [...]

    --
    Keith Thompson (The_Other_Keith) Keith.S.Thompson+u@gmail.com
    void Void(void) { Void(); } /* The recursive call of the void */

    --- PyGate Linux v1.5.15
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From Keith Thompson@3:633/10 to All on Fri Jun 5 11:24:52 2026
    Tim Rentsch <tr.17687@z991.linuxsc.com> writes:
    Keith Thompson <Keith.S.Thompson+u@gmail.com> writes:
    Tim Rentsch <tr.17687@z991.linuxsc.com> writes:
    Keith Thompson <Keith.S.Thompson+u@gmail.com> writes:
    Note that in a context that requires a constant expression, overflow is >>>> a constraint violation. For example, a case label like:

    case (INT_MAX + 1) * 0:

    must be diagnosed at compile time.

    gcc disagrees with you.

    What makes you think so?

    [...]

    I'm skipping this and proceeding on to the original question.

    Why?

    You made a statement, "gcc disagrees with you". I demonstrated,
    in text that you snipped, that gcc does in fact agree with me.
    You were wrong. I don't know the basis of your error, so I asked.
    Or maybe I'm missing something, and you had a valid point that I
    didn't understand.

    You're not required to answer my question, which I think was
    an extremely reasonable one, but quoting it and then explicitly
    refusing to answer it is pointlessly rude.

    I'd like to know whether you still think you were right. If so,
    I'd like to see your explanation. If not, an admission that you
    made a mistake would be appreciated. But I expect neither from you.

    [SNIP]

    I see no basis for this belief. My conclusions are based on what
    the C standard actually says, rather than guesses about some
    unstated "intentions". I think you would do well to reach your
    conclusions based more on the actual text of the C standard, and
    less on your interpretation of what the text was "intended" to
    mean.

    The actual text of the standard implies that 42 is not an expression.
    I rely on the obvious intent to conclude that it is.

    --
    Keith Thompson (The_Other_Keith) Keith.S.Thompson+u@gmail.com
    void Void(void) { Void(); } /* The recursive call of the void */

    --- PyGate Linux v1.5.15
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From Tim Rentsch@3:633/10 to All on Fri Jun 5 11:53:05 2026
    Keith Thompson <Keith.S.Thompson+u@gmail.com> writes:

    Tim Rentsch <tr.17687@z991.linuxsc.com> writes:

    Keith Thompson <Keith.S.Thompson+u@gmail.com> writes:

    [...]

    The line `# 2 "hello.c"` is, according to the C standard, a
    "non-directive", which is a kind of directive. Executing a
    non-directive has undefined behavior,

    Since it is gcc that is generating the non-directives, for
    internal purposes, and gcc that is consuming them, it hardly
    seems worth worrying about whether their behavior is defined
    or not.

    I wasn't worried. I just mentioned in in passing.

    You quoted most of the article, but snipped relevant context in
    the middle of a sentence.

    It wasn't relevant to what I wanted to say.

    --- PyGate Linux v1.5.15
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From Bart@3:633/10 to All on Fri Jun 5 20:29:16 2026
    On 05/06/2026 19:09, Keith Thompson wrote:
    Bart <bc@freeuk.com> writes:
    On 05/06/2026 08:29, David Brown wrote:
    On 04/06/2026 21:29, Bart wrote:
    [...]
    TR:
    Sadly the idea of writing in a way that is "most easily understood"
    has resulted in a race to the bottom, where writers are more and
    more encouraged to take the view that (some) readers are pretty
    much arbitrarily stupid, with the result that expressions become
    littered with scads of unnecessary parentheses that actually
    detract from ease of reading. Good writing is always a balance
    between too much and too little.

    BC:
    Actual examples of too many parentheses?

    TR:
    The point of my comment is that either too many or too few is a
    subjective judgment, not an objective one.

    Here it is clear that 'too many' was just a paraphrase of
    'unnecessary'.

    No, it is clear that "too many" and "unnecessary" have two different meanings.

    I was replying to a comment that used "unnecessary" and "too much".
    Presumably they are connected.

    Maybe I should asked for examples of "too much parentheses"!

    (How would you have phrased it? Bear in mind you will had the benefit of dozens of posts showing the pitfalls in this group of choosing words
    that people will seize upon mercilessly.)

    The idea that "too many" and "unnecessary" mean the same thing
    is your own invention.

    But "too much" and "unnecessary" are perfectly fine!




    [...]

    Tim Rentsch I'm sure will prefer the latter because 99.9% of C
    programmers are machines, according to him.

    Please give a reference for him saying that.ÿ (I'll save you the
    bother, he has not made any remarks remotely like this in
    c.l.c. since I have been here.)

    Find out what was the subject of the 99.9% (even if that was an
    exaggeration). Then we'll talk.

    Only Tim can clarify that point, and he's made it clear that he's
    not interested in doing so. Please don't complain to the rest of
    us about that.

    No, he didn't use the word 'machines'; I paraphrased to suggest
    supernormal people who know everything and never make mistakes.

    You're going to argue about this now?

    Bart, when you make ridiculous and/or false statements, people are going
    to argue with you. When you double down on such statements, people are
    going to continue to argue with you.

    Your use of the word "machines" was ridiculous and false.

    But this statement from Tim isn't ridiculous at all:

    "If someone really can't learn the rules of expression syntax for the
    language they are using, they should be advised to try a different
    language, or perhaps give up programming altogether. It's silly to
    worry about something that 999 people out of a 1000 (and the actual
    numbers are undoubtedly much higher) are able to navigate without
    difficulty."

    999 out of 1000? And he says 'much higher' so, what, 99999 out of 100000?

    If C programmers were really that perfect, then they probably /are/
    machines (ie. AI).

    But I curious: why has nobody but me picked up on this exaggeration?



    --- PyGate Linux v1.5.15
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From Chris M. Thomasson@3:633/10 to All on Fri Jun 5 14:27:06 2026
    On 6/5/2026 5:58 AM, Waldek Hebisch wrote:
    Chris M. Thomasson <chris.m.thomasson.1@gmail.com> wrote:
    On 6/4/2026 4:44 PM, Bart wrote:
    On 05/06/2026 00:09, Keith Thompson wrote:
    Bart <bc@freeuk.com> writes:
    On 04/06/2026 22:06, Keith Thompson wrote:
    Bart <bc@freeuk.com> writes:
    On 04/06/2026 19:54, David Brown wrote:
    [...]
    Again - /please/ stop trying to guess what people say or put words >>>>>>>> in their mouths.ÿ I can't remember ever seeing you do so accurately. >>>>>>>
    This is what you actually said:

    It is an objective fact, therefore, that "(a*a) + (b*b)" has more >>>>>>>> parentheses than needed in the context of most programming languages. >>>>>>>>
    "(a*a) + (b*b) has too many parentheses", on the other hand, is a >>>>>>>> purely
    subjective opinion.ÿ Even if it is true that this is "commonly agreed >>>>>>>> to" (and AFAIK you have no basis for that claim), that would still >>>>>>>> be a
    subjective opinion - no matter how common that opinion is.

    You're saying that:

    *ÿ "more than needed" is objective
    *ÿ "too many" is subjective
    Stop it.ÿ He's not saying that.

    That is EXACTLY what he's saying: "It is an OBJECTIVE fact .. has more >>>>> ... than needed", and:

    ÿ "has too many ... is ... purely subjective".

    You're taking phrases out of context and making false claims that the >>>>>> full statement was far more general than it actually was.

    And this is exactly what other people are doing.

    Taken literally, your statement implies that you admit that that's
    what you're doing.ÿ Is that what you meant?ÿ If so, I suggest you
    *stop* making such false claims.ÿ If not, what did you actually mean?

    So I used TOO MANY instead of MORE THAN NEEDED to describe the exact >>>>> same phenomenon.

    That's not the problem.ÿ There is an actual meaningful distinction
    here, between what's needed by the compiler and what's useful to
    improve clarity for human readers.ÿ I have found some of what you've
    written to be unclear about that distinction.

    Can we agree that the question of whether parentheses in a C
    expression are necessary to the compiler can be answered objectively?
    Can we agree that the question of whether extra parentheses are
    helpful to a human reader is at least partly subjective, and
    varies from case to case?ÿ Is there really anything else that we
    fundamentally disagree about?

    (1) Why are you all making such a big fucking deal of this?

    Why are you?

    I didn't start this business of something being subjective or objective, >>> or suggesting than one turn of phrase to discuss the same thing was
    subjective and the other objective (implying that a subjective opinion
    had less worth). TR started that and several people backed him up.

    Myself I wouldn't even use those terms. My point was that some overuses
    of () for commonly known precedences are more overkill than others.

    If that's subjective then so be it; it is not some fundamental law of
    the universe. I would just call it common sense.

    > Why are you?

    Since you ask, I was defending my point of view then got sidetracked by
    this subjective/objective nonsense. I notice that TR has disappeared
    from this subthread.


    Wrt the number of ()'s? Might as well go to sleep with the following
    song playing in the background:

    (The Fate of Ophelia - Taylor Swift (Lyrics) Charlie Puth ft. Selena
    Gomez, the weekd, ariana grande)

    AFAICS outer parentheses there are excessive, inner ones look OK.


    That's fine. Btw, have you ever looked at some of the generated code
    from the chaos pp lib?

    --- PyGate Linux v1.5.15
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From Dan Cross@3:633/10 to All on Sat Jun 6 03:10:20 2026
    In article <10vt7b9$pi3s$1@kst.eternal-september.org>,
    Keith Thompson <Keith.S.Thompson+u@gmail.com> wrote: >cross@spitfire.i.gajendra.net (Dan Cross) writes:
    In article <10vsnl7$lkmu$1@kst.eternal-september.org>,
    Keith Thompson <Keith.S.Thompson+u@gmail.com> wrote: >>>cross@spitfire.i.gajendra.net (Dan Cross) writes:
    In article <865x3yd21n.fsf@linuxsc.com>,
    Tim Rentsch <tr.17687@z991.linuxsc.com> wrote: >>>>>cross@spitfire.i.gajendra.net (Dan Cross) writes:
    In article <86ik81cfk5.fsf_-_@linuxsc.com>,
    Tim Rentsch <tr.17687@z991.linuxsc.com> wrote:
    [...]
    There's an important distinction to make here. Consider this
    program:

    #include <limits.h>

    int
    foo(){
    int zero = (INT_MAX+1)*0;
    return zero;
    }

    int
    main(){
    return 0;
    }

    This program does not transgress the bounds of undefined behavior. >>>>>
    To clarify, the comments in my posting were meant to be read as >>>>>saying the given text is the entire program, and that it is strictly >>>>>conforming with respect to conforming hosted implementations. >>>>>(Incidentally, given the rules for freestanding implementations, I'm >>>>>not sure that it is even possible for any program to be strictly >>>>>conforming with respect to conforming freestanding implementations. >>>>>In any case my statements were meant only in the context of hosted >>>>>implementations.)

    Ok.

    [snip]
    Perhaps you mean that this is irrelevant because `foo` is not
    invoked, but I see no reason why that need be the case in e.g.
    a freestanding environment.

    I explained the context of my previous statements above. Sorry for >>>>>not saying that in the original message.

    In a hosted environment, I don't
    think anything explicitly prevents `foo` from being called after
    `main` returns (though I can't imagine that would happen in real
    life; it would be weird if it did).

    The semantics described in the ISO C standard don't admit that >>>>>possibility.

    Could you please point to where it says this, in the C standard?

    I cannot find anything that says that arbitrary code cannot run
    after `main()` returns, and I don't see how that could possibly
    be true.

    N3220 5.1.2.4, Program semantics.

    It defines the *observable behavior* of a program, which consists of >>>accesses to volatile objects, data written to files, and I/O dynamics of >>>interactive devices.

    Yes, but it does so for strictly-conforming programs with no UB.

    It does so for programs in general, not just strictly conforming
    ones. If a program has undefined behavior, all bets are off,
    but for example a program that evaluates `printf("%d\n", INT_MAX)`
    is not strictly conforming, but it's fully subject to 5.1.2.4.

    To understand conformance, we have to jump over to section 4,
    which explicitly says that, 'Undefined behavior is otherwise
    indicated in this document by the words "undefined behavior" or
    by the omission of any explicit definition of behavior.' As it
    does not say that a program with an instance of undefined
    behavior in an integer constant expression that is not executed
    must otherwise behave in any given manner, what the program does
    is undefined. A constaint violation mandates a diagnostic, but
    beyond that, the standard is (AFAICT) silent.

    I don't think an integer constant expression can have undefined
    behavior. INT_MAX+1 and 1/0 are not constant expressions, because
    neither "evaluate(s) to a constant that is in the range of
    representable values for its type".

    I claim that an expression that looks like a constant expression
    *isn't* a constant-expression if it doesn't appear in a context
    that requires a constant-expression.

    That's a bold claim, but I think I see why you're saying that.

    The program in question, quoted above, has:

    int zero = (INT_MAX+1)*0;

    `(INT_MAX+1)*0` is not a constant expression, not because of the
    overflow, but because a constant expression is not required in
    that context. "constant-expression" is defined by a production in
    the grammar (it reduces to "conditional-expression"). Even in

    int n = 42;

    42 is not a a constant expression, because the grammar doesn't
    call for a constant expression in that context -- even though it
    looks like one. Similarly, in `a + b * c`, `a + b` looks like an
    additive expression, but it isn't one. (Not a perfect analogy.)

    Right; I see what you mean. In this case, the
    `assignment-expression` production applies, not
    `constant-expression`.

    Undefined Behavior, in turn, is not defined as specific only to
    execution: the standard simply says that it is "behavior, upon
    use of a *nonportable or erroneous program construct*..." for
    which there are no requirements, and there are examples of
    things that are explicitly UB at translation time, such as
    improperly terminated lexemes and so forth.

    Yes, there are constructs that are explicitly UB at translation time.
    (I think that's unfortunate, and there are efforts to clear up some
    such cases in C2y.)

    It's unclear to me how it could be any other way. If UB was
    _only_ an issue at runtime, then how could a compiler take
    advantage of it to perform optimizations during translation?
    We know that compilers do this.

    Signed integer overflow is not one of those constructs.

    This I'm not sure I agree with. It the compiler detects signed
    integer overflow in (perhaps not relevant in _this_ example) an
    integer constant expression, I still don't see anthing that
    makes that anything other than UB. It's a constaint violation,
    sure, but nothing says it is not also UB.

    Any undefined behavior from evaluating INT_MAX+1 happens during
    execution (barring constraint violations).

    I'm not sure the standard says that. The standard says this
    happens during _evaluation_, and that evaluation must be
    performed in accordance with the rules of the abstract syntax
    machine. But it doesn't precisely specify _when_ evaluation
    takes place, and in particular, there are places in the standard
    that explicitly mention evaluation during translation. I still
    don't see anything that prohibits a compiler from evaluating
    that expression at compile time (indeed, it clearly does, as it
    generates a diagnostic about the overflow).

    I suppose that changes the matter: does the language merely
    leave that unspecified, in which case, this program is not
    strictly conforming, or does it say that it _cannot_ make any
    translation-time decisions about it? I cannot find a satisfying
    argument for the latter.

    Furthermore, the expression above is obviously an integer
    constant expression as defined by sec 6.6 para 8. Section 6.6,
    para 4, reads in part, "Each constant expression shall evaluate
    to a constant that is in the range of representable values for
    its type." The expression, `(INT_MAX+1)*0` violates this
    constraint, and so therefore a diagnostic is mandated as per
    sec 5.1.1.3 para 1. That it appears in code that is not
    obviously called from `main` doesn't change that.

    It satisfies the requirements for an integer constant expression in
    6.6p8, but it violates the constraint in 6.6p4. (I presume that an
    "integer constant expression" must be a "constant expression".)
    But since "constant-expression" is a grammatical production,
    it doesn't have to satisfy that constraint, and no diagnostic
    is required. (A warning is certainly permitted.)

    Fair point. It's grammatical position makes it an
    assignment-expression. I clearly misinterpreted that before.

    Similarly, this:
    int n = INT_MAX + 1;
    at block scope doesn't require a diagnostic, though of course it
    has undefined behavior -- but at file scope, the initializer is a
    constant expression, so that would be a constraint violation.

    Right. The semantics of this are defined in sec 6.7.11 para 5.

    Morever, sec 6.6 para 17 says that, "the semantic rules for
    evaluation of a constant expression are the same as for
    nonconstant expressions." This brings us back to 5.1.2.4,
    though I submit that para (4) is a stronger argument for what
    you and Tim are saying, as it reads in part, "An actual
    implementation is not required to evaluate part of an expression
    if it can deduce that its value is not used and that no needed
    side effects are produced (including any caused by calling a
    function or through volatile access to an object)." I interpret
    this to mean that, if the implementation can determine that
    there is no way that `foo` can be called, it does not _have_ to
    evaluate the above expression. However, it must satisfy the
    range constraint from section 6.6, so it likely will, and in any
    event, the standard does not say that it, "shall not" evaluate
    it, or when.

    Overflow in a constant expression is not undefined behavior. It's a >constraint violation. But that doesn't apply here, because the
    initializer is not a constant expression. (Sorry if I'm repeating
    myself.)

    Where does it say that UB and constraint violations are mutually
    exclusive? I don't see any such statement in the standard. Am
    I missing it?

    The standard says that if a constraint is violated, a diagnostic
    must be emitted, regardless of whether or not the constraint
    violation is the result of something that is UB not; that is, if
    a constraint violation occurs due to something that is UB, the
    implementation must still emit a diagnostic: UB is not an escape
    hatch from that requirement.

    It also says, 'If a "shall" or "shall not" requirement that
    appears outside of a constraint or runtime-constraint is
    violated, the behavior is undefined. Undefined behavior is
    otherwise indicated in this document by the words "undefined
    behavior" or by the omission of any explicit definition of
    behavior.' However, that does not preclude such behavior being
    undefined; it just means that the words "shall" and "shall not"
    in a constraint violation do not a priori describe behavior vis
    definition.

    Once the compiler does that, if it does, and observes UB, the
    standard is silent on what requirements it imposes, which means
    the behavior is undefined. I see no reason it couldn't arrange
    to invoke `foo` at that point.

    Any UB in the program would occur during execution,

    I suppose; but it's not clear to me that UB is tied _only_ to
    execution time.

    The standard is explicit that there _are_ things that are
    evaluated at translation time, like the initializer for an
    object with storage class `constexpr`. It is not clear me that
    a compiler is otherwise _prohibited_ from evaluating an
    expression during translation; indeed, one could imagine it
    doing so to perform constant folding, and I do not believe there
    exists any normative text defining it as such.

    I realize this is an extreme interpretation, and not one that is
    not widely shared. Personally, I think it's rather silly.

    However, I that is _a_ danger of the informality of the C
    specification; it does not define the semantics of the abstract
    machine in the formally precise way that, say, the SML spec
    defines that language's semantics. Rather, it informally
    specifies them in prose, and that prose is ambiguous.

    Probably much good would be done if C's semantics _were_
    rigorously defined, but they are not. Thus, they are open to
    radical interpretation, and as extreme as those may be, I do not
    see how the normative text of the standard explicitly
    _prohibits_ them.

    and in fact
    it *won't* occur during execution because foo() isn't called.
    A compiler can't generate code with arbitrary behavior just because
    it can't prove that there will be no UB. If it could, every signed
    or floating-point arithmetic operation with unknown operand values
    would grant the same permission.

    But that's not the situation here. The situation is that the
    compiler can prove that something _is_ UB.

    Regardless, I think you highlighted an actual problem with the
    spec; I don't think that behavior is _explicitly_ prohibited,
    therefore, it is likely undefined, but at a minimum unspecified,
    whether it actually could happen. If the argument against that
    is that this renders the language essentially unusuable, then
    my response is, "yeah, well, welcome to programming in C in the
    2020s." Most compilers would never be that extreme, but I see
    no evidence that it would not be an invalid reading of the
    literal text of the standard if they did.

    So no, I do not see how execution according to the rules of the
    abstract machine is not guaranteed, here. I certainly see no
    way in which this can be regarded as a strictly conforming
    program.

    foo()'s behavior would be undefined if it were called. It *isn't*
    called, so there's no actual UB. The program does not violate any
    of the other requirements for strict conformance.

    I understand _what_ you're saying: despite the expression itself
    manifesting undefined behavior, in this case it's not UB because
    `foo` is never executed. What I'm saying is that I don't see
    anything in the standard that restricts UB to _only_ executed
    code. A reputable compiler obviously instruments `foo` with
    code to trap into ubsan; if it's not UB, since it's not
    executed, then why do so? Granted, that's not evidence of
    anything other than the behavior of those compilers, but still.

    It is clearly the _intent_ that this be a strictly conforming
    program. The C standard, as an imprecise, informal document,
    cannot guarantee it.

    If the usual "Hello, world" program prints "Hello, world" followed
    by "Goodbye", the implementation is non-conforming. If it formats
    my hard drive after printing "Goodbye", it's non-conforming and >>>dangerous.

    Two separate things. My point earlier was that code can
    obviously run after `main` terminates. Moreoever, I can't
    imagine what would _prevent_ a runtime system that invokes
    `main` from doing something like printing, "PROGRAM STOPPED"
    after `main` returned. C imposes no requirements here.

    Yes, it does. An OS can print "PROGRAM STOPPED", but not as part
    of the execution of the program. On my system, a shell prompt is
    printed after a program terminates, but not by the program. If I
    execute a "hello, world" program with its output redirected to a file
    (on a system that supports that), the resulting file cannot contain
    "PROGRAM STOPPED". The requirements in 5.1.2.4 specify both what
    the execution of a program must do and what it must not do.

    Files are a separate case. There's no guarantee that the
    standard output refers to a file; it may well refer to an
    "interactive device", the semantics of which are (necessarily)
    unspecified.

    Here's an example: consider an interactive user who uses a
    screen reader device. Suppose that user makes use of an
    implementation that includes runtime support for that device,
    and that precedes invocation of `main` with a command sequence
    causing the screen reader to (perhaps) change intonation; and
    suceeds return from main by outputing another command sequence
    that resets to the original state.

    I do not see how C could prohibit that, assuming that the
    implementation takes care to detect whether standard output
    really refers to the screen reader, and does emit the control
    sequences if output is redirected to a file. Another user who
    runs that same program without a screen reader may see the
    standard text printed on the screen, without the control
    sequence sandwich.

    I don't think a conforming implementation can prohibit that kind
    of thing.

    Whether foo() has external linkage or internal
    linkage doesn't change that.

    I disagree. There's no possible way for the implementation to
    know whether a function with external linkage will be ultimately
    invoked or not; consider a system that supports loadable shared
    modules. Nothing prevents even this simple program from being
    compiled as a shared module, dynamically loaded, the loading
    program explicitly searching for and finding the symbol
    corresponding to the `foo` function, and invoking it.

    Remember that linking is translation phase 8. The compiler is not
    the entire implementation.

    Exactly my point. The compiler cannot know how `foo` might be
    used, or how the translated object might be exercised. There's
    I don't see how it could possibly know that, given that `foo`
    has external linkage.

    We were presented with a complete translation unit that included a
    function definition for "main". It's a complete program. There's no
    valid way for some other program to call foo. If OS provided such
    a mechanism, it would be outside the scope of C.

    Given an excessively pedantic and literal reading of the text of
    the standard, I don't think an implementation is explicitly
    prohibited from evaluating the initializer at translation time,
    deducing that the behavior is undefined, and blaming it on the
    program, at which point, all bets are off.

    Hence, the compiler _must_ treat with UB as written, which is
    why `ubsan` inserts trapping code in `foo`.

    I don't know what "_must_ treat with UB" means.

    foo() has undefined behavior if it's called, so replacing its
    body with trapping code is valid. But (I'm reasonably sure that)
    an implementation cannot reject a program just because it can't
    prove that it has no undefined behavior during execution. It can
    reject it if it can prove that it *always* has undefined behavior
    during execution.

    What I'm saying is that, `foo` has undefined behavior _period_.
    That's manifest in an integer constant expression, whether it is
    executed at runtime or not. I believe that the standard forces
    the expression to be evaluated at translation time, via the
    "shall" mandate when checking the constraint on the range in sec
    6.6 para 4. Further, that evaluation must happen in accordance
    with the rules of the abstract machine, as per 5.1.2.4 para 17.
    The diagnostic is mandated, as is the translation-time
    evaluation. The expression is itself manifestly exhibits UB,
    and so therefore the result of the rest of the translation is
    undefined.

    foo is a function. foo does not have undefined behavior; it has no
    behavior at all. A *call* to foo during execution has undefined
    behavior. (`foo;` is a statement-expression that does nothing;
    it does not have undefined behavior.)

    The _evaluation_ of that expression in `foo` has undefined
    behavior. The standard does not say that it _cannot_ be
    evaluated at translation time.

    [SNIP]

    I think the question of whether the initializer is a
    constant-expression or not has caused some not entirely relevant
    confusion.

    Here's another example that avoids that issue.

    #include <limits.h>

    int foo(void) {
    int zero;
    zero = INT_MAX;
    zero ++;
    zero *= 0;
    return zero;
    }

    int main(void) {
    return 0;
    }

    Given my grammatical argument above, I would say that this program
    has no constant expressions.

    Agreed, if by "constant expressions" you mean those mandated to
    use the `constant-expression` grammatical production.

    Whether that argument is correct or
    not, it certainly has no constant expressions that violate any
    constraint or that have undefined behavior. Evaluating `zero ++`
    (which doesn't even pretend to be a constant expression) would have
    run-time undefined behavior -- *if* foo() were ever called.

    Let me turn this around in two ways: suppose that the
    translation unit _only_ included `foo`. Could the compiler
    deduce that the behavior of `foo`, if called, is undefined? If
    not, why not?

    Second, suppose that `foo` _were_ called, could the compiler
    replace this with a program that was the equivalent of,
    `int main(void) {printf("check your nose"); abort();}`? If so
    why? If not, why not?

    And given this translation unit, I don't think there's any way to
    construct a multi-TU program that calls foo, so a compiler *can*
    determine that foo is never called (but there's no requirement to
    do so, or to make any use of that information).

    This is the crux of my point, as well. There's not requirement
    for the translator to _not_ evaluate the expression and become
    privy to UB.

    Would it be stupid if a compiler did that? Yes. Do existing
    compilers do so? No, not that I'm aware of. Would some dweeb
    nerd compiler douche who thinks this would make a compiler
    benchmark some microfraction of a percent faster take advantage
    of that? I absolutely think so, yes.

    - Dan C.


    --- PyGate Linux v1.5.15
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From Dan Cross@3:633/10 to All on Sat Jun 6 03:22:03 2026
    In article <86bjdpayv0.fsf@linuxsc.com>,
    Tim Rentsch <tr.17687@z991.linuxsc.com> wrote:
    Keith Thompson <Keith.S.Thompson+u@gmail.com> writes:
    [snip]
    But taking a closer look at the standard, I'm not 100% sure that the
    language requires a diagnostic, though I think that's the intent.
    The relevant constraint is:

    Each constant expression shall evaluate to a constant that is
    in the range of representable values for its type.

    If I squint really hard, I can argue that the entire expression
    has to be a constant expression, but it doesn't say that its
    subexpressions are constant expressions -- and *if* INT_MAX +
    1 evaluates to INT_MIN in the current implementation, then
    (INT_MAX + 1) * 0 evaluates to 0 and therefore satisfies the
    constraint.

    My reasoning is as follows.

    To determine if the constraint is satisfied, the compiler must
    first evaluate the expression (INT_MAX + 1) * 0.

    To evaluate the expression (INT_MAX + 1) * 0, the compiler must
    first evaluate the sub-expression (INT_MAX + 1).

    Because the expression (INT_MAX + 1) overflows, the behavior is
    undefined, and the compiler is free to decide that the value of
    the sub-expression (INT_MAX + 1) is, let's say, 12.

    The compiler next evaluates the overall expression as 12*0, which
    is 0 (an int).

    This result of the overall expression satisfies the constraint,
    and so the compiler is not obliged to generate a diagnostic.

    The text of the standard explicitly carves this out; or, rather,
    it attempts to. If the result of an expression is not
    representable in the target type, _regardless of whether that's
    due to UB or not_, a diagnostic is required.

    But as it happens, I think I can see how your interpretation may
    be valid: if, as a result of UB, the expression evaluates to "0"
    (or 12 or something simiilar) that _is_ representable, then
    there _is no constraint violation_ and so no diagnostic is
    required.

    I do not believe that that is the intent. But it _is_
    conformant with the text of the standard.

    This is a problem with the C standard: it is insufficiently
    precise, as the semantics of the language are not formally
    defined.

    [snip]
    I see no basis for this belief. My conclusions are based on what
    the C standard actually says, rather than guesses about some
    unstated "intentions". I think you would do well to reach your
    conclusions based more on the actual text of the C standard, and
    less on your interpretation of what the text was "intended" to
    mean.

    The same could be said to you, as well. There exists a reading
    of the standard by which your `foo`-containing program is not
    strictly conforming . But that way lies madness; C is not a
    formally specified language. Given that as an objective fact,
    we must accept intent, consistency, and other "soft" aspects
    when considering its definition.

    That sort of sucks, but here we are.

    - Dan C.


    --- PyGate Linux v1.5.15
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From Dan Cross@3:633/10 to All on Sat Jun 6 03:44:26 2026
    In article <10vu703$11s5q$1@dont-email.me>, Bart <bc@freeuk.com> wrote:
    On 05/06/2026 08:53, Tim Rentsch wrote:
    cross@spitfire.i.gajendra.net (Dan Cross) writes:

    In article <10vsrpo$men2$2@dont-email.me>, Bart <bc@freeuk.com> wrote:

    On 04/06/2026 22:06, Keith Thompson wrote:

    Bart <bc@freeuk.com> writes:

    [snip]
    Tim Rentsch I'm sure will prefer the latter because 99.9% of C
    programmers are machines, according to him.

    Tim didn't say or imply that.

    So what was his 99.9% all about? Nobody has a clue, except they are
    certain that what I think it is is wrong!

    Have you thought about, I don't know, maybe asking him?

    Asking him straight questions is usually futile. You can probably guess
    this from the response below.

    I agree that that response was both unhelpful and hypocritical.

    Notice he hasn't tried to enlighten anyone about that 99.9%.

    I think my explanation was actually pretty close. YMMV.

    That may just have been a throwaway line like when I say 'nobody likes
    X', but I would still dispute that, if it's about what I think it is,
    it's anything like a super-majority.

    The point still stands. You should know your audience:
    comp.lang.c is a forum that prizes a certain kind of semantic
    precision. Perhaps your intent when you say things of the form,
    "X has too many parentheses" is to be informal; it will
    certainly not be taken that way here. And you _do_ have a track
    record of being wrong enough that you are unlikely to be
    afforded the benefit of the doubt.

    At the risk of saying what may be obvious to everyone, Bart has
    shown that he has no interest in having a serious, constructive,
    useful, or productive conversation with anyone. His questions
    are all rhetorical; he hasn't asked me a straight question
    because he isn't really interested in what I would say. In
    short, Bart isn't looking for an answer, he's looking for an
    argument. My recommendation is just stop responding to him
    altogether. My response to him upthread was a sincere effort to
    provide a neutral and helpful answer to his question. Maybe my
    remarks were helpful to other people, and if they were that's
    good. Any further efforts to interact with Bart are not just a
    waste of time but actually counterproductive. What Bart needs is
    not help with understanding C but a good therapist. In any case
    I'm confident that whatever Bart's needs may be, no one responding
    to his postings here is in a position to provide them. Please
    consider these remarks before responding to him further.

    Generally speaking, AFAIK, none of the regular posters here are
    qualified mental health professionals; as such, we should all
    avoid from making armchair psychological diagnoses, the
    occasionally midly offcolor joke aside ("that's crazy!").

    Stick to C, Tim.

    - Dan C.


    --- PyGate Linux v1.5.15
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From Dan Cross@3:633/10 to All on Sat Jun 6 03:45:04 2026
    In article <86jysdb1yr.fsf@linuxsc.com>,
    Tim Rentsch <tr.17687@z991.linuxsc.com> wrote:
    I didn't read Bart's posting. Unfortunately it seems
    true that any continued interaction with his comments
    is counterproductive.

    As is your response. I, for one, can conceieve of no purpose to
    it other than to goad him. Do better.

    - Dan C.


    --- PyGate Linux v1.5.15
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From Dan Cross@3:633/10 to All on Sat Jun 6 03:49:30 2026
    In article <DHAUR.47540$0o1c.29921@fx08.iad>,
    Scott Lurndal <slp53@pacbell.net> wrote:
    cross@spitfire.i.gajendra.net (Dan Cross) writes:
    In article <1BoUR.3$lmCb.1@fx22.iad>, Scott Lurndal <slp53@pacbell.net> wrote:
    cross@spitfire.i.gajendra.net (Dan Cross) writes:
    [snip]
    <snip>
    Yeah, that's from `cc.c`, right?

    No, it's from cpp.c

    $ ls /work/reference/collegetapes/sltape/v6cc/
    c0.c c00.c c01.c c02.c c03.c c04.c c05.c c1.h
    c10.c c11.c c12.c c13.c c2.h c20.c c21.c cc.c cpp.c

    Oh interesting. I don't have a `cpp.c` in my v6 archive.

    I wonder what else I'm missing.

    [snip]

    Thanks! This is an artifact definitely worth preserving. As
    far as I know, it's not in any of the extant V6 archives. I'll
    shoot you an email, if that's ok.

    - Dan C.


    --- PyGate Linux v1.5.15
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From Janis Papanagnou@3:633/10 to All on Sat Jun 6 07:39:20 2026
    On 2026-06-06 05:44, Dan Cross wrote:
    In article <10vu703$11s5q$1@dont-email.me>, Bart <bc@freeuk.com> wrote:
    On 05/06/2026 08:53, Tim Rentsch wrote:
    [...]
    [...]
    [...]

    Generally speaking, AFAIK, none of the regular posters here are
    qualified mental health professionals; as such, we should all
    avoid from making armchair psychological diagnoses, the
    occasionally midly offcolor joke aside ("that's crazy!").

    Do we need to know about the particle physics mechanics of
    H -> He fusion or Einstein's E = m c^2 to understand that
    our sun is emitting energy, giving us light and warms us?

    Janis :-}


    --- PyGate Linux v1.5.15
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From Keith Thompson@3:633/10 to All on Fri Jun 5 23:50:49 2026
    cross@spitfire.i.gajendra.net (Dan Cross) writes:
    In article <10vt7b9$pi3s$1@kst.eternal-september.org>,
    Keith Thompson <Keith.S.Thompson+u@gmail.com> wrote:
    cross@spitfire.i.gajendra.net (Dan Cross) writes:
    In article <10vsnl7$lkmu$1@kst.eternal-september.org>,
    Keith Thompson <Keith.S.Thompson+u@gmail.com> wrote: >>>>cross@spitfire.i.gajendra.net (Dan Cross) writes:
    In article <865x3yd21n.fsf@linuxsc.com>,
    Tim Rentsch <tr.17687@z991.linuxsc.com> wrote: >>>>>>cross@spitfire.i.gajendra.net (Dan Cross) writes:
    In article <86ik81cfk5.fsf_-_@linuxsc.com>,
    Tim Rentsch <tr.17687@z991.linuxsc.com> wrote:
    [...]
    There's an important distinction to make here. Consider this
    program:

    #include <limits.h>

    int
    foo(){
    int zero = (INT_MAX+1)*0;
    return zero;
    }

    int
    main(){
    return 0;
    }

    This program does not transgress the bounds of undefined behavior. >>>>>>
    To clarify, the comments in my posting were meant to be read as >>>>>>saying the given text is the entire program, and that it is strictly >>>>>>conforming with respect to conforming hosted implementations. >>>>>>(Incidentally, given the rules for freestanding implementations, I'm >>>>>>not sure that it is even possible for any program to be strictly >>>>>>conforming with respect to conforming freestanding implementations. >>>>>>In any case my statements were meant only in the context of hosted >>>>>>implementations.)

    Ok.

    [snip]
    Perhaps you mean that this is irrelevant because `foo` is not
    invoked, but I see no reason why that need be the case in e.g.
    a freestanding environment.

    I explained the context of my previous statements above. Sorry for >>>>>>not saying that in the original message.

    In a hosted environment, I don't
    think anything explicitly prevents `foo` from being called after >>>>>>> `main` returns (though I can't imagine that would happen in real >>>>>>> life; it would be weird if it did).

    The semantics described in the ISO C standard don't admit that >>>>>>possibility.

    Could you please point to where it says this, in the C standard?

    I cannot find anything that says that arbitrary code cannot run
    after `main()` returns, and I don't see how that could possibly
    be true.

    N3220 5.1.2.4, Program semantics.

    It defines the *observable behavior* of a program, which consists of >>>>accesses to volatile objects, data written to files, and I/O dynamics of >>>>interactive devices.

    Yes, but it does so for strictly-conforming programs with no UB.

    It does so for programs in general, not just strictly conforming
    ones. If a program has undefined behavior, all bets are off,
    but for example a program that evaluates `printf("%d\n", INT_MAX)`
    is not strictly conforming, but it's fully subject to 5.1.2.4.

    To understand conformance, we have to jump over to section 4,
    which explicitly says that, 'Undefined behavior is otherwise
    indicated in this document by the words "undefined behavior" or
    by the omission of any explicit definition of behavior.' As it
    does not say that a program with an instance of undefined
    behavior in an integer constant expression that is not executed
    must otherwise behave in any given manner, what the program does
    is undefined. A constaint violation mandates a diagnostic, but
    beyond that, the standard is (AFAICT) silent.

    I don't think an integer constant expression can have undefined
    behavior. INT_MAX+1 and 1/0 are not constant expressions, because
    neither "evaluate(s) to a constant that is in the range of
    representable values for its type".

    I claim that an expression that looks like a constant expression
    *isn't* a constant-expression if it doesn't appear in a context
    that requires a constant-expression.

    That's a bold claim, but I think I see why you're saying that.

    The program in question, quoted above, has:

    int zero = (INT_MAX+1)*0;

    `(INT_MAX+1)*0` is not a constant expression, not because of the
    overflow, but because a constant expression is not required in
    that context. "constant-expression" is defined by a production in
    the grammar (it reduces to "conditional-expression"). Even in

    int n = 42;

    42 is not a a constant expression, because the grammar doesn't
    call for a constant expression in that context -- even though it
    looks like one. Similarly, in `a + b * c`, `a + b` looks like an
    additive expression, but it isn't one. (Not a perfect analogy.)

    Right; I see what you mean. In this case, the
    `assignment-expression` production applies, not
    `constant-expression`.

    Undefined Behavior, in turn, is not defined as specific only to
    execution: the standard simply says that it is "behavior, upon
    use of a *nonportable or erroneous program construct*..." for
    which there are no requirements, and there are examples of
    things that are explicitly UB at translation time, such as
    improperly terminated lexemes and so forth.

    Yes, there are constructs that are explicitly UB at translation time.
    (I think that's unfortunate, and there are efforts to clear up some
    such cases in C2y.)

    It's unclear to me how it could be any other way. If UB was
    _only_ an issue at runtime, then how could a compiler take
    advantage of it to perform optimizations during translation?
    We know that compilers do this.

    There are instances of undefined behavior that depend on specific characteristics of a source file, not on run-time behavior.
    The first example I found (N3220) is in the description of
    translation phase 4, 5.1.1.2:

    If a character sequence that matches the syntax of a universal
    character name is produced by token concatenation (6.10.5.3), the
    behavior is undefined.

    That's something that can be detected during compilation. It would
    be far better if it were either well defined or a syntax rule
    violation. And in fact the latest C2y draft doesn't have that
    wording. There's an ongoing effort to clean up this kind of thing.

    That's not the kind of UB I'm talking about.

    Signed integer overflow is not one of those constructs.

    This I'm not sure I agree with. It the compiler detects signed
    integer overflow in (perhaps not relevant in _this_ example) an
    integer constant expression, I still don't see anthing that
    makes that anything other than UB. It's a constaint violation,
    sure, but nothing says it is not also UB.

    An implementation can choose to successfully translate a program that
    violates a constraint. In my opinion, the resulting program has (or
    should be considered to have) undefined behavior, but the standard
    doesn't explicitly say so. My argument is based on the definition
    of "constraint": "restriction, either syntactic or semantic,
    by which the exposition of language elements is interpreted".
    If a constraint is violated, I argue that there is no basis for
    interpreting the exposition of language elements, and therefore no
    definition of the behavior.

    Other interpretations are possible.

    So if an overflow in an ICE has undefined behavior, it's merely
    an instance of this more general principle, which might even not
    be valid.

    An unambiguous case is:

    case INT_MAX+1:

    That's a constraint violation. The expression is required to be an
    ICE, but it doesn't "evaluate to a constant that is in the range
    of representable values for its type" (unless you want to argue
    that it can evaluate to INT_MIN for a particular implementation,
    but I really dislike the implications of that). If there's UB,
    it's because of the constraint violation. (In fact I'd expect most
    compilers to reject it, so there's no behavior at all.)

    On the other hand, this:

    int n = INT_MAX;
    n++;

    has undefined behavior and is not a constraint violation. A note on the definition of "undefined behavior" says:

    Possible undefined behavior ranges from ignoring the situation
    completely with unpredictable results, to behaving during
    translation or program execution in a documented manner
    characteristic of the environment (with or without the issuance of a
    diagnostic message), to terminating a translation or execution (with
    the issuance of a diagnostic message).

    So a compiler can reject it *if* it can prove that the undefined
    behavior will always occur. The standard is not 100% clear about
    whether it can be rejected if the code is never executed, or is
    executed conditionally, but I think that's not permitted, or at least
    it shouldn't be. Rejecting code because the compiler can't prove
    the behavior is undefined has some very unpleasant implications.

    Any undefined behavior from evaluating INT_MAX+1 happens during
    execution (barring constraint violations).

    I'm not sure the standard says that. The standard says this
    happens during _evaluation_, and that evaluation must be
    performed in accordance with the rules of the abstract syntax
    machine. But it doesn't precisely specify _when_ evaluation
    takes place, and in particular, there are places in the standard
    that explicitly mention evaluation during translation. I still
    don't see anything that prohibits a compiler from evaluating
    that expression at compile time (indeed, it clearly does, as it
    generates a diagnostic about the overflow).

    I suppose that changes the matter: does the language merely
    leave that unspecified, in which case, this program is not
    strictly conforming, or does it say that it _cannot_ make any translation-time decisions about it? I cannot find a satisfying
    argument for the latter.

    Ok, given:

    case INT_MAX+1:

    a compiler could issue the required diagnostic for the constraint
    violation as a non-fatal warning, then generate code that executes
    an ADD instruction with operands INT_MAX and 1. That would be
    conforming but silly. The compiler has to determine that INT_MAX+1
    overflows anyway so it can issue the diagnostic.

    Furthermore, the expression above is obviously an integer
    constant expression as defined by sec 6.6 para 8. Section 6.6,
    para 4, reads in part, "Each constant expression shall evaluate
    to a constant that is in the range of representable values for
    its type." The expression, `(INT_MAX+1)*0` violates this
    constraint, and so therefore a diagnostic is mandated as per
    sec 5.1.1.3 para 1. That it appears in code that is not
    obviously called from `main` doesn't change that.

    It satisfies the requirements for an integer constant expression in
    6.6p8, but it violates the constraint in 6.6p4. (I presume that an >>"integer constant expression" must be a "constant expression".)
    But since "constant-expression" is a grammatical production,
    it doesn't have to satisfy that constraint, and no diagnostic
    is required. (A warning is certainly permitted.)

    Fair point. It's grammatical position makes it an
    assignment-expression. I clearly misinterpreted that before.

    Similarly, this:
    int n = INT_MAX + 1;
    at block scope doesn't require a diagnostic, though of course it
    has undefined behavior -- but at file scope, the initializer is a
    constant expression, so that would be a constraint violation.

    Right. The semantics of this are defined in sec 6.7.11 para 5.

    Morever, sec 6.6 para 17 says that, "the semantic rules for
    evaluation of a constant expression are the same as for
    nonconstant expressions." This brings us back to 5.1.2.4,
    though I submit that para (4) is a stronger argument for what
    you and Tim are saying, as it reads in part, "An actual
    implementation is not required to evaluate part of an expression
    if it can deduce that its value is not used and that no needed
    side effects are produced (including any caused by calling a
    function or through volatile access to an object)." I interpret
    this to mean that, if the implementation can determine that
    there is no way that `foo` can be called, it does not _have_ to
    evaluate the above expression. However, it must satisfy the
    range constraint from section 6.6, so it likely will, and in any
    event, the standard does not say that it, "shall not" evaluate
    it, or when.

    Overflow in a constant expression is not undefined behavior. It's a >>constraint violation. But that doesn't apply here, because the
    initializer is not a constant expression. (Sorry if I'm repeating
    myself.)

    Where does it say that UB and constraint violations are mutually
    exclusive? I don't see any such statement in the standard. Am
    I missing it?

    It doesn't.

    As a practical matter, when I look at C code, if it violates a
    constraint, I typically don't care about its behavior. I want it
    to be rejected at compile time (unless it's deliberately taking
    advantage of a documented extension). I'll fix it rather than
    worrying about its behavior.

    (Unless the code has somehow gotten into production and it's my
    job to analyze how it misbehaves.)

    Yes, a program that violates a constraint can have run-time behavior if
    the compiler chooses not to reject it, and that behavior may be
    undefined.

    The standard says that if a constraint is violated, a diagnostic
    must be emitted, regardless of whether or not the constraint
    violation is the result of something that is UB not; that is, if
    a constraint violation occurs due to something that is UB, the
    implementation must still emit a diagnostic: UB is not an escape
    hatch from that requirement.

    Right.

    It also says, 'If a "shall" or "shall not" requirement that
    appears outside of a constraint or runtime-constraint is
    violated, the behavior is undefined. Undefined behavior is
    otherwise indicated in this document by the words "undefined
    behavior" or by the omission of any explicit definition of
    behavior.' However, that does not preclude such behavior being
    undefined; it just means that the words "shall" and "shall not"
    in a constraint violation do not a priori describe behavior vis
    definition.

    Right.

    Once the compiler does that, if it does, and observes UB, the
    standard is silent on what requirements it imposes, which means
    the behavior is undefined. I see no reason it couldn't arrange
    to invoke `foo` at that point.

    Any UB in the program would occur during execution,

    I suppose; but it's not clear to me that UB is tied _only_ to
    execution time.

    The standard is explicit that there _are_ things that are
    evaluated at translation time, like the initializer for an
    object with storage class `constexpr`. It is not clear me that
    a compiler is otherwise _prohibited_ from evaluating an
    expression during translation; indeed, one could imagine it
    doing so to perform constant folding, and I do not believe there
    exists any normative text defining it as such.

    Certainly a compiler can, but need not, evaluate any expression at
    compile time if it's able to:

    int n;
    n = 2 + 2;

    I'd be surprised to see an ADD instruction in the generated code, but
    a naive compiler could certainly generate one. For that matter, a
    perverse compiler could generate code that adds 3 and 1 or divides 28
    by 7. Anything that implements the required *observable behavior*
    (5.1.2.4 Program semantics) is acceptable. Executing an ADD
    instruction is not part of the observable behavior.

    I realize this is an extreme interpretation, and not one that is
    not widely shared. Personally, I think it's rather silly.

    However, I that is _a_ danger of the informality of the C
    specification; it does not define the semantics of the abstract
    machine in the formally precise way that, say, the SML spec
    defines that language's semantics. Rather, it informally
    specifies them in prose, and that prose is ambiguous.

    There have been attempts to define C's semantics formally, but
    those attempts are not part of the standard. Fully defining C's
    semantics formally rather than in English would, I imagine it would
    be a *lot* of work -- and fewer people would be able to understand
    the specification or work on it.

    Probably much good would be done if C's semantics _were_
    rigorously defined, but they are not. Thus, they are open to
    radical interpretation, and as extreme as those may be, I do not
    see how the normative text of the standard explicitly
    _prohibits_ them.

    and in fact
    it *won't* occur during execution because foo() isn't called.
    A compiler can't generate code with arbitrary behavior just because
    it can't prove that there will be no UB. If it could, every signed
    or floating-point arithmetic operation with unknown operand values
    would grant the same permission.

    But that's not the situation here. The situation is that the
    compiler can prove that something _is_ UB.

    In the program quoted at the top of this post, the UB occurs in
    a function foo() that's never called. A compiler can replace the
    body of foo() with a trap, and it can certainly warn about the UB,
    but I don't believe it can reject the entire program. A clever
    compiler could prove that the UB never occurs.

    A naive compiler that performs no optimizations would generate
    code for foo() that attempts to compute (INT_MAX+1)*0 step by
    step, without recognizing the overflow, and that code would never
    be executed.

    Regardless, I think you highlighted an actual problem with the
    spec; I don't think that behavior is _explicitly_ prohibited,
    therefore, it is likely undefined, but at a minimum unspecified,
    whether it actually could happen. If the argument against that
    is that this renders the language essentially unusuable, then
    my response is, "yeah, well, welcome to programming in C in the
    2020s." Most compilers would never be that extreme, but I see
    no evidence that it would not be an invalid reading of the
    literal text of the standard if they did.

    So no, I do not see how execution according to the rules of the
    abstract machine is not guaranteed, here. I certainly see no
    way in which this can be regarded as a strictly conforming
    program.

    foo()'s behavior would be undefined if it were called. It *isn't*
    called, so there's no actual UB. The program does not violate any
    of the other requirements for strict conformance.

    I understand _what_ you're saying: despite the expression itself
    manifesting undefined behavior, in this case it's not UB because
    `foo` is never executed. What I'm saying is that I don't see
    anything in the standard that restricts UB to _only_ executed
    code. A reputable compiler obviously instruments `foo` with
    code to trap into ubsan; if it's not UB, since it's not
    executed, then why do so? Granted, that's not evidence of
    anything other than the behavior of those compilers, but still.

    Probably the compiler generated the trap code because it didn't
    (yet?) know whether foo is ever called. If it were clever enough
    to prove that foo is never called, it could generate no code for
    it at all.

    The note on the definition of undefined behavior is a bit vague.
    It permits terminating a translation in response to UB, but that
    doesn't address exactly when it can do so. I believe it can do so
    only when it can prove that the UB always occurs, but that's not
    clearly stated.

    However, the behavior of the program as a whole is clearly defined.
    It returns a status of 0 from main and does nothing else.
    A conforming implementation *must* generate code that implements
    that behavior.

    Another argument (subject to interpretation of wording): Undefined
    behavior is "behavior, **upon use** of a nonportable or erroneous
    program construct or of erroneous data, for which this document
    imposes no requirements". The overflowing expression within foo()
    is never *used*, so there is no undefined behavior.

    To put it another way, undefined behavior is behavior. Something
    that never occurs is not behavior.

    It is clearly the _intent_ that this be a strictly conforming
    program. The C standard, as an imprecise, informal document,
    cannot guarantee it.

    If the usual "Hello, world" program prints "Hello, world" followed
    by "Goodbye", the implementation is non-conforming. If it formats
    my hard drive after printing "Goodbye", it's non-conforming and >>>>dangerous.

    Two separate things. My point earlier was that code can
    obviously run after `main` terminates. Moreoever, I can't
    imagine what would _prevent_ a runtime system that invokes
    `main` from doing something like printing, "PROGRAM STOPPED"
    after `main` returned. C imposes no requirements here.

    Yes, it does. An OS can print "PROGRAM STOPPED", but not as part
    of the execution of the program. On my system, a shell prompt is
    printed after a program terminates, but not by the program. If I
    execute a "hello, world" program with its output redirected to a file
    (on a system that supports that), the resulting file cannot contain >>"PROGRAM STOPPED". The requirements in 5.1.2.4 specify both what
    the execution of a program must do and what it must not do.

    Files are a separate case. There's no guarantee that the
    standard output refers to a file; it may well refer to an
    "interactive device", the semantics of which are (necessarily)
    unspecified.

    The requirements for "observable behavior" cover both files and
    interactive devices.

    Here's an example: consider an interactive user who uses a
    screen reader device. Suppose that user makes use of an
    implementation that includes runtime support for that device,
    and that precedes invocation of `main` with a command sequence
    causing the screen reader to (perhaps) change intonation; and
    suceeds return from main by outputing another command sequence
    that resets to the original state.

    I do not see how C could prohibit that, assuming that the
    implementation takes care to detect whether standard output
    really refers to the screen reader, and does emit the control
    sequences if output is redirected to a file. Another user who
    runs that same program without a screen reader may see the
    standard text printed on the screen, without the control
    sequence sandwich.

    I don't think a conforming implementation can prohibit that kind
    of thing.

    I agree. printf("hello, world\n") must write that string to standard
    output, which may be a file or an interactive device. Just what
    that means is unspecified or implementation-defined. It might be
    printed in EBCDIC or incised into clay tablets. Closing stdout,
    which occurs when main() terminates, might involve firing the tablet
    or emitting control sequences for a screen reader.

    Whether foo() has external linkage or internal
    linkage doesn't change that.

    I disagree. There's no possible way for the implementation to
    know whether a function with external linkage will be ultimately
    invoked or not; consider a system that supports loadable shared
    modules. Nothing prevents even this simple program from being
    compiled as a shared module, dynamically loaded, the loading
    program explicitly searching for and finding the symbol
    corresponding to the `foo` function, and invoking it.

    Remember that linking is translation phase 8. The compiler is not
    the entire implementation.

    Exactly my point. The compiler cannot know how `foo` might be
    used, or how the translated object might be exercised. There's
    I don't see how it could possibly know that, given that `foo`
    has external linkage.

    We were presented with a complete translation unit that included a
    function definition for "main". It's a complete program. There's no
    valid way for some other program to call foo. If OS provided such
    a mechanism, it would be outside the scope of C.

    Given an excessively pedantic and literal reading of the text of
    the standard, I don't think an implementation is explicitly
    prohibited from evaluating the initializer at translation time,
    deducing that the behavior is undefined, and blaming it on the
    program, at which point, all bets are off.

    An implementation can certainly evaluate the initializer at
    translation time, deduce that the behavior would be undefined
    *if the initializer were evaluated*, and blame it on the program.
    That doesn't mean it can reject a strictly conforming program.

    Hence, the compiler _must_ treat with UB as written, which is
    why `ubsan` inserts trapping code in `foo`.

    I don't know what "_must_ treat with UB" means.

    foo() has undefined behavior if it's called, so replacing its
    body with trapping code is valid. But (I'm reasonably sure that)
    an implementation cannot reject a program just because it can't
    prove that it has no undefined behavior during execution. It can >>>>reject it if it can prove that it *always* has undefined behavior >>>>during execution.

    What I'm saying is that, `foo` has undefined behavior _period_.
    That's manifest in an integer constant expression, whether it is
    executed at runtime or not. I believe that the standard forces
    the expression to be evaluated at translation time, via the
    "shall" mandate when checking the constraint on the range in sec
    6.6 para 4. Further, that evaluation must happen in accordance
    with the rules of the abstract machine, as per 5.1.2.4 para 17.
    The diagnostic is mandated, as is the translation-time
    evaluation. The expression is itself manifestly exhibits UB,
    and so therefore the result of the rest of the translation is
    undefined.

    foo is a function. foo does not have undefined behavior; it has no >>behavior at all. A *call* to foo during execution has undefined
    behavior. (`foo;` is a statement-expression that does nothing;
    it does not have undefined behavior.)

    The _evaluation_ of that expression in `foo` has undefined
    behavior. The standard does not say that it _cannot_ be
    evaluated at translation time.

    If a compiler sees a subexpression INT_MAX+1 it can attempt to
    evaluate it at compile time. But it can't just blindly add the
    values if overflow would cause a fatal trap, crashing the compiler.
    That would be a serious compiler bug. The behavior *of the compiler*
    is not undefined.

    [SNIP]

    I think the question of whether the initializer is a
    constant-expression or not has caused some not entirely relevant
    confusion.

    Here's another example that avoids that issue.

    #include <limits.h>

    int foo(void) {
    int zero;
    zero = INT_MAX;
    zero ++;
    zero *= 0;
    return zero;
    }

    int main(void) {
    return 0;
    }

    Given my grammatical argument above, I would say that this program
    has no constant expressions.

    Agreed, if by "constant expressions" you mean those mandated to
    use the `constant-expression` grammatical production.

    Yes, that's what I mean by it.

    Whether that argument is correct or
    not, it certainly has no constant expressions that violate any
    constraint or that have undefined behavior. Evaluating `zero ++`
    (which doesn't even pretend to be a constant expression) would have >>run-time undefined behavior -- *if* foo() were ever called.

    Let me turn this around in two ways: suppose that the
    translation unit _only_ included `foo`. Could the compiler
    deduce that the behavior of `foo`, if called, is undefined? If
    not, why not?

    Certainly.

    Second, suppose that `foo` _were_ called, could the compiler
    replace this with a program that was the equivalent of,
    `int main(void) {printf("check your nose"); abort();}`? If so
    why? If not, why not?

    Yes, if foo were called in every possible execution of the program,
    the program's behavior would be undefined. The compiler could also
    reject it.

    And given this translation unit, I don't think there's any way to
    construct a multi-TU program that calls foo, so a compiler *can*
    determine that foo is never called (but there's no requirement to
    do so, or to make any use of that information).

    This is the crux of my point, as well. There's not requirement
    for the translator to _not_ evaluate the expression and become
    privy to UB.

    I believe there is. The program is strictly conforming, which means,
    among other things, that it does not produce output depending on any
    undefined behavior. There is no undefined behavior because foo() is
    never called.

    A *strictly conforming program* shall use only those features of the
    language and library specified in this document. It shall not
    produce output dependent on any unspecified, undefined, or
    implementation- defined behavior, and shall not exceed any minimum
    implementation limit.

    ...

    A *conforming hosted implementation* shall accept any strictly
    conforming program.

    An implementation that rejects the program quoted at the top of this
    article is non-conforming.

    Would it be stupid if a compiler did that? Yes. Do existing
    compilers do so? No, not that I'm aware of. Would some dweeb
    nerd compiler douche who thinks this would make a compiler
    benchmark some microfraction of a percent faster take advantage
    of that? I absolutely think so, yes.

    And I'd submit a bug report.

    --
    Keith Thompson (The_Other_Keith) Keith.S.Thompson+u@gmail.com
    void Void(void) { Void(); } /* The recursive call of the void */

    --- PyGate Linux v1.5.15
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From Keith Thompson@3:633/10 to All on Fri Jun 5 23:56:52 2026
    cross@spitfire.i.gajendra.net (Dan Cross) writes:
    [...]
    The text of the standard explicitly carves this out; or, rather,
    it attempts to. If the result of an expression is not
    representable in the target type, _regardless of whether that's
    due to UB or not_, a diagnostic is required.
    [...]

    How would an expression (appearing in a context that requires an
    integer constant expression) not "evaluate to a constant that is in
    the range of representable values for its type" other than by UB?
    I can't think of an example, but I'd be interested in seeing one.

    Note in particular that UINT_MAX+1U is well defined, not an overflow.

    --
    Keith Thompson (The_Other_Keith) Keith.S.Thompson+u@gmail.com
    void Void(void) { Void(); } /* The recursive call of the void */

    --- PyGate Linux v1.5.15
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From Scott Lurndal@3:633/10 to All on Sat Jun 6 15:13:30 2026
    cross@spitfire.i.gajendra.net (Dan Cross) writes:
    In article <DHAUR.47540$0o1c.29921@fx08.iad>,
    Scott Lurndal <slp53@pacbell.net> wrote:
    cross@spitfire.i.gajendra.net (Dan Cross) writes:
    In article <1BoUR.3$lmCb.1@fx22.iad>, Scott Lurndal <slp53@pacbell.net> wrote:
    cross@spitfire.i.gajendra.net (Dan Cross) writes:
    [snip]
    <snip>
    Yeah, that's from `cc.c`, right?

    No, it's from cpp.c

    $ ls /work/reference/collegetapes/sltape/v6cc/
    c0.c c00.c c01.c c02.c c03.c c04.c c05.c c1.h
    c10.c c11.c c12.c c13.c c2.h c20.c c21.c cc.c cpp.c

    Oh interesting. I don't have a `cpp.c` in my v6 archive.

    I wonder what else I'm missing.

    [snip]

    Thanks! This is an artifact definitely worth preserving. As
    far as I know, it's not in any of the extant V6 archives. I'll
    shoot you an email, if that's ok.

    - Dan C.


    --- PyGate Linux v1.5.15
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From Scott Lurndal@3:633/10 to All on Sat Jun 6 17:53:01 2026
    cross@spitfire.i.gajendra.net (Dan Cross) writes:
    In article <DHAUR.47540$0o1c.29921@fx08.iad>,
    Scott Lurndal <slp53@pacbell.net> wrote:
    cross@spitfire.i.gajendra.net (Dan Cross) writes:
    In article <1BoUR.3$lmCb.1@fx22.iad>, Scott Lurndal <slp53@pacbell.net> wrote:
    cross@spitfire.i.gajendra.net (Dan Cross) writes:
    [snip]
    <snip>
    Yeah, that's from `cc.c`, right?

    No, it's from cpp.c

    $ ls /work/reference/collegetapes/sltape/v6cc/
    c0.c c00.c c01.c c02.c c03.c c04.c c05.c c1.h
    c10.c c11.c c12.c c13.c c2.h c20.c c21.c cc.c cpp.c

    Oh interesting. I don't have a `cpp.c` in my v6 archive.

    I wonder what else I'm missing.

    [snip]

    Thanks! This is an artifact definitely worth preserving. As
    far as I know, it's not in any of the extant V6 archives. I'll
    shoot you an email, if that's ok.

    A a version of cpp that was used with the portable C compiler (PCC)
    is here.

    It has a -C option to preserve comments in the processed output.

    https://github.com/IanHarvey/pcc/blob/master/cc/cpp/cpp.c

    --- PyGate Linux v1.5.15
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From Tim Rentsch@3:633/10 to All on Sat Jun 6 15:47:07 2026
    Keith Thompson <Keith.S.Thompson+u@gmail.com> writes:

    I claim that an expression that looks like a constant expression
    *isn't* a constant-expression if it doesn't appear in a context
    that requires a constant-expression.

    Right. This question came up years ago in a Defect Report. The
    response from the Committee was basically the same as what you
    said: the 6.6 constraints for constant expressions apply only in
    situations where the C standard expressly requires a constant
    expression. (I don't have the DR in front of me; I'm summarizing
    based on memory, but am confident the actual wording is consistent
    with what I just said.)

    --- PyGate Linux v1.5.15
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From dave_thompson_2@3:633/10 to All on Sat Jun 6 19:02:11 2026
    On Mon, 1 Jun 2026 09:52:08 +0200, David Brown
    <david.brown@hesbynett.no> wrote:

    On 31/05/2026 19:11, Bart wrote:
    ...
    Actual examples of too many parentheses?

    Any source code written in LISP :-)

    (And for too few parentheses, any source code in Forth.)

    FORTH uses parentheses for stack diagrams -- a semi-standard type of comment/documentation -- and of course good code (using my subjective definition of good :-) ) always has sufficient documentation :-)

    --- PyGate Linux v1.5.15
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From Tim Rentsch@3:633/10 to All on Sat Jun 6 16:15:05 2026
    Janis Papanagnou <janis_papanagnou+ng@hotmail.com> writes:

    PS: One yet non-considered question that was part of my original
    post was: "Is there any rationale from the _software designer_'s perspective?"

    I didn't respond to your original question because it was based on a misconception. Whether a given expression is a constant expression,
    in the sense of needing to satisfy the constraints of 6.6, depends
    not on the form of the expression but on the context in which it
    appears. The 6.6 constraints apply only in situations where the C
    standard expressly requires a constant expression. Other cases,
    such as a use like this

    int
    whatever(){
    int r = (int)(-1u/2) + 1;
    return r;
    }

    do not need to satisfy the 6.6 constraints, because the C standard
    doesn't require a constant expression in that context. (Note that
    the initializing expression for 'r' does overflow the range of int
    in implementations where UINT_MAX == INT_MAX*2.)

    --- PyGate Linux v1.5.15
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From Keith Thompson@3:633/10 to All on Sat Jun 6 16:36:14 2026
    Tim Rentsch <tr.17687@z991.linuxsc.com> writes:
    Keith Thompson <Keith.S.Thompson+u@gmail.com> writes:
    I claim that an expression that looks like a constant expression
    *isn't* a constant-expression if it doesn't appear in a context
    that requires a constant-expression.

    Right. This question came up years ago in a Defect Report. The
    response from the Committee was basically the same as what you
    said: the 6.6 constraints for constant expressions apply only in
    situations where the C standard expressly requires a constant
    expression. (I don't have the DR in front of me; I'm summarizing
    based on memory, but am confident the actual wording is consistent
    with what I just said.)

    C99 DR 261 looks similar to what you're talking about.

    https://www.open-std.org/jtc1/sc22/wg14/www/docs/dr_261.htm

    The Committee Response section says:

    In general, the interpretation of an expression for constantness
    is context sensitive. For any expression which contains only
    constants:

    - If the syntax or context only permits a constant expression, the
    constraints of 6.6#3 and 6.6#4 shall apply.
    - Otherwise, if the expression meets the requirements of 6.6
    (including any form accepted in accordance with 6.6#10), it is a
    constant expression.
    - Otherwise it is not a constant expression.

    That's close to what I claimed, but the second bullet point differs.
    My claim was that, given:

    n = 2+2;

    2+2 is not a constant expression because the grammar doesn't require
    a constant expression in that context. The Committee's opinion
    (at least at the time) was that it is a constant expression because
    it meets the requirements of 6.6.

    But I *think* it's a distinction without a difference. Calling 2+2
    a constant expression has no effect on the semantics, and does not
    require or forbid the implementation from, for example, generating
    an ADD instruction. The distinction would matter for an expression
    that has UB and/or does not yield a value of the type, but that
    falls through to the third bullet.

    I found another interesting tidbit, C90 DR 031, relevant to another
    point I made elsethread:

    https://www.open-std.org/jtc1/sc22/wg14/www/docs/dr_031.html

    case (INT_MAX*4)/4: is a constraint violation.
    When subclause 6.4 says on page 55, lines 11-12:
    Each constant expression shall evaluate to a constant that is in
    the range of representable values for its type.
    the Committee's judgement of the intent is that the
    ``representable'' requirement applies to each subexpression of a
    constant expression, as shown in the third example. A constant
    expression is meant as defined by the syntax rules.

    --
    Keith Thompson (The_Other_Keith) Keith.S.Thompson+u@gmail.com
    void Void(void) { Void(); } /* The recursive call of the void */

    --- PyGate Linux v1.5.15
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From Tim Rentsch@3:633/10 to All on Sat Jun 6 16:43:53 2026
    Keith Thompson <Keith.S.Thompson+u@gmail.com> writes:

    Tim Rentsch <tr.17687@z991.linuxsc.com> writes:

    Keith Thompson <Keith.S.Thompson+u@gmail.com> writes:

    I claim that an expression that looks like a constant expression
    *isn't* a constant-expression if it doesn't appear in a context
    that requires a constant-expression.

    Right. This question came up years ago in a Defect Report. The
    response from the Committee was basically the same as what you
    said: the 6.6 constraints for constant expressions apply only in
    situations where the C standard expressly requires a constant
    expression. (I don't have the DR in front of me; I'm summarizing
    based on memory, but am confident the actual wording is consistent
    with what I just said.)

    C99 DR 261 looks similar to what you're talking about.

    https://www.open-std.org/jtc1/sc22/wg14/www/docs/dr_261.htm

    The Committee Response section says:

    In general, the interpretation of an expression for constantness
    is context sensitive. For any expression which contains only
    constants:

    - If the syntax or context only permits a constant expression, the
    constraints of 6.6#3 and 6.6#4 shall apply.
    - Otherwise, if the expression meets the requirements of 6.6
    (including any form accepted in accordance with 6.6#10), it is a
    constant expression.
    - Otherwise it is not a constant expression.

    That's close to what I claimed, but the second bullet point differs.
    My claim was that, given:

    n = 2+2;

    2+2 is not a constant expression because the grammar doesn't require
    a constant expression in that context. The Committee's opinion
    (at least at the time) was that it is a constant expression because
    it meets the requirements of 6.6.

    But I *think* it's a distinction without a difference. [...]

    Right. The key point is that the constraints need to be satisfied
    only in situations where the C standard expressly requires a
    constant expression. Whether a given expression is called a
    "constant expression" doesn't matter; all that does matter is
    whether the constraints need to be satisfied.

    --- PyGate Linux v1.5.15
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From Keith Thompson@3:633/10 to All on Sat Jun 6 17:41:34 2026
    Tim Rentsch <tr.17687@z991.linuxsc.com> writes:
    Keith Thompson <Keith.S.Thompson+u@gmail.com> writes:
    [...]
    That's close to what I claimed, but the second bullet point differs.
    My claim was that, given:

    n = 2+2;

    2+2 is not a constant expression because the grammar doesn't require
    a constant expression in that context. The Committee's opinion
    (at least at the time) was that it is a constant expression because
    it meets the requirements of 6.6.

    But I *think* it's a distinction without a difference. [...]

    Right. The key point is that the constraints need to be satisfied
    only in situations where the C standard expressly requires a
    constant expression. Whether a given expression is called a
    "constant expression" doesn't matter; all that does matter is
    whether the constraints need to be satisfied.

    Well, it matters a little bit, at least to me, even though the
    distinction doesn't seem to affect the validity or semantics of
    any C code.

    A clear and unambiguous definition of what is or is not a "constant
    expression" would make the language just a bit easier to understand
    and explain. I'd even be satisified with the definition given in
    the DR *if* it were clearly expressed in the standard.

    --
    Keith Thompson (The_Other_Keith) Keith.S.Thompson+u@gmail.com
    void Void(void) { Void(); } /* The recursive call of the void */

    --- PyGate Linux v1.5.15
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From Tim Rentsch@3:633/10 to All on Sat Jun 6 18:06:37 2026
    Keith Thompson <Keith.S.Thompson+u@gmail.com> writes:

    cross@spitfire.i.gajendra.net (Dan Cross) writes:

    In article <865x3yd21n.fsf@linuxsc.com>,
    Tim Rentsch <tr.17687@z991.linuxsc.com> wrote:

    cross@spitfire.i.gajendra.net (Dan Cross) writes:

    In article <86ik81cfk5.fsf_-_@linuxsc.com>,
    Tim Rentsch <tr.17687@z991.linuxsc.com> wrote:

    [...]

    There's an important distinction to make here. Consider this
    program:

    #include <limits.h>

    int
    foo(){
    int zero = (INT_MAX+1)*0;
    return zero;
    }

    int
    main(){
    return 0;
    }

    This program does not transgress the bounds of undefined behavior.

    To clarify, the comments in my posting were meant to be read as
    saying the given text is the entire program, and that it is strictly
    conforming with respect to conforming hosted implementations.
    (Incidentally, given the rules for freestanding implementations, I'm
    not sure that it is even possible for any program to be strictly
    conforming with respect to conforming freestanding implementations.
    In any case my statements were meant only in the context of hosted
    implementations.)

    [...]

    foo() has undefined behavior if it's called, so replacing its
    body with trapping code is valid.

    Right.

    But (I'm reasonably sure that)
    an implementation cannot reject a program just because it can't
    prove that it has no undefined behavior during execution. [...]

    Right.

    In your example, `foo` clearly exhibits UB; I think your
    argument is whether that has a realized effect or not, since the
    UB is not invoked. I'm saying that in general a compiler cannot
    possibly know that when it compiles `foo`, and is free to assume
    the worst.

    foo() exhibits UB if and only if it's called during execution.

    Right.

    --- PyGate Linux v1.5.15
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From Dan Cross@3:633/10 to All on Sun Jun 7 13:37:35 2026
    In article <1100gbk$1lt8i$2@kst.eternal-september.org>,
    Keith Thompson <Keith.S.Thompson+u@gmail.com> wrote: >cross@spitfire.i.gajendra.net (Dan Cross) writes:
    [...]
    The text of the standard explicitly carves this out; or, rather,
    it attempts to. If the result of an expression is not
    representable in the target type, _regardless of whether that's
    due to UB or not_, a diagnostic is required.
    [...]

    How would an expression (appearing in a context that requires an
    integer constant expression) not "evaluate to a constant that is in
    the range of representable values for its type" other than by UB?

    It wouldn't. But because it's UB, it could evaluate to
    anything, including something that didn't violate the
    constraint.

    I can't think of an example, but I'd be interested in seeing one.

    In terms of a practical, working compiler? I doubt that one
    exists.

    Note in particular that UINT_MAX+1U is well defined, not an overflow.

    Yes.

    - Dan C.


    --- PyGate Linux v1.5.15
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From Keith Thompson@3:633/10 to All on Sun Jun 7 15:09:43 2026
    cross@spitfire.i.gajendra.net (Dan Cross) writes:
    In article <1100gbk$1lt8i$2@kst.eternal-september.org>,
    Keith Thompson <Keith.S.Thompson+u@gmail.com> wrote:
    cross@spitfire.i.gajendra.net (Dan Cross) writes:
    [...]
    The text of the standard explicitly carves this out; or, rather,
    it attempts to. If the result of an expression is not
    representable in the target type, _regardless of whether that's
    due to UB or not_, a diagnostic is required.
    [...]

    How would an expression (appearing in a context that requires an
    integer constant expression) not "evaluate to a constant that is in
    the range of representable values for its type" other than by UB?

    It wouldn't. But because it's UB, it could evaluate to
    anything, including something that didn't violate the
    constraint.

    I can't think of an example, but I'd be interested in seeing one.

    In terms of a practical, working compiler? I doubt that one
    exists.

    I actually meant in terms of the standard, not of any particular
    compiler.

    I can't think of an example, but maybe someone else can.

    [...]

    --
    Keith Thompson (The_Other_Keith) Keith.S.Thompson+u@gmail.com
    void Void(void) { Void(); } /* The recursive call of the void */

    --- PyGate Linux v1.5.15
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)