• Re: Parentheses

    From Keith Thompson@3:633/10 to All on Wed Jun 3 16:24:17 2026
    antispam@fricas.org (Waldek Hebisch) writes:
    Bart <bc@freeuk.com> wrote:
    Personally I don't have much use for CSTs for a normal compiler, but
    they might be useful for source-to-source translators, or programs that
    do source refactoring, where you want to preserve extras such as
    parentheses even if they're not strictly needed.

    (Injecting the right parentheses for examples like `(a + b) * c' which
    would have an AST like '(* (+ a b) c)' is surpringly tricky. Easier to
    just follow the original source!

    You probably mean some more complicated example. This one is
    easy:

    (10) -> parse("(a + b) * c")

    (10) (* (+ a b) c)

    (11) -> unparse(parse("(a + b) * c"))

    (11) "(a+b)*c"
    [snip]

    What tool are you using?

    [...]

    --
    Keith Thompson (The_Other_Keith) Keith.S.Thompson+u@gmail.com
    void Void(void) { Void(); } /* The recursive call of the void */

    --- PyGate Linux v1.5.15
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From Bart@3:633/10 to All on Thu Jun 4 01:12:32 2026
    On 03/06/2026 23:30, Waldek Hebisch wrote:
    Bart <bc@freeuk.com> wrote:

    Personally I don't have much use for CSTs for a normal compiler, but
    they might be useful for source-to-source translators, or programs that
    do source refactoring, where you want to preserve extras such as
    parentheses even if they're not strictly needed.

    (Injecting the right parentheses for examples like `(a + b) * c' which
    would have an AST like '(* (+ a b) c)' is surpringly tricky. Easier to
    just follow the original source!

    You probably mean some more complicated example. This one is
    easy:

    (10) -> parse("(a + b) * c")

    (10) (* (+ a b) c)

    (11) -> unparse(parse("(a + b) * c"))

    (11) "(a+b)*c"

    (12) -> parse("a + b * c")

    (12) (+ a (* b c))

    (13) -> unparse(parse("a + b * c"))

    (13) "a+b*c"


    You just need to track priorities of subexpressions to produce the
    above: '+' has lower priority than '*' so subexpression needs
    parentheses, '*' has higher priority, so there is no need for
    parentheses.

    I seem to remember one problem was with minus, for example original expr is:

    a - (b - c)

    The AST is (- a (- b c)), but a simplistic approach would generate, from either that or (- (- a b) c), the same output:

    a - b - c

    No parentheses because the the two "-" have the same precedence. The
    example might have been 'a + (b - c)'; same thing.

    It just seemed more trouble than it was worth.



    --- PyGate Linux v1.5.15
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From Bart@3:633/10 to All on Thu Jun 4 11:37:30 2026
    On 04/06/2026 02:58, Waldek Hebisch wrote:
    Bart <bc@freeuk.com> wrote:
    On 03/06/2026 23:30, Waldek Hebisch wrote:
    Bart <bc@freeuk.com> wrote:

    Personally I don't have much use for CSTs for a normal compiler, but
    they might be useful for source-to-source translators, or programs that >>>> do source refactoring, where you want to preserve extras such as
    parentheses even if they're not strictly needed.

    (Injecting the right parentheses for examples like `(a + b) * c' which >>>> would have an AST like '(* (+ a b) c)' is surpringly tricky. Easier to >>>> just follow the original source!

    You probably mean some more complicated example. This one is
    easy:

    (10) -> parse("(a + b) * c")

    (10) (* (+ a b) c)

    (11) -> unparse(parse("(a + b) * c"))

    (11) "(a+b)*c"

    (12) -> parse("a + b * c")

    (12) (+ a (* b c))

    (13) -> unparse(parse("a + b * c"))

    (13) "a+b*c"


    You just need to track priorities of subexpressions to produce the
    above: '+' has lower priority than '*' so subexpression needs
    parentheses, '*' has higher priority, so there is no need for
    parentheses.

    I seem to remember one problem was with minus, for example original expr is: >>
    a - (b - c)

    The AST is (- a (- b c)), but a simplistic approach would generate, from
    either that or (- (- a b) c), the same output:

    a - b - c

    No parentheses because the the two "-" have the same precedence. The
    example might have been 'a + (b - c)'; same thing.

    It just seemed more trouble than it was worth.

    Well,

    (7) -> parse("a - (b - c)")

    (7) (- a (- b c))

    (8) -> unparse(parse("a - (b - c)"))

    (8) "a-(b-c)"

    (9) -> parse("a - b - c")

    (9) (- (- a b) c)

    (10) -> unparse(parse("a - b - c"))

    (10) "a-b-c"

    (11) -> parse("a + (b + c)")

    (11) (+ a (+ b c))

    (12) -> unparse(parse("a + (b + c)"))

    (12) "a+(b+c)"

    (13) -> parse("a + b + c")

    (13) (+ (+ a b) c)

    (14) -> unparse(parse("a + b + c"))

    (14) "a+b+c"

    Note: the approach builds string from the parse tree so that
    parsing gives back original tree. That is why (12) has
    parentheses: without them parsing gives different tree.

    Implementation is simple, but there are little subtleties, for
    example left and right arguments do not have symmetric role
    even for commutative operators.

    The code is not in C, but rough translation of part of 'unparse'
    handling arithmetic to C could look like:

    ...

    OK, I'll have a play with it, although you could have just posted
    pseudo-code.

    Note that I just said it was 'surprisingly tricky'. It doesn't only
    depend on precedence levels.

    And it is still more elaborate than simply putting parentheses around
    every binary term.

    (My point had been that a CST rather then AST would make this nearly as simple.)



    --- PyGate Linux v1.5.15
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From Dan Cross@3:633/10 to All on Thu Jun 4 10:51:37 2026
    In article <10vqftg$2d72$1@dont-email.me>, Bart <bc@freeuk.com> wrote:
    On 03/06/2026 23:30, Waldek Hebisch wrote:
    Bart <bc@freeuk.com> wrote:

    Personally I don't have much use for CSTs for a normal compiler, but
    they might be useful for source-to-source translators, or programs that
    do source refactoring, where you want to preserve extras such as
    parentheses even if they're not strictly needed.

    (Injecting the right parentheses for examples like `(a + b) * c' which
    would have an AST like '(* (+ a b) c)' is surpringly tricky. Easier to
    just follow the original source!

    You probably mean some more complicated example. This one is
    easy:

    (10) -> parse("(a + b) * c")

    (10) (* (+ a b) c)

    (11) -> unparse(parse("(a + b) * c"))

    (11) "(a+b)*c"

    (12) -> parse("a + b * c")

    (12) (+ a (* b c))

    (13) -> unparse(parse("a + b * c"))

    (13) "a+b*c"


    You just need to track priorities of subexpressions to produce the
    above: '+' has lower priority than '*' so subexpression needs
    parentheses, '*' has higher priority, so there is no need for
    parentheses.

    I seem to remember one problem was with minus, for example original expr is:

    a - (b - c)

    The AST is (- a (- b c)), but a simplistic approach would generate, from >either that or (- (- a b) c), the same output:

    a - b - c

    No parentheses because the the two "-" have the same precedence. The
    example might have been 'a + (b - c)'; same thing.

    It just seemed more trouble than it was worth.

    What? I don't understand what you're saying at all.

    Subtraction is not associative, and the two expresions,
    `a - b - c` and `a - (b - c)`, are not at all the same thing,
    either in C or in regular arithmetic. The former is
    `(a - b) - c`, and the subtraction distributes over the
    parenthesized subexpression, so the latter is equivalent to
    `a - b + c = (a - b) + c`.

    term% cat wha.c
    #include <stdio.h>
    int
    main(void)
    {
    int a = 5, b = 4, c = 3;
    printf("a - b - c = %d\n", a - b - c);
    printf("(a - b) - c = %d\n", (a - b) - c);
    printf("a - (b - c) = %d\n", a - (b - c));
    return 0;
    }
    term% make wha
    cc -O2 -pipe -o wha wha.c
    term% ./wha
    a - b - c = -2
    (a - b) - c = -2
    a - (b - c) = 4
    term%

    - Dan C.


    --- PyGate Linux v1.5.15
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From Bart@3:633/10 to All on Thu Jun 4 12:47:07 2026
    On 04/06/2026 11:51, Dan Cross wrote:
    In article <10vqftg$2d72$1@dont-email.me>, Bart <bc@freeuk.com> wrote:
    On 03/06/2026 23:30, Waldek Hebisch wrote:
    Bart <bc@freeuk.com> wrote:

    Personally I don't have much use for CSTs for a normal compiler, but
    they might be useful for source-to-source translators, or programs that >>>> do source refactoring, where you want to preserve extras such as
    parentheses even if they're not strictly needed.

    (Injecting the right parentheses for examples like `(a + b) * c' which >>>> would have an AST like '(* (+ a b) c)' is surpringly tricky. Easier to >>>> just follow the original source!

    You probably mean some more complicated example. This one is
    easy:

    (10) -> parse("(a + b) * c")

    (10) (* (+ a b) c)

    (11) -> unparse(parse("(a + b) * c"))

    (11) "(a+b)*c"

    (12) -> parse("a + b * c")

    (12) (+ a (* b c))

    (13) -> unparse(parse("a + b * c"))

    (13) "a+b*c"


    You just need to track priorities of subexpressions to produce the
    above: '+' has lower priority than '*' so subexpression needs
    parentheses, '*' has higher priority, so there is no need for
    parentheses.

    I seem to remember one problem was with minus, for example original expr is: >>
    a - (b - c)

    The AST is (- a (- b c)), but a simplistic approach would generate, from
    either that or (- (- a b) c), the same output:

    a - b - c

    No parentheses because the the two "-" have the same precedence. The
    example might have been 'a + (b - c)'; same thing.

    It just seemed more trouble than it was worth.

    What? I don't understand what you're saying at all.

    This is about turning AST (which has stripped parentheses) back into the original source text.

    I was responding to 'You just need to track priorities of
    subexpressions', with an example where the two operators had the same priorities.

    It would extra work to generate the parentheses needed to make the
    output correct. For my application, I decided not to be bother.



    --- PyGate Linux v1.5.15
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From Janis Papanagnou@3:633/10 to All on Thu Jun 4 14:57:33 2026
    On 2026-06-04 13:47, Bart wrote:
    [...]

    This is about turning AST (which has stripped parentheses) back into the original source text.

    This is, in the first place, an unnecessary and stupid task that
    generally cannot be fulfilled; since many homologous expressions
    map to the same unambiguous internal representations. (And those
    internal representations need no parentheses, they are regularly
    chosen to be inherently unambiguous.)

    Janis

    [...]


    --- PyGate Linux v1.5.15
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From Dan Cross@3:633/10 to All on Thu Jun 4 14:31:30 2026
    In article <10vrojq$bjmf$2@dont-email.me>, Bart <bc@freeuk.com> wrote:
    On 04/06/2026 11:51, Dan Cross wrote:
    In article <10vqftg$2d72$1@dont-email.me>, Bart <bc@freeuk.com> wrote:
    On 03/06/2026 23:30, Waldek Hebisch wrote:
    Bart <bc@freeuk.com> wrote:

    Personally I don't have much use for CSTs for a normal compiler, but >>>>> they might be useful for source-to-source translators, or programs that >>>>> do source refactoring, where you want to preserve extras such as
    parentheses even if they're not strictly needed.

    (Injecting the right parentheses for examples like `(a + b) * c' which >>>>> would have an AST like '(* (+ a b) c)' is surpringly tricky. Easier to >>>>> just follow the original source!

    You probably mean some more complicated example. This one is
    easy:

    (10) -> parse("(a + b) * c")

    (10) (* (+ a b) c)

    (11) -> unparse(parse("(a + b) * c"))

    (11) "(a+b)*c"

    (12) -> parse("a + b * c")

    (12) (+ a (* b c))

    (13) -> unparse(parse("a + b * c"))

    (13) "a+b*c"


    You just need to track priorities of subexpressions to produce the
    above: '+' has lower priority than '*' so subexpression needs
    parentheses, '*' has higher priority, so there is no need for
    parentheses.

    I seem to remember one problem was with minus, for example original expr is:

    a - (b - c)

    The AST is (- a (- b c)), but a simplistic approach would generate, from >>> either that or (- (- a b) c), the same output:

    a - b - c

    No parentheses because the the two "-" have the same precedence. The
    example might have been 'a + (b - c)'; same thing.

    It just seemed more trouble than it was worth.

    What? I don't understand what you're saying at all.

    This is about turning AST (which has stripped parentheses) back into the >original source text.

    I was responding to 'You just need to track priorities of
    subexpressions', with an example where the two operators had the same >priorities.

    It would extra work to generate the parentheses needed to make the
    output correct. For my application, I decided not to be bother.

    I remain mystified as to your point. The two ASTs presented
    as S-expressions above are different: `(- a (- b c))` is simply
    not the same as `(- (- a b) c)`. If a program that turns these
    "back into the original source text" produces `a - b - c` for
    both, then it is simply broken. `a - b - c` matches the the AST
    `(- (- a b) c)`, but manifestly does not match `(- a (- b c))`.

    - Dan C.


    --- PyGate Linux v1.5.15
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)