Bart <bc@freeuk.com> wrote:[snip]
Personally I don't have much use for CSTs for a normal compiler, but
they might be useful for source-to-source translators, or programs that
do source refactoring, where you want to preserve extras such as
parentheses even if they're not strictly needed.
(Injecting the right parentheses for examples like `(a + b) * c' which
would have an AST like '(* (+ a b) c)' is surpringly tricky. Easier to
just follow the original source!
You probably mean some more complicated example. This one is
easy:
(10) -> parse("(a + b) * c")
(10) (* (+ a b) c)
(11) -> unparse(parse("(a + b) * c"))
(11) "(a+b)*c"
Bart <bc@freeuk.com> wrote:
Personally I don't have much use for CSTs for a normal compiler, but
they might be useful for source-to-source translators, or programs that
do source refactoring, where you want to preserve extras such as
parentheses even if they're not strictly needed.
(Injecting the right parentheses for examples like `(a + b) * c' which
would have an AST like '(* (+ a b) c)' is surpringly tricky. Easier to
just follow the original source!
You probably mean some more complicated example. This one is
easy:
(10) -> parse("(a + b) * c")
(10) (* (+ a b) c)
(11) -> unparse(parse("(a + b) * c"))
(11) "(a+b)*c"
(12) -> parse("a + b * c")
(12) (+ a (* b c))
(13) -> unparse(parse("a + b * c"))
(13) "a+b*c"
You just need to track priorities of subexpressions to produce the
above: '+' has lower priority than '*' so subexpression needs
parentheses, '*' has higher priority, so there is no need for
parentheses.
Bart <bc@freeuk.com> wrote:
On 03/06/2026 23:30, Waldek Hebisch wrote:
Bart <bc@freeuk.com> wrote:
Personally I don't have much use for CSTs for a normal compiler, but
they might be useful for source-to-source translators, or programs that >>>> do source refactoring, where you want to preserve extras such as
parentheses even if they're not strictly needed.
(Injecting the right parentheses for examples like `(a + b) * c' which >>>> would have an AST like '(* (+ a b) c)' is surpringly tricky. Easier to >>>> just follow the original source!
You probably mean some more complicated example. This one is
easy:
(10) -> parse("(a + b) * c")
(10) (* (+ a b) c)
(11) -> unparse(parse("(a + b) * c"))
(11) "(a+b)*c"
(12) -> parse("a + b * c")
(12) (+ a (* b c))
(13) -> unparse(parse("a + b * c"))
(13) "a+b*c"
You just need to track priorities of subexpressions to produce the
above: '+' has lower priority than '*' so subexpression needs
parentheses, '*' has higher priority, so there is no need for
parentheses.
I seem to remember one problem was with minus, for example original expr is: >>
a - (b - c)
The AST is (- a (- b c)), but a simplistic approach would generate, from
either that or (- (- a b) c), the same output:
a - b - c
No parentheses because the the two "-" have the same precedence. The
example might have been 'a + (b - c)'; same thing.
It just seemed more trouble than it was worth.
Well,
(7) -> parse("a - (b - c)")
(7) (- a (- b c))
(8) -> unparse(parse("a - (b - c)"))
(8) "a-(b-c)"
(9) -> parse("a - b - c")
(9) (- (- a b) c)
(10) -> unparse(parse("a - b - c"))
(10) "a-b-c"
(11) -> parse("a + (b + c)")
(11) (+ a (+ b c))
(12) -> unparse(parse("a + (b + c)"))
(12) "a+(b+c)"
(13) -> parse("a + b + c")
(13) (+ (+ a b) c)
(14) -> unparse(parse("a + b + c"))
(14) "a+b+c"
Note: the approach builds string from the parse tree so that
parsing gives back original tree. That is why (12) has
parentheses: without them parsing gives different tree.
Implementation is simple, but there are little subtleties, for
example left and right arguments do not have symmetric role
even for commutative operators.
The code is not in C, but rough translation of part of 'unparse'
handling arithmetic to C could look like:
...
On 03/06/2026 23:30, Waldek Hebisch wrote:
Bart <bc@freeuk.com> wrote:
Personally I don't have much use for CSTs for a normal compiler, but
they might be useful for source-to-source translators, or programs that
do source refactoring, where you want to preserve extras such as
parentheses even if they're not strictly needed.
(Injecting the right parentheses for examples like `(a + b) * c' which
would have an AST like '(* (+ a b) c)' is surpringly tricky. Easier to
just follow the original source!
You probably mean some more complicated example. This one is
easy:
(10) -> parse("(a + b) * c")
(10) (* (+ a b) c)
(11) -> unparse(parse("(a + b) * c"))
(11) "(a+b)*c"
(12) -> parse("a + b * c")
(12) (+ a (* b c))
(13) -> unparse(parse("a + b * c"))
(13) "a+b*c"
You just need to track priorities of subexpressions to produce the
above: '+' has lower priority than '*' so subexpression needs
parentheses, '*' has higher priority, so there is no need for
parentheses.
I seem to remember one problem was with minus, for example original expr is:
a - (b - c)
The AST is (- a (- b c)), but a simplistic approach would generate, from >either that or (- (- a b) c), the same output:
a - b - c
No parentheses because the the two "-" have the same precedence. The
example might have been 'a + (b - c)'; same thing.
It just seemed more trouble than it was worth.
In article <10vqftg$2d72$1@dont-email.me>, Bart <bc@freeuk.com> wrote:
On 03/06/2026 23:30, Waldek Hebisch wrote:
Bart <bc@freeuk.com> wrote:
Personally I don't have much use for CSTs for a normal compiler, but
they might be useful for source-to-source translators, or programs that >>>> do source refactoring, where you want to preserve extras such as
parentheses even if they're not strictly needed.
(Injecting the right parentheses for examples like `(a + b) * c' which >>>> would have an AST like '(* (+ a b) c)' is surpringly tricky. Easier to >>>> just follow the original source!
You probably mean some more complicated example. This one is
easy:
(10) -> parse("(a + b) * c")
(10) (* (+ a b) c)
(11) -> unparse(parse("(a + b) * c"))
(11) "(a+b)*c"
(12) -> parse("a + b * c")
(12) (+ a (* b c))
(13) -> unparse(parse("a + b * c"))
(13) "a+b*c"
You just need to track priorities of subexpressions to produce the
above: '+' has lower priority than '*' so subexpression needs
parentheses, '*' has higher priority, so there is no need for
parentheses.
I seem to remember one problem was with minus, for example original expr is: >>
a - (b - c)
The AST is (- a (- b c)), but a simplistic approach would generate, from
either that or (- (- a b) c), the same output:
a - b - c
No parentheses because the the two "-" have the same precedence. The
example might have been 'a + (b - c)'; same thing.
It just seemed more trouble than it was worth.
What? I don't understand what you're saying at all.
[...]
This is about turning AST (which has stripped parentheses) back into the original source text.
[...]
On 04/06/2026 11:51, Dan Cross wrote:
In article <10vqftg$2d72$1@dont-email.me>, Bart <bc@freeuk.com> wrote:
On 03/06/2026 23:30, Waldek Hebisch wrote:
Bart <bc@freeuk.com> wrote:
Personally I don't have much use for CSTs for a normal compiler, but >>>>> they might be useful for source-to-source translators, or programs that >>>>> do source refactoring, where you want to preserve extras such as
parentheses even if they're not strictly needed.
(Injecting the right parentheses for examples like `(a + b) * c' which >>>>> would have an AST like '(* (+ a b) c)' is surpringly tricky. Easier to >>>>> just follow the original source!
You probably mean some more complicated example. This one is
easy:
(10) -> parse("(a + b) * c")
(10) (* (+ a b) c)
(11) -> unparse(parse("(a + b) * c"))
(11) "(a+b)*c"
(12) -> parse("a + b * c")
(12) (+ a (* b c))
(13) -> unparse(parse("a + b * c"))
(13) "a+b*c"
You just need to track priorities of subexpressions to produce the
above: '+' has lower priority than '*' so subexpression needs
parentheses, '*' has higher priority, so there is no need for
parentheses.
I seem to remember one problem was with minus, for example original expr is:
a - (b - c)
The AST is (- a (- b c)), but a simplistic approach would generate, from >>> either that or (- (- a b) c), the same output:
a - b - c
No parentheses because the the two "-" have the same precedence. The
example might have been 'a + (b - c)'; same thing.
It just seemed more trouble than it was worth.
What? I don't understand what you're saying at all.
This is about turning AST (which has stripped parentheses) back into the >original source text.
I was responding to 'You just need to track priorities of
subexpressions', with an example where the two operators had the same >priorities.
It would extra work to generate the parentheses needed to make the
output correct. For my application, I decided not to be bother.
| Sysop: | Tetrazocine |
|---|---|
| Location: | Melbourne, VIC, Australia |
| Users: | 14 |
| Nodes: | 8 (0 / 8) |
| Uptime: | 198:17:19 |
| Calls: | 218 |
| Files: | 21,503 |
| Messages: | 82,303 |