Overloading `child_nodes` in `config.yml` #1268

jemmaissroff · 2023-08-16T21:03:14Z

I’m looking at adding a flag to nodes, and realizing how much we’re overloading the term child_nodes in config.yml . We’re using child_nodes currently to mean params. A small, simple improvement would be renaming this. But overloading the term still leads to quite a bit of filtering and casing when actually generating the code.

I think it would be cleaner to separate out actual child nodes from other fields on a node (eg have fields on a node for child_nodes, locations, flags) to match how the templates are being used. We could define a params method which returns all of these when needed, but in most cases just use the filtered down nodes.

For example, something like this

  - name: WhileNode
    child_nodes:
      - name: keyword_loc
        type: location
      - name: predicate
        type: node
      - name: statements
        type: node?
        kind: StatementsNode
      - name: flags
        type: flags
        kind: LoopFlags

could become:

  - name: WhileNode
    locations:
      - name: keyword_loc
        type: location
    child_nodes:
      - name: predicate
        type: node
      - name: statements
        type: node?
        kind: StatementsNode
    flags:
      - name: flags
        type: flags
        kind: LoopFlags

depending on how we’d implement it, we could even get rid of the type on some of the less complicated fields.

The text was updated successfully, but these errors were encountered:

eregon · 2023-08-16T22:42:26Z

What about string and integer fields?

Also if it's separated by category it wouldn't be possible to choose the order of fields belonging to different categories.
For instance the current ordering for CallNode seems nice as it's in order of things appearing in the source:
https://github.com/ruby/yarp/blob/8a0a7a1d28b82df26a614378b2007076033ff5d0/config.yml#L585-L607

I think renaming child_nodes to fields would be an easy fix here with most of the advantages.

Regarding iterating child params, there could be a convenience method for that, that's indeed done a lot manually right now.

enebo · 2023-08-17T15:53:10Z

@jemmaissroff in JRuby for the current AST we differentiate between child nodes and fields in that we have a childNodes() which only returns the child nodes of the current node. Non-nodes are fields (and actually we also save the child nodes as fields) but childNodes() is necessary for implementing a proper visitor.

Ultimately though the only need to know child nodes from other fields (or types) is mostly to give the list of children to a visitor right? That must be done by looking for a type of node{,?,[]}. If we guaranteed child_nodes was only nodes I think that could allow type to be removed so long as cardinality specifier was still there somewhere. So:

- name: PostExecutionNode
    child_nodes:
      - name: statements
        type: node?
        kind: StatementsNode

could just become:

name: PostExecutionNode
    child_nodes:
      - name: statements
        kind: StatementsNode?

and so long as we are changing format we could go big (:smile:):

name: PostExecutionNode
    child_nodes:
      - child: statements: StatementsNode?

I guess this is maybe a good way to condense the size of config.yml.

The thing I don't like about the idea is the notion that the current child_node order represents the syntactical order of all the elements. This is mostly helpful in looking up nodes in this file as a reference. It is trivial from order to know which element is which on a node (although the names are pretty good too so this is maybe not that important).

kddnewton · 2023-08-17T19:51:29Z

I think order is not as helpful as you might think. For example on IfNode, it's in the right order only if it's in the block form. If it's in the modifier it's in the wrong order. Similarly for CallNode, depending on the kind of call it could be in the wrong order. (-foo vs foo.-@ would change this.) I think ultimately it's more beneficial for us to put them in the order that makes sense for us reading it.

The other implication for ordering is the order in the struct. Ideally I think we should sort these based on the most efficient packing mechanism to make sure they don't cause any holes in the struct. I think this isn't an issue now, and certainly isn't something we need to think about for a while, but I figured I'd mention it.

Overall, I'm very 👍 on this idea. Anything that makes it more readable.

enebo · 2023-08-17T20:33:09Z

@kddnewton ah yeah I just looked at how if works between regular and mod_if. I see what you mean. I could argue we could add even more nodes so then order did match syntax but I am not serious about that (although mod_if cannot have an else branch so we lose a byte serializing in that case by sharing the same node entry). I withdraw that reservation.

kddnewton · 2023-09-07T14:19:45Z

There are 2 things in this issue: renaming child_nodes and splitting up the fields. The first is done. I think we can leave the second one until we provide the option to drop certain fields from the API when we don't need them. (Sort of a "full" parse versus a "minimal" parse.) I think at that point we could split up the fields like:

  - name: CallNode
    fields:
      minimal:
        - name: receiver
          type: node?
        - name: arguments
          type: node?
          kind: ArgumentsNode
        - name: block
          type: node?
          kind: BlockNode
        - name: flags
          type: flags
          kind: CallNodeFlags
        - name: name
          type: string
      full:
        - name: call_operator_loc
          type: location?
        - name: message_loc
          type: location?
        - name: opening_loc
          type: location?
        - name: closing_loc
          type: location?

There might end up being some locations that we actually want to keep, so I think this delineation might be more useful. Closing for now, revisit later.

enebo · 2023-09-07T15:08:34Z

@kddnewton two things for your last comment design:

maybe use semantic for minimal and syntactic for full?
All nodes currently still have offsets of the node itself and we still need it for line calcs but it would be nice to make lines for semantic and offsets for syntactic. line would be a big savings for compile serialization because it is only one number and a smaller one (I think there are one or two nodes which require endLine as well but that is just a detail).

enebo · 2023-09-07T15:09:36Z

I think TR wants both offsets AND line so there likely is a wrinkle there.

eregon · 2023-09-12T10:12:48Z

I think it's enough to remove location/location? fields for serialization (#807).
We should be able to do it soon now, I think we have everything needed or very close to it.
If some location field turns out to be needed we can either expose it in another way (e.g. flag, other field), or until that's done have some extra YAML key to preserve it.
I think a flat list of fields is easier to handle for all consumers of this YAML config file.

kddnewton added the enhancement New feature or request label Aug 17, 2023

kddnewton mentioned this issue Aug 30, 2023

Rename child_nodes in config.yml to fields #1366

Merged

kddnewton closed this as completed Sep 7, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Overloading `child_nodes` in `config.yml` #1268

Overloading `child_nodes` in `config.yml` #1268

jemmaissroff commented Aug 16, 2023

eregon commented Aug 16, 2023

enebo commented Aug 17, 2023

kddnewton commented Aug 17, 2023

enebo commented Aug 17, 2023

kddnewton commented Sep 7, 2023

enebo commented Sep 7, 2023

enebo commented Sep 7, 2023

eregon commented Sep 12, 2023

Overloading child_nodes in config.yml #1268

Overloading child_nodes in config.yml #1268

Comments

jemmaissroff commented Aug 16, 2023

eregon commented Aug 16, 2023

enebo commented Aug 17, 2023

kddnewton commented Aug 17, 2023

enebo commented Aug 17, 2023

kddnewton commented Sep 7, 2023

enebo commented Sep 7, 2023

enebo commented Sep 7, 2023

eregon commented Sep 12, 2023

Overloading `child_nodes` in `config.yml` #1268

Overloading `child_nodes` in `config.yml` #1268