Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Parser options for either name_locs or names #1407

Closed
jemmaissroff opened this issue Sep 5, 2023 · 3 comments
Closed

Parser options for either name_locs or names #1407

jemmaissroff opened this issue Sep 5, 2023 · 3 comments
Labels
enhancement New feature or request

Comments

@jemmaissroff
Copy link
Collaborator

Some nodes (eg ClassVariableAndWriteNode) have both a name field and a name_loc field. These are because it's helpful for some consumers (eg linters) to have the location, while it's helpful for other consumers (eg compilers) to have the name itself. Consumers need one, not both.

We could eventually expose a way for consumers to specify which field type they need. This would reduce overall memory impact of the parse tree by not storing all of these fields twice.

This isn't necessary for V1, and is most helpful if we decide to reduce memory since it'll introduce some otherwise unnecessary complexity.

@enebo
Copy link
Collaborator

enebo commented Sep 5, 2023

@jemmaissroff I had suggested the notion of a profile between the two use-cases back when we discussed how syntax tools want offsets but compilers want line number. The linting use case wants original source in memory and the compiler doesn't. There are a number of different preferences between the two use-cases. It feels like two tools which overlap 80% trying to consume the same format.

@eregon
Copy link
Member

eregon commented Sep 6, 2023

Related to #807, which aims to remove all location fields for serialization and Java nodes (but keep start+length on the node itself, I consider that a different concern and TruffleRuby needs those anyway).

name_loc is just one of the many location fields, none of them should be needed for compilation/execution.

Could maybe be useful to do the same for C structs too but that seems far more involved as e.g. yarp.c would then need to check some macro or so for whether to assign those location fields. OTOH that might speedup the parser a bit too. If done at templating time, then that specific parser (yarp.c) would never be able to give locations and so wouldn't be usable for the Ruby API (would need to generate 2 files from yarp.c, one tracking location fields and one not, by setting defining macro differently).
If it's if (set_location_fields) conditions then it would work for either case but that wouldn't reduce the size of the node structs at all then (so no real value) and could have some overhead.

A more immediate gain for shrinking C structs could be to change yp_location_t to be uint32_t start, length; (8 bytes) instead of 2 const char* (16 bytes). That requires extensive changes in yarp.c so I'm not volunteering but it's probably worth it footprint-wise.

@kddnewton
Copy link
Collaborator

Unfortunately a tool like rubocop needs both (I found out the hard way through https://github.com/kddnewton/parser-prism. I think since we're now not serialization the locations for JRuby/TruffleRuby and we could instead shrink the location struct, I think I'd like to close this one as not planned.

@kddnewton kddnewton closed this as not planned Won't fix, can't repro, duplicate, stale Dec 7, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

4 participants