-
Notifications
You must be signed in to change notification settings - Fork 154
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Only keep semantic fields in Java, i.e. skip location fields #1450
Conversation
f2d56f0
to
2a461db
Compare
@eregon We cannot toggle this yet. regular expressions do not unescape properly. I also still use non-escaped string value via location in one place (which perhaps is another problem). I will make an issue for one right now and perhaps a second will be forthcoming. |
I cannot link so this is first issue: #1452 |
@kddnewton Do you think #1452 could be fixed soon? |
If you want to take a look, that would be great. Otherwise I'll add it to my list. |
2a461db
to
60d1041
Compare
I don't think I'll have time for that one. I updated the PR so we can mark individual location fields and still serialize those until they have a replacement. |
@enebo I preserved the Do you also need |
@eregon if you look at the other String nodes you will see the ones with delimeters have it but ones coming from interpolated strings do not. So it is neccesary until unescaped bytes for regexp is fixed. This is my current logic: boolean interpolated = node.opening_loc != null && source[node.opening_loc.startOffset] != '\''; |
60d1041
to
7437498
Compare
@enebo OK I kept StringNode#opening_loc for now. |
@kddnewton Could you merge this? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This PR has two names for the same thing: semantic_fields
and location_fields
. Can you call them all semantic_fields
?
@kddnewton The problem is how should the boolean be named? |
I would probably just call it |
OK, I'll update to that tomorrow |
I think // Serialize the AST represented by the given node to the given buffer.
YP_EXPORTED_FUNCTION void yp_serialize(yp_parser_t *parser, yp_node_t *node, yp_buffer_t *buffer, bool semantic); is too unclear and we would need to explain this field in every public API in details like I think with // Serialize the AST represented by the given node to the given buffer.
YP_EXPORTED_FUNCTION void yp_serialize(yp_parser_t *parser, yp_node_t *node, yp_buffer_t *buffer, bool only_semantics_fields); |
7437498
to
7350b12
Compare
I have changed the approach, now there is an env var (YARP_SERIALIZE_ONLY_SEMANTICS_FIELDS) which is used at templating time and decides whether to serialize non-semantic/location fields or not. |
7350b12
to
eb8c843
Compare
* Add $YARP_SERIALIZE_ONLY_SEMANTICS_FIELDS to control where to serialize location fields at templating time, this way there is no overhead for either case and nothing to check at runtime. * Add a byte in the header to indicate whether location fields are included as expected. * Fixes ruby#807 * Simplify the build-java CI job now that the FFI backend is available so JRuby can serialize. * Support keeping some location fields which are still needed until there is a replacement
eb8c843
to
fc5cf2d
Compare
See #1532 (comment) for the savings on serialized size with this PR |
@@ -14436,6 +14436,7 @@ yp_serialize(yp_parser_t *parser, yp_node_t *node, yp_buffer_t *buffer) { | |||
yp_buffer_append_u8(buffer, YP_VERSION_MAJOR); | |||
yp_buffer_append_u8(buffer, YP_VERSION_MINOR); | |||
yp_buffer_append_u8(buffer, YP_VERSION_PATCH); | |||
yp_buffer_append_u8(buffer, YP_SERIALIZE_ONLY_SEMANTICS_FIELDS ? 1 : 0); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
YP_SERIALIZE_ONLY_SEMANTICS_FIELDS
needs to be defined just for this usage.
I think we should move yp_serialize
or the part writing the header to serialize.c.erb
, but that seemed out of scope of this PR.
@eregon I'm nervous a bit about the implications of having this be an environment variable. It means everything is dependent on how the gem/project was templated, as opposed to the options specified. This now means there could be a mismatch between the In general I think almost all of this PR is really good, I'm just very nervous about making the runtime behavior tied to a build-time specification, especially considering how many different consumers there are. Would you consider moving this back to a runtime specification? |
Yes the mismatch is possible but it is checked and errors properly in case of the mismatch with the added byte in the serialization header.
I believe we need build-time specification for #1493 anyway. I'm not very keen to make it a runtime specification because it makes the code more complex and might have some performance overhead, when there is AFAIK no value/use-case to specify this dynamically. If it turns out to be needed to be a runtime parameter we can always change this fairly easily. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay. We can go with this for now. I'm still nervous about this change because we've tied runtime behavior to build-time config, but hopefully I'm wrong.
Thank you for merging.
Only when YARP is used as the JRuby/TruffleRuby parser. When YARP is used as a gem, location fields are always included and hence the CI still passes on JRuby/TruffleRuby. |
cc @enebo