-
-
Notifications
You must be signed in to change notification settings - Fork 800
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add JsonFactory.Feature.CHARSET_DETECTION
to disable charset detection (default to UTF-8)
#921
Conversation
Ok this looks nice and reasonable. But let me think about this for just a bit before merging. :) Thank you again! |
Oh, one idea/suggestion: although I don't have strong opinion but this just occurred to me and thought I'd share. |
@cowtowncoder i used a JsonFactory.Feature because it was easier in this case. it looks like JsonReadFeature also requires a JsonParser.Feature at the moment, which did not seem appropriate here, the other JsonParser.Features seem more like they operate to all formats (not just bytes). Do you think it should be a JsonParser.Feature as well, or should I figure out something to add a JsonReadFeature without a JsonParser.Feature? |
Good point. Unfortunately coupling for 2.x remains (untangling is not super easy), so let's just leave things the way they are as |
JsonFactory.Feature
to disable charset detection
ok thanks! |
Currently, we use a non-blocking parser to transform input JSON to a JsonNode, then send it downstream, and finally convert it to the desired request type when the route arguments are bound. This patch delays parsing until the conversion to the route type. It consists of these parts: - A change to JsonContentProcessor. The JsonContentProcessor has to handle certain scenarios (`application/x-json-stream`, or our support for streaming a json array as a Publisher) where the input contains multiple json tokens, Instead of forwarding the input bytes to an async parser, they are first split into individual json values, that are then forwarded or parsed in bulk. - A new JsonCounter class. This class handles splitting the input into its individual json nodes. It is basically a limited JSON parser that counts braces. It is already very optimized, but it also has a special optimization when the content type is `application/json`, in which case it assumes only a single json value in the input (this is technically a breaking change, but breaks no tests). I also wrote fuzz tests for JsonCounter, I will submit them in a separate PR to https://github.com/micronaut-projects/micronaut-fuzzing . - A benchmark. It's not run by the build tools, but it is good to have around. - An option in NettyHttpServerConfiguration to re-enable eager parsing to JsonNode. JsonCounter is still used, so it's not 100% the old behavior, but it should be more compatible in some respects because it doesn't change the conversion logic anymore. - A new LazyJsonNode class. It contains a ByteBuffer with the actual unparsed bytes. It's a bit complicated because it contains a reference counted buffer that has to be released after conversion, but also sometimes multiple converters are called for the same LazyJsonNode (once to JsonNode, once to a user-defined type). To solve this, when there is a conversion to JsonNode, it keeps that JsonNode around (and releases the buffer). If there are future conversions to a specific type, it will use the JsonNode as the source instead. - A new JsonSyntaxException class. JsonMappers can throw this to signal a syntax error in the JSON, which will be reported differently than a data binding error. - Changes to JsonMapper API: The asynchronous parser is deprecated. There is a new readValue method that takes a ByteBuffer. The jackson implementation has an optimization for netty ByteBuffers. - New converters for LazyJsonNode. Also removed some old superfluous converters for JsonNode. - A change to RequestLifecycle to show JSON syntax error messages like we did previously, instead of the opaque 400 that we usually send for conversion errors. - Some test changes to reflect changes in error messages. Because we now parse the input in bulk, in some cases jackson can decorate errors with short snippets of the failing input data. Is this an issue? There are a few potential incompatibilities with this change: - Non-standard JSON features when combined with JsonCounter (i.e. when streaming a JSON array, or when using `application/x-json-stream`) will break even when the JSON parser is configured to support them. e.g. if the user configured jackson to ignore comments, that may fail now. - When the input body is never bound, JSON syntax errors may not be caught. - JsonCounter only supports UTF-8, by design. This is permitted by the JSON standard, however Jackson also supports UTF-16 and UTF-32. imo this is a net benefit, to avoid potential parser differential vulnerabilities. I also made a jackson feature to match this, though JsonCounter is sufficient now: FasterXML/jackson-core#921 - JsonMapper implementations that do not use JsonSyntaxException yet will have less verbose HTTP errors until they switch to JsonSyntaxException.
JsonFactory.Feature
to disable charset detectionJsonFactory.Feature.CHARSET_DETECTION
to disable charset detection
JsonFactory.Feature.CHARSET_DETECTION
to disable charset detectionJsonFactory.Feature.CHARSET_DETECTION
to disable charset detection (default to UTF-8)
No description provided.