Optimize parsing 19 digit longs #865

marschall · 2022-12-15T16:11:44Z

Parsing 19 digit long values is significantly slower than parsing 18 digit long values as we are hitting a slow path that does a string allocation.

19 digit long values are quite common when a random number generator is used to generate long values.

We have to pay attention that we only use this fast path if the value is a valid long. This can be achieved by calling #inLongRange first.

In the following synthetic microbenchmark a throughput increase of about 33% with a drastic reduction in the allocation rate can be achieved.

@OutputTimeUnit(TimeUnit.SECONDS)
public class JacksonStdLongReadVanilla
{

    private static final JsonFactory JSON_FACTORY = new JsonFactory();

    private static final String JSON;

    static {
        StringBuilder json = new StringBuilder();
        Random random = new Random(0L);
        json.append('[');
        for (int i = 0; i < 1_000; i++) {
            if (i > 0) {
                json.append(',');
            }
            json.append(random.nextLong());
        }
        json.append(']');
        JSON = json.toString();
    }

    @Benchmark
    public void parseLong(Blackhole blackhole) throws IOException {
        try (JsonParser parser = JSON_FACTORY.createParser(JSON)) {
            parser.nextToken();

            JsonToken nextToken = parser.nextToken();
            while (nextToken == JsonToken.VALUE_NUMBER_INT) {
                blackhole.consume(parser.getLongValue());
                nextToken = parser.nextToken();
            }
        }
    }

}

If we have a look at the allocations before we see that it is dominated by byte[] allocations for String allocations

Resulting in many gcs

With the changes in this PR the byte[] and String allocations are gone

Resulting in no more gcs

Avoid slow path and allocation when parsing 19 digit longs.

pjfanning · 2022-12-15T17:25:54Z

src/main/java/com/fasterxml/jackson/core/io/NumberInput.java

@@ -152,6 +152,32 @@ public static long parseLong(char[] ch, int off, int len)
        long val = parseInt(ch, off, len1) * L_BILLION;
        return val + (long) parseInt(ch, off+len1, 9);
    }
+
+    /**
+     * Parses an unsigned long made up of exactly 19 digits.


this looks wrong - the code seems to handle signed numbers

This the same approach that com.fasterxml.jackson.core.util.TextBuffer#contentsAsInt(boolean) uses, it calls com.fasterxml.jackson.core.io.NumberInput#parseInt(char[], int, int) that parses an optionally singed value unsigned by ignoring the sign and later flips the sign if it had a minus sign.

pjfanning · 2022-12-15T17:29:18Z

src/main/java/com/fasterxml/jackson/core/io/NumberInput.java

+            num = (num * 10) + (c - '0');
+        }
+        return negative ? -num : num;
+    }


can you provide a jmh test project to prove this is faster? you can extend https://github.com/pjfanning/jackson-number-parse-bench if you like - you obviously have the test case, just would be handy to be able to run it independently

if this is really such a performance drag, shouldn't there be an openjdk issue - or something similar?

if this is really such a performance drag, shouldn't there be an openjdk issue - or something similar?

Outsiders can't create OpenJDK issues. And even if, the discussions alone would probably take years. The time investment is massive with no guaranteed outcome. To give you a concrete example openjdk/jdk#6275.

Even if everything would go smoothly we would still have to wait years until Jackson updates the minimum version to one that has the fixes. Jackson currently has a base of JDK 8 which was released 8 years ago. So we're looking at a best case of 10 years.

can you provide a jmh test project to prove this is faster? you can extend https://github.com/pjfanning/jackson-number-parse-bench if you like - you obviously have the test case, just would be handy to be able to run it independently

I saw there is also https://github.com/FasterXML/jackson-benchmarks let me know what you prefer.

Whichever suits you. I have a vague recollection that jackson-benchmarks is configured to run the benchmarks for a long time. There was something about the project that made me go off and create my own where I could reconfigure the settings more readily.

pjfanning · 2022-12-15T18:29:23Z

Another thing that I'm concerned with is that with a custom long parser, we would need to have test coverage for a wide range of possible values.

The FastDoubleParser that we allow use of is

optional (and not the default)
FastDoubleParser has good test coverage
it is admittedly still quite experimental and users who opt in to using it will need to do so with caution

marschall · 2022-12-16T10:43:05Z

Another thing that I'm concerned with is that with a custom long parser, we would need to have test coverage for a wide range of possible values.

NumberInput is already full of custom number parsing code, I refer you to the various #parseInt methods. There is also the custom BigDecimalParser. But I can extend the tests in NumberParsingTest.

cowtowncoder · 2022-12-17T03:42:49Z

Although jackson-benchmarks is useful for comparing performance of various Jackson backends over relatively simple POJO type, it's not as good for micro-benchmarks for aspects like this. So I think I'd go with https://github.com/pjfanning/jackson-number-parse-bench.

src/main/java/com/fasterxml/jackson/core/io/NumberInput.java

pjfanning · 2022-12-17T10:45:19Z

Could NumberInput.parseLong be changed to call parseLong19 where appropriate - it already uses parseInt for small nums. Would it be simpler just to have parseLong parse all numbers the new way (if the performance was not worse)?

cowtowncoder · 2022-12-20T05:12:49Z

Could NumberInput.parseLong be changed to call parseLong19 where appropriate - it already uses parseInt for small nums. Would it be simpler just to have parseLong parse all numbers the new way (if the performance was not worse)?

Right now method is quite particular expecting length of 10 - 18 characters and there'd be one more check (and branch).
Not sure it would affect optimizer that much; I don't have strong opinion where to modify it and add checks or keep separate method. Separate method would do well to follow the same pattern tho; decoding 2 ints, using int multiplications is faster than doing 19 long multiplications. I remember benchmarking that difference years ago and it was non-trivial.

marschall · 2022-12-20T20:26:17Z

Although jackson-benchmarks is useful for comparing performance of various Jackson backends over relatively simple POJO type, it's not as good for micro-benchmarks for aspects like this. So I think I'd go with https://github.com/pjfanning/jackson-number-parse-bench.

There you go pjfanning/jackson-number-parse-bench#2

I had to change build.gradle

repositories {
    mavenLocal()
}

jmh {
    includes = ['org.example.jackson.bench.LongParserBench']
    profilers = ['gc'] // or whatever you like best like jfr or async
}

On a Ryzen 7 PRO 5750GE and JDK 17.0.5 we go from

Benchmark                                                    Mode  Cnt      Score      Error   Units
LongParserBench.parseLong                                   thrpt   25  21860.758 ± 1076.882   ops/s
LongParserBench.parseLong:·gc.alloc.rate                    thrpt   25   1139.800 ±   56.144  MB/sec
LongParserBench.parseLong:·gc.alloc.rate.norm               thrpt   25  57410.238 ±    0.043    B/op
LongParserBench.parseLong:·gc.churn.G1_Eden_Space           thrpt   25   1136.527 ±   59.318  MB/sec
LongParserBench.parseLong:·gc.churn.G1_Eden_Space.norm      thrpt   25  57240.652 ±  766.795    B/op
LongParserBench.parseLong:·gc.churn.G1_Survivor_Space       thrpt   25      0.006 ±    0.002  MB/sec
LongParserBench.parseLong:·gc.churn.G1_Survivor_Space.norm  thrpt   25      0.303 ±    0.079    B/op
LongParserBench.parseLong:·gc.count                         thrpt   25    504.000             counts
LongParserBench.parseLong:·gc.time                          thrpt   25    334.000                 ms

to

Benchmark                                                    Mode  Cnt      Score      Error   Units
LongParserBench.parseLong                                   thrpt   25  31230.896 ± 1532.563   ops/s
LongParserBench.parseLong:·gc.alloc.rate                    thrpt   25     14.523 ±    0.713  MB/sec
LongParserBench.parseLong:·gc.alloc.rate.norm               thrpt   25    512.025 ±    0.029    B/op
LongParserBench.parseLong:·gc.churn.G1_Eden_Space           thrpt   25     15.785 ±   19.352  MB/sec
LongParserBench.parseLong:·gc.churn.G1_Eden_Space.norm      thrpt   25    544.799 ±  668.200    B/op
LongParserBench.parseLong:·gc.churn.G1_Survivor_Space       thrpt   25     ≈ 10⁻⁴             MB/sec
LongParserBench.parseLong:·gc.churn.G1_Survivor_Space.norm  thrpt   25      0.005 ±    0.011    B/op
LongParserBench.parseLong:·gc.count                         thrpt   25      7.000             counts
LongParserBench.parseLong:·gc.time                          thrpt   25     10.000                 ms

We go from 21860.758 ops/s to 31230.896 ops/s.
We go from an allocation rate of 1139 MB/sec to 14 MB/sec.
We go from 504 gcs taking 334 ms to 7 gc taking 10 ms.

I haven yet looked at JDK 8 results, it's possible the differences are greater there due to the lack of compact strings.

marschall · 2022-12-20T20:31:07Z

Another thing that I'm concerned with is that with a custom long parser, we would need to have test coverage for a wide range of possible values.

NumberInput is already full of custom number parsing code, I refer you to the various #parseInt methods. There is also the custom BigDecimalParser. But I can extend the tests in NumberParsingTest.

NumberParsingTest already covered a lot of cases, I expanded them a bit.

pjfanning · 2022-12-20T20:36:24Z

src/main/java/com/fasterxml/jackson/core/io/NumberInput.java

+     */
+    public static long parseLong19(char[] ch, int off, boolean negative)
+    {
+        // Note: caller must ensure length is [10, 18]


can you remove this comment line which doesn't match this code?

cowtowncoder · 2022-12-20T23:52:12Z

Thank you @marschall -- I'll merge this. May do minor tweaks but looks good & can go in 2.15.0.

marschall · 2022-12-21T10:42:29Z

Could NumberInput.parseLong be changed to call parseLong19 where appropriate - it already uses parseInt for small nums. Would it be simpler just to have parseLong parse all numbers the new way (if the performance was not worse)?

Right now method is quite particular expecting length of 10 - 18 characters and there'd be one more check (and branch). Not sure it would affect optimizer that much; I don't have strong opinion where to modify it and add checks or keep separate method. Separate method would do well to follow the same pattern tho; decoding 2 ints, using int multiplications is faster than doing 19 long multiplications. I remember benchmarking that difference years ago and it was non-trivial.

I had a look at this, it would require a bit more changes. Currently #parseInt only handles ints with length 9. ints with length 10 are parsed as long and then converted to int. Therefore we can't call #parseInt twice to parse long of length 19, only long of length 18.

If we we wanted to change this we would have to:

Introduce NumberInput#inIntRange
Change ParserBase#_parseIntValue() and ParserBase#_parseNumericValue(int) to call NumberInput#inIntRange for length 10 and based on this either call TextBuffer#contentsAsInt(boolean) or TextBuffer#contentsAsLong(boolean).
Change #parseInt to handle a length of up to 10

cowtowncoder · 2022-12-21T17:33:54Z

I think we are ok for now; if anyone really wants to optimize things further, a new PR would be fine.
There's fine line between handling 10/19 lengths optimally vs handling shorter ones -- for most use cases I suspect it is more important to have close(r) to optimal parsing for shorter numbers than longer. That is: adding more checks (-> more branching in code) for different cases slows down existing cases.

Optimize parsing 19 digit longs

c6d50fc

Avoid slow path and allocation when parsing 19 digit longs.

pjfanning reviewed Dec 15, 2022

View reviewed changes

cowtowncoder reviewed Dec 17, 2022

View reviewed changes

src/main/java/com/fasterxml/jackson/core/io/NumberInput.java Show resolved Hide resolved

Implement more number tests

d7501ee

marschall mentioned this pull request Dec 20, 2022

Add LongParserBench pjfanning/jackson-number-parse-bench#2

Merged

pjfanning reviewed Dec 20, 2022

View reviewed changes

cowtowncoder approved these changes Dec 20, 2022

View reviewed changes

cowtowncoder merged commit 3188b19 into FasterXML:2.15 Dec 20, 2022

cowtowncoder added a commit that referenced this pull request Dec 21, 2022

Add release notes wrt #865, minor post-merge tweaks

3fc89cd

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Optimize parsing 19 digit longs #865

Optimize parsing 19 digit longs #865

marschall commented Dec 15, 2022 •

edited

Loading

pjfanning Dec 15, 2022 •

edited

Loading

marschall Dec 20, 2022

pjfanning Dec 15, 2022 •

edited

Loading

pjfanning Dec 15, 2022

marschall Dec 16, 2022

marschall Dec 16, 2022

pjfanning Dec 16, 2022

pjfanning commented Dec 15, 2022

marschall commented Dec 16, 2022

cowtowncoder commented Dec 17, 2022

pjfanning commented Dec 17, 2022

cowtowncoder commented Dec 20, 2022 •

edited

Loading

marschall commented Dec 20, 2022 •

edited

Loading

marschall commented Dec 20, 2022

pjfanning Dec 20, 2022

cowtowncoder commented Dec 20, 2022

marschall commented Dec 21, 2022

cowtowncoder commented Dec 21, 2022

Optimize parsing 19 digit longs #865

Optimize parsing 19 digit longs #865

Conversation

marschall commented Dec 15, 2022 • edited Loading

pjfanning Dec 15, 2022 • edited Loading

Choose a reason for hiding this comment

marschall Dec 20, 2022

Choose a reason for hiding this comment

pjfanning Dec 15, 2022 • edited Loading

Choose a reason for hiding this comment

pjfanning Dec 15, 2022

Choose a reason for hiding this comment

marschall Dec 16, 2022

Choose a reason for hiding this comment

marschall Dec 16, 2022

Choose a reason for hiding this comment

pjfanning Dec 16, 2022

Choose a reason for hiding this comment

pjfanning commented Dec 15, 2022

marschall commented Dec 16, 2022

cowtowncoder commented Dec 17, 2022

pjfanning commented Dec 17, 2022

cowtowncoder commented Dec 20, 2022 • edited Loading

marschall commented Dec 20, 2022 • edited Loading

marschall commented Dec 20, 2022

pjfanning Dec 20, 2022

Choose a reason for hiding this comment

cowtowncoder commented Dec 20, 2022

marschall commented Dec 21, 2022

cowtowncoder commented Dec 21, 2022

marschall commented Dec 15, 2022 •

edited

Loading

pjfanning Dec 15, 2022 •

edited

Loading

pjfanning Dec 15, 2022 •

edited

Loading

cowtowncoder commented Dec 20, 2022 •

edited

Loading

marschall commented Dec 20, 2022 •

edited

Loading