-
Notifications
You must be signed in to change notification settings - Fork 5.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
JDK-8277095 : Empty streams create too many objects #6275
Conversation
👋 Welcome back kabutz! A progress list of the required criteria for merging this PR into |
Streams are closeable, and a terminal operation may be invoked on a given stream only once. Thus, shouldn't the third line in both of the examples below throw
|
I don't think that we can remove all the state from an empty stream, but we can likely make it smaller. |
That would be fairly easy to solve by having two instances of the EmptyStream. The terminal operations would return the terminal operation that throws IllegalStateExceptions. |
@@ -740,6 +740,7 @@ default boolean removeIf(Predicate<? super E> filter) { | |||
* @since 1.8 | |||
*/ | |||
default Stream<E> stream() { | |||
if (isEmpty()) return Stream.empty(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The net effect of this change might depend on your workload. If you call stream() on empty collections that have cheap isEmpty(), this change will likely improve performance and reduce waste. However, this same change might do the opposite if some of your collections aren't empty or have costly isEmpty(). It would be good to have benchmarks for different workloads.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
wouldn't this make streams no longer lazy if the collection is empty?
List<String> list = new ArrayList<>();
Stream<String> stream = list.stream();
list.addAll(List.of("one", "two", "three"));
stream.forEach(System.out::println); // prints one two three
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(immutable collections could override stream() instead, since they don't have that problem)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The net effect of this change might depend on your workload. If you call stream() on empty collections that have cheap isEmpty(), this change will likely improve performance and reduce waste. However, this same change might do the opposite if some of your collections aren't empty or have costly isEmpty(). It would be good to have benchmarks for different workloads.
Yes, I also thought about the cost of isEmpty() on concurrent collections. There are four concurrent collections that have a linear time cost size() method: CLQ, CLD, LTQ and CHM. However, in each of these cases, the isEmpty() method has constant time cost. There might be collections defined outside the JDK where this could be the case.
However, I will extend the benchmark to include a few of those cases too, as well as different sizes and collection sizes.
Thank you so much for your input.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
wouldn't this make streams no longer lazy if the collection is empty?
List<String> list = new ArrayList<>(); Stream<String> stream = list.stream(); list.addAll(List.of("one", "two", "three")); stream.forEach(System.out::println); // prints one two three
I did not consider this case, thank you for bringing it up. I have always found this behaviour a bit strange and have never used it "in the real world". It is also not consistent between collections. Here is an example with four collections: ArrayList, CopyOnWriteArrayList, ConcurrentSkipListSet and ArrayBlockingQueue:
import java.util.ArrayList;
import java.util.Arrays;
import java.util.Collection;
import java.util.List;
import java.util.Objects;
import java.util.concurrent.ArrayBlockingQueue;
import java.util.concurrent.ConcurrentSkipListSet;
import java.util.concurrent.CopyOnWriteArrayList;
import java.util.function.Supplier;
import java.util.stream.IntStream;
public class LazyStreamDemo {
public static void main(String... args) {
List<Supplier<Collection<String>>> suppliers =
List.of(ArrayList::new, // fast-fail
CopyOnWriteArrayList::new, // snapshot
ConcurrentSkipListSet::new, // weakly-consistent
() -> new ArrayBlockingQueue<>(10) // weakly-consistent
);
for (Supplier<Collection<String>> supplier : suppliers) {
Collection<String> c = supplier.get();
System.out.println(c.getClass());
IntStream stream = c.stream()
.sorted()
.filter(Objects::nonNull)
.mapToInt(String::length)
.sorted();
c.addAll(List.of("one", "two", "three", "four", "five"));
System.out.println("stream = " + Arrays.toString(stream.toArray()));
}
}
}
The output is:
class java.util.ArrayList
stream = [3, 3, 4, 4, 5]
class java.util.concurrent.CopyOnWriteArrayList
stream = []
class java.util.concurrent.ConcurrentSkipListSet
stream = []
class java.util.concurrent.ArrayBlockingQueue
stream = [3, 3, 4, 4, 5]
At least with the EmptyStream we would have consistent output of always []
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@kabutz I agree that i wouldn't consider it clean code to use a stream like i put into the example. I only brought it up because it might break existing code, since i think streams are expected to be lazy. Interesting to see that they are in fact not lazy in all situations - i put that into my notes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Edit: actually I think the implementation of Collection::stream
could simply be changed to:
default Stream<E> stream() {
var spliterator = spliterator();
if(spliterator.hasCharacteristics(Spliterator.IMMUTABLE | Spliterator.CONCURRENT) && isEmpty()) {
return Stream.empty();
}
return StreamSupport.stream(spliterator, false);
}
Note that the spliterators of methods such as Collections::emptyList
, List::of
, List::copyOf
, Set::of
, ... currently don't have the IMMUTABLE
characteristic though, so they should be updated.
The Javadoc for java.util.stream is clear though:
Traversal of the pipeline source does not begin until the terminal operation of the pipeline is executed.
and further on says:
Spliterators for mutable data sources have an additional challenge; timing of binding to the data, since the data could change between the time the spliterator is created and the time the stream pipeline is executed. Ideally, a spliterator for a stream would report a characteristic of IMMUTABLE or CONCURRENT; if not it should be late-binding.
which explains why the collections in java.util.concurrent
(whose spliterators have one of those characteristics) need not be late-binding.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @anthonyvdotbe I believe that would satisfy the lazy binding, and then we should increase the use of the IMMUTABLE characteristic where it makes sense.
Looks a good idea to reduce the pipeline overhead ! |
I've added far more JMH tests to check for stream size of [0, 1, 10, 100], then different types of streams, from minimal to complex, then five different collections ArrayList, ConcurrentLinkedQueue, ConcurrentSkipListSet, CopyOnWriteArrayList, ConcurrentHashMap.newKeySet() and lastly with Stream.empty() that has the new behavior and StreamSupport.stream(...) that was the old behavior. With all this multiplication of test options, it will take a couple of days to run on my server. Will let you know if anything surprising pops up. |
Thanks for your excellent suggestions. It seemed that we cannot get away from making some objects. I've got a new version in the works that makes EmptyStream instances each time, with their own state. The performance is still good. For a simple stream pipeline, it is roughly twice as fast for an empty stream. There is no noticeable slow-down for non-empty streams.
Streams are not created lazily yet in my updated version, but I'm hoping it is a step in the right direction. I'm running an extensive JMH suite overnight to compare object allocation rates and throughput for the streams. |
Here are the maximum throughput numbers for the JMH benchmark for various lengths, collections, type of stream decorations and whether it is the old style stream creation or the new EmptyStream. We also see the speedup with the new system. There are some strange effects where we see small differences once we get past 0 length, which is most likely to be explained because of noise in the benchmark. The speedup in the benchmark for empty streams seems to be from about 2x for minimal stream decoration through to about 9x for the more complex kind.
|
@@ -5444,7 +5444,9 @@ public static void parallelSetAll(double[] array, IntToDoubleFunction generator) | |||
* @since 1.8 | |||
*/ | |||
public static <T> Stream<T> stream(T[] array, int startInclusive, int endExclusive) { | |||
return StreamSupport.stream(spliterator(array, startInclusive, endExclusive), false); | |||
var spliterator = spliterator(array, startInclusive, endExclusive); | |||
if (startInclusive == endExclusive) return Stream.empty(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can't we just add the if
statement to before the spliterator
is computed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @liach for the suggestion. The spliterator serves the purpose of checking the arguments. For example, if array is null, the method should throw a NPE. Similarly, startInclusive and endExclusive have to be in the range of the array length. If we swap around the lines, someone could call stream(null, -100, -100) and it would happily return an empty stream.
Furthermore, the empty streams can inherit characteristics of a spliterator. In the code you pointed out, I don't do that, but will change that. It makes it much easier to keep the empty streams consistent with their previous behavior. One might argue that it is pointless to call distinct() sorted() and unordered() on an empty stream and expect it to change, but it was easy enough to implement so that we can keep consistency FWIW.
Webrevs
|
I have a similar project at empty-streams. A couple of notes:
|
Reading the Javadoc again I'm not sure this is allowed. The method Javadoc doesn't clearly say it but the package Javadoc for intermediate operations says a new stream is returned. |
I found the need for ALL the Spliterator characteristics, plus some others that I interleaved between the Spliterator bits. That way I could fit them into a single int. Parallel was too tricky though, so in that case I return a new stream built with the standard StreamSupport(spliterator, true).
Similarly when someone calls closeHandler(), I also return a new stream built with StreamSupport.
Yes, we need this for TreeSet and ConcurrentSkipListSet. I was hoping to avoid it, but there is no way around if we want to be completely correct.
Agreed. I fixed that.
Oh that's a good suggestion - thanks.
I now always return a new EmptyStream. Fortunately escape analysis gets rid of a lot of objects.
unordered() only does a state change if it currently ORDERED, otherwise it is correct to return this. The logic is convoluted and weird though. I have a more thorough test case that I need to incorporate into my EmptyStreamTest and which tests all the JDK collections, including the wrapper and immutable collections.
Interesting idea. I'm not sure if that would work for me. I will have a look if I can also do this, but I doubt it. A lot of the methods, especially the primitive ones, are repeats of each other. It might be really nice to put that into a common superclass.
I tried to keep the characteristics identical to what we had before and parallel seemed particularly challenging to get right. I might attempt this again at a later stage, but for now I want to keep it like it is. I don't think parallel() is the most common use case, except with very large collections perhaps. Even though I thought I was careful with the characteristics, I still have a few edge cases I need to iron out. |
Yes, I also returned "this" initially to avoid object allocation, but fortunately escape analysis helps get rid of them if they are indeed unnecessary. My object allocation for the new empty streams is minimal. The one thing that is still hindering me is the lazy creation of streams.
Does anyone use this "feature"? I hope not, but unfortunately it's the current behavior and we probably have to support it :-( |
…improved unordered() Added StreamSupport.empty for primitive spliterators and use that in Arrays.stream()
After looking at this, I decided I'd rather not short-circuit the methods, since they have checking code that I would have to duplicate. |
@kabutz This pull request has been inactive for more than 4 weeks and will be automatically closed if another 4 weeks passes without any activity. To avoid this, simply add a new comment to the pull request. Feel free to ask for assistance if you need help with progressing this pull request towards integration! |
Add another comment to keep this active. |
@kabutz This pull request has been inactive for more than 4 weeks and will be automatically closed if another 4 weeks passes without any activity. To avoid this, simply add a new comment to the pull request. Feel free to ask for assistance if you need help with progressing this pull request towards integration! |
Another comment |
@kabutz This pull request has been inactive for more than 4 weeks and will be automatically closed if another 4 weeks passes without any activity. To avoid this, simply add a new comment to the pull request. Feel free to ask for assistance if you need help with progressing this pull request towards integration! |
Still wondering whether this can ever join the JDK |
@kabutz This pull request has been inactive for more than 4 weeks and will be automatically closed if another 4 weeks passes without any activity. To avoid this, simply add a new comment to the pull request. Feel free to ask for assistance if you need help with progressing this pull request towards integration! |
satisfy the bot with a comment - we are on day 20 now :) |
@kabutz This pull request has been inactive for more than 4 weeks and will be automatically closed if another 4 weeks passes without any activity. To avoid this, simply add a new comment to the pull request. Feel free to ask for assistance if you need help with progressing this pull request towards integration! |
@kabutz This pull request has been inactive for more than 8 weeks and will now be automatically closed. If you would like to continue working on this pull request in the future, feel free to reopen it! This can be done using the |
I hope this PR isn’t abandoned. It’s the kind of optimization I would hope steams could do - even if this was initially only applied to arrays and immutable lists. One minor suggestion would be to move the implementations into a new package-visible EmptyStreams class to reduce churn in the already large Streams |
* @param spliterator a {@code Spliterator.OfInt} describing the stream elements | ||
* @return a new sequential empty {@code IntStream} | ||
*/ | ||
public static IntStream emptyIntStream(Spliterator.OfInt spliterator) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the interest of not adding new public APIs, can these be removed in favour of eg, IntStream.empty()? It wouldn’t take the spliterator, but perhaps it isn’t necessary.
I agree it’s the “kind of” optimization that would be nice. “Kind of”. Personally I would be happier to see complexity like this added that would help a larger class of common streams. It’s a harder problem, and I know this is case of “the best is the enemy of the good”, but I think a stream which has less content bulk than pipeline phases (according to some heuristic weighting) might possibly be handled better by dumping the elements into an Object array and running each phase in sequence over that array, rather than composing a “net result of all phases” object and then running it over the few elements. Stream object creation can be reduced, perhaps, by building the stream around a small internal buffer that collects pipeline phases (and their lambdas), by side effect. The terminal operation runs this Rube-Goldberg contraption in an interpretive manner over the elements. An advantage would arise if the contraption were smaller and simpler than a fully-composed stream of today, and the optimizations lost by having an interpreter instead of a specialized object ness were insignificant due to the small bulk of the stream source. If such an optimization really works, it would automatically handle the zero-elements case, but would also cover lots of use cases (in the JDK code even) where streams are used for their notational convenience, rather than their ability to process many elements efficiently. I looked at this for a few days, several months ago. I solved enough problems to see that there is a long tail of difficulties in stream implementation! (Many are noted in this PR thread.) I noticed that some pipeline phases can expand the “bulk” (flatMap and friends). Bulk-reducing phases (filter) are not a problem, for already-small streams. (These issues do not arise for truly empty streams.) For expansion there would have to be an option to inflate a previously-small stream to use the general paths. Another issue is avoiding running any lambdas until the terminal operation, which means capturing them in some orderly fashion. Again, if a bizarre pipeline structure shows up, inflation is an option. And for truly empty streams some or all of the pipeline structure can be just dropped on the floor, as this PR proposes. In the end, I think the best leverage will come from a completely different set of techniques, from whatever allows us to do “loop customization”. By loop customization which I mean the ability to compile the loop in the terminal operation separately for each “interestingly distinct” combination of pipeline components, in such a way that the loop can constant-fold their lambdas, the shape of the stream source, and anything else that affects loop code quality. That technique should apply well regardless of “bulk”, since the most complex object allocation should happen during specialization, which is a link-time operation, instead of every time a stream is created. Current state of the art uses mega-inlining, which has complex limitations and failure modes. (It utterly fails for parallel streams.) When we get to specialized generics, I hope to take another run at the problem, so that the type of each pipeline lambda “feeds” into a specialization syndrome that the JVM “sees” distinct from other specializations, and can optimize separately. (Making streams be value classes could help also.) I guess all that might be “the impossible dream being the enemy of the best”. Anyway, FWIW… |
Indeed, this was a nice experiment, but it would be even better to solve the issues in a more general way. It also turned into a rather complex change that modified a bit how streams currently work. Even though it took a couple of weeks of work, especially in creating the unit tests, I am 100% in favour of mothballing this. Until we solve it, there is an easy workaround that should also work in the future. Just check ahead of time whether the collection is empty, and in that case, don't create a stream. |
Sounds reasonable. I’ve been seeing streams being used because of them being a convenient api more, which made me wonder more about how well they dealt with optimizations. I appreciate the extra insight. |
This is a draft proposal for how we could improve stream performance for the case where the streams are empty. Empty collections are common-place. If we iterate over them with an Iterator, we would have to create one small Iterator object (which could often be eliminated) and if it is empty we are done. However, with Streams we first have to build up the entire pipeline, until we realize that there is no work to do. With this example, we change Collection#stream() to first check if the collection is empty, and if it is, we simply return an EmptyStream. We also have EmptyIntStream, EmptyLongStream and EmptyDoubleStream. We have taken great care for these to have the same characteristics and behaviour as the streams returned by Stream.empty(), IntStream.empty(), etc.
Some of the JDK tests fail with this, due to ClassCastExceptions (our EmptyStream is not an AbstractPipeline) and AssertionError, since we can call some methods repeatedly on the stream without it failing. On the plus side, creating a complex stream on an empty stream gives us upwards of 50x increase in performance due to a much smaller object allocation rate. This PR includes the code for the change, unit tests and also a JMH benchmark to demonstrate the improvement.
Progress
Issue
Reviewing
Using
git
Checkout this PR locally:
$ git fetch https://git.openjdk.java.net/jdk pull/6275/head:pull/6275
$ git checkout pull/6275
Update a local copy of the PR:
$ git checkout pull/6275
$ git pull https://git.openjdk.java.net/jdk pull/6275/head
Using Skara CLI tools
Checkout this PR locally:
$ git pr checkout 6275
View PR using the GUI difftool:
$ git pr show -t 6275
Using diff file
Download this PR as a diff file:
https://git.openjdk.java.net/jdk/pull/6275.diff