Why does flatMap take in a function that returns stream instead of Collection?

Teddy Tsai

Why does the flatMap operation require a function which returns Stream instead of a function that returns a Collection? Any particular reason it forces the user to do the stream conversion manually?

Reading the source code example I can see that this way the compatibilioty can be extended to arrays but wouldn't an overload of flatMap achieve the same result?

// Java 8 source code example:
Stream<String> words = lines.flatMap(line -> Stream.of(line.split(" +")));

What are the use cases where it's better to have the streaming process explicited?

Example: why am I forced to do this

Map<String, List<String>> map = new HashMap<String, List<String>>();
List<String> flatList = map.entrySet().stream().flatMap(e -> e.getValue().stream()).collect(Collectors.toList());

instead of this?

Map<String, List<String>> map = new HashMap<String, List<String>>();
List<String> flatList = map.entrySet().stream().flatMap(Map.Entry::getValue).collect(Collectors.toList());
Alexander Ivanchenko

Why does the flatMap() operation require a function which returns Stream instead of a function that returns a Collection?

There are many reasons for that:

  • Stream is a means of iteration, i.e. we're not storing the data in the stream, its purpose is to iterate lazily many over the source of data, which can be a String, Array, IO-Stream, etc.

  • Secondly, Stream operations are divided into two groups: terminal, which are meant to produce the result and terminate the execution of the stream pipeline (i.e. it's not possible to apply any operation after a terminal one), and intermediate operations, which transform the stream. Intermediate operations are always lazy. A stream takes elements from the source one-by-one and processes them lazily, i.e. operations occur only when needed. Don't a new stream with a chain of nested for-loops, they act differently. Every intermediate operation produces a new stream.

Here's a quote from the API documentation:

Streams differ from collections in several ways:

  • No storage. A stream is not a data structure that stores elements; instead, it conveys elements from a source such as a data structure, an array, a generator function, or an I/O channel, through a pipeline of computational operations.

  • Laziness-seeking. Many stream operations, such as filtering, mapping, or duplicate removal, can be implemented lazily, exposing opportunities for optimization. For example, "find the first String with three consecutive vowels" need not examine all the input strings. Stream operations are divided into intermediate (Stream-producing) operations and terminal (value- or side-effect-producing) operations. Intermediate operations are always lazy.

  • Since Stream are internal iterators over the source of data which can have a different nature (not necessarily a Collectoin) it's reasonable for flatMap() to expect data in a predictable uniform shape, not an Array, Collection, Iterable, etc. but another internal iterator, i.e. another Stream, so that's obvious how to deal with it.

Any option that you can up with would be less intuitive. If flatMap() was implemented in such a way so that it would expect a function producing Collection how would you deal with strings, arrays, IO-Streams, various implementations of Iterable? By dumping the data into a Collection - that's not an option. Same issue would arise if we imagine that flatMap() required Iterable, how would we produce Iterable from a String? Streams are designed to be versatile.

I suspect that your judgement regarding flatMap() is biased because you are not accustomed to it. When you embrace the idea that a Stream is an Internal Iterator, the fact that operation for flattening the data expect function producing another iterator would be perceived as more intuitive.

Collected from the Internet

Please contact [email protected] to delete if infringement.

edited at
0

Comments

0 comments
Login to comment

Related

Why does this function returns '' instead of a concatenated string

Why does mapcar in lisp take a name instead of function?

Why does this flatMap return a Map instead of a list?

Why does Java Stream.map take a Function<? super P_OUT, ? extends R> mapper as input instead of Function<P_OUT, ? extends R>?

Why Does Hex() Function returns a string instead an int hex?

Why can't Stream.flatMap accept a collection?

Why does Stream.Write not take a UInt?

flatmap a stream of a collection to a stream of its elements

Why does this Stream.take/2 call break a functioning Stream?

Why does the Promise constructor require a function that calls 'resolve' when complete, but 'then' does not - it returns a value instead?

Java Stream collect after flatMap returns List<Object> instead of List<String>

Why Collection.toArray(T[]) doesn't take an E[] instead

Why does < instead of << in stream output still compile?

Why does this definition returns a function?

Why does cSplit returns TRUE instead of the character

Why does using ~ returns -2 instead of false?

Why is this usage of Stream::flatMap wrong?

Why this function returns 0 instead of a double?

Why does Spark JavaRDD flatmap function return an iterator

Why the function does not take the new value assignment?

Why stream average() method returns OptionalDouble instead of double?

Java streams: flatMap returns Stream of Objects

How does the lazy 'take' function compute the Scala stream further?

Why does the `orElseThrow()` method in Java take a `Supplier` as a parameter instead of an `Exception`?

Why does Go panic() take interface{} instead of ...interface{} as argument?

Why does Drop take &mut self instead of self?

Why does this stream function result in an error?

Java flatMap - whats the difference stream.of() and collection.stream()

Why does distinct work via flatMap, but not via map's "sub-stream"?