'Which ExecutorService is best for blocking IO tasks

Let's imagine that we have n independent blocking IO tasks e.g. tasks for rest-call to another server. Then all answer we need to combine. Every task can be processing over 10 second.

  1. We can process it sequentially and spent ~n*10 second at the end:

    Task1Ans task1 = service1.doSomething();
    Task2Ans task2 = service2.doSomething()
    ...
    return result;
    
  2. Another strategy is to process it in parallel manner using CompletableFuture and spent ~ 10 second on all task:

    CompletableFuture<Task1Ans> task1Cs = CompletableFuture.supplyAsync(() -> service1.doSomething(), bestExecutor);
    CompletableFuture<Task2Ans> task2Cs = CompletableFuture.supplyAsync(() -> service2.doSomething(), bestExecutor);
    return CompletableFuture.allOf(task1Cs, task2Cs)
       .thenApply(nothing -> {
           ...
           // combine task1, task2 into result object
           return result;
       }).join();
    

The second approach has benefits, but I can't understand which type of thread pool is the best for this kind of task:

ExecutorService bestExecutor = Executors.newFixedThreadPool(30)   /// or Executors.newCachedThreadPool() or Executors.newWorkStealingPool()

My question is which ExecutorService is best for process n-parallel blocking IO tasks.



Solution 1:[1]

On completely CPU bound tasks you do not get additional performances by going with more threads than CPU cores. So in this scenario, 8 core / 8 thread CPU needs only 8 thread to maximize performances, and loses performance by going with more. IO tasks usually do gain performances by going with larger number of threads than CPU cores, as CPU time is available to do other stuff while waiting for IO. But even when CPU overhead of each thread is low there are limits to scaling as each thread eats into memory, and incurs caching/context switches..

Given that your task is IO limited, and you didn't provide any other constraints, you should probably just run different thread for each of your IO tasks. You can achieve this by either using fixed or cached thread pool.

If the number of your IO tasks is very large (thousands+), you should limit the maximum size of your thread pool, as you can have such thing as too many of threads.

If your task are CPU bound, you should again limit thread pool to even smaller size. Number of cores can be dynamically fetched by using:

int cores = Runtime.getRuntime().availableProcessors();

Also, just as your CPU has scaling limit, your IO device usually has a scaling limit too. You should not exceed that limit, but without measuring it is hard to say where limit is.

Solution 2:[2]

Project Loom

Your situation is suited to using the new features being proposed for future versions of Java: virtual threads and structured concurrency. These are part of Project Loom.

Today’s Java threads are mapped one-to-one onto host operating system threads. When Java code blocks, the host thread blocks. The host OS threads sits idle, waiting for execution to resume. Host OS threads are heavyweight, costly in terms of both CPU and memory. So this idling is not optimal.

In contrast, virtual threads in Project Loom are mapped many to one onto the host OS thread. When code in a virtual thread blocks, that task is “parked”, set aside to allow another virtual thread’s task some execution time. This parking of virtual threads is managed within the JVM, so it is highly optimized, very fast, very efficient both in CPU and in memory. As a result, Java apps running on common hardware can support thousands, even millions, of virtual threads at a time.

The ExecutorService is AutoCloseable in Loom. So we can use try-with-resources to contain your entire batch of tasks in a try ( ExecutorService es = Executors.newVirtualThreadPerTaskExecutor() ) { … submit tasks … }. Once completed, the flow of control exits from the try-with-resources block, and you know your tasks are done. Access the Future object returned for each task you submitted. No need for CompletableFuture.

Experimental builds of a JDK with Loom technology are available now, based on early-access Java 19. While in the works for some years, Loom is still evolving and may change. Not yet ready for production, though I’d be tempted to try it in some non-critical capacity.

For more info, see the several articles, presentations, and interviews with members of the Project Loom team. These include Ron Pressler and Alan Bateman.

Solution 3:[3]

If I understand your question properly, for above behaviour, irrespective of selection of executorService, it is more important how you are calling your executorService.

E.g.

ExecutorService executorService=Executors.newCachedThreadPool();
executorService.invokeAll(..);

Now here, invokeAll(..) will block until all supplied tasks inside are completed. So I feel selecting any ExecutorService & calling invokeAll(..) will be suitable for your requirement.

Also please have a look at this SE Question which discusses new Java 8 introduction of ExecutorCompletionService & invokeAll.

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 Talijanac
Solution 2
Solution 3 Ashish Patil