Deferred Result: Make Spring Boot Non-Blocking

deferred result is non-blocking

Divyesh Kanzariya, Java Tutorials Spot

Non-blocking I/O is achievable in Spring using the deferred result. This article examines how to improve speed on a heavily trafficked website using Spring and how to test a deferred result.

Spring is fast but suffers from dedicating a single thread per request. The deferred result helps eradicate this issue.

Why is My Spring Boot Website Slow?

Spring is written in Java which does not block. However, Spring dedicates a single thread to each request. No other work is performed on the thread for the life of the request.

This means that performance tuning often revolves around dedicating more RAM or CPU to the website. The JVM must account for threads, and this can become cumbersome no matter how well built a system is.

Executing a single request per thread is particularly problematic when requests to other services are made or heavy computation is performed.  For instance, a program might create a call to a database system. The handling thread will wait until the response is received before continuing. A bottleneck occurs when thousands or millions of  requests are processed at once.

When Can Non-Blocking I/O Help?

Non-blocking I/O as outlined above performs a request on a thread pool. Computation and I/O laden requests are completed in the background while the main thread works on other tasks. By not blocking, the comptuer is free to perform more work this includes handling other requests in Spring.

Java contains a variety of mechanisms to avoid blocking.  A fork-join pool and thread pool are available to the developer.

Fork-join pools are particularly useful. In this pool, idle threads take up work from active threads.

How Can I Perform Non-Blocking I/O in Spring Boot?

Spring Boot allows for non-blocking I/O to be performed through the deferred result. Error or response objects are set in the deferred result which serves as the return value:

public DeferredResult testDeferred(){
    DeferredResult deferred = new DeferredREsult();
    deferred.onTimeout(() ->"timeout");
    Thread thread = (() -> {

A deferred object is thread-safe. The return value is set in the result. An error object and function executed on timeout may be configured as well.

How Can I Test Non-Blocking I/O in Spring?

Testing non-blocking I/O is possible through  MockMvc or RestAssured. MockMvc handles asynchronous requests differently from synchronous calls:

ResultActions resultActions = this.mockMvc.perform(post("/test_path"));
MvcResult result = resultActions.andExpect(request().asnycStarted()).andReturn();
result = this.mockMvc.perform(asyncDispatch()).andReturn();


After ensuring that the asynchronous behavior started execution and awaiting a return value, testing continues as before.


Spring and Spring Boot can perform non-blocking I/O calls. This allows the framework to wait for computation-heavy processes and asynchronous network requests to complete while other network calls are handled by the original thread.

A fork-join pool or thread pool ensures that only a certain number of threads are created to handle background tasks.


Running a Gevent StreamServer in a Thread for Maximum Control

There are times when serving requests needs to be controlled without forcibly killing a running server instance. It is possible to do this with Python’s gevent library. In this article, we will examine how to control the gevent StreamServer through an event.

All code is available on my Github through an actor system project which I intend to use in production (e.g. it will be completed).

Greenlet Threads

Gevent utilizes green threads through the gevent.greenlet package. A greenlet, a Green thread in gevent, utilizes a similar API to the Python asyncio library but with more control over scheduling.

The greenlet is scheduled by the program instead of the OS and works more akin to threading in the JVM. Greenlet threads run code concurrently but not in parallel, although this could be achieved by starting greenlets in a new process.

In this manner, it is possible to schedule Greenlet threads on a new operating system based thread or, more likely due to the GIL, in a new process to achieve a similar level of concurrency and even parallelism to Java. Of course, the cost of parallelism must be considered.


Gevent maintains a server through gevent.server.StreamServer. This server utilizes greenlet threads to concurrently handle requests.

A StreamServer takes an address and port and can optionally be given a pool for controlling the number of connections created:

pool = Pool(MAX_THREADS)
server = StreamServer((, self.port), handle_connect, spawn=pool)

This concurrency allows for faster communication in the bi-directional way that troubles asyncio.

 Gracefully Exiting a StreamServer

In instances where we want to gracefully exit a server while still running code, it is possible to use the gevent.event.Event class to communicate between threads and force the server to stop.

As a preliminary note, it is necessary to call the gevent.monkey.patch_all method in the native thread to allow for cross-thread communications.

from gevent import monkey


Instead of using the serve_forever method, we must utilize the start and stop methods. this must also be accompanied by the use of the Event class:

evt = Event()
pool = Pool(MAX_THREADS)
server = StreamServer((, self.port), handle_connect, spawn=pool)
gevent.signal(signal.SIGQUIT, self.evt.set)
gevent.signal(signal.SIGTERM, self.evt.set)
gevent.signal(signal.SIGINT, self.evt.set)

For good measure, termination signals are handled through gevent.signal to allow the server to close and kill the thread in case of user based termination. This hopefully leaves no loose ends.

The event can be set externally to force the server to stop with the preset 10 second timeout:



In this article, we examined how a greenlet thread or server can be controlled using a separate thread for graceful termination without exiting a running program. An Event was used to achieve our goal.