FastAPI vs. Fastify vs. Spring Boot vs. Gin Benchmark

01.10.2022

In a previous article, I benchmarked FastAPI, Express.js, Flask, and Nest.js in order to verify FastAPI’s claims of being on par with Node.js. In this article, I am pitting the champion, FastAPI, against a new set of faster competitors. For each framework, I created an API endpoint that returns 100 rows of data from a PostgreSQL database. The data is returned as JSON.

The code for this benchmark can be found here:

https://github.com/travisluong/python-vs-nodejs-benchmark

Disclaimer: I am not a benchmarking expert. This was simply a random experiment I did out of curiosity. This is for entertainment purposes only.

Here are the results:

FastAPI + asyncpg + orjson + gunicorn

Running 10s test @ http://localhost:8000/orjson
2 threads and 10 connections
Thread Stats Avg Stdev Max +/- Stdev
Latency 2.29ms 0.93ms 10.28ms 55.43%
Req/Sec 2.19k 568.66 3.25k 60.50%
43575 requests in 10.01s, 333.28MB read
Requests/sec: 4355.30
Transfer/sec: 33.31MB

Fastify + pg

Running 10s test @ http://localhost:3000
2 threads and 10 connections
Thread Stats Avg Stdev Max +/- Stdev
Latency 4.62ms 1.49ms 20.94ms 87.98%
Req/Sec 1.10k 165.28 1.31k 76.50%
21860 requests in 10.01s, 172.30MB read
Requests/sec: 2184.83
Transfer/sec: 17.22MB

Spring Boot + jdbc

Running 10s test @ http://localhost:8080
2 threads and 10 connections
Thread Stats Avg Stdev Max +/- Stdev
Latency 1.37ms 1.95ms 73.33ms 98.92%
Req/Sec 3.98k 361.02 5.78k 76.12%
79653 requests in 10.10s, 609.25MB read
Requests/sec: 7886.63
Transfer/sec: 60.32MB

Spring Boot + JPA

Running 10s test @ http://localhost:8080/posts
2 threads and 10 connections
Thread Stats Avg Stdev Max +/- Stdev
Latency 15.52ms 17.42ms 134.96ms 90.44%
Req/Sec 424.58 117.08 737.00 75.50%
8473 requests in 10.03s, 55.25MB read
Requests/sec: 844.82
Transfer/sec: 5.51MB

Gin + database/sql + lib/pq

Running 10s test @ http://localhost:8080/loadtest
2 threads and 10 connections
Thread Stats Avg Stdev Max +/- Stdev
Latency 5.31ms 5.76ms 33.29ms 80.44%
Req/Sec 1.49k 209.14 2.00k 68.50%
29687 requests in 10.01s, 182.53MB read
Requests/sec: 2966.86
Transfer/sec: 18.24MB

The Rankings

These include benchmarks from part 1 of this article.

  1. Spring Boot + jdbc (7886 req/sec)
  2. Go + pgx (7517 req/sec)
  3. Go + pg + SetMaxOpenConns + SetMaxIdleConns (7388 req/sec)
  4. FastAPI + asyncpg + ujson + gunicorn 8w (4831 req/sec)
  5. Fastify + pg + cluster mode 8w (without logging) (4622 req/sec)
  6. FastAPI + asyncpg + ujson + gunicorn 4w (4401 req/sec)
  7. FastAPI + asyncpg + gunicorn 4w + orjson (4193 req/sec)
  8. Express.js + pg + cluster mode 8w (4145 req/sec)
  9. Fastify + pg + cluster mode 8w (3417 req/sec)
  10. Gin + database/sql + lib/pq (2966 req/sec)
  11. Fastify + pg (without logging) (2750 req/sec)
  12. Fastify + pg (2184 req/sec)
  13. Express.js + pg (1931 req/sec)
  14. FastAPI + asyncpg + uvicorn + orjson (1885 req/sec)
  15. FastAPI + asyncpg + uvicorn + ujson (1711 req/sec)
  16. Flask + psycopg2 + gunicorn 4w (1478 req/sec)
  17. Nest.js + Prisma (1184 req/sec)
  18. FastAPI + psycopg2 + gunicorn 4w (989 req/sec)
  19. FastAPI + asyncpg + gunicorn 4w (952 req/sec)
  20. SpringBoot + JPA (844 req/sec)
  21. FastAPI + psycopg2 + uvicorn + orjson (827 req/sec)
  22. Flask + psycopg2 + flask run (705 req/sec)
  23. FastAPI + SQLModel + gunicorn 4w (569 req/sec)
  24. Flask + psycopg2 + gunicorn 1w (536 req/sec)
  25. FastAPI + asyncpg + uvicorn (314 req/sec)
  26. FastAPI + psycopg2 + uvicorn (308 req/sec)
  27. FastAPI + databases + uvicorn (267 req/sec)
  28. FastAPI + SQLModel + uvicorn (182 req/sec)

Update 1/18/22

I realized there is another huge flaw in my benchmark. I was running Express.js and Fastify with a single process and FastAPI with 4. Obviously, this isn’t a fair comparison, so I reran the tests with 8 workers each, to fully utilize the CPU on my quad-core MacBook pro.

If there’s a flaw in the code or benchmark, please suggest improvements in the comments.

Express.js + pg + cluster mode 8 workers

Running 10s test @ http://localhost:3000
2 threads and 10 connections
Thread Stats Avg Stdev Max +/- Stdev
Latency 16.52ms 30.81ms 187.78ms 85.78%
Req/Sec 2.09k 569.71 4.07k 70.15%
41873 requests in 10.10s, 332.72MB read
Requests/sec: 4145.64
Transfer/sec: 32.94MB

Fastify + pg + cluster mode 8 workers

Running 10s test @ http://localhost:3000
2 threads and 10 connections
Thread Stats Avg Stdev Max +/- Stdev
Latency 17.62ms 33.50ms 253.90ms 86.50%
Req/Sec 1.72k 425.97 2.97k 70.50%
34186 requests in 10.00s, 269.46MB read
Requests/sec: 3417.27
Transfer/sec: 26.94MB

FastAPI + asyncpg + gunicorn 8 workers + ujson

Running 10s test @ http://localhost:8000/ujson
2 threads and 10 connections
Thread Stats Avg Stdev Max +/- Stdev
Latency 2.07ms 542.26us 9.73ms 73.07%
Req/Sec 2.43k 278.47 2.99k 78.00%
48324 requests in 10.00s, 369.60MB read
Requests/sec: 4831.50
Transfer/sec: 36.95MB

Update 2/6/22

Fastify + pg (without logging)

Running 10s test @ http://localhost:3000
  2 threads and 10 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency     3.63ms    1.08ms  17.63ms   86.29%
    Req/Sec     1.39k   181.96     2.58k    82.09%
  27788 requests in 10.10s, 219.03MB read
Requests/sec:   2750.63
Transfer/sec:     21.68MB

Fastify + pg + cluster mode 8 workers (without logging)

Running 10s test @ http://localhost:3000
  2 threads and 10 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency    15.87ms   30.37ms 170.58ms   86.04%
    Req/Sec     2.32k   833.78     5.10k    72.50%
  46246 requests in 10.00s, 364.52MB read
Requests/sec:   4622.35
Transfer/sec:     36.43MB

Update 2/8/22

Gin + pgx

Running 10s test @ http://localhost:8080/loadtest
  2 threads and 10 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency     1.47ms    1.70ms  34.71ms   97.38%
    Req/Sec     3.78k   817.41     5.00k    74.75%
  75941 requests in 10.10s, 466.91MB read
Requests/sec:   7517.37
Transfer/sec:     46.22MB

Update 2/20/22

Thanks to user abenz, another flaw was pointed out in the Go benchmark. I have updated the results.

Gin + pg

With pg.SetMaxOpenConns(10000) and pg.SetMaxIdleConns(5000)

Running 10s test @ http://localhost:8080/loadtest
  2 threads and 10 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency     1.36ms  586.06us  21.15ms   83.11%
    Req/Sec     3.71k   451.06     4.70k    72.28%
  74626 requests in 10.10s, 458.83MB read
Requests/sec:   7388.91
Transfer/sec:     45.43MB

Conclusion

I initially did these benchmarks to verify FastAPI’s claims of being on par with Node.js. During the process, I also wanted to figure out why I wasn’t getting a great performance out of FastAPI when used in a typical scenario with a database request. There were a few things I learned from doing these benchmarks:

  1. FastAPI is not fast out of the box. You have to use the correct database drivers such as asyncpg to fully take advantage of FastAPI’s speed.
  2. Even with asyncpg, you still have to use a faster json library with FastAPI to push performance up to Node.js levels.
  3. Going from raw sql queries to json is significantly faster than using an ORM, which makes sense as you are skipping the object mapping process.
  4. I’ve always heard that compiled languages were faster than interpreted languages, although I never verified it for myself. Java/Go was indeed faster compared to similar setups in interpreted languages.
  5. Node.js has a cluster module that allows you to launch a cluster of Node.js processes to take advantage of multi-core systems.
  6. Logging affects performance.

A “maxed-out” FastAPI configuration vs. a “maxed-out” Express.js configuration seems to produce similar results. I’ve included a link to the code above. Let me know if there’s anything that can improve this benchmark.

Thanks for reading.