입사 초기에 우리 파트에서 사용하는 배포 프로세스가 신박해서 열심히 공부했는데, 업무 표준화 TF로 활동하면서 설명할 일 까지 있어 정리했던 내용이다.
This is a technique that helps graceful deployment. It means that it reduces downtime of the service and hopefully any error (almost) doesn’t occur while you deploy a new version. It’s accomplished by running two identical production environments called blue and green.
This can be done in hardware level, vm level, or container level. But I wanna talk about bluegreen deployment in process level.
Bluegreen deployment using nginx is commonly used. We might have a process in service. We call it blue and it is bound to port 10000. Nginx has upstream to the blue, so all the requests to this server is delivered to blue on port 1000 by Nginx.
Let’s run another process of updated version on port 20000 called green. If you change upstream to green in Nginx configuration and reload the configuration, Nginx will deliver all the requests to green, then we can serve all the realtime coming requests while changing versions.
However, the problem comes up when the server using keep-alive option. If the option is set on, then some client still keep connections and use this even after nginx changed upstream. It meant clients with old connecitons keep sending requests to old socket bound to port:10000, but nginx has no upstream like that, so it cause errors. I can’t tell this is graceful.
We can archive bluegreen deployment in process level in another way. SO_REUSEPORT option make it possible. This socket option allows multiple sockets to listen on the same IP address and port combination. The kernel then load balances incoming connections across the sockets.
To archive graceful deployment using SO_REUSEPORT, we have to assure one more thing. We should separate workers to connection listener and requests handler. This is supported in many was framework like Undertow, Tomcat, Grizzly, Jetty.
In normal status, a process ,called blue again, will server all coming connections and requests.
When we need to deploy a new version, we can run a new process called green on the same port using SO_REUSERPORT option. Then, new coming connections are loaded to both of the processes by the kernel. In this situation, if we just stop connection listener of blue, all the new connections will be established to the green. But workers of blue is still alive with sockets bound to port 8000, so they can serve requests through old connections without any error. we can called this process ‘switch’
Http Keep-alive usually has time out, and it is 1~2 minutes by default in most library and frameworks. All of the old connections to blue will be closed and clients will establish new connections to green in some minutes. Then, we can stop whole blue process. It is perfectly graceful and just spent some minutes.