Table of contents

Problem introduction

I was working on a project and where had to implement maintenance mode. I am doing this using feature flag maintenance_mode=enabled passed via config file. This flag when enabled, will do a redirect to use the maintenance page. When application is in maintenance mode, it displays, information like “currently we are in maintenance mode and bla bla” and returns some status code.

The delimma is what status code should we return here ?

After going through various rabbit holes, alot of experts opinions and almost all the HTTP status codes defined in MDN docs, there were the possible options sortlisted:

  • 503 status code

  • 200 status code

  • 404 status code

  • 302 status code

If i use 503 monitoring systems will start throwing alerts, if i use 200 or 404 it does not 100% makes sense and if i use 302 then google bots might get unhappy and folks who idolise standards might get pissed.

Why you ask ?

Let me explain

503 status code

503 status code as it literally means service unavailable and any status code of range 5XX are considered server side error. According to MDN, 503 is a service unavailable server error which indicates that server is not ready to handle the request.

But there is a caveat of using 503. If you have used application to maintenance mode or used cloudflares maintenance page, when application is down it throws 503 status code. And if you noticed in the definition from MDBN, 503 is a server error and any application monitoring tool monitoring that endpoint or site will start triggering alerts even if we know that there was nothing wrong and the action was intentional. Also, maintenance mode is not an error, it’s a feature that was enabled not a bug.

200 status code

200 status code means that request has been succeeded. If we think about it differently, our request did actually succedded during maintenance mode. It sent a request and it returned us with resources.

So should we use 200 then ?

This is kind of debeatable, as while the request succeded it did not return the intended resources.

Now let’s move on to next

404 status code

404 status code means that server is not able to find the requested resource which in our case is also true. We requested for one resource, but it gave us other maintenance page resource.

So should we use 404 ?

let’s move on to our final option

302 status code

302 status code means the requested resource was temporarily moved to another location.

What if you redirected users to the page containing maintenance status.

What does google think ?

Google according to this suggests using 503 status code along with Retry-After header as it affects SEO of the website. It says that if this is not done, then SEO of website is impacted negatively.

So should we use 503 then ?

Conclusion

For me 503 is an error and i will treat it just like any other service side errors 5XX. After reading about it and going through several rabbit holes, in conclusion, here is what i would do as an SRE:

  • If Googles SEO is your top concern then, you should go ahead and return 503 but also make sure there is an autoamted system to slience the alerts in case APM triggers alerts based on the error code.

    Note: Silencing alerts is a bad practice as in case a real incipdent occurs, you will be in trouble. Only do this if you are 100% sure what you are doing

    Let’s take a time to look from customers perspective, almost all of the customers actually do not care whether the status code is 503, 200, or 302. They just want to know what happened and when will it come back.

    Application would also have to set a custom header like X-Maintenance: True and configure monitoring solutions to only alert if it sees 503 status codes without our custom maintenance header. Also, i would add Retry-After to keep google crawlers happy.

  • If i were to use 302 then, i would redirect request to other page which hosts maintenance information and be done with it. This way monitoring solution is also happy and customers are as well.

Final opinion

Now, i am not saying either of the method mentioned above is correct or either of them are wrong. It’s up to the reader and the company to decide what process to follow, but as an SRE, second one makes much more sense to me than first one as google SEO is a black box. I have seen of lot of geniune sites that are in 2-3 page of google where as crappy sites which dedicated SEO engineer are on top 10.

This method also makes much more sense because, when application is backed by reverse proxies like CF proxy or even a load balancer then reverse proxy responds with a 503 status code if its backend is down or fails to respond to livelines probe.

Using 302 here would make sense as then you would know if service is actually unavailable because your reverse proxy would send 503 which gives clear signal something is wrong avoiding confusion of 503 from maintenance window or 503 from server error.

Also google is not too much strict on these stuffs. As long as your website does not return 410 Gone status, it will not drop your website from index, and using either of them should be fine.