Recent Less Wrong downtime

by drpowell1 min read18th Jul 20113 comments

36

Site Meta
Personal Blog

Many users may have noticed that Less Wrong experienced about 6 hours of downtime on 16/7/2011.

CAUSE: The server was put under an unusual amount of load and started up a new instance to load-balance the traffic.  Unfortunately, there was a bug in the script that starts the new instance that caused it to use an inconsistent mix of old and new code.  The symptom seen by users was that any post with comments was inaccessible.

RESPONSE: A hotfix was deployed as soon as the problem was detected, unfortunately it was a Saturday so this reponse time was slower than we would like.  We have since implemented a proper fix for the particular bug that caused this problem.  We are also creating some extra monitoring probes so we'll be notified promptly of any similar problems in the future.

Apologies for the inconvenience.

3 comments, sorted by Highlighting new comments since Today at 12:45 AM
New Comment

Thanks for writing this up!

Thanks for the response!

Thanks for letting us know :)