Office IT System Meltdown - The W1nners' Club

 

I witnessed this astounding IT meltdown around 2004 in a large academic organization.

An employee decided to send a broad solicitation about her need for a local apartment. She happened to discover and decided to use an all-employees@org.edu type of email address that included everyone. By “everyone,” I mean every employee in a 30,000-employee academic institution. Everyone from the CEO down received this lady’s apartment inquiry.

Of course, this kicked off the usual round of “why am I getting this” and “take me offa list” and “omg everyone stop replying” responses – each of whom reply-all’ed to all-employees@org.edu, so 30,000 new messages. The email system started to bog down as half a million messages flooded all the mailboxes.

IT Fail #1: It wasn’t making an all-employees@org.edu email address as that’s quite reasonable – but granting unrestricted access to it rather than configuring the mail server to check the sender and generate one “not the CEO = not authorized” reply.

That however, also wasn’t necessarily the real problem as that incident might have simmered down after people stopped responding.

In a 30k organization, lots of people go on vacation and some of them, (let’s say 20 for argument’s sake) remembered to set their email to auto-respond about their absence. As a result, the auto-responders responded to the same recipients including all-employees@org.edu. So, every “I don’t care about your apartment” message didn’t just generate 30,000 copies of itself, it also generated 30,000 x 20 = 600,000 new messages. Even the avalanche of apartment messages became drowned out by the volume of “I’ll be gone ’til November” auto-replies.

That also wasn’t the real problem, which again might have died down all by itself.

The REAL problem was that the mail servers were quite diligent. The auto-responders didn’t just send one “I’m away” message: they sent an “I’m away” message in response to every incoming message… including the “I’m away” messages of the other auto-responders.

The auto-response avalanche converted the entire mail system into an Agent-Smith-like replication factory of away messages, as auto-responders incessantly informed not just every employee, but also each other, about employee status.

The email systems melted down. Everything went offline and a 30k-wide enterprise suddenly had no email for about 24 hours.

That however is not the end of the story.

The IT staff busied themselves with mucking out the mailboxes from these millions of messages and deactivating the auto-responders. They brought the email system back online, and their first order of business was to send out an email explaining the cause of the problem, etc. And they addressed the notification email to all-employees@org.edu.

IT Fail #2: Before they sent their email message, they had disabled most of the auto-responders – but they missed at least one.

More specifically: they missed at least two.

 

(Story by: sfsdfd Source: Reddit)

 

Leave a Reply

Your email address will not be published. Required fields are marked *