XMission Outage 11/11
Hello Everyone,
XMission extends a heartfelt apology to our customers for any inconvenience caused by yesterday’s outage. We are truly sorry, and appreciate your patience.
This is a courtesy post for anyone that is not already on the XMission Announcement email list.
XMission Outage
XMission experienced a serious outage while we were performing some standard UPS maintenance today. The outage affected all services and started at approximately 2:00 p.m. on Tuesday, November 11th. Network services for many were partially restored by about 2:30 p.m. but some other services required a lot of attention and took much longer.
Details
About 40% of our data center, including our server room, suffered a power outage when a technician flipped a mislabeled breaker during some standard maintenance on one of our 3 UPS units. Although the power outage was momentary, servers and routers often respond very poorly to losing power and sometimes take extensive work to come back up. Unfortunately, such was the case today with many systems.
Seriously Affected Systems
- An important router, which some connections and servers rely on, required extensive attention from our network administrators.
- DNS (Domain Name Service) was sporadic for some customers for over an hour.
- Email services were down for over 5 hours.
- Web hosting suffered the longest outage because our NetApp storage appliance which houses all customer files and web sites lost multiple hard drives. As a result, we are currently restoring files to our new NetApp 2020 from our November 9th backup, which will take many hours yet to complete. We recently purchased this new NetApp and were merely days away from getting it online.
Conclusion
Today’s outage was exacerbated by multiple systems responding poorly to losing power. In spite of the holiday, our systems administrators were on site within minutes and continue to work tirelessly to restore all services. In the end, we should have performed this maintenance on a day when our systems administrators were on site because problems can arise no matter how carefully you proceed.









I wsa surprised at the bad service provided by XMission in this outage. XMission is usually excellent. When I called XMission to find out what was going on, I was greeted by an automated system that directed me to leave a message to the technician. But the mailbox wouldn’t accept any more messages.
XMission’s blog was itself offline. XMission’s chat was offline. There was no way to receive news from XMission.
Our DNS records, hosted with XMission, disappeared from the internet, making our sites apparently unavailable to the world. I was surprised that XMission didn’t have DNS replicated anywhere else in the globe.
My suggestions:
1) Create means of communicating with customers. Means that are outside of XMission’s systems, in case a catastrophic event takes them down.
2) Replicate DNS somewhere else in the globe.
Having said that, it was the first time in 10 years that I’ve had any problems with XMission.