Server outage

Started by peterp, September 10, 2013, 01:41:13 PM

Previous topic - Next topic

peterp

We had a problem with the server being down from 10.26 am this morning until 12.54 pm  :cry:

Sorry for any inconvenience but it wasn't within our power to get it back any sooner.

All is well now.  :thumb:

Regards   Peter
You only need two tools in life - WD-40 and Duct Tape. If it doesn't move and should, use the WD-40. If it shouldn't move and does, use the Duct Tape.

cam

So that's what that button does  :evilone:

peterp

The button that you should have pushed is this one.
You only need two tools in life - WD-40 and Duct Tape. If it doesn't move and should, use the WD-40. If it shouldn't move and does, use the Duct Tape.

cam

Ohh, not this one then?  :smile01:

Macka

Phew, I thought it was me.. 

cam

Ahhh, a guilty conscience.  Told ya someone would surface Peter P.  What shall we do with him?  :evilone:

Macka

I have the recent post set as my homepage and every time I opened the system it was telling me the URL was unable to be read.. 

Who would have thought that a "Mustang Site" would break down..    :bolt:

peterp

#7
Hi Macka,

It has come to light that the possible cause is that one of our forum members has been watching tooo many " I bought a Jeep " adds.  :grin:

LOL. peterp
You only need two tools in life - WD-40 and Duct Tape. If it doesn't move and should, use the WD-40. If it shouldn't move and does, use the Duct Tape.

cam

#8
Quote from: Macka on September 10, 2013, 04:50:35 PM
Who would have thought that a "Mustang Site" would break down..    :bolt:

I'm not touching that one with a barge pole..... for fear of death  :lol:


Quote from: peterp on September 10, 2013, 04:54:43 PM
Hi Macka,

It has come to light that the possible cause is that one of our forum members has been watching tooo many " I bought a Jeep " adds.  :grin:

LOL. peterp

Well that person needs help, Macca is there something you're not telling us  :grin:


malscar

#9
Quote from: CPU on September 10, 2013, 05:17:38 PM
Macca is there something you're not telling us  :grin:

Are you sure it is a jeep?



http://youtu.be/QrYNlykirtk

Herman

I think you guys have been touching too many buttons  :lol:  :grin: :lmao:
Have now converted the other half into doing some of the Concours washin, cleaning & polishing stuff!!!!!!!!

cam

#11
Quote from: malscar on September 10, 2013, 05:31:50 PM
Are you sure it is a jeep?



:lmao: :lmao: :lmao: :lmao:


Quote from: Herman on September 10, 2013, 05:36:10 PM
I think you guys have been touching too many buttons  :lol:  :grin: :lmao:

:grin: :grin:

peterp

#12
Another outage, see below for a report from the people that look after the servers.
Unfortunately when this happens it corrupts the forum database if someone is posting at the time.
So I have to manually do a database repair to correct it.

Incident 1
Date: Tuesday 10th September
Time: Morning
VPS Downtime: VPSs located on the Windows 2012 Cluster on SAN01 were offline for a period of 3-4 hours. During this time our mail server was also offline causing difficulties in responding to and receiving tickets which coupled with a very heavy phone load meant some users were unable to communicate with out staff.
Cause: The central licensing server for the SAN provided incorrect activation expiry details which saw one of our SANs inexplicably fail to renew it's license automatically. This took some time to sort out as it involved overseas contact. This issue has been resolved permanently and will not reoccur. Future Preventative Action: Our immediate action involved ensuring with the SAN provider that this could not happen again and modifying our license/activation structure accordingly in conjunction with them.

Incident 2
Date: Wednesday 11th September
Time: 11:40pm-12:15am & 1:30am
VPS Downtime: VPSs located on the Windows 2012 Cluster (across all SANs) were offline for a short period of time ranging from 5 minutes to 35 minutes. All VPSs located on the Windows 2012 Cluster were restarted as a result.
Cause: The virtualization system began throwing errors on some nodes, this snowballed and caused issues for a number of VPSs that were in a hung state. We needed to restart the virtualization system to restore all services quickly. Future Preventative Action: Due to the initial problem occurring we installed the latest hotfix provided by Microsoft, this also caused a reoccurrence of the issue while the patches were being installed. This was scheduled to be installed later in September, and hadn't been done earlier due to the relative stability of the virtualization system since the last issues caused by this problem. As a result of a reoccurence we have installed the hotfix across all nodes which according to Microsoft should prevent this problem reoccurring.

General Information - Virtualization
We have identified for some time that the virtualization system under 2012 has not been as stable as our 2008 cluster, and we believe we adopted the technology on a broad scale too early. However, MS has given assurances that as of the latest hotfix there are no continuing known issues that should cause the same ongoing problem. However, we are very soon shifting to a model offered by our Australian competitors for our standard VPS products. This will involve individual servers rather than clustered failover (as the issues with the virtualization system as well as other problems have all been related to either SAN or the clustering system). These servers will also include 100% RAID 10 local SSD storage so in effect will also offer faster disk access. We expect to be offering this within 30 days and the launch will coincide with a new look website. We will still offering failover clustering as a separate product. Existing clients will not be moved unless they request a move after the new systems are online and our internal services integration completed. We will be assessing our 2012 cluster now that the new hotfix has been installed to determine it's stability before making any further decisions which would affect existing VPSs.

Regards  peterp
You only need two tools in life - WD-40 and Duct Tape. If it doesn't move and should, use the WD-40. If it shouldn't move and does, use the Duct Tape.

cam


peterp

You only need two tools in life - WD-40 and Duct Tape. If it doesn't move and should, use the WD-40. If it shouldn't move and does, use the Duct Tape.

cam

Well every thing else is stuffed so may as well.  Better than owning a BMW  :evilone: