Author Topic: Server outage (Read 14110 times)

peterp · « **on:** September 10, 2013, 01:41:13 pm »

We had a problem with the server being down from 10.26 am this morning until 12.54 pm

Sorry for any inconvenience but it wasn't within our power to get it back any sooner.

All is well now.

Regards Peter

cam · « **Reply #1 on:** September 10, 2013, 02:12:24 pm »

So that's what that button does

peterp · « **Reply #2 on:** September 10, 2013, 02:55:34 pm »

The button that you should have pushed is this one.

cam · « **Reply #3 on:** September 10, 2013, 02:58:48 pm »

Ohh, not this one then?

Macka · « **Reply #4 on:** September 10, 2013, 04:32:03 pm »

Phew, I thought it was me..

cam · « **Reply #5 on:** September 10, 2013, 04:34:29 pm »

Ahhh, a guilty conscience. Told ya someone would surface Peter P. What shall we do with him?

Macka · « **Reply #6 on:** September 10, 2013, 04:50:35 pm »

I have the recent post set as my homepage and every time I opened the system it was telling me the URL was unable to be read..

Who would have thought that a "Mustang Site" would break down..

peterp · « **Reply #7 on:** September 10, 2013, 04:54:43 pm »

Hi Macka,

It has come to light that the possible cause is that one of our forum members has been watching tooo many " I bought a Jeep " adds.

LOL. peterp

cam · « **Reply #8 on:** September 10, 2013, 05:17:38 pm »

Quote from: Macka on September 10, 2013, 04:50:35 pm

Who would have thought that a "Mustang Site" would break down..

I'm not touching that one with a barge pole..... for fear of death

Quote from: peterp on September 10, 2013, 04:54:43 pm

Hi Macka,

It has come to light that the possible cause is that one of our forum members has been watching tooo many " I bought a Jeep " adds.

LOL. peterp

Well that person needs help, Macca is there something you're not telling us

malscar · « **Reply #9 on:** September 10, 2013, 05:31:50 pm »

Quote from: CPU on September 10, 2013, 05:17:38 pm

Macca is there something you're not telling us

Are you sure it is a jeep?

http://youtu.be/QrYNlykirtk

Herman · « **Reply #10 on:** September 10, 2013, 05:36:10 pm »

I think you guys have been touching too many buttons

cam · « **Reply #11 on:** September 10, 2013, 05:37:34 pm »

Quote from: malscar on September 10, 2013, 05:31:50 pm

Are you sure it is a jeep?

Quote from: Herman on September 10, 2013, 05:36:10 pm

I think you guys have been touching too many buttons

peterp · « **Reply #12 on:** September 12, 2013, 08:02:06 am »

Another outage, see below for a report from the people that look after the servers.
Unfortunately when this happens it corrupts the forum database if someone is posting at the time.
So I have to manually do a database repair to correct it.

Incident 1
Date: Tuesday 10th September
Time: Morning
VPS Downtime: VPSs located on the Windows 2012 Cluster on SAN01 were offline for a period of 3-4 hours. During this time our mail server was also offline causing difficulties in responding to and receiving tickets which coupled with a very heavy phone load meant some users were unable to communicate with out staff.
Cause: The central licensing server for the SAN provided incorrect activation expiry details which saw one of our SANs inexplicably fail to renew it's license automatically. This took some time to sort out as it involved overseas contact. This issue has been resolved permanently and will not reoccur. Future Preventative Action: Our immediate action involved ensuring with the SAN provider that this could not happen again and modifying our license/activation structure accordingly in conjunction with them.

Incident 2
Date: Wednesday 11th September
Time: 11:40pm-12:15am & 1:30am
VPS Downtime: VPSs located on the Windows 2012 Cluster (across all SANs) were offline for a short period of time ranging from 5 minutes to 35 minutes. All VPSs located on the Windows 2012 Cluster were restarted as a result.
Cause: The virtualization system began throwing errors on some nodes, this snowballed and caused issues for a number of VPSs that were in a hung state. We needed to restart the virtualization system to restore all services quickly. Future Preventative Action: Due to the initial problem occurring we installed the latest hotfix provided by Microsoft, this also caused a reoccurrence of the issue while the patches were being installed. This was scheduled to be installed later in September, and hadn't been done earlier due to the relative stability of the virtualization system since the last issues caused by this problem. As a result of a reoccurence we have installed the hotfix across all nodes which according to Microsoft should prevent this problem reoccurring.

General Information - Virtualization
We have identified for some time that the virtualization system under 2012 has not been as stable as our 2008 cluster, and we believe we adopted the technology on a broad scale too early. However, MS has given assurances that as of the latest hotfix there are no continuing known issues that should cause the same ongoing problem. However, we are very soon shifting to a model offered by our Australian competitors for our standard VPS products. This will involve individual servers rather than clustered failover (as the issues with the virtualization system as well as other problems have all been related to either SAN or the clustering system). These servers will also include 100% RAID 10 local SSD storage so in effect will also offer faster disk access. We expect to be offering this within 30 days and the launch will coincide with a new look website. We will still offering failover clustering as a separate product. Existing clients will not be moved unless they request a move after the new systems are online and our internal services integration completed. We will be assessing our 2012 cluster now that the new hotfix has been installed to determine it's stability before making any further decisions which would affect existing VPSs.

Regards peterp

cam · « **Reply #13 on:** September 12, 2013, 08:54:35 am »

"I bought a Jeep"

peterp · « **Reply #14 on:** September 12, 2013, 09:28:44 am »

You bought a Jeep !!

cam · « **Reply #15 on:** September 12, 2013, 09:30:33 am »

Well every thing else is stuffed so may as well. Better than owning a BMW

Server outage

News:

Author Topic: Server outage (Read 14110 times)

peterp

Server outage

cam

Re: Server outage

peterp

Re: Server outage

cam

Re: Server outage

Macka

Re: Server outage

cam

Re: Server outage

Macka

Re: Server outage

peterp

Re: Server outage

cam

Re: Server outage

malscar

Re: Server outage

Herman

Re: Server outage

cam

Re: Server outage

peterp

Re: Server outage

cam

Re: Server outage

peterp

Re: Server outage

cam

Re: Server outage