Best of Breed solutions for disaster recovery and business continuity has four key components:
- High Availability – Best of breed requires service that have high availability. The service may go down but will recover quickly enough that the employees and clients are not significantly impacted, if they notice at all. High availability helps mask or minimize the effects of the failure and makes it less of an issue for those who consume that IT Service.
Best of Breed for World Class Organizations
- Fault Tolerance – Best of breed requires that A system have a high fault tolerance when the entire system will not fail even if a critical IT service component is compromised. This is achieved through solutions with redundant hardware and software so that the slack is immediately picked up by the secondary system.
- Continuous Operation – Planned downtime is a common occurrence for many if not all IT services. Recent innovations in virtualization allow IT teams to perform maintenance on systems without downtime. IT Services that meet this requirement are considered to be in continuous operation.
- Continuous Availability – This is the ultimate goal of DR and BC teams. This means that the service achieves 100% availability by avoiding both planned and unplanned downtime. This is achieved through a combination of Disaster Recovery and business continuity solutions such as those defined in Janco’s Disaster Recovery and Business Continuity Template.
Disaster Recovery — What are the major misconceptions when a disaster occurs with IT systems? Can your systems can not support your company’s day-to-day operations?
The major misconception is that a backup recovery plan is all that you need. At Janco Associates that is not enough. We have found that most companies are really not prepared. Files can be restored but it does no good if they do have facilities for their staffs.
Victor Janulaitis, the CEO of Janco Associates, was responsible for the creation of the Disaster Recovery Plan that Merrill Lynch implemented on 911. ML lost less than one minute of transactions with the plan that was created under his direction.
Disaster Recovery Audit
A core process that he identified was a Disaster Plan Audit. This Disaster Recovery / Business Continuity Audit program identifies control objectives that are meet by the audit program. There are 36 specific items that the audit covers in the 13 page audit program. Included are references to specific Janco products that directly address the areas the audit covers. This program can be used as standalone audit program or in concert with the following Janco offerings:
- Disaster Recovery / Business Continuity Template
- Security Manual Template
- Security Audit Program Template
- Business and IT Impact Questionnaire
- IT Service Management for Service Oriented Architecture
- Metrics for the Internet and Information Technology
Here is a great video that another company has produced that describes what some of the major misconceptions are in disaster recovery and business continuity planning. These thoughts are the same as Janco’s and the video is well worth watching.
This is a great video on physical security as well as the the software security. This is a great primer which all CIOs and Data Center managers should consider. Now only does it address the physical security issues, disaster recovery, it also addresses how Google implements and disposes of new server devices.
“Disaster Recovery and business continuity are all about being ready for everything. The question that every IT manager and CIO has to answer every day is what should they complete today If they knew a business interruption was going to happen in the next 12 hours.”
Cascading problems are not things that most companies want to talk about but are disaster recovery business continuity risks
We have one client, who wants to remain nameless, on a Friday evening he thought the had a hardware problem. The weekend staff proceeded to connect that device into the network to diagnose the issue and a virus was released. Then they transmitted that virus to one of their largest suppliers. When it was all said and done they spent well over $500,000 to isolate the virus, restore the files, and make their supplier whole. They were just lucky this happened over the week-end so it did not impact as many people. Interestingly if this problem had surfaced a few hours earlier their regular staff would have diagnosed the problem off-line and it would not have gotten away from them.
Disasters and events that impact business continuity vary widely in more than duration. As you design your plan, consider the probability of threats that are:
- Chronicled — events that have occurred (Power outages, earthquakes, hurricanes)
- Human — events likely from carelessness, malicious intent, fatigue, or lack of training
- Geographical — events likely as a result of the location of your business (floods, storms, lightning strikes, earthquakes, typhoons, tsunamis)
- Localized — events due to system malfunctions (assembly line failures, computer crashes, sprinkler activations, chemical spills)
- Planned — scheduled events (software upgrades, system tests, facility moves) that go awry
Janco’s own list of top 10 disasters that CIOs and business managers need to plan for are:
- Weather related events like floods, tornadoes, hurricanes, forest/brush fires, and sand storms
- Facility fires
- Water pipe breaks in facility
- Fiber or communications line are cut – loss of network
- Power failures – Outage or sporadic service
- Human error like a redundant systems failure that goes unnoticed and hinders the recovery operation
- Security breach hacking and or malicious code
- Data corruption and loss – not only from physical device or network failure but also from application and user error
- Cascading system failure
Janco believes that a prepared, and well-rehearsed team address the issues associates with a major and minor business interruption much quicker than companies who have no plan and no preparedness.
Information Management Magazine and Insurance Networking News both report that there was significant growth in the Health Care field in the number of IT jobs available. Much of this is due to the requirement that all medical records (EHR) are required to be mechanized and new compliance requirements for the Affordable Health Care Act (aka Obama-care).
It is estimated that the Health Care IT spending will increase by up to 25% in the next two years. Spending last year for Health Care software was was close to $7 billion and is expected to grow by over $1 Billion in the next year. Much of that spending will be in the “small practice” physician and “small hospitals”. The question is how protected will they be from business interruptions and security attacks.
Do have any comments on this?
Disaster Recovery plans that depend on outsourcers face significant additional risk
What if your were in Florida and the Hurricane season was in full swing and your provider decided to go out of business. Would you have the time to move to a new provider and test your solution before you need to execute your plan?
For example, earlier in the year Google decided to close its Message Continuity service. Google gave most clients a reasonable timescale to find an alternative supplier. This allowed existing Message Continuity contracts to run until their contacts expired. What if that was the communication solution you had selected for communicating with your staff? Would you be able to implement a new one on time.
Another example was the news that Doyenz, the US-based supplier of rCloud, a service which offers disaster recovery for physical and virtual servers, had decided to pull the plug on its UK operations. Clients were given not weeks or months but days to respond and to find a new supplier.
CIOs and IT managers all need to consider all of the possibilities and have alternative solutions in place and tested.
FEMA conference videos which discuss tools and services available in the disaster and business continuity processes.
We are looking for people who can help us find typo’s and poor grammar on our sites. We are offering incentives like major discounts on our products or free copies of selected products.
When a security breach or business interruption occur, the life cycle from the start to the end are the same. First and foremost you must be prepared and have a plan in place. Included in that plan is a being able to know that the event or incident has occurred. Then react to what has happened and get back to normal operations as quickly as possible.
After everything is back to the way it should be there should always be a post event analysis to find out what worked, what did not, and what could be done better.
6 Ways to Utilize Social Media Before a Disaster Strikes
by Adam Crowe
When creating a disaster recovery plan include social media. Simple things like having a predefined hash tag (#companynameBC) will make the recover process easier and provide a quick way to get back in business. In addition utilize sites like youtube.com to have instructions on what and how to do it in the recovery process.
Business Continuity Services Video
Business continuity video is good overview of what IBM thinks about this