The theory and principles of good business continuity exist. As the world changes, they may change too. However, organizations have and will always have a comprehensive body of knowledge available to help them to continue to operate normally even in the face of adversity. So the question “Does it work” is not meant in the general sense, but in the specific one – as in “Does your particular business continuity planning and management work for you?” It’s a question that one company in the food sector found to be a longstanding frustration, until they put in place a way of answering it.
Commercial enterprises know that the best way to maintain market leadership is to attack yourself. It’s the same in IT security if you want to maximize your resistance against hackers. A niche industry has grown up around penetration testing – or ‘pentesting’ for short. Providers in this sector offer their services for applying automated or manual tests to see if they can ethically hack your computer systems and network. Business self-preservation is a strong motivation for pentesting. Such tests may also be necessary parts of a certification process for being allowed to handle confidential customer or financial data, for example. Some practitioners divide test operations into white-box and black-box testing. But is it really that clear cut?
The literature buffs among you should recognise this paraphrase of Samuel Coleridge’s epic poem, ‘The Ancient Mariner’. Besides having to put up with an albatross hung round his neck, the Ancient Mariner despaired of a lack of drinking water while becalmed at sea (“Water, water, everywhere…”) Given today’s oceans of data, CIOs might feel much the same way. They have to battle to fulfil legal requirements and assist business continuity by enabling management to pick out single data objects from terabytes of storage. AHIMA (American Health Information Management Association) produced a model for the healthcare sector to tackle the problem. It’s a model that might be adapted for other industries too.
As you bring virtualisation into your IT infrastructure, you may have noticed a few security-related aspects that weren’t present in a purely physical ‘one-app-one-server’ environment. First, of all, the virtual administrator (you or whoever) and the system hypervisor have significant new power over your population of servers. Secondly, ‘things’ exist at the virtualisation level that conventional monitoring at the physical level cannot detect. Thirdly, files can skip blithely from one machine to another. In fact, the machines themselves have, logically speaking, become files. These things are reasons for implementing virtualisation in the first place – but they are also security weaknesses.
Disaster recovery planners are often recommended to take a holistic view of their IT organisation. They should work to deal with potential outcomes, rather than possible causes. That certainly helps businesses to greater overall DR effectiveness and cost-efficiency. However, there’s no denying that a number of practical details must also be respected. Otherwise, the best-aligned DR plan may never get off the ground. The old rhyme says: “For want of a nail, a shoe was lost…” and finally the whole kingdom too. Here are a few such ‘nails’ that disaster recovery planning can take into account to get those mission-critical apps up and running again after an incident.
There’s no doubt that virtualisation has been a boon to many enterprises. Being able to rationalise the use of servers by spreading storage and applications evenly over a total pool of hardware resources leads to higher cost-efficiency, as well as improved disaster recovery and business continuity. Yet in practical terms, businesses are often still tied to one vendor for any effective storage strategy. To break free of that constraint, software-defined storage (SDS) lets IT departments mix and match the physical storage devices as they want. And there are further benefits too.
Your data backups are there to help you recover information, applications and files if required, hopefully both effectively and efficiently. But they and any archiving you do may also be there for external parties to use as a result of e-discovery. That’s the retrieval of electronically stored information (ESI) for use in legal proceedings involving your organisation. The US has led the way in this field, defining ESI as any information that is “created, stored, or best used with any kind of computer technology”. Now in Australia, all court dealings above a certain size must be conducted completely digitally. But is e-discovery good news or bad news for legal rulings and ultimately business continuity?
Virtualization is a business continuity answer to the vulnerabilities and foibles of physical servers. By spreading applications virtually and horizontally across vertical stacks of computing power, service can be ensured even if one stack goes down and the same application elsewhere picks up the slack. In principle, that’s fine – as long as IT administrators remember they’re dealing with virtual machines and manage them correctly. War stories grow daily of catastrophes or near misses concerning faulty perceptions and handling of virtualisation. The following can help you conserve business continuity and avoid the need for disaster recovery.
The main challenges in properly implementing business continuity management in an organisation can be expressed in four words: engagement, understanding, appropriateness and assumptions. In other words: senior management needs to be involved and committed to BCM; business continuity managers need to understand the essentials about IT operations; BCM processes need to link business objectives to operational realities; and any assumptions in BC planning need to be closely scrutinized. If this sounds like IT governance, you’re right. IT governance gives some good hints about how to make business continuity a practical, valued reality.
Historically, vendor solutions for disaster recovery have been created for on-site use for individual enterprises. The client company concerned was the sole owner of the user data involved, and disaster recovery could be implemented without having to worry about anybody else. The cloud computing model changes that situation. It’s possible to use cloud services to have your own dedicated servers and instances of applications, or to share physical space but still have your own application (as in multi-instance setups). However, multi-tenancy (perhaps the defining feature of cloud architectures) makes the application of disaster recovery solutions rather more delicate.
Agile project methodologies have their roots in the software industry, but the overall principle of staying close to market requirements can be applied in any sector. When risk management becomes difficult because of uncertainties like the weather or the economy, short agile cycles encourage a focus on objectives. This may make more sense than detailed planning that tries to put everything in place for the mid to long term. Efficiency and business continuity can be improved, on condition that communications remain open and productive with all stakeholders. So with these advantages, why don’t all organisations and projects jump on the agile bandwagon?
The ‘not invented here’ syndrome was something that forward-looking corporations set out to beat about 20 years ago. If a different product or service could be more cost-effectively bought in rather than being designed and manufactured in-house, then it was bought in. The challenge was to overcome misplaced pride and internal turf wars, where being asked to give up control over development could be construed as an attack on credibility, status or both. Some departments resisted by refusing to work with something that was ‘not invented here’. Now, Disaster Recovery as a Service (DRaaS) may be plagued with a similar issue, where companies cannot look outside what they already have – but for a different reason.
Traditional data backup happens once every so often – once an hour, once a day, once a week, for example, depending on the recovery requirements associated with the data. It’s typically the recovery point objective or RPO that determines the frequency of the backup. If you cannot afford to lose more than the last 30 minutes’ worth of data, then your RPO will be 30 minutes and backups will happen at least every half an hour. Continuous replication on the other hand changes the model by backing up your data every time you make a change. But what does that do to RPO, disk space requirements and network capacity (assuming you’re backing up to storage in a different physical location)?
Ensuring employee safety by rapidly disseminating the right information, and keeping communication lines open in a time of crisis are both priorities for businesses. Traditional solutions for this have relied on the manual ‘call tree’ or ‘phone tree’. Key employees are contacted first to inform them of whatever situation or crisis has arisen, with remaining staff to be contacted as soon as possible afterwards. However, even for smaller organisations of 100 people for example, the manual call tree rapidly demonstrates its limitations. For larger enterprises, there is no doubt – a better solution is required.
If you’ve already experienced a distributed denial of service attack, you may have simply seen it as an attempt to cripple a company or organisation by blocking connections to its servers. Indeed, that’s what DDoS is designed to do. Hackers use a multitude of computers, some without the real computer owner’s knowledge, to generate more traffic than a server can cope with. Legitimate users are unable to connect to the server or experience very poor performance (slow connections). However, DDoS often indicates more than one stand-alone cyber aggression. Organisations experiencing this kind of attack should be on the lookout for other risks too.