Making your Business
Unstoppable
Angela Osorio
HPS Solution Manager
Today’s business is about information availability
Evolution of Business Continuity
‘80s
‘90s
‘00s
Traditional
Dot.com
e-Business
High Availability
24 x 7, Scalable
Regulation
e-Commerce
Competition
Magnified by
Disaster
Absence of
“Bricks & Mortar”
Dependence on
Computers
Recovery
Expectation
Hardware
Hardware, Data
Days/Hours
Minutes/Seconds
Hardware, Data,
Applications
Minutes/Seconds
Business Focus
Requirements Restore, Recover
Driven by
Decision
Optional
Mandatory
Changing Concept of Business
Continuity
 Availability
 Accessibility
Quality
Drivers of Data and Information Flow
Yesterday
Today
ASP
DIST
Company
MFG
ISP
Company SSP
Credit
Customer
Customer
SC
Risks to information availability
The Failure Event Spectrum
Regional Event
Metropolitan
Area Event
Building Level
Incident
Administrative
Intervention
Component
Global Event
Causes of DOWNTIME
1
Planned maintenance
2
Application failure
3
Operator error
4
Source: Gartner Group
Operating system failure
5 Hardware failure
6 Power outage
7 Natural disaster
Financial cost of downtime is relative to
who feels the pain
Average cost per hour
of downtime (US$)
Industry
Application
Financial
Brokerage operations
$
7,840,000
Financial
Credit card sales
$
3,160,000
Media
Pay-per-view
$
183,000
Retail
Home shopping (TV)
$
137,000
Retail
Catalog sales
$
109,000
Transportation
Airline reservations
$
108,000
Entertainment
Tele-ticket sales
$
83,000
Shipping
Package shipping
$
34,000
Financial
ATM fees
$
18,000
Source: Contingency Planning Research, 2000
Disasters are defined by you
One person’s inconvenience may be another’s disaster
 Which systems are critical to your business?
– Those which are customer facing are usually more
important
 What happens if data becomes unavailable?
– Is it merely inconvenient or aggravating?
– Is it life or death?
More disastrous results
 Loss of customer service satisfaction
 Cost and time of rebuilding lost data
 Possible fines and penalties imposed by regulatory agencies
 Idle time of employees
 Fines and penalties imposed for not meeting contracted
delivery times or SLAs
 Movement of your customers to your competitor
High Availability and Disaster Tolerance
 High Availability tends to be:
 Disaster Tolerance tends to be:
– Transaction-centric
– Data-centric
– Transaction integrity-focused
– Data integrity-focused
– Local
– Geographical
– Recovery time focused
– Recovery point focused
– Very short time horizon
– Longer time horizon
Protect your business…
Protect your information
The stakes are high!
“Nearly half the companies
that lose their data through
disaster, never re-open,
and 90% are out of
business within
two years.”
Source: University of Texas Center for
Research on Information Systems
Site goes down
Shares down
30 pts.
$4B in stock
value lost
Anticipated problems driving need for High
Availability and Disaster Tolerance
What types of problems does/will your plan anticipate?
Network failure 86.9%
Under $20M in Revenue
89.5
87.6
Hardware component failure 84.8
78.9
Natural disasters 84.4
77.9
Operating system fault/failure 77.6
Software viruses 75.5
Application failure 70.9
Malicious physical and computer security breaches (external) 68.4
Malicious physical and computer security breaches (internal) 59.1
Acts of man (war, terrorism, etc.) 57.8
Service provider failure 56.1
Over $20M in Revenue
89.3
90.9
77.9
76
83.2
69.4
71.6
69.4
67.4
68.6
56.8
59.5
56.8
60.3
60
53.7
Accidental employee-initiated outages 55.3
Attack on company Web site 53.6
47.4
61.2
56.8
52.9
CIO Insight study on Disaster Recovery – November 2001
Events that actually forced companies to
declare a disaster
Power Outage
Hardware Failure
Fire
Flood
Earthquake
Hurricane
Software Error
Bombing
Snow/Wind Storm
Network Failure
Contamination
Burst Pipe
Forced Evacuation
HVAC Failure
Delayed Relocation
Riot
DR Testing went wrong
Source: Disaster Recovery Journal
High Availability & Disaster Tolerance
It’s about data and keeping it available
What Is Your Specific Situation?
 Questions to ask yourself
–
–
–
–
–
–
What is your business?
What is your application?
What is your environment (flood zone, earthquake)?
What risks are you willing to take?
What’s happened in the past?
What if your critical systems were lost?
High Availability & Disaster Tolerance
It’s about data and keeping it available
Evaluating RPO and RTO
 Recovery point objective
– How fresh is your data?
 Not all data needs to be recovered to the same point
 Recovery time objective
– How soon after an event do you need to be running?
 Not all applications need to come up at the same time
The quicker your required recovery time and the more
thorough and accurate your recovery point, the more
robust a solution is required
Rules Of Thumb
Environment
Less Forgiving
More Forgiving
Defense
Emergency 911
Data Warehousing
Tech Pubs
eCommerce
Discrete Mfg
Backup and drive
tape across town
Healthcare
Financial Transactions
Payroll
Telecommunications
Accounting
Disaster Tolerance Methodology
Campus-Wide
Clusters
High Availability & Disaster Tolerant
responses are a balance of three aspects
Technology
Services
Procedures and discipline
Find the balance of three aspects
Services
Technology
Procedures &
Discipline
Techniques to eliminate system downtime
Technology
Services
Procedures & Discipline

Data protection

Insurance

Plan

Remote log shipping

Assets Recovery

Question

Data Replication
Manager

Cold-site, Mobile recovery

Exercise

Stand-Alone systems

Document procedures

Business Protection Service


Distributed & Networked
systems
Eliminate single points
of failure

Rolling Upgrades

Disaster recovery hot-site

Data
Protection

Redundancy, Hot Swap
components, RAID
Provide shared, direct
access to storage

Remote
Log Shipping

Availability clusters
Minimize
environmental risks
Data
Replication
Manager

Data mirroring, SMART

Practice!

Dual host/redundancy
Campus Wide
Clusters

Shared Data clusters

FDDI, ATM switching

Campus-Wide Clusters

Reliable Transaction
Router
Reliable
Transaction
Router
20
Nominal Justifiable Cost of Plan
 Does cost of recovery exceed the losses?
C
O
S
T
Acceptable
downtime
L
O
S
S
Money
Maximum cost
of plan
Time to recover
Evaluate Alternatives
 Does your plan make financial sense?
Acceptable
downtime
Plan IV
Cost
Plan III
Maximum cost
of plan
Plan II
Plan I
Loss reduction (savings)
E-business…
putting all of your “eggs-in-a-basket”
Risk Level
Dependency on Technology
Tools to Make Your
Business Unstoppable
High Availability & Disaster Tolerance
It’s about data and keeping it available
Evaluating RPO and RTO
 Recovery point objective
– How fresh is your data?
 Not all data needs to be recovered to the same point
 Recovery time objective
– How soon after an event do you need to be running?
 Not all applications need to come up at the same time
The quicker your required recovery time and the more
thorough and accurate your recovery point, the more
robust a solution is required
Rules Of Thumb
Environment
Less Forgiving
More Forgiving
Defense
Emergency 911
Data Warehousing
Tech Pubs
eCommerce
Discrete Mfg
Backup and drive
tape across town
Healthcare
Financial Transactions
Payroll
Telecommunications
Accounting
Disaster Tolerance Methodology
Campus-Wide
Clusters
High Availability & Disaster Tolerant
responses are a balance of three aspects
Technology
Services
Procedures and discipline
Find the balance of three aspects
Services
Technology
Procedures &
Discipline
Techniques to eliminate system downtime
Technology
Services
Procedures & Discipline

Data protection

Insurance

Plan

Remote log shipping

Assets Recovery

Question

Data Replication
Manager

Cold-site, Mobile recovery

Exercise

Stand-Alone systems

Document procedures

Business Protection Service


Distributed & Networked
systems
Eliminate single points
of failure

Rolling Upgrades

Disaster recovery hot-site

Data
Protection

Redundancy, Hot Swap
components, RAID
Provide shared, direct
access to storage

Remote
Log Shipping

Availability clusters
Minimize
environmental risks
Data
Replication
Manager

Data mirroring, SMART

Practice!

Dual host/redundancy
Campus Wide
Clusters

Shared Data clusters

FDDI, ATM switching

Campus-Wide Clusters

Reliable Transaction
Router
Reliable
Transaction
Router
29
Preventing a Disaster
 You Need:
– copy of applications
– copy of application data
 current: no, or predictable degree of, data loss
 consistent: write ordering across related replicas
– systems to restart and run applications
– reestablished client communications
 Spectrum of recovery techniques
– trade off cost, recovery time, data currency
AVAILABILITY…
open all night long
“
High availability is as
important to eCommerce as
breathing is to humans.
Our Compaq servers stay
highly available to customers,
giving us an advantage for
eCommerce.
”
Kal Raman
Chief Information Officer
Drugstore.com, Inc.
Making online
healthy and
beautiful
SECURITY…
solving a devilish problem
“
At the Vatican... security was
our first criterion in choosing a
partner; our second critical factor
was availability; another was
high performance.
”
Stefano Pasquini
IT Planner
Internet Office of the Holy See
God knows what
else you need…
Professional
Services
Business Continuity Methodologies
Application
Synchronous
Asynchronous
Defense
Emergency 911
Data Warehousing
Tech Pubs
Simple Backup &
Remote Storage Site
eCommerce
Healthcare
Reliable
Financial Transactions Transaction Router
Payroll
Telecommunications
Accounting
Campus-Wide
Discrete Mfg
Data Protection
Technologies
Remote Log Shipping
Clusters
SANworks Data
Replication Manager
Technology
Descargar

Keeping your business in business: High Availability & Disaster