Disaster Recovery for HR Applications

Anirudh Khanna, Pacific Gas and Electric

HR applications enable companies to store employee information, manage standard HR functions, and execute critical HR activities such as processing payroll and administering benefits. Features of HR applications include an employee self-service portal, payroll, workforce management, recruitment and hiring, benefits administration, and talent management. These capabilities are often delivered through individual modules that form a unified suite of HR tools. HR applications are critical and should be designed with high availability and disaster recovery capabilities.

High availability (HA) describes the ability of an application to withstand all planned and unplanned outages (a planned outage could be performing a system upgrade) and to provide continuous processing for business-critical applications.

Disaster recovery (DR) involves a set of policies, tools, and procedures for returning a system, application, or data center to a whole operation after a catastrophic interruption. It includes procedures for copying and storing an installed system’s essential data in a secure location and for recovering that data to restore normalcy of operation.

Disaster recovery relies upon replicating data and computer processing in an off-premises location unaffected by the disaster. When servers go down because of a natural disaster, equipment failure, or cyber-attack, a business must recover lost data from a second location where the data is backed up. Ideally, an organization can transfer its computer processing to that remote location to continue operations.

Importance of Disaster Recovery

A disaster is an unexpected problem resulting in a slowdown, interruption, or network outage in an IT system. Outages come in many forms, including the following examples:

An earthquake or fire,
Technology failures,
System incompatibilities,
Simple human error, and
Intentional unauthorized access by third parties.

These disasters disrupt business operations, cause customer service problems, and result in revenue loss. A disaster recovery plan helps organizations respond promptly to disruptive events and provides key benefits such as:

Ensures business continuity: A disaster can be detrimental to all aspects of the business and is often costly. It also interrupts normal business operations, as the team’s productivity is reduced due to limited access to the tools required. A disaster recovery plan prompts the quick restart of backup systems and data so that operations can continue as scheduled.
Enhances system security: Integrating data protection, back-up, and restoring processes into a disaster recovery plan limits the impact of ransomware, malware, or other security risks for businesses. For example, data back-ups to the cloud have numerous built-in security features to restrict suspicious activity before it impacts the business.
Improves customer retention: If a disaster occurs, customers question the reliability of an organization’s security practices and services. The longer a disaster impacts a business, the greater the customer frustration. A good disaster recovery plan mitigates this risk by training employees to handle customer inquiries. Customers gain confidence when they observe that the business is well-prepared to handle any disaster.
Reduces recovery costs: Depending on its severity, a disaster causes loss of income and productivity. A robust disaster recovery plan avoids unnecessary losses as systems return to normal soon after the incident. For example, cloud storage solutions are cost-effective data backup methods. You can manage, monitor, and maintain data while the business operates as usual.

Components of Disaster Recovery

Disaster recovery focuses on getting applications up and running within minutes of an outage. Organizations address the following three components:

1. 1. 1. Prevention: To reduce the likelihood of a technology-related disaster, businesses need a plan to ensure that all critical systems are as reliable and secure as possible. Because humans cannot control natural disasters, prevention only applies to network problems, security risks, and human errors. The right tools and techniques must be set up to prevent disaster. For example, system-testing software that auto-checks all-new configuration files before applying them can prevent configuration mistakes and failures.
    2. Anticipation: Anticipation includes predicting future disasters, knowing the consequences, and planning appropriate disaster recovery procedures. It is challenging to predict what can happen, but you can develop a disaster recovery solution with knowledge from previous situations and analysis. For example, backing up all critical business data to the cloud in anticipation of future hardware failure of on-premises devices is a pragmatic approach to data management.
    3. Mitigation: Mitigation is how a business responds after a disaster. A mitigation strategy aims to reduce the negative impact on routine business procedures. All key stakeholders know what to do during a disaster, including the following steps:

Critical Elements of a Disaster Recovery Plan

Disaster recovery team: This team of subject matter experts will be responsible for creating, implementing, and managing the disaster recovery plan. This plan should define each team member’s role and responsibilities. In a disaster, the recovery team should know how to communicate with each other, employees, vendors, and customers.
Risk analysis: Assessing potential hazards that put your organization at risk. Depending on the type of event, strategize what measures and resources will be needed to resume business. For example, in a cyber-attack, what data protection measures will the recovery team have in place to respond?
Mission-critical application identification: A good disaster recovery plan includes documentation of which systems, applications, data, and other resources are most critical for business continuity, as well as the necessary steps to recover data.
Back-ups: Determine data/systems which need back-up, who should perform back-ups, and how back-ups will be implemented. Include a recovery point objective (RPO) that states the frequency of back-ups and a recovery time objective (RTO) that defines the maximum downtime allowed after a disaster. These metrics limit the choice of IT strategy, processes, and procedures that make up an organization’s disaster recovery plan. The downtime an organization can handle and how frequently it backs up its data will inform the disaster recovery strategy.
Operational readiness testing: The DR team should continually test and update its strategy to address ever-evolving threats and business needs. It can successfully navigate such challenges by ensuring a company is ready to face the worst-case scenarios in disaster situations. In planning how to respond to a cyber-attack, for example, it’s essential that organizations continually test and optimize their security and data protection strategies and have protective measures in place to detect potential security breaches.

Steps to Build a Disaster Recovery Team

Whether creating a disaster recovery strategy from scratch or improving an existing plan, assembling the right collaborative team of experts is a critical first step. It starts with tapping IT specialists and other key individuals to provide leadership over the following key areas in the event of a disaster:

Crisis management:This leadership role commences recovery plans, coordinates efforts throughout recovery, and resolves emerging problems or delays.
Business continuity:The expert overseeing this ensures that the recovery plan aligns with the company’s business needs based on the business impact analysis.
Impact assessment and recovery:The team responsible for this recovery area has technical expertise in IT infrastructure, including servers, storage, databases, and networks.
IT applications:This role monitors which application activities should be implemented based on a restorative plan. Tasks include application integrations, settings, configuration, and data consistency.
While not necessarily part of the IT department, the following roles should also be assigned to any disaster recovery plan:
Executive management:The executive team will need to approve the strategy, policies, and budget related to the disaster recovery plan, plus provide input if obstacles arise.
Critical business units:A representative from each business unit will ideally provide feedback on disaster recovery planning to address their concerns.

Types of Disaster Recovery

Businesses can choose from a variety of disaster recovery methods or combine several:

1. - Back-up: This is the simplest type of disaster recovery and entails storing data off-site or on a removable drive. However, just backing up data provides minimal business continuity help, as the IT infrastructure is not backed up.
  - Cold Site: In this type of disaster recovery, an organization sets up basic infrastructure in a second, rarely used facility that provides a place for employees to work after a natural disaster or fire. It can help with business continuity because business operations can continue. However, it does not provide a way to protect or recover important data, so a cold site must be combined with other disaster recovery methods.
  - Hot Site: A hot site always maintains up-to-date copies of data. Hot spots are time-consuming and more expensive than cold sites, but they dramatically reduce downtime.
  - Disaster Recovery as a Service (DRaaS): In the event of a disaster or cyber-attack, a DRaaS provider moves an organization’s computer processing to its cloud infrastructure, allowing a business to continue operations seamlessly from the vendor’s location, even if an organization’s servers are down. DRaaS plans are available through either subscription or pay-per-use models. There are pros and cons to choosing a local DRaaS provider: latency will be lower after transferring to DRaaS servers closer to an organization’s location, but in the event of a widespread natural disaster, a DRaaS nearby may be affected by the same tragic event.
  - Back Up as a Service: Similar to backing up data at a remote location, with Back Up as a Service, a third-party provider backs up an organization’s data, but not its IT infrastructure.
  - Datacenter disaster recovery: The physical elements of a data center can protect data and contribute to faster disaster recovery in certain types of disasters. For instance, fire suppression tools will help data and computer equipment survive a fire. A backup power source will help businesses sail through power outages without grinding operations to a halt. Of course, none of these physical disaster recovery tools will help in the event of a cyber-attack.
  - Virtualization: Organizations can back up certain operations and data or even a working replica of an organization’s computing environment on off-site virtual machines unaffected by physical disasters. Virtualization as part of a disaster recovery plan also allows businesses to automate some disaster recovery processes, bringing everything back online faster. For virtualization to be an effective disaster recovery tool, frequent replication of data and workloads is essential, as is good communication within the IT team about how many virtual machines are operating within an organization.
  - Point-in-time copies: Point-in-time copies, also known as point-in-time snapshots, make a copy of the entire database at a given time. This backup can restore data, but only if the document is stored off-site or on a virtual machine unaffected by the disaster.
  - Instant recovery: Instant recovery is similar to point-in-time copies, except that instead of copying a database, instant recovery takes a snapshot of an entire Virtual Machine

References

1. The State of Business Continuity Preparedness, Forrester Research and Disaster Recovery Journal, 2011, https://bit.ly/47ykOWi.
2. Disaster Recovery White Paper, posted by Online Tech, 2013, https://bit.ly/3YIaXsY.
3. Top 5 Reasons Why Your IT Disaster Recovery Plan Should Be A Top Priority, onlinetech.com, https://bit.ly/3KQYq0y.
4. IT Disaster Recovery Planning for Dummies by Peter Gregory, Wiley Publishing, Inc., 2008, https://bit.ly/44jJf6U.
5. gov/Business Continuity Plan, Office of Department of Homeland Security, https://bit.ly/3OIwjSC.
6. 7 things your IT disaster recovery plan should cover, posted by James Martin, csoonline.com, 7/2017, https://bit.ly/44qZShq.
7. Business continuity and disaster recovery planning: The basics, posted by Derek Slater, csoonline.com, 5/2015, https://bit.ly/45DPFzc.

Anirudh Khanna, Pacific Gas and Electric

Senior Data Protection Lead at Pacific Gas & Electric | + posts

Anirudh Khanna works for Pacific Gas and Electric (PG&E) as a Senior Data Protection Lead. He has been leading the Data Protection, Disaster Recovery and Cyber Recovery team in PG&E for the last 6 years.

Anirudh obtained his bachelor’s in computer science engineering in 2008, and he has 15 years of progressive professional experience in technology and its practical applications. He has worked with several Fortune 500 companies such as Viacom, Electronic Arts, Aegis Media and Tata Consultancy Services.

His experience cuts across various areas in IT Infrastructure management, Data Protection, VMware, Cloud computing, and Cyber recovery. He has led multiple end-to-end implementations to achieve substantial cost savings and optimize business processes.

Anirudh frequently discusses technology and its applications in various forums, especially IT Infrastructure management in large enterprises.

He can be reached at [email protected].

Sorry, we couldn't find any posts. Please try a different search.

Join the world’s largest community of HR information management professionals.

Become a Member

Disaster Recovery for HR Applications

Anirudh Khanna, Pacific Gas and Electric

Related Articles

Join the world’s largest community of HR information management professionals.

Start typing and press enter to search