☢ The Overlooked Importance of Vendor Risk Management and Business Continuity☢ Last week’s massive computer outage caused by CrowdStrike's update (Channel File 291) on Microsoft systems serves as a stark reminder of the critical need for comprehensive Vendor Risk Management (VRM) and Business Continuity/Disaster Recovery (BC/DR) processes. The incident, which resulted in widespread Blue Screen of Death (BSOD) issues across Windows systems globally, disrupted operations across multiple sectors and negatively impacted thousands of lives, likely including your own. To effectively manage vendor risks, organizations should implement the following controls in alignment with well-established industry standards and frameworks: 1. ISO/IEC 27001:2022: - A.15.1: Establish and maintain documented policies and procedures to manage the risks associated with supplier relationships. - A.15.2: Ensure information security requirements are incorporated into supplier agreements. - A.15.3: Regularly monitor and evaluate supplier performance against agreed-upon information security requirements. 2. AICPA SOC 2: - CC9.2: Assess and manage risks associated with vendors and business partners. - CC8.1: Implement changes to infrastructure, data, software, and procedures to meet objectives. 3. ISO/IEC 42001:2023 (Clauses): - 6.1.2: Perform regular risk assessments to identify potential disruptions and their impact on AI systems. - 6.1.3: Develop and implement treatment plans to mitigate identified risks and ensure continuity of AI operations. Organizations must also ensure business continuity and disaster recovery plans are comprehensive and tested regularly to mitigate the impact of such incidents. Your controls should include: 1. ISO/IEC 27001:2022: - A.17.1: Maintain information security at an appropriate level during disruptions. - A.17.2: Develop and implement ICT continuity plans. - A.12.3: Maintain and regularly test backup copies of information, software, and systems. - A.16.1: Establish incident response procedures. 2. AICPA SOC 2: - CC7.5: Develop activities to recover from security incidents. - A1.3: Test recovery plan procedures periodically. 3. ISO/IEC 42001:2023 (Clauses): - 6.1.2: Perform regular risk assessments. - 6.1.3: Implement risk treatment plans. 4. ISO 22301:2019 (Clauses): - 8.2: Implement systematic processes for analyzing business impact and assessing risks of disruption. - 8.3: Identify and select business continuity strategies. - 8.4: Provide plans and procedures to manage disruptions. - 8.5: Maintain a program of exercising and testing business continuity strategies. To discuss more, or for help getting started, please reach out! A-LIGN #iso42001 #iso27001 #BCDR #TheBusinessofCompliance #ComplianceAlingedtoYou
Disaster Recovery Project Management
Explore top LinkedIn content from expert professionals.
Summary
Disaster recovery project management is the process of planning, organizing, and executing steps that help organizations quickly restore vital systems and data after an unexpected event like a natural disaster, cyberattack, or hardware failure. It ensures business operations can resume swiftly with minimal disruption and data loss, often following a clearly defined disaster recovery (DR) plan.
- Set clear priorities: Conduct a business impact analysis to identify the most essential systems and services that need to be restored first after a disruption.
- Test your plan: Regularly practice disaster recovery procedures, including failover and restoration drills, to make sure your strategy works when it matters most.
- Understand vendor risks: Review and monitor the disaster recovery plans of any third-party partners to be confident they can support your business in a crisis.
-
-
DR (Disaster Recovery): The Minibar of RISE 🍹 CFO: “Extra for DR? I thought RISE was all-inclusive!” Me: “It is like a resort. Wi-Fi and breakfast included; minibar (DR) is extra. 🍹🧾” What DR is (plain English): A standby SAP system in another data center that takes over if your main one goes 💥. Like insurance, you grumble about the premium… until the day you are glad you paid it. What RISE includes by default: · High Availability (HA): resilient setup in one region so routine failures do not bring you down. · Backups: taken regularly (about 30 days kept, stored offsite). · Uptime target: about 99.7% for production. · No RTO/RPO promise: Unless you buy DR, there is no commitment on o RTO = Recovery Time Objective (how long you can be down) o RPO = Recovery Point Objective (how much data you can lose) What paid DR adds: · A second site (different region) that has been designed, monitored, and tested. · Typical RTO ≈ 12 hours (becomes ≈ 4 hours if you pay more) and RPO 0–30 minutes. · Runbooks + annual DR drills so the plan works under pressure. · Yes, it costs. Also, yes, it reduces existential risk. When DR is probably overkill: · Downtime = inconvenience, not lost revenue or compliance trouble. · You can live with a restore-from-backup timeline. · Early-stage, low-risk teams where cash > continuity (for now). When DR is non-negotiable: · Pharma/Life Sciences: validation, batch records, inspections → compliance + patient safety. · Banking/Financials: downtime = real-time money, reputation, regulator heat. · 24×7 core ops: plants, supply chains, e-commerce, order capture. Key Questions for SAP leaders 1. RTO: How long can we be down before it hurts badly? 2. RPO: How much data can we lose without breaking laws/contracts? 3. Math: Hourly outage cost vs annual DR cost. 4. Obligations: Regulators, customer SLAs, audits. 5. Pick your fit: No DR / Warm DR / Hot DR. Action: Align CFO–CIO–Audit, test, document, sleep better 😴. Bottom line: DR is a “wasteful” parachute, annoying on the ground, priceless in freefall. If hours offline risk revenue, regulators, or patients, it is a lifeline. If not, own the risk—consciously. CFOs and SAP leaders—what is your RTO/RPO and why? #SAP #RISEwithSAP #S4HANA #DisasterRecovery #CFO #CIO #ITStrategy #Pharma #Banking #Cloud #Leadership #Hiring
-
For the last few months I have been involved in creating disaster recovery strategies for our systems and their implementation. Here are some learnings from designing real systems ! 1. Cost was always one of the most important discussion so you can't skip it in your system design interview. 2. The recovery strategy should always work when we need it, so it has to be reliable. 3. The solution should be ready within a week or two for one service, like MongoDb, Cassandra or Redis. It cannot take a long time because this is critical for us. 4. The technologies we used were terraform to create the infra using the backup EBS snapshot, Ansible for automation of installation/running processes and Jenkins to write jobs to run both terraform and ansible to recover in a single click. 5. Final outcome was jobs that can recover a completely deleted database in less than 10 minutes. Real systems teach you tons of things. Execution speed with a balance of stability in the solution is really what we needed ! Follow Arpit Adlakha for more !
-
🌩️ Ensuring Business Resilience: Cloud Disaster Recovery Strategies 🌩️ In today's rapidly evolving digital landscape, organizations must be prepared for any unforeseen disruptions that could impact their operations. Cloud disaster recovery strategies play a pivotal role in safeguarding critical business data and ensuring minimal downtime in the face of unexpected incidents. Two essential metrics that shape effective disaster recovery plans are Recovery Time Objective (RTO) and Recovery Point Objective (RPO). RTO defines the acceptable duration of downtime, while RPO determines the maximum tolerable amount of data loss. By leveraging cloud technologies, businesses can optimize their disaster recovery strategies and achieve greater resilience. Here are some key considerations: 1️⃣ Cloud-Based Replication: Replicating data and infrastructure to the cloud provides organizations with off-site backups and real-time synchronization. This approach significantly reduces RPO, allowing for minimal data loss during recovery. 2️⃣ Scalable Infrastructure: Cloud platforms offer elastic scalability, enabling organizations to provision additional resources on-demand. This flexibility ensures rapid recovery and meets the defined RTO by quickly ramping up the necessary infrastructure. 3️⃣ Automated Backup and Testing: Implementing automated backup mechanisms simplifies the process of capturing and storing data. Regular testing of the recovery process helps identify any potential gaps, ensuring a smoother restoration in case of an actual disaster. 4️⃣ Geographical Redundancy: Deploying disaster recovery environments across multiple geographically diverse regions enhances resilience. By spreading infrastructure across different locations, organizations can minimize the impact of localized incidents and achieve higher availability. 5️⃣ Monitoring and Alerting: Proactive monitoring and real-time alerting systems are crucial for identifying potential issues and initiating recovery procedures promptly. Continuous monitoring helps organizations meet their RTO goals and mitigate risks effectively. Embracing cloud disaster recovery strategies empowers businesses to protect critical assets and maintain continuity during unexpected disruptions. It enables organizations to recover swiftly, minimize data loss, and ensure uninterrupted service delivery to customers. Let's strive for resilience and embrace cloud-based solutions that enable us to navigate any storm, ensuring our businesses stay operational and thrive in the face of adversity. 💪🌐 #cloudcomputing #disasterrecoveryplan #businessresilience #rto #rpo #cloudsolutions PC:- Govardhana Miriyala Kannaiah
-
Disaster Recovery: It (literally) hit home during Blackhat! During the week of Blackhat in August, a fast-moving storm produced 6 tornadoes in Maryland. I wish I had been there to help my husband over the 3 days he had to live (and work) with without power, but I got home just in time to help him with the massive cleanup from the storm. It has taken us two weeks of clearing trees to even see the grass again. The winds were so fierce that broken tree branches were driven into the earth, and we have had to dig them up to move them. Luckily, no one was hurt in our area, but our little town is still recovering. Since I was a kid in a small Texas town, there was a plan. When we saw the storm coming we would head to the hallway bathroom and put a mattress over us. Sounds crazy, but that was our plan, and our family knew the drill. Through our personal disaster, I have been working on Business Continuity and Disaster Recovery Plans for our customers. We cannot predict nor stop events that can cripple our homes or businesses, sometimes for weeks, so what can we do to survive through them? Have a Plan! Why does your organization need a Business Continuity Plan (BCP)? For many it is a compliance requirement. I encourage everyone that if you are taking the time to develop a BCP for compliance, design it as if your business is going to use it. Identify the most critical processes for your business through conducting Business Impact Analyses (BIAs) with each business unit in the company. These help organizations prioritize response to keep the business afloat until operations are restored. This effort also sets Recovery Time Objective (RTO) and Recovery Point Objectives (RPO) for restoration. Prioritization of services is key: every system cannot be restored in 4 hours. Then comes your Disaster Recovery (DR) Plan. What does this look like? Have you tested it? Can you failover to an alternate site successfully or restore critical systems in 4 hours? If you rely on third party partners to run the most critical components of your business, do you understand their DR Plan? In a year of unprecedented weather events and continued large-scale cyberattacks, this is a great time to have those conversations. From IT support teams carrying servers above their head through knee-deep waters to companies having to shift to writing paper checks to keep employees paid, I have seen the power of a plan keep organizations viable through the worst of times. If you need any help designing a plan, don’t hesitate to reach out. I am always ready to help. #businesscontinuity #disasterrecovery #businessresiliency #planning #restoration #compliance #cisos Photo: Our backyard after 4 trips to the dump.
-
Lived through enough disasters to know this truth: Production is where optimism goes to die. Deployments WILL break. Systems WILL crash. You NEED to have a Disaster Recovery plan prepped. Most organizations spend $$ on fancy tech stacks but don’t realize how critical DR really is until something goes wrong. And that’s where the trouble starts. Here are a few pain points I see decision-makers miss: 👉 𝗕𝗮𝗰𝗸𝘂𝗽𝘀 ≠ 𝗗𝗶𝘀𝗮𝘀𝘁𝗲𝗿 𝗥𝗲𝗰𝗼𝘃𝗲𝗿𝘆. Sure, you’ve got backups—but what about your Recovery Point Objective (RPO)? How much data are you actually okay losing? Or your Recovery Time Objective (RTO)—how long can you afford to be down? 👉 "𝗦𝗲𝘁 𝗜𝘁 𝗮𝗻𝗱 𝗙𝗼𝗿𝗴𝗲𝘁 𝗜𝘁” 𝗗𝗥 𝗣𝗹𝗮𝗻𝘀. The app changes, infrastructure evolves, but you’re running on a DR plan you wrote two years ago? 👉 𝗜𝗱𝗹𝗲 𝗕𝗮𝗰𝗸𝘂𝗽 𝗘𝗻𝘃𝗶𝗿𝗼𝗻𝗺𝗲𝗻𝘁𝘀. Most teams have “hot spares” (idle infrastructure) sitting around waiting for the next big disaster. Disasters aren’t IF, they’re WHEN. Build DR testing into your CI/CD pipeline. If you’re shipping code daily, your recovery strategy should be just as active. Turn those idle backups into active DevOps workspaces. Load test them, stress test them, break them before production does. Stop relying on manual backups or failovers. Tools like AWS Backup, Route 53, and Elastic Load Balancers exist for a reason. Automate your snapshots, automate your failovers, automate 𝗲𝘃𝗲𝗿𝘆𝘁𝗵𝗶𝗻𝗴. Don’t wait for a disaster to test your DR strategy. Test it now, fail fast, and fix faster. What about you—what’s your top DR strategy tip? 💬 #DisasterRecovery #CloudComputing #DevOps #Infrastructure Zelar - Secure and innovate your cloud-native journey. Follow me for insights on DevOps and tech innovation.