IT Disaster Recovery Plans

Explore top LinkedIn content from expert professionals.

  • View profile for Kelly Hood

    EVP & Cybersecurity Engineer @ Optic Cyber Solutions | Cybersecurity Translator | Compliance Therapist | Making sense of CMMC & CSF | CISSP, CMMC Lead CCA & CCP, CDPSE

    8,046 followers

    As I’ve been digging into the #CybersecurityFramework 2.0, and helping clients navigate the changes, I’ve found several areas where the new additions feel pretty significant. If you’re already using the #CSF and trying to figure out where to focus first, take note of these new Categories: ◾ The POLICY (GV.PO) Category was created to encompass ALL cybersecurity policies and guidance. Now, on one hand it might seem like a "well, of course" moment to consolidate all cybersecurity policies into one place - on the other hand, policies were previously sprinkled throughout the CSF, and were tied to specific actions like Asset Management or Incident Response. Now, it's all in one area, which makes a ton of sense and simplifies things, but also means we've got to remember that this one Category covers everything! ◾ Another significant addition is the PLATFORM SECURITY (PR.PS) Category which largely pulls together key topics from the previous Information Protection Processes & Procedures (PR.IP) and Protective Technology (PR.PT) focusing on security protections around broader platform types (hardware, software, virtual, etc.). If you’re looking for things like configuration management, maintenance, and SDLC – you’ll now find them here.  ◾ The TECHNOLOGY INFRASTRUCTURE RESILIENCE (PR.IR) Category pulls largely from the previous Information Protection Processes & Procedures (PR.IP) and Protective Technology (PR.PT) as well, but also pulls in key aspects from Data Security (PR.DS). This new Category highlights the need for managing an organization’s security architecture and includes security protections around networks as well as your environment to ensure resource capacity, resilience, etc. So, what does all this mean for your organization? Whether you're just starting out, or you're looking to refine your existing cybersecurity strategies, CSF 2.0 offers a more streamlined framework to use to bolster your cyber resilience. Remember, staying ahead in cybersecurity is a continuous journey of adaptation and improvement. Embrace these changes as an opportunity to review and enhance your cybersecurity posture, leveraging the expanded resources and guidance provided by #NIST! Have you seen the updated mapping NIST released from v1.1 to v2.0? Check it out here to get started and “directly download all the Informative References for CSF 2.0” 👇 https://lnkd.in/e3F6hn9Y

  • View profile for Andrew King

    CISO | Chief Information Security Officer | Incident Commander | Cyber Security SME | Global IT Executive | Executes strategies to strengthen security, build high-performing teams, and mitigate risk

    5,889 followers

    After spending the past year leading ransomware incident response, I wanted to share some insights that you should be thinking about in relation to your organization. 1. Leadership clarity is non-negotiable. Multiple executives giving competing directions doesn't just create confusion - it directly impacts your bottom line. Every minute of misaligned leadership translated into increased recovery costs and extended downtime. 2. Trust your IR experts. Yes, you know your environment inside and out. But incident response is their expertise. When you hire specialists, let them specialize. I've seen firsthand how second-guessing IR teams can derail recovery efforts. 3. Master the time paradox. Your success hinges on rapid containment while simultaneously extending threat actor negotiations. If your leadership and IR partnership aren't solid (points 1 & 2), this delicate balance falls apart. 4. Global password resets are deceptively complex. Every human account, service account, API key, and automated process needs rotation. Without robust asset management and IAM programs, this becomes a nightmare. You will discover dependencies that you didn't even know existed. 5. Visibility isn't just nice-to-have - it's survival. Modern security tools that provide comprehensive visibility across your environment aren't a luxury. This week reinforced that every blind spot extends your recovery time exponentially. 6. Data gaps become permanent mysteries. Without proper logging and monitoring, you might never uncover the initial access vector. It's sobering to realize that lack of visibility today means questions that can never be answered tomorrow. 7. Backup investment is incident insurance. Organizations regularly lose millions that could have been prevented with proper backup strategies. If you think good backups are expensive, wait until you see the cost of not having them. 8. Protect your team from burnout. Bring in additional help immediately - don't wait. Your core team needs to be there for the rebuild after the incident, and running them into the ground during response isn't worth it. Spending money on staff augmentation isn't just about handling the immediate crisis - it's about maintaining the institutional knowledge and expertise you'll need for recovery. Remember: the incident ends, but your team's journey continues long after. #Cybersecurity #IncidentResponse #CISO #RansomwareResponse #SecurityLeadership"

  • View profile for Arpit Adlakha
    Arpit Adlakha Arpit Adlakha is an Influencer

    AI and Software, Staff Software Engineer @Thoughtspot | LinkedIn Top Voice 2025

    76,411 followers

    For the last few months I have been involved in creating disaster recovery strategies for our systems and their implementation. Here are some learnings from designing real systems ! 1. Cost was always one of the most important discussion so you can't skip it in your system design interview. 2. The recovery strategy should always work when we need it, so it has to be reliable. 3. The solution should be ready within a week or two for one service, like MongoDb, Cassandra or Redis. It cannot take a long time because this is critical for us. 4. The technologies we used were terraform to create the infra using the backup EBS snapshot, Ansible for automation of installation/running processes and Jenkins to write jobs to run both terraform and ansible to recover in a single click. 5. Final outcome was jobs that can recover a completely deleted database in less than 10 minutes. Real systems teach you tons of things. Execution speed with a balance of stability in the solution is really what we needed ! Follow Arpit Adlakha for more !

  • View profile for Lalit Chandra Trivedi
    Lalit Chandra Trivedi Lalit Chandra Trivedi is an Influencer

    Railway Consultant || Ex GM Railways ( Secy to Government of India’s grade ) || Chairman Rail Division India ( IMechE) || Empaneled Arbitrator - DFCC and IRCON || IEM at MSTC and Uranium Corp of India

    38,229 followers

    Navigating the Aftermath: Managing an AI-Powered Railway Post-Cyber Attack As artificial intelligence (AI) becomes the backbone of modern railway systems—optimizing routes, predicting maintenance, and enhancing safety—cyber threats have grown exponentially. A single attack can paralyze operations, disrupt schedules, and compromise passenger safety. Over the past five years, cyber incidents targeting railways have surged by over 220%, with cases like remote hijacking via radio frequencies in Poland (2023) and ticketing disruptions in Ukraine (2025) serving as stark reminders. Here’s a practical framework for managing an AI-driven railway system after a cyber attack. 1️⃣ Immediate Containment – Isolate and Assess Once an intrusion is detected, the first step is to contain it. In AI-managed railways, this means isolating compromised systems—dispatch algorithms, predictive maintenance modules, or signaling networks—from the rest. Activate a Rapid Response Team: Bring together cybersecurity experts, AI engineers, and railway operations specialists to identify attack vectors—whether phishing, ransomware, or signaling manipulation. Eradicate the Threat: Reset credentials, patch vulnerabilities, and enforce multi-factor authentication (MFA). For AI systems, encrypt models during storage and transmission to prevent theft or tampering.
The 2023 Polish incident, where 20 trains were halted via radio interference, proved how swift isolation minimizes damage. 2️⃣ Recovery & Restoration – Rebuild with Resilience Containment alone isn’t enough; recovery demands validating both physical assets and AI model integrity. System Integrity Checks: Apply frameworks such as NIST CSF 2.0 to verify that automated safety functions are uncompromised before resuming operations. Data Recovery: Restore from secure, encrypted backups; implement zero-trust access policies. Business Continuity: Test disaster-recovery plans regularly, ensuring seamless switchovers to manual operations when required.
Post-incident analysis should be mandatory—review logs, trace root causes, and update security policies, as seen in U.S. freight rail guidelines. 3️⃣ Long-Term Prevention – Fortify the Future True resilience lies in learning from the breach and preventing recurrences. Secure-by-Design: Embed cybersecurity through the AI lifecycle, from data collection to deployment. Continuous Monitoring: Use AI itself for real-time threat detection and anomaly analysis, ensuring human oversight in decision loops. Collaborate & Comply: Follow rail-specific cybersecurity standards and share threat intelligence across the ecosystem. AI can be both the target and the shield—its predictive power can detect attacks faster than humans ever could, provided its training data and parameters remain uncompromised. #CyberSecurity #AIRailway #InfrastructureManagement #Resilience #RailSafety #AIinTransport #CriticalInfrastructure

  • View profile for •Dianna Booher

    Hall-of-Fame Speaker. Bestselling Author. Leadership Communication & Executive Presence Expert. Book Writing & Publishing Coach. Global Gurus Top 30 Communication Experts, Marshall Goldsmith's Top 100 Coaches

    12,481 followers

    How do you prevent mayhem when crises occur that affect you and your team? Bridges collapse. Criminals mow down innocent victims. CEOs have heart attacks. Contagious diseases spread. Layoffs happen. Such crises create havoc as misinformation and fear run rampant through an organization or team. So what’s your part in calming the hysteria among your team? Communication. Communication that’s current, consistent, and complete. When I’ve consulted on handling crisis communication previously, I often get this question from bosses: “But how can I tell people what’s going on when we haven’t yet investigated and don’t have the facts?” That’s never an excuse for delayed communication. Be mindful that when people don’t have the facts, they tend to make them up. In a communication void, people pass on what they think, fear, or imagine. Noise. Keep these communication tips in mind to be part of the solution, not the noise: ▶ Tell what you know as soon as you know it. ▶ State what information you don’t have and tell people what you’re investigating. ▶ Stifle the urge to comment on/add to rumors, fears, guesses. ▶ Communicate concern specifically to those directly affected. ▶ Offer tangible support when you can (time, money, acts of kindness). ▶ Communicate kudos to those working behind the scenes. Accurate, speedy communication creates relationships and cultures that build trust and encourage loyalty. Have you been affected by a crisis? Was it handled well or poorly? Outlandish rumors that circulated? #CrisisCommunication #LeadershipCommunication #BusinessCommunication #ProfessionalCommunication #DiannaBooher #BooherResearch

  • View profile for Kavya Wadhwa
    Kavya Wadhwa Kavya Wadhwa is an Influencer

    Bridging Nations for Nuclear Energy | LinkedIn Top Voice Global | Climate Diplomacy | Nuclear Energy, Technology, Security, and Policy

    7,905 followers

    In the event of a radiation emergency at a nuclear power plant, comprehensive emergency plans are crucial to mitigate risks, protect workers and the public, and regain control swiftly. These plans are meticulously crafted with three primary objectives. 1. Minimizing Radiation Exposure: The foremost objective is to limit radiation exposure to levels as low as reasonably achievable (ALARA) and prevent exposures surpassing established safety limits. This involves swift and effective measures to contain and control the release of radioactivity within the plant and its vicinity. Evacuation, sheltering, and distribution of protective measures such as iodine tablets may be implemented to safeguard individuals from potential harm. 2. Incident Understanding and Consequence Assessment: Gathering accurate information about the incident is paramount. Emergency response teams are equipped to assess the causes of the situation, employing monitoring systems and specialized equipment. This information aids in understanding the extent of the incident and evaluating potential consequences. Communication channels, both internal and external, play a critical role in disseminating information to relevant authorities, the public, and international organizations, fostering transparency and cooperation. 3. Swift Restoration of Control: The ultimate goal is to bring the emergency situation under control as expeditiously as possible. Emergency response teams, often comprising highly trained personnel, utilize specialized equipment and protocols to stabilize the plant, contain the release, and mitigate further risks. Simultaneously, ongoing monitoring helps track the effectiveness of implemented measures. Learning from past incidents, these plans are dynamic and subject to continuous improvement, incorporating the latest technologies and best practices. Key Components: Early Warning Systems: Rapid detection of anomalies triggers immediate response actions. Evacuation and Sheltering Protocols: Defined procedures for relocating personnel and the public to safe areas. Communication Strategies: Timely and transparent dissemination of information to relevant stakeholders. Training and Drills: Regular exercises to ensure the readiness and effectiveness of response teams. Continuous Improvement: The dynamic nature of nuclear technology necessitates ongoing review and enhancement of emergency plans. Regular drills, feedback analysis, and incorporation of lessons learned from global incidents contribute to the adaptability and resilience of these plans. In essence, the emergency response framework for nuclear power plants is a multifaceted system designed to prioritize safety, communication, and the swift restoration of control. While the probability of such events is low, the meticulous planning and preparedness are paramount for ensuring the well-being of both workers and the public in the unlikely occurrence of a nuclear power plant emergency.

  • View profile for Omkar Sawant
    Omkar Sawant Omkar Sawant is an Influencer

    Helping Startups Grow @Google | Ex-Microsoft | IIIT-B | Data Analytics | AI & ML | Cloud Computing | DevOps

    15,002 followers

    It's a common fear for anyone in tech: the dreaded "oopsie" that wipes out your data. 😱 What if a tiny hiccup became a huge catastrophe? We've all been there, panicking after accidentally deleting a file or a critical piece of data. Now, imagine that on a massive, organizational scale. It's enough to make you lose sleep. 😴 Did you know that 75% of businesses lose some or all of their data from downtime or a natural disaster? 📈 That's a staggering number, and it highlights a major problem: data disaster recovery isn't just a "nice-to-have," it's a "must-have." The Problem: The 'Oh-Crap' Moment 👉 Disaster recovery is a lot like a fire drill. You practice it to be prepared, but you're always a little worried that the practice itself might cause a problem. For organizations, running a real disaster recovery test for their data warehouse could mean risking up to 15 minutes of data loss. 🤯 👉 Imagine telling your CEO, "We practiced for the worst, and in doing so, we accidentally caused a small-scale disaster." It's a lose-lose situation that keeps teams from properly testing their systems and ensuring they're truly ready for an unplanned outage. The Solution: A Gentle Nudge, Not a Hard Shove 👉 This is where the new soft failover feature for BigQuery Managed Disaster Recovery comes in. Instead of a "hard failover" that aggressively switches your system, soft failover is like a gentle, controlled transition. 👉 It waits until all your data has been fully replicated to the secondary region before it promotes the datasets and compute. It's a safe, confident way to simulate a disaster without the risk of data loss. ✅ The Benefits: Sleep Soundly at Night 👉 Zero Data Loss: You can run disaster recovery drills without the fear of losing valuable information. This ensures your data is always safe, even when you're testing your defenses. 👉 Boosted Confidence: Teams can confidently perform simulations to meet compliance requirements and prove their readiness. 👉 Greater Control: This feature gives you the power to manage your disaster recovery process with precision, knowing that the system won't transition until everything is perfectly aligned. In a world where data is everything, having a reliable way to protect it is non-negotiable. This feature isn't just about technology; it's about giving teams the confidence they need to focus on innovation instead of worrying about a potential disaster. Disaster recovery should be a smooth, stress-free process, not a gamble. BigQuery's new soft failover feature is a big step towards that reality, making it easier for businesses to test their readiness and protect their most valuable asset: their data. It's about being prepared, not paranoid. Stay safe out there! 🛡️ #DataAnalytics #BigQuery #DisasterRecovery #CloudComputing #GoogleCloud #DataProtection #TechTrends

  • View profile for Dr Fatemeh Rezazadeh

    Energy & Infrastructure Executive | Capital Structuring & Strategic Advisory | Board Advisor | Executing Cross-Border M&A Transactions & Investment Strategy

    3,700 followers

    There was enough power, but there wasn’t enough resilience. Last week’s Heathrow shutdown wasn’t just a power outage—it was an exposure. A transformer fire at the North Hyde substation took out electricity to the world’s second-busiest airport. The ripple effects were felt across global aviation, supply chains, and headlines. John Pettigrew, CEO of National Grid, says the other two substations serving Heathrow had enough capacity to keep the airport running. So why the closure? Because operational resilience isn’t just about capacity—it’s about design, systems, decision-making, and time. Heathrow’s CEO explained that they had to shut down thousands of systems and methodically reboot them to ensure safety. Backup generators existed—but only to cover critical safety systems, not full operations. Switching to alternate substations wasn’t instantaneous; reconfiguring and restoring took hours. This is a classic example of design resilience vs. lived resilience. We often assume that having backup available is enough. But in complex systems—airports, hospitals, data centers—it’s how quickly and safely that backup can be activated that defines true resilience. Other major airports have made resilience a priority: - JFK, New York – 110 MW gas-fired CHP plant enabling full microgrid operation during outages. - Frankfurt Airport – Redundant grid feeds, on-site gas turbine generation, and UPS systems. - Amsterdam Schiphol – Integrated energy management system with diesel and battery backup for essential systems. - Changi Airport, Singapore – Multiple grid connections, standby diesel generation, and automated switchgear. - Incheon International, South Korea – Dual-feed substations, backup diesel generators, and smart grid control. These airports understand that resilience isn’t a luxury—it’s a license to operate. This is the future of energy for critical infrastructure: - Decentralized - Redundant - Fast-switching - Integrated with grid and on-site systems. If Heathrow—despite being served by three substations—could still go dark for nearly 24 hours, the question isn’t who to blame. It’s what to build differently. Are we designing our infrastructure for availability, or for agility? Are we investing in energy systems that can recover, or just survive? Let’s make sure this isn’t just a red flag—it’s a redirection. #EnergyResilience #InfrastructureLeadership #FutureOfPower #CriticalInfrastructure #Heathrow #GridSecurity #Digitalisation #Electrification

  • View profile for Ismail Orhan, CISSO, CTFI, CCII

    CISO @ASEE | Cybersecurity Leader of the Year 2025 🏆 | HBR Contributor | Published Author | Thought Leader | International Keynote Speaker

    19,078 followers

    🔐 Incident Response is not just a procedure; it is a discipline. 🌍 My journey: 🎖️ Armed Forces — active roles in cyber defense operations ✈️ US Air Force CyberPatriot — a distinguished graduate and award winner 🛡️ Defense industry — projects built on discipline and decision-making cycles 🏦 Private sector Head of Cyber Security — leading critical payment systems security ✨ All converge on one point: End-to-End Incident Management. 📑 The framework I share today blends the methodology of NIST SP 800-61 Rev.3 with military decision doctrines (OODA, MDMP, AAR). ⚡ Speed + Discipline + Authority = Successful Response ⏱️ Every minute counts. Preparation, detection, decision, execution, recovery, and lessons learned… Just like on the battlefield 🪖, this cycle must operate seamlessly in cyberspace 🌐 as well. 📘 This approach goes beyond the technical—it integrates corporate risk, regulatory requirements, and reputation management. 👉 Without a truly command-centered Incident Response culture, resilience remains an illusion. #IncidentResponse #CyberSecurity #MilitaryDiscipline #DefenseIndustry #CyberPatriot #NIST #OODA #MDMP #AAR #Leadership #QuantumSecurity

  • View profile for AD E.

    GRC Visionary | Cybersecurity & Data Privacy | AI Governance | Pioneering AI-Driven Risk Management and Compliance Excellence

    10,140 followers

    You’re the newly hired Compliance Lead at a fast-growing tech startup. Two weeks into your role, you discover that the company has no formal incident response plan in place, even though it recently experienced a ransomware attack. Leadership is concerned but doesn’t know where to begin, and employees are confused about their roles during an incident. Your CEO asks you to draft a basic Incident Response Framework and outline the top 3 immediate steps the company should take to prepare for future incidents. - What would your first draft framework include? (Hint: Think of NIST’s Incident Response Lifecycle – preparation, detection, analysis, containment, eradication, and recovery.) - How would you ensure team alignment across IT, legal, and operations? (Hint: Consider regular tabletop exercises, clear role definitions, and a central incident communication channel.) - What tools or processes would you recommend to track and report incidents effectively? (Hint: Look at tools like Splunk for monitoring, Jira for tracking, and SOAR platforms for automation.)

Explore categories