One mistake I see in cloud migrations over and over is that teams lift-and-shift apps into cloud VMs and then wonder why costs and security get worse. Here's what actually matters. In many cases, people do just a lift-and-shift. They pick up an application, figure out a way to run it in a VM, move it over into the cloud, and forget it. But unless they do a business impact assessment and a technology architecture review, they’re probably going to end up with higher costs, worse security, and an application that may not meet current business needs. I’ve done it myself. When I was CIO at Microsoft, my goal was to get out of our data centers and move as much as I could to Microsoft Azure. Some apps we re-architected. Others, we just parked in VMs and said, “We’ll get to it next year.” That wasn’t optimal, but at the time, it let us vacate data centers and retire old equipment. The problem isn’t that lift-and-shift is generally wrong, but when leaders treat it as the default strategy, instead of a short-term fix with a plan behind it. Here’s how I’ve learned to approach it: 1. Inventory and prioritize your apps. Not everything is mission-critical, but not everything should be punted either. 2. Do a business-impact assessment and a technology architecture review. Without these, you risk higher costs and worse outcomes. 3. Use specialists, not generalists. Moving apps is like redoing plumbing. You can ask me to figure it out, but it’ll take me three times as long, and it won’t look good. And, if not properly performed, can lead to disaster. It’s worth it to have people who know the craft. 4. Allow temporary “parking” in VMs* but set a clear timeline to revisit, and then force a decision to terminate or optimize for cloud. Short-term convenience without discipline creates long-term debt. Costs rise, risks increase, and you may be burdened with apps that don’t do what the business needs.
Troubleshooting Common Cloud Migration Issues
Explore top LinkedIn content from expert professionals.
Summary
Troubleshooting common cloud migration issues means identifying and solving problems that occur when moving applications and data from traditional servers to the cloud. These challenges can range from technical hiccups to security concerns and cost surprises, especially if migrations aren’t planned carefully.
- Assess risk early: Build a detailed risk register and plan for possible failures before starting your migration to avoid unexpected outages.
- Monitor performance: Regularly check resource usage, latency, and costs after migration to spot any issues before they impact users or your budget.
- Review configurations: Double-check permissions, settings, and dependencies to prevent security gaps and broken deployments in your new cloud setup.
-
-
🚨 𝗡𝗘𝗪 𝗔𝗥𝗧𝗜𝗖𝗟𝗘 𝗔𝗟𝗘𝗥𝗧: 𝗛𝗼𝘄 𝗪𝗲 𝗠𝗮𝗻𝗮𝗴𝗲𝗱 𝟰𝟬+ 𝗜𝗻𝗳𝗿𝗮𝘀𝘁𝗿𝘂𝗰𝘁𝘂𝗿𝗲 𝗥𝗶𝘀𝗸𝘀 𝗗𝘂𝗿𝗶𝗻𝗴 𝗮 𝗖𝗹𝗼𝘂𝗱 𝗠𝗶𝗴𝗿𝗮𝘁𝗶𝗼𝗻 (And why planning for failure saved the entire project.) Have you ever led a project where a single outage could bring everything to a halt? Where shipping, invoicing, and customer portals were all riding on fragile legacy systems? This edition of 𝗧𝗵𝗲 𝗣𝗠 𝗣𝗹𝗮𝘆𝗯𝗼𝗼𝗸 breaks down how we migrated core systems to the cloud without causing chaos. With 600 employees and a live production environment, we didn’t have the luxury of “figuring it out later.” 𝗛𝗲𝗿𝗲’𝘀 𝘄𝗵𝗮𝘁 𝘄𝗲 𝘄𝗲𝗿𝗲 𝘂𝗽 𝗮𝗴𝗮𝗶𝗻𝘀𝘁: ➝ A 90-day timeline with zero margin for error ➝ Legacy systems with undocumented dependencies ➝ Vendors, data risks, and real-time operations under pressure 𝗛𝗲𝗿𝗲’𝘀 𝗵𝗼𝘄 𝘄𝗲 𝗺𝗮𝗻𝗮𝗴𝗲𝗱 𝘁𝗵𝗲 𝗿𝗶𝘀𝗸: ✅ Created a living risk register with 40+ tracked scenarios ✅ Simulated outages with a Red Team before go-live ✅ Designed rollback paths for every migration step 𝗪𝗵𝗮𝘁 𝘆𝗼𝘂’𝗹𝗹 𝗹𝗲𝗮𝗿𝗻: → How to make risk planning the core of your migration strategy → Why real-time simulations beat assumptions every time → How to coordinate vendors around failure planning → How to deliver under pressure without losing control 𝗪𝗲’𝗿𝗲 𝗮𝗹𝘀𝗼 𝗶𝗻𝗰𝗹𝘂𝗱𝗶𝗻𝗴: 🧠 The risk categories you need to track during cloud migrations 📊 How we resolved live issues in under 2 hours 🚀 Lessons you can apply to any system transition under pressure If you’ve ever lost sleep over infrastructure risks, this one’s for you. 👉 READ THE FULL ARTICLE NOW and drop a comment: What’s the smartest move you’ve made to manage infrastructure risk? 2 Disgruntled PMs Podcast
-
10 Cloud DevOps troubleshooting scenarios you can't skip (and their resolution strategies) 1. Diagnosing High Latency in a Cloud-Native Application (Performance) → Check Cloud Specifics Monitoring dashboard or Grafana metrics → Analyze API Gateway latency (if it's a part of your app) → Inspect database queries and response time Note: Begin with metric analysis before log investigation 2. Kubernetes Pod in CrashLoopBackOff → Run kubectl logs <pod> for error messages → Use kubectl describe pod to check events → Validate environment variables, image version, and resource limits Note: Misconfigurations and missing dependencies are common causes 3. Broken CI/CD Pipeline → Review pipeline logs (GitHub Actions, Jenkins, etc.) → Validate secrets, tokens, and environment variables → Check for failed dependencies or syntax errors Note: Testing workflows locally helps catch silent failures 4. Publicly Exposed Storage Bucket (e.g., S3, GCS etc) → Audit bucket permissions and IAM policies → Block public access and review access control lists → Enable encryption and logging for monitoring Note: Always follow least-privilege access principles 5. Terraform Apply Failure → Review error messages for plan/apply mismatches → Check state file locks, syntax errors, or version conflicts → Validate changes before applying Note: Always run terraform plan to preview updates 6. Failed Kubernetes(Eg. EKS, AKS, or GKE) Deployment → Validate Helm chart values and image tags → Check node availability, taints, and resource limits → Use kubectl get events for insights Note: Misconfigured YAML is a frequent root cause 7. Unexpected Cloud Cost Spike → Use the billing dashboard and cost explorer → Identify idle or over-provisioned resources (compute, volume, Load Balancers) → Review autoscaling settings and storage tiers Note: Set alerts and budgets to catch anomalies early 8. Broken Blue-Green Deployment → Verify routing in load balancer or DNS → Check application health on the green environment → Ensure environment variables and secrets match Note: Always test green thoroughly before rerouting traffic There are way more real-world scenarios than what I’ve shared here (plus, I’ve hit the character limit on LinkedIn 😅 ) — so I’m putting together a list of Cloud DevOps troubleshooting cases I’ve come across in today’s newsletter. Subscribe here to get it in your inbox when it’s live: https://lnkd.in/dBNJPv9U • • • If you found this helpful, follow me (Vishakha Sadhwani) for more Cloud & DevOps insights through my newsletter — and feel free to share it so others can learn too!