It is a Friday night cutover. Your team has waited six months for this weekend. The new cloud environment is provisioned, the DNS records are ready, and the old data centre is meant to go quiet by Monday morning.
At 02:00, the database sync stalls. Three terabytes have moved. Eight terabytes have not.
By Saturday afternoon, the operations team is on calls. Customer logins are failing. Half the traffic still points at the legacy environment, and the other half is hitting a half-migrated database.
This is the version of cloud migration nobody puts in a case study. It happens far more often than vendors admit. The aim here is to show you the failure modes, then show you the top strategies for minimizing downtime during cloud migration so the weekend cutover does not become a quarter-long recovery.
What a Failed Cloud Migration Actually Looks Like
The scene above is composite, but the numbers behind it are not.
Gartner research, widely cited in the IT press, shows that 83% of data migration projects fail outright or finish late and over budget. That is not a fringe statistic. It is the base rate.
Downtime turns the same project into a revenue event. ITIC's 2024 Hourly Cost of Downtime survey found that 97% of large enterprises lose at least $100,000 per hour of outage, and 41% lose between $1 million and $5 million per hour.
For UK mid-market businesses, the figure runs in the hundreds of pounds per minute even before brand damage. The cutover is the visible failure. The hidden one is the eight weeks before, where nobody pressure-tested the rollback plan.
We have seen this exact pattern across retail, fintech, and healthcare clients. Nobody put the source database into read-only mode during the final sync, and every late write became a reconciliation ticket on Monday.
The downtime itself rarely lasts a single weekend. It bleeds into the following week as your team patches data drift, reconciles half-finished transactions, and explains to customers why their order history is incomplete.
The good news: every one of those failures is preventable. The bad news: most teams do not learn that until the second migration.
Why Most Cloud Migrations Bleed Downtime
Nine times out of ten, the problem is not the cloud platform. It is the assumptions baked into the migration plan before anyone touched a console.
Undocumented dependencies
Most production systems have at least one batch job, cron task, or third-party integration that nobody on the migration team remembers.
We have seen migrations stall because a finance reconciliation script ran against a hardcoded internal IP that no longer existed in the new VPC. The cure is a full dependency audit, not a guess.
An untested rollback plan
Teams write a rollback plan and never run it. When the cutover stalls at 02:00, the rollback is the first thing to fail.
McKinsey's 2021 cloud migration study reported that 75% of cloud migrations ran over budget and 38% over schedule. Untested rollbacks are a major driver.
An oversized cutover window
If your plan is to move everything in one weekend, the plan is the risk.
Atlassian's own migration guidance makes the same point for SaaS transitions: break the migration into smaller batches, and migrate attachments and stale data weeks before the final cutover.
No read-only mode on the source system
Without read-only mode, users keep writing to the legacy system while the final sync is running. Every write is a data drift event that the cutover has to reconcile.
Most teams skip this because the business does not want to lose any transaction window. The cost is downtime later.
These four causes account for the bulk of failed cloud migrations we are called in to recover. The next question is what good preparation actually looks like.
What Good Preparation Looks Like Versus What Most Teams Actually Do
Most teams treat preparation as a planning document. Good preparation is a sequence of dress rehearsals on real data.
A well-run cloud migration runs a full dry-run cutover against a staging environment that mirrors production, including the same data volume and the same dependencies. Not a subset. The full thing.
We have seen migrations that worked on a 100 GB test set fall over on the real 8 TB database because the index build took 14 hours, not two.
The same goes for monitoring. New Relic's database migration guidance recommends instrumenting both source and target throughout the cutover so the team can see drift in real time.
Most teams set up monitoring on the new environment only. That is half a picture, and the half you cannot see is the one that breaks the cutover.
Best practices for cloud migration minimizing downtime also include a written communications plan, agreed with the business before cutover weekend.
Who tells customers, when, and what is the fallback message if the cutover slips by twelve hours? If this is decided on the Saturday morning of the cutover, you have already lost.
The pattern that separates well-run migrations from troubled ones is rehearsal, not strategy. The strategy can be borrowed. The rehearsal cannot.
The Decision Points Where Cloud Migrations Go Off Track
There are three decisions that, once made badly, are very hard to recover from. Most projects do not realise they have made them.
Lift-and-shift versus re-platforming
Lift-and-shift is cheaper on day one and almost always more expensive on day 365.
Gartner has reported that 80% of organisations without a formal cloud migration strategy overspend their budgets by 20 to 50%. A lift-and-shift that runs the same monolith on cloud VMs inherits all of the old cost problems, plus a new bill.
Big-bang versus phased
Big-bang cutovers compress every risk into one weekend. Phased cutovers spread risk across weeks but require parallel running, which the finance team often refuses to fund.
The right answer depends on the business. The wrong answer is to default to big-bang because it is easier to project-plan.
Internal team versus partner
Internal teams know the systems. Partners know the migration patterns. The combination wins, but only if the partner is named on the cutover runbook, not just on the contract.
We have seen partners disappear from Slack on the Saturday of the cutover because their statement of work ended at the go-live boundary.
These three decisions shape the next strategy section. Get them right, and the strategies below land. Get them wrong, and no amount of tooling will save the cutover.
Top Strategies for Minimizing Downtime During Cloud Migration
These are the cloud migration minimize downtime strategies that consistently deliver clean cutovers. None of them are new. All of them are skipped at least once on most projects.
Run a full dress rehearsal in a staging environment
Spin up a staging environment that matches production in size, not just structure. Run the full cutover sequence end to end, including the rollback. Time every step.
We aim for at least two full rehearsals before the production cutover, and we will not sign off go-live until the rehearsal completes inside the agreed window.
Move data in waves, not in one cutover
Pre-migrate cold data, archives, and attachments weeks before the cutover.
Atlassian's migration documentation puts this clearly: attachment migration is usually the longest single phase, and moving it ahead of time cuts the final cutover window dramatically. The same logic applies to historical transaction data, log archives, and BI extracts.
Use parallel running with read-replicas or master-master sync
Run the legacy and cloud databases in parallel, with continuous replication between them.
New Relic's database migration guidance describes three options: master/read-replica switch, master/master bidirectional sync, and offline copy. Master/master is the most complex and the lowest-downtime choice, but it requires careful conflict resolution.
Choose the strategy that matches your tolerance for both downtime and data conflict.
Switch traffic with DNS or blue-green deployment
Blue-green deployment keeps two production environments live and switches traffic between them at the load balancer. If the new environment misbehaves, traffic is switched back inside a minute.
DNS-based switching is simpler but slower, because TTL caching can leave some users on the old environment for hours. For low-downtime cutovers, blue-green at the load balancer is the cleanest answer.
Pre-migrate users, attachments, and stale data
User and group migration can usually happen with zero downtime, weeks before the data cutover. This removes a whole class of access problems from the cutover weekend.
The same applies to stale data: filter out anything not updated in 90 days and migrate it on its own schedule.
Put the source system into read-only mode during the final sync
This is the single most-skipped technique in cloud migration.
Without read-only mode, users keep writing to the legacy system while the final sync is running, and the cutover has to reconcile every write that lands after the sync started.
Read-only mode is the difference between a one-hour cutover and a six-hour one. The business pushback is real, but the alternative is worse.
Together, these strategies cover the difference between a migration that ships clean and one that bleeds into Monday. The next question is who you trust to run them.
How to Evaluate a Migration Partner to Reduce Risk
The partner you choose owns more of the risk than the platform you choose. Best services to avoid downtime during connector migration come from partners who have run the migration before, not just sold one.
Ask the partner to walk you through a previous migration in your industry. Not a sales deck. The runbook.
If they cannot show you a runbook with named owners, timing windows, and rollback steps, they have not run enough migrations to be safe.
Confirm certifications. For UK businesses handling regulated data, ISO 27001 and Cyber Essentials are baseline.
For healthcare and fintech workloads, the partner should also walk you through their data residency and DPIA process.
Confirm IP ownership in the statement of work. The migration runbook, infrastructure-as-code repositories, and configuration documentation all need to transfer to you at handover.
We have seen clients arrive at year two of cloud operations to find that the partner still owns the Terraform repository.
Confirm the NDA is signed before any production data, architecture diagrams, or customer schemas are shared. This is a basic professional standard. If the partner pushes back, choose a different partner.
Confirm the rollback plan is inside the statement of work, not a best-effort add-on. A partner who treats rollback as out-of-scope is a partner who has never had a cutover fail.
The fee for including rollback is far smaller than the cost of needing it without a plan. With a partner chosen and qualified, the last gate before cutover weekend is the pre-commit checklist.
A Pre-Commit Checklist You Should Sign Off Before the Weekend
Sign off every item below before the cutover begins. If any item is unchecked, postpone the cutover. The cost of a postponement is far smaller than the cost of a failed weekend.
Full staging-environment dress rehearsal completed within the agreed window. Rollback plan tested end to end, with timing documented. Source system read-only mode tested and the business communications plan agreed.
Stale data, attachments, and user accounts pre-migrated. Monitoring active on both source and target environments, with named on-call engineers.
Cutover runbook circulated to every named owner with timing windows and decision gates.
Customer communications drafted and approved for three scenarios: clean cutover, twelve-hour slip, rollback to source.
Partner and internal team on a shared Slack or Teams channel for the full cutover window. DNS TTL reduced 48 hours in advance so a traffic switch lands cleanly.
Post-cutover validation script ready to run inside the first hour, covering critical user journeys.
If your team can sign off every line above, the weekend cutover is in good shape. If any line is unchecked, the weekend will cost more than the postponement would have.
Conclusion
The headline takeaway is short. Cloud migrations fail because of the eight weeks before the cutover, not the cutover itself.
Top strategies for minimizing downtime during cloud migration are not exotic. They are dress rehearsals, phased data moves, parallel running, read-only mode, and a partner who treats rollback as in-scope.
If you are planning a cloud migration and want a partner who runs the rehearsals before quoting the cutover window, our IT infrastructure team has done this across regulated and high-traffic UK workloads. We will show you the runbook before you sign anything.
FAQs
What is the best way to minimise downtime during cloud migration?
The single most reliable technique is to run a full dress rehearsal in a staging environment that matches production in size, then put the source system into read-only mode for the final sync. Together these two steps remove most of the data drift and reconciliation work. Phased data moves and blue-green traffic switching reinforce the result.
How long does a cloud migration usually take?
End-to-end timelines depend on data volume and integration complexity. A standard UK mid-market migration runs 12 to 24 weeks from discovery to go-live. The final cutover window itself is typically one weekend for well-prepared phased migrations, longer for big-bang cutovers.
What is the difference between lift-and-shift and re-platforming?
Lift-and-shift moves existing servers to cloud VMs without changing the application. Re-platforming refactors the application to use managed cloud services such as managed databases, object storage, and serverless functions. Re-platforming costs more on day one and far less on day 365.
Can you migrate to the cloud with zero downtime?
True zero downtime is possible but only with master-master database replication, blue-green traffic switching, and a fully decoupled application architecture. For most UK mid-market businesses the practical target is near-zero downtime: a cutover window measured in minutes rather than hours.
Who is responsible if a cloud migration causes downtime?
Responsibility sits with whoever signed off the cutover runbook. In practice that is usually the CTO or Head of IT internally and the partner's named delivery lead externally. The statement of work should make this explicit, including the rollback obligation.