Support > About cloud server > How often should cloud server snapshots be backed up? How many versions should be retained for optimal performance?
How often should cloud server snapshots be backed up? How many versions should be retained for optimal performance?
Time : 2026-05-17 10:30:16
Edit : Jtti

  Taking snapshots isn't a huge deal, but it's not insignificant either. Many website owners, new IT professionals, and even some veterans have made mistakes on two key issues: too frequent snapshots lead to skyrocketing storage costs; too infrequent snapshots result in regrettable consequences if something goes wrong. As for how many versions to retain, opinions vary widely—some keep three days' worth, some keep a month's worth, and some don't bother with snapshots at all until they receive their bills.

  First, understand: What exactly do snapshots protect?

  Before discussing frequency and the number of versions retained, it's crucial to clarify one thing—snapshots are not for preventing hardware failures. Cloud providers' distributed storage inherently has multiple copies, making hard drive failure extremely rare. Snapshots truly protect against logical errors such as human error, program deletion of databases, configuration crashes, and ransomware encryption.

  Think back, how many times has your server experienced "physical hard drive failure" in the past year? Most likely, not even once. Have you ever encountered situations like "accidentally using `rm -rf`", "forgot to add a WHERE clause to a database update", or "a plugin wiped out all the website's data"? Most operations and maintenance personnel have experienced this at least once or twice. Snapshots are the remedy for these scenarios.

  Once you understand this premise, you'll realize that the frequency of snapshots and the retention strategy are essentially a quantification of how many changes you can tolerate losing.

  II. Three Core Dimensions Determining Snapshot Frequency

  1. Data Change Rate: How many times a day does your business change?

  This is the most direct indicator. Different business types have vastly different data change frequencies.

  High-frequency change businesses: E-commerce orders, community forums, CRM systems, online customer service—new data is written every minute. If a day's or even a few hours' worth of data is lost, you might lose hundreds of orders and thousands of user messages, resulting in direct financial losses.

  Medium-frequency change businesses: Corporate websites, company blogs, content-based CMS—a few articles are updated daily, with occasional configuration changes. Losing a day's worth of data means losing a few new articles and some access statistics; the impact is controllable but not zero. 1. Low-frequency change businesses: Personal static blogs, test environments, backup servers—data changes only a few times a week or even a month. Losing a week's worth of data at most means losing a few personal notes, with virtually no business impact.

  2. Business Importance: How much loss can you tolerate losing a day's worth of data?

  The rate of data change is an objective indicator, while business importance is a subjective but more crucial dimension. Ask yourself this question: If all the data from the past 24 hours were lost, how much money would you lose? How much user trust? How much workload?

  Using this number to measure snapshot frequency is simple. Assuming you earn 1000 yuan a day, taking a snapshot daily and spending a few yuan on storage to protect that 1000 yuan is a simple calculation. If you only earn 10 yuan a day, daily snapshots are indeed a bit of a loss—but this doesn't mean not taking them at all, but rather reducing the frequency or adopting a lighter backup method.

  What you really need to be wary of are those businesses that you "think are unimportant, but whose loss would actually be a huge problem." For example, a company's internal knowledge base system might seem harmless under normal circumstances, but if it lost six months' worth of accumulated operational documentation and project records, the entire team would be furious. In such a scenario, even if no one is directly paying for the system, you should still give it a relatively reasonable snapshot frequency—at least once every three days.

  3. Recovery Time Objective (RTO) and Recovery Point Objective (RPO): Two Professional Metrics to Help You Make Rational Decisions

  If you've ever worked with operations and maintenance system design, you've definitely heard of RTO and RPO. Applying them to snapshot frequency is much more scientific than relying on gut feeling.

  RPO: The maximum amount of data loss you can tolerate. In other words, the maximum interval between two snapshots.

  RTO: The maximum time you can accept from the occurrence of a failure to business recovery.

  Snapshot frequency directly determines RPO—hourly snapshots can lose a maximum of one hour of data; daily snapshots can lose a maximum of 24 hours of data. Therefore, first determine your RPO, and the snapshot frequency will follow.

  For example, an online education platform experiences virtually no users from 10 PM to 8 AM the next morning, with peak hours from 9 AM to 9 PM. Operations analysis revealed that most errors occurred during daytime change windows. Therefore, they implemented a "full snapshot every day at 2 AM," resulting in a 24-hour Recovery Point Objective (RPO). If business requirements dictate an RPO of no more than 12 hours, then two snapshots per day are necessary—one at noon and one at midnight.

  Recovery Time Objective (RTO) significantly influences your choice between "snapshot rollback" and "rebuild from backup." Snapshot rollback typically takes only a few minutes, but without snapshots, you can only redeploy the environment from scripts and restore data, potentially resulting in an RTO of several hours or even a day. From this perspective, snapshots are a powerful tool for reducing RTO.

  III. How Many Versions to Retain? A Classic Cloud Variant of the "3-2-1" Strategy

  Having discussed frequency, let's talk about quantity. How many snapshots should be retained? My advice is: based on recovery needs, not on intuition.

  In the field of local backup, there's a classic principle called "3-2-1": 3 backups, 2 different media, and 1 off-site storage. In cloud snapshot scenarios, since cloud providers have already solved the media and off-site issues (multi-availability zone storage), we can simplify it to a "multi-version rolling retention" strategy.

  1. Short-term retention: Daily snapshots of the last 3-7 days

  This is the recovery point you're most likely to use. Most incidents like "accidental database deletion" or "configuration crashes" are discovered within hours of occurring, at most a day later. Retaining daily snapshots of the last 3-7 days can basically cover 95% of recovery needs.

  The specific number of days to retain depends on your business complexity and operation frequency. For a static blog maintained by one person, 3 days is enough; for a collaborative operations backend where someone uploads images and modifies product information daily, 7 days is more reliable.

  2. Medium-term retention: Weekly snapshots of the past 4 weeks

  Some problems are not immediately apparent. For example, malicious code might lurk in the system for two weeks before activating, or you might discover that a configuration change a week ago caused a performance degradation. In these cases, weekly snapshots are needed for location and recovery.

  Retaining weekly snapshots from the most recent four weeks (e.g., the one taken at midnight every Monday) provides a backtracking window without consuming too much space. Four weeks equals four versions, plus seven versions daily, totaling eleven versions, keeping storage costs manageable.

  3. Long-term retention: One snapshot per month, retaining for 3-6 months or even a year

  Annual audits, compliance checks, or needs like "what the website looked like at a certain point last year" require monthly snapshots. These types of recovery are rare, but if needed, their absence is extremely troublesome.

  Retaining monthly snapshots for 3-6 months is usually sufficient. If you have compliance requirements (e.g., the financial or healthcare industries need to retain data state for six months to a year), you can extend this to 12 months. However, note that cloud snapshots are charged based on actual space usage; monthly snapshots, due to accumulated data changes, may consume significantly more incremental space than daily snapshots. A more economical approach is to perform a separate monthly archive backup (e.g., exporting to object storage) instead of simply keeping snapshots.

  IV. The Cost-Benefit Balance: Not All Disks Need Frequent Snapshots

  Many people mistakenly believe that "all cloud disks should be treated the same." In reality, you can use different strategies for different mounted disks to further reduce costs.

  System Disk: The operating system and software environment change very infrequently, remaining largely unchanged except for major version upgrades or security patches. System disk snapshots can be taken once a week or even every two weeks, retaining only three versions. You can even manually take a snapshot only before major changes, relying entirely on data disk backups for daily operations.

  Data Disk: Databases and website files are the core. The snapshot strategy for data disks should be determined according to the business type and RPO mentioned above, requiring high frequency and multiple retained versions.

  Temporary Disks/Cache Disks: Some cloud servers mount temporary disks (tmp, cache). This type of data is lost upon restart and does not require snapshots.

  By treating them separately, you'll find that storage costs can be halved. Another cost-saving tip: utilize cloud vendors' "incremental snapshot" mechanism. Most cloud snapshots are incremental—the first snapshot is a full snapshot, and subsequent snapshots only save the changed data blocks. This means that when you take a snapshot daily, you're not being charged for the entire disk every day, but only for the changed parts. Therefore, in many scenarios, the cost difference between taking snapshots daily and every three days isn't as significant as you might imagine.

  V. Differentiated Strategies for Special Scenarios

  The above describes the general situation. The following special scenarios require separate treatment.

  1. Database Servers: Ensure Data Consistency Before Snapshotting

  Taking a snapshot directly on a running database can result in a "crash consistency" state—for example, a transaction being frozen before it was fully committed, requiring the database to roll back automatically after recovery. For databases like MySQL and PostgreSQL, it's recommended to combine table locking or use the cloud vendor's database backup function. A better approach is to use cloud database RDS, whose automatic backup and PITR (Point-in-Time Recovery) are more professional than snapshots.

  2. Environments with Frequent Code Deployments

  If you release code several times a day, snapshots are more likely to be overwritten frequently. In these scenarios, "changes" primarily involve code files, and the code itself should be managed using a version control system (Git), not snapshots. Snapshots should primarily protect databases and uploaded files. Strategically, the frequency can be reduced to once a day, or even just once at the end of each release cycle.

  3. High Security Requirements (Ransomware Protection)

  Ransomware encrypts your data, including online snapshots (if the snapshot is writable or the virus has the right to delete it). In this case, it is recommended to enable the cloud vendor's "snapshot locking" or "WORM (Write Once, Read Many)" feature to prevent snapshots from being deleted or overwritten. Simultaneously, retain at least one offline copy—export the snapshot to another cloud region or a low-frequency access tier of object storage.

  Snapshot frequency and the number of copies retained cannot be calculated using a mathematical formula; it is the balance you find between your understanding of business risks and the trade-off between costs. The principle is essentially one: within your willingness to pay for storage, ensure snapshots cover the maximum data loss time you can accept.

  Don't blindly believe in "doing it every day," and don't be greedy for "keeping a hundred versions." Find a piece of paper and write down your business's RPO (the maximum number of hours of data to be lost), then calculate the corresponding snapshot frequency and version number. Then set up the automation strategy: let it sleep when it should, and deploy when it should. Snapshots are like fire extinguishers—they seem like they take up space normally, but if you don't have them when you really need them, it's not just a matter of money.

Relevant contents

What are the advantages of separating the system disk and data disk on a cloud server? Are cheap VPSs that cost only a few tens of dollars a year usable? Are there any hidden pitfalls? How fast are the hard drives on Japanese cloud servers? Here are some testing methods. What is a cloud phone? Unveiling the truth about cloud-based virtual phone technology. What are the differences between premium CN2 lines and the original optimized CN2 lines from mainland China? How to configure security group rules for Hong Kong cloud servers? Common port opening rules and the principle of least privilege. How to clean up a full root directory on a cloud server? Methods for finding large files and expanding disk space. What should I do if my Japanese cloud server is experiencing abnormal traffic consumption? Why is my website so slow after being deployed to a Hong Kong VPS? What is a residential VPS? How can I verify the authenticity of a residential VPS account?
Go back

24/7/365 support.We work when you work

Support