Point-in-Time Recovery for MongoDB on Kubernetes

point in time recovery mongodb Running MongoDB in Kubernetes with Percona Operator is not only simple but also by design provides a highly available MongoDB cluster suitable for mission-critical applications. In the latest 1.8.0 version, we add a feature that implements Point-in-time recovery (PITR). It allows users to store Oplog on an external S3 bucket and to perform recovery to a specific date and time once needed. The main value of this approach is a significantly lower Recovery Time Objective (RTO) and Recovery Point Objective (RPO).

In this blog post, we will look into how we delivered this feature and review some architectural decisions.

Internals

For full backups and PITR features, the Operator relies on Percona Backup for MongoDB (PBM), which by design supports storing operations logs (oplogs) on S3-compatible storage. We run PBM as a sidecar container in replica sets Pods, including Config Server Pods. So each replica set has two containers from the very beginning – Percona Server for MongoDB (PSMDB) and Percona Backup for MongoDB.

While PBM is a great tool, it comes with some limitations that we needed to keep in mind when implementing the PITR feature.

One Bucket

If PITR is enabled, PBM stores backups on S3 storage in a chained mode: Oplogs are stored right after the full backup and require it. PBM stores metadata about the backups in the MongoDB cluster itself and creates a copy on S3 to maintain the full visibility of the state of backups and operation logs.

When a user wants to recover to a specific date and time, PBM figures out which full backup to use, recovers from it, and applies the oplogs.

If the user decides to use multiple S3 buckets to store backups, it means that oplogs are also scattered across these buckets. This complicates the recovery process because PBM only knows about the last S3 bucket used to store the full backup.

To simplify things and to avoid these split-brain situations with multiple buckets we made the following design decisions:

Do not enable the PITR feature in the user-specified multiple buckets in
```
backup.storages
```
section. This should cover most of the cases. We throw an error if the user tries that:

"msg":"Point-in-time recovery can be enabled only if one bucket is used in spec.backup.storages"

There are still cases where users can get into the situation with multiple buckets (ex. disable PITR and enable it again with another bucket).
- That is why to recover from the backup we request the user to specify the
```
backupName
```
  (
```
psmdb-backup
```
  Custom Resource name) in the
```
recover.yaml
```
  manifest. From this CR we get the storage and PBM fetches the oplogs which follow the full backup.

The obvious question is: why can’t the Operator handle the logic and somehow store metadata from multiple buckets?

There are several answers here:

Bucket configurations can change during a cluster’s lifecycle and keeping all this data is possible, but the data may become obsolete over time. Also, our Operator is stateless and we want to keep it that way.
We don’t want to bring this complexity into the Operator and are assessing the feasibility of adding this functionality into PBM instead (K8SPSMDB-460).

Full Backup Needed

We mentioned before that Oplogs require full backups. Without a full backup, PBM will not start uploading oplogs and the Operator will throw the following error:

"msg":"Point-in-time recovery will work only with full backup. Please create one manually or wait for scheduled backup to be created (if configured).

There are two cases when this can happen:

User enables PITR for the cluster
User recovers from backup

In this release, we decided not to create the full backup automatically, but leave it to the user or backup schedule. We might introduce the flag in the following releases which would allow users to configure this behavior, but for now, we decided that current primitives are enough to automate the full backup creation.

10 Minutes RPO

Right now PBM uploads oplogs to the S3 bucket every 10 minutes. This time span is not configurable and hardcoded for now. What it means to the user is that a Recovery Point Objective (RPO) can be as much as ten minutes.

This is going to be improved in the following releases of Percona Backup for MongoDB and captured in PBM-543 JIRA issue. Once it is there, the user would be able to control the period between Oplog uploads with

spec.backup.pitr.timeBetweenUploads

cr.yaml

Which Backups do I Have?

So the user has Full backups and PITR enabled. PBM has a nice feature that shows all the backups and Oplog (PITR) time frames:

$ pbm list

Backup snapshots:
     2020-12-10T12:19:10Z [complete: 2020-12-10T12:23:50]
     2020-12-14T10:44:44Z [complete: 2020-12-14T10:49:03]
     2020-12-14T14:26:20Z [complete: 2020-12-14T14:34:39]
     2020-12-17T16:46:59Z [complete: 2020-12-17T16:51:07]
PITR <on>:
     2020-12-14T14:26:40 - 2020-12-16T17:27:26
     2020-12-17T16:47:20 - 2020-12-17T16:57:55

But in Operator the user can see full backup details, but cannot see the Oplog information yet without going into the backup container manually:

$ kubectl get psmdb-backup backup2 -o yaml
…
status:
  completed: "2021-05-05T19:27:36Z"
  destination: "2021-05-05T19:27:11Z"
  lastTransition: "2021-05-05T19:27:36Z"
  pbmName: "2021-05-05T19:27:11Z"
  s3:
    bucket: my-bucket
    credentialsSecret: s3-secret
    endpointUrl: https://storage.googleapis.com
    region: us-central-1

The obvious idea is to somehow store this information in

psmdb-backup

Custom Resource but to do that we need to keep it updated. Updating hundreds of these objects all the time in a reconcile loop might result in pressure on the Operator and even Kubernetes API. We are still assessing different options here.

Conclusion

Point-in-time recovery is an important feature for Percona Operator for MongoDB as it reduces both RTO and RPO. The feature was present in PBM for some time already and was battle-tested in multiple production deployments. With Operator we want to reduce the manual burden to a minimum and automate day-2 operations as much as possible. Here is a quick summary of what is coming in the following releases of the Operator related to PITR:

Reduce RPO even more with configurable Oplogs upload period (PBM-543, K8SPSMDB-388)
Take full backup automatically if PITR is enabled (K8SPSMDB-460)
Provide users the visibility into available Oplogs time frames (K8SPSMDB-461)

Our roadmap is available publicly here and we would be curious to learn more about your ideas. If you are willing to contribute a good starting point would be CONTRIBUTING.md in our Github repository. It has all the details about how to contribute code, submit new ideas, and report a bug. A good place to ask questions is our Community Forum, where anyone can freely share their thoughts and suggestions regarding Percona software.

Point-in-Time Recovery for MongoDB on Kubernetes

Internals

One Bucket

Full Backup Needed

10 Minutes RPO

Which Backups do I Have?

Conclusion

Trending Articles

Download: FK ft Shenky – Nakuyewa ”Prod by: Shenky”

Password Reset on SX6036?

Outlook でメールを保存または送信時に...

Throw Back: Samini — Where My Baby Dey (Prod by Kaywa)

Name Of Parts Of The Day In hindi And English-List Of Part Of Days In Hindi

Nahitaji matokeo ya kidato cha nne ya mwaka 1998

Practice Sheet of Right form of verbs for HSC Students

VMOU RSCIT Result 2017, RSCIT Result VMOU rkcl.vmou.ac.in Name Wise

Muloraki Au

SEAGCD2 - Editorial

ESENT データベース USS.jtx で、エラーイベント ID 490、454、489、455 が記録される事象について

Felony Arrest of Joseph A. White and Heather Coomer-White

the range cannot be deleted (6028) in microsoft word

Rajasthan Board 10th Result 2016 Roll No wise & Name Wise

GTA 5 PPSSPP Zip File Download For Android Mediafire 382 MB

Arrow Flash 2 – Sinhala Dubbed – Episode 17 – 28th February 2016

PRC MOE SCHOOL TEACHER CHARGED FOR SEXUALLY PENETRATING 12 YEAR-OLD WITH FINGERS

Bhiknur Mandal Sarpanch | Upa-Sarpanch | Ward member Mobile Numbers List...

Teen Shot In Miami Drive-By Dies From Injuries

Bureau of Internal Revenue: Regional Offices (Directory)