In the interest of full disclosure, TOC-ers, I am getting paid for this blog post.
In fact, this is 1 of 3 posts sponsored by the Disaster Recovery company Zerto.
And, yes, I know what you’re thinking: I’ve sold out. That is a nice try, but “selling out” is when someone makes an endorsement without actually using or vouching for the product, and you should be aware that my current and previous employer both use(d) Zerto for disaster recovery, and I have used the product first hand.
I unequivocally vouch for the product. I would not be making these posts if that were not the case.
Furthermore, I would have posted about this anyway, so really, I am not selling out, I am getting paid for what I would have done anyway.
Would you turn the money down? I don’t think so.
And, on a serious note, I should hope by now dear reader that I have built enough trust with you that I would never steer you astray. This is a win-win-win: a win for Zerto, a win for me, and I think Zerto would agree, most importantly: this is a win for you.
What is Continuous Data Protection?
Let’s level set. The (very) brief history of disaster recovery and/or backups has 3 major phases I can think of, although I am sure I am missing some in between:
- “Unstructured” – WE COPY THING TO OTHER THING AND KEEP THE OTHER THING UNTIL WE NEED TO RESTORE THE THING . . . WE HOPE.
- Full/Incremental/Differential – Does anyone do this anymore? At least in the way I am describing here? And if you’ve been around long enough, you know what I am talking about.
- Point-in-Time Restores, aka “versioning” where you can restore from a certain time, but the effectiveness of this is dependent upon how often these point-in-time backups are taken.
That last one still has one problem. Your RPO is dependent upon the frequency with which the “backups” are “taken”. Usually this is limited by a certain number per day at a certain interval, or limited by space.
And let’s say you do these point-in-time backups every 3 minutes. That’s great, but in the enterprise, 3 minutes can be a freaking eternity. I won’t get into the details, but in online retail, 3 minutes of lost data could cost tens or hundreds of thousands of dollars.
Enter Continuous Data Protection. Under the hood, very simply, every transaction is backed up on-demand. All of the usual features that go with that (deduplication, synchronization, journaling) are there, but with all of the infrastructure provided by both Kubernetes and Zerto, we can leverage what one could call “real time backups” . . . continuously. Then, therefore your RPO’s can be seconds.
I have a plan to demo this in Part 2, but you can get the idea just from the checkpoints (that word means what you think it means) while Zerto runs. Here’s a screenshot taken from the Data Protection as Code: Introducing Zerto for Kubernetes session at ZertoCon:
Notice that between ID 14-15, there is a difference of one second (14 was tagged during the demo).
This is because every change to the system, anywhere, is “backed up” every time in real time.
Wait. What About App-level Restores?
Furthermore, what about “multi-tier apps”? . . . (Do we still call them that?)
You know what I mean: websites connected to databases and what not. Don’t we need to ensure that we don’t have a mismatch in restorations of one or the other?
For example, even though there may be no changes on the website, there may be database changes that happen outside of the Web Application’s purview. Or vice versa.
Sometimes this can cause issues where the database knows of a change, but the Web Application does not.
Zerto’s Continuous Data Protection handles all of this through it’s VPG (Virtual Protection Group). We’ll see this in action in Part 2.
In other words, each checkpoint keeps metadata about the state and data of all portions of the Kubernetes components at both the app and “infrastructure layer” and can restore them at any point you desire.
Some Advice on Vendor Engagement and Zerto’s Data Protection as Code
Whenever we have a vendor come onsite, I don’t have a lot of time or patience to talk about obvious things. In other words, I am kind of a jerk to vendors tough customer. Therefore, I usually do my homework, and I have some advice for you if you like to make vendors squirm ask the right questions:
Before a vendor comes out to your site (or calls you on Zoom since WFH is still a thing in April 2021), read up on their marketing slicks. Know the feature base that each vendor offers, at least at a high level, and draw a Venn Diagram of what features each vendor has that are in common.
Then, when they arrive and the pleasantries are over, say this, word-for-word:
“I would like to know what features your product offers that your competitors don’t, and I would like you to show me the thing your product provides that I don’t know I need yet.”
Sometimes I will prep them for it and ask before they are on-site, but now you can quickly cut through the features that are commoditized.
Let’s take a look at the list Zerto has and see what’s set it apart (in my opinion):
- Continuous Data Protection – Continuous Data Protection has been around for a while. In other contexts it’s called (Near) Continuous Data Protection (and to that point, can one really have true Continuous Data Protection?). CDP means that there’s no schedule, or Full/Incremental/Differential nonsense. The data is backed up immediately with every change and the full stack is kept synchronized.
- Application-Centric – One approach to restoring Kubernetes Applications is to simply backup all the application persistent data and/or its database. Then, upon restore, restore the persistent data and then re-roll out the application anew. The problem is that this can be rife with inconsistencies. And, for the record I did not say that was the best method. Or, you can provide backups/restores at every level of the stack. The best definition I can think of for this is from Zerto:
“Application-centric protection: Beyond protecting the persistent data, protect, move and recover complex applications as one consistent entity, including all associated Kubernetes objects and metadata.” - Data Protection as Code – This is my personal favorite. If you have so far been lukewarm on everything, Data Protection as Code should wake you up. If you are a regular reader of my blog . . . (‘sup readers?) . . . you are very well aware that I am a huge Infrastructure as Code Engineer. If you can give me an API and/or method for coding out pipelines . . . ** Chef’s Kiss **
I can imagine how useful this would be in conjunction with Kubernetes Deployments and ReplicaSets: DR restores anywhere in addition to what’s built in to Kubernetes? Yes, please.
Join me for the other posts in this series. You can expect me to get more hands-on with the next Zerto on Kubernetes Post.
If you are interested in accessing this for yourself with some hands-on labs, you can try them here.
We’re also going to talk about what Zerto has to offer for migrations.
See you then!
Hit me up on twitter @RussianLitGuy or email me at bryansullins@thinkingoutcloud.org. I would love to hear from you.