Cybersecurity Webinar: Why Patching is Critical

The topic of the June 2021 Global Data Vault webinar was patching: why it’s so critical and when it should be done, and best practices for how often you should do it. Topics covered include what to patch, why patching is so essential, how often it should be done, how it applies to your SLA and much more.

Present for the chat was Trista Perot, Director of Marketing for Global Data Vault, Kelly Culwell, Product Marketing and Tom May, CIO.

You can watch the video of the webinar here or if you prefer, read the transcript below. We encourage you to check out our other webinars and sign up for the next one here so that you can have your questions cybersecurity and disaster recovery questions answered. Please also subscribe to the Global Data Vault channel on Youtube!

 

Kelly Culwell:
Tom and I have done our share of patching over the years and are very familiar with it, so we’ll be hearing personal stories and real-world scenarios today. The first reason why patch management is critical is cybersecurity, something that’s top of everybody’s mind. We all worry about it but what you might not understand is that every piece of hardware or software requires patches–routers, switches, servers, hypervisors, anything that basically provides technology is going to need some type of a patch or firmware update because they’re going to be at risk. Patching is often a proactive measure against attacks and should be performed regularly.

In this example, according to the recent study by Fortinet, 90% of attacks utilized vulnerabilities known for three or more years, 60% for 10 years. There was a virus a few years ago that used a vulnerability that had been patched by Microsoft for about three or four years, and so many people had not applied the patch that the virus spread rampantly around the world. 

why patching is critical

Kelly:
Patching is not a popular topic. It’s not something people really want to talk about or look forward to doing, but as we’ve already identified it is necessary. Tom, what do you think about operating systems and applications, using automated or manual patching tools, do you download the patches automatically? Do you apply them automatically? What do you think is most time-consuming method when it comes to patching?

Tom May:
Every day, patching becomes much more time-consuming. Security is at the forefront of everything in this world now. Gone are the days of quarterly monthly patching, and now we have to be sure we are using the right patch and methodologies. We have to be aware of accountability, the audit, the vulnerability, and we need to know if it might break something or not? Cybersecurity is at the forefront of an organization. That’s why we have CISOs and CSOs. Now we really have to make sure that security is present and patching happens weekly. There’s more and more to do – it’s like a full-time gig for an organization to make sure you’re patched.

Kelly:
I remember times when we would actually have to download the patches manually–before higher-end tools that did for you, you’d go into the server and download all the patches manually. One thing I want to recommend is that you do not turn on automatic updates for your servers. You might think that that would save you time but it really puts you at more risk and causes more problems than it helps. If you want to download the patches ahead of time, there are ways to do that which can save you time when your actual patch window begins. We talk about patch stacking and being able to apply all the necessary patches at the same time versus letting them pile up. What do you think about that? Obviously, it’s going to be easier to apply them more frequently versus waiting until you have 110 of them.

Tom:
Patch stacking is dangerous. When you look at a long patch window when patching once a month, you have Patch Tuesday by Microsoft, you have your vendor, you stack all those in there. You might have to do cyclical reboots of four or 5, 6, 7 times. Let’s prevent the issue by having a little more planned downtime. I’ve seen a patch stack for even a week take four or five hours, honestly. It’s insane.

Kelly:
I know we’ll talk about testing in a little bit, but when you are in a phase where you’re applying so many patches and one of them breaks, it’s really hard to know which one of them broke and why. That adds to the complexity of the process. You mentioned clustering, which helps with keeping services up and running while patching, but there are other ways to do it, especially with hypervisors and virtual machines, that could introduce their own complexity but could also help with downtime and preventing issues. What are your experiences with that?

Tom:
Clustering is really challenging. It comes down to the operating system that you’re clustering, for example, Microsoft itself, let’s say in a Hyper-V environment, they’ve actually taken great strides to be able to handle the patch level. The theory with the cluster is all nodes have to be on the same patch level. How do you maintain this? With clusters, you’re able to fail a node and keep service delivery up. When we start patching, we kind of break that a little bit. Microsoft has done a great job of allowing us to patch through and inject patches across the cluster and maintain uptime. But as you move into different systems of clustering, Citrix systems, and whatnot, it becomes a little more complex. Remembering that, when we try to patch things, people often use third-party tools. If you’re a Microsoft shop, are you running SCOM or are you running a third-party patching tool?

At Global Data Vault, we use third-party tools because they go across multiple operating systems, and they give the compliance reporting in there. It gives the patch “labbing”. It gives us auto approvals. It fits into change management. That’s all great. But do they work well with clusters?  Well, I’ll be the first one to tell you, no, we need different methodologies for our clusters. It’s very hard. I can use the reporting and compliance from my third-party tools, but clustering–I love that it’s in there, but wow. What a whole other world. I know that my customer likes clustering because it’s always on – it’s that always-on scenario. That’s clustering. Good and bad.

Kelly:
We know about automated versus manual, the good and bad of those, so if you’re manual, the complexity happens for the person applying the patch and experience with doing that. How many different systems are they doing at a single time? You look at automated patching, which saves the man-hours, but you must have the appropriate tools in place. You have to have the systems to be able to do that, to monitor the patches, and indicate there was a system pause at this reboot or during this patch or whatever, and be able to give you that information. So there are pros, cons, and complexities with both.

Tom:
About staging patches, many people think we’ll go ahead and stage the patch. So the concept here is through some method, patches are downloaded and ready for install at reboot. Well, what happens when you have that firestorm come through and you have to reboot that server to fix an issue? Now you’re into a patch cycle to come up. You could see four hours of downtime. So again, it’s always your risk analysis, change management, and timing.

Kelly:
You mentioned Patch Tuesday, and anybody who’s been in IT for any length of time knows what Patch Tuesday is. Other vendors release patches as needed, but when we talk about downtime, there’s a good downtime and a bad downtime, right? Good downtime is the one that’s planned through change management.

Downtime that’s prepared- and planned for does not count as an outage because it is maintenance. Bad downtime is anything that happens outside of a plan, change management or a maintenance window, regardless of what it is. It is always difficult and challenging to schedule downtime, easier for smaller organizations, right? It gets harder the bigger you get–the more servers, the more lines of businesses or, other applications that are impacted, the more people you have to contact and coordinate testing and all of that. Monthly processing schedules or busy periods must be considered. I know that you used to work for a firm that did a lot of processing at times, how did you guys handle that?

Tom:
Well, you really have to look at when does it make sense to do it? And the answer is always … never, but what you have to do is look at utilization times on your system. So if you come around your methodology of monthly, weekly, whatnot, you have to look at your operational loads. I use Global Data Vault as an example versus traditional IT.  With traditional IT, when do we patch? Nights and weekends. Well, if I’m a service provider for backups when am getting backups? Nights and weekends.  Therefore, I have an inverse patching schedule to avoid taking servers offline during peak times. I like to counter with what’s more important–a good, known, safe, backup we can deliver that could be slightly older or nothing wrong? That’s really what we’re talking about with patching. So at Global Data Vault, certain systems are patched during the day now.

Let’s say you’re on a high availability system, when would we take that one down? We look at our SLA delivery. If a Cloud Connect customer is a four- or eight-hour SLA, well, I can affect that on a schedule. But if I’m into a tight one-hour SLA, I have to interface with the customer to determine when does it make sense for that backup to not run, because it must be shut off while we patch. They may say two o’clock in the morning. We schedule that.

Does that affect my SLA? No, because planned maintenance is not a failure of an SLA. That’s incorporated in there. When we talk about four nines and five nines, now we work around the methodology of planned versus unplanned. So if I just shut you off, wow, I failed your SLA. If I bring you up, that’s a different story. And then honestly, in the backup business, it’s the deliverable of the backup that goes into there. You have to know your business and deliverables to be able to plan and schedule your downtime, but you must hold this as the truth, and then I’ll end here: what is better,  a system that is down and will come back up, or a system that is permanently crushed because of security vulnerabilities? You don’t want to be in the second camp. Err on the side of caution.

patching and sla

Kelly:
From my personal experience, if something happens during your change management window, you have that amount of time to fix it or back out of the change, and like you were saying, it really depends on the customers. It depends on SLAs. It depends on when their systems can be down. How many times have you seen a bad patch? Let’s say Microsoft releases a patch and maybe not a zero-day, it’s not critical, but something was wrong with the code and you apply it.

Tom May
Oh, every time I hear dot NET Framework, I just want to curl up in the corner. Put it on fast? Wait a little bit and be vulnerable? What’s the right thing, you know?

Kelly Culwell
That’s why it’s always good to test those patches. Veeam gives you the ability to turn on your backups and do all kinds of testing in a “lab” environment, which eliminates the need for duplicated environments.

patch testing

Tom:
I love to speak to data labs, which eliminates the headaches of having to maintain production-like copies of your data. How cool is it that you can bring it all up on the actual backup or replica of a production machine?

Kelly:
To your point, you know, with people who do have labs, it’s really common that they’ve got the lab, but it’s running different versions of products. You can’t really test something on different products and expect things not to break in the production environment. That’s why it’s important to have end-user validation.

Tom:
At Global Data Vault, we take a sample subset of production machines and put them in what we call a test lab, and patch them. We ask for end-user feedback. Great. Looks good. Now let me go to my next lab, which is a small subset of one or two of each in the environment. It is a prod machine, release it out and wait for user feedback. If we get bad feedback, we roll it back. If we get green lights, we approve it and it goes for mass deployment when we have our installation window. Labs are critical. Not just data labs, but your patch lab methodologies.

Kelly:
People might ask, “how can end-users get access to Veeam Data Labs”, which used to be called SureBackup? When you power those VMs on, aren’t they going to conflict with the production environment? I mean, how can you have both systems on at the same time?

Tom:
Veeam uses masqueraded IP addresses to allow access. They “containerize” data and systems and can allow certain systems within the containers to communicate, but not communicate with the production network.

Fear of the Patch

Kelly:
One of my favorite features for sure. As IT folks, we know that rolling back from patches is difficult, or can be difficult. With the appropriate systems and software, it’s pretty easy to do.

You talk about deployment across time zones. If you’re doing something on the east coast versus the west coast, depending on where your systems are located, you could have cross impacts. It could cause bigger issues, and then always vendor support, right? Going back to your customer-side experience with your patches at a large organization, what was the fear of the patch there?

Tom:
It was so absolutely scary. I attended a live event for a major software provider and the chief developer got up there, the guy in charge of it goes basically, you know, to paraphrase it, you are not going to be successful in your uptime if you put the latest and greatest out there immediately. He spoke of the fear of the patch, the fear of the new version. We like to give it a month to work out the kinks.

Using a hypervisor snapshot as a contingency is not a good option. Admins forget about snapshots, which can cause issues months later. Being the Veeam aficionados we are, having a Veeam backup prior to starting work gives us an easy, eloquent failback option.

We do weekly patching here at Global Data Vault to avoid patch stacking, and we do it with the labs. So at any point, if you did our scorecard at 30 days, which is kind of our metric, we look amazingly compliant because we are vigilant but yet we’re running a little bit behind, say a week or two. But you also have out-of-band patches. We don’t care what it will break. So inband and out-of-band fall into there too.

Kelly:
We tend to call those zero-day patches. What’s next? How often to patch, right? It really comes down to how often you test, how you’re able to test implementation of those patches, versus how critical is it for me to apply those, say for tiered systems. It’s not a one-size-fits-all. It’s more of let’s look at this as an organization, figure it out.

Tom:
Oh, absolutely. And tiered systems. I mean, that is knowing your business. It’s about communication with patching and certain systems you need to communicate when they will be down. You might think of file servers, no big deal. Imagine if you’re in a legal world where they need production data, they need some sort of image, et cetera. We communicate when a system will be down so they are aware well in advance. They better plan on doing their work a little earlier that day.

At Global Data Vault on Wednesdays from noon to three Central, you’re going to see a ripple of a patch go across. That’s my design, the lowest utilized time in our environment, not for tier-one systems, but for all the rest, that is the load. How odd is that? Wednesday from noon to three Central, the lowest utilized time.

Kelly:
You might think of how to identify these? How do I know what’s tiered? How do I know what’s impacted? If you watched our last webinar about how to create a disaster recovery plan for your business, it talks about business impact analysis and exactly what touches what, and that’s part of what change management helps with too, by making sure that when you’re doing something to a system, the appropriate people for that system are notified. Because like you’re saying, a file system may not be that important to some people, but there are some people that that’s all their job depends on. So it’s really important to them.

Kelly:
On to the tools, there’s a whole list. You can use System Center, Windows, software, update services, there are all kinds of tools out there. My comment on this is to find something that works for your business and figure it out. Tom, what do you think about that?

Tom:
I agree with you, but I want to think past the execution of the patch. It’s not just those labs that we talked about, but it’s the compliance reporting. How are my systems faring with available patches? What’s my vulnerability? You know, I want to have that insight, so I know on a monthly basis, I’m at 100% compliance based on whatever patch. I need some more analytics behind it to make sure that I’m okay. This is before I get to vulnerability testing and all that other good stuff, but just a what’s the state of the union.

Kelly:
I think this is a type of dashboard that you can expect. Tom, can you walk us through this image below?

failed

Tom:
Absolutely. Take a look up top of the sample. We have 48 failed systems. So out of our total 399 systems, 48 have some sort of bad deployment, hung patch, hung, feature, updates, or something, Hey, these need my attention. I’m looking at 74 that require rebooting. Wow, what does that mean? Well, 74 systems need to get some more patches in there. Are they stacked? Did one work, did two work? But I can tell by looking at the dials that 317 out of 399 are looking good. I were to issue those reboots, maybe I can go ahead and raise that number. As I look across, I can look at my patching if there’s 509 available? How many are out there? All right, 429. Missing patches by severity. How many critical unknowns?

I can start gauging my response time to create an out-of-band catch-up as I call it. I executed my weekly patching, but I just couldn’t get ahead. Maybe there are targeted systems that require planned maintenance outside of the regular patch window. That’s really important on this screen. And then what I like from here are the latest security news. In this case, Microsoft Edge had several high vulnerabilities on May 30th. That’s important to note, a zero-day warning vulnerability. 

This is a real-time look by operating system, by version, missing patches based on the release time. Look at that, greater than 90 days is sitting at 32. I really need to find out why. It’s probably those hung systems. That’s critical, 60 to 90, critical! 30 to 60, great.  Zero to 30? Well, here’s the good news. You know, if I’m on a monthly patch cycle, I have 41 systems in compliance. It helps me gauge what I’m doing in here. If I was on a quarterly patch, guess what? I’m probably compliant.  How vigilant will you be? So you have to read it through the appropriate lens, but some more great data, and you can just drill in and drill down. And these systems they’re wonderful.

Kelly:
Yep. So we can skim through this. We’ve already talked about it, right? Identification and deployment, figure out what you need to patch where you need to patch it. Test it. Veeam’s On-Demand Sandbox, which is now called Data Labs is a great way to test it. And then from then, you roll it to production.

Tom:
It’s a great system. You’ve got to have your rollback. Use Veeam. It’s wonderful.

Kelly:
 Obviously we’ve talked about change management to understand that and then have good backups. What are your thoughts?

Tom:
I think you hit everything. I mean, I think we’ve all been around long enough to know you have a backup before starting a change management window. Those are all the bullet points there. You know, Veeam Data Labs all day long. That’s what you’re going to do. Your rollback is Veeam. This is exactly what you need.

Kelly:
Do you have good backups? With Veeam Backup & Replication, you could go into that machine before you’re going to patch it and just do a manual backup right then. It won’t take very long, and it is an actual backup. It’s not a hypervisor snapshot, so you’re guaranteed to have better protection because– not that it could happen–but what happens if you have a hypervisor issue during a patch?

Tom:
I saw one, where they had an unsigned driver. They applied a patch to a dot net framework and rebooted. The patch was fine. The onsite driver went sideways. Could not get it back. You know what? We brought it back to life because we were able to go back.

Kelly:
I know that you’ve already talked about this a bit, but just run us through this image.

patching policy

Tom:
Sure. Vulnerability scans–you need some sort of software SOC, some group out there in your organization, finding out where you are vulnerable. I’m not just talking about penetration testing. Tell me what’s vulnerable, where.  Let it worm through and find your switch firmware, your blade intraconnect module firmware, your OSes. Do you need a regular scan to know where am I vulnerable? Required patching? Listen, you’ve got to have a patch methodology, get over the fear. You’re going to have to change the culture of your organization to say, there will be some downtime, but you’re moving the ball towards more security. Hypervigilance is essential. Absolutely. Again, do I want it off for four hours or for eternity? Keep your pace in there. Be hypervigilant. And of course, the 3, 2, 1 rule meaning three copies, two different media, one to off-site. You need to make sure if this all goes bad, you can bring it back up.

Kelly:
Global Data Vault is a Veeam Services Provider, fully-managed service for backup as a service, disaster recovery as a service, and Office 365 backup. We work with VMware, Hyper-V, Acropolis (Nutanix), and physical servers and endpoints. 

 

Trista Perot
00:43:43

Awesome conversation.  I have a few frequently asked questions on patching: 

Does Global Data Vault send out maintenance notices when patching?

Tom:
We do typically send out maintenances for out-of-band patches, for those emergencies. We have a standard patching window, Wednesdays from 12:00 PM Central to 3:00 PM Central because those are regular and expected. We do not usually communicate those in our systems.

Should patching be a part of your DR Plan?

Tom:
Yes, absolutely. The best way to save a DR Plan is to not have enacted it, so to speak. So step one, patch and be secure before you have to fix what went wrong. So absolutely. Yeah. That’s paramount.

Kelly:
Going back to the slipstreaming thing, make sure that you have multiple copies of builds with different patch levels. Make sure you have all that stuff ready and have patches because you don’t know your internet capabilities. You don’t know your resources. If you’re going to have to apply patches, make sure that you can do those without having to download them from the internet. 

Do you prefer windows-based patch systems or a third party?

Kelly:
I hate saying it depends, but it depends. I mean, is your organization Microsoft-based? Do you have a wide deployment of Microsoft System Center Operations Manager? Do you have all kinds of Microsoft certified guys, is your environment, you know, highly Microsoft or highly homogenous in its operating system deployments? Do you have a mix of VMware and Hyper-V? Do you have Linux servers? Do you have Oracle? Do you have Sun servers? You know, there are all kinds of different things that could not work with Microsoft-based tools that could work better with third-party tools that encapsulate more of those types of things. So I hate to say it depends, but it’s really dependent upon your organization and the structure and the way your organization is laid out.

What happens if the patch causes issues? Does Global Data Vault have rollback options?

Tom May
Absolutely, so long as you are being protected, you can roll back to your last-known good backup. So if you are into a standard SLA or for eight hours, there you go. If you’re on a one hour, there you go. But Kelly hit something really important when he spoke about the change management side, which is, what is my rollback plan? Maybe you should take a manual backup before you begin this process. And it’s quick. So let’s say we add a good 20 minutes on or 30 minutes onto our maintenance window. We perform a manual backup for a point in time, right? When things are taken out of service, we do what we need to do. It all goes south. We roll it back. There’s actually zero downtime in there, so in your change window, add that manual backup. Then you have what we call the point of no return. I’ve traversed two hours. If I’m not back after two hours, I must enact the rollback plan and put that backup into production, come out of maintenance, and figure it out later. So yes, we absolutely do. And it touches on the first question which was, should it be a part of the DR Plan? Absolutely. It all encompasses that rollback. 

Kelly:
At Global Data Vault we also have our Enhanced Data Protection recovery option. We’ve got an additional layer of protection there.

Tom:
Absolutely. We can bring you right back from the brink.

More Global Data Vault Webinars

Webinar: The Difference Between BaaS and DRaaS

Webinar: The Difference Between BaaS and DRaaS

Global Data Vault is kicking off a new webinar series that takes us Back to the Basics. For our inaugural episode, we are breaking down the difference between Backup as a Service (BaaS) and Disaster Recovery as a Service (DRaaS), with a little bit of Veeam and Global...

Webinar: The Importance of Veeam VMCE Certification

Webinar: The Importance of Veeam VMCE Certification

Do you have questions about Veeam VMCE Certification? We have answers! Our recent webinar focused on the importance of Veeam VMCE certification. Steven New, Senior Support Manager, and Kelly Culwell our Product Manager discuss key points including what the VMCE...

How to Create a Disaster Recovery Plan

How to Create a Disaster Recovery Plan

Business Continuity vs. Disaster Recovery     While similar in nature and often confused, business continuity planning and disaster recovery planning are two distinctly different activities, and yet entwined. Disaster recovery is a significant component of continuity...

Managing the Risk of Having Data in the Cloud

Managing the Risk of Having Data in the Cloud

In this webinar, we discussed the March 2021 OVHcloud fire in Europe and the challenges of maintaining and protecting data no matter where it resides. Many customers lost data due to the fire and not having backups, offsite backups, or a disaster recovery plan. Our...

Veeam Backup & Replication v11 Webinar

Veeam Backup & Replication v11 Webinar

Veeam Backup & Replication v11 is available! With over 200 new features and enhancements, we wanted to focus on our favorites and a few buzzworthy items such as CDP and restore anything to Hyper-V. We invite you to listen as Veeam Certified Architect and 2021...

Eliminating backup headaches for MSPs and VARs

Eliminating backup headaches for MSPs and VARs

In this webinar, we visit with Nick Scholle, CIO of Technology Pointe, and hear how partnering with Global Data Vault has:       Reduced costs related to colocation, hardware, and software       Added value to customer offerings       Eliminated repetitive tasks...

Webinar: Public, Private Cloud or Managed Cloud?

Webinar: Public, Private Cloud or Managed Cloud?

AWS, Azure, or GDV: Who Cares More About Your Data and Why? Selecting a public cloud or backup/disaster recovery provider can be daunting. While there is nothing “wrong” with services such as AWS or Azure, or even other cloud backup providers, they could leave you...

MSP Reseller + GDV = Success

MSP Reseller + GDV = Success

Success Stories of How MSP Resellers and VCSP Work Together Through our reseller program, Global Data Vault offers DRaaS, BaaS, O365 Backup, Enhanced Data Protection, and an elite cybersecurity solution. Our straightforward commissions, paired with the philosophy,...

Insider Threats & Ransomware Webinar

Insider Threats & Ransomware Webinar

This Insider Threat and Ransomware Webinar explores what you need to know to keep your date safe today. Insider threat may once have only applied to disgruntled employees, or perhaps accidental deletions or modifications due to carelessness or over-delegation of...

Insider Threat Webinar

Insider Threat Webinar

Insider Threats and Enhanced Data Protection What is an insider threat and how do they affect businesses? How do I defend and protect myself against insider threats and malware? Learn how to create a complete security posture?   Insider Threats, Malware and...

Disaster Recovery as a Service

0 Comments

Submit a Comment

Your email address will not be published. Required fields are marked *