End of Year IT Maintenance Windows and Change Management
It’s that time of year again when, depending on your industry, things either get really busy or really slow. What does your company do during peak or holiday seasons related to IT maintenance?
In my 20-odd years’ of experience in the IT industry, I have been fortunate to learn many things. Having started in a large, enterprise financial institution, I often had to learn not only new technologies, but processes and procedures, and quickly. As I moved through the ranks of server support to server implementation, project management, and even to helping build the first VMware infrastructure for the organization, one of the things I learned was the role and importance of change management.
What is change management?
Enterprise change management (ECM), or just change management (CM), is a set of processes and tools that assist with…well, changing something. CM provided for tools, resources, and risk reduction for the successful implementation of change within an environment. Different types exist for project management, software versioning, and infrastructure changes, and I dealt mostly with the latter. Basically, each line of business (LOB–think large, internal department) had their own ECM group to handle changes within the LOB and then would meet with the other ECM groups to review cross-functional changes. At first, we saw this as a major inconvenience to simply getting things done, as we had to fill out a request with some very specific types of data with a certain number of business days as a lead time based on the type of change. I could submit an emergency change request for critical needs, but those garnered an almost third-degree type interrogation.
The ECM groups ensured several things:
- All impacted or cross-impacted systems are identified, appropriate approvals occur, and resources are available
- Areas of risk are identified
- An implementation plan exists and is valid
- A backout plan exists and is valid
- A validation of success/completion plan exists and is valid
IT maintenance windows and general practices
Many IT administrators plan downtime for overnight hours, weekends, and even holidays, generally expecting those times to have the least impact on users. It makes sense for smaller companies, as I was able to experience later while working for an IT consulting firm. For those with change management programs in place, it is quite a bit more delicate.
My LOB within the bank supported infrastructure for the rest of the bank and groups that provided core services. I worked with production systems—those that were active and in use—before moving into an implementation group that worked with pre-production systems in more of a project-based function. Change management was definitely more restrictive, for obvious reasons, for the production systems but we still had to follow CM procedures for pre-production systems, as they touched data center racks, networks, power, etc. upon implementation.
Back when physical servers were more prevalent than virtual machines, the “riskiest” operation was often the reboot, as this is when all the moving parts stopped and started back up again. It was also the time when operating system files that had not been used since the last reboot are needed. If a virus or malware has corrupted those files, you would get a blue screen of death. I always remember watching for a server to come back on the network with fingers crossed. With virtual machines, there is no physical hardware tied specifically to it, it is possible to see the console of the VM and troubleshoot more easily, plus it is easier to roll back changes and recover.
IT maintenance peak holiday considerations
During the last few weeks of a calendar year, many employees take time off to use vacation, travel, or spend time with family for the holidays. Not only does this reduce the number of staff available to monitor the change management process, but it could also limit people needed to implement changes, respond to issues, or provide proper functionality validation. From a business perspective, critical processing, sales, orders, or fiscal calculations could also occur during these periods and should not be interrupted for anything not critical to maintaining reliable operations.
For production systems, the only thing we used to be able to do was emergency maintenance—critical security updates, hardware failures, etc. Otherwise, we spent time updating runbooks, training, or doing other tasks we may not have had time for previously. With pre-production, all project changes stopped, but we were usually still able to perform tasks that did not require any cabling, power, or network changes.
Verifying and validating backups to reduce risk
Some change management processes require validation of backups before making changes to a system. Testing backups can prove difficult for some, but thankfully Veeam® has solutions to eliminate risk related to reboots and system changes. As part of Veeam’s Backup & Replication™, Instant VM Recovery® and SureBackup® are features that allow you to power on a virtual machine directly from a backup file. This gives administrators the ability to test a backup prior to starting a maintenance window.
As an additional layer of protection and validation, the backup copies you send to Global Data Vault are not only offsite from your primary backups, but we also keep ten restore points in a hidden repository as part of what we call Enhanced Data Protection.
Does your company have change control or change management restrictions at certain times of the year? What steps do you have to take to have a change approved? Leave a comment below and let us know!
– Kelly Culwell is a 20-year IT professional, marketing consultant, and freelance technical content creator.
More Data Protection Posts
What is the 3-2-1 Backup Rule and how can Global Data Vault and Veeam Backup and replication be your best ally against data loss?
A strong data retention policy and scheduled backups are important to maintain the security of data and ongoing commercial operations.
Not only must companies protect edge-facing and core-facing infrastructure technologies, they must protect all data at all times, and in all locations.
Next Level Backup ProtectionPhysical and logical security are two important factors when considering backup files. Modern restore capabilities allow for entire virtual machines and systems to be brought back online from a backup very quickly, assuming that a...