Tiered Infrastructure Maintenance Standards
Tiered Infrastructure Maintenance Standards
Tiered Infrastructure Maintenance Standards
Maintenance Standards™
(TIMS)
For Mission-Critical
Environments
1
Lee Technologies’ Tiered Infrastructure Maintenance
Standards (TIMS) For Mission Critical Environments
Lee Technologies has created a system of infrastructure maintenance
practices and procedures for mission-critical facilities.These Tiered Infrastructure
Maintenance Standards (TIMS) offer a systematic approach to achieving synergy
between maintenance activity levels and the level of reliability expected of the
facility.
The goal of Tiered Infrastructure Maintenance Standards is to provide
organizations a means to evaluate their maintenance programs, understand their
level of risk and to effectively allocate their resources.
Success, however, relies on management developing a maintenance
philosophy.This philosophy must align with the organization’s overall performance
goals and must be enforced and managed throughout every aspect of the
maintenance organization.
Every facility and organization is unique, and the TIMS will need to be
adapted to the specific environment. However,TIMS provides essential guidelines
and rationale for all those involved in the operation, administration and
management of mission-critical environments.
2
TIMS-1 Run to Fail
This level of service reflects the old adage, "If it isn’t broken, don’t fix it."
Maintenance at this level is essentially reactive. When a problem develops, a
vendor is called to perform the repair. When redundant systems are present,
there may be little or no impact to the critical load for an isolated event.
However, a lack of preventive maintenance often results in overall system
weakness and overloading.This creates a domino scenario, where multiple weak
links overwhelm and defeat system redundancies.
Operating at this level implies that the cost of an outage is low compared to
the cost of higher level maintenance. Unfortunately, when budgets are tight,
deferring maintenance is often perceived as a way to cut cost. It is in essence a
form of gambling similar to forgoing insurance. Statistically, any perceived short-
term savings in maintenance costs will likely be overshadowed in the long run by
costly outages and expensive repairs.
3
TIMS-3 Structured Maintenance
4
Categorizing Your Maintenance Program
Few maintenance programs fall neatly into a single category. More often,
there will be elements of two or more Maintenance Service Tiers present. One
common example would be a program that embraces Structured Maintenance
on the electrical side of the house, but might be unstructured on the mechanical
side. Another example would be a Tier III-designed facility being run without a
formal change management program. In cases such as these, the weakest link
principle applies:Your overall service level is only as high as the lowest level of
maintenance being performed in any area of your facility.
Without a detailed understanding of high-level maintenance procedures,
evaluating an organization’s maintenance program can be difficult, particularly in
cases where maintenance is managed internally. It is generally preferred to hire an
independent consultant to perform a Mission-Critical Infrastructure Site
Assessment and Risk Analysis.
Certified technicians will provide a comprehensive walkthrough of the facility,
identifying potential points of failure and creating a roadmap for improvement.
The consultant and internal staff will discuss known or suspected reliability
problems, discuss capacity/load planning and review up-to-date facility drawings,
specifications, operation documentation and maintenance records.
The results of such an audit may prove pivotal.The investment in the audit
will be returned many times over by identifying critical steps that will increase the
facility’s reliability.
Organizations with high tier data centers typically allocate the appropriate
budget levels for maintenance.These organizations acknowledge the risks of an
outage and have invested heavily to add layers of protection. However, this can
lead to a sense of invulnerability, and the resulting complacency often creates a
5
lack of attention to detail in the maintenance category. In order to fully realize
and maintain the benefits of the original capital investment, time, energy and
resources must be allocated to developing a high TIMS.
The day-to-day demands placed on internal IT and facilities staff often
precludes the ability to develop high-level maintenance. Even the most
sophisticated organizations may lack the capabilities (skill sets, impartiality,
expertise, resources) to internally develop and manage a TIMS-2,TIMS-3 or
TIMS-4 program.
Considerations
1) Scope: What specific actions need to be taken to achieve the desired TIMS
tier?
2) Budget: Does your budget allow you to meet your chosen goals?
3) Skills: Do you have the internal skills to manage and perform the activities
required?
4) Impact: What is the impact on your business operation to implement the
plan, and what are the risks?
Conclusion
Now more than ever, the public and private sector recognize the need to
maximize uptime in mission-critical facilities. Increased focus on disaster recovery
and business continuity has empowered organizations with much needed focus
on system reliability, including the tiered classification approach to mission-critical
center design and construction.
However, it is time to strengthen the final link in the chain with a systematic
approach to the ongoing maintenance of the facility and its infrastructure.To be
effective, it must be realistic and provide management with a dollars-and-cents
justification to ensure a long-term commitment.
When evaluating the entire scope of the mission critical enterprise, the
effectiveness of the maintenance program is one of the key components that
6
must be factored-in to determine the true measure of sustained reliability.The
tremendous variability in how maintenance is implemented can make it difficult
to judge what constitutes the proper level of service in a given situation. Defining
maintenance levels is a tool to achieving such an understanding. Matching-up high
reliability systems with high maintenance service levels will allow organizations to
achieve the highest levels of reliability and uptime.
The Tiered Infrastructure Maintenance Standards offers a systematic
approach to achieving synergy between maintenance activity levels to the level of
reliability expected of the facility. Applying these principals to your maintenance
program will go a long way to help attaining uptime and overall continuity goals.
7
SERVICE TIMS-1 TIMS-2 TIMS-3 TIMS-4
Maintenance logs X X X
Maintenance QA process X X
8
Glossary of Terms
Critical Systems Vendor Response: The contractual SLA for critical system
vendor response time in the event of a facility emergency.
9
Infrared Scan: A periodic thermal imaging of critical electrical components
such as circuit breakers.The results of this non-invasive testing can pinpoint hot
spots, which are indicators of poor connections or failing components before
they become a service affecting issue.
Integrated System Testing: Integrated System Testing (IST) verifies that all of
the mechanical, electrical, control and safety systems work as designed, during
normal operations as well as during multiple systems failures. IST provides
baseline information on the operation of the facility during all anticipated modes
of operation and can pinpoint any weaknesses that may not be discovered
during the normal commissioning process.
Load Bank Testing: The process of placing a resistive load on the output of an
electrical system in order to test its operation at various stages of loading.This is
employed to validate the operation of partially loaded systems, to test
equipment prior to placing it in service and to create high heat loads to load test
HVAC systems.
10
Onsite Facilities Staff: Dedicated on-site facilities staff that focuses on the
site critical systems.This group performs daily walkthroughs, manages vendors
and performs some level of self-performed service.The facilities staff is
responsible for creating and maintaining all of the site documentation, including
MOP's, SOP's and emergency procedures.This staff may or may not be providing
24x7 coverage, depending on the level of service required.
11
Training Program: A formal and comprehensive staff training program that
defines various levels of qualification along with a rigorous testing and
certification process.This is used in conjunction with a matrix that identifies
specific maintenance tasks and what the qualification levels are for performing
them.
12