Skip to main content
IT Service Status
IT Service Status

Serving Servers – Get to Know the Service Operation Center

The University Data Center is housed in a nondescript building that sits just off campus. However, what happens inside the building is anything but ordinary. Mike Korby, the Service Operation Center manager, leads a tenured team with specialized skillsets who oversee the Data Center’s day-to-day operations. Mike shared with us all that entails.

What are the main functions of the Service Operation Center?

The Service Operation Center (SOC) is the 24/7/365 Data Center operation team. From the command center, we handle system monitoring, incident triage, assessment, equipment repair and installation, and everything and anything that goes on within this mission-critical IT facility.

The SOC has a decisive operating plan for maintaining the operational integrity of all University services, and our approach requires disciplined execution of around-the-clock processes to deliver on our priorities, knowing service is our strategy.

So, what’s inside?

The University Data Center is made up of four main sections:

The four main sections of the University Data Center.
Quest The University’s high-performance computing cluster is the single most extensive system in the Data Center, occupying the most space and utilizing over 90 percent of the total power in the facility. It is one of the driving forces that make Northwestern an R1 institution.
Network transport This is the nerve center that connects our infrastructure and allows us to talk to the outside world. It serves as one of the core routers for the University.
Enterprise equipment Enterprise equipment includes virtual servers, enterprise storage, Research Data Storage Service (RDSS), and the Oracle enterprise environments, which include PeopleSoft, myHR, CAESAR, and others.
Co-location This houses University school administrative and sponsored research equipment. While not fully supported by the SOC, we work with researchers to ensure they have the support and equipment to meet their equipment specifications.
All in all, 80 percent of the space in the Data Center is dedicated to research, and 90 percent of our power and cooling serves that 80 percent!

How does the SOC respond to unexpected service interruptions?

One of the areas where we are most successful is the procedural development and execution of the major incident management process, which is used when there is an unexpected service interruption. The most severe incident is called a Priority 1 (P1) incident.
James Panegasser in the ata center
Our team is generally the first to notice any service disruption because we constantly monitor SolarWinds, the primary tool to indicate the health of our physical and virtual devices and switches. This includes assets within the Data Center, network equipment across all our campuses, and even many cloud applications.

When a connection is broken, the SOC engages the appropriate technical team for further discussion and analysis. We also escalate the issue to Northwestern IT senior leadership and P1 stakeholders to ensure just-in-time communications are distributed, and we maintain the process through regular cadence. University Police are also engaged to confirm there are no life/safety implications because of a service outage.

We encourage all service owners to use the Teams Incident channel whenever something looks wrong. It is vital to get the process rolling to manage a response to the community.

What is the response when there is a utility power outage? 

In the case of a total utility power failure, the Uninterrupted Power Supply (UPS) units provide immediate battery-conditioned ride-through power to the Data Center until we transfer to the emergency generator source. This occurs in milliseconds – until utility power is restored. 

Do you have a favorite project?

We just finished a multi-year effort with the Research Computing Infrastructure team to upgrade the Data Center’s server cooling system to a more power-efficient liquid-cooled design—it’s the future of hydrocooling. The power saved is applied to Quest (tripling the density for high-performance computing) and other Data Center equipment. Northwestern is the first in the Big Ten and one of the first Higher Education institutions in the United States to have this technology.

A side benefit is that the area is much quieter. Before the transition, the din of the Data Center could reach nearly 82 decibels—the equivalent to the sound you would hear standing five feet away from a freight train!

How can the SOC be reached?

The SOC can be reached any time—day or night—by calling 847-467-2222, by emailing ci-soc@northwestern.edu, or by simply dropping a message onto the 24/7 SOC Teams channel. We are ready to assist!

Meet the Team

When not enjoying the harmonious hum of servers, members of the team can be found enjoying time by the pond outside of Norris University Center, getting a birds-eye view of campus from the Data Center rooftop, or strolling through the gardens near Howes Chapel.

The SOC team

From left to right: James Panegasser, senior data center technician; Ed Riveron, data center technician; Mike Korby, manager, Service Operation Center; and Brian Williamson, data center technician. Not pictured: Dan Daley, senior data center technician; Artur Bak, Miguel Bello, Connor Edling, and Jarry Hakl, data center technicians.