Skip to main content

Problem Isolation - Gestalt Analysis of Unstructured Logs (GAUL)

Project Description

Computer systems fail for many reasons including software and firmware bugs, hardware errors and mis-configurations. Today, businesses are increasingly dependent on powerful and sophisticated computer systems, which increase the number and complexity of their failure modes. Even a small amount of downtime can lead to costly lost business productivity and recovery expense. For example, the downtime cost for a bank approaches $75k per minute. While system downtime may not be completely avoidable, the speed of problem isolation and recovery is becoming critical for minimizing these business costs. Our vision is to create an automated tool that expedites the process of diagnosing problems for complex systems such as IBM's flagship storage controllers, thereby reducing system downtime and associated costs.

We have invented a novel, domain-agnostic, fuzzy matching algorithm, called GAUL, that automatically learns from historical failures to identify repeat instances as and when they occur in the field. GAUL achieves an unprecedented simplicity and robustness by marrying the techniques from text mining and problem isolation for the first time. Use technique akin to search engines that lend the ability to handle fuzzy and incomplete data, GAUL provides valuable diagnosis without any human-based training, domain-specific knowledge, or continuous maintenance costs that plague the prior art in this field.

Selected Publications

  • Pin Zhou, Binny Gill, Wendy Belluomini, and Avani Wildaniy, "GAUL: Gestalt Analysis of Unstructured Logs for Problem Troubleshooting--Learning from History", in Proceedings of the 7th Proactive Problem Prediction, Avoidance and Diagnosis Conference (IBM Academy P3AD), April 2009.

Selected Patents

  • AUTOMATED SYSTEM PROBLEM DIAGNOSING

People

Product Impact

GAUL had been available for use since June 2009. It has hundreds users from across the globe, covering L1/L2 support, PFE, and development. GAUL is listed as part of the official Megamouth RAS tools and we are in the process of migrating ownership to the PFE organization.