Diagnosis of Distributed Systems
Based on Abnormal Symptom Histories
Sunggu Lee: Department of Electronic and Electrical Engineering,
Pohang University of Science and Technology (POSTECH), San 31 Hyoja Dong,
Pohang 790-784, Korea.
(TEL) +82-54-279-2236 (FAX) +82-54-279-5940 (E-Mail) slee@postech.ac.kr
(URL) http://cal.postech.ac.kr/slee1/lsg1.htm
Seung Gu
Kim: same
affiliation as above. (TEL)
+82-54-279-5936 (E-Mail) kimsg@postech.ac.kr
Abstract
In this presentation, the
main idea is to use observations of abnormal node/program behaviors and their
times of occurrence to diagnose the state of the target system. Just as a
variety of simple and complex symptoms, combined with information on their
times of occurrence, are carefully analyzed to diagnose a human patientÕs
current health, a variety of abnormal node/program behaviors (symptoms) can be
monitored and summarized into state messages that can be sent to neighboring
nodes. Each node can then use the state messages received from neighboring
nodes to maintain a state history for those nodes. This information can be used
to diagnose and isolate faulty nodes.
Short
Biography
Sunggu Lee: Sunggu Lee is a
Professor in the Department of Electronic and Electrical Engineering at Pohang
University of Science and Technology. Prior to this appointment, he was an
Assistant Professor in the Department of Electrical Engineering at the
University of Delaware in Newark, Delaware, U.S.A. From June 1997 to June 1998, he spent one year as a Visiting
Scientist at the IBM T. J. Watson Research Center. Sunggu Lee received the B.S.E.E. degree with highest
distinction from the University of Kansas, Lawrence, in 1985 and the M.S.E. and
Ph.D. degrees from the University of Michigan, Ann Arbor, in 1987 and 1990,
respectively. His current research
interests are in mobile ad-hoc networks (routing, time synchronization,
real-time communication), cluster and grid computing (middleware, task
scheduling), real-time communication, scheduling), and fault-tolerant computing
(fault-tolerant communication, checkpoint/restart).
Seung Gu
Kim: He is a
second-year Master-degree program student in the Department of Electronic and
Electrical Engineering at POSTECH.
He graduated with a B.S. from Sogang University in 2004. His research interests are in
fault-tolerant computing, with an emphasis on system-level fault diagnosis.