Mining Web Logs to Debug Distant Connectivity Problems
- Emre Kiciman ,
- Dave Maltz ,
- Moises Goldszmidt ,
- John Platt
ACM SIGCOMM Workshop on Mining Network Data (MineNet-06) |
Published by Association for Computing Machinery, Inc.
Content providers base their business on their ability to receive and answer requests from clients distributed across the Internet. Since disruptions in the flow of these requests directly translate into lost revenue, there is tremendous incentive to diagnose why some requests fail and prod the responsible parties into corrective action. However, a content provider has only limited visibility into the state of the Internet outside its domain. Instead, it must mine failure diagnoses from available information sources to infer what is going wrong and who is responsible. Our ultimate goal is to help Internet content providers resolve reliability problems in the wide-area network that are affecting enduser perceived reliability. We describe two algorithms that represent our first steps towards enabling content providers to extract actionable debugging information from content provider logs, and we present the results of applying the algorithms to a week’s worth of logs from a large content provider, during which time it handled over 1 billion requests originating from over 10 thousand ASes.
Copyright © 2004 by the Association for Computing Machinery, Inc. Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, to republish, to post on servers, or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from Publications Dept, ACM Inc., fax +1 (212) 869-0481, or permissions@acm.org. The definitive version of this paper can be found at ACM's Digital Library -http://www.acm.org/dl/.