declarative
distributed
debugging
Kuang Chen, Gunho Lee, Byung-Gon Chun
Motivation
• Debugging distributed systems is hard
• Logs are scattered and numerous
• Log content can vary
– Application-specific logs: aka printf statements
– Tracing: X-Trace
• Needed: a more formal and efficient way to use
logs in debugging.
State of the Art
• Centralized log processing
 Master “controller” node collects logs for processing
 Techniques for filtering/aggregation before
transmission to the controller
– Logs size and # of nodes can be huge
– Debugging look for needles in the haystack
– Many logging mechanisms are ad-hoc (e.g. printf
statements) and focus on how instead of what.
New Idea
• Towards more efficient and formal distributed
debugging
– Step one: declarative querying of logs
• Queries are sent out. Logs do not move
– Specify queries and dataflow declaratively
– Process queries in a distributed manner
• “Declarative” - a separation of what from how
– What: a language for specifying distributed debugging
queries and data specification for tracing the causality
of events
– How: recursive query engine (e.g. P2) and tracing
infrastructure (e.g. X-Trace)
Risks
• Query language - (OverLog, NDLog, DPLog …coincidence?)
– Sufficient for expressing logging and tracing queries?
• Implementation
– P2 has limited in-memory persistence
– Need datacenter specific (potentially real-time) X-Trace data
• e.g. Hadoop X-Trace data
• Success metrics
– Working prototype
• e.g. creates causality graph of debugging use case
– Show declarative query specification is concise / easy
The Plan
Due
Task
Description
Deliver
10/12
Literature review
Understand related
system
Read papers
Play with P2 and X-Trace
Initial Proposal
10/19
Design brainstorming
Find debugging use cases
Requirement document
10/26
Initial design
Base implementation
Integrate X-Trace (or its data) into P2
Base system
Revised Proposal
11/02
Further design
Implement basic use case
Define advanced use case
Design document
11/09
Basic query
Deployment!
Running example
11/16
Advanced query
Design refinement
Implement advanced use case
Advanced example
11/23
Analysis
Evaluate results
Make poster
Project Poster
11/30
Summarize
Discussion
Write report
Final Report
6
Descargar

DPL: Distributed Processing of Logs (for Debugging