Department of Computer Sciences Dynamic Shape Analysis via Degree Metrics Maria Jump & Kathryn S. McKinley Department of Computer Sciences The University of Texas at Austin {mjump,[email protected] Software is Dynamic Software is always changing New expectations New algorithms New applications New users New notions of what is cool Limited only by human ambition First Law of Software: Software is a gas! It expands to fit the container it is in! [Nathan Myhrvold, Former CTO, Microsoft] 20-Jun-2009 Department of Computer Sciences Jump & McKinley Software Complexity Property of a system that is directly proportional to the difficulty one has in comprehending the system at a level and detail necessary to make changes without introducing instability or functional regressions [Peter Rosser, Microsoft Software Engineer] Developed by large teams Few, if any, understand all parts Need tools to help 20-Jun-2009 Department of Computer Sciences Jump & McKinley Program Analysis Program = Code + Data Software complexity leads to heap complexity 20-Jun-2009 Department of Computer Sciences Jump & McKinley Heap Complexity Exacerbated by modern languages Objects are smaller and more numerous Most objects allocated on the heap Objects encode program state Heap contains semantic, memory, and concurrency bugs Program analysis is incomplete without heap analysis 20-Jun-2009 Department of Computer Sciences Jump & McKinley An Opportunity Automatic memory management examines live data on a regular basis Understand heap usage Detect heap-based bugs Optimize program based on heap usage Opportunity to perform heap analysis dynamically 20-Jun-2009 Department of Computer Sciences Jump & McKinley Shape Analysis Characterize the “shape” of heap-allocated pointer-based data structures and verify shape-preserving properties 20-Jun-2009 Department of Computer Sciences Jump & McKinley Shape Analysis Static Dynamic Conservatively characterizes Discover shape of current all “possible” shapes at every program point Only works on small programs Program = Code 20-Jun-2009 Department of Computer Sciences + data structure Generate assertions for verification Monitor shape during entire execution Data Jump & McKinley Shape Analysis HomeBrewed Custom Library Library % of Heap 100 LinkedHashMap: 99.3% 80 60 BinaryTree: 35.8% HashMap: 59.5% 40 20 OctTree: 98% 20-Jun-2009 Department of Computer Sciences Custom: 33% Library: 58% averag e x a la n pm d lus e a r ch luin d e x jy th o n fo p e clip s e ch a r t b lo a t a n tlr ja c k m p e g a u d io ja v a c r a y tr a ce je s s co m p r e ss 0 Jump & McKinley ShapeUp Desired goals Discover shape of current data structure Discover dynamic invariants Monitor expected shape of current data structure during execution Heap summarization graph [POPL ‘07] Class field-wise graph (CFWG) Identifies recursive data structures (RDS) Summarize dynamic degree invariants 20-Jun-2009 Department of Computer Sciences Jump & McKinley Class Field-Wise Graph (CFWG) Heap CFWG Home-brewed data structure from SPECjbb2000 (simplified) 20-Jun-2009 Department of Computer Sciences Jump & McKinley Class Field-Wise Graph (CFWG) Heap CFWG Home-brewed data structure from SPECjbb2000 (simplified) 20-Jun-2009 Department of Computer Sciences Jump & McKinley Class Field-Wise Graph (CFWG) Heap CFWG Degree Metrics Home-brewed data structure from SPECjbb2000 (simplified) 20-Jun-2009 Department of Computer Sciences Jump & McKinley Degree Metrics Degree metrics per instance: Out-degree: # of outgoing references In-degree: # of incoming references Only count edges between objects of same class Use word in header to track in-degree of backbone objects Summarize by class per data structure instance in a degree profile 20-Jun-2009 Department of Computer Sciences Jump & McKinley Calculating Degree Profile CFWG 20-Jun-2009 Department of Computer Sciences Jump & McKinley Calculating Degree Profile CFWG 20-Jun-2009 Department of Computer Sciences Jump & McKinley Calculating Degree Profile CFWG 20-Jun-2009 Department of Computer Sciences Jump & McKinley Calculating Degree Profile n=1 CFWG 20-Jun-2009 Department of Computer Sciences Jump & McKinley Calculating Degree Profile n=1 CFWG 20-Jun-2009 Department of Computer Sciences Jump & McKinley Calculating Degree Profile n=1 CFWG 20-Jun-2009 Department of Computer Sciences Jump & McKinley Calculating Degree Profile n=2 CFWG 20-Jun-2009 Department of Computer Sciences Jump & McKinley Calculating Degree Profile n=2 CFWG 20-Jun-2009 Department of Computer Sciences Jump & McKinley Calculating Degree Profile n=2 CFWG 20-Jun-2009 Department of Computer Sciences Jump & McKinley Calculating Degree Profile n=3 CFWG 20-Jun-2009 Department of Computer Sciences Jump & McKinley Calculating Degree Profile n=3 CFWG 20-Jun-2009 Department of Computer Sciences Jump & McKinley Calculating Degree Profile n=3 CFWG 20-Jun-2009 Department of Computer Sciences Jump & McKinley Calculating Degree Profile n=4 CFWG 20-Jun-2009 Department of Computer Sciences Jump & McKinley Calculating Degree Profile n=5 CFWG 20-Jun-2009 Department of Computer Sciences Jump & McKinley Calculating Degree Profile n=6 CFWG 20-Jun-2009 Department of Computer Sciences Jump & McKinley Calculating Degree Profile n=6 CFWG 20-Jun-2009 Department of Computer Sciences Jump & McKinley Calculating Degree Profile n=6 CFWG 20-Jun-2009 Department of Computer Sciences Jump & McKinley Calculating Degree Profile n=7 CFWG 20-Jun-2009 Department of Computer Sciences Jump & McKinley Calculating Degree Profile n=8 CFWG 20-Jun-2009 Department of Computer Sciences Jump & McKinley Calculating Degree Profile n=9 CFWG 20-Jun-2009 Department of Computer Sciences Jump & McKinley Calculating Degree Profile n=9 CFWG 20-Jun-2009 Department of Computer Sciences Jump & McKinley Calculating Degree Profile n=9 CFWG 20-Jun-2009 Department of Computer Sciences Jump & McKinley Calculating Degree Profile n=10 CFWG 20-Jun-2009 Department of Computer Sciences Jump & McKinley Calculating Degree Profile n=10 CFWG 20-Jun-2009 Department of Computer Sciences Jump & McKinley Calculating Degree Profile n=10 CFWG 20-Jun-2009 Department of Computer Sciences Jump & McKinley Calculating Degree Profile n=11 CFWG 20-Jun-2009 Department of Computer Sciences Jump & McKinley Calculating Degree Profile n=12 CFWG 20-Jun-2009 Department of Computer Sciences Jump & McKinley Calculating Degree Profile n=13 CFWG 20-Jun-2009 Department of Computer Sciences Jump & McKinley Calculating Degree Profile n=13 CFWG 20-Jun-2009 Department of Computer Sciences Jump & McKinley Calculating Degree Profile n=13 CFWG 20-Jun-2009 Department of Computer Sciences Jump & McKinley Calculating Degree Profile n=14 CFWG 20-Jun-2009 Department of Computer Sciences Jump & McKinley Calculating Degree Profile n=15 CFWG 20-Jun-2009 Department of Computer Sciences Jump & McKinley Calculating Degree Profile constant n CFWG range 20-Jun-2009 Complete Binary Tree Department of Computer Sciences Jump & McKinley Calculating Degree Profile constant CFWG range left 20-Jun-2009 Complete Binary Tree Department of Computer Sciences right Jump & McKinley Interpreting Degree Profile Monitoring degree profile shows errors: Error: constant violation Warning: range violation [min-,max+] Indicate problem with RDS during development and testing Monitor shape after deployment for dynamically introduced errors What kind of errors can be detected? 20-Jun-2009 Department of Computer Sciences Jump & McKinley Implementation and Methodology Jikes RVM SPECjvm98, DaCapo, SPECjbb2000 Heap composition Showed degree metric are stable by class Overheads <8% total time <1% space overhead [used bits in header] Microbenchmarks Single RDS in isolation Random error injection 20-Jun-2009 Department of Computer Sciences Singly-linked list Doubly-linked list Binary Tree Binary Tree w/ PP Jump & McKinley Looking for errors TRAINING Ran microbenchmark 100 times Merged together dynamic invariants ERROR DETECTION Created RDS with 100,000 nodes Errors: 1, 2, 3, 4, 5, 10, 50, and 100 Detected dynamic invariant violations 20-Jun-2009 Department of Computer Sciences Jump & McKinley Doubly-linked List =0 =1 =2 in 0 2 n-2 out 0 2 n-2 circle disconnect 20-Jun-2009 Department of Computer Sciences cyclic skip Jump & McKinley Doubly-linked List 1 2 =0 =1 =2 in 0 2 n-2 out 0 2 n-2 3 4 5 10 50 100 100 % D e te ct io n 90 80 70 60 50 40 30 20 10 20-Jun-2009 Department of Computer Sciences Ra nd o m S kip C y c lic D is co n ne ct C irc le 0 Jump & McKinley Binary Tree =0 =1 =2 Complete Binary Tree in out 1 n-1 0 [50.0,50.2] 0 [49.8,50.0] Full Binary Tree in out 1 n-1 0 [50.0,50.1] 0 [49.9,50.0] Random Binary Tree in out 1 n-1 0 [33.6,35.7] [28.7,35.1] [32.3,35.6] link 20-Jun-2009 Department of Computer Sciences Jump & McKinley Binary Tree =0 =1 =2 Complete Binary Tree in out 1 n-1 0 [50.0,50.2] 0 [49.8,50.0] Full Binary Tree in 1 2 3 4 5 10 50 out 100 100 n-1 0 [50.0,50.1] 0 [49.9,50.0] Random Binary Tree 90 80 in 70 60 out 50 1 n-1 0 [33.6,35.7] [28.7,35.1] [32.3,35.6] 40 30 20 10 B TR a n dom B T F u ll 0 B T C o m p le te % D e te ct ion 1 20-Jun-2009 Department of Computer Sciences Jump & McKinley Binary Tree w/ PP =1 =2 =3 Complete Binary Tree w/ PP in [50.0,50.0] 2 [49.9,50.0] out [50.0,50.0] 2 [49.9,50.0] link Full Binary Tree w/ PP in [50.0,50.1] 1 [49.9,50.0] out [50.0,50.1] 1 [49.9,50.0] Random Binary Tree w/ PP in [33.8,35.2] [30.0,32.6] [33.6,34.8] out [33.8,35.2] [30.0,32.6] [33.6,34.8] disconnect 20-Jun-2009 Department of Computer Sciences Jump & McKinley 20-Jun-2009 Department of Computer Sciences R an d o m B TP R a n d o m R an d o m 50 B TP F u ll R an d o m 10 B TP C o m p le te D isc on n e ct 5 B TP R a n d o m D isc on n e ct 4 B TP F u ll 3 D isc on n e ct 2 B TP C o m p le te L ink 1 B TP R a n d o m B TP F u ll L in k L ink B TP C o m p le te % D e te c tio n Microbenchmark: Binary Tree w/ PP 100 100 90 80 70 60 50 40 30 20 10 0 Jump & McKinley ShapeUp’s Contributions Performs dynamic heap analysis to determine shape of current data structure Summarizes degree metrics using a class field-wise graph with low overheads Shows whole-heap degree metrics are not stable, but class degree metrics are Introduces degree profiles for RDS Regular RDS have more invariants than random RDS Degree profiles can detect some shape errors Thank You! 20-Jun-2009 Department of Computer Sciences Jump & McKinley Thank You! 20-Jun-2009 Department of Computer Sciences Jump & McKinley Space Overhead jess # of types bm+VM Eclipse Geomean 1744 3365 avg 318 667 max 319 775 # of avg edges max 844 4090 861 7585 1142 0.094% 0.167% 0.233% 0.233% Increased Alloc % 20-Jun-2009 Department of Computer Sciences 19% 2.7X 1747 334 346 904 Jump & McKinley Normalized Total Time Time Overhead Heap Size Relative to Minimum 20-Jun-2009 Department of Computer Sciences Jump & McKinley Singly-linked List =0 =1 in 1 n-1 out 1 n-1 1 Error head to tail cyclic := creates a cycle from tail to random node 90 80 Detection % circle := attaches 100 70 60 50 40 30 20 10 20-Jun-2009 Department of Computer Sciences SL L C y c lic SL L C ir c le 0 Jump & McKinley Microbenchmark: HashMap =0 =1 =2 in [31.9,51.6] [48.4,68.1] {0} out [31.9,51.6] [48.4,68.1] {0} 1 2 3 4 5 10 50 100 100 90 link := creates a connection between buckets from a null ptr to a random entry 80 70 60 50 40 30 20 10 H M Lin k 0 20-Jun-2009 Department of Computer Sciences Jump & McKinley Data Structures in Benchmarks Benchmark RDS Node Raytrace OctNode H 78.0 Jack LinkedHashMap$LinkedHashEntry L 44.2 RuntimeNfaState H 9.4 Antlr Object[] H 76.2 Bloat HashMap$HashEntry L 56.6 CallMethodExpr H 3.7 LinkedHashmap$LinkedHashEntry L 59.0 AND_AND_Expression H 0.4 HashMap$HashEntry L 51.4 PropertyList H 4.4 Jython Pyframe H 94.6 Luindex LinkedHashMap$LinkedHashEntry L 99.3 Lusearch WeakHashMap$WeakBucket L 47.5 HitDoc H 2.0 ChildIterator H 34.6 Eclipse Fop Xalan 20-Jun-2009 Department of Computer Sciences L or H? % Jump & McKinley

Descargar
# Cork: Dynamic Memory Leak Detection with Garbage …