US20120203992A1 - Parallel, single-pass compaction in a region-based garbage collector - Google Patents
Parallel, single-pass compaction in a region-based garbage collector Download PDFInfo
- Publication number
- US20120203992A1 US20120203992A1 US13/023,437 US201113023437A US2012203992A1 US 20120203992 A1 US20120203992 A1 US 20120203992A1 US 201113023437 A US201113023437 A US 201113023437A US 2012203992 A1 US2012203992 A1 US 2012203992A1
- Authority
- US
- United States
- Prior art keywords
- regions
- region
- compaction
- move
- computer
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Abandoned
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F12/00—Accessing, addressing or allocating within memory systems or architectures
- G06F12/02—Addressing or allocation; Relocation
- G06F12/0223—User address space allocation, e.g. contiguous or non contiguous base addressing
- G06F12/023—Free address space management
- G06F12/0253—Garbage collection, i.e. reclamation of unreferenced memory
- G06F12/0269—Incremental or concurrent garbage collection, e.g. in real-time systems
Definitions
- This invention relates to memory management, and more specifically to apparatus and methods for compacting regions in memory.
- Any memory management system that does not move objects may be affected by memory fragmentation.
- One important mechanism for resolving fragmentation is compaction.
- An Efficient Parallel Heap Compaction Algorithm published in 2004, the authors (Abuaiadh et al.) disclosed, at the time, a state of the art compactor that is currently used in many Java Virtual Machines.
- the compactor (hereinafter the “Aduaiadh compactor”) parallelizes very well, but requires two passes over the heap, both of which write to the heap. The first pass moves objects and the second pass fixes up references between objects, based on their new locations.
- the compactor (hereinafter the “Kermany compactor”) is a “stop-the-world” compactor that divides compaction into two phases, referred to herein as the “planning” phase and “move” phase.
- planning phase the new location of each object or group of objects is determined and recorded in a concise manner in an offset table. Planning is based only on the mark vector—it does not require a pass to read or write to the heap.
- the data in the mark vector and offset table may be combined to determine the destination of any object. This allows the move phase to move objects to their predetermined destinations and fix up their references to other objects at the same time.
- the move phase is the only phase of the compactor which reads or writes to the heap—planning and external fix-up do not need to read or write to heap memory.
- the Kermany compactor also includes a concurrent and incremental aspect, which are not discussed here.
- the Kermany compactor parallelizes the move phase, the parallelization of movement relies on the use of memory mapping, using the operating system's virtual memory subsystem, to ensure that a compactor thread never copies over objects which haven't been copied yet.
- the Kermany compactor allocates a second block of memory addresses which map onto the same physical memory as the object heap. As the Kermany compactor compacts the heap, it allocates new “to-virtual-space” pages in which to move objects. As “from-virtual-space” pages are completely evacuated, they are unmapped. Using this technique, the Kermany compactor does not need to worry about moving an object into a page which has not yet been fully evacuated.
- the invention has been developed in response to the present state of the art and, in particular, in response to the problems and needs in the art that have not yet been fully solved by currently available apparatus and methods. Accordingly, the invention has been developed to provide apparatus and methods to implement parallel, single-pass compaction in a region-based garbage collector. The features and advantages of the invention will become more fully apparent from the following description and appended claims, or may be learned by practice of the invention as set forth hereinafter.
- a method to implement parallel, single-pass compaction in a garbage collector includes conducting a planning phase for multiple regions to be compacted. During the planning phase, the method determines new locations for data entities in the multiple regions. The method then performs a move phase for the multiple regions to move the data entities to their new locations. During the move phase, the method initiates multiple compaction threads to move the data entities to their new locations. While executing, the compaction threads dynamically build a dependency graph of the regions being compacted. The dependency graph guarantees that no data entity is moved to its new location until all data entities that it overwrites have been moved to their new locations.
- FIG. 1 is a high-level block diagram showing one example of a computer system suitable for use with various embodiments of the invention
- FIG. 2 is a high-level block diagram showing one example of an object-oriented managed runtime, in this example the Java Virtual Machine, incorporating a compaction module in accordance with the invention
- FIGS. 3A through 3D show a general compaction process on a region
- FIG. 4 shows one example of a compaction module and various related data structures used to create a dependency graph
- FIG. 5 shows one embodiment of a method for implementing parallel, single-pass compaction on regions of an object heap
- FIG. 6 shows various conventions for use in association with FIGS. 7 through 16 ;
- FIGS. 7 through 16 show a simple specific example of operation of the compaction module using a pair of compaction threads (Thread 1 and Thread 2 ), a work stack (Work Stack), and four regions (Regions A, B, C, and D).
- the present invention may be embodied as an apparatus, system, method, or computer program product.
- the present invention may take the form of a hardware embodiment, a software embodiment (including firmware, resident software, microcode, etc.) configured to operate hardware, or an embodiment combining software and hardware aspects that may all generally be referred to herein as a “module” or “system.”
- the present invention may take the form of a computer-usable storage medium embodied in any tangible medium of expression having computer-usable program code stored therein.
- the computer-usable or computer-readable storage medium may be, for example but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device. More specific examples (a non-exhaustive list) of the computer-readable storage medium may include the following: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a random access memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or Flash memory), an optical fiber, a portable compact disc read-only memory (CDROM), an optical storage device, or a magnetic storage device.
- a computer-usable or computer-readable storage medium may be any medium that can contain, store, or transport the program for use by or in connection with the instruction execution system, apparatus, or device.
- Computer program code for carrying out operations of the present invention may be written in any combination of one or more programming languages, including an object-oriented programming language such as Java, Smalltalk, C++, or the like, and conventional procedural programming languages, such as the “C” programming language or similar programming languages.
- Computer program code for implementing the invention may also be written in a low-level programming language such as assembly language.
- the present invention may be described below with reference to flowchart illustrations and/or block diagrams of methods, apparatus, systems, and computer program products according to various embodiments of the invention. It will be understood that each block of the flowchart illustrations and/or block diagrams, and combinations of blocks in the flowchart illustrations and/or block diagrams, may be implemented by computer program instructions or code.
- the computer program instructions may be provided to a processor of a general-purpose computer, special-purpose computer, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions/acts specified in the flowchart and/or block diagram block or blocks.
- the computer program instructions may also be stored in a computer-readable storage medium that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable storage medium produce an article of manufacture including instruction means which implement the function/act specified in the flowchart and/or block diagram block or blocks.
- the computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide processes for implementing the functions/acts specified in the flowchart and/or block diagram block or blocks.
- FIG. 1 one example of a computer system 100 is illustrated.
- the computer system 100 is presented to show one example of an environment where an apparatus and method in accordance with the invention may be implemented.
- the computer system 100 is presented only by way of example and is not intended to be limiting. Indeed, the apparatus and methods disclosed herein may be applicable to a wide variety of different computer systems in addition to the computer system 100 shown. The apparatus and methods disclosed herein may also potentially be distributed across multiple computer systems 100 .
- the computer system 100 includes at least one processor 102 and may include more than one processor.
- the processor 102 includes one or more registers 104 storing data describing the state of the processor 102 and facilitating execution of software systems.
- the registers 104 may be internal to the processor 102 or may be stored in a memory 106 .
- the memory 106 stores operational and executable data that is operated upon by the processor 102 .
- the memory 106 may be accessed by the processor 102 by means of a memory controller 108 .
- the memory 106 may include volatile memory (e.g., RAM) as well as non-volatile memory (e.g., ROM, EPROM, EEPROM, hard disks, flash memory, etc.).
- the processor 102 may be coupled to additional devices supporting execution of software and interaction with users.
- the processor 102 may be coupled to one or more input devices 110 , such as a mouse, keyboard, touch screen, microphone, or the like.
- the processor 102 may also be coupled to one or more output devices such as a display device 112 , speaker, or the like.
- the processor 102 may communicate with one or more other computer systems by means of a network 114 , such as a LAN, WAN, or the Internet. Communication over the network 114 may be facilitated by a network adapter 116 .
- FIG. 2 one example of an object-oriented managed runtime, in this example the Java Runtime Environment, is illustrated.
- the Java Runtime Environment is presented to show one example of a runtime environment in which various embodiments of the invention may operate. Nevertheless, the compaction techniques disclosed herein are not limited to the Java Runtime Environment but may operate or be adapted to operate in any object-oriented managed runtime that uses an object heap to store objects.
- Other non-limiting examples of runtime environments in which embodiments of the invention might operate include the Microsoft Common Language Runtime (CLR) and Smalltalk runtime.
- CLR Common Language Runtime
- Smalltalk runtime Smalltalk runtime
- a Java Virtual Machine 202 may be configured to operate on a specific platform, which may include an underlying hardware and operating system architecture 204 , 206 .
- the Java Virtual Machine 202 receives program code 200 , compiled to an intermediate form referred to as “bytecode” 200 .
- the Java Virtual Machine 202 translates this bytecode 200 into native operating system calls and machine instructions for execution on the underlying platform 204 , 206 .
- the bytecode 200 is compiled once to operate on all Java Virtual Machines 202 .
- a Java Virtual Machine 202 by contrast, may be tailored to the underlying hardware and software platform 204 , 206 . In this way, the Java bytecode 200 may be considered platform independent.
- the primary memory resource is a garbage-collected object heap 210 .
- the object heap 210 provides memory for objects, each of which is an instance of a class.
- a garbage collection module 208 or subsystem 208 , is provided in the Java Runtime Environment to reclaim memory occupied by objects, or classes associated with objects, that are no longer in use by a program.
- the garbage collection module 208 frees a programmer from worrying about releasing objects that are no longer needed, which would otherwise require significant design effort from the programmer.
- the garbage collection module 208 may, in certain embodiments, be configured to operate in an incremental manner. That is, the garbage collection module 208 may be configured to perform garbage collection processes on a portion of the object heap 210 at a time, as opposed to the entire object heap 210 . In order to accomplish this, the object heap 210 may be divided into a number of regions (e.g., hundreds or even thousands of regions). The garbage collection module 208 may then perform garbage collection on a subset of these regions at any particular time. This incremental garbage collection process minimizes, as much as possible, disruption of the main program.
- the garbage collection module 208 includes a compaction module 212 to perform compaction algorithms on the object heap 210 .
- FIGS. 3A through 3D provide an example of a general compaction process on a region 300 of an object heap 210 .
- a region 300 may initially contain one or more objects 302 .
- the garbage collection module 208 may traverse a program graph, starting with root objects, to discover all live objects 302 a (objects referenced by other live objects) in the region 300 .
- the garbage collection module 208 marks each live object 302 a that it encounters during the traversal, indicating that the object 302 a is in use. All remaining objects 302 b are considered dead (unreachable).
- the garbage collection module 208 then frees the memory occupied by the dead objects 302 b , effectively clearing the dead objects 302 b from memory, as shown in FIG. 3C .
- the compaction module 212 may then compact the remaining live objects 302 a by moving the objects toward the low address of the region 300 to create a large contiguous space 304 , as shown in FIG. 3D . New objects may then be allocated within the large contiguous space 304 .
- certain objects 302 a may be moved within regions 300 , while other objects 302 a may be moved from one region 300 to another.
- care must be taken to ensure that live objects 302 a are not overwritten by other live objects 302 a .
- the compaction module 212 builds a dependency graph of the regions 300 being compacted to ensure that no live object 302 a is moved to its new location until all live objects 302 a that it overwrites have been moved to their new locations.
- the compaction module 212 may be configured to discover dependencies and generate the dependency graph dynamically as objects 302 a are moved to their new locations. The manner in which dependencies are discovered and used to build a dependency graph will be discussed in more detail hereafter.
- the compaction module 212 includes a planning module 400 and a move module 402 .
- the planning module 400 may conduct the planning phase of the compaction process, wherein new locations of live objects are determined.
- the planning module 400 may conduct the planning phase in the manner described in the Abuaiadh or Kermany articles previously described.
- the move module 402 may conduct the move phase of the compaction process, wherein live objects are moved to their new locations.
- the move module 402 may initiate multiple compaction threads, each of which may operate in parallel, to compact the regions 300 .
- the planning module 400 and move module 402 may interact with various data structures 406 .
- the planning module 400 may store these addresses in a new address table 408 .
- the compaction threads initiated by the move module 402 may look up new addresses in the address table 408 in order to move live objects to their new locations.
- the compaction threads may interact with a work stack 410 , and region properties 412 associated with each region 300 .
- the work stack 410 and region properties 412 together create the dependency graph 424 discussed herein.
- the work stack 410 contains a list of regions 300 to be compacted. When looking for work to perform, the compaction threads pop regions from the work stack 410 .
- a work stack monitor 404 controls access to the work stack 410 , ensuring that only one compaction thread can access the work stack 410 at any particular time.
- a compaction thread will wait on the work stack monitor 404 if there is no work available. When a compaction thread adds new work to the work stack 410 , or it is the last thread running and finds there is no work on the work stack 410 , the compaction thread will notify all compaction threads that are waiting on the work stack monitor 404 . This will allow the waiting threads to either pop work from the work stack 410 or, if no work is available, exit the work stack monitor 404 and return from the work stack 410 (since this means that the compaction process is complete).
- the work stack 410 includes a pointer to a list of regions 300 it contains, a count of waiting threads, and an “isDone” flag.
- One thread may initially populate the work stack 410 with all the regions 300 that are participating in the compaction process.
- the work stack 410 may be built as a side-effect of the planning algorithm in order to eliminate the need for this additional serialization. In either case, the operation is low-cost since it blindly pushes all regions 300 onto the work stack 410 , without attempting to infer dependency therebetween.
- a compaction thread finishes processing a region 300 , it either discards the region (if processing is complete) or adds the region to another region's blockedRegionList 422 .
- a region 300 will be pushed back onto the work stack 410 after being removed from a now-completed region's blockedRegionList 422 . Since objects from a region 300 may be moved to several different regions 300 , it is possible that a region 300 will end up moving through the work stack 410 and a blockedRegionList 422 multiple times during a single compaction cycle.
- a region 300 is a logical unit of compaction work and may include the illustrated properties 412 .
- the lowAddress property 414 is the base address of the region 300 . This property 414 is constant during compaction.
- the highAddress property 416 is the address of the first byte after the region 300 , which is also constant during compaction.
- the nextToMove property 418 is the address of the next object to be moved from the region 300 . This property 418 starts at the lowAddress 414 and becomes equal to the highAddress 416 when all of the region's objects have been moved.
- the nextRegionInList property 420 points to the next region 300 in the same list as a region 300 (the list may include either the work stack 410 or another region's blockedRegionList).
- the blockedRegionList property 422 points to a list of regions 300 which are blocked on a region 300 and cannot be processed until compaction of the region 300 is complete.
- the compaction of a region 300 generally progresses from lowAddress 414 to highAddress 416 and continually updates the next nextToMove pointer 418 to point at the next object to be moved. Initially, the nextToMove pointer 418 points at the lowAddress 414 and eventually progresses to highAddress 416 when the compaction of the region 300 is completed.
- a region 300 becomes dynamically dependent on a target region 300 when the destination address of nextToMove plus nextToMove's instance size is greater than the nextToMove pointer of the target region containing the destination address. When this situation occurs, the region's compaction is suspended and the region 300 is added to the blockedRegionList 422 of the target region 300 .
- a method 500 for implementing parallel, single-pass compaction in a garbage collector is illustrated. Such a method 500 may be executed in parallel by each of the compaction threads after the planning phase has been completed.
- the method 500 begins, no region 300 is known to be blocked on any other region 300 , each region 300 has its nextToMove pointer 418 set to the first object it contains, and all regions 300 to be compacted are contained in the monitor-protected work stack 410 .
- a compaction thread initially acquires 502 the work stack monitor 404 , which provides access to the work stack 410 .
- the compaction thread determines 504 whether the work stack 410 is empty. If the work stack 410 is not empty, the compaction thread pops 506 a region from the work stack 410 and releases 506 the work stack monitor 404 .
- the compaction thread looks up 508 the region's nextToMove pointer 418 to find the next object to be moved from the region 300 . Upon finding this object, the method determines 508 the object size (objectSize) and resolves 508 the new address of the object (this returns the newTargetAddress).
- the compaction thread finds 510 the region (targetRegion) containing the newTargetAddress. Upon finding 510 the targetRegion, the compaction thread determines 512 whether the nextToMove pointer of the targetRegion is less than the objectSize added to the newTargetAddress. This step 512 essentially determines 512 whether moving the object will overwrite data that has yet to be processed in the targetRegion.
- the compaction thread enters 514 the work stack monitor 404 and attaches 514 the current region to the targetRegion's blockedRegionList. If the object will not overwrite unprocessed data in the targetRegion, the compaction thread moves 516 the object to the newTargetAddress in the targetRegion and updates 516 the nextToMove pointer of the current region 300 to the next object in the region 300 . The compaction thread then determines 518 whether the object that was moved was the last object in the region 300 . If not, the compaction thread repeats steps 508 , 510 , 512 , 514 , 516 for the next object in the region 300 .
- the compaction thread determines that an object is the last object in a region 300 , the compaction thread enters 520 the work stack monitor 404 and pushes 520 each region found in the region's blockedRegionList onto the work stack 410 .
- This step 520 essentially unblocks 520 any regions that are blocked on the current region 300 , since processing of the current region 300 is complete.
- regions in the blockedRegionList are pushed onto the work stack 410 , the compaction thread notifies 520 any waiting compaction threads to wake up the threads and make them aware that new work is on the work stack 410 .
- the compaction thread determines 504 whether the work stack 410 is empty. If the work stack 410 is not empty, the compaction thread pops 506 the next region from the work stack 410 and processes the region in the manner previously described. If the work stack 410 is empty, the compaction thread determines 522 whether it is the last compaction thread not waiting on the work stack monitor 404 . If it is the last compaction thread not waiting, the compaction thread sets 528 the “isDone” flag to true (indicating that the compaction process has been completed for all participating regions 300 ) and notifies 528 all waiting compaction threads. The compaction thread then exits 530 the work stack monitor 404 and returns 530 from the work stack 410 . Any waiting compaction threads that are notified 528 will wake up and see that the isDone flag is set, allowing them to exit the work stack monitor 404 and return from the work stack 410 .
- a compaction thread is not the last thread not waiting on the work stack monitor 404 , the compaction thread simply waits 524 on the work stack monitor 404 .
- the compaction thread determines 526 whether the isDone flag is set. If the isDone flag is not set, the compaction thread looks 504 for work on the work stack 410 in the manner previously described. If the isDone flag is set, however, the compaction thread exits 530 the work stack monitor 404 and returns 530 from the work stack 410 .
- FIG. 6 various conventions are shown for use in association with FIGS. 7 through 16 .
- the boxes shown in FIG. 6 represent regions 300 .
- FIG. 6 shows boxes representing Regions A, B, and C.
- the arrows originating from the bottom of a box represent the nextRegionInList.
- Region B is the next region in the same list as Region A.
- Arrows originating from the right of a region represent a blockedRegionList.
- Region C is blocked until the processing of region A is complete.
- FIGS. 7 through 16 a simple example of the compaction process described hereinabove is illustrated.
- This very simple example assumes a pair of compaction threads (Thread 1 and Thread 2 ), a work stack (Work Stack), and four regions (Regions A, B, C, and D).
- Thread 1 and Thread 2 a pair of compaction threads
- Work Stack work stack
- regions A, B, C, and D regions
- a system might have hundreds or even thousands of regions.
- this example is intended to describe the core elements of the novel compaction process on a simpler scale.
- FIG. 7 shows the state of the system prior to beginning the compaction process.
- Regions A, B, C, and D are in the work stack 410 , with no regions blocked on any other regions. Both threads are idle but ready to perform work.
- FIG. 8 both threads pop a region from the work stack 410 and begin compacting the regions. Thus Thread 1 begins to process Region A and Thread 2 begins to process Region B.
- Thread 2 discovers that Region D depends on Region B being completed. As a result, Region D is added to Region B's blockedRegionList. This creates a multi-level dependency where Region D is indirectly blocked on Region C. Upon adding Region D to Region B's blockedRegionList, Thread 2 finds the work stack 410 empty and thus waits on the work stack monitor 404 , as shown in FIG. 11 .
- Thread 1 When Thread 1 finishes processing Region C, and discards it from the system, it pushes both of its blocked regions (Regions A and B) onto the work stack 410 , as shown in FIG. 12 . Threads 1 and 2 then pop these Regions from the work stack 410 and begin compacting them, as shown in FIG. 13 . Thread 1 finishes processing Region A and discards it from the system. Likewise, Thread 2 finishes processing Region B and discards it from the system. Upon completing Region B, Thread 2 pushes its blocked region (Region D) onto the work stack 410 , as shown in FIG. 14 . Thread 1 then pops Region D from the work stack 410 and finishes processing it. Thread 2 waits for more work to appear in the work stack 410 .
- Thread 1 Once Thread 1 processes Region D and discards it from the system, Thread 1 will realize that it is the last thread still running (not waiting). In response, Thread 1 sets the isDone flag, notifies Thread 2 , and exits the compaction process. Thread 2 wakes up, checks the isDone flag to realize it is set, and exits the compaction process.
- the compaction process could also be applied to data entities having different granularities.
- the compaction process could be applied to small groups of contiguous objects, thereby reducing the need to explicitly track the new address of every object individually.
- the compaction process could consider the size of the group, instead of a single object, to determine if it will fit at its new location.
- the nextToMove pointer would also require less frequent updates as it would point to the base of an object group instead of an individual object.
- the above-described compaction process could be modified such that regions in a blockedRegionList don't require compaction to be completed on the target region before being pushed back onto the work stack 410 . Because regions may depend on particular parts of the target region being processed, rather than the entire target region being processed, these regions could be pushed back onto the work stack 410 as the target region's compaction progresses, rather than after the compaction of the target region has entirely completed.
- the blockedRegionList could be sorted by blocking address to make checking the list efficient.
- each block in the flowcharts or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s).
- the functions noted in a block may occur out of the order noted in the Figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. Some blocks may be deleted or other blocks may be added depending on the functionality involved.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Memory System (AREA)
Abstract
A method to implement parallel, single-pass compaction in a garbage collector is described. In one embodiment, such a method includes conducting a planning phase for multiple regions to be compacted. During the planning phase, the method determines new locations for data entities in the multiple regions. The method then performs a move phase for the multiple regions to move the data entities to their new locations. During the move phase, the method initiates multiple compaction threads to move the data entities to their new locations. While executing, the compaction threads dynamically build a dependency graph of the regions being compacted. The dependency graph guarantees that no data entity is moved to its new location until all data entities that it overwrites have been moved to their new locations. A corresponding computer program product and apparatus are also disclosed and claimed herein.
Description
- This invention relates to memory management, and more specifically to apparatus and methods for compacting regions in memory.
- Any memory management system that does not move objects may be affected by memory fragmentation. One important mechanism for resolving fragmentation is compaction. In an article entitled “An Efficient Parallel Heap Compaction Algorithm,” published in 2004, the authors (Abuaiadh et al.) disclosed, at the time, a state of the art compactor that is currently used in many Java Virtual Machines. The compactor (hereinafter the “Aduaiadh compactor”) parallelizes very well, but requires two passes over the heap, both of which write to the heap. The first pass moves objects and the second pass fixes up references between objects, based on their new locations.
- More recently, in an article entitled “The Compressor: Concurrent, Incremental, and Parallel Compaction,” published in 2006, the authors (Kermany et al.) described a novel compactor which improves on the Abuaiadh compactor. The compactor (hereinafter the “Kermany compactor”) is a “stop-the-world” compactor that divides compaction into two phases, referred to herein as the “planning” phase and “move” phase. During the planning phase, the new location of each object or group of objects is determined and recorded in a concise manner in an offset table. Planning is based only on the mark vector—it does not require a pass to read or write to the heap. Once planning is complete, the data in the mark vector and offset table may be combined to determine the destination of any object. This allows the move phase to move objects to their predetermined destinations and fix up their references to other objects at the same time. The move phase is the only phase of the compactor which reads or writes to the heap—planning and external fix-up do not need to read or write to heap memory. The Kermany compactor also includes a concurrent and incremental aspect, which are not discussed here.
- Although the Kermany compactor parallelizes the move phase, the parallelization of movement relies on the use of memory mapping, using the operating system's virtual memory subsystem, to ensure that a compactor thread never copies over objects which haven't been copied yet. To accomplish this, the Kermany compactor allocates a second block of memory addresses which map onto the same physical memory as the object heap. As the Kermany compactor compacts the heap, it allocates new “to-virtual-space” pages in which to move objects. As “from-virtual-space” pages are completely evacuated, they are unmapped. Using this technique, the Kermany compactor does not need to worry about moving an object into a page which has not yet been fully evacuated.
- However, the Kermany compactor's heavy use of virtual memory for mapping and unmapping is expensive. It typically requires system calls and can be prohibitively slow. It also requires additional address space. The authors acknowledge that this requirement “may create some problem for large heaps on a 32-bit machine.” Also, the Kermany compactor relies on the use of relatively small pages so that virtual memory can be managed at a fine granularity. Hardware and software systems are increasingly moving towards larger page sizes. On large 64-bit systems, for example, 16 GB pages are now available. Such large pages make this mapping/unmapping strategy impractical.
- In view of the foregoing, what are needed are apparatus and methods to parallelize object movement in a single pass over an object heap. Ideally, such apparatus and methods will provide mechanisms to ensure that live objects are not overwritten by other live objects, without requiring use of virtual memory.
- The invention has been developed in response to the present state of the art and, in particular, in response to the problems and needs in the art that have not yet been fully solved by currently available apparatus and methods. Accordingly, the invention has been developed to provide apparatus and methods to implement parallel, single-pass compaction in a region-based garbage collector. The features and advantages of the invention will become more fully apparent from the following description and appended claims, or may be learned by practice of the invention as set forth hereinafter.
- Consistent with the foregoing, a method to implement parallel, single-pass compaction in a garbage collector is disclosed herein. In one embodiment, such a method includes conducting a planning phase for multiple regions to be compacted. During the planning phase, the method determines new locations for data entities in the multiple regions. The method then performs a move phase for the multiple regions to move the data entities to their new locations. During the move phase, the method initiates multiple compaction threads to move the data entities to their new locations. While executing, the compaction threads dynamically build a dependency graph of the regions being compacted. The dependency graph guarantees that no data entity is moved to its new location until all data entities that it overwrites have been moved to their new locations.
- A corresponding computer program product and apparatus are also disclosed and claimed herein.
- In order that the advantages of the invention will be readily understood, a more particular description of the invention briefly described above will be rendered by reference to specific embodiments illustrated in the appended drawings. Understanding that these drawings depict only typical embodiments of the invention and are not therefore to be considered limiting of its scope, the invention will be described and explained with additional specificity and detail through use of the accompanying drawings, in which:
-
FIG. 1 is a high-level block diagram showing one example of a computer system suitable for use with various embodiments of the invention; -
FIG. 2 is a high-level block diagram showing one example of an object-oriented managed runtime, in this example the Java Virtual Machine, incorporating a compaction module in accordance with the invention; -
FIGS. 3A through 3D show a general compaction process on a region; -
FIG. 4 shows one example of a compaction module and various related data structures used to create a dependency graph; -
FIG. 5 shows one embodiment of a method for implementing parallel, single-pass compaction on regions of an object heap; -
FIG. 6 shows various conventions for use in association withFIGS. 7 through 16 ; and -
FIGS. 7 through 16 show a simple specific example of operation of the compaction module using a pair of compaction threads (Thread 1 and Thread 2), a work stack (Work Stack), and four regions (Regions A, B, C, and D). - It will be readily understood that the components of the present invention, as generally described and illustrated in the Figures herein, could be arranged and designed in a wide variety of different configurations. Thus, the following more detailed description of the embodiments of the invention, as represented in the Figures, is not intended to limit the scope of the invention, as claimed, but is merely representative of certain examples of presently contemplated embodiments in accordance with the invention. The presently described embodiments will be best understood by reference to the drawings, wherein like parts are designated by like numerals throughout.
- As will be appreciated by one skilled in the art, the present invention may be embodied as an apparatus, system, method, or computer program product. Furthermore, the present invention may take the form of a hardware embodiment, a software embodiment (including firmware, resident software, microcode, etc.) configured to operate hardware, or an embodiment combining software and hardware aspects that may all generally be referred to herein as a “module” or “system.” Furthermore, the present invention may take the form of a computer-usable storage medium embodied in any tangible medium of expression having computer-usable program code stored therein.
- Any combination of one or more computer-usable or computer-readable storage medium(s) may be utilized to store the computer program product. The computer-usable or computer-readable storage medium may be, for example but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device. More specific examples (a non-exhaustive list) of the computer-readable storage medium may include the following: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a random access memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or Flash memory), an optical fiber, a portable compact disc read-only memory (CDROM), an optical storage device, or a magnetic storage device. In the context of this document, a computer-usable or computer-readable storage medium may be any medium that can contain, store, or transport the program for use by or in connection with the instruction execution system, apparatus, or device.
- Computer program code for carrying out operations of the present invention may be written in any combination of one or more programming languages, including an object-oriented programming language such as Java, Smalltalk, C++, or the like, and conventional procedural programming languages, such as the “C” programming language or similar programming languages. Computer program code for implementing the invention may also be written in a low-level programming language such as assembly language.
- The present invention may be described below with reference to flowchart illustrations and/or block diagrams of methods, apparatus, systems, and computer program products according to various embodiments of the invention. It will be understood that each block of the flowchart illustrations and/or block diagrams, and combinations of blocks in the flowchart illustrations and/or block diagrams, may be implemented by computer program instructions or code. The computer program instructions may be provided to a processor of a general-purpose computer, special-purpose computer, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions/acts specified in the flowchart and/or block diagram block or blocks.
- The computer program instructions may also be stored in a computer-readable storage medium that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable storage medium produce an article of manufacture including instruction means which implement the function/act specified in the flowchart and/or block diagram block or blocks. The computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide processes for implementing the functions/acts specified in the flowchart and/or block diagram block or blocks.
- Referring to
FIG. 1 , one example of acomputer system 100 is illustrated. Thecomputer system 100 is presented to show one example of an environment where an apparatus and method in accordance with the invention may be implemented. Thecomputer system 100 is presented only by way of example and is not intended to be limiting. Indeed, the apparatus and methods disclosed herein may be applicable to a wide variety of different computer systems in addition to thecomputer system 100 shown. The apparatus and methods disclosed herein may also potentially be distributed acrossmultiple computer systems 100. - The
computer system 100 includes at least oneprocessor 102 and may include more than one processor. Theprocessor 102 includes one ormore registers 104 storing data describing the state of theprocessor 102 and facilitating execution of software systems. Theregisters 104 may be internal to theprocessor 102 or may be stored in amemory 106. Thememory 106 stores operational and executable data that is operated upon by theprocessor 102. Thememory 106 may be accessed by theprocessor 102 by means of amemory controller 108. Thememory 106 may include volatile memory (e.g., RAM) as well as non-volatile memory (e.g., ROM, EPROM, EEPROM, hard disks, flash memory, etc.). - The
processor 102 may be coupled to additional devices supporting execution of software and interaction with users. For example, theprocessor 102 may be coupled to one ormore input devices 110, such as a mouse, keyboard, touch screen, microphone, or the like. Theprocessor 102 may also be coupled to one or more output devices such as adisplay device 112, speaker, or the like. Theprocessor 102 may communicate with one or more other computer systems by means of anetwork 114, such as a LAN, WAN, or the Internet. Communication over thenetwork 114 may be facilitated by anetwork adapter 116. - Referring to
FIG. 2 , one example of an object-oriented managed runtime, in this example the Java Runtime Environment, is illustrated. The Java Runtime Environment is presented to show one example of a runtime environment in which various embodiments of the invention may operate. Nevertheless, the compaction techniques disclosed herein are not limited to the Java Runtime Environment but may operate or be adapted to operate in any object-oriented managed runtime that uses an object heap to store objects. Other non-limiting examples of runtime environments in which embodiments of the invention might operate include the Microsoft Common Language Runtime (CLR) and Smalltalk runtime. Thus, although particular reference is made herein to the Java Runtime Environment, the principles taught herein are not limited to the Java Runtime Environment but may also be applicable to other runtime environments. - As shown in
FIG. 2 , in the Java Runtime Environment, a JavaVirtual Machine 202 may be configured to operate on a specific platform, which may include an underlying hardware andoperating system architecture Virtual Machine 202 receivesprogram code 200, compiled to an intermediate form referred to as “bytecode” 200. The JavaVirtual Machine 202 translates thisbytecode 200 into native operating system calls and machine instructions for execution on theunderlying platform bytecode 200 for the specific hardware andsoftware platform bytecode 200 is compiled once to operate on all JavaVirtual Machines 202. A JavaVirtual Machine 202, by contrast, may be tailored to the underlying hardware andsoftware platform - In the Java Runtime Environment, the primary memory resource is a garbage-collected
object heap 210. Theobject heap 210 provides memory for objects, each of which is an instance of a class. Agarbage collection module 208, orsubsystem 208, is provided in the Java Runtime Environment to reclaim memory occupied by objects, or classes associated with objects, that are no longer in use by a program. Among other benefits, thegarbage collection module 208 frees a programmer from worrying about releasing objects that are no longer needed, which would otherwise require significant design effort from the programmer. - In order to reduce pause times, the
garbage collection module 208 may, in certain embodiments, be configured to operate in an incremental manner. That is, thegarbage collection module 208 may be configured to perform garbage collection processes on a portion of theobject heap 210 at a time, as opposed to theentire object heap 210. In order to accomplish this, theobject heap 210 may be divided into a number of regions (e.g., hundreds or even thousands of regions). Thegarbage collection module 208 may then perform garbage collection on a subset of these regions at any particular time. This incremental garbage collection process minimizes, as much as possible, disruption of the main program. - As memory is reclaimed from objects that are no longer reachable, holes or gaps may be left in various regions of the
object heap 210. If the remaining live objects are not moved, the holes or gaps may lead to memory fragmentation. This may impair program performance as allocating objects within these gaps or holes is typically more costly than allocating objects in large contiguous spaces. To reduce this problem, compaction algorithms may be used to group live objects together and create large contiguous spaces for allocating new objects. Thus, in certain embodiments, thegarbage collection module 208 includes acompaction module 212 to perform compaction algorithms on theobject heap 210. -
FIGS. 3A through 3D provide an example of a general compaction process on aregion 300 of anobject heap 210. As shown inFIG. 3A , aregion 300 may initially contain one ormore objects 302. During a garbage collection process, thegarbage collection module 208 may traverse a program graph, starting with root objects, to discover alllive objects 302 a (objects referenced by other live objects) in theregion 300. As shown inFIG. 3B , thegarbage collection module 208 marks eachlive object 302 a that it encounters during the traversal, indicating that theobject 302 a is in use. All remainingobjects 302 b are considered dead (unreachable). Thegarbage collection module 208 then frees the memory occupied by thedead objects 302 b, effectively clearing thedead objects 302 b from memory, as shown inFIG. 3C . Thecompaction module 212 may then compact the remaininglive objects 302 a by moving the objects toward the low address of theregion 300 to create a largecontiguous space 304, as shown inFIG. 3D . New objects may then be allocated within the largecontiguous space 304. - During the compaction process,
certain objects 302 a may be moved withinregions 300, whileother objects 302 a may be moved from oneregion 300 to another. When moving objects 302 a, care must be taken to ensure thatlive objects 302 a are not overwritten by otherlive objects 302 a. To accomplish this, thecompaction module 212 builds a dependency graph of theregions 300 being compacted to ensure that nolive object 302 a is moved to its new location until alllive objects 302 a that it overwrites have been moved to their new locations. Rather than explicitly generating this dependency graph, which would require single-threaded execution time and either extra storage or extra time to discover dependencies, thecompaction module 212 may be configured to discover dependencies and generate the dependency graph dynamically asobjects 302 a are moved to their new locations. The manner in which dependencies are discovered and used to build a dependency graph will be discussed in more detail hereafter. - Referring to
FIG. 4 , in certain embodiments, thecompaction module 212 includes aplanning module 400 and amove module 402. Theplanning module 400 may conduct the planning phase of the compaction process, wherein new locations of live objects are determined. For example, theplanning module 400 may conduct the planning phase in the manner described in the Abuaiadh or Kermany articles previously described. Themove module 402 may conduct the move phase of the compaction process, wherein live objects are moved to their new locations. To implement the move phase, themove module 402 may initiate multiple compaction threads, each of which may operate in parallel, to compact theregions 300. - During the compaction process, the
planning module 400 andmove module 402 may interact withvarious data structures 406. For example, upon determining the new addresses of live objects, theplanning module 400 may store these addresses in a new address table 408. The compaction threads initiated by themove module 402 may look up new addresses in the address table 408 in order to move live objects to their new locations. While moving live objects to their new locations, the compaction threads may interact with awork stack 410, andregion properties 412 associated with eachregion 300. In the illustrated embodiment, thework stack 410 andregion properties 412 together create thedependency graph 424 discussed herein. - The
work stack 410 contains a list ofregions 300 to be compacted. When looking for work to perform, the compaction threads pop regions from thework stack 410. A work stack monitor 404 controls access to thework stack 410, ensuring that only one compaction thread can access thework stack 410 at any particular time. A compaction thread will wait on the work stack monitor 404 if there is no work available. When a compaction thread adds new work to thework stack 410, or it is the last thread running and finds there is no work on thework stack 410, the compaction thread will notify all compaction threads that are waiting on thework stack monitor 404. This will allow the waiting threads to either pop work from thework stack 410 or, if no work is available, exit the work stack monitor 404 and return from the work stack 410 (since this means that the compaction process is complete). - In certain embodiments, the
work stack 410 includes a pointer to a list ofregions 300 it contains, a count of waiting threads, and an “isDone” flag. One thread may initially populate thework stack 410 with all theregions 300 that are participating in the compaction process. Depending on the implementation of the planning phase, thework stack 410 may be built as a side-effect of the planning algorithm in order to eliminate the need for this additional serialization. In either case, the operation is low-cost since it blindly pushes allregions 300 onto thework stack 410, without attempting to infer dependency therebetween. Once a compaction thread finishes processing aregion 300, it either discards the region (if processing is complete) or adds the region to another region'sblockedRegionList 422. Aregion 300 will be pushed back onto thework stack 410 after being removed from a now-completed region'sblockedRegionList 422. Since objects from aregion 300 may be moved to severaldifferent regions 300, it is possible that aregion 300 will end up moving through thework stack 410 and ablockedRegionList 422 multiple times during a single compaction cycle. - A
region 300 is a logical unit of compaction work and may include the illustratedproperties 412. ThelowAddress property 414 is the base address of theregion 300. Thisproperty 414 is constant during compaction. ThehighAddress property 416 is the address of the first byte after theregion 300, which is also constant during compaction. ThenextToMove property 418 is the address of the next object to be moved from theregion 300. Thisproperty 418 starts at thelowAddress 414 and becomes equal to thehighAddress 416 when all of the region's objects have been moved. ThenextRegionInList property 420 points to thenext region 300 in the same list as a region 300 (the list may include either thework stack 410 or another region's blockedRegionList). TheblockedRegionList property 422 points to a list ofregions 300 which are blocked on aregion 300 and cannot be processed until compaction of theregion 300 is complete. - The compaction of a
region 300 generally progresses fromlowAddress 414 tohighAddress 416 and continually updates thenext nextToMove pointer 418 to point at the next object to be moved. Initially, thenextToMove pointer 418 points at thelowAddress 414 and eventually progresses to highAddress 416 when the compaction of theregion 300 is completed. Aregion 300 becomes dynamically dependent on atarget region 300 when the destination address of nextToMove plus nextToMove's instance size is greater than the nextToMove pointer of the target region containing the destination address. When this situation occurs, the region's compaction is suspended and theregion 300 is added to theblockedRegionList 422 of thetarget region 300. - Referring to
FIG. 5 , one embodiment of amethod 500 for implementing parallel, single-pass compaction in a garbage collector is illustrated. Such amethod 500 may be executed in parallel by each of the compaction threads after the planning phase has been completed. When themethod 500 begins, noregion 300 is known to be blocked on anyother region 300, eachregion 300 has itsnextToMove pointer 418 set to the first object it contains, and allregions 300 to be compacted are contained in the monitor-protectedwork stack 410. - As illustrated, a compaction thread initially acquires 502 the
work stack monitor 404, which provides access to thework stack 410. The compaction thread then determines 504 whether thework stack 410 is empty. If thework stack 410 is not empty, the compaction thread pops 506 a region from thework stack 410 andreleases 506 thework stack monitor 404. The compaction thread then looks up 508 the region'snextToMove pointer 418 to find the next object to be moved from theregion 300. Upon finding this object, the method determines 508 the object size (objectSize) and resolves 508 the new address of the object (this returns the newTargetAddress). The compaction thread then finds 510 the region (targetRegion) containing the newTargetAddress. Upon finding 510 the targetRegion, the compaction thread determines 512 whether the nextToMove pointer of the targetRegion is less than the objectSize added to the newTargetAddress. Thisstep 512 essentially determines 512 whether moving the object will overwrite data that has yet to be processed in the targetRegion. - If moving the object will overwrite unprocessed data in the targetRegion, the object cannot be moved since it may overwrite live objects. In such a case, the compaction thread enters 514 the work stack monitor 404 and attaches 514 the current region to the targetRegion's blockedRegionList. If the object will not overwrite unprocessed data in the targetRegion, the compaction thread moves 516 the object to the newTargetAddress in the targetRegion and updates 516 the nextToMove pointer of the
current region 300 to the next object in theregion 300. The compaction thread then determines 518 whether the object that was moved was the last object in theregion 300. If not, the compaction thread repeatssteps region 300. - If, at
step 518, the compaction thread determines that an object is the last object in aregion 300, the compaction thread enters 520 the work stack monitor 404 and pushes 520 each region found in the region's blockedRegionList onto thework stack 410. Thisstep 520 essentially unblocks 520 any regions that are blocked on thecurrent region 300, since processing of thecurrent region 300 is complete. Once regions in the blockedRegionList are pushed onto thework stack 410, the compaction thread notifies 520 any waiting compaction threads to wake up the threads and make them aware that new work is on thework stack 410. - At this point, the compaction thread determines 504 whether the
work stack 410 is empty. If thework stack 410 is not empty, the compaction thread pops 506 the next region from thework stack 410 and processes the region in the manner previously described. If thework stack 410 is empty, the compaction thread determines 522 whether it is the last compaction thread not waiting on thework stack monitor 404. If it is the last compaction thread not waiting, the compaction thread sets 528 the “isDone” flag to true (indicating that the compaction process has been completed for all participating regions 300) and notifies 528 all waiting compaction threads. The compaction thread then exits 530 the work stack monitor 404 and returns 530 from thework stack 410. Any waiting compaction threads that are notified 528 will wake up and see that the isDone flag is set, allowing them to exit the work stack monitor 404 and return from thework stack 410. - If, at
step 522, a compaction thread is not the last thread not waiting on thework stack monitor 404, the compaction thread simply waits 524 on thework stack monitor 404. When the compaction thread wakes up, the compaction thread determines 526 whether the isDone flag is set. If the isDone flag is not set, the compaction thread looks 504 for work on thework stack 410 in the manner previously described. If the isDone flag is set, however, the compaction thread exits 530 the work stack monitor 404 and returns 530 from thework stack 410. - Referring to
FIG. 6 , various conventions are shown for use in association withFIGS. 7 through 16 . The boxes shown inFIG. 6 representregions 300. For example,FIG. 6 shows boxes representing Regions A, B, and C. The arrows originating from the bottom of a box represent the nextRegionInList. For example, Region B is the next region in the same list as Region A. Arrows originating from the right of a region represent a blockedRegionList. For example, Region C is blocked until the processing of region A is complete. - Referring generally to
FIGS. 7 through 16 , a simple example of the compaction process described hereinabove is illustrated. This very simple example assumes a pair of compaction threads (Thread 1 and Thread 2), a work stack (Work Stack), and four regions (Regions A, B, C, and D). In reality, a system might have hundreds or even thousands of regions. Thus, this example is intended to describe the core elements of the novel compaction process on a simpler scale. -
FIG. 7 shows the state of the system prior to beginning the compaction process. As shown inFIG. 7 , prior to beginning the compaction process, Regions A, B, C, and D are in thework stack 410, with no regions blocked on any other regions. Both threads are idle but ready to perform work. As shown inFIG. 8 , both threads pop a region from thework stack 410 and begin compacting the regions. ThusThread 1 begins to process Region A andThread 2 begins to process Region B. - Assume that
Thread 1 discovers that Region A depends on Region C being completed andThread 2 discovers that Region B also depends on Region C being completed. As a result, both Regions are added to Region C's blockedRegionList, as shown inFIG. 9 . Both Threads then pop the remaining regions (Regions C and D) from thework stack 410, thereby leaving thework stack 410 empty, as shown inFIG. 10 . - Assume that
Thread 2 discovers that Region D depends on Region B being completed. As a result, Region D is added to Region B's blockedRegionList. This creates a multi-level dependency where Region D is indirectly blocked on Region C. Upon adding Region D to Region B's blockedRegionList,Thread 2 finds thework stack 410 empty and thus waits on thework stack monitor 404, as shown inFIG. 11 . - When
Thread 1 finishes processing Region C, and discards it from the system, it pushes both of its blocked regions (Regions A and B) onto thework stack 410, as shown inFIG. 12 .Threads work stack 410 and begin compacting them, as shown inFIG. 13 .Thread 1 finishes processing Region A and discards it from the system. Likewise,Thread 2 finishes processing Region B and discards it from the system. Upon completing Region B,Thread 2 pushes its blocked region (Region D) onto thework stack 410, as shown inFIG. 14 .Thread 1 then pops Region D from thework stack 410 and finishes processing it.Thread 2 waits for more work to appear in thework stack 410. OnceThread 1 processes Region D and discards it from the system,Thread 1 will realize that it is the last thread still running (not waiting). In response,Thread 1 sets the isDone flag, notifiesThread 2, and exits the compaction process.Thread 2 wakes up, checks the isDone flag to realize it is set, and exits the compaction process. - Although the above-described compaction process has focused primarily on moving individual objects, the compaction process could also be applied to data entities having different granularities. For example, the compaction process could be applied to small groups of contiguous objects, thereby reducing the need to explicitly track the new address of every object individually. In such an embodiment, the compaction process could consider the size of the group, instead of a single object, to determine if it will fit at its new location. The nextToMove pointer would also require less frequent updates as it would point to the base of an object group instead of an individual object.
- In other embodiments, the above-described compaction process could be modified such that regions in a blockedRegionList don't require compaction to be completed on the target region before being pushed back onto the
work stack 410. Because regions may depend on particular parts of the target region being processed, rather than the entire target region being processed, these regions could be pushed back onto thework stack 410 as the target region's compaction progresses, rather than after the compaction of the target region has entirely completed. In certain embodiments, the blockedRegionList could be sorted by blocking address to make checking the list efficient. - The flowcharts and block diagrams in the Figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods, and computer-usable media according to various embodiments of the present invention. In this regard, each block in the flowcharts or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in a block may occur out of the order noted in the Figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. Some blocks may be deleted or other blocks may be added depending on the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustrations, and combinations of blocks in the block diagrams and/or flowchart illustrations, may be implemented by special purpose hardware-based systems that perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
Claims (17)
1-8. (canceled)
9. A computer program product to implement parallel, single-pass compaction in a garbage collector, the computer program product comprising a computer-usable storage medium having computer-usable program code embodied therein, the computer-usable program code comprising:
computer-usable program code to conduct a planning phase for a plurality of regions to be compacted, wherein conducting the planning phase comprises determining new locations for data entities in the plurality of regions;
computer-usable program code to perform a move phase for the plurality of regions to move the data entities to their new locations, wherein performing the move phase comprises:
initiating a plurality of compaction threads to move the data entities to their new locations; and
dynamically building a dependency graph of the regions being compacted as the compaction threads move the data entities to their new locations, wherein the dependency graph guarantees that no data entity is moved to its new location until all data entities that it overwrites have been moved to their new locations.
10. The computer program product of claim 9 , wherein the plurality of compaction threads are configured to operate in parallel.
11. The computer program product of claim 9 , wherein the data entities are at least one of objects and groups of objects.
12. The computer program product of claim 9 , wherein performing the move phase comprises performing the move phase in a single pass over the plurality of regions.
13. The computer program product of claim 9 , wherein the compaction threads are configured to pop regions from a work stack to determine which regions need to be compacted.
14. The computer program product of claim 13 , wherein each region includes a blocked regions list indicating the regions that are blocked on the region.
15. The computer program product of claim 14 , further comprising computer-usable program code to, upon completing a region, push all regions in the region's blocked regions list onto the work stack.
16. The computer program product of claim 14 , further comprising computer-usable program code to, upon encountering a first data entity that cannot be moved because a second data entity that it overwrites has not yet been moved, place the region associated with the first data entity in the blocked regions list of the region associated with the second data entity.
17. An apparatus to implement parallel, single-pass compaction in a garbage collector, the apparatus comprising:
a planning module to conduct a planning phase for a plurality of regions to be compacted, wherein conducting the planning phase comprises determining new locations for data entities in the plurality of regions; and
a move module to perform a move phase for the plurality of regions to move the data entities to their new locations, wherein performing the move phase comprises:
initiating a plurality of compaction threads to move the data entities to their new locations; and
dynamically building a dependency graph of the regions being compacted as the compaction threads move the data entities to their new locations, wherein the dependency graph guarantees that no data entity is moved to its new location until all data entities that it overwrites have been moved to their new locations.
18. The apparatus of claim 17 , wherein the plurality of compaction threads are configured to operate in parallel.
19. The apparatus of claim 17 , wherein the data entities are at least one of objects and groups of objects.
20. The apparatus of claim 17 , wherein performing the move phase comprises performing the move phase in a single pass over the plurality of regions.
21. The apparatus of claim 17 , wherein the compaction threads are configured to pop regions from a work stack to determine which regions need to be compacted.
22. The apparatus of claim 21 , wherein each region includes a blocked regions list indicating which regions are blocked on the region.
23. The apparatus of claim 22 , wherein the move module is further configured to, upon completing a region, push all regions in the region's blocked regions list onto the work stack.
24. The apparatus of claim 22 , wherein the move module is further configured to, upon encountering a first data entity that cannot be moved because a second data entity that it overwrites has not yet been moved, place the region associated with the first data entity in the blocked regions list of the region associated with the second data entity.
Priority Applications (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US13/023,437 US20120203992A1 (en) | 2011-02-08 | 2011-02-08 | Parallel, single-pass compaction in a region-based garbage collector |
US13/431,971 US8769230B2 (en) | 2011-02-08 | 2012-03-28 | Parallel, single-pass compaction in a region-based garbage collector |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US13/023,437 US20120203992A1 (en) | 2011-02-08 | 2011-02-08 | Parallel, single-pass compaction in a region-based garbage collector |
Related Child Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US13/431,971 Continuation US8769230B2 (en) | 2011-02-08 | 2012-03-28 | Parallel, single-pass compaction in a region-based garbage collector |
Publications (1)
Publication Number | Publication Date |
---|---|
US20120203992A1 true US20120203992A1 (en) | 2012-08-09 |
Family
ID=46601475
Family Applications (2)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US13/023,437 Abandoned US20120203992A1 (en) | 2011-02-08 | 2011-02-08 | Parallel, single-pass compaction in a region-based garbage collector |
US13/431,971 Expired - Fee Related US8769230B2 (en) | 2011-02-08 | 2012-03-28 | Parallel, single-pass compaction in a region-based garbage collector |
Family Applications After (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US13/431,971 Expired - Fee Related US8769230B2 (en) | 2011-02-08 | 2012-03-28 | Parallel, single-pass compaction in a region-based garbage collector |
Country Status (1)
Country | Link |
---|---|
US (2) | US20120203992A1 (en) |
Families Citing this family (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20140250277A1 (en) * | 2013-03-04 | 2014-09-04 | Kabushiki Kaisha Toshiba | Memory system |
US9229858B2 (en) | 2013-10-08 | 2016-01-05 | Red Hat, Inc. | Concurrent garbage collector thread |
US10740230B2 (en) | 2016-10-20 | 2020-08-11 | International Business Machines Corporation | Heap contraction for increasing memory density in cloud environment |
CN108345470B (en) * | 2017-01-24 | 2021-10-08 | 阿里巴巴集团控股有限公司 | Data processing and storing method and device and electronic equipment |
US10621057B2 (en) * | 2017-08-16 | 2020-04-14 | Intelliflash By Ddn, Inc. | Intelligent redundant array of independent disks with resilvering beyond bandwidth of a single drive |
US10838639B2 (en) | 2017-08-16 | 2020-11-17 | Intelliflash By Ddn, Inc. | Intelligent redundant array of independent disks |
US10474572B2 (en) | 2017-08-16 | 2019-11-12 | HGST, Inc. | Intelligent redundant array of independent disks with high performance recompaction |
US11221947B2 (en) | 2018-07-27 | 2022-01-11 | Red Hat, Inc. | Concurrent garbage collection with minimal graph traversal |
Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US7480782B2 (en) * | 2006-06-14 | 2009-01-20 | Sun Microsystems, Inc. | Reference-updating using per-chunk referenced-address ranges in a compacting garbage collector |
Family Cites Families (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US7340494B1 (en) | 2004-03-12 | 2008-03-04 | Sun Microsystems, Inc. | Garbage-first garbage collection |
CN101046755B (en) | 2006-03-28 | 2011-06-15 | 郭明南 | System and method of computer automatic memory management |
-
2011
- 2011-02-08 US US13/023,437 patent/US20120203992A1/en not_active Abandoned
-
2012
- 2012-03-28 US US13/431,971 patent/US8769230B2/en not_active Expired - Fee Related
Patent Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US7480782B2 (en) * | 2006-06-14 | 2009-01-20 | Sun Microsystems, Inc. | Reference-updating using per-chunk referenced-address ranges in a compacting garbage collector |
Also Published As
Publication number | Publication date |
---|---|
US8769230B2 (en) | 2014-07-01 |
US20120203994A1 (en) | 2012-08-09 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US8769230B2 (en) | Parallel, single-pass compaction in a region-based garbage collector | |
US8489653B2 (en) | Incremental class unloading in a region-based garbage collector | |
US8423589B2 (en) | Copy collector with efficient abort-on-copy transition to mark collector | |
US11573894B2 (en) | Tracking garbage collection states of references | |
US8447793B2 (en) | Efficient remembered set for region-based garbage collectors | |
EP4341818A1 (en) | Write barrier for remembered set maintenance in generational z garbage collector | |
US10733095B2 (en) | Performing garbage collection on an object array using array chunk references | |
US11474832B2 (en) | Intelligently determining a virtual machine configuration during runtime based on garbage collection characteristics | |
EP3807771A1 (en) | Arena-based memory management | |
CN109923527B (en) | Variable type builder | |
US11789863B2 (en) | On-the-fly remembered set data structure adaptation | |
US12019541B2 (en) | Lazy compaction in garbage collection | |
US11513954B2 (en) | Consolidated and concurrent remapping and identification for colorless roots | |
WO2022245954A1 (en) | Write barrier for remembered set maintenance in generational z garbage collector | |
WO2022245659A1 (en) | Colorless roots implementation in z garbage collector | |
WO2022245749A1 (en) | Snapshot at the beginning marking in z garbage collector |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
AS | Assignment |
Owner name: INTERNATIONAL BUSINESS MACHINES CORPORATION, NEW Y Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:BURKA, PETER W.;DISHER, JEFFREY M.;MAIER, DARYL J.;AND OTHERS;REEL/FRAME:025771/0167 Effective date: 20110208 |
|
STCB | Information on status: application discontinuation |
Free format text: ABANDONED -- FAILURE TO PAY ISSUE FEE |