A fledgling software developer with the right attitude?

Photo credit: Mdf
Note: This text was originally posted on my earlier blog at http://www.adaptivelearningonline.net
A fledgling software developer with the right attitude?
Over the past few blog posts we have covered some basics of garbage collection in Java. Garbage collection is a key strength of the JVM, since we do not have to worry about releasing object memory. It reduces memory leaks and other problems associated with improper memory management code that often creeps in when we program against hard deadlines :-)
However, for large programs we very often have to tweak the default garbage collection mechanism to improve performance. Here's a nice article that explains garbage collection in much more detail with good examples. This page contains several links to memory management in the Java Hotspot VM. Here's another page that explains how to fine tune the Java garbage collector in Java 1.5.
It is also a good idea to keep up with the latest in technology, so here's a page that describes enhancements to the Java VM in version 1.6, that influence the garbage collector.
I hope you enjoyed this series. Over the next few months I hope to cover various aspects of core Java through such mini series posts. I hope you find them informative. As always your comments and suggestions are very welcome.
You can discuss this post in our learning forum.
A Java program can request the underlying JVM to perform garbage collection by calling the gc() method of the System class.
Note that System.gc() is a request to the underlying JVM. Garbage collection may not always happen. Hence we cannot depend on this method, however we can use it to optimize performance, with the understanding that it may not work as desired on all JVM's.
01 /** This program creates instances of Bag objects in a loop which will run 100000 times. 02 * Whenever the program is in a loop that is a multiple of 1000, it requests the JVM 03 * to start the garbage collector. We will see that the JVM may not always fulfill the 04 * request. 05 */ 06 public class GarbageCollectionDemo { 07 public static void main(String args[]) { 08 //loop 10000 times 09 for(int i=0;i<10000;i++) { 10 //if this loop is a multiple of 1000, then request for garbage collection 11 if(i%1000 == 0) { 12 System.out.println("Requesting for garbage collection"); 13 System.gc(); 14 } 15 System.out.println("Creating bag " + i); 16 Bag b = new Bag(String.valueOf(i)); 17 } 18 } 19 } 20 21 /** A placeholder object that is created in GarbageCollectionDemo 22 * 23 */ 24 class Bag { 25 private String id; 26 public Bag(String id) { 27 this.id = id; 28 } 29 30 /**We override the finalize method and put a print statement 31 * which will tell us when the object was garbage collected. 32 */ 33 public void finalize() { 34 System.out.println("Garbage collecting bag " + id); 35 } 36 } |
Java2html |
Here is the output of the program. Since the output was very large, I have truncated most parts (which have been shown as dots ...). As is shown, we request for garbage collection after creating 999 objects [line 11]. The JVM complies and garbage collects all unused objects. Also note that the garbage collector again starts reclaiming objects on line 32, even though the program has not specifically asked for it to do so. It may have done so either because it ran short of resources (but that seems unlikely because the garbage collector had just freed resources some time back), or it was run in the normal course of the garbage collection algorithm (this seems more likely).
Now this JVM is really well behaved, it fulfills all our garbage collection requests, but do not take this behavior for granted. All JVM's may not be as well mannered.
02 Requesting for garbage collection 03 Creating bag 0 04 Creating bag 1 05 Creating bag 2 06 Creating bag 3 07 Creating bag 4 08 Creating bag 5 09 ... 10 Creating bag 999 11 Requesting for garbage collection 12 Garbage collecting bag 999 13 Garbage collecting bag 998 14 Garbage collecting bag 997 15 Garbage collecting bag 996 16 Garbage collecting bag 995 17 ... 18 Garbage collecting bag 5 19 Garbage collecting bag 4 20 Garbage collecting bag 3 21 Garbage collecting bag 2 22 Garbage collecting bag 1 23 Garbage collecting bag 0 24 Creating bag 1000 25 Creating bag 1001 26 Creating bag 1002 27 Creating bag 1003 28 ... 29 Creating bag 1452 30 Creating bag 1453 31 Creating bag 1454 32 Garbage collecting bag 1454 33 Garbage collecting bag 1001 34 ... 35 Garbage collecting bag 1228 36 Garbage collecting bag 1226 37 Garbage collecting bag 1227 38 Creating bag 1455 39 Creating bag 1456 40 ... 41 Creating bag 1998 42 Creating bag 1999 43 Requesting for garbage collection 44 Garbage collecting bag 1999 45 Garbage collecting bag 1998 46 ... 47 Garbage collecting bag 1456 48 Garbage collecting bag 1455 49 Creating bag 2000 50 ... 51 Creating bag 2813 52 Creating bag 2814 53 Garbage collecting bag 2814 54 Garbage collecting bag 2000 55 Garbage collecting bag 2813 56 ... 57 Garbage collecting bag 2413 58 Garbage collecting bag 2401 59 ... 60 Garbage collecting bag 2407 61 Creating bag 2815 62 Creating bag 2816 63 Creating bag 2817 64 ... 65 ... 66 ... 67 Creating bag 9872 68 ... 69 Creating bag 9999 |
Java2html |
Adaptive algorithms do not use any fixed algorithm for garbage collection. They monitor the heap and choose an algorithm that is most effective for the current usage pattern. Such algorithms may also divide the heap into sub heaps and use a different algorithm for every sub heap, depending on the usage.
Introduction:
Generational algorithms make use of a property found in most Java programs. Several studies have revealed a pattern in the way objects are used. It has been found that objects that get created early in program execution usually live longer than objects that get created later. It is usually the youngest objects that are garbage collected first.
Look at the diagram below. The blue area in the diagram is a typical distribution for the lifetimes of objects. The X axis represents the bytes allocated by the JVM, and the Y axis represents th number of surviving bytes (live objects). The sharp peak at the left represents objects that can be reclaimed (i.e., have "died") shortly after being allocated. Iterator objects, for example, are often alive for the duration of a single loop.
Image source: The above image has been taken from the document "Tuning garbage collection with the 5.0 Java[tm] virtual machine"
If you notice, the distribution stretches out to the the right. This is because some objects live longer. Typically these are objects that have been created when the program started, and they live for the duration of the program. The lump observed after the first drop represents those objects that are created for some intermediate process. Some applications have very different looking distributions, but a surprisingly large number possess this general shape. The diagram above shows that most objects have a very short life span. Generational algorithms take advantage of this fact to optimize garbage collection in the JVM.
Mechanism:
Generational algorithms divide the heap into several sub heaps. As objects get created, they are put in the sub heap that represents the youngest generation. When this area becomes full, the garbage collector is run and all unused objects in that sub heap get garbage collected. Objects that survive a few garbage collection attempts are promoted to a sub heap representing an older generation. The garbage collector runs most frequently in the younger heaps. Each progressively older generation of objects get garbage collected less often.
Generational algorithms are very efficient because they make use of certain well known properties of Java programs. Sun's Hotspot VM uses a modified form of generational algorithms. However, please note that this algorithm will not work efficiently with programs that make non-standard use of memory. We may configure the algorithm to work appropriately with such programs if we understand their memory usage well. If not, it is best to keep the default implementation.
References:
Copying algorithms are tracing algorithms that divide the heap into multiple regions; one object region in which objects are created, and one or more free space regions (which are ignored, ie. objects are never created in these regions). When the object space becomes full, all live objects are moved to the free space. Now the free space becomes the object space, and the old object space becomes the free space. Since objects are placed contiguously in the new region, holes (fragments) are not formed.
The most commonly used copying algorithm is known as stop and copy, which uses one object region and one free space region. When the object region becomes full, the (Java) program is paused until live objects are copied into the free space. The program is restarted after the copying phase.
Copying algorithms have an advantage over compacting algorithms in that the address of object references does not change when the objects are moved.
Discuss this post in the learning forum.
Commercial Links
Hello Friends, I may not be able to post any new blogs this week. However, I should be able resume normal posting again by the weekend.
Please bear with the lack of new learning material for a few days :-) See ya'll soon.
From some time I have been thinking about publishing a series of podcasts/screencasts to help people learning Java. I would like to structure them as discussions (with professional Java developers) instead of lectures. These discussions will be structured with the aim to educate.
Each session will be focussed on a particular Java concept or library. Things that would ideally constitute a session are:
These sessions will be augmented blog posts containing screencasts, code samples, exercises, and links to other learning resources.
What do you guys think of such a series? Will it be helpful to developers? Do you have any suggestions?
Tracing
algorithms result in a fragmented heap, because objects that are
garbage collected in the sweep phase, usually occupy arbitrary
positions in the heap, resulting in holes, wherever the old objects
were garbage collected. Compacting algorithms have a modified sweep
phase, in which all live objects are copied into one end of the heap.
After they are copied, the rest of the heap is freed. This places all
live objects contiguously, and the heap is defragmented.
Advantages:
The memory is defragmented in the sweep phase
Disadvantages:
The [Java] program has to be paused when the garbage collector starts.
References have to be updated when the live objects are moved.
Discuss this post in the learning forum.
Commercial Links