Saturday, June 30, 2007

The right attitude for software development

A fledgling software developer with the right attitude?

Photo credit: Mdf

Note: This text was originally posted on my earlier blog at

Friday, June 29, 2007

Concluding the series on Java garbage collection

Over the past few blog posts we have covered some basics of garbage collection in Java. Garbage collection is a key strength of the JVM, since we do not have to worry about releasing object memory. It reduces memory leaks and other problems associated with improper memory management code that often creeps in when we program against hard deadlines :-)

However, for large programs we very often have to tweak the default garbage collection mechanism to improve performance. Here's a nice article that explains garbage collection in much more detail with good examples. This page contains several links to memory management in the Java Hotspot VM. Here's another page that explains how to fine tune the Java garbage collector in Java 1.5.

It is also a good idea to keep up with the latest in technology, so here's a page that describes enhancements to the Java VM in version 1.6, that influence the garbage collector.

I hope you enjoyed this series. Over the next few months I hope to cover various aspects of core Java through such mini series posts. I hope you find them informative. As always your comments and suggestions are very welcome.

You can discuss this post in our learning forum.

Note: This text was originally posted on my earlier blog at
Here are the comments from the original post

AUTHOR: Sanket Daru
DATE: 06/29/2007 05:13:19 AM
Dear Sir,
Indeed the mini-series on JVM Garbage Collection was very informative. At the conclusion, I have one question though.

Well this might be out of scope of the mini-series, but I want some guide. I was working with XML parsing in Java and started off using DOM parsing. Now DOM has its limitations, mainly due to its HEAVY footprint and it ends up giving me "out of memory" errors... I tried several other means but nothing seemed to work out...

Eventually I left the matter because I was just trying it out for experimentation. I had decided to use SAX due to its light footprint and event-based modeling.

Now my question is, can we tweak memory management of JVM and Garbage Collector programatically? Such a solution will be really helpful.

Looking forward to more of such informative mini-series...

DATE: 06/29/2007 06:52:47 AM
Hi Sanket,

As of now I do not believe there is a way to "programatically" tweak garbage collection. You can tweak the JVM when you start it using:
-Xms -Xmx for heap size.
Using various other options as described in (, you can modify the generation sizes and ask the JVM to use a particular JVM algorithm.

However, all these things have to be done when the JVM is started.

Increasing the heap size will allow you to hold a larger DOM tree in memory, but tweaking the GC will only help you with performance, not with being able to hold more objects in memory.

One potential solution might be to use serialization. If you can hold only enough information in the nodes such that your entire tree can live in memory, and when a node is visited it's details are retrieved from a serialized object. Once you navigate away from that node release the memory (modifying the serialized object of any changes are made).

I am not sure of this is a standard way of dealing with this problem and if an API already exists to achieve it. But it seems like a plausible solution.

Tuesday, June 26, 2007

Garbage collection example

A Java program can request the underlying JVM to perform garbage collection by calling the gc() method of the System class.

Note that System.gc() is a request to the underlying JVM. Garbage collection may not always happen. Hence we cannot depend on this method, however we can use it to optimize performance, with the understanding that it may not work as desired on all JVM's.


Study the program shown below. It ilustrates how we can invoke the garbage collector from a Java program.

01 /** This program creates instances of Bag objects in a loop which will run 100000 times.
02  * Whenever the program is in a loop that is a multiple of 1000, it requests the JVM
03  * to start the garbage collector. We will see that the JVM may not always fulfill the
04  * request.
05  */
06 public class GarbageCollectionDemo {
07   public static void main(String args[]) {
08     //loop 10000 times
09     for(int i=0;i<10000;i++) {
10       //if this loop is a multiple of 1000, then request for garbage collection
11       if(i%1000 == 0) {
12         System.out.println("Requesting for garbage collection");
13         System.gc();
14       }
15       System.out.println("Creating bag " + i);
16       Bag b = new Bag(String.valueOf(i));
17     }
18   }
19 }
21 /** A placeholder object that is created in GarbageCollectionDemo
22  *
23  */
24 class Bag {
25   private String id;
26   public Bag(String id) {
27 = id;
28   }
30   /**We override the finalize method and put a print statement
31    * which will tell us when the object was garbage collected.
32    */
33   public void finalize() {
34     System.out.println("Garbage collecting bag " + id);
35   }
36 }

Here is the output of the program. Since the output was very large, I have truncated most parts (which have been shown as dots ...). As is shown, we request for garbage collection after creating 999 objects [line 11]. The JVM complies and garbage collects all unused objects. Also note that the garbage collector again starts reclaiming objects on line 32, even though the program has not specifically asked for it to do so. It may have done so either because it ran short of resources (but that seems unlikely because the garbage collector had just freed resources some time back), or it was run in the normal course of the garbage collection algorithm (this seems more likely).

Now this JVM is really well behaved, it fulfills all our garbage collection requests, but do not take this behavior for granted. All JVM's may not be as well mannered.

02 Requesting for garbage collection
03 Creating bag 0
04 Creating bag 1
05 Creating bag 2
06 Creating bag 3
07 Creating bag 4
08 Creating bag 5
09 ...
10 Creating bag 999
11 Requesting for garbage collection
12 Garbage collecting bag 999
13 Garbage collecting bag 998
14 Garbage collecting bag 997
15 Garbage collecting bag 996
16 Garbage collecting bag 995
17 ...
18 Garbage collecting bag 5
19 Garbage collecting bag 4
20 Garbage collecting bag 3
21 Garbage collecting bag 2
22 Garbage collecting bag 1
23 Garbage collecting bag 0
24 Creating bag 1000
25 Creating bag 1001
26 Creating bag 1002
27 Creating bag 1003
28 ...
29 Creating bag 1452
30 Creating bag 1453
31 Creating bag 1454
32 Garbage collecting bag 1454
33 Garbage collecting bag 1001
34 ...
35 Garbage collecting bag 1228
36 Garbage collecting bag 1226
37 Garbage collecting bag 1227
38 Creating bag 1455
39 Creating bag 1456
40 ...
41 Creating bag 1998
42 Creating bag 1999
43 Requesting for garbage collection
44 Garbage collecting bag 1999
45 Garbage collecting bag 1998
46 ...
47 Garbage collecting bag 1456
48 Garbage collecting bag 1455
49 Creating bag 2000
50 ...
51 Creating bag 2813
52 Creating bag 2814
53 Garbage collecting bag 2814
54 Garbage collecting bag 2000
55 Garbage collecting bag 2813
56 ...
57 Garbage collecting bag 2413
58 Garbage collecting bag 2401
59 ...
60 Garbage collecting bag 2407
61 Creating bag 2815
62 Creating bag 2816
63 Creating bag 2817
64 ...
65 ...
66 ...
67 Creating bag 9872
68 ...
69 Creating bag 9999

Note: This text was originally posted on my earlier blog at

Saturday, June 23, 2007

Adaptive algorithms

Adaptive algorithms do not use any fixed algorithm for garbage collection. They monitor the heap and choose an algorithm that is most effective for the current usage pattern. Such algorithms may also divide the heap into sub heaps and use a different algorithm for every sub heap, depending on the usage.

Note: This text was originally posted on my earlier blog at

Friday, June 22, 2007

Generational algorithms


Generational algorithms make use of a property found in most Java programs. Several studies have revealed a pattern in the way objects are used. It has been found that objects that get created early in program execution usually live longer than objects that get created later. It is usually the youngest objects that are garbage collected first.

Look at the diagram below. The blue area in the diagram is a typical distribution for the lifetimes of objects. The X axis represents the bytes allocated by the JVM, and the Y axis represents th number of surviving bytes (live objects). The sharp peak at the left represents objects that can be reclaimed (i.e., have "died") shortly after being allocated. Iterator objects, for example, are often alive for the duration of a single loop.

Image source: The above image has been taken from the document "Tuning garbage collection with the 5.0 Java[tm] virtual machine"

If you notice, the distribution stretches out to the the right. This is because some objects live longer. Typically these are objects that have been created when the program started, and they live for the duration of the program. The lump observed after the first drop represents those objects that are created for some intermediate process. Some applications have very different looking distributions, but a surprisingly large number possess this general shape. The diagram above shows that most objects have a very short life span. Generational algorithms take advantage of this fact to optimize garbage collection in the JVM.


Generational algorithms divide the heap into several sub heaps. As objects get created, they are put in the sub heap that represents the youngest generation. When this area becomes full, the garbage collector is run and all unused objects in that sub heap get garbage collected. Objects that survive a few garbage collection attempts are promoted to a sub heap representing an older generation. The garbage collector runs most frequently in the younger heaps. Each progressively older generation of objects get garbage collected less often.

Generational algorithms are very efficient because they make use of certain well known properties of Java programs. Sun's Hotspot VM uses a modified form of generational algorithms. However, please note that this algorithm will not work efficiently with programs that make non-standard use of memory. We may configure the algorithm to work appropriately with such programs if we understand their memory usage well. If not, it is best to keep the default implementation.


  1. Tuning garbage collection with the 5.0 Java[tm] virtual machine

Note: This text was originally posted on my earlier blog at

Monday, June 18, 2007


Attended (and partly managed/mismanaged) BlogCampPune on Saturday. I am happy we organized an event for bloggers in Pune. This medium of expression and conversation has a lot of potential.

People often point out that individual blogs (those who rant about toothpaste and the like) are doing more harm than good by cluttering the internet. But I think otherwise. There are real gems written by people who have some meaningful stuff to share. But besides sharing, a blog is a medium for individual expression. It's like having a conversation in the ether... you are generally talking to anyone who is interested... and anyone who is interested responds back. Writing also has the potential to guide a person towards personal development. Blogging also enables grassroots journalism. So, all in all, I think blogging is an excellent medium, and I'm glad BlogCampPune happened.

Overall the unconference was pretty good. I admit, wi-fi sucked, as did the presentations by some people who did not realize that one does not do blatant advertising in an unconference. Next time... please sell your wares by empowering people and not by boring them :-)

Among the sessions, I enjoyed the talk by Melody, and the discussion on old vs. new media. Some interesting thoughts there.

CNBC interviewed me, but I have no clue when they will air the 2 minute snippet.

Copying algorithms

Copying algorithms are tracing algorithms that divide the heap into multiple regions; one object region in which objects are created, and one or more free space regions (which are ignored, ie. objects are never created in these regions). When the object space becomes full, all live objects are moved to the free space. Now the free space becomes the object space, and the old object space becomes the free space. Since objects are placed contiguously in the new region, holes (fragments) are not formed.

The most commonly used copying algorithm is known as stop and copy, which uses one object region and one free space region. When the object region becomes full, the (Java) program is paused until live objects are copied into the free space. The program is restarted after the copying phase.

Copying algorithms have an advantage over compacting algorithms in that the address of object references does not change when the objects are moved.

Discuss this post in the learning forum.

Commercial Links

Note: This text was originally posted on my earlier blog at

Monday, June 11, 2007

May not be able to post this week

Hello Friends, I may not be able to post any new blogs this week. However, I should be able resume normal posting again by the weekend.

Please bear with the lack of new learning material for a few days :-) See ya'll soon. 

Note: This text was originally posted on my earlier blog at

Thursday, June 07, 2007

Java discussions and screencasts for learning

From some time I have been thinking about publishing a series of podcasts/screencasts to help people learning Java. I would like to structure them as discussions (with professional Java developers) instead of lectures. These discussions will be structured with the aim to educate.

Each session will be focussed on a particular Java concept or library. Things that would ideally constitute a session are:

  1. Brief explanation of the concept/library (say the Java IO library)
  2. Places where it can be applied
  3. Special situations to take care of
  4. Pain points
  5. Best practices

These sessions will be augmented blog posts containing screencasts, code samples, exercises, and links to other learning resources.

What do you guys think of such a series? Will it be helpful to developers? Do you have any suggestions?

Note: This text was originally posted on my earlier blog at
Here are the comments from the original post

AUTHOR: Sanket Daru
DATE: 06/07/2007 08:58:40 AM
Great Idea. Go for it.

Its only through a developer's interaction with other developers that any value will be added to his knowledge base. This to-be followed methodology of yours will go a long way in ensuring that the value-addition takes place.
DATE: 06/07/2007 11:45:40 AM
Thanks for your thoughts Sanket.

I would like to start this series with the perspective of a student who has learned Java sometime back. A discussion where he or she talks about what they found difficult to understand, and how they finally managed to grasp it. Their methodology for learning, and some tips for those who are starting now. It will be very valuable for other learners to know where pitfalls exist and how to overcome them.

Since you are the first person to post a comment on this topic and you are also a very good learner :-) I would like to start this series with a discussion with you. Would you like to be participate?

AUTHOR: anonymous
DATE: 06/08/2007 05:23:39 AM
gr8 idea. Looking forward to it.
AUTHOR: Badal Burathoki
DATE: 06/11/2007 06:34:05 AM
sir, your idea to publishing a series of podcasts/screencasts will really help us to learn and the section 'Special situations to take care of' will be interesting one.
AUTHOR: kishore hariharan
DATE: 06/13/2007 09:38:45 AM
its an interesting thought to which i would like to add my own..what i found after being into development in java for the past 9 months or so is that there is a clear distinction between concepts and their implementations..concepts more or less are well understood through appropriate real world analogies..but its their code implementation that pose a challenge..the implementation part could be one other area which could be looked into..which would enable better programming practices..its just a suggestion..what do you say prof.??

kishore hariharan
student 2006 batch SCIT
AUTHOR: Sanket Daru
DATE: 06/15/2007 03:50:22 PM
Dear Sir,
Sorry didn't reply to your comment. I usually go through the blogs with the help of RSS reader and hence miss out on the comments.

Well, you know that I am always very eager to participate in any form of learning and if I can be of any help to you in this process, it will be my pleasure!

I do agree with Kishore that implementation is one area where many students fail. After clearing the base thoroughly, the implementation is the one key area which needs to be addressed.

In any case, I will be very glad to offer any help possible from my side. And now I will keep in mind to visit the blog often to check out the comments.

DATE: 06/15/2007 05:16:36 PM
Hi Sanket,

Thanks for offering to help.

DATE: 06/16/2007 08:34:30 AM
Hi Sanket,

I too have a lot of trouble keeping track of my comments and people's responses. Some blogs allow you to subscribe to the comment feed or support email notification when someone adds a comment to the post on which you have commented. However, this feature is not available on all blogs.

A simple process that I follow is to bookmark all the posts that I comment on and then check the posts after a few days to see if someone has written anything in response to my comment. This process has worked out well for me.


AUTHOR: Sanket Daru
DATE: 06/23/2007 11:31:50 AM
Dear Sir,
Will try out the tracking method suggested by you. But I guess, keeping a track of the bookmarks itself will become a headache if one is used to commenting a lot!!! Nevertheless, a good idea.

Subscribing to the comments feed doesn't make much sense, that is what I feel, because more often than not, the comments are mere one-liners, not informative and hardly worth reading! Still, some blogs have really interesting comment feeds as well. So again, its a matter of choice and availability!

Sanket Daru.
AUTHOR: A.K.Purandare
DATE: 06/24/2007 04:18:46 AM
You are very precise in your description and plan.I believe you may be somewhere sharing your thoughts on other fields as well.Like providing solutions/help to Indian University students,Discussing various problems and solutions related to on line coaching and learning etc.Will be delighted to know.Thanks.
DATE: 06/25/2007 05:39:29 AM

I try to share my thoughts and the results of my online learning experiments with everyone through my blog. Even though I do not formally collaborate with or mentor any students or institute, though I will be glad to share whatever I know through this blog.

For online learning there are some people who are real pioneers. They have an awesome vision of how learning should happen and how the Internet and New Media can prove to be helpful resources.

Some people whose blogs I regularly visit to get more insight on learning are:
Stephen Downes:
Cristopher Sessums:


Tuesday, June 05, 2007

Compacting algorithms

algorithms result in a fragmented heap, because objects that are
garbage collected in the sweep phase, usually occupy arbitrary
positions in the heap, resulting in holes, wherever the old objects
were garbage collected. Compacting algorithms have a modified sweep
phase, in which all live objects are copied into one end of the heap.
After they are copied, the rest of the heap is freed. This places all
live objects contiguously, and the heap is defragmented.


  • The memory is defragmented in the sweep phase


  • The [Java] program has to be paused when the garbage collector starts.

  • References have to be updated when the live objects are moved.

Discuss this post in the learning forum.

Commercial Links

Note: This text was originally posted on my earlier blog at