Skip to main content


Showing posts from May, 2007

Tracing algorithms

Before we understand what tracing algorithms are, we must first understand the meaning of the phrase “root set of references” . Essentially the root set of references are those references that are always accessible to an executing Java program. They constitute local variables, method arguments, class variables, and instance variables. When triggered, tracing algorithms mark all objects on the heap that can be traced from the root set of references. These are objects that are directly or indirectly referenced by the root set. Let's consider these to be live objects. This forms the mark phase of the tracing algorithm. The mark phase is followed by a sweep phase, in which all objects that were not marked in the mark phase are garbage collected. Because of it's two phases, this type of tracing algorithm is also known as “mark and sweep” algorithm. Advantages: Can detect and handle cycles Disadvantages: The heap is fragmented, ater the sweep phase. Having a fra

Reference counting algorithms

Early JVM's used reference counting to determine objects no longer in use. In reference counting, an objects reference count is initially set to one when it is instantiated and it's reference is assigned to a variable. Every time a reference of that object is assigned to another variable, the count is incremented by one. When a variable holding the reference goes out of scope, the count of the object is decremented by one. An object is eligible for garbage collection when it's reference count becomes zero. Let's see how reference counting algorithms work, with a simple animation. Right click in the region below and select 'play' to start the animation. Advantages: It is very simple to implement. The JVM does not need to be paused when the garbage collector is running. Disadvantages: There is an extra overhead since the reference count of objects needs to be updated very often.

Garbage collection in Java

I am starting a series of posts on the Java garbage collector and various garbage collection algorithms. Over the 10 days or so, we will examine the Java garbage collector and garbage collection algorithms. Let's begin with a brief introduction. An executing program occupies memory. Memory is taken as new objects are created. This memory should be returned when these objects are no longer needed. If it is not, then the computer will soon run out of resources, forcing the program to a halt. In C++ we have to explicitly return an object's memory back to the system when the object is destroyed. This is also the source for many bugs in C++. Buggy code that does not return memory can very soon exhaust system resources. To eliminate such errors, and make your program more robust, Java has adopted the philosophy of automatic memory reclamation. The JVM determines when an object is no longer needed (Eg: after the variable goes out of scope), and automatically reclaims memory occupied b

Garbage collection example

A Java program can request the underlying JVM to perform garbage collection by calling the gc() method of the System class. Note that System.gc() is a request to the underlying JVM. Garbage collection may not always happen. Hence we cannot depend on this method, however we can use it to optimize performance, with the understanding that it may not work as desired on all JVM's.   Study the program shown below. It ilustrates how we can invoke the garbage collector from a Java program. 01   /** This program creates instances of Bag objects in a loop which will run 100000 times. 02     * Whenever the program is in a loop that is a multiple of 1000, it requests the JVM 03     * to start the garbage collector. We will see that the JVM may not always fulfill the 04     * request. 05     */ 06   public class  GarbageCollectionDemo  { 07      public static  void  main ( String args []) { 08        //loop 10000 times 09        for ( int  i= 0 ;i< 1

Adaptive algorithms

Adaptive algorithms do not use any fixed algorithm for garbage collection. They monitor the heap and choose an algorithm that is most effective for the current usage pattern. Such algorithms may also divide the heap into sub heaps and use a different algorithm for every sub heap, depending on the usage. Note: This text was originally posted on my earlier blog at

Generational algorithms

Introduction: Generational algorithms make use of a property found in most Java programs. Several studies have revealed a pattern in the way objects are used. It has been found that objects that get created early in program execution usually live longer than objects that get created later. It is usually the youngest objects that are garbage collected first.   Look at the diagram below. The blue area in the diagram is a typical distribution for the lifetimes of objects. The X axis represents the bytes allocated by the JVM, and the Y axis represents the number of surviving bytes (live objects). The sharp peak at the left represents objects that can be reclaimed (i.e., have "died") shortly after being allocated. Iterator objects, for example, are often alive for the duration of a single loop.   If you notice, the distribution stretches out to the the right. This is because some objects live longer. Typically these are objects that have been created when the program st

Copying algorithms

Copying algorithms are tracing algorithms that divide the heap into multiple regions; one object region in which objects are created, and one or more free space regions [which are ignored, ie. objects are never created in these regions]. When the object space becomes full, all live objects are moved to the free space. Now the free space becomes the object space, and the old object space becomes the free space. Since objects are placed contiguously in the new region, holes [fragments] are not formed. The most commonly used copying algorithm is known as stop and copy, which uses one object region and one free space region. When the object region becomes full, the [Java] program is paused until live objects are copied into the free space. The program is restarted after the copying phase.   Copying algorithms have an advantage over compacting algorithms in that the address of object references does not change when the objects are moved. Discuss this post in the learning forum . <

Deciding what to log

In this section we will discuss what to log and the various levels at which we can generate log statements. Even though I have mentioned that we should generate log statements when control enters and exits a method, this is a rather controversial point. Especially generating log statements on exiting; because a method could have multiple exit routes. What do you think? I would say, if in doubt - do NOT generate logs when exiting a method. Logs that show control has entered a method do have value in my personal opinion, however you can leave those out as well if you do not see much value in them. Let me first draw your attention to the output from last page. 0 [main] DEBUG LoggerTest - Entering main 453 [main] DEBUG LoggerTest - Exiting main What is DEBUG? As I have explained, we should log several events from our program as it executes. But all events are not equally important. Hence, Log4J supports six levels for logging, where each level is appropriate for a

Log4j's configuration file

Log4J configuration file # Set root logger level to DEBUG and its only appender to A1. log4j.rootLogger=DEBUG, A1 # A1 is set to be a ConsoleAppender. log4j.appender.A1=org.apache.log4j.ConsoleAppender # A1 uses PatternLayout. log4j.appender.A1.layout=org.apache.log4j.PatternLayout log4j.appender.A1.layout.ConversionPattern=%-4r [%t] %-5p %c - %m%n Program 01   import  org.apache.log4j.*; 02   public class  LoggerTest  { 03   04      static  String category = LoggerTest. class .getName () ; 05      static  Logger logger = Logger.getLogger ( category ) ; 06      static  { 07        PropertyConfigurator.configure ( "" ) ; 08      } 09   10      public static  void  main ( String args []) { 11        logger.debug ( "Entering main" ) ; 12        System.out.println ( "Hello World" ) ; 13        try  { 14          Thread.sleep ( 450 ) ; 15        }  catch ( Exception e ) { 16          System.out.println (

Initializing Log4j

  In the next post I will describe a log4j configuration file.  Discuss this post in the learning forum . Note: This text was originally posted on my earlier blog at     Commercial Links Buy programming books from our powered storefront . Earn as you browse. Sign up for Algoco .  

Log4j an introduction

Log4J is a logging library that helps us create log statements that are way more powerful than System.out.println("..."). Log4J messages can be formatted with a timestamp, source, and many other things. The Log4J system is initialzed using a property file, which determines the destination, and formatting of the log messages. Log messages can be selectively turned on/off by modifying the property file.   Here is Log4J's documentation page. It has a very good introduction and a wiki .  Discuss this post in the learning forum . Note: This text was originally posted on my earlier blog at     Commercial Links Buy programming books from our powered storefront . Earn as you browse. Sign up for Algoco .  

Generating log statements from your application

In most software applications, control passes through complex code while users use the system. For large applications it is very important to keep track of where control is passing through in the application, if any exceptions are being thrown and what actions are being invoked by the user. Keepingtrack of these things help in maintaining an audit log and also in debugging the system if it fails at any point. In this section we will understand why we should not use System.out.println() and we will also understand how to use a very good open source alternative called Log4J . When you write small programs and run them, how do you keep track of where your program is passing control? How do you debug your program if it does'nt work properly? Most likely the answer is: System.out.println("some message"); This is the most common way to debug and track a small program, but it is also a flawed approach. It is cumbersome to print the source of the message. It is cumbersome to prin

Resources for coding conventions

Yesterday, I posted about simple coding conventions. Following good coding conventions is like keeping your car keys in the same place everyday. If you always keep them in the same place, you as well as other's in your house will be able to find them easily. However, if you keep them in some random place, everyone is going to have a hrd time finding them, resulting in unproductive time. Staying with the car analogy, imagine what would happen if different cars had their brake pedal in different places. The reason why we are easily able to drive different cars with ease is because we where to find the things we need to drive the car. It's important that you follow some coding convention. It could be a homegrown convention for your company. It's ok, but, other people may have a hard time understanding your code. Not just other people, but your own developers who have recently come from other companies will also have to learn your coding conventions. To elliminate such problems

Commenting your code

Comments are an integral part of any program, even though they do not contribute to the logic. Appropriate comments add to the maintainability of a software. I have heard developers complain about not remembering the logic of some code they wrote a few months back. Can you imagine how difficult it can be to understand programs written by others, when we sometimes find it hard to understand our own code. It is a nightmare to maintain programs that are not appropriately commented. Java classes should contain comments at various levels. There are two types of comments; implementation comments and documentation comments. Implementation comments usually explain design desicisions, or a particularly intricate peice of code. If you find the need to make a lot of implementation comments, then it may signal overly complex code. Documentation comments usually describe the API of a program, they are meant for developers who are going to use your classes. All classes, methods and variables

Online, peer-taught, outcome-based, and instructor-facilitated teaching

From some time I have been thinking of the right way to teach a programming course. Something I have noticed from the way I learn, as well as by having taught Java to college students and corporate participants, is that programming is not learned well if it is detached from practice and discussion. In fact, practice and discussion are the most important aspects of learning how to program. Here are some numbers from the National Training Laboratories, Maine, USA, that point to how retention rates differ for different learning approaches. Research shows that discussion, practice, and teaching are the most effective ways to learn. My own learning experience also validates this finding. When I simply read about a programming technology, I might forget it in a few days, but when I practice it, the learning remains with me for longer, and when I teach something, I retain it for much longer, and even more importantly, I also grasp the fundamentals in a better way. All of these concepts ca