Skip to main content

The Java Classloader #2

Now that we understand the importance of dynamic linking, assume that you are running a dynamically linked C++ program which needs to use a class that is in an external library. The runtime locates the external library using the PATH environment variable.

Java programs do not use the PATH environment variable to locate classes, they use the CLASSPATH variable instead. CLASSPATH contains a list of directories, zip files or jar files. You are probably thinking, why Java does not use PATH? Why did it invent another environment variable, the CLASSPATH? One reason why Java did ot choose to use the PATH variable, is because PATH already contains entries needed by other executables. Using PATH may slow down the system, because it will have to process extra entries. OK, so if that convinces you, let's have a look at what the CLASSPATH contains. The CLASSPATH contains a list of directories, which contain class files that will be needed by the Java runtime (Note: Core Java classes like the String class are found automatically by the runtime. We do not need to add entries in the CLASSPATH to locate them). Zip files and jar files are also allowed because if you think about it, they are nothing but archived and compressed directories.

Consider a scenario where we are running a Java based Student Registration software. During execution some class tries to instantiate the Student class. Since a Java program is not distributed as one large executable file, the JVM needs to locate and load the Student class. The JVM iterates through all the entries in the CLASSPATH and searches for the class file in every entry. If the class cannot be found, the JVM throws a ClassNotFoundException. If a classfile exists in a directory or jar file not listed in the CLASSPATH, then as far as the JVM is concerned, it does not exist. The responsibility of locating and loading classes is assigned to the Java Classloader.

There are a few holes in this approach. Can you find them? What if an external library we are using also has a Student class. Which class will the Classloader load? Our Student class or the Student class from the external library? Is there an unambiguous way to determine the right class? Well, we asked the JVM to load a class called Student, it has no way of knowing which one of the multiple Student classes in the CLASSPATH is the right class. Clearly we must have a unique attribute that differentiates all the Student classes. This unique attribute is called a namespace which is represented as a package name in Java. Every Java class must exist in a namespace which is specified using the package keyword. Even classes that do not belong to any specific package, belong to the default package.

The code below shows how we can create a Student class which belongs to the 'edu.scit.studentreg' package.

package edu.scit.studentreg;

public class Student {

}

The use of packages differentiates this class from another class which is also called Student but belongs to the namespace 'com.oracle.studentreg'.

package com.oracle.studentreg;

public class Student {

}

Now when we want to instantiate a Student class we will instantiate it by using the fully qualified name of the class.

edu.scit.studentreg.Student s1 = new edu.scit.studentreg.Student();

or

com.oracle.studentreg.Student s2 = new com.oracle.studentreg.Student();

Because we have used the fully qualified class name, the Classloader knows exactly which Student class we are reffering to. But wait... a Student class is created in a file called Student.class. We do not use the package name in the name of the file. How does the Classloader know which class file is the right one? Even though the class file contains the package name, searching for Student.class in every directory and subdirectory in the CLASSPATH, to determine if it contains the right package name is a very time consuming process. We need a better way of organizing class files, so the JVM can locate them quickly. The convention used is to match the directory heirarchy in which a class file is put with the package name of the class. Let us understand this concept with an example we have used above. Where should we put the class edu.scit.studentreg.Student? First of all we determine a base directory in which we will put this and many other classes. If we decide to put our classes in c:\scit\classes, then we have to follow the convention relative to c:\scit\classes. We must create a directory called 'edu' (in c:\scit\classes) then we create a directory called 'scit' in 'edu' and a directory called 'studentreg' in 'scit'. The file Student.class is placed inside the 'studentreg' directory. What do you think should go in the CLASSPATH? c:\scit\classes or c:\scit\classes\edu\scit\studentreg? We put c:\scit\classes.

When the Classloader needs to locate the class edu.scit.studentreg.Student, it will look at the first entry in the classpath. Suppose it is c:\scit\classes. The JVM will now try to locate the class by zeroing in to the appropriate location based on the package name of the class. First it looks for a directory edu in c:\scit\classes, if it finds the directory then it looks for scit inside edu and studentreg inside scit. Once in the studentreg directory it looks for a file called Student.class. If found the file is loaded, othewise the next entry in the classpath is searched. If the file is not found in any of the directories specified in the classpath, then a ClassNotFoundExceptin is thrown.

This is how the Classloader locates and loads class files.

Even if you have understood how the Classloader loactes class files, I would like to suggest that you practice a couple of times of understand the concept better. To best understand these concepts, I strongly recommend that you work with Notepad and the command line.

  1. Create any class and associate it with the package example.code , and compile the class. Ensure that the Java as well as the class file exist in a proper directory heirarchy. Add the appropriate directory to the CLASSPATH and run the program from a totally different directory.
  2. Change the location of the class file and run the program.
  3. Remove the directory from the CLASSPATH, and run the program from the parent directory of the directory which you had put in your CLASSPATH. Does the program run? If not, try removing all CLASSPATH entries and run the program again.

Please note, while removing CLASSPATH entries, do not remove them from the environment of the operating system. Each command prompt gets the CLASSPATH that is set for that user, however, we can modify the CLASSPATH of only that command prompt without affecting the OS's environment.

In the next post, we will look at some more nuances of the Classloader.



Notes: This text was originally posted on my earlier blog at http://www.adaptivelearningonline.net

Comments

Oliver Jones said…

I love your blog.. very nice colors & theme. Did you create this website yourself or did you hire someone to do it for you? Plz answer back as I'm looking to design my own blog and would like to know where u got this from. many thanks all of craigslist

Popular posts from this blog

My HSQLDB schema inspection story

This is a simple story of my need to inspect the schema of an HSQLDB database for a participar FOREIGN KEY, and the interesting things I had to do to actually inspect it. I am using an HSQLDB 1.8 database in one of my web applications. The application has been developed using the Play framework , which by default uses JPA and Hibernate . A few days back, I wanted to inspect the schema which Hibernate had created for one of my model objects. I started the HSQLDB database on my local machine, and then started the database manager with the following command java -cp ./hsqldb-1.8.0.7.jar org.hsqldb.util.DatabaseManagerSwing When I tried the view the schema of my table, it showed me the columns and column types on that table, but it did not show me columns were FOREIGN KEYs. Image 1: Table schema as shown by HSQLDB's database manager I decided to search on StackOverflow and find out how I could view the full schema of the table in question. I got a few hints, and they all pointed to ...

Commenting your code

Comments are an integral part of any program, even though they do not contribute to the logic. Appropriate comments add to the maintainability of a software. I have heard developers complain about not remembering the logic of some code they wrote a few months back. Can you imagine how difficult it can be to understand programs written by others, when we sometimes find it hard to understand our own code. It is a nightmare to maintain programs that are not appropriately commented. Java classes should contain comments at various levels. There are two types of comments; implementation comments and documentation comments. Implementation comments usually explain design desicisions, or a particularly intricate peice of code. If you find the need to make a lot of implementation comments, then it may signal overly complex code. Documentation comments usually describe the API of a program, they are meant for developers who are going to use your classes. All classes, methods and variables ...

Inheritance vs. composition depending on how much is same and how much differs

I am reading the excellent Django book right now. In the 4th chapter on Django templates , there is an example of includes and inheritance in Django templates. Without going into details about Django templates, the include is very similar to composition where we can include the text of another template for evaluation. Inheritance in Django templates works in a way similar to object inheritance. Django templates can specify certain blocks which can be redefined in subtemplates. The subtemplates use the rest of the parent template as is. Now we have all learned that inheritance is used when we have a is-a relationship between classes, and composition is used when we have a contains-a relationship. This is absolutely right, but while reading about Django templates, I just realized another pattern in these relationships. This is really simple and perhaps many of you may have already have had this insight... We use inheritance when we want to allow reuse of the bulk of one object in other ...