Skip to main content

Catching up with Java 5

Java 5 (a.k.a Tiger) has been around from a while. But there are still many developer's (including myself) who do not know about and use all it's features.

So, in an effort to educate myself and help others, I have decided to spend some time everyday reading Java 1.5 Tiger A Developer's Notebook, and share my findings with others on this blog.

Something I found out today (I know this should have happened long back, but such is the profession of programming :-) ), is that since Java 1.5 there is support for Unicode 4 which supports a supplemantary character set, that goes beyond 16 bits. An interesting implication is that a the char data type may no longer be able to hold all characters, because those in the supplementary range can now take upto 21 bits.

This means that a string that contains certain characters may have to encode them as 2 char data types. Such a pair of characters that represents one codepoint is known as a surrogate pair. Now a string with n codepoints may no longer be n characters long, because some code points will be encoded using one character, while some will use a surrogate pair.

A few questions have come to my mind about parsing such strings. How do I determine which codepoint appears in the middle of the String?

I came across this article that explains support for unicode 4 in Java. I will read it and share any interesting findings on this blog.

Meanwhile for a more general explanation of unicode, I strongly recommend this excellent article by Joel Spolsky: The absolute minimum every software developer absolutely, positively must know about unicode and character sets (no excuses!)

  • Discuss this post in the learning forum.
  • Check out my learning journal. I am learning JSF at the moment. Do you want to join an experiment in forming an adhoc virtual study group?
Note: This text was originally posted on my earlier blog at http://www.adaptivelearningonline.net
Here are the comments from the original post

-----
COMMENT:
AUTHOR: Manjari
URL: http://simplymanjari.blogspot.com/
DATE: 08/17/2007 05:33:37 AM
Thanks for the link to Joel's article on Unicode. I discovered I was blissfully ignorant in that context.
-----
COMMENT:
AUTHOR: Parag
DATE: 08/18/2007 06:31:12 PM
You are very welcome Manjari, and thanks for the comment :-)

--
Regards
Parag

Comments

Popular posts from this blog

Inheritance vs. composition depending on how much is same and how much differs

I am reading the excellent Django book right now. In the 4th chapter on Django templates , there is an example of includes and inheritance in Django templates. Without going into details about Django templates, the include is very similar to composition where we can include the text of another template for evaluation. Inheritance in Django templates works in a way similar to object inheritance. Django templates can specify certain blocks which can be redefined in subtemplates. The subtemplates use the rest of the parent template as is. Now we have all learned that inheritance is used when we have a is-a relationship between classes, and composition is used when we have a contains-a relationship. This is absolutely right, but while reading about Django templates, I just realized another pattern in these relationships. This is really simple and perhaps many of you may have already have had this insight... We use inheritance when we want to allow reuse of the bulk of one object in other ...

Running your own one person company

Recently there was a post on PuneTech on mom's re-entering the IT work force after a break. Two of the biggest concerns mentioned were : Coping with vast advances (changes) in the IT landscape Balancing work and family responsibilities Since I have been running a one person company for a good amount of time, I suggested that as an option. In this post I will discuss various aspects of running a one person company. Advantages: You have full control of your time. You can choose to spend as much or as little time as you would like. There is also a good chance that you will be able to decide when you want to spend that time. You get to work on something that you enjoy doing. Tremendous work satisfaction. You have the option of working from home. Disadvantages: It can take a little while for the work to get set, so you may not be able to see revenues for some time. It takes a huge amount of discipline to work without a boss, and without deadlines. You will not get the benefits (insuranc...

Commenting your code

Comments are an integral part of any program, even though they do not contribute to the logic. Appropriate comments add to the maintainability of a software. I have heard developers complain about not remembering the logic of some code they wrote a few months back. Can you imagine how difficult it can be to understand programs written by others, when we sometimes find it hard to understand our own code. It is a nightmare to maintain programs that are not appropriately commented. Java classes should contain comments at various levels. There are two types of comments; implementation comments and documentation comments. Implementation comments usually explain design desicisions, or a particularly intricate peice of code. If you find the need to make a lot of implementation comments, then it may signal overly complex code. Documentation comments usually describe the API of a program, they are meant for developers who are going to use your classes. All classes, methods and variables ...