Skip to main content

Catching up with Java 5

Java 5 (a.k.a Tiger) has been around from a while. But there are still many developer's (including myself) who do not know about and use all it's features.

So, in an effort to educate myself and help others, I have decided to spend some time everyday reading Java 1.5 Tiger A Developer's Notebook, and share my findings with others on this blog.

Something I found out today (I know this should have happened long back, but such is the profession of programming :-) ), is that since Java 1.5 there is support for Unicode 4 which supports a supplemantary character set, that goes beyond 16 bits. An interesting implication is that a the char data type may no longer be able to hold all characters, because those in the supplementary range can now take upto 21 bits.

This means that a string that contains certain characters may have to encode them as 2 char data types. Such a pair of characters that represents one codepoint is known as a surrogate pair. Now a string with n codepoints may no longer be n characters long, because some code points will be encoded using one character, while some will use a surrogate pair.

A few questions have come to my mind about parsing such strings. How do I determine which codepoint appears in the middle of the String?

I came across this article that explains support for unicode 4 in Java. I will read it and share any interesting findings on this blog.

Meanwhile for a more general explanation of unicode, I strongly recommend this excellent article by Joel Spolsky: The absolute minimum every software developer absolutely, positively must know about unicode and character sets (no excuses!)

  • Discuss this post in the learning forum.
  • Check out my learning journal. I am learning JSF at the moment. Do you want to join an experiment in forming an adhoc virtual study group?
Note: This text was originally posted on my earlier blog at http://www.adaptivelearningonline.net
Here are the comments from the original post

-----
COMMENT:
AUTHOR: Manjari
URL: http://simplymanjari.blogspot.com/
DATE: 08/17/2007 05:33:37 AM
Thanks for the link to Joel's article on Unicode. I discovered I was blissfully ignorant in that context.
-----
COMMENT:
AUTHOR: Parag
DATE: 08/18/2007 06:31:12 PM
You are very welcome Manjari, and thanks for the comment :-)

--
Regards
Parag

Comments

Popular posts from this blog

Commenting your code

Comments are an integral part of any program, even though they do not contribute to the logic. Appropriate comments add to the maintainability of a software. I have heard developers complain about not remembering the logic of some code they wrote a few months back. Can you imagine how difficult it can be to understand programs written by others, when we sometimes find it hard to understand our own code. It is a nightmare to maintain programs that are not appropriately commented. Java classes should contain comments at various levels. There are two types of comments; implementation comments and documentation comments. Implementation comments usually explain design desicisions, or a particularly intricate peice of code. If you find the need to make a lot of implementation comments, then it may signal overly complex code. Documentation comments usually describe the API of a program, they are meant for developers who are going to use your classes. All classes, methods and variables ...

Inheritance vs. composition depending on how much is same and how much differs

I am reading the excellent Django book right now. In the 4th chapter on Django templates , there is an example of includes and inheritance in Django templates. Without going into details about Django templates, the include is very similar to composition where we can include the text of another template for evaluation. Inheritance in Django templates works in a way similar to object inheritance. Django templates can specify certain blocks which can be redefined in subtemplates. The subtemplates use the rest of the parent template as is. Now we have all learned that inheritance is used when we have a is-a relationship between classes, and composition is used when we have a contains-a relationship. This is absolutely right, but while reading about Django templates, I just realized another pattern in these relationships. This is really simple and perhaps many of you may have already have had this insight... We use inheritance when we want to allow reuse of the bulk of one object in other ...

Planning a User Guide - Part 3/5 - Co-ordinate the Team

Photo by  Helloquence  on  Unsplash This is the third post in a series of five posts on how to plan a user guide. In the first post , I wrote about how to conduct an audience analysis and the second post discussed how to define the overall scope of the manual. Once the overall scope of the user guide is defined, the next step is to coordinate the team that will work on creating the manual. A typical team will consist of the following roles. Many of these roles will be fulfilled by freelancers since they are one-off or intermittent work engagements. At the end of the article, I have provided a list of websites where you can find good freelancers. Creative Artist You'll need to work with a creative artist to design the cover page and any other images for the user guide. Most small to mid-sized companies don't have a dedicated creative artist on their rolls. But that's not a problem. There are several freelancing websites where you can work with great creative ...