Coffea Works

Jakarta Commons

First Acquaintance

by Christian Gross

The Apache group has diversified its source code base from providing a single Web Server to that of a group providing a host of different solutions. Most of the solutions tend to be server-based; for example, Xerces is an XML processor, and Jakarta Tomcat is a Java-based Web Server. The Jakarta Commons is another solution and its mission is to provide reusable Java components. This article will bring you up to speed on how to download, compile, inspect, and integrate the Jakarta Components.

Reusable components are the holy grail of development much like code reuse in Object Oriented (OO) development. Often, developers will think of an architecture, develop some classes, and call them reusable. Then, when the classes are actually used they aren't as reusable as originally designed. The Jakarta Commons components are different because only those components that are truly reusable get to be part of the Commons.

For example, let's say that I develop some component and release it for use by other developers. If I want to my component be part of the Jakarta Commons, the component will be refused in all probability -not because of the technical implementation, but because the component is new and has not yet proven its value. So the objective for me would be to create a community of users and potentially developers.

If, in the course of time, a community develops then we could request that the component be included in the Jakarta Commons group. If accepted, the first step is to enter the Jakarta Commons sandbox. The sandbox is a place where components are tested for their viability. Also, the ability of the component to integrate into the Jakarta Commons community will be tested. If the Jakarta Commons community likes the component and it is proven valuable then the component will be promoted. This promotion could lead to the project becoming a full fledged Jakarta Commons component, or even a full-fledged, top-level, project on its own.

Where to Find Components
Figure 1 is a snapshot of the Jakarta Commons Web site.


A Snapshot of the Jakarta Commons Web site


On the Jakarta Commons Web site the menu, the left-hand toolbat contains the links to different projects, contributors, binaries, daily snapshots, and pointers to everything that we'd want to know about the Commons. Near the bottom of the menu are a set of links to other Apache hosted Commons projects -- it would worth your time to watch the progress of these projects that are in startup mode at the time of writing this article.

When one is first confronted with the Jakarta Commons Web site all of the information can be a bit daunting, and we might just want to quickly find an answer without having to spend a large amount of time figuring out what the ideal component might be. The rest of the article will focus on understanding the Jakarta Commons and how the components are presented to the developer community - this will help tounderstand how simple it is to integrate a Jakarta Commons component into a developer's development environment.

The initial undertaking when using the Jakarta Commons components is to figure out what components are available, by researching the descriptions of the components on the Web site. The components can be researched by clicking on the Components link on the Jakarta Commons home page --this will result in an HTML page similar to the snapshot shown in Figure 2.


A Snapshot of the Commons Web site


On the HTML page the individual components are listed in the menu bar, and described using a short text in the main page. For initial research, the table provides enough information to get an idea of what the component can do. For more information a link in the table can be clicked. The Web browser would then be redirected to the home of page of the component; see the snapshot shown in Figure 3.


A Snapshot of the Home Page of a Component


The home page of the component provides detailed descriptions on what the component does, who the project maintainers are, where to find the sources in the CVS repository, and other information. The information is provided for all Commons and Commons Sandbox components. Note that it's possible that some Sandbox components may not as well documented, or defined. The Sandbox components might also not contain current information, since the component might have changed focus.Even so, you should still be able to get a gist of the component.

Using The Components
Using the Jakarta Commons components effectively is more complicated that is apparent. The problem with the Jakarta Commons components is that very often you will be using them without even knowing about it. Take for example the Jakarta Tomcat project. In the directory [Tomcat-Installation]/server/lib, there are a number of jar files used by the Tomcat application.

commons-beanutils.jar is one of them. The identifier commons indicates that the jar is from the Jakarta Commons. The identifier beanutils indicates that the jar is from the BeanUtils project. From the filename of the jar it is not apparent, which version of the BeanUtils component is being distributed. The version information is stored in the manifest file that can be extract using the following jar command:

jar xf commons-beanutils.jar META-INF/MANIFEST.MF


The MANIFEST.MF file will contain information similar to the following listing:

Manifest-Version: 1.0
Created-By: Apache Ant 1.5.1
Extension-Name: org.apache.commons.beanutils
Specification-Title: Jakarta Commons Beanutils
Specification-Vendor: Apache Software Foundation
Specification-Version: 1.6
Implementation-Title: org.apache.commons.beanutils
Implementation-Vendor: Apache Software Foundation
Implementation-Version: 1.6


There are two parts to the manifest file -- Specification and Implementation. The Specification defines which interfaces the component implements. The Implementation specifies the version of the component implementation. For many Jakarta Commons components the Specification and Version are the same number.

The version and manifest file is important for the developer and the administrator because the Jakarta Commons components are used in many projects. In the individual projects the versions used might not be the latest stable version. This can be a problem in that an application could be using multiple versions of the same component. The developer could upgrade the components so that there is only one version of the component. However, that could have a ramification when an application breaks because the newer component was changed or the old functionality was fixed. Therefore the developer has to be extremely careful of the context when choosing which version of a Jakarta Commons project to use.

Developing With Components
Once the developer has decided to use a Jakarta Commons component it has to be integrated into a development environment. Because the Jakarta Commons components are packaged as jar files the exact integration is a dependency of the development environment (for example, NetBeans, JBuilder, or Eclipse). The point is that integrating a jar file is not that difficult, but there are other needs as well. For example, a developer will want some documentation in the form of JavaDocs. Here is where things can become tricky because a jar file does not contain the JavaDocs. Therefore, it's important for the developer to know which version of the component they are using, since the JavaDocs are specific to the version of the component.

Most components have multiple versions and the simplest case for the developer is to integrate the latest version of the component, but that may not always be possible. When choosing a component the developer has three choices --the latest stable, latest beta, or a copy from the CVS repository. Though, the easiest choice is the latest stable version, the best choice would be the copy from the CVS repository. The advantage of the CVS repository is that you can choose the version, compile in custom changes, and generate a distribution that includes the full JavaDocs. There are releases that include the JavaDocs, but the developer needs to search for those distributions and potentially old versions might not be available.

It is recommended that regardless of the distribution you choose, you should have access to the component sources. The advantage of having access to the sources is that the integrated development environment (IDE) can browse the sources. When problems arise or when the developer does not entirely understand how a component works the sources can be browsed and researched. In fact when writing my book, Applied Software Engineering with Jakarta Commons, from Charles River Media (ISBN 1584502460) by browsing the sources I learned how the individual components worked.

Going back to the snapshot of the home page for the Jakarta Commons, you must have noticed the Download menu item and the Binaries menu sub-item link. Clicking on the link generates an HTML page that references a number of Jakarta binary files that can be downloaded. Included in the list are the Jakarta Commons components, but the user has to find them from the list. Each of the links will reference a mirror from where the files can be downloaded. Notice the mirror used, as that is reference used a bit later in this article.
Another way to download the different Jakarta Commons components is to use the wget command as shown in the following listing.

wget --recursive --level=4 --accept=*[0-9].[0-9]*.tar.gz \
--include=/mirror/apache/dist/jakarta/commons \
http://$server/mirror/apache/dist/jakarta/commons


The wget command is a traditional UNIX command line utility that can be used to download a Web site for offline viewing. wget is availalbe on all the major platforms -- Windows, OSX, Linux, FreeBSD, and Solaris -- and its very easy to install too. The command line options used will force wget to iterate the individual Commons directories and download all of the available Commons component distributions.

The command line option --accept does not mirror everything - it only mirrors the released Commons components. The *[0-9].[0-9]*.tar.gz is a regular expression that selects all files that contain a version identifier, and ends with the extension .tar.gz. The regular expression happens to work because of the naming convention used by the Commons components.

The command line option --include ensures that only files belonging to the Commons are downloaded. The variable $server is the identifier, referenced earlier as the mirror, containing the Jakarta binary files.
When the wget command has executed, a number of files are downloaded and each file represents a Jakarta Commons component version. Also, note that both the binary and source archives are downloaded - so, for example, the binary distribution commons-beanutils-1.6.1.tar.gz and its corresponding source distribution commons-beanutils-1.6.1-src.tar.gz would've been downloaded.

The binary distribution commons-beanutils-1.6.1.tar.gz is the archive that contains the jar of the component. When the binary archive is expanded the Component sub-directory will contain one sub-directory and three items. Note that, at the mimimum, the Jakarta Commons components will have the following data:
  • commons-beanutils.jar - Represents the BeanUtils jar file, and all binary Commons components will contain a jar file that has a similar naming convention to the archive.
  • Docs - A sub-directory that contains the JavaDocs that the developer can be used to figure out what the packages, classes, interfaces, properties, and methods are.
  • LICENSE.txt - A file that contains the terms of the Apache license.
  • RELEASE-NOTES.txt - A file that contains the release notes for the component. The developer is urged to read the release notes so that they are aware of potential limitations and changes.

With the binary distribution the developer can integrate the jar file and JavaDocs into their development process. If the developer does not care about the implementations details or does not want to alter the components then the sources can be of great interest.

Building the Commons Components
When the commons-beanutils-1.6.1-src.tar.gz source archive is expanded, you'll find the jar file missing. Instead, you'll find an added src source directory, the xdocs raw documentation directory, and the project build files. The project files for most (if not all) Commons components are based on the Maven build tool. It is not necessary to use Maven because an Ant build file will be supplied as well.

Here things might get a bit tricky because one may not know how to compile the components. There are three ways to get this done - using Ant or Maven, or integrating the sources into your IDE.
Integrating the sources directly into your IDE is useful when you want to modify the component sources to do something different. It is not a problem to integrate the sources because many components are very simple and straightforward. The only problem, if you do decide to integrate the sources into your own program, is that there might be problems regarding code synchronization. By integrating the sources you are branching the component and that is problematic. After having made changes to the sources it would be a good idea to give back the changes to the appropriate Commons project. If the changes are rejected or you do not wish to distribute the changes then you will need to perform code merging on a regular basis. The exact details of code merging are beyond the scope of this article, but you can e-mail me to get some pointers.

The other option is to use the Maven maven.xml file or the Ant build.xml to build the component. Many of you might have used the Ant build tool, but may not be as well-acquainted with the Maven build tool. First, the Maven build tool is not just a build tool, but considered to be a project management tool. Maven is a solution created to manage modular projects that are all built using Maven. It can be considered as a higher-level abstraction of the Ant build tool. Essentially, everything that Maven can do, Ant can do as well, it is just that Ant requires some extra steps.

This then raises the questions of whether to use Maven or Ant. For building a Commons component the simplest tool to use is Maven. However, when integrating the build process of a Commons component into your development process, it's simpler to use Ant. Maven is a good build tool as long as you accept and use the Maven guidelines. It is an abstraction of the project build sequence and has its way of building, testing, and generating JavaDocs. Having said that, my personal preference is Ant, but it'd be best to inspect both Ant and Maven and then decide on which tool to use.

To build a Jakarta Commons component the developer needs to download and install either Ant, or Maven. Having installed one of these, and added the application to the PATH it's possible to build a Commons component. To do so, open a console and change the working directory to the root directory of the Commons component. Then in the console, execute the command Ant or Maven. In either of these cases, the component will compile and there should be no error messages. If there are error messages, it's mostly because the build.properties configuration file, located in the directory of the component, needs to be properly defined.

When the classes are compiled a jar will not be created. The classes will typically be compiled in the target sub-directory. When executing the application the classes can be referenced directly, by adding the target directory to the Java CLASSPATH. If you built a jar file using Ant, the target dist will be built. After the build has completed, the dist sub-directory will hold all of the files that would make up the component distribution. The dist sub-directory will include the jar file, and JavaDocs.

If you are planning on using multiple Jakarta Components and have made it a part of your overall strategy, the CVS sources should be used. CVS is a version control program used to download the sources from the Apache Web site. Note that some projects are starting to use the CVS replacement -- Subversion. The Jakarta Commons does not use Subversion just yet, but it may do so in the future. The Jakarta Commons components are stored in the jakarta-commons CVS repository, and the Jakarta Commons Sandbox components are stored in jakarta-commons-sandbox. Note that when downloading the sources for the first time they will be the latest (and greatest) meaning that the components might not work properly.
One of the advantages of using CVS is the ability to retrieve and build the sources of a specific Commons component version. To view all of the available revisions of a Commons component open a console, and change the directory to a CVS downloaded component. Then, within the directory, execute the following command:

cvs log build.xml


The command will generate some text similar to the following:

RCS file: /home/cvspublic/jakarta-commons/beanutils/build.xml,v
Working file: build.xml
head: 1.58
branch:
locks: strict
access list:
symbolic names:
STRUTS_1_1_RC1: 1.50
BEANUTILS_1_6_1: 1.49
BEANUTILS_1_6_1_RC1: 1.48
BEANUTILS_1_6: 1.44
STRUTS_1_1_B3: 1.40
BEANUTILS_1_5: 1.39
BEANUTILS_1_4_1: 1.37
BEANUTILS_1_4: 1.35
STRUTS_1_1_B2: 1.34
BEANUTILS_1_3: 1.32
STRUTS_1_1_B1: 1.30
jwsdp_10_ea2_01: 1.29
BEANUTILS_1_2: 1.20
BEANUTILS_1_1: 1.16
BEANUTILS_1_0: 1.14
keyword substitution: kv
total revisions: 58; selected revisions: 58


In the section entitled symbolic names you'll find a number of tags that define the various releases of the BeanUtils component. Included in this list of releases are the standard version numbers, and releases for different frameworks such as Struts or Java Web Services Developers Pack (jwsdp). The version numbers 1.XX are CVS revision numbers. It is important to note that a release of the component, for a framework such as Struts, might not coincide with a formal release. This is why it is extremely important to use the right version when developing applications that run in the context of another application framework. You should use the version that is distributed with the application framework. Look at the CVS documentation to download the sources associated with a tag.

Why No Details?
So far there we have not embarked on a discussion of individual Jakarta Components. All that we have seen is how to download, compile, inspect, and integrate the Jakarta Components. The reason is because Jakarta Components are so numerous and change so often that it is more important to understand how the Jakarta Commons infrastructure works. By being able to understand the Jakarta Commons infrastructure you are ready and capable of dealing with problems or bugs as and when they arise. Note that issues could occur in the development cycle or in the deployment phase.

Not outlined, but equally important is that anybody using the Jakarta Commons Components need to subscribe to the Jakarta Commons Mailing Lists. Design decisions and new components are discussed on the mailing list and following the mailing lists ensure that you are not taken by surprise when big changes happen. Finally, after having read this article the next step would be to head over to the Jakarta Commons Web site, and use CVS to download the sources. Good luck! If you've any further questions, contact me.


back

top

print

recommend