Monday, November 24, 2008

OSGi: a Dynamic Component System for Java

I know OSGi since I finally adopted Eclipse as my IDE for Java four years ago. Recently, two things re-ignite my strong interests in OSGi.

One is GridSphere. GridSphere is a portal framework implementing JSR-168, running on Tomcat. I am evaluating some portlets running in GridSphere. I found out deploying both GridSphere and portlets in GridSphere need to restart the JVM. It is clear to me this is not a desirable feature that an enterprise software should have. The selling point of the portal framework is that the portlets (i.e., the components) can be developed asynchronously and distributed, while can work in harmony in a portal. Let's say we have a portal that contains 25 portlets. Each portlet each year releases two major updates and two security patches. If we need to restart the portal every time when we update a portlet, then we need to restart the portal 100 times every year. This is not a good news to the administrator. Clearly we should have better support for components in architecting server-side software. At least, we should allow individual component be deployed and undeployed dynamically without affecting the remaining part of the system. By the way, 10 years ago in an interview with Huawei, I was asked the question how to patch a software when it is still running.

The other is SpringSource dm server, which uses OSGi. It seems to me OSGi is the answer to the architectural challenges in any complex, server-side, enterprise, or component-based software.

A further look at the OSGi website reveals that OSGi is behind many web application servers and J2EE servers: IBM Websphere, SpringSource Application Server, Oracle (formerly BEA) Weblogic, Sun's GlassFish, and Redhat's JBoss. And not surprisingly, the top two showcases for adopting OGSi are Eclipse and Spring.

Adopting OSGi is claimed to have many benefits, including dynamic update of bundle, security, support for dependencies and versioning, simple API and small footprint. To achieve those, OSGi has the module and service concepts deep in design.

In OSGi, the sharing between modules, like importing jars from other modules and exporting own jars for other modules to use, must be declared explicitly. By default, nothing is shared.

There is a service registry in OSGi. A service can be implemented using any POJO. While the OSGi Alliance publishes the Compendium specifications, which define a large number of standard services, from a Log Service to a Measurement and State specification.

In terms of the implementation, the two most popular ones are Apache Felix and Eclipse Equinox. My next step would be doing some hands on exercise with either of them.

Friday, November 21, 2008

Tweak the Delegation Model of Java Class Loading

The Java class loader architecture [1, 2] affords a Java developer a tremendous amount of flexibility in the way that an application is assembled and extended. Basically, class loaders form a hierarchy where the root is bootstrap, who is the parent of sun.misc.Launcher$ExtClassLoader, who is the parent of sun.misc.Launcher$AppClassLoader. When a class loader is asked to load a class, it will first delegate the request to its parent. This is why it is called the delegation model.

But, occalsionally, we do need to tweek this delegation model in some way that is more suitable to our requirement. A good example is in the implementation of Java Servlet Specification. For instance, in Tomcat 6, the web application class loader attempts to load its classes before delegate the request to its parent, the common class loader.

The delegation model is implemented in java.lang.ClassLoader's protected method loadClass(String name, boolean resolve):
protected synchronized Class loadClass(String name,
boolean resolve) throws ClassNotFoundException {
// First, check if the class has already been loaded
Class c = findLoadedClass(name);
if (c == null) {
try {
if (parent != null) {
c = parent.loadClass(name, false);
} else {
c = findBootstrapClass0(name);
}
} catch (ClassNotFoundException e) {
// If still not found, then invoke findClass
// in order to find the class.
c = findClass(name);
}
}
if (resolve) {
resolveClass(c);
}
return c;
}
The following program shows how to override the protected method loadClass to tweak the delegation model. In the program, the classes inside the package "somewhere" will be loaded from somewhere else even the classes with the same name exist in the application classpath. For loading all the other classes, the delegation model is still respected.
import java.io.FileInputStream;
import java.io.FileNotFoundException;
import java.io.IOException;
import java.lang.reflect.Method;

import somewhere.Greet;

public class ClassLoaderTest {
public static class MyClassLoader extends ClassLoader {
@Override
protected synchronized Class loadClass(
String name, boolean resolve)
throws ClassNotFoundException {
Class c = findLoadedClass(name);
if (c == null) {
if (name != null && name.startsWith("somewhere."))
c = findClass(name);
else
c = this.getParent().loadClass(name);
}
if (resolve) {
resolveClass(c);
}
return c;
}

@Override
protected Class findClass(String name)
throws ClassNotFoundException {
byte[] buf = loadFromSomewhere(name);
return defineClass(name, buf, 0, buf.length);
}

private byte[] loadFromSomewhere(String name)
throws ClassNotFoundException {
String className = name.substring("somewhere.".length());
FileInputStream fis = null;
try {
fis = new FileInputStream("./class/somewhere/" + className
+ ".class");
} catch (FileNotFoundException e) {
throw new ClassNotFoundException(e.getMessage());
}
byte[] buffer = new byte[8192];
int n = 0;
try {
int c = fis.read(buffer);
n = c;
while (c != -1) {
c = fis.read(buffer, n, buffer.length - n);
n += c;
}
fis.close();
n++;
} catch (IOException e) {
throw new ClassNotFoundException(e.getMessage());
}
byte[] rv = new byte[n];
System.arraycopy(buffer, 0, rv, 0, n);
return rv;
}
}

/**
* @param args
* @throws ClassNotFoundException
*/
public static void main(String[] args) throws Exception {
MyClassLoader myClassLoader = new MyClassLoader();
Class clazz = myClassLoader.loadClass("somewhere.Greet");
Object object = clazz.newInstance();
Method method = clazz.getMethod("hello");
// Greet from somewhere, print out "hello world"
method.invoke(object);

Greet greet = new Greet();
// local Greet, print out "good morning"
greet.hello();
// java.lang.ClassCastException:
// somewhere.Greet cannot be cast to somewhere.Greet
greet = (Greet) object;
}

}
In fact, A loaded class in a JVM is identified by its fully qualified name and its defining class loader. Consequently, each class loader in the JVM can be said to define its own namespace.

Therefore, leveraging the Java class loader architecture, components inside a Java process are able to have their own, separated code (class) space. This partly forms the foundation for that individual component can be plugged and played as well as hot swapped without affecting the other components in the same process.

Wednesday, November 19, 2008

Let Java SSL Trust All Certificates without Violating Security Manager

Java SSL by default does not trust self-signed certificate. Wikibooks:Programming reveals a way to allow connection to secure HTTP server using self-signed certificate. The magic looks like:

// Create a trust manager that does not validate certificate chains
TrustManager[] trustAllCerts = new TrustManager[]{
new X509TrustManager() {
public java.security.cert.X509Certificate[] getAcceptedIssuers() {
return null;
}

public void checkClientTrusted(
java.security.cert.X509Certificate[] certs, String authType) {
// do nothing
}

public void checkServerTrusted(
java.security.cert.X509Certificate[] certs, String authType) {
// do nothing
}
}
};

// Install the all-trusting trust manager
SSLContext sc = null;
try {
sc = SSLContext.getInstance("SSL");
sc.init(null, trustAllCerts, new java.security.SecureRandom());
} catch(GeneralSecurityException gse) {
throw new IllegalStateException(gse.getMessage());
}
HttpsURLConnection.setDefaultSSLSocketFactory(
sc.getSocketFactory());

However, HttpsURLConnection.setDefaultSSLSocketFactory(...) will throw a SecurityException (a RuntimeException) if a security manager exists and its checkSetFactory method does not allow a socket factory to be specified. The thrown SecurityException looks like

Exception in thread "main" java.security.AccessControlException: access denied (java.lang.RuntimePermission setFactory)
at java.security.AccessControlContext.checkPermission(AccessControlContext.java:323)
at java.security.AccessController.checkPermission(AccessController.java:546)
at java.lang.SecurityManager.checkPermission(SecurityManager.java:532)
at java.lang.SecurityManager.checkSetFactory(SecurityManager.java:1612)
at javax.net.ssl.HttpsURLConnection.setDefaultSSLSocketFactory(HttpsURLConnection.java:308)
at SecurityManagerTest.main(SecurityManagerTest.java:50)

A workaround to avoid such a SecurityException is as below:

URL url = new URL("https://engage.ac.uk");
HttpsURLConnection conn = (HttpsURLConnection) url.openConnection();
conn.setSSLSocketFactory(sc.getSocketFactory());
conn.getInputStream();

The trick is to use the instance method setSSLSocketFactory instead of the static method setDefaultSSLSocketFactory. The former does not throw a SecurityException.

Note: need to use conn.getInputStream() instead of url.openStream(), otherwise the customised SocketFactory won't be used.

Of course to allow to connect the secure web site, the following permission should be added in the Java security policy file:

permission java.net.SocketPermission "engage.ac.uk:443", "connect";

Use Security Manager to Control Java Web Application Behavior

In a Java web application server like Tomcat, we have the Tomcat container and multiple web applications running in the same JVM. It is thus very important to cut the unexpected interactions between applications and between applications and Tomcat. Ideally it is expected the behavior of one web application should not affect the behavior of other applications as well as Tomcat, in a bad way.

One of such unexpected interactions is caused by modifying shared classes and objects, for instance, changing system properties, changing system classes' behavior.

Tomcat 5.5's class loaders
are organised as:
      Bootstrap
|
System
|
Common
/ \
Catalina Shared
/ \
Webapp1 Webapp2 ...
Tomcat 6.0's class loaders are organised as:
      Bootstrap
|
System
|
Common
/ \
Webapp1 Webapp2 ...

In both cases, web applications and Tomcat share the classes managed by class loaders Bootstrap, System and Common.

Let's say one web application use the following code to tell Java runtime to use the XSLT implementation shipped with JDK.

System.setProperty("javax.xml.transform.TransformerFactory", "com.sun.org.apache.xalan.internal.xsltc.trax.TransformerFactoryImpl");

Other web applications may choose to use other XSLT implementations in a similar way. If a certain web application's correct behavior is depending on a particular XLST implementation, then we may get a problem, because System.setProperty("javax.xml.transform.TransformerFactory", ...) will change the JVM-wide XSLT implementation.

Therefore it is advisable to use Java security manager to control the permissions granted to web application code.

Note $CATALINA_HOME/bin/startup.sh does not start up a Tomcat with the security manager. To enable the security manager, use $CATALINA_HOME/bin/startup.sh -security. It will append "-Djava.security.manager -Djava.security.policy=..." to the JVM arguments.

A simple Java program testing security manager and policy file

public class SecurityManagerTest {
public static void main(String[] args) throws Exception {
System.setProperty("greeting", "hello world!");
System.out.println(System.getProperty("greeting"));
}
}

Run this program, you will see "hello world!" printed out.

Prepare a policy file called lab.policy:

grant {
permission java.util.PropertyPermission "*", "read";
};

It says any code can only read system properties, but not write.

By the way, the policy file can be created using policytool.

Then run the program like this:

java -Djava.security.manager -Djava.security.policy=lab.policy SecurityManagerTest

You will see an excpetion pop up:

Exception in thread "main" java.security.AccessControlException: access denied (java.util.PropertyPermission greeting write)
at java.security.AccessControlContext.checkPermission(AccessControlContext.java:323)
at java.security.AccessController.checkPermission(AccessController.java:546)
at java.lang.SecurityManager.checkPermission(SecurityManager.java:532)
at java.lang.System.setProperty(System.java:727)
at SecurityManagerTest.main(SecurityManagerTest.java:11)

See the security manager is working. It is important to give the right location for the policy file. Because when the security manager is enabled, by default no permission is granted. So any permission will be denied except those defined in the policy file. In case the policy file cannot be found, then the code is given no permission at all.

Friday, November 14, 2008

Update RHEL5

See here (within intranet, may require authentication) for how to register RHEL4 for up2date and RHEL5 for yum.

After that, then can use yum update for a complete update.

To install a certain package: yum install package_name

To see what a package does: yum provides package_name

To find packages containing some keyword: yum list | grep keyword

To check already installed packages that contains some keyword: rpm -qa | grep keyword

Tuesday, November 11, 2008

Make a Java-enabled Virtual Machine Using Ubuntu JeOS

As suggested in the best practice to build a virtual appliance by VMware, Ubuntu JeOS is used as the operating system. In around 100MB, it provides a just enough OS. After installed, its VMware virtual disk file size is about 380MB.

By default, Ubuntu does not provide a root password. The root privilege is carried out using the "sudo" command. To enable root login: sudo passwd root.

Also, see how to prepare a virtual appliance from the Ubuntu community.

Adding a new virtual disk
  • To add a new virtual disk, using the "Add Hardware" command in VMware web interface.
  • Need to restart the virtual machine to detect the new virtual disk.
  • Let's say the newly added hard drive is /dev/sdb. Use the following command to partition it: fdisk /dev/sdb. To create a single new partition, the entire size of /dev/sdb: n ENTER p ENTER 1 ENTER (default) ENTER (default) ENTER w ENTER.
  • To format it: mkfs.ext3 /dev/sdb.
  • To mount it on an existing location, say /data: mount /dev/sdb /data.
See how to format a new hard disk on ubuntu forums.

Making a Java appliance

The next step will be to make it a Java appliance, with JDK, ant as well as Tomcat installed, running some Java application like Grimoires.

Ubuntu JeOS does not pre-install Open SSH server. Use the following command to install Open SSH: apt-get install openssh-server.

Then install
  • JDK 1.6.0_10
  • Apache Ant 1.7.1
  • Apache Tomcat 5.5.27
  • Grimoires 2.0.0
The used disk space shown by "df" inside virtual machine is about 675MB.

Setting up NAT

Assume the virtual machine has been configured to support NAT. Add the following line to /etc/vmware/vmnet8/nat/nat.conf, under the [incomingtcp] section:

6660 = 172.16.59.132:8080

where 172.16.59.132 is the IP address of the virtual machine.

According to the VMware server user guide, clicking "Refresh Network List" in the Virtual Infrastructure web interface should bring up the modified network configuration. But it does not work in my case. I have to restart VMware to enforce the new NAT configuration: service vmware restart. (Better shut down vm before restart vmware server!)

After restart VMware, and make sure the port 6660 is opened in the firewall of the host machine, access http://hostname_of_the_host:6660/grimoires from another machine.

It seems that vmware server runs a separate daemon to handle the address translation when vm is set to use NAT. So this blocks iptables' NAT configuration.

Cloning the Virtual Machine

This is something a little bit tricky.

I used the conventional way to clone it: copy the vmdk and vmx files; then add the clone virtual machine to the inventory ("Add Virtual Machine to Inventory"); when asked whether you copied it or moved it, answer copied it.

The clone was able to start up. But there was no eth0. The routing table was empty (shown by route). There was an eth1 with the correct MAC address but without IP information (shown by ifconfig). Not surprisingly, TCP/IP was not working.

My judgement was that there was no reason for the virtual network card to go wrong, so it should be due to some configuration issue.

Hinted by a post about adding a second network card to Ubuntu, I used dmesg | grep eth to check the kernel messages on booting. Then I found an interesting message: "udev: rename eth0 to eth1". After a little bit googling and investigation, I solved the problem!

udev is the device manager for the Linux 2.6 kernel series. In /etc/udev/rules.d/70-persistent-net.rules, there were two lines:
SUBSYSTEM=="net", ...... ATTR{address}=="00:0c:29:07:6f:37", ...... NAME="eth0"
SUBSYSTEM=="net", ...... ATTR{address}=="00:0c:29:cf:c8:64", ...... NAME="eth1"

The MAC address in the first line is the one of the original vm, while the MAC address in the second line is the correct MAC address of the clone. Clearly, the first line has no effect, because there is no such a device in the system. And the second line adds the network card as eth1.

Looking at /etc/network/interfaces, it only defines lo and eth0. eth0 is defined like this:
auto eth0
iface eth0 inet dhcp

There is no wonder why only eth1 was visible but did not support TCP/IP. After commenting out the first line and changing eth1 to eth0 in the second line in /etc/udev/rules.d/70-persistent-net.rules, and restarting the system, TCP/IP is working!

Summary

This is a convenient way to prepare a small footprint Virtual Machine for running Java. It can be used to deliver Java software, or as a test or evaluation environment for Java product.

Thursday, November 06, 2008

OpenSSL

Recently I used the openssl utility to generate key pairs and certificates. For example,
  • To generate a private key: openssl genrsa -out ca.key 2048
  • To create a self-signed certificate: openssl req -new -key ca.key -x509 -days 365 -out ca.crt
  • To create a certificate signing request: openssl req -new -key temp.key -out temp.csr
  • To create a certificate from a certificate signing request: openssl x509 -req -in temp.csr -CA ca.crt -CAkey ca.key -CAcreateserial -out temp.crt
  • To display a certificate: openssl x509 -text -in temp.crt
  • To display the content of a pkcs12 formatted certificate (the displayed private key and certificate are in PEM format, which can be used in the above commands): openssl pkcs12 -in old_uk_escience.p12 -out old.txt
  • To convert from pkcs12 format to PEM format: openssl pkcs12 -in cred.p12 -out cert.pem -nodes -clcerts -nokeys, openssl pkcs12 -in cred.p12 -out key.pem -nodes -nocerts
  • To create pkcs12 format certificate using PEM format private key and certificate: openssl pkcs12 -in temp.crt -inkey temp.key -out temp.p12 -export

Self-Decrypting HTML Page

This is something worthy doing. Let's call it self-decrypting HTML page.

It is a publicly accessible, encrypted HTML page, that can be hosted in any website, but can only be decrypted by authorized persons who know the key. JavaScript can be used to perform encryption and decryption locally at the browser. Thus what is transmitted on the net and stays at the public website is the encrypted information, which guarantees the confidentiality of the information. Some symmetric key algorithm is preferred.

A self-decrypting HTML page generator is needed to convert the to-be-protected information to a self-decrypting HTML page.

It is noted that a United States Patent 7003800 has been filed on "Self-decrypting web site pages". It seems it talks about a method pertinent to a web site.

A self-decrypting email utility performs the similar task, which uses RC4 as the cipher. Here is the JavaScript code for RC4.

Compared with another web-based confidential information system, i.e., some information is hosted on a HTTPS server and is protected by some authentication method, self-decrypting HTML page does not require an authentication method, thus no need for user management. All needed to access the confidential information is the URL of the self-decrypting HTML page and the key.

Will Google App Engine be a good platform for this? Will it be possible to use Google AdSense to generate revenue from this application?

Tuesday, November 04, 2008

Handle Chinese in Java

(This is one of my old Google Notes.)

In Java, Reader and Writer are used to handle character (char) stream, and InputStream and OutputStream are used to handle byte streams. InputStreamReader and OutuptStreamWriter are bridges between byte streams and character streams.

Characters can have many different coding schemes, such as ASCII, GB2312, UTF-8 (Unicode Transformation Format), when they are represented in bytes. While characters in Java, char or String, are Unicode only.

A Charset is a named mapping between sequences of sixteen-bit Unicode, which is the character representation in Java, and sequences of bytes. A Charset knows how to convert a byte sequence to a (Unicode) char sequence, and vice versa, following the standard it implements.

Not surprisingly, both InputStreamReader and OutuptStreamWriter can be configured to use a specified Charset, but there is no concept of Charset in Reader and Writer.

The following Java program demonstrates how to handle Chinese in Java.

import java.io.BufferedWriter;
import java.io.OutputStreamWriter;

public class ChineseTest {
public static void main(String[] args) throws Exception {
String chinese = "\u4eca\u65e5\u83dc\u6839\u8c2d"; // 1
System.out.println(chinese); //2
BufferedWriter bw = new BufferedWriter(new OutputStreamWriter(System.out, "GB2312")); //3
bw.write(chinese); //4
bw.close(); //5
}
}

Code comments:
  1. This is the Unicode for 今日菜根谭, generated by "native2ascii -encoding GB2312 c.txt", where the content of c.txt is 今日菜根谭 encoded in GB2312. The utility native2ascii converts a file with native-encoded characters (characters which are non-Latin 1 and non-Unicode) to one with Unicode-encoded characters.
  2. Can't print 今日菜根谭, by using the platform's default Charset.
  3. Create a Writer using the GB2312 Charset.
  4. Now we can print out 今日菜根谭.
  5. Flush the output. Required.

Virtual DOM

(This is one of my old Google Notes. Virtual DOM is a small piece of software I implemented.)

Complying with the W3C DOM interface, Virtual DOM is capable of representing in memory a large XML data which can not be represented using other DOM implementations such as Xerces. Virtual DOM is aiming at allowing off-the-shelf XPath engines such as Jaxen to process XML DOM representation that is too large to be held in memory otherwise. To support such a goal, Virtual DOM must be told how to load DOM element, and then it uses SoftReference to cache the loaded element. To put it in a simple way, the Virtual DOM elements can be garbage collected when Java heap space becomes scarce, and later be reloaded on demand.

The problem of Xerces DOM implementation

Each node has references to its parent, children, and siblings. So as long as there is a single reference to any node of the DOM tree, any parts of the tree can not be garbage collected.

Implementation of Virtual DOM

Virtual DOM adopts a forest mode, where there is a virtual document and a virtual root element. The virtual root element has a number of child elements, which is implemented as CollectableElement extended from CollectableNode, each representing a different DOM document.

Each Virtual DOM node, called CollectableNode, has a reference to DocumentCache, which in turn has a SoftReference to the cached owning document. DocumentCache also has enough information to load and reload the owning document whenever necessary.

Each Virtual DOM node has information specifying how to locate this node from the root node of the owning document.

Each Virtual DOM node also has a SoftReference to its corresponding concrete DOM node for convenience purpose.

The CollectableElement that is the direct child of the virtual root element has an index indicating which child it is in terms of the virtual root element. Thus it is able to retrieve its next sibling.

All exported DOM references should be of Collectable* instead of the default DOM ones to prevent DOM node references from being held externally.

Test

With 64 MB heap, using a revised Jaxen 1.1 (one of Jaxen 1.1's methods unnecessarily retains object references.), Virtual DOM is able to deal with at least 250,000 instances of a certain XML document. In a comparison, Xerces DOM runs out of memory at 39,000 instances.

(It is called VirtualDOM in my Eclipse projects.)

JSPWiki ACL Filter

I am working on the OMII-UK website. JSPWiki has been adopted as the web content management system as well as a wiki in the website. JSPWiki has been customised with an OMII-UK template, and authentication and authorization modules. There are many users, so In-page ACLs are adopted to protect some page from unauthorised editing. For instance, the following ACLs say only members of staff group can view and edit the page containing this ACL:

[{ALLOW edit StaffGroup}]

Because members of staff group can edit this page, members of staff group can also edit the ACL, which is nothing more than a JSPWiki markup. This causes some potential security flaw: any member of staff group can edit ACL, e.g., by mistake, and thus violate the intended access control of this page. Ideally, ACL, though residing in a page, should be treated differently from the other page source.

Thus an ACL Filter is introduced to only allow users with AllPermission to create/edit/delete in-page ACL. For instance, in the above example, even any member of staff group can edit the page, but only users with AllPermission can change the ACL to something other than the above ACL.

Essentially, the ACL Filter is trying to separate two concerns, content and access control over content, which are originally mixed up in the wiki markup. With the ACL Filter, content and access control over content are treated differently: any one authorised by ACLs can edit content, but only some certain super users can edit access control.

This work has been contributed back the JSPWiki community. See here.

Spring JUnit Test and Rollback DB Transaction

In Spring, doing unit test with JUnit4 can be as simple as this:

@RunWith(SpringJUnit4ClassRunner.class)
@ContextConfiguration
public class HibernateDaoTest {
@Autowired
protected RepositoryHibernateDao repositoryHibernateDao;

@Test
@Transactional
public void submitProject() {
Project newp = new Project();
newp.setStatus("PENDING");
int pid = repositoryHibernateDao.save(newp);
assertTrue(pid > 0);
Project p = repositoryHibernateDao.getProjectDetail(pid);
assertEquals(p.getName(), name);
assertEquals(p.getSubmitUser(), owner);
repositoryHibernateDao.delete(p);
}
}

With the annotation @Transactional, when the test is finished, the database is supposed to roll back to the state before the test. But this does not happen in my Spring/Hibernate/MySQL setting.

It turns out to be that in order to support transaction, the MySQL table must be an InnoDB table.

Usually the tables are MyISAM ones, which are non-transactional. The MyISAM table provides high-speed storage and retrieval, as well as fulltext searching capabilities. It is supported in all MySQL configurations, and is the default storage engine unless you have configured MySQL to use a different one by default.

On the other hand, the InnoDB and BDB storage engines provide transaction-safe tables. InnoDB is included by default in all MySQL 5.0 binary distributions. Here describes how to
Convert a MyISAM table to innoDB. Basically, what needs to be done is: ALTER TABLE ... ENGINE=INNODB

Monday, November 03, 2008

Map of Downloads


I have used Google Maps API to display on the map where the downloads of OMII-UK software are from. See the downloads for Grimoires 2.0.0. By clicking on the balloon, more geographic information about the download will pop up. For instance, I know there is a guy in Changsha, Hunan, China has downloaded Grimoires 2.0.0. All the other downloads are from UK, France and Germany.

Hibernate Performance

(This is one of my old Google Notes.)

I have used YourKit to benchmark the performance of Hibernate some time ago.

In my database, I have two tables: Project and Release. Project is associated with release in a one-to-many relationship. In my setting, a SQL statement using JDBC costs from several milliseconds to ~30 milliseconds; serving a JSP costs from 1 second to 3 seconds in the first run, then from tens of milliseconds to ~250 milliseconds in the later run.

In Hibernate,
  • Getting a single project (as well as its releases) costs 234ms in the first run, and 14ms in average (excluding the first run);
  • Getting all projects costs 547ms in the first run, and 96ms in average;
  • Getting some partial information of all projects, costs 282ms in the first run, and 15ms in average.
The reason for that the first run is much slower, is because Hibernate does bytecode instrumentation to generate proxy object on the first touch of ORM, which is quite expensive. However, the late runs approximate JDBC's performance.

Some Hibernate tips:
  • Chapter 19 of the Hibernate reference describes how to improve its performance.
  • Use set instead of list. List uses the id as index, thus creates a large data structure with lots of empty cells.
  • Associated objects are initialized in a lazy way. They need to be touched to bring into memory.

Sunday, November 02, 2008

Dependency Resolution

(This is one of my old Google Notes.)

In the software architecture, one component (caller) is relying on some services provided by another component (callee). Before the caller can use the service of the caller, the caller must hold an instance of the callee. There are several ways to do it.

First, the caller can create an instance of the callee. Sometimes, only one instance of the callee is preferred to exist in the whole system, because it is unique or it is expensive to create, e.g., the database manager. In that case, the singleton pattern can be used. But there are some shortcomings with the singleton pattern. First, because the callee has to implement the singleton pattern and the callers have to invoke the singleton pattern in a proper way, the code of solving dependency is scattering around the whole system. Second, it is inflexible to replace a callee component with one of the same interface.

Second, the caller can go to a "service locator", then use, for instance, a JNDI directory to locate callees. Though the service locator maintains a central place of managing all service providers (callees), the caller still needs to know how to talk to the service locator.

Third, this is state-of-the-art: dependency injection, which is implemented in many frameworks, such as Spring and Google Guice. Compared to the above two approaches, the advantages of dependency injection are obvious. Both caller and callee can be POJO (Plain Old Java Object), with no special control implemented. And there is a single control point. In Spring, that is the Spring XML configuration file. Thus the dependency is solved in a declarative way! Nowadays we always prefer a declarative way to a programmatic way to do things.

In a service-oriented architecture, service workflow is a multi-component system. Any component service can be a caller, a callee, or both. There is some similarity between dependency injection and how a workflow is composed: both have a central control point for solving dependency in a declarative way. In workflow, that is the workflow description file, e.g., written using BPEL. On the other hand, in service-oriented architecture, the service registry shares a similar idea as the service locator pattern.