Monday, 1 February 2010

JMockit & JavaAgent - Enterprise integration woes

We've been using JMockit as the preferred mocking utility at a large Bank that I'm consulting at. While generally the product and its non-intrusive mocking features are commendable, some of the under-the-hood mechanisms have caused us a nightmare with our continuous integration. After upgrading to latest version of JMockit, version 0.996.0, we had ALL our unit tests failing with the following spurious error
java.lang.NoClassDefFoundError: org.junit.runner.Runner
 at org.junit.runners.model.RunnerBuilder.safeRunnerForClass(RunnerBuilder.java:59)
 at org.junit.internal.requests.ClassRequest.getRunner(ClassRequest.java:24)
 at org.eclipse.jdt.internal.junit4.runner.JUnit4TestReference.(JUnit4TestReference.java:29)
 at org.eclipse.jdt.internal.junit4.runner.JUnit4TestClassReference.(JUnit4TestClassReference.java:25)
 at org.eclipse.jdt.internal.junit4.runner.JUnit4TestLoader.createTest(JUnit4TestLoader.java:40)
 at org.eclipse.jdt.internal.junit4.runner.JUnit4TestLoader.loadTests(JUnit4TestLoader.java:30)
 at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:452)
 at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:683)
 at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.run(RemoteTestRunner.java:390)
 at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.main(RemoteTestRunner.java:197)
While my colleague who had been advocating the use of the utility asked me for a way to change the order of libraries in the classpath, we smelt rot and started debugging under the hood. After quite a bit of digging around, we found that the actual error was caused by the failure to load the JMockit agent dynamically. The following code is the sequence of calls which JMockit makes to load the agent.
 public class Startup {

   // other code omitted 
   public static void verifyInitialization()
   {
      if (instrumentation == null) {
         new AgentInitialization().initializeAccordingToJDKVersion();
      }
   }
}

public final class AgentInitialization
{
   public void initializeAccordingToJDKVersion()
   {
      String jarFilePath = discoverPathToJarFile();
      if (Startup.jdk6OrLater) {
         new JDK6AgentLoader(jarFilePath).loadAgent();
      }
      else if ("1.5".equals(Startup.javaSpecVersion)) {
         throw new IllegalStateException(
            "JMockit has not been initialized. Check that your Java 5 VM has been started " +
            "with the -javaagent:" + jarFilePath + " command line option.");
      }
      else {
         throw new IllegalStateException("JMockit requires a Java 5 VM or later.");
      }
   }
   private String discoverPathToJarFile()
   {
      CodeSource codeSource = AgentInitialization.class.getProtectionDomain().getCodeSource();
      if (codeSource == null) {
         return findPathToJarFileFromClasspath();
      }

      URI jarFileURI; // URI is needed to deal with spaces and non-ASCII characters
      try {
         jarFileURI = codeSource.getLocation().toURI();
      }
      catch (URISyntaxException e) {
         throw new RuntimeException(e);
      }

      return new File(jarFileURI).getPath();
   }

   private String findPathToJarFileFromClasspath()
   {
      String[] classPath = System.getProperty("java.class.path").split(File.pathSeparator);
      for (String cpEntry : classPath) {
         if (cpEntry.matches(".*jmockit[-.\\d]*.jar")) {
            return cpEntry;
         }
      }
      return null;
   }
}
While this approach is non-intrusive and works out of the box in most cases, in large enterprise organisations, you are often limited extensively and work in a confined environment. My current environment loads files from a network file system and the following lines from the discoverPathToJarFile() cause the class initialization error
 jarFileURI = codeSource.getLocation().toURI(); 
returns
 CodeSource
file://remoterepository/libraries/jmockit/0.996.0/install/common/lib/jmockit.jar 
and
 return new File(jarFileURI).getPath(); 
returns
 IllegalArgumentException: URI has an authority component 
We had been using JMockit for a while now since 0.991.0 version and the agent loading had always been a problem. In our original usage, we had to do two things. We had to create our own version of the JDK6AgentLoader as this class has package level scope and create our own JUnit4 runner which loaded the agent from a hard-coded location, as below
public final class MyJMockitRunner extends BlockJUnit4ClassRunner
{
   static
   {
      MyJDK6AgentLoader.loadAgent(hardCodedPathtoAgentJar);
   }

   public JMockit(Class testClass) throws InitializationError
   {
      super(testClass);
   }
}

Why the problem now?
What changed majorly between the two versions was, Shadowing the org.junit.runner.Runner class which in the modified version loaded the java agent before initialization of the class.

public abstract class Runner implements Describable
{
   static
   {
      Startup.initializeIfNeeded();
   }

   public abstract Description getDescription();

   public abstract void run(RunNotifier notifier);

   public int testCount()
   {
      return getDescription().testCount();
   }
}
So even though we are trying to load the agent in our custom Runner, the loading of the Runner class itself preceding the sequence and aggressively loads the agent in its own strategy. While this is a possible approach, I would highly criticize the usage of class shadowing for the following reasons
  1. Open source API generally provide certain basic guarantees, and is generally tested extensively by the commnuity in different enviroments to ensure they work in the way they are designed. By shadowing a public class, you override this guarantee and for most common developers, the fact that they are using a modified class is oblivious. This can lead to late night debugging sessions and nightmares when things go wrong.
  2. In this particular case, while only the tests that used JMockit should have failed, thousands of test cases across multiple projects failed at once. As a programmer, I always believe to lazily load resources and only when they are needed. But there is a necessity for mocking or not, the agent is going to be loaded whenever a Unit test is run. This is fundamentally wrong.
  3. Any workarounds that had been applied to the library fail miserably with this aggressive approach to force your library to behave in a specific way.
  4. While I agree that the owner cannot foresee all the environments his library would be used, trying to perform gimmicks under the hood on an open library could have been avoided.

Solution
While I've criticized the usage of shadowing a class, unfortunately the only solution that worked for us was to shadow the JMockit class that loaded the agent as below.

public final class AgentInitialization
{
   public void initializeAccordingToJDKVersion()
   {
 MyJDK6AgentLoader.loadAgent(hardCodedPathtoAgentJar);
   }
}

Better approach
There are a couple of suggestions for the JMockit API to overcome the shortcomings described above
  1. Please do remove the shadowed JUnit class and make the agent loading explicit. This is already the case when you use the JMockit.class as the runner.
  2. While the auto detection of the agent is brilliant, there could be some hooking mechanism which allowed the developers to extend it with their own strategies.
Last but not least, there definitely was an obvious solution to the problem, which was to explicitly specify the agent path in the VM arguments. The issue we had in that was there were too many OS, at least a few IDE, a shared continuous integration server and many many ant build scripts that need to be updated. It would have been a far more involved effort to get all this modified and maintained than applying a code fix which worked "under the hood"

Sunday, 17 January 2010

Google Appengine and Maven - another hack for datanucleus plugin

After researching for mavenizing an Appengine pet project, most of the links that Google search returned were really old when first version of AppEngine was released and did not match the expectations. Also unfortunately there is really no maven plugin that supports the latest version as Google publishes them or a plugin that allows you to point to a local installation, so that you could update at the speed of Google releasing updates for AppEngine (as of this writing, there were atleast 3 updates in the shorter period of 2 weeks)

Thanks to Salient Point, this was the best article to configure a maven project and it allowed to configure any AppEngine version with the simplicity of changing a couple of variables in a script file and building the maven project. However the only caveat that needed a few hours of debugging was the maven-datanucleus-plugin, as again the versions of the Jar files provided by datanucleus repository and the ones that are published by Google were slightly different and it was more simpler to use the same philosophy of having all the Jars provided by Google rather than downloading them from a different repository.

On this note, the following are the only deviation of configurations from the original post 1) Define the local directory to which the AppEngine Jars have been downloaded, the preferred way is to use profiles such that we could specify a default location in each platform (windows / mac)

        <profile>
            <id>gwt-dev-windows</id>
            <properties>
                <platform>windows</platform>
                <appengine.sdk.root>C:\Java\appengine-java-sdk-${appengineVersion}</appengine.sdk.root>
            </properties>
            <activation>
                <activeByDefault>false</activeByDefault>
                <os>
                    <family>windows</family>
                </os>
            </activation>
        </profile>
2) Specify a antrun task that performs the datanucleus enhancement for entity objects rather than the datanucleus plugin
            <plugin>
                <artifactId>maven-antrun-plugin</artifactId>
                <executions>
                    <execution>
                        <id>datanucleus-enhance</id>
                        <phase>process-classes</phase>
                        <configuration>
                            <tasks>
                                <property name="appengine.tools.classpath"
                                    location="${appengine.sdk.root}/lib/appengine-tools-api.jar"/>
                                <property name="dependency_classpath" refid="maven.dependency.classpath"/>

                                <taskdef name="enhance" classname="com.google.appengine.tools.enhancer.EnhancerTask"
                                    classpath="${appengine.tools.classpath}"/>
                                <enhance failonerror="true" api="JPA">
                                    <classpath>
                                        <pathelement path="${appengine.tools.classpath}"/>
                                        <pathelement path="${dependency_classpath}"/>
                                        <pathelement path="target/classes"/>
                                    </classpath>
                                    <fileset dir="target/classes" includes="**/*.class"/>
                                </enhance>
                            </tasks>
                        </configuration>
                        <goals>
                            <goal>run</goal>
                        </goals>
                    </execution>
                </executions>
            </plugin>
Alas it works like magic for us and we are able to upgrade the AppEngine versions as and when they are released by Google.

Also to note is the link on configuring a multi-project setup that is documented on the maven-gwt-plugin documentation that helps to organize your code into multiple projects (separating the core API / UI / service implementation and AppEngine deployment). Refer to our sample project setup that is hosted at http://code.google.com/p/scrumsp

Tuesday, 15 December 2009

Builder Pattern Abused

Everyone has read Joshua Bloch's builder pattern, which he has also described at his "Effective Java Reloaded" sessions at Javaone. However in this post, I am going to describe a misuse of this pattern, when used in conjunction with inheritance. 
When building a relational model, the biggest challenge is to maintain the integrity of inheritance and reduce the visibility of the properties completely. However if you need builders for this model, then you either have to create a class hierarchy of builders similar to the model itself or make all accesses protected (ugly #1) so that builders could appropriately initialize them. Making the properties less visible has at least two disadvantages though (1) you cannot make them final and ensure construction safe and (2) you'll end up copying the builders to more than one class (ugly #2) I recently came across a horrendous implementation of the hierarchy of builders where the author could not solve the above issues elegantly. So he had resorted to creating copy constructors (ugly #3) for the builders and also duplicating the methods for each of the builders. The following code snippet exactly reproduces the code, in an example class hierarchy and its builders.
abstract class Shape {
    private String name;
    private Dimension dimension;

    public enum Dimension {
        TWO_D, THREE_D;
    }

    public static abstract class ShapeBuilder {
        private String name;
        private Dimension dimension;

        protected ShapeBuilder copyOf(ShapeBuilder builder) {
            ShapeBuilder copy = createBuilder();
            copy.name = builder.name;
            copy.dimension = builder.dimension;
            return copy;
        }
        public ShapeBuilder name(String name) {
            ShapeBuilder bldr = copyOf(this);
            bldr.name = name;
            return bldr;
        }
        public ShapeBuilder dimension(Dimension dimension) {
            ShapeBuilder bldr = copyOf(this);
            bldr.dimension = dimension;
            return bldr;
        }
        public Shape build() {
            Shape s = createInstance();
            s.name = this.name;
            s.dimension = this.dimension;
            return s;
        }
        abstract Shape createInstance();
        abstract ShapeBuilder createBuilder();
    }
    // omitting getters for clarity
}
abstract class TwoDShape extends Shape {
    private double area;

    public static abstract class TwoDShapeBuilder extends ShapeBuilder {
        private double area;

        protected TwoDShapeBuilder copyOf(TwoDShapeBuilder builder) {
            TwoDShapeBuilder copy = (TwoDShapeBuilder) super.copyOf(builder);
            copy.area = this.area;
            return copy;
        }
        public TwoDShapeBuilder name(String name) {
            return (TwoDShapeBuilder) super.name(name);
        }
        public TwoDShapeBuilder dimension(Dimension dimension) {
            return (TwoDShapeBuilder) super.dimension(dimension);
        }
        public TwoDShapeBuilder area(double area) {
            TwoDShapeBuilder bldr = copyOf(this);
            bldr.area = area;
            return bldr;
        }
        public TwoDShape build() {
            TwoDShape s = (TwoDShape) super.build();
            s.area = this.area;
            return s;
        }
    }
}
class Circle extends TwoDShape {
    private double radius;

    public static class CircleBuilder extends TwoDShapeBuilder {
        private double radius;

        protected CircleBuilder copyOf(CircleBuilder builder) {
            CircleBuilder copy = (CircleBuilder) super.copyOf(builder);
            copy.radius = this.radius;
            return copy;
        }
        public CircleBuilder name(String name) {
            return (CircleBuilder) super.name(name);
        }
        public CircleBuilder dimension(Dimension dimension) {
            return (CircleBuilder) super.dimension(dimension);
        }
        public CircleBuilder area(double area) {
            return (CircleBuilder) super.area(area);
        }
        public CircleBuilder radius(double radius) {
            CircleBuilder bldr = copyOf(this);
            bldr.radius = radius;
            return bldr;
        }
        public Circle build() {
            Circle s = (Circle) super.build();
            s.radius = this.radius;
            return s;
        }
        @Override
        protected Shape createInstance() {
            return new Circle();
        }
        @Override
        protected ShapeBuilder createBuilder() {
            return new CircleBuilder();
        }
    }
}
abstract class ThreeDShape extends Shape {
    private double volume;
    // builder similar to 2DShape
}
class Cube extends ThreeDShape {
    private int side;
    // builder similar to Circle
}
Now extrapolate this class into an enterprise relational model consisting of 10-15 objects with nested hierarchies and you can imagine the amount of stupid code that's been written. If you haven't solved this problem yourself before, please stop here and think for a solution. The biggest issue here is, when you create a CircleBuilder and call the name() method, you are again expecting a CircleBuilder back and not a ShapeBuilder. Thankfully there is a generic implementation (partly under documented) which helps to solve problems of this nature. The modified code is given below
abstract class Shape {
    private String name;
    private Dimension dimension;

    public enum Dimension {
        TWO_D, THREE_D;
    }

    public static abstract class ShapeBuilder<S extends Shape, SB extends ShapeBuilder<S, SB>> {
        private String name;
        private Dimension dimension;

        @SuppressWarnings("unchecked")
        public SB name(String name) {
            this.name = name;
            return (SB) this;
        }
        @SuppressWarnings("unchecked")
        public SB dimension(Dimension dimension) {
            this.dimension = dimension;
            return (SB) this;
        }
        public T build() {
            T result = createInstance();
            result.name = this.name;
            result.dimension = this.dimension;
            return result;
        }
        abstract T createInstance();
    }
}
abstract class TwoDShape extends Shape {
    private double area;

    public static abstract class TwoDShapeBuilder<S extends TwoDShape, SB extends TwoDShapeBuilder<S, SB>>
            extends ShapeBuilder<T, SB> {
        private double area;

        @SuppressWarnings("unchecked")
        public SB area(double area) {
            this.area = area;
            return (SB) this;
        }
        public T build() {
            T result = (T) super.build();
            result.area = this.area;
            return result;
        }
    }
}
class Circle extends TwoDShape {
    private double radius;

    public static class CircleBuilder extends
            TwoDShapeBuilder<Circle, CircleBuilder> {
        private double radius;

        public CircleBuilder radius(double radius) {
            this.radius = radius;
            return this;
        }
        public Circle build() {
            Circle s = (Circle) super.build();
            s.radius = this.radius;
            return s;
        }
        @Override
        protected Circle createInstance() {
            return new Circle();
        }
    }
}
The only thing ugly about this code is that the builders themselves have to be cast to the generic implementation because of the type erasure. However this would be solved by the new "Type Inference" language feature that will be introduced in Java 7.