Thursday, December 31, 2009

Accessing POSIX system calls from Groovy with JNA

One of the many virtues of the Groovy programming language is largely stylistic. It's syntax, typing and general feature set preserve the feel of earlier dynamically typed languages while still remaining just close enough to Java to support easy interaction with the hefty Java/JVM ecosystem. This feature enables first-class access to existing Java code while still allowing me to think like a Perl programmer.

For those of us who came up writing as much Perl, Python and Lisp/Scheme as we did C++ and Java this is a non-trivial benefit. If you're used to working with both static and dynamic types you find yourself thinking about certain problems in certain ways. For certain problems you may think in terms of dynamic types. For other problems you may employ strong type systems. Often the right tool for the job is the right tool because of the workman using it. [1]

The comparison between Groovy and the other dynamic languages is hardly flawless. Some of the discrepencies are due to the platform independence required by the JVM. Take something as simple as getting the PID of the current process on a POSIX-compliant OS. In Perl I have $PID. In Python I have os.getpid(). In Groovy I got... nuthin'.

Fortunately the JNA library for Java provides access to native libs directly from Java code (no JNI coding required). It seems like access to libc via JNA plus some combination of Groovy's metaprogramming capabilities should help us out. A rough sketch of my requirements would look something like this:

  • Assume a POSIX-complaint OS and ready availability of libc. At least initially we'll constrain ourselves to POSIX system calls.
  • Any solution should stay as close to the POSIX APIs as possible. We can add layers or encapsulate results in objects later. [2] Changing error cases to throw exceptions rather than checking return values in code is an acceptable change (although I admit this is a fairly arbitrary choice).
  • There are a lot of system calls defined in POSIX. No, really; check it out if you don't believe me. I'd like to avoid having to define a function in an interface (or anywhere else) before using it. This may introduce a bit of a performance penalty [3] but I'm okay with that for now.

A simple initial approach would use methodMissing along the lines of the recommendations offered by the JNA guys for the use of their lib in dynamic languages. In our case we have to wrap the JNA NativeLibrary in a Groovy proxy in order to make it available for metaprogramming work.


  /* Create the lib first so that it's in scope
   * when the closure is declared */
  def libc = NativeLibrary.getInstance("c")

  /* No need to register this meta-class; other
   * instances of NativeLibrary may wish
   * to do things differently. */
  ExpandoMetaClass emc = new ExpandoMetaClass(Proxy,false)
  emc.methodMissing = {

   String fname, fargs ->
   println "Invoking method name ${fname}, args: ${fargs}"
   def f = libc.getFunction(fname)

   synchronized (libc) {

    def rv = f.invokeInt(fargs)
    if (rv == -1) {

     def errnoptr = libc.getGlobalVariableAddress("errno")
     def errno = errnoptr.getInt(0)
     def errstr = libc.getFunction("strerror").invokeString([errno] as Object[],false)
     throw new LibcException(errno,errstr)
    }
    else { return rv }
   }
  }
  emc.initialize()

  def libcproxy = new Proxy().wrap(libc)
  libcproxy.setMetaClass(emc)

  def pid = libcproxy.getpid()
  println "PID: ${pid}}"


This seems to work well enough, but upon closer examination there are a few problems:

  • We have to tie up methodMissing for the lib in order to make the lib work. One can imagine that individual apps may wish to do other things with methodMissing.
  • The class we use to make the JNA NativeLibrary into a GroovyObject (groovy.util.Proxy) provides easy access to that NativeLibrary object. This allows other callers to make syscalls using this NativeLibrary while we're in methodMissing. This in turn could easily confuse our error handling code via it's reliance on the global "errno" variable.

So we make a few tweaks and wind up with a better proxy using invokeMethod:


class GroovyLibc extends GroovyObjectSupport {

 private libc = NativeLibrary.getInstance("c")

 /* Complete hack to cover the fact that the private
  * access control modifier for properties is
  * apparently completely ignored now. Details
  * can be found at
  * http://jira.codehaus.org/browse/GROOVY-1875 */
 public Object getProperty(String name) {

  switch (name) {
   case "libc": throw new MissingPropertyException("Property ${name} unknown")
   default: return super.getProperty(name)
  }
 }

 public Object invokeMethod(String name, Object args) {

  println "Invoking method name ${name}, args: ${args}"
  def f = libc.getFunction(name)
  if (f == null) {

   throw new MissingMethodException("Could not find function ${name}")
  }

  synchronized (libc) {

   def rv = f.invokeInt(args)
   if (rv == -1) {

    def errnoptr = libc.getGlobalVariableAddress("errno")
    def errno = errnoptr.getInt(0)
    def errstr = libc.getFunction("strerror").invokeString([errno] as Object[],false)
    throw new LibcException(errno,errstr)
   }
   else { return rv }
  }
 }
}


Both concerns are now addressed. As a final optimization we note that as of version 3.2.0 JNA offers direct support for throwing an exception when a syscall returns an error (according to a defined calling convention). We can make use of this support to clean up our code a bit:


class BetterGroovyLibc extends GroovyObjectSupport {

 private libc = NativeLibrary.getInstance("c")

 /* Complete hack to cover the fact that the private
  * access control modifier for properties is
  * apparently completely ignored now. Details
  * can be found at
  * http://jira.codehaus.org/browse/GROOVY-1875 */
 public Object getProperty(String name) {

  switch (name) {
   case "libc": throw new MissingPropertyException("Property ${name} unknown")
   default: return super.getProperty(name)
  }
 }

 public Object invokeMethod(String name, Object args) {

  println "Invoking method name ${name}, args: ${args}"
  try {

   def f = libc.getFunction(name,Function.THROW_LAST_ERROR)
   if (f == null) {

    throw new MissingMethodException("Could not find function ${name}")
   }

   return f.invokeInt(args)
  }
  catch (UnsatisfiedLinkError ule) {

   throw new MissingMethodException("Could not find function ${name}")
  }
  catch (LastErrorException lee) {

   def errno = lee.errorCode
   def errstr = libc.getFunction("strerror").invokeString([errno] as Object[],false)
   throw new LibcException(errno,errstr)
  }
 }
}


Complete code (along with a few sample unit tests to verify that it works) can be found on Github.

[1] Ports of existing dynamic languages (i.e. Jython and JRuby) enable thinking in terms of dynamic types but don't integrate as cleanly with existing Java code.

[2] The jna-posix project has done some excellent work in a similar vein. The project started as part of the JRuby core but was later spun off into a standalone lib. The problem is that it wraps syscall results in objects rather than following conventional semantics. POSIX.lstat(String) returns a FileStat object rather than the more conventional lstat(String,stat struct). There are good arguments for this approach; it's much friendlier to object-oriented systems and it does avoid calling methods for side effects only. But if you're used to the conventional POSIX system calls a new syntax and/or object hierarchy just gets in the way. Like everything else in this article this is at least in part a matter of taste.

[3] For example this constraint explicitly prevents us from using the direct mapping features for native functions available in JNA.

No comments:

Post a Comment