Thursday, August 23, 2007

FindBugs

Inspired by Josh Bloch and Bill Pugh's Java Puzzlers talk at Google, Java Puzzlers, episode VI, I decided to use FindBugs and analyze some core Java libraries we wrote and used at one of my previous employments. Here are some of the findings: Commons: 675 classes, 505 bugs (98 bad practice, 27 correctness, 96 malicious code vulnerability, 14 multithreaded correctness, 207 performance, 63 dodgy). Messaging: 31 classes, 21 bugs (9 bad practice, 8 malicious code vulnerability, 3 performance, 1 dodgy) Services: 239 classes, 78 bugs (5 bad practice, 4 correctness, 35 malicious code vulnerability, 1 multithreaded correctness, 26 performance, 7 dodgy) Content management: 637 classes, 577 bugs (38 bad practice, 14 correctness, 72 malicious code vulnerability, 3 multithreaded correctness, 382 performance, 68 dodgy)

Example bugs include:

Bad attemnpt to compute absolute value of signed 32-bit hashcode:

indexPrimary +=  Math.abs(this.getStoragePrimary()
  .hashCode()) + SEP;
Why? If the hash code is equal to Integer.minValue() then the result will be negative as well.

Impossible cast (ouch!):

ArrayList list=new ArrayList();
setChoices((DateDatum[])list.toArray());
Possible null reference dereference for internalConnetion:
try {
  if(internalConnetion == null) {
    throw new TransactionManagerException("...");
  }
  ...
  } catch(Exception e) {
    throw new TransactionManagerException(
      e.getMessage(), e);
  } finally {
    if (internalConnetion.getDepth() < 1) {
Nullcheck of value previously dereferenced:
if (accountId.equals(internalAccount)) {
  permissions.add(new AllPermission());
  return permissions;
}
if (accountId == null) {
  return permissions;
}
This last comparison for null is redundant since, if true, it would have already raised an exception.

Method invokes inefficient Boolean constructor:

return new Boolean(false);
Boolean objects are immutable, there's no need to create a new instance; use Boolean.valueOf(...) instead.

Method invokes inefficient new String(String) constructor:

String path = new String("");
Method concatenates strings using + in a loop:
for(int i = 0; i < (hash.length / 2); i++) {
  rtnValue += Integer.toHexString(x);
}
Inefficient use of keySet iterator instead of entrySet iterator:
Set keySet = headers.keySet();
Iterator iterator = keySet.iterator();
while(iterator.hasNext()) {
  String value = (String) headers.get(key);
}
May expose internal representation by incorporating reference to mutable object:
public void setMethods(Hashtable methods) {
  this.methods = methods;
  FunctionsContainer.getLogger().info("Loaded " +
    methods.size()+" methods");
}

This code stores a reference to an externally mutable methods object into the internal representation of the object. Storing a copy of the object would have been much safer.

It's always a good idea to statically analyse your code once you're done with it. It's not going to render it bug free but it certainly helps. Oh, and while I'm at it, go get yourself a copy of Java Puzzlers book; it helps avoiding some very dark corners you might not have been aware of.

Sunday, August 19, 2007

xkcd

A webcomic of romance, sarcasm, math, and language by Randall Munroe. Check it.

Friday, August 17, 2007

parkour


I didn't know that parkour was so popular until I saw the video above. This is a short definition from Wikipedia:

Parkour (sometimes abbreviated to PK) or l'art du déplacement (English: the art of displacement) is recreational activity of French origin, the aim of which is to move from point A to point B as efficiently and quickly as possible, using principally the abilities of the human body.

Astonishing eh? Now my traceurs, summon your spidery powers and displace yourselves!

Tuesday, August 14, 2007

cURL

If you are not already using it, I suggest you start using cURL. cURL, or Client for URLs, or see URL comes in two flavours, a command line tool for getting and sending files using URL syntax and a library, libcurl, for use by other programs. It supports more than a dozen protocols (FTP, FTPS, HTTP, HTTPS, SCP, SFTP, TFTP, TELNET, DICT, FILE, LDAP), cookies, proxy tunneling, transfer resume, authentication (Basic, Digest, NTLM, Negotiate, kerberos...), SSL certificates, HTTP uploads, progress meter, speed limit, you name it. Here's some examples:

Upload a file as multipart/form-data plus extra params to a URL: curl -F upload=@localfilename -F press=OK [URL]

Use an agent of your choice: curl -A "Mozilla/4.0 (compatible; MSIE 5.01; Windows NT 5.0)" [URL]

Get the last 500 bytes of a document: curl -r -500 http://www.get.this/

ftp upload through a proxy: curl --proxytunnel -x proxy:port -T localfile ftp.upload.com

Read and write cookies from a netscape cookie file: curl -b cookies.txt -c cookies.txt www.example.com

Download resume: curl -C - -o file http://www.server.com/

cURL is in Flash Player 9, Mac OS X, F-Secure and IBM's BOINC among others. Interfaces exist for most major as well as other more obscure languages. Unfortunately, the javacurl interface supports a small subset of cURL's features and is not well tested.

Friday, August 10, 2007

cloning adventure

Once I wanted to create clones of tree nodes, naming the clones of node say A as A-Copy-1, A-Copy-2, etc. My first thought was using the parent's child-count property and start counting copies from that value onwards. However, if the nodes are A, A-Copy-1 and A-Copy-2, when A-Copy-1 gets deleted the child-count becomes 2 and if that is used you get the same name as one of the existing children. Then I thought I would simply iterate through all children named A-Copy-i, for i = 1 to child-count, and create clones whenever that name was for take. That would reuse slots created by previous deletions but would make distinguishing the new nodes impossible. This was easily fixed by finding the maximum copy count of A and start creating clones after that so, if A-Copy-4 and A-Copy-6 sibling nodes are left in the tree, any new copies would be named as A-Copy-7, A-Copy-8 etc. That way you would always get a nice continuous set of cloned nodes in the tree. Even if you wanted to clone A-Copy-i node, that would quite naturally become A-Copy-i-Copy-j. I decided to use regular expressions to look for <node name>-Copy-<copy count> sibling nodes. That gave the extra benefit of having the copy count part available by means of a capturing group. To force any metacharacters in the node name to be treated like ordinary characters, I preceded every single character in the name with backslashes:

StringBuffer escapedNodeName = 
  new StringBuffer(nodeName.length()*2);
for (int i = 0; i < nodeName.length(); i++) {
  escapedNodeName.append('\\').append(nodeName.charAt(i));
}
RE exp = new RE(escapedNodeName.toString() + "-Copy-([1-9]\\d*$)");
This failed miserably! Guessed why? It is quite alright to escape a metacharacter, but it is another kettle of fish when "escaping" ordinary characters. When this code was used on a tree which had numbers in names, the escaped digits became backreferences and the match failed. I hastily changed the escaping part of the code to enclose the name within a quote (\Q and \E). No sooner had I done this than I realised that even that had its own fallacy: what if the node name had a \E in it? Proper quoting required some more effort:
StringBuilder escapedNodeName = 
  new StringBuilder(nodeName.length() * 2);
escapedNodeName.append("\\Q");
slashEIndex = 0;
int current = 0;
while ((slashEIndex = nodeName.indexOf("\\E", current)) != -1) {
  escapedNodeName.append(nodeName.substring(current, 
                                            slashEIndex));
  current = slashEIndex + 2;
  escapedNodeName.append("\\E\\\\E\\Q");
}
escapedNodeName.append(nodeName.substring(current, 
                                          nodeName.length()));
escapedNodeName.append("\\E");
Thankfully this is made available as java.util.regex.Pattern.quote(String) method since Java 1.5.