Friday, April 19, 2013

About test coverage

I was given a link to yet another article preaching about how 100% code coverage is the only goal (the article is preaching, not Tatu BTW)

There is an issue I have with how the 100% coverage advocates present their argument, but we'll get to that later.

The case for 100% coverage is put quite simply:

Anything less than 100% coverage means that there are some lines in your code that are untested, and are you seriously willing to let code be released without being tested?

Now if your code is 100 lines long, and you have 95% coverage, 5 lines doesn't seem too bad. In fact looking at the code coverage report you see those 5 lines, and you make a judgement call that they are an acceptable risk. No harm done, right?

When the code grows to 1,000 lines long, 95% coverage is giving you less of that warm and fuzzy feeling, but a quick scan of the coverage report and you can visually check the 50 lines untested, and you make a judgement call that they are an acceptable risk. No harm done, right?

With a 10,000+ LOC code base and 95% coverage and your boss wants to know if it is OK to push this code into production now, the 500+ uncovered lines that you need to review are now much more than a headache…

Are you a convert now? Do you believe in 100% coverage? It's a nice argument. But there is a flaw… in some cases that is!

When you are dealing with a dynamically typed language, especially some of the more expressive ones, the only tests you can have are the tests you write yourself. In those languages, the 100% coverage zealots have me sold (at least for code that needs to be maintained and is longer than 100 lines or so!)

But in different languages we can, and do, have additional tests that are provided by tooling:

  • Syntax tests that verify that every line of code is syntactically correct
  • Scoping tests (in languages that force declaring names before first use within a scope) that verify that each line only accesses those names within the correct scope for the line.
  • Type tests (in statically typed languages) 

And that's not all… most of the code I write runs on the JVM, there are more tests I can add to the mix when on the JVM

  • Static analysis tools, such as Checkstyle, PMD and Findbugs provide an additional set of tests that I can run automatically on my code looking for common problems and possible mistakes. In fact I've found and fixed bugs with Findbugs in code that had 100% coverage already.
  • Annotation's in code can be used to, not only document the code contracts, but aid and assist Findbugs in catching bugs. Specifically I am referring to the @CheckForNull and @NonNull annotations. I apply these annotations to the production code, and there are tests applied to the code for free by the toolchain I use

So when I am writing Java code, every line I write already has at least five tests covering it and I haven't even started adding unit tests into the mix!

Now I am not arguing that the above tests are enough for your code on it's own… but when you look at my unit test coverage at 83.7% and ask am I "happy to ship with 1,630 untested lines of code", I will answer that those 1,630 lines of code are tested, they may not be our best tests, but we have tests on them.

Show me a real code base with 100% coverage, and I will show you a good number of crappy tests helping that code base get to 100% coverage...

On the other hand, if you ask me am I happy to ship that Ruby on Rails / Node.JS / etc application into production with 99.5% coverage, I'll say no way are we shipping that code with 50 untested LOC.

Monday, February 04, 2013

Note to self: Updated usejava BASH function for MacOSX

For use when you have multiple JVM providers (Apple & Oracle), you want to be able to switch between JDKs for each CLI

usejava ()
{
    local sel=$1.jdk
    if [ -x "/Library/Java/JavaVirtualMachines/jdk$sel/Contents/Home/bin/java" -a ! -x "/Library/Java/JavaVirtualMachines/$1/Contents/Home/bin/java" ]
    then
        sel=jdk$sel
    fi
    local base=/Library/Java/JavaVirtualMachines
    if [ -x "/System/Library/Java/JavaVirtualMachines/$sel/Contents/Home/bin/java" ]
    then
        base=/System/Library/Java/JavaVirtualMachines
    fi

    if [ -z "$1" -o ! -x "$base/$sel/Contents/Home/bin/java" ]
    then
        local prefix="Syntax: usejava "
        for i in /Library/Java/JavaVirtualMachines/* /System/Library/Java/JavaVirtualMachines/*
        do
            if [ -x "$i/Contents/Home/bin/java" ]
            then
                /bin/echo -n "$prefix$(basename $i | sed -e "s/^jdk//;s/\.jdk$//;")"
                prefix=" | "
            fi
        done
        /bin/echo ""
    else
        if [ -z "$JAVA_HOME" ]
        then
            export PATH=$base/$sel/Contents/Home/bin:$PATH
        else
            export PATH=$(echo $PATH|sed -e "s:$JAVA_HOME/bin:$base/$sel/Contents/Home/bin:g")
        fi
        export JAVA_HOME=$base/$sel/Contents/Home
        echo -n -e "\033]0;$(java -version 2>&1 | sed -e "s/.*\"\(.*\)\".*/Java \1/;q")\007"
    fi
}

There is additional fun to be had, given that most Java based launchers that try to fix JAVA_HOME when not set will guess the Apple JVM path… so the following Java program can help

public class FixJavaHome {
  public static void main(String[] args) {
  	String javaHome = System.getProperty("java.home");
  	if (javaHome.endsWith("/jre")) {
  		javaHome = javaHome.substring(0,javaHome.length() - "/jre".length());
  	}
    System.out.println("export JAVA_HOME=\""+javaHome+'\"');
  }
}

Install like so

mkdir -p ~/bin/FixJavaHome && cd ~/bin/FixJavaHome && cat > FixJavaHome.java <<EOF
public class FixJavaHome {
  public static void main(String[] args) {
  	String javaHome = System.getProperty("java.home");
  	if (javaHome.endsWith("/jre")) {
  		javaHome = javaHome.substring(0,javaHome.length() - "/jre".length());
  	}
    System.out.println("export JAVA_HOME=\""+javaHome+'\"');
  }
}
EOF
javac FixJavaHome.java
cd -

If you add the following to your ~/.bash_profile

eval $(java -cp ~/bin/FixJavaHome/ FixJavaHome)
echo -n -e "\033]0;$(java -version 2>&1 | sed -e "s/.*\"\(.*\)\".*/Java \1/;q")\007"

Then your JAVA_HOME should be set up from the start, as well as your Terminal window title

Sunday, December 23, 2012

iOS traing course

http://j.mp/TfdoxL a link to a free online iOS app development training course that looks worth checking out

Thursday, October 25, 2012

Note to self: Mac Keyboard and ‘fancy’ characters

Special Symbols and Characters on the regular Mac Keyboard

These are for the British Keyboard layout 

Currency Symbols

$ dollar Shift+4
¢ cents Option+4
£ pound Shift+3
¥ yen Option+y
Euro Option+2

Trademark and Copyright Symbols

© copyright Option+g
® registered Option+r
trademark Option+Shift+2

Apple Symbol

apple Option+Shift+K

Math and Greek Character Symbols

± plus-or-minus Option+Shift+Equals
µ micro Option+h 
π pi  Option+l (as in L is for llama) 
square root  Option+t 
÷ divided by  Option+/ (slash is key to the left of right-hand shift key) 
· middle dot  Option+Shift+9 
almost equal  Option+x 
not equal  Option+= 
infinity  Option+Shift+T 
less than or equal  Option+, (comma) 
greater than or equal  Option+. (period) 
Å Angstrom sign  Option+Shift+K, Shift+A 
summation sign Option+Shift+F 
° degree sign Option+Shift+0 (zero) 
partial differential  Option+d 
integral  Option+Shift+D 
Ω Omega  Option+Shift+Z 

Copyediting, typesetting, and miscellaneous symbols

double dagger Option+Shift+semicolon
pilcrow sign Option+7
§ section sign Option+5
bullet sign Option+Shift+8

Punctuation marks

left single quotation mark Option+]
right single quotation mark  Option+Shift+]
left double quotation mark  Option+[ 
right double quotation mark  Option+Shift+[
« left pointing double angle quotation mark Option+\ (backslash)
» right pointing double angle quotation mark  Option+Shift+\ (backslash) 
single left pointing angle quotation mark  Option+Shift+3
single right pointing angle quotation mark  Option+Shift+4
¡ inverted exclamation mark  Option+1 
¿ inverted question mark  Option+Shift+? 
ellipsis  Option+; (semicolon) 

Tuesday, October 09, 2012

The Cross-Build Injection Attack Fallacy

This is a repost of my post on the CloudBees Developers blog

TL;DR Source control injection attacks are a bigger worry than build tool injection attacks, and if you cannot trust your local filesystem, then you cannot trust anything.

A few exchanges on twitter have prompted me to write a fuller blog post on the subject of Cross-Build Injection (XBI) Attacks.

The idea of XBI is that you trick the developer and replace parts of their code with your code, thereby getting your code to be trusted by the developer.

I do not object to the theory of XBI. But let's get real for a minute. Ultimately all the XBI attacks rely on a compromised local file system.

I am not saying that you cannot apply these attacks to remote systems and then have those affect developers with un-compromised local file systems.

I am saying that when you fix any remote vectors, you still end up victim of the local file system integrity.

Take for example this attack vector using Maven as an example victim build tool. How does the attack work? Well it replaces a good artifact in the Maven local repository with a bad version… and bad things happen.

For this attack to work for real you need to have your local file system compromised. Is that attack specific to Maven? Nope. You can get your $ANT_HOME/lib folder contents compromised just as easily (i.e. if your local file system cannot be trusted to hold your local repository, it cannot be trusted to hold your build tool) Same too applies to Gradle, Make, MSBuild, etc.

How do we prevent the attack? Well for quite some time the central repository has only been publishing artifacts with GPG signatures. So we could verify the GPG signature before each and every build… but those signatures are stored on the file system too, so we cannot trust them… and our GPG checking code is stored on the filesystem also… so we cannot trust that! Never mind that such checks would slow every build down - increasing the risk of the developer being knocked out of “the zone”.

The reality of “the zone” is often lost on people. Working memory is only able to retain information for a couple of seconds at a time and therefore any interruptions can be fatal to problem solving processes. Software development is one continuous problem solving process after another. If you add 5 seconds to every build, then that is 5 seconds of temptation for the developer to check their email / reddit / stackoverflow / etc. And then they will have to rebuild the context of the problem they were solving. In some cases, this can correspond to up to 45-50 minutes of zero productivity for the developer (I cannot find the link, but I have personal experiences that confirm this).

Good developers that recognise this problem will therefore seek to reduce build time to the minimum… therefore turning off any GPG or other integrity checks, etc. If you ask them why, they will probably respond with something like:

Well if I cannot trust the local filesystem, sure I cannot trust the SCM or the signature checks to even run in the first place. I'm reclaiming those 5 seconds on every build and being more productive.

What is the solution? Simple. Don't do the checks until you are making the release build! Better yet do the release builds from a continuous integration server such as Jenkins. You can lock that down, have it do the checks for you, and have it sign the resultant artifacts… but just be sure that you trust its filesystem and your source control system too!