Jake Goulding

Rust on an Arduino Uno, Part 6

2016-05-27T11:11:11-04:00

We can’t yet compile the stock version of libcore, so in the meantime we have our own version with the essentials. Because we’ve directly added this code to our project, each recompile takes a while. It’d be really nice if we could use Cargo like a Real Rust Project would, allowing us to compile our modified libcore once and reuse it.

Create a new library crate (cargo new avr-core) and move all of the hacked-up core files that we created before into the src directory:

clone.rs
cmp.rs
intrinsics.rs
marker.rs
ops.rs
option.rs

Additionally, create a lib.rs with the top-level core items:

all the feature flags
the prelude
module references
the eh_personality and panic handlers.

Now we can create a binary crate that will use our AVR-compatible libcore. After cargo new --bin blink, add a path to the core library:

[dependencies.avr-core]
path = "../avr-core"

We can remove a bunch of junk from our main.rs and just import the interesting core items:

extern crate avr_core;

use avr_core::prelude::*;
use avr_core::intrinsics::{volatile_load, volatile_store};
use avr_core::marker::PhantomData;

Now when we compile, we only need to rebuild our own code, not all of libcore! Much better.

Let’s continue improving our code. The last thing we did was to hook up an interrupt handler for the timers, but we had to add a bunch of assembly to make the handler behave in the proper way. As suggested in the previous post, there’s a much better way to do it.

Rust allows us to declare extern functions with a calling convention. A calling convention describes where the arguments are located, where the return value should be placed, and what registers a function is allowed to change.

There are two special calling conventions for AVR code: avr-interrupt and avr-non-blocking-interrupt. They are basically the same, except that the latter immediately re-enables interrupt handling when it starts. With the former, you don’t have to worry about one interrupt happening while you are handling another.

That means we can rewrite our interrupt handler much easier:

#[no_mangle]
pub unsafe extern "avr-interrupt" fn _ivr_timer1_compare_a() {
    let prev_value = volatile_load(PORTB);
    volatile_store(PORTB, prev_value ^ PINB5);
}

Now that we are using Cargo, it would be nice if we didn’t have to directly call avr-gcc ourselves. We can accomplish this with a target file. This is JSON configuration that can enhance the Rust compiler’s knowledge about how to compile a piece of code.

There are many fields that are required (check the repository for the full reference), but the important one is that we can tell the compiler to use avr-gcc as our linker:

  "linker": "avr-gcc",
  "pre-link-args": ["-mmcu=atmega328p", "-nostartfiles", "../interrupt_vector.S"],
  "exe-suffix": ".elf",
  "post-link-args": ["-Wl,--no-gc-sections"],

And we can use this JSON target file when compiling:

cargo build --release --target=./arduino.json

This will create our ELF file, automatically linking to our interrupt vector definition, and ready to be processed with avr-objcopy and uploaded to the board. We are getting closer and closer to an enjoyable development experience!

As before, the complete source is available on Github.

Rust on an Arduino Uno, Part 5

2016-05-19T13:57:04-04:00

Previously, we wrote some code that allowed us to sleep by waiting for a number of cycles to pass. However, we had to peek at the disassembly to know how many cycles we were spending and adapt our source code to match. While it got us started, it’s not a very elegant solution.

The Arduino Uno uses an ATmega328P processor. One of the features of this processor are 3 built-in timers that can trigger interrupts at certain periods. Interrupts are special bits of code that take over control of the processor when something important happens. These are often time-critical things that need to be handled quickly.

What would be ideal is if we could rely on the timer feature to implement our sleep method. To get started, we are going to need the ability to specify the interrupt vector.

The interrupt vector is a table of 26 instructions that must be placed at a specific section in memory. Each element in the table corresponds to a specific interrupt, and should consist of one instruction that jumps to the appropriate interrupt handler.

To do this, we need to write a little bit of assembly:

ivr:
        jmp _ivr_reset
        jmp _ivr_irq0
        jmp _ivr_irq1
        ;; continues for all the rest

In order to use this, we need to include it when linking all of our code together. We also have to disable the existing interrupt vector that would be added. This is done via the -nostartfiles flag:

avr-gcc -mmcu=atmega328p interrupt_vector.S hello.o -nostartfiles -o hello.elf

If you compile right now, you will get a whole bunch of errors of the form:

temp_file_name.o: In function `ivr':
(.text+0x0): undefined reference to `_ivr_reset'

Our interrupt vector is trying to jump to a bunch of symbols that we haven’t yet defined. We could do the simple thing and define a bunch of _ivr_* methods in Rust (and I did, to start with), but that’s rather annoying. Instead, we can use weak linking to define a kind of “fallback” symbol. We will have one simple handler that just returns from the interrupt, and set each handler to use that unless it is defined:

_ivr_undefined:
        reti

.weak _ivr_irq0
.set  _ivr_irq0, _ivr_undefined
;;; And so on

The only outlier is _ivr_reset which we define to point to our main method, avoiding extraneous indirection. At this point, we should be compiling again, but not using the interrupts yet. Let’s change that.

Following this guide, we can see all the details of setting up the timer. At a high level it’s:

Register an interrupt handler.
Disable interrupts.
Set a bunch of values as determined by the datasheet and math.
Enable interrupts.

We will copy all of the values and registers from this article to setup timer 0, but with a 1kHz rate instead of 2kHz. This matches nicer with our sleep_ms method which waits milliseconds.

Let’s use a little bit of nice Rust for a change. When we disable interrupts, we really want to make sure we enable them again! In a language like Rust, we can use a (misleadingly labeled) pattern known as Resource Acquisition Is Initialization (RAII). We will create a struct that disables interrupts when it is created and enables them when the struct is dropped. This means we can never forget to re-enable interrupts as the compiler will ensure things are restored!

struct DisableInterrupts(PhantomData<()>);
impl DisableInterrupts {
    fn new() -> DisableInterrupts {
        unsafe { asm!("CLI") }
        DisableInterrupts(PhantomData)
    }
}

impl Drop for DisableInterrupts {
    fn drop(&mut self) {
        unsafe { asm!("SEI") }
    }
}

We can bundle this into a nice wrapper:

fn without_interrupts(f: F) -> T
    where F: FnOnce() -> T
{
    let _disabled = DisableInterrupts::new();
    f()
}

And use it like so:

without_interrupts(|| {
    volatile_store(TCCR0A, 0);
});

To define the interrupt handler, we simply create a method that matches the expected name from our assembly file. The method simply increments a global variable each time it is triggered:

static mut N_MS_ELAPSED: u8 = 0;

#[no_mangle]
pub unsafe extern fn _ivr_timer0_compare_a() {
    N_MS_ELAPSED += 1;
}

And re-implement our sleep_ms function to:

fn sleep_ms(duration_ms: u8) {
    unsafe {
        volatile_store(&mut N_MS_ELAPSED, 0);
        while volatile_load(&mut N_MS_ELAPSED) < duration_ms {
            // spin
        }
    }
}

Compile this and load it onto the board, and we are greeted with the sight of nothing blinking. It’s time to dig into more disassembly. Here’s what _ivr_timer0_compare_a looks like:

lds     r24, 0x0000
inc     r24
sts     0x0000, r24
ret

Checking the instruction set manual and the datasheet, we will notice a few problems:

We use ret (Return from Subroutine) instead of reti (Return from Interrupt).
We do not save and restore the Status register.
We do not save and restore the r24 register.

Let’s modify our handler with a bit more assembly to address all three issues:

#[no_mangle]
pub unsafe extern fn _ivr_timer0_compare_a() {
    asm!{
        "PUSH R24
         IN R24, 0x3F
         PUSH R24"
    };

    N_MS_ELAPSED += 1;

    asm!{
        "POP R24
         OUT 0x3F, R24
         POP R24
         RETI"
    };
}

That’s certainly a bit longer, but it compiles and works again! And it will continue to work, so long as the compiler always decides to use r24 for the incremented value, something we have no control over. As you might guess, there’s a better solution.

Rust on an Arduino Uno, Part 4

2016-05-12T13:04:43-04:00

When we left off, we were blinking the LED. Let’s take a brief detour and document how to get a working Rust compiler. This is mostly a way for me to document what I’ve been doing so I can find it again!

We are going to start by getting a local version of LLVM that supports targeting AVR. After cloning the repository, we will need to set up for a build. Note that the upstream avr-rust-support branch sometimes lags compared to avr-support, so you will probably want to merge the two branches to get any updates.

cd avr-llvm
git checkout avr-support
git merge origin/avr-rust-support
mkdir -p debug/build
cd debug/build

We will then configure LLVM. This particular configuration I have here is based off the current Rust build and is specific to OS X (see the C_FLAGS and CXX_FLAGS). If you are using a different platform, you’ll need to poke at the Rust build process to see the appropriate flags.

Last updated: 2016-11-06

cmake ../.. \
  -DCMAKE_BUILD_TYPE=Debug \
  -DLLVM_TARGETS_TO_BUILD="X86;AVR" \
  -DLLVM_INCLUDE_EXAMPLES=OFF \
  -DLLVM_INCLUDE_TESTS=OFF \
  -DLLVM_INCLUDE_DOCS=OFF \
  -DLLVM_ENABLE_ZLIB=OFF \
  -DWITH_POLLY=OFF \
  -DLLVM_ENABLE_TERMINFO=OFF \
  -DLLVM_INSTALL_UTILS=ON \
  -DCMAKE_C_FLAGS="-ffunction-sections -fdata-sections -m64 -fPIC -stdlib=libc++" \
  -DCMAKE_CXX_FLAGS="-ffunction-sections -fdata-sections -m64 -fPIC -stdlib=libc++" \
  -DCMAKE_INSTALL_PREFIX=..

Then it’s just a matter of building and installing. Since it created normal Makefiles for me, I passed an extra make flag to build in parallel. The LLVM build is pretty fast this way!

cmake --build . -- -j7
cmake --build . --target install

Then we need to build Rust with this custom LLVM. After cloning the repository, set up the structure:

cd avr-rust
git checkout avr-support
mkdir -p debug
cd debug/

AVR-LLVM is based on a very new version of LLVM, so we need to use the in-progress Rust build system called “rustbuild”. Using in-development build systems with in-development compilers, what could go wrong?

Note that it’s very important to use an absolute path to your LLVM directory.

../configure \
  --enable-rustbuild \
  --enable-debug \
  --disable-docs \
  --enable-debug-assertions \
  --disable-jemalloc \
  --llvm-root=/absolute/path/to/avr-llvm/debug

Then we build!

make -j7

4 or more hours later, you will have a fully-built compiler. However, you can usually get up-and-running earlier by using the stage 1 compiler, located in debug/build/*/stage1. This will be available pretty quickly, before the entire build is complete.

We then add this build as a rustup toolchain and use it as the override compiler in a directory:

rustup toolchain link avr /path/to/rust/debug/build/*/stage1
rustup override set avr

Note that this will only produce a cross-compiler; none of the libraries that make things actually work. That’s still coming!

Rust on an Arduino Uno, Part 3

2016-01-24T13:11:12-05:00

Now that we can turn an LED on, let’s see if we can do something more exciting: make the LED blink. Surprisingly, this is more difficult than you might expect!

Blinking boils down to “turn the light on, wait a while, turn the light off, wait a while” and repeat forever. We already know how to turn the light on and off, as well as repeating forever. The trick lies in “wait a while”.

In a conventional Rust application, we’d probably call something like std::thread::sleep, but we don’t have access to libstd on an Arduino as that library is too high-level. We will have to implement it ourselves!

It’s easy enough, all we have to do is loop a bunch of times. If the Arduino processor runs at 16MHz, we can waste 16000 cycles to take one millisecond. We will execute a nop instruction to waste the time:

fn sleep_ms(duration_ms: u16) {
    const FREQUENCY_HZ: u32 = 16_000_000;
    const CYCLES_PER_MS: u16 = (FREQUENCY_HZ / 1000) as u16;

    for _ in 0..duration_ms {
        for _ in 0..CYCLES_PER_MS {
            unsafe { asm!("nop"); }
        }
    }
}

Just compile this and away we go! Or not…

error: failed to resolve. Maybe a missing `extern crate iter`? [E0433]
     for _ in 0..duration_ms {
     ^~~~

error: unresolved name `iter::IntoIterator::into_iter` [E0425]

Right, we haven’t actually defined any of the Iterator logic; that’s in libcore which we don’t have yet. Let’s skip that and do something a little more C-like. We can just loop and increment integers and compare them:

fn sleep_ms(duration_ms: u16) {
    const FREQUENCY_HZ: u32 = 16_000_000;
    const CYCLES_PER_MS: u16 = 16_000;

    let mut outer = 0;
    while outer < duration_ms {
        let mut inner = 0;
        while inner < CYCLES_PER_MS {
            unsafe { asm!("nop"); }
            inner += 1;
        }
        outer += 1;
    }
}

And… that fails too:

error: binary operation `/` cannot be applied to type `u32` [E0369]
     const CYCLES_PER_MS: u16 = (FREQUENCY_HZ / 1000) as u16;
                                 ^~~~~~~~~~~~

Ok, no division, even if it is just a constant and should be computed at compile time. Well, we can hard code it for the moment…

error: binary operation `<` cannot be applied to type `_` [E0369]
     while outer < duration_ms {
           ^~~~~

error: binary assignment operation `+=` cannot be applied to type `u16` [E0368]
         outer += 1;
         ^~~~~

OK, wow, no addition or comparison either. There’s no way around this – we really need libcore or else we are stuck with a pretty primitive environment. Since we know we have issues compiling all of libcore, let’s try a smaller part, just enough to compile this example.

Previously, we had copied in some small snippets from libcore, but let’s replace those excerpts with the complete files and drag in a few more. After some trial-and-error, this small set compiles:

clone
cmp
intrinsics
marker
ops
option

With it compiling, let’s actually call sleep_ms in our main and load the program onto the board:

loop {
    sleep_ms(500);
    volatile_store(PORTB, 0xFF); // Everything is on
    sleep_ms(500);
    volatile_store(PORTB, 0x00); // Everything is off
}

A video of the blinking LED.

Look at that nice, steady blinking. Blinking at a rate that is nothing like 500 milliseconds. Let’s take a look at the disassembly for the inner loop to understand why:

adiw ;; Add word (16-bit)            ;; 2 cycles
cp   ;; Compare registers            ;; 1 cycle
cpc  ;; Compare registers with carry ;; 1 cycle
brcs ;; Branch if carry set          ;; 1 cycle (false) / 2 cycles (true)

We increment our counter and check to see if we’ve exceeded our limit. In all cases except the last iteration we will branch back to the beginning of the loop, bringing the total cycle count of the loop to six. Compare that to the naive calculation that the nop would take one cycle and the rest of the loop would be free. Dividing the inner loop constant by six gets us much closer to the appropriate duration.

The outer loop and the function call itself also have some overhead, but these only add up to a few cycles per inner loop. Since the inner loop corresponds to many thousands of cycles, a few cycles is a small error and I think can be safely ignored.

An interesting aside is that I have no idea why the nop does not occur inside the loop. The compiler has reordered the code such that the nop occurs in the variable initialization of the function. You can change the code to just asm!("") and accomplish the same goal of preventing the loop from being optimized away.

Next time, we will see if we can do something a little more structured than counting cycles to sleep. As before, check out the repository for the code up to this point.

Rust on an Arduino Uno, Part 2

2016-01-17T14:34:54-05:00

After my previous attempt, I started to think that the issues were caused by an inability to completely link the program. If that were the case, could we try to link in a different way?

Through a bit of trial and error, I was able to generate an object file:

rustc --target avr-atmel-none hello.rs --emit obj

Checking the disassembly of this file with objdump -d hello.o showed promise:

00000000 :
   0:   0e 94 00 00     call    0     ; 0x0 
   4:   08 95           ret

Disassembly of section .text._ZN4main10__rust_abiE:

00000000 <_ZN4main10__rust_abiE>:
   0:   8f ef           ldi r24, 0xFF ; 255
   2:   84 b9           out 0x04, r24 ; 4
   4:   00 c0           rjmp    .+0   ; 0x6 

00000006 :
   6:   8f ef           ldi r24, 0xFF ; 255
   8:   85 b9           out 0x05, r24 ; 5
   a:   80 e0           ldi r24, 0x00 ; 0
   c:   85 b9           out 0x05, r24 ; 5
   e:   fb cf           rjmp    .-10  ; 0x6 

I then used an existing installation of GCC with AVR support to finish linking the code together:

avr-gcc -mmcu=atmega328p hello.o -o hello.elf

Taking a look at the disassembly of this code shows a lot of things that were not present in the original object file:

The interrupt vector table is established. This occupies about the first 25 instructions. Each instruction is a jump to the appropriate interrupt handler. Most importantly, table index 0 is the reset “interrupt” which controls where the processor should jump to when it is initialized.
The EEPROM Control Register and GPIOR0 are initialized and external interrupts are disabled. Then main is called.
After main returns, interrupts are disabled and the chip goes into an infinite loop.

Getting code on board

Now that we have a compiled binary, we need to get it onto the Arduino proper. avrdude is an in-system programmer that will allow us to upload the compiled code, but it prefers input in a different format: Intel HEX. We can convert using avr-objcopy:

avr-objcopy -O ihex -R .eeprom hello.elf hello.hex

Now we can upload to the Arduino:

avrdude -p atmega328p -c arduino -P /dev/cu.usbmodem1411 -U flash:w:hello.hex:i

The Arduino Uno has a second Atmel chip (ATmega16U2) that looks like a USB-to-serial device to the host computer. On my OS X computer, that device shows up at /dev/cu.usbmodem1411. Your location will differ.

It’s alive!

Because I have such a basic level of code, I can’t do anything nice like blink an LED. Instead, I can write a tight loop that just turns the LED on or off some percentage of the time. This allows it to have a different relative brightness, which in turn lets me see that the code changes are actually happening.

Check out the LED marked L in the following pictures.

LED on 100% of the time

LED on 50% of the time

LED on 1% of the time

What’s next?

This isn’t the most elegant of solutions, and there are a lot of avenues to explore:

Avoid installing avr-gcc and avr-objcopy. Right now, avr-gcc is used when compiling Rust itself (for compiler-rt) and to finish assembly of the executable. It would be ideal if all of this could be handled within an AVR-enabled Rust or LLVM.
Set interrupt handlers. I think the typical solution is to use a linker script, but that’s one more moving piece I’d like to avoid adding.
Compile libcore! In order to get the most basic of things to compile, I had to straight-up copy code from libcore. An impressive amount of things are included there. Things you might want to use, like addition, not to mention Option or anything having to do with iterators. libstd is unlikely to ever be supported as it relies on memory allocation.
Merge the Rust fork of LLVM with the AVR fork of LLVM. The more frequently these are merged, the easier it will be to eventually include the AVR support in Rust proper. I started to do this, but had a large number of merge conflicts so I backed off.
Compile AVR-enabled Rust in non-debug mode. For some reason, when I compile Rust without debugging symbols, I get an “exciting” assertion failure from deep within LLVM. That is most likely a symptom of some problem that should be fixed.

TL;DR

Check out my repo for an example that worked for me. In short:

# Compile object
rustc --target avr-atmel-none -C target-cpu=atmega328p --emit=obj hello.rs -o hello.o
# Link together
avr-gcc -mmcu=atmega328p hello.o -o hello.elf
# Reformat for upload
avr-objcopy -O ihex -R .eeprom hello.elf hello.hex
# Upload to the board
avrdude -p atmega328p -c arduino -P /dev/cu.usbmodem1411 -U flash:w:hello.hex:i

If you are on OS X, you can install the things you need (except an AVR-enabled Rust build) with homebrew:

brew tap osx-cross/avr
brew install avr-libc
brew install avrdude

Continue on to part 3.

Rust on an Arduino Uno

2016-01-02T15:26:54-05:00

We have an Arduino Uno that’s been sitting around gathering dust for a little while, so I decided to see how Rust worked on it.

A bit of searching led to a fork of Rust with AVR support, AVR-Rust. This is built on top of a fork of LLVM with AVR support, AVR-LLVM. Both of these projects are led by Dylan McKay.

The current documentation for AVR-Rust is a bit lacking, and it was forked from a development version of Rust 1.4. The current development version is Rust 1.7, making the fork about 4.5 months old. However, the changes to LLVM are in the process of being merged into upstream, laying the groundwork for merging the changes into Rust as well.

Let’s start out by doing the bare minimum and try to get a version of rustc that can target the AVR chip:

git clone https://github.com/avr-rust/rust.git
mkdir build && cd build
../rust/configure
make

You’ll note that there’s nothing AVR specific here. Every Rust compiler is actually a cross-compiler, a compiler that executes on one architecture but produces code for another architecture. Because this fork of Rust has support files for AVR, it will be able to produce the correct executable code.

Unfortunately, I couldn’t get a basic file to compile out of the box.

So I did what any sane person would do – I started changing code without knowing exactly what the failure was or what the code I was changing did.

First I tried updating the branch of LLVM that AVR-Rust uses. There are two branches in the repository – avr-support is more actively updated and avr-rust-support lags behind.

Merging avr-support into avr-rust-support went smoothly, but the Rust LLVM driver code needed to be updated to handle the newer LLVM version. I grabbed the diff from the main Rust repository and applied that. This seemed to work, but then I got a segfault from the stage 1 Rust compiler, deep in the internals of LLVM.

make: *** [x86_64-apple-darwin/stage1/lib/rustlib/x86_64-apple-darwin/lib/stamp.term] Segmentation fault: 11

So I continued changing more stuff!

I merged current Rust into the AVR fork of Rust and resolved the merge conflicts as best I could figure out. After fixing a few new errors and some poor merge conflicts, I was on my way. Until I hit the segfault again.

That means it’s actually time to try to figure out where the segfault was coming from. I configured another build with some debug information:

./configure --enable-debug --disable-docs --enable-llvm-assertions --enable-debug-assertions

And built. This takes a long time, as nothing gets optimized. And then it turns out that doing this also hides the segfault. Ugh.

However, I do get to a new error:

ld: unknown option: --as-needed
clang: error: linker command failed with exit code 1 (use -v to see invocation)

Fortunately, I know where to tweak that in the source. The downside is I’ll need to wait for another long build cycle…

Continue on to part 2.

Running dnsmasq on OS X and routing to virtual machines

2014-04-26T17:28:27-04:00

At work, I needed to run a Docker container with a Rails application that talked to another application running inside a VMWare virtual machine. Adding to the complexity, I use boot2docker, which runs inside of VirtualBox.

If I only needed to access rails.localdomain.dev or api.localdomain.dev from my Mac, I could have simply edited /etc/hosts and set both domains to resolve to 127.0.0.1 and been done with it. Unfortunately, Rails needed to be able to directly resolve the API server.

Setting up dnsmasq

NOTE: It’s possible that editing /etc/hosts would have been enough and I didn’t need to set up dnsmasq at all. Read the section about configuring the virtual machine’s DNS first.

I followed this tutorial to install and configure dnsmasq. You can ignore the parts about nginx and foreman.

Our Rails application must run at a domain like rails.localdomain.dev, and the API server runs at api.localdomain.dev, so I configured dnsmasq to manage the .localdomain.dev domain.

I moved the hard-coded DNS entry for api.localdomain.dev from /etc/hosts to dnsmasq.conf. I found this IP by logging into the API VM and looking at the output of ifconfig. I’m not certain why this IP will not change, but that’s what I was told.

Creating a routable “loopback address”

Originally, I had configured api.localdomain.dev to resolve to 127.0.0.1. This works fine when accessed from the Mac, but when a virtual machine resolved that domain, 127.0.0.1 would refer to the virtual machine itself! I needed an IP address that:

Would refer to my laptop.
Wouldn’t change when I changed network configuration.
Wouldn’t resolve to the VM inside the VM.

We can accomplish this by using an ifconfig alias:

sudo ifconfig lo0 alias 10.254.254.254

I picked 10.254.254.254 because it is a private network address and it is unlikely to be used on any networks I connect to. If I ever do have a conflict, there are many other private addresses to choose from!

I edited dnsmasq.conf and replaced 127.0.0.1 with 10.254.254.254. Requests for *.localdomain.dev will now resolve to an IP address that will always refer to the Mac, but that the virtual machines will not think resolves to the virtual machine itself.

Big thanks to Andre for helping me find and understand how aliasing works!

Configuring virtual machine DNS

Next I configured the virtual machine to route all DNS requests through the Mac’s resolving system. For VirtualBox, configure it according to this serverfault answer. If you are using Vagrant, you can add a stanza like:

config.vm.provider "virtualbox" do |vb|
  # Always go through OS X resolver, allowing us to redirect local domains.
  vb.customize ["modifyvm", :id, "--natdnsproxy1", "on"]
  vb.customize ["modifyvm", :id, "--natdnshostresolver1", "on"]
end

I’m not sure, but it’s possible that these settings would use entries configured in my Mac’s /etc/hosts. This could make it so that the dnsmasq step isn’t required.

Instead of resolving through the host, I could have edited /etc/resolv.conf and used 10.254.254.254 as my DNS server instead. If you do this, you definitely need to run dnsmasq.

Once the virtual machine could ping api.localdomain.dev, I restarted the Docker daemon to pick up the networking changes. Dropping into a Docker container, I was able to ping the API server as well. Success!

A final dip into Ruby's Marshal format

2013-01-20T20:30:00-05:00

This is the third and last of my posts about the Marshal format. The first part introduced the format and some straight-forward serializations. The second part touched on strings and object links. This post rounds us off with regexes, classes, modules, and instances of objects.

Regexes

/hello/

0408 492f 0a68 656c 6c6f 0006 3a06 4546

Like strings, regexes are surrounded by an IVAR. The typecode 2f is ASCII / and denotes that this object is a regex. The length of the string follows, again encoded as an integer. The regex string is stored as a set of bytes, and must be interpreted with the string encoding from the IVAR. After the string, the regex options are saved.

/hello/imx

0408 492f 0a68 656c 6c6f 0706 3a06 4546

The regex option byte is a bitset of the five possible options. In this example, ignore case, extend, and multiline are set (0x1, 0x2, and 0x4 respectively)

Classes

String

0408 630b 5374 7269 6e67

The typecode 63 is ASCII c and denotes that this object is a class. The length of the class name followed by the class name are next.

Math::DomainError

0408 6316 4d61 7468 3a3a 446f 6d61 696e 4572 726f 72

Namespaces are separated by ::.

Modules

Enumerable

0408 6d0f 456e 756d 6572 6162 6c65

Modules are identical to classes, except the typecode 6d is ASCII m.

Instances of user objects

Let’s define a small class to test with.

class DumpTest
  def initialize(a)
    @a = a
  end
end

DumpTest.new(nil)

0408 6f3a 0d44 756d 7054 6573 7406 3a07 4061 30

The typecode 6f is ASCII o, and denotes that this is an object. The class name is next, written as a symbol – :DumpTest. The number of instance variables is encoded as an integer, followed by pairs of name, value. This example has 1 pair of instance variables, [:@a, nil].

Another dip into Ruby's Marshal format

2013-01-16T20:00:00-05:00

In a previous post I started to describe some details of Ruby’s Marshal format. This post goes further: a larger set of integers, IVARs, strings, and object links.

Larger integers

What happens once we go beyond integer values that can be represented in one byte? Marshal simply writes the number of bytes needed to represent the value, followed by the value, least significant byte first. Leading zeroes are not encoded.

123

0408 6901 7b

01 indicates that the value takes up one byte, followed by the value itself.

256

0408 6902 0001

256 requires two bytes.

2**30 - 1

0408 6904 ffff ff3f

This is the largest value you can serialize as an integer. Above this, Marshal starts serializing integers as a “bignum”.

Negative integers

-1

0408 69fa

fa is -6 in two’s complement, which mirrors how 1 is encoded as 6.

-124

0408 69ff 84

Here the first byte is -1 in two’s complement. This indicates that one byte of value follows. The value has had leading FF bytes removed, similar to large positive integers.

-257

0408 69fe fffe

-257 requires two bytes.

-(2**30)

0408 69fc 0000 00c0

This is the largest negative value you can serialize as an integer before becoming a bignum.

IVARs

Hang on to your seats, we’re going to jump into strings. First, however, we need to talk about IVARs. The crucial thing that IVARs bring to the table is the handling of string encodings.

'hello'

0408 4922 0a68 656c 6c6f 063a 0645 54

The typecode 49 is ASCII I and denotes that this object contains instance variables. After all the object data, the number of instance variables is provided. The first instance variable is a special one – it’s the string encoding of the object. In this example the string encoding is UTF-8, denoted by the symbol :E followed by a true.

'hello'.force_encoding('US-ASCII')

0408 4922 0a68 656c 6c6f 063a 0645 46

To represent US-ASCII, :E false is used instead. Both US-ASCII and UTF-8 are common enough string encodings that special indicators were created for them.

'hello'.force_encoding('SHIFT_JIS')

0408 4922 0a68 656c 6c6f 063a 0d65 6e63 6f64 696e 6722 0e53 6869 6674 5f4a 4953

For any other string encoding, the symbol :encoding is used and the full string encoding is written out as a raw string – "SHIFT_JIS".

'hello'.tap {|s| s.instance_variable_set(:@test, nil)}

0408 4922 0a68 656c 6c6f 073a 0645 543a 0a40 7465 7374 30

Additional instance variables follow the string encoding. There are now 2 instance variables. The symbol for the instance variable name :@test comes before the value, nil.

Raw strings

'hello'

0408 4922 0a68 656c 6c6f 063a 0645 54

Raw strings are safely nestled inside an IVAR, and are comparatively very simple. The typecode 22 is ASCII " and denotes that this object is a raw string. The length of the string data is next, encoded in the same form as integers. The string data follows as a set of bytes. These bytes must be interpreted using the encoding from the surrounding IVAR.

Object links

When the same object instance is repeated multiple times, the Marshal encoding allows subsequent instances to reference the first instance to save space in the stream.

a = 'hello'; [a, a]

0408 5b07 4922 0a68 656c 6c6f 063a 0645 5440 06

The typecode 40 is ASCII @. The typecode is followed by the position of the object in the cache table. This cache table is distinct from the symbol cache.

The rest

There’s a more types that Marshal can handle, but not all of them are interesting. The next post covers regexes, classes, modules, and instances of objects.

A little dip into Ruby's Marshal format

2013-01-15T20:00:00-05:00

I recently tried to resolve a JRuby issue involving Marshal. I’ve used Marshal before, but never needed to pay attention to the actual bytes written to disk. I decided to write up what I learned in the process.

Version number

I collected this data using Ruby 1.9.3p327, which has Marshal version 4.8. The version number is encoded with two bytes, one each for the major and minor version. This version number precedes all dumps and I’ll ignore it for the rest of this post.

Nil, true, false

nil

0408 30

The typecode 30 is ASCII 0.

true

0408 54

The typecode 54 is ASCII T.

false

0408 46

The typecode 46 is ASCII F.

Integers (easy)

0

0408 6900

The typecode 69 is ASCII i. The typecode is followed by the value of the integer. Zero is represented as 00.

1

0408 6906

Here we see that the encoded value for one is 06, not 01 as we might expect at first. This allows for more efficient storage of smaller numbers. -123 <= x <= 122 can be encoded in just one byte.

Arrays

[]

0408 5b00

The typecode 5b is ASCII [. The typecode is followed by the number of elements in the array.

[1]

0408 5b06 6906

The number of items in the array is encoded in the same form as integers. Each value in the array is encoded sequentially after the size of the array.

Hashes

{}

0408 7b00

The typecode 7b is ASCII {. The typecode is followed by the number of (key, value) pairs in the hash.

{1 => 2}

0408 7b06 6906 6907

Like arrays, the number of items in the hash is encoded in the same form as integers. Each pair of (key, value) is encoded sequentially after the size of the hash.

Symbols

:hello

0408 3a0a 6865 6c6c 6f

The typecode 3a is ASCII :. The typecode is followed by the length of the symbol name and then the symbol name itself, encoded as UTF-8.

Symlinks

When a symbol is repeated multiple times, the Marshal encoding allows subsequent instances to reference the first instance to save space in the stream.

[:hello, :hello]

0408 5b07 3a0a 6865 6c6c 6f3b 00

The typecode 3b is ASCII ;. The typecode is followed by the position of the symbol in the cache table. This table is indexed by the order in which the symbol first appeared.

The rest

There’s a lot more to the Marshal format; I haven’t even covered strings yet! You can find more at the next post in this series, or jump right to the last post.

How to explore on your own

To generate the examples for this post, I hacked up a quick helper in irb:

def dump(x)
  File.open('/tmp/out', 'w') {|f| Marshal.dump(x, f)}
  `xxd /tmp/out`
end

Conway's Game of Life without return values

2012-12-13T15:14:00-05:00

On 2012-12-08, I attended the Pittsburgh Global Day of Code Retreat facilitated by Joe Kramer and Jim Hurne. As usual, I had a great time, and got to meet new people from the Pittsburgh tech scene. It’s always good for me to remember that there are non-Ruby developers out there! I even started the day off by doing the Game of Life in C#.

One of the more contentious constraints of the day was “no return values”. I feel like I was the only one in the room that liked this constraint at all! As such, I wanted to finish it up to see what my final code and observations would look like.

Goal

As I understand it, the point of this constraint is to explore “tell don’t ask”, with a secondary exploration of mocks vs. stubs.

Constraint modifications

I made some small tweaks to the constraint to deal with how Ruby works and to avoid work orthogonal to the goal.

Allow return values from constructors
Allow return values from standard library classes
Allow return values from private methods

In Ruby, constructors are methods on a Class instance that return a new instance of the class. Since everything in Ruby is an object, it would be impossible to make progress if we didn’t allow creating new objects.

The goal of the constraint isn’t to rewrite all of Ruby’s standard library. If we cannot use return values from the standard library, we couldn’t do something as simple as a = 1 + 1! Our newly-created code will not return values, so it is safe to use return values hidden away inside of our objects.

Allowing private methods to return values isn’t strictly necessary, but it allows us to reduce code duplication. Technically, we could inline the private methods where they are used, but that would be ugly. Since these methods are private, they won’t add to the surface area of our objects and shouldn’t conflict with the goal of the exercise.

Things I liked

I usually start out Conway’s with the ability to see if a cell is alive, followed quickly by the ability to bring a cell to life. This means the first thing I do is rely on return values. This time, I began with the concept of a UI that would be told when a cell is alive. I found this interesting as I usually skip over the display completely, leaving it as a “trivial” thing to be added later.

The Board class came into existence while implementing the time_passes method because I needed to have both the current and next board state. I like that this concept was reified; the Game class deals with coordinating the rules and a board, but the Board class deals with the particulars of the board state.

I was forced into giving human names to more things than I usually would, such as has_two_neighbors, or AliveCellRules. I find that this is the extended version of creating a well-named temporary variable.

Things I didn’t like

There are two rule-related classes, one for alive cells and one for dead cells. The alive cell rules class is almost 100% duplication. This could be reduced using Ruby’s alias at the cost of reduced readability, and still wouldn’t help the duplication in the dead cell rules. It’s hard to tell if this would be good or bad in the absence of future changes, but I don’t like it as it stands now.

I wanted to create a Point class to abstract the concept of x / y coordinates and also to have a place to hang the idea of “neighbors”. Unfortunately, it would have solely existed to return values: a list of points, equality comparisons, etc. I think this would be an ideal example of a value type.

I love Ruby’s Struct; I have written too many class initializers longhand to ever want to go back. As far as I am concerned, Struct reduces the work to make an initializer from O(n) to O(1). Unfortunately, it automatically creates a public attr_accessor, which would be too tempting to use. I also avoided attr_reader for the same reason, even though I could have made the reader private. Seeing all the bare instance variables makes me uncomfortable.

Interesting implementation details

For each public method, I returned self. In Ruby, the last executed statement is implicitly returned. Returning self avoids accidentally relying on a return value. In production code I wouldn’t go this overboard, trusting the caller to not use incidental return values. In a language like Java, I would declare the method as void.

I’ve never used flat_map before, but I’m going to keep my eyes open for more places to use it. I’m not at the point where it comes without thinking, but looking for ary.map{ ... }.flatten(1) should be easy enough. Also, I learned that flatten can take an argument that controls how deep it will go.

I swear that there is an existing method that does the equivalent of ary.reject { |x| x == CONSTANT }, but I couldn’t find it. delete will mutate the array in place, which isn’t quite the same.

Tests

As the code progressed, I had to start using RSpec’s as_null_object more frequently. This is because closely situated cells began interacting and would be output to the user interface. I wasn’t interested in these outputs, but they weren’t incorrect. After enough tests needed a null object, I changed the test-wide mock, which may have been too broad a change.

All of the tests that involve time passing have two duplicated lines. These lines could have been pulled into the rarely-used after block. I’ve never seen code that does this, and I’m not sure how I feel about it.

I don’t know what order I prefer for the should_receive calls relative to the rest of the setup. In this case, I chose to put the message expectations at the top of the test block.

Final thoughts

Like most exercises during Code Retreat, preventing return values has benefits and disadvantages. I like how certain concepts were forced to become reified and that I had to think more about the consumer of my code. Contrariwise, I missed not being able to use Struct and really wanted a Point.

Will I change how I code because of this? Maybe a little bit. It probably would be good practice to avoid return values at first blush, but I certainly won’t stop using them completely. One thing I might look further into is Ruby 1.9’s Enumerator. This would allow me to provide a nice function that takes a block or returns an enumerable for further chaining.

Feel free to read over the code on GitHub if you are interested!

Refactor and make changes in different commits

2012-11-04T14:31:00-05:00

If you combine refactoring and making a change to your code into the same commit, you are going to have a bad time.

Just in case you’ve forgotten, refactoring is

the process of changing a computer program’s source code without modifying its external functional behavior

When I review a commit that claims to be refactoring, I shift into to a very specific mindset. I visualize myself as a world-class goalie, ready to stop any rogue features that come my way; I’m going to stand my ground, no matter what.

Contrast refactoring to adding new functionality. In that case, the author should be adding new objects that fulfill a responsibility, following the open/closed principle. When I review a commit that adds new functionality, I pay attention that the tests cover the new functionality, the minimum amount of code was added, and that the new code is sufficient.

These are very different things to review for.

Why’s it bad?

When you combine refactoring and feature addition into the same commit, you double the work required to review it. In addition to figuring out if each changed line is correct, you also have to figure out what “correct” even means!

Beyond the doubled work, you have to change your mindset for every line of code. That’s an amazing amount of context switching. It’s very hard to thoroughly review each line when it’s hard to even remember what you are reviewing for.

Combining these disparate actions into one commit isn’t something that we do maliciously. In fact, it’s likely the opposite: good programmers have an innate drive to make the code better all the time. Sometimes we see a little problem that we just have to fix up.

The problem is that when we have our programmer hats on, we don’t always think about what this commit will mean to others downstream. This could mean reviewers, approvers, testers, documenters, whatever needs to happen after the commit.

What do I do when my commit is too big?

There are a few main techniques I use when I discover I’ve done work that should be in different commits.

If I haven’t committed yet, and the changes are separate enough, I use git add -pu to add certain lines of code and not others.

If the changes overlap with each other, I will edit a specific section of the file until it looks like the intermediate change I really wanted. I git add the file and immediately revert my editor changes. I then repeat with the next section.

If I have already committed, then I go into an interactive rebase and edit the particular commits that are too big. I often create a throw-away branch so I can easily compare the original and modified branches to make sure they end up the same.

All of these techniques create “false history” – I didn’t really make that small step. After I’m finished, I run a little script that checks out each commit and runs my tests.

Sometimes, trying to preserve my changes isn’t worth the time, or I can see into the future a bit and know ahead of time that I am about to make a big set of changes. In these cases, I try a spike: I make the changes willy-nilly, writing down each step as I do it. Then I throw it away and invert the order of steps. This allows each step to happen in the order I would prefer, and I often improve on each step.

Isn’t too many small commits just as bad?

I’ve heard something like this before:

It’s so small, it doesn’t deserve it’s own commit

I’ve never had to review a commit that was too small. I have had to review a commit that was too large. I’m willing to take the risk of creating many small commits, especially if all the changes are going to be made one way or another.

If a commit is small, then I can open it, read the commit message, and review it within seconds.

What do I do as a reviewer?

I use a modified version of the single responsibility principle that applies to commits:

A commit should have one, and only one, change.

I try to follow steps like these:

Read the whole commit message. It should have no ands, ors, or buts. If it does, I kick it back to the author and request that the commit be split up into those pieces. Otherwise, I sear the commit message into my brain.
Read the diff of the commit and evaluate each change against the commit message. If a line doesn’t fit with the message, mark the change as unrelated. If it does, review the line as usual.
Sometimes I keep reading the diff once you I find an unrelated line, other times I stop at the first one; the original author may be faster at separating the concerns.
Make sure to thank the author when they provide a commit with a single focus – positive reinforcement lets us know that we are on the right track!

Run your tests in a deterministic random order

2012-10-18T18:56:00-04:00

Running your tests in a random order is a good idea to help shake out implicit dependencies between tests. Running your tests in a deterministic random order is even better.

What’s an implicit dependency?

It’s easy to accidentally create order-dependent tests:

it "creates a widget" do
  Widget.create(name: 'Awesome Widget')
  Widget.count.should eql(1)
end

it "deletes a widget" do
  Widget.first.delete # Implicitly requires the first test to have been run
  Widget.count.should eql(0)
end

Why should I care?

Dependencies between tests are bad for a number of reasons:

When a single test fails, you need to run many tests to reproduce the failure. This makes reproduction slower and more annoying.
The test method is no longer complete documentation. The required setup for the test is located in many different methods.
The complexity of the test is hidden. What looks like a two line test may actually comprise hundreds of lines of code. Complex test code is often an excellent indicator of complex production code.

Running tests in a random order isn’t enough; you need to be able to reproduce the same random order before you can fix it! RSpec and MiniTest both offer a way to specify the random seed on the command line or with environment variables. Unfortunately, the Surefire plugin for Maven does not offer a way to specify the seed, even though it allows random ordering.

Continuous integration servers

At work, we use gerrit for code reviews and Jenkins as our CI server. Whenever a new or updated commit is pushed to gerrit, a build is started in Jenkins. There is also a Jenkins job to build origin/master every 15 minutes if it has been updated.

The Gerrit/Jenkins combination allows you to retrigger a specific build in case there were environmental issues that have since been fixed. Unfortunately for us, retriggering was being used as a way to avoid dealing with test failures due to order dependencies. To encourage us to stop and address our order dependency problem, we updated both jobs to use a deterministic seed.

For the Gerrit builds, we used the Gerrit change number, which remains constant across multiple revisions of the same commit. The Gerrit plugin makes this value available as a environment variable during script execution.

rspec SPEC_OPTS="--seed $GERRIT_CHANGE_NUMBER"

For the origin/master build, we chose to use the Git hash of the commit. Since the hash contains letters, we used a shell one-liner to scrape out something that looks reasonable as a seed.

SEED=$(git rev-parse HEAD | tr -d 'a-z' | cut -b 1-5)
rspec SPEC_OPTS="--seed $SEED"

Does it work?

Just a few days after making the above changes, another developer came to me with a strange problem. His commit was unable to pass the tests in Gerrit, but the failing test had nothing to do with his changes. We ran the tests locally using the seed from the Jenkins server and were able to reproduce the problem. Ultimately, we traced the problem to a request spec that modified some core configuration settings and didn’t reset them successfully. Success!

Watch out for lost updates when using Capybara with Selenium

2012-10-10T19:41:00-04:00

At work, I am still working on finding and squashing fun test failures. In this case, “fun” means tests that have an intermittent failure rate of 5% (or less!). The test issue I worked on today had to do with the “lost update” problem.

The lost update problem

Growing Object-Oriented Software, Guided by Tests has a great description and diagram of the problem:

The short version is that when you poll a system for its state, it’s entirely possible to miss the state you are looking for. In the diagram, the color changes to red and then to blue before the test ever has a chance to see that it was red. Since this system will never go back to red, the test will incorrectly fail.

The lost update problem in Capybara

Like many other sites, we use the DataTables jQuery plugin to show tabular data. A test that ensured that the filtering worked looked something like this:

def wait_for_table_loading
  dialog = page.find('.loading_dialog')
  wait_until { dialog.visible? }
end

def wait_for_table_ready
  dialog = page.find('.loading_dialog')
  wait_until { ! dialog.visible? }
end

it 'filters the list' do
  visit list_path
  click_on 'Filter by active'
  wait_for_table_loading
  wait_for_table_ready
  page.all('.data-item').should have(3).items
end

Enabling filtering triggers some slow backend activity, which brings up the loading dialog. The test waits for that dialog to appear and disappear before continuing on. Now the entire table is populated and we can safely see how many elements are in the table.

However, the test will fail if the backend is too fast. The loading dialog will appear and disappear almost immediately. The test will time out waiting for the loading dialog that will never appear again. This behavior can be reliably replicated by adding a sleep to the test between lines 13 and 14.

A Capybara solution

In order to make the test more robust, I rewrote it as:

it 'filters the list' do
  visit list_path
  click_on 'Filter by active'
  page.should have_css('.data-item', :count => 3)
end

The test now ignores the loading dialogs completely, instead asking Capybara to find a particular number of elements. Asking Capybara to find things in this manner will let the test leverage the built-in waiting facilities of Capybara.

In this test, the number of data items won’t change once the table is loaded, so it is a safe state to poll. As an additional benefit, the test now has fewer lines of code and is clearer.

As a downside, when the test fails, the Capybara error message doesn’t include how many items were found, which isn’t as informative as the equivalent message from the RSpec matcher.

Also, this test still ultimately relies on polling the DOM, so it’s possible for similar bugs to pop up in the future.

The GOOS solution to the lost update problem

GOOS provides a solution to the lost update problem that can avoid the problems with polling completely:

The system under test must be modified to provide notifications when something interesting happens. This system now has a listener that is notified when the color changes and what the color is changed to. The test supplies a simple listener that accumulates the changes and offers a nice API suited for the tests.

A hypothetical Capybara solution without polling

I can imagine a Capybara test that looks something like this:

it 'filters the list' do
  visit list_path
  wait_for_js_event('table.loaded') do
    click_on 'Filter by active'
  end
  page.all('.data-item').should have(3).items
end

Under the hood, there’s some extra JavaScript going on. The wait_for_js_event method would inject some JavaScript into the running Selenium session that creates an event listener and binds it to the given event. This listener just collects all the events it receives. After yielding the block, the test code then polls the event listener, waiting for the event to be captured.

It’s entirely possible that code that does this already exists, but I don’t know of it. It wouldn’t be a large amount of code to write, but it would straddle the borders of Capybara, Selenium and JavaScript.

This might be a useful thing for Jasmine tests, so it might already exist in that ecosystem.

Finding a race condition in Capybara with Selenium

2012-10-08T20:40:00-04:00

At work, we’ve been using Capybara and Selenium to test our newest web application. Many of us have used this combination before for our own projects, but it’s new territory for a work project.

Every so often, we would get this error from a specific test:

Selenium::WebDriver::Error::StaleElementReferenceError:
  Element not found in the cache - perhaps the page has changed since it was looked up

The error was intermittent, so we fell into the seductive but dangerous trap of simply rerunning our tests whenever it failed. Recently, I had a bit of time and decided to dig into it and fix it once and for all.

My first task was to see if I could reproduce the error locally. We often saw the error when running the tests on our Jenkins continuous integration server, so there was the possibility that the problem was environmental. However, we also knew that the failure was intermittent, so we couldn’t be sure it was environmental even if the test passed locally a few times.

I rigged up a small shell script to simply run the test over and over again while I wandered away from my computer. The script looked something like:

(test-script-runner.sh) download

#!/bin/bash
set -eu

failures=0

for run in `seq 20`; do
    if ! rspec -e 'the bad test'; then
        failures=$(($failures + 1))
    fi

    echo "Test run $run complete, $failures failures"
done

I’m sure there’s a proper statistical manner to determine how many times the test would have to be run without failing to be reasonably certain that the test won’t fail, but I didn’t have to worry about that – the test failed somewhere within the first ten or so runs.

Now that I knew the test could fail on my local setup, it was time to dig into what the test was doing. The test was fairly concise and readable (which I highly appreciated) and looked something like:

it 'deletes the element', :js => true  do
  visit path_to_the_page
  click_on 'Remove item'
  page.should_not have_css(".item", text: "Old text")
end

The exception was coming from line #4 – when the test made the first assertion about the elements. Unfortunately, the stacktrace isn’t very useful, as it mostly contains references to the JavaScript running inside of Firefox. The exception text indicates that the test has a reference to an element, but it isn’t available in the cache anymore. Considering that the test just deleted the element, this is certainly suspicious.

At this point, I cloned the capybara repository and started poking around. A git grep quickly found where has_no_css? was defined. Following the thread of code led to has_no_selector?, which calls the all method. This method had a pretty clear split between the “finding” part of the code and the “filtering” part. There was no magic I used here to see this, just previous experience debugging race conditions.

I opened up the installed gem and inserted a sleep directly into the code between the “finding” and “filtering” sections. It’s ugly doing this, but it’s good to try to not change too many things at once when debugging. I played with the sleep value a bit and eventually found a value that reliably reproduced the failure. Success!

Well, maybe not complete success, but at least a step in the right direction. Even though I could reproduce the problem, I had only reproduced it in our production application, and I had modified my installed gem directly. It was time to make a nice test case.

I created a new Rails app and added the requisite RSpec gems. Since we only need a simple HTML page with a bit of JavaScript to remove the element, I modified the index.html that ships with Rails to have the JavaScript inline and created an element and link to wire the action to.

Since I knew that I would want to make changes to Capybara, I used the :path parameter in the Gemfile to point to my local checkout of Capybara. This is an awesome feature of Bundler that you might not know about. It also means I’m not messing with my generally-available copy of Capybara, which is good for my sanity.

I then created a stripped-down version of the test, the same as the example above. After getting everything hooked up, I ran the test but it didn’t fail. This was bad news – I had done a few big steps between the production app and the smaller test case – which one of them could have changed the behavior?

This is where my knowledge of our production system came in useful. In that application, we aren’t just removing something from the page, we are persisting that deletion to disk. Doing that can add some time before the JavaScript fires to remove the item. I changed the test JavaScript to have a delay less than the delay in Capybara and ran the test again. It failed, just like we wanted it to. To be sure, I ran the test case a bunch of times to make sure it always failed and for the expected reason. Success!

Well, almost. Even though I had a test case, I still needed to show that code to someone who could do something about it. Checking back at the Capybara website, I looked for how to submit a ticket. Right at the top is a nice, clear comment:

Need help? Ask on the mailing list (please do not open an issue on GitHub)

So, I pushed my changes to Capybara to my fork and updated my test app to use a remote git version of the gem (another cool Bundler feature). I then pushed my test case to GitHub as well. I took a bit of time to create a short README so that anyone stumbling on the test app would have a clue as to what it was.

After that, it was just a small matter to write up a clear email to the Capybara list. I still find emailing new lists scary. Who knows how the list will respond? This time I got a nice surprise:

That’s a very nice bug report, Jake.

It appears to be a bug indeed. I’ve been able to reproduce it on master as well.

A GitHub issue has been opened, and the bug is well on the way to being fixed. Yay for Open Source!

Name your variables by the roles they play

2012-10-03T19:49:00-04:00

Have you ever seen a variable with a terrible name? This is of course a trick question; everyone has. I’d like to look at a particular variable-naming annoyance: naming the variable based on the class name.

In a statically-typed language without type inference, like Java, you have likely seen something like this:

FooBarZed fooBarZed = new FooBarZed(true);

In a dynamically typed language, like Ruby, it would look more like this:

foo_bar_zed = FooBarZed.new(true)

This style of code may make sense to you. Maybe your class names are self-describing, and so duplicating the class name as the variable name is just “reusing a good thing”.

The problem with naming variables in this style is that the class name, at best, describes what is special about an object or how the object works. When you are using the object, you want to know why you are using it – you want to know the role of the object:

# this variable name says nothing new to the reader of the code
url_fetcher = UrlFetcher.new('http://example.com/')
# this variable name explains why we want to do something
conversion_rate_fetcher = UrlFetcher.new('http://example.com/')

As a pragmatic argument, think how often you rename a class to better describe it. Now think how often you’ve changed a class name AND all the variables to match the new class.

The role that an object plays changes at vastly different rate than the name of the class. It’s unlikely that the role will change dramatically, as that new role would probably be better represented by a brand new object. Similarly, code that uses that object is unlikely to want an object that fills a different role without a rewrite of how the calling code works.

Using Ruby blocks to ensure resources are cleaned up

2012-10-01T18:05:00-04:00

In programming, cleaning up resources you have created is an easily-overlooked problem. In languages like C, you have to clean up everything by hand: memory, files, network sockets, etc. Languages that have a garbage collector take away the need to explicitly free memory, but you still have to manage the other resources.

In Ruby, we can use blocks to help ensure resources are closed. You’ve probably seen this idiom when dealing with files. The File class ensures that the file is closed after the block is finished:

File.open('file.txt') do |file|
  # ... work with the file ...
end

However, this can only be used when the lifespan of the resource is one method call. If you need to keep the file around in an instance variable, then you cannot use this pattern, and must fall back to explicitly closing the resource:

class Foo
  def initialize
    @file = File.open('file.txt')
  end

  def close
    @file.close
  end
end

To make this code nicer on yourself and others that have to use it, you should add a method that handles closing the resource for you, just like File does:

class Foo
  def self.open
    foo = Foo.new
    yield foo
  ensure
    foo.close
  end
end

Unfortunately, this clean implementation of open has a problem if the constructor of File can throw an exception. The exception will occur before the variable foo can be set to anything, so the close message will be sent to nil instead, causing another exception! This certainly didn’t happen in any code I was writing…

To handle this case, we have to give up on using the implicit `begin“ block from the function and create our own scope:

class Foo
  def self.open
    foo = Foo.new
    begin
      yield foo
    ensure
      foo.close
    end
  end
end

Now we can safely create a Foo and close it all in one place, but what if another object wants to keep an instance of Foo around for longer? It’s nice to transparently handle both cases:

class Foo
  def self.open
    foo = Foo.new

    return foo unless block_given?

    begin
      yield foo
    ensure
      foo.close
    end
  end
end

Be careful when using JUnit's expected exceptions

2012-09-26T18:52:00-04:00

For many people, JUnit is the grand-daddy of testing frameworks. Even though other testing frameworks came first, a lot of people got their start with JUnit.

People often start out testing with simple Boolean assertions, then move on substring matching, then maybe on to mocks and stubs. At some point, however, most people want to assert that their code throws a particular exception, and that’s where our story starts.

When JUnit 3 was the latest and greatest, you were supposed to catch the exception yourself and assert if no such exception was thrown. Here’s an example I tweaked from Lasse’s blog and the JUnit documentation for @Test.

@Test
public void test_for_npe_with_try_catch() {
    try {
        throw new NullPointerException();
        fail("should've thrown an exception!");
    } catch (NullPointerException expected) {
        // go team!
    }
}

With the newest versions of JUnit 4 (4.11 at the time of writing), there are two more options available to you: the @Test annotation and the ExpectedException rule.

@Test(expected = NullPointerException.class)
public void test_for_npe_with_annotation() {
    throw new NullPointerException();
}

@Rule
public ExpectedException thrown = ExpectedException.none();

@Test
public void test_for_npe_with_rule() {
    thrown.expect(NullPointerException.class);
    throw new NullPointerException();
}

Both of these forms offer a lot in the way of conciseness and readability, and I prefer to use them when I need to test this kind of thing. However, both forms can cause a test to pass when it shouldn’t when the code can throw the exception in multiple ways:

@Test(expected = NullPointerException.class)
public void test_for_npe_but_which_one() {
    CoolObject obj = new CoolObject(null);
    obj.doSomeSetupWork(42);  // What actually throws the exception
    obj.calculateTheAnswer(); // What we want to throw the exception
}

In languages that have lambdas or equivalents, this problem is easily avoided. For example, you can use expect and raise_error in RSpec:

it 'throws_a_npe' do
  obj = CoolObject.new(nil)
  obj.do_some_setup_work(42)
  expect { obj.calculate_the_answer }.to_raise(NoMethodError)
end

Alternate solutions

Until a version of Java is released with lambdas, I see no better solution than using try-catch blocks, the old JUnit 3 way. You could define an interface and then create anonymous classes in the test to have the desired level of granularity. This is a pretty bulky syntax, any variables you use in the object would need to be declared final, and then you have to explictly run the code!

@Test
public void test_for_one_of_two_npe_bulky_syntax() {
    final CoolObject obj = new CoolObject(null);
    obj.doSomeSetupWork(42);

    new GonnaThrowException(NullPointerException.class) {
        public void test() {
            obj.calculateTheAnswer();
        }
    }.run();
}

If you can rephrase the problem slightly, you might be able to use the fact that ExpectedException can assert on the exception message to restrict your test. If you know that only your error can include a certain string, then checking for that string could prevent tests from passing when they shouldn’t.

Another solution would be to modify your code or tests so that you don’t have to deal with the problem in the first place. If you can move the setup code into a @Before block, then the exception wouldn’t be caught by the test. If you can change your code so it cannot throw the exception multiple ways, or if it throws different exceptions, then that would also allow you to sidestep the problem.

Update 2012-09-27

David Bradley points out that if you configure the JUnit rule right before the expected exception, you can reduce the possibility of error. Unfortunately, exceptions thrown after the desired line will still cause the test to pass incorrectly. This may not be a problem in practice, as you are unlikely to continue the test after an exception should be thrown, and most Java tests do not have a teardown phase.

@Rule
public ExpectedException thrown = ExpectedException.none();

@Test
public void test_for_npe_with_rule_at_last_moment() {
    CoolObject obj = new CoolObject(null);
    obj.doSomeSetupWork(42);

    thrown.expect(NullPointerException.class);
    obj.calculateTheAnswer();

    // Any exceptions here will still cause the test to pass incorrectly
}

A refactoring example: lots of if-else statements on strings

2012-09-24T12:00:00-04:00

I recently did a bit of work that turned out to be a great exercise for refactoring a huge sequence of if-else statements based on strings. There are a few ugly bits left, so I’m still poking at it, but I am pleased with my progress so far.

While the original code was Java, the meat of the problem can be easily shown in Ruby. Translating it to Ruby also makes it easier to make sure I don’t accidentally share any proprietary information!

The problem involves processing a hunk of XML to create nested configuration objects. The original implementation used a sequence of if-else blocks, but the Ruby version would have used a case statement.

class ObjectsFromXML
  def process(parent, element)
    new_object = nil

    case element.name
    when 'foo'
      f = Foo.new(element['name'])
      parent.add(f)
      new_object = f
    when 'bar'
      b = Bar.new(element['name'])
      b.weight = element['weight'].to_f
      parent.add(b)
      new_object = b
    # ... about 20 of these cases in total
    else
      raise "Invalid node"
    end

    element.children.each { |c| process(new_object, c) }
  end
end

This function has some issues: it is pretty long, and each case has some similarity to the next but are different enough to be annoying. The code certainly doesn’t try to adhere to the Single Responsibility Principle!

My first step was to split the blocks into classes with a common interface.

class FooHandler
  def process(parent, element)
    Foo.new(element['name']).tap |f|
      parent.add(f)
    end
  end
end

class BarHandler
  def process(parent, element)
    Bar.new(element['name']).tap
      b.weight = element['weight'].to_f
      parent.add(b)
    end
  end
end

class ObjectsFromXML
  def process(parent, element)
    new_object = nil

    case element.name
    when 'foo'
      new_object = FooHandler.new.process(parent, element)
    when 'bar'
      new_object = BarHandler.new.process(parent, element)
    # ... other cases
    else
      raise "Invalid node"
    end

    element.children.each { |c| process(new_object, c) }
  end
end

This corrals the item-specific code to an item-specific class. If I need to change how the Bar class is created, only the BarHandler class needs to be updated.

class FooHandler
  # As above...
end

class BarHandler
  # As above...
end

class ObjectsFromXML
  attr_reader :handlers

  def initialize
    @handlers = {}
    handlers['foo'] = FooHandler.new
    handlers['bar'] = BarHandler.new
    # ... other cases
  end

  def process(parent, element)
    handler = handlers.fetch(element.name) { raise "Invalid node" }
    new_object = handler.process(parent, element)
    element.children.each { |c| process(new_object, c) }
  end
end

Now all the handlers are in a hash, keyed by the expected element name. This allows me to pull out the correct handler and go. The process function now only needs to be concerned with picking the right handler and dealing with children elements.

class FooHandler
  # As above...

  def element_name
    'foo'
  end
end

class BarHandler
  # As above...

  def element_name
    'bar'
  end
end

class Handlers
  def initialize
    @handlers = {}
  end

  def add(handler)
    @handlers[handler.element_name] = handler
  end

  def process(parent, element)
    handler = handlers.fetch(element.name) { raise "Invalid node" }
    handler.process(parent, element)
  end
end

class ObjectsFromXML
  attr_reader :handlers

  def initialize
    @handlers = Handlers.new
    handlers.add(FooHandler.new)
    handlers.add(BarHandler.new)
    # ... other cases
  end

  def process(parent, element)
    new_object = handlers.process(parent, element)
    element.children.each { |c| process(new_object, c) }
  end
end

Now we have moved the concept of the expected element name into the handler itself. This makes sense, as each handler should know what name it expects, not some other piece of code. I also took the opportunity to move the code purely related to the handlers to a new class that is highly focused on that one responsibility.

Some further refinements happened after this last point. The ObjectsFromXML class became another Handler class, which made it the same level of abstraction as the other handlers and removed a redundant process method. The return code was removed because it wasn’t used except in a few tests. Iterating over children was moved to each class that could contain children.

The stages of code review

2012-07-01T13:37:00-04:00

We recently started using gerrit to perform code reviews for a legacy C codebase that I work on. I also help out on a couple of newer Java and Ruby projects that have had the benefit of having code reviews and testing infrastructure from day one.

Starting to use gerrit on our C code has led me to think about how we approach code reviews, and I’ve identified some stages that we have gone through. It’s loosely sorted by the order in which we adopted each check. While not every commit needs each point, this is a general idea of what might be required.

Functionality

This was what started us on the code review path – sometimes we would commit something that just didn’t work right. Occasionally, the code wouldn’t even compile! To try and address these problems, we would have a coworker give the code a once-over before pushing it. We actually started doing this long before gerrit by walking over to another desk.
Function names

One of our explicit coding conventions is that non-static functions must be prefixed with the module name they belong to. This keeps us sane and helps prevent name collisions. We also have a few conventions for constructors, destructors and other common patterns. These are all easy to check for and was something we started doing almost immediately.
Resource leaks

After a few annoying memory leaks got committed, we started looking at the code with a critical eye for all kinds of leaks. Resource leaks are easy enough to add and subsequently miss, especially in
1. Leaks are really just a special type of non-functioning code, but one that bites you days/weeks/years later instead of immediately.
Sometimes we use a tool such as valgrind to test for leaks, but in many cases we just inspect the code visually. We check to see if resources are handled consistently and pay special attention to various error conditions.
Efficiency

For better or worse, we almost always worry about how optimized our code is. Sometimes this is can be a valuable exercise, but in most situations it was probably overkill. There’s just a warm fuzzy feeling when you catch an O(n²) algorithm that could be O(n log n), even if you spend more time coming up with the faster algorithm than will ever be saved in runtime. Since this focus on optimization is part of our culture, it has found it’s way into our reviews.
Tests

Tests fall lower on this list than I would prefer. Unit testing C code is, at best, hard and/or annoying. Add the fact that trying to test code that was never designed to be tested is painful, and you can easily see why we tend to turn a blind eye when a commit doesn’t add any new tests.
Documentation

Code written in C should probably have more documentation than most other languages, simply due to all the ways you can shoot yourself in the foot. For example, you can’t tell if any given function will take ownership of the pointer you pass it, that information has to be documented somewhere. When we prefix our function names with the module name, the descriptive part of the name is often shortened to prevent extremely long function names. This means the function documentation is vital to understand what the function does.

Reviewing documentation often comes down to checking that it makes sense and is proper English. It’s impressive how mangled a sentence can get when you refactor the code it is trying to describe.
Coding style

Many different aspects of code fall into “style”: the contents, formatting and spelling of comments; the names of variables, static functions, and structures; the size and complexity of functions; the contents of structures.

These stylistic issues can be difficult to talk about in a code review, since sometimes it comes down to personal preference. The best you can do is express your preference and try to sway the other developer to your line of thinking. It helps if both people realize that the code has to be read and maintained by the entire development group.
Test style

Test code style is an entirely different kettle of fish from production code style. A test needs to focus on how a user would want to use the code. It should minimize the clutter required to perform the test so as to make the test as readable as possible, while still highlighting the interesting part under test.

Similar to code style, the difficulty reviewing tests comes from differences in personal preference. This is compounded by the fact that we are not as experienced with writing great tests yet.
Commit message

Right now, this is my holy grail of code reviews. I once spent 15 minutes skimming through the git log to determine if we had a preferred verb tense in our commit messages (we did). More reasonably, this can involve ensuring that commit messages are capitalized consistently and that they describe why a change is being made, not just what or how.

Where we are now

The C project is currently somewhere around the “Tests” or “Documentation” stages. The Java and Ruby projects expect tests and documentation, so they hover around the “Code style” and “Test style” stages; I’ve only had one or two opportunities to correct someones verb tense :–).