Using Rust objects from other languages
Let’s create a Rust object that will tell us how many people live in each USA ZIP code. We want to be able to use this logic in other languages, but we only need to pass simple primitives like integers or strings across the FFI boundary. The object will have both mutable and immutable methods. Because we can not look inside the object, this is often referred to as an opaque object or an opaque pointer.
extern crate libc;
use libc::c_char;
use std::collections::HashMap;
use std::ffi::CStr;
pub struct ZipCodeDatabase {
population: HashMap<String, u32>,
}
impl ZipCodeDatabase {
fn new() -> ZipCodeDatabase {
ZipCodeDatabase {
population: HashMap::new(),
}
}
fn populate(&mut self) {
for i in 0..100_000 {
let zip = format!("{:05}", i);
self.population.insert(zip, i);
}
}
fn population_of(&self, zip: &str) -> u32 {
self.population.get(zip).cloned().unwrap_or(0)
}
}
#[no_mangle]
pub extern "C" fn zip_code_database_new() -> *mut ZipCodeDatabase {
Box::into_raw(Box::new(ZipCodeDatabase::new()))
}
#[no_mangle]
pub extern "C" fn zip_code_database_free(ptr: *mut ZipCodeDatabase) {
if ptr.is_null() {
return;
}
unsafe {
Box::from_raw(ptr);
}
}
#[no_mangle]
pub extern "C" fn zip_code_database_populate(ptr: *mut ZipCodeDatabase) {
let database = unsafe {
assert!(!ptr.is_null());
&mut *ptr
};
database.populate();
}
#[no_mangle]
pub extern "C" fn zip_code_database_population_of(
ptr: *const ZipCodeDatabase,
zip: *const c_char,
) -> u32 {
let database = unsafe {
assert!(!ptr.is_null());
&*ptr
};
let zip = unsafe {
assert!(!zip.is_null());
CStr::from_ptr(zip)
};
let zip_str = zip.to_str().unwrap();
database.population_of(zip_str)
}
The struct
is defined in a normal way for Rust. One extern
function is created for each function of the object. C has no
built-in namespacing concept, so it is normal to prefix each function
with a package name and/or a type name. For this example, we use
zip_code_database
. Following normal C conventions, a pointer to the
object is always provided as the first argument.
To create a new instance of object, we box the result of the object’s
constructor. This places the struct onto the heap where it will have a
stable memory address. This address is converted into a raw pointer
using Box::into_raw
.
This pointer points at memory allocated by Rust; memory allocated by
Rust must be deallocated by Rust. We use
Box::from_raw
to convert the pointer back into a
Box<ZipCodeDatabase>
when the object is to be freed. Unlike other
functions, we do allow NULL
to be passed, but simply do nothing
in that case. This is a nicety for client programmers.
To create a reference from a raw pointer, you can use the terse syntax
&*
, which indicates that the pointer should be dereferenced and then
re-referenced. Creating a mutable reference is similar, but uses
&mut *
. Like other pointers, you must ensure that the pointer is not
NULL
.
Note that a *const T
can be freely converted to and from a *mut T
and that nothing prevents the client code from never calling the
deallocation function, or from calling it more than once. Memory
management and safety guarantees are completely in the hands of the
programmer.
C
#include <stdio.h>
#include <stdint.h>
#include <inttypes.h>
typedef struct zip_code_database zip_code_database_t;
extern zip_code_database_t *
zip_code_database_new(void);
extern void
zip_code_database_free(zip_code_database_t *);
extern void
zip_code_database_populate(zip_code_database_t *);
extern uint32_t
zip_code_database_population_of(const zip_code_database_t *, const char *zip);
int main(void) {
zip_code_database_t *database = zip_code_database_new();
zip_code_database_populate(database);
uint32_t pop1 = zip_code_database_population_of(database, "90210");
uint32_t pop2 = zip_code_database_population_of(database, "20500");
zip_code_database_free(database);
printf("%" PRId32 "\n", (int32_t)pop1 - (int32_t)pop2);
}
A dummy struct is created to provide a small amount of type-safety.
The const
modifier is used on functions where appropriate, even
though const-correctness is much more fluid in C than in Rust.
Ruby
require 'ffi'
class ZipCodeDatabase < FFI::AutoPointer
def self.release(ptr)
Binding.free(ptr)
end
def populate
Binding.populate(self)
end
def population_of(zip)
Binding.population_of(self, zip)
end
module Binding
extend FFI::Library
ffi_lib 'objects'
attach_function :new, :zip_code_database_new,
[], ZipCodeDatabase
attach_function :free, :zip_code_database_free,
[ZipCodeDatabase], :void
attach_function :populate, :zip_code_database_populate,
[ZipCodeDatabase], :void
attach_function :population_of, :zip_code_database_population_of,
[ZipCodeDatabase, :string], :uint32
end
end
database = ZipCodeDatabase::Binding.new
database.populate
pop1 = database.population_of("90210")
pop2 = database.population_of("20500")
puts pop1 - pop2
To wrap the raw functions, we create a small class inheriting from
AutoPointer
. AutoPointer
will ensure that the
underlying resource is freed when the object is freed. To do this, the
user must define the self.release
method.
Unfortunately, because we inherit from AutoPointer
, we cannot
redefine the initializer. To better group concepts together, we bind
the FFI methods in a nested module. We provide shorter names for the
bound methods, which enables the client to just call
ZipCodeDatabase::Binding.new
.
Python
#!/usr/bin/env python3
import sys, ctypes
from ctypes import c_char_p, c_uint32, Structure, POINTER
class ZipCodeDatabaseS(Structure):
pass
prefix = {'win32': ''}.get(sys.platform, 'lib')
extension = {'darwin': '.dylib', 'win32': '.dll'}.get(sys.platform, '.so')
lib = ctypes.cdll.LoadLibrary(prefix + "objects" + extension)
lib.zip_code_database_new.restype = POINTER(ZipCodeDatabaseS)
lib.zip_code_database_free.argtypes = (POINTER(ZipCodeDatabaseS), )
lib.zip_code_database_populate.argtypes = (POINTER(ZipCodeDatabaseS), )
lib.zip_code_database_population_of.argtypes = (POINTER(ZipCodeDatabaseS), c_char_p)
lib.zip_code_database_population_of.restype = c_uint32
class ZipCodeDatabase:
def __init__(self):
self.obj = lib.zip_code_database_new()
def __enter__(self):
return self
def __exit__(self, exc_type, exc_value, traceback):
lib.zip_code_database_free(self.obj)
def populate(self):
lib.zip_code_database_populate(self.obj)
def population_of(self, zip):
return lib.zip_code_database_population_of(self.obj, zip.encode('utf-8'))
with ZipCodeDatabase() as database:
database.populate()
pop1 = database.population_of("90210")
pop2 = database.population_of("20500")
print(pop1 - pop2)
We create an empty structure to represent our type. This will only be
used in conjunction with the POINTER
method, which creates a new
type as a pointer to an existing one.
To ensure that memory is properly cleaned up, we use a context
manager. This is tied to our class through the __enter__
and
__exit__
methods. We use the with
statement to start a new
context. When the context is over, the __exit__
method will be
automatically called, preventing the memory leak.
Haskell
{-# LANGUAGE ForeignFunctionInterface #-}
import Data.Word (Word32)
import Foreign.Ptr
import Foreign.ForeignPtr
import Foreign.C.String (CString(..), newCString)
data ZipCodeDatabase
foreign import ccall unsafe "zip_code_database_new"
zip_code_database_new :: IO (Ptr ZipCodeDatabase)
foreign import ccall unsafe "&zip_code_database_free"
zip_code_database_free :: FunPtr (Ptr ZipCodeDatabase -> IO ())
foreign import ccall unsafe "zip_code_database_populate"
zip_code_database_populate :: Ptr ZipCodeDatabase -> IO ()
foreign import ccall unsafe "zip_code_database_population_of"
zip_code_database_population_of :: Ptr ZipCodeDatabase -> CString -> Word32
createDatabase :: IO (Maybe (ForeignPtr ZipCodeDatabase))
createDatabase = do
ptr <- zip_code_database_new
if ptr /= nullPtr
then do
foreignPtr <- newForeignPtr zip_code_database_free ptr
return $ Just foreignPtr
else
return Nothing
populate = zip_code_database_populate
populationOf :: Ptr ZipCodeDatabase -> String -> IO (Word32)
populationOf db zip = do
zip_str <- newCString zip
return $ zip_code_database_population_of db zip_str
main :: IO ()
main = do
db <- createDatabase
case db of
Nothing -> putStrLn "Unable to create database"
Just ptr -> withForeignPtr ptr $ \database -> do
populate database
pop1 <- populationOf database "90210"
pop2 <- populationOf database "20500"
print (pop1 - pop2)
We start by defining an empty type to refer to the opaque object. When
defining the imported functions, we use the Ptr
type constructor
with this new type as the type of the pointer returned from Rust. We
also use IO
as allocating, freeing, and populating the object are
all functions with side-effects.
As allocation can theoretically fail, we check for NULL
and return a
Maybe
from the constructor. This is likely overkill, as Rust
currently aborts the process when the allocator fails.
To ensure that the allocated memory is automatically freed, we use the
ForeignPtr
type. This takes a raw Ptr
and a function to call when
the wrapped pointer is deallocated.
When using the wrapped pointer, withForeignPtr
is used to unwrap it
before passing it back to the FFI functions.
Node.js
const ffi = require('ffi-napi');
const lib = ffi.Library('libobjects', {
zip_code_database_new: ['pointer', []],
zip_code_database_free: ['void', ['pointer']],
zip_code_database_populate: ['void', ['pointer']],
zip_code_database_population_of: ['uint32', ['pointer', 'string']],
});
const ZipCodeDatabase = function() {
this.ptr = lib.zip_code_database_new();
};
ZipCodeDatabase.prototype.free = function() {
lib.zip_code_database_free(this.ptr);
};
ZipCodeDatabase.prototype.populate = function() {
lib.zip_code_database_populate(this.ptr);
};
ZipCodeDatabase.prototype.populationOf = function(zip) {
return lib.zip_code_database_population_of(this.ptr, zip);
};
const database = new ZipCodeDatabase();
try {
database.populate();
const pop1 = database.populationOf('90210');
const pop2 = database.populationOf('20500');
console.log(pop1 - pop2);
} finally {
database.free();
}
When importing the functions, we simply declare that a pointer
type
is returned or accepted.
To make accessing the functions cleaner, we create a simple class that maintains the pointer for us and abstracts passing it to the lower-level functions. This also gives us as opportunity to rename the functions with idiomatic JavaScript camel-case.
To ensure that the resources are cleaned up, we use a try
block and
call the deallocation method in the finally
block.
C#
using System;
using System.Runtime.InteropServices;
internal class Native
{
[DllImport("objects")]
internal static extern ZipCodeDatabaseHandle zip_code_database_new();
[DllImport("objects")]
internal static extern void zip_code_database_free(IntPtr db);
[DllImport("objects")]
internal static extern void zip_code_database_populate(ZipCodeDatabaseHandle db);
[DllImport("objects")]
internal static extern uint zip_code_database_population_of(ZipCodeDatabaseHandle db, string zip);
}
internal class ZipCodeDatabaseHandle : SafeHandle
{
public ZipCodeDatabaseHandle() : base(IntPtr.Zero, true) {}
public override bool IsInvalid
{
get { return this.handle == IntPtr.Zero; }
}
protected override bool ReleaseHandle()
{
if (!this.IsInvalid)
{
Native.zip_code_database_free(handle);
}
return true;
}
}
public class ZipCodeDatabase : IDisposable
{
private ZipCodeDatabaseHandle db;
public ZipCodeDatabase()
{
db = Native.zip_code_database_new();
}
public void Populate()
{
Native.zip_code_database_populate(db);
}
public uint PopulationOf(string zip)
{
return Native.zip_code_database_population_of(db, zip);
}
public void Dispose()
{
db.Dispose();
}
static public void Main()
{
var db = new ZipCodeDatabase();
db.Populate();
var pop1 = db.PopulationOf("90210");
var pop2 = db.PopulationOf("20500");
Console.WriteLine("{0}", pop1 - pop2);
}
}
As the responsibilities for calling the native functions are going to
be more spread out, we create a Native
class to hold all the
definitions.
To ensure that the allocated memory is automatically freed, we create
a subclass of the SafeHandle
class. This requires
implementing IsInvalid
and ReleaseHandle
. Since our Rust function
accepts freeing a NULL
pointer, we can say that every pointer is
valid.
We can use our safe wrapper ZipCodeDatabaseHandle
as the type of the
FFI functions except for the free function. The actual pointer will be
automatically marshalled to and from the wrapper.
We also allow the ZipCodeDatabase
to participate in the
IDisposable
protocol, forwarding to the safe wrapper.
Julia
#!/usr/bin/env julia
using Libdl
libname = "objects"
if !Sys.iswindows()
libname = "lib$(libname)"
end
lib = Libdl.dlopen(libname)
zipcodedatabase_new_sym = Libdl.dlsym(lib, :zip_code_database_new)
zipcodedatabase_free_sym = Libdl.dlsym(lib, :zip_code_database_free)
zipcodedatabase_populate_sym = Libdl.dlsym(lib, :zip_code_database_populate)
zipcodedatabase_populationof_sym = Libdl.dlsym(lib, :zip_code_database_population_of)
struct ZipCodeDatabase
handle::Ptr{Nothing}
function ZipCodeDatabase()
handle = ccall(zipcodedatabase_new_sym, Ptr{Cvoid}, ())
new(handle)
end
function ZipCodeDatabase(f::Function)
database = ZipCodeDatabase()
out = f(database)
close(database)
out
end
end
close(database:: ZipCodeDatabase) = ccall(
zipcodedatabase_free_sym,
Cvoid, (Ptr{Cvoid},),
database.handle
)
populate!(database:: ZipCodeDatabase) = ccall(
zipcodedatabase_populate_sym,
Cvoid, (Ptr{Cvoid},),
database.handle
)
populationof(database:: ZipCodeDatabase, zipcode:: AbstractString) = ccall(
zipcodedatabase_populationof_sym,
UInt32, (Ptr{Cvoid}, Cstring),
database.handle, zipcode
)
ZipCodeDatabase() do database
populate!(database)
pop1 = populationof(database, "90210")
pop2 = populationof(database, "20500")
println(pop1 - pop2)
end
As in other languages, we hide a handler pointer behind a new data type. The method which populates the database is called populate!
to follow the Julia convention of having the !
suffix on methods which modify the value.
There is currently no consensus on how Julia should handle native
resources. While the inner constructor pattern for allocating the
ZipCodeDatabase
is suitable here, we can think of many ways to let
Julia free it afterwards. In this example, we show two means of freeing
the object: (1) a mapping constructor for use with the do
syntax, and
(2) a close
overload for manually freeing the object.
The inner constructor ZipCodeDatabase(f)
, is both in charge of
creating and freeing the object. With the do
syntax, the user code
becomes similar to one using Python’s with
syntax. Alternatively,
the programmer can use the other constructor and call the method
close
when it is no longer needed.