Rust functions with string arguments
Let’s start on something a little more complex, accepting strings as
arguments. In Rust, strings are composed of a slice of u8
and are
guaranteed to be valid UTF-8, which allows for NUL
bytes in the
interior of the string. In C, strings are just pointers to a char
and are terminated by a NUL
byte (with the integer value 0
). Some
work is needed to convert between these two representations.
extern crate libc;
use libc::c_char;
use std::ffi::CStr;
#[no_mangle]
pub extern "C" fn how_many_characters(s: *const c_char) -> u32 {
let c_str = unsafe {
assert!(!s.is_null());
CStr::from_ptr(s)
};
let r_str = c_str.to_str().unwrap();
r_str.chars().count() as u32
}
Getting a Rust string slice (&str
) requires a few steps:
-
We have to ensure that the C pointer is not
NULL
as Rust references are not allowed to beNULL
. -
Use
std::ffi::CStr
to wrap the pointer.CStr
will compute the length of the string based on the terminatingNUL
. This requires anunsafe
block as we will be dereferencing a raw pointer, which the Rust compiler cannot verify meets all the safety guarantees so the programmer must do it instead. -
Ensure that the C string is valid UTF-8 and convert it to a Rust string slice.
-
Use the string slice.
In this example, we are simply aborting the program if any of our preconditions fail. Each use case must evaluate what are appropriate failure modes, but failing loudly and early is a good initial position.
Ownership and lifetimes
In this example, the Rust code does not own the string slice, and
the compiler will only allow the string to live as long as the CStr
instance. It is up to the programmer to ensure that this lifetime is
sufficiently short.
C
#include <stdio.h>
#include <stdint.h>
#include <inttypes.h>
extern uint32_t
how_many_characters(const char *str);
int main(void) {
uint32_t count = how_many_characters("göes to élevên");
printf("%" PRIu32 "\n", count);
}
The C code declares the function to accept a pointer to a constant string, as the Rust function will not modify it. You can then call the function with a normal C string constant.
Ruby
# coding: utf-8
require 'ffi'
module StringArguments
extend FFI::Library
ffi_lib 'string_arguments'
attach_function :how_many_characters, [:string], :uint32
end
puts StringArguments.how_many_characters("göes to élevên")
The FFI gem automatically converts Ruby strings to the appropriate C string.
Python
#!/usr/bin/env python3
# coding: utf-8
import sys, ctypes
from ctypes import c_uint32, c_char_p
prefix = {'win32': ''}.get(sys.platform, 'lib')
extension = {'darwin': '.dylib', 'win32': '.dll'}.get(sys.platform, '.so')
lib = ctypes.cdll.LoadLibrary(prefix + "string_arguments" + extension)
lib.how_many_characters.argtypes = (c_char_p,)
lib.how_many_characters.restype = c_uint32
print(lib.how_many_characters("göes to élevên".encode('utf-8')))
Python strings must be encoded as UTF-8 to be passed through the FFI boundary.
Haskell
{-# LANGUAGE ForeignFunctionInterface #-}
import Data.Word (Word32)
import Foreign.C.String (CString(..), newCString)
foreign import ccall "how_many_characters"
how_many_characters :: CString -> Word32
main :: IO ()
main = do
str <- newCString "göes to élevên"
print (how_many_characters str)
The Foreign.C.String
module has support for converting Haskell’s
string representation to C’s packed-byte representation. We can
create one with the newCString
function, and then pass the
CString
value to our foreign call.
Node.js
const ffi = require('ffi-napi');
const lib = ffi.Library('libstring_arguments', {
how_many_characters: ['uint32', ['string']],
});
console.log(lib.how_many_characters('göes to élevên'));
The ffi
package automatically converts JavaScript strings to the
appropriate C strings.
C#
using System;
using System.Runtime.InteropServices;
class StringArguments
{
[DllImport("string_arguments", EntryPoint="how_many_characters")]
public static extern uint HowManyCharacters(string s);
static public void Main()
{
var count = StringArguments.HowManyCharacters("göes to élevên");
Console.WriteLine(count);
}
}
Native strings are automatically marshalled to C-compatible strings.
Julia
#!/usr/bin/env julia
using Libdl
libname = "string_arguments"
if !Sys.iswindows()
libname = "lib$(libname)"
end
libstring_arguments = Libdl.dlopen(libname)
howmanycharacters_sym = Libdl.dlsym(libstring_arguments, :how_many_characters)
howmanycharacters(s:: AbstractString) = ccall(
howmanycharacters_sym,
UInt32, (Cstring,),
s)
println(howmanycharacters("göes to élevên"))
Julia strings (of base type AbstractString
) are automatically
converted to C strings. The Cstring
type from Julia is compatible
with the Rust type CStr
, as it also assumes a NUL
terminator byte
and does not allow NUL
bytes embedded in the string.