ndarray::Array1
- 5433 words
- 28 min
I can illustrate the [...] approach with the [...] image of a nut to be opened. The first analogy that came to my mind is of immersing the nut in some softening liquid, and why not simply water? From time to time you rub so the liquid penetrates better, and otherwise you let time pass. The shell becomes more flexible through weeks and months — when the time is ripe, hand pressure is enough, the shell opens like a perfectly ripened avocado! A different image came to me a few weeks ago. The unknown thing to be known appeared to me as some stretch of earth or hard marl, resisting penetration [...] the sea advances insensibly in silence, nothing seems to happen, nothing moves, the water is so far off you hardly hear it [...] yet finally it surrounds the resistant substance. Alexander Grothendieck
Scientific Computing: A Rust adventure (TOC)
- Part 0: Vectors
- Part 1: Zero-cost abstractions
- Part 2: Array1 (this post)
Checkpoint
In the previous episode we introduced several key concepts concerning Rust's type system: generics, traits, operators, associated types, Copy
. Empowered by this new knowledge, we were able to write a generic vector product, suitable to manipulate several numerical types under very mild constraints.
In this episode we will finally start using ndarray
: we will introduce Array1
, its one-dimensional array, and we will spend some time on closures and function traits. Our driving example is going to be numerical integration: we will write a very simple routine to compute the integral of well-behaved real-valued functions over closed 1-dimensional intervals.
Numerical integration
In hindsight, nothing gave me more pleasure as a mathematician than reframing problems. You have this thorny issue you are working on - all techniques you are familiar with lead you nowhere, you are not moving forward. Then you start looking at the structures you are working with diagonally, from a weird angle. You rotate them in your mental eye, until it seems you are hallucinating: your objects looks like something else, something you know, something you have met somewhere else, something familiar, but different. You can reframe your problem using new words, new structures, and those new words carry with them other tools, other theorems, other strategies you can use to finally get to the bottom of your problem. It's thrilling, you feel elated.1
This blog post will probably fail by a long shot to deliver such a powerful impact, but I promise you we will spending most of our time dealing with a problem that is crucial to a lot of wildly different applications (inside and outside of math): numerical integration.
Two noticeable examples:
- ordinary differential equations - they are used to describe a wild variety of physical phenomena and their resolution can be framed as the computation of an integral;
- Bayesian inference - (nasty) integrals come in big time when trying to evaluate posterior probability distributions.
We won't be touching any state-of-art algorithm for numerical integration2. We will implement an elementary routine - a good occasion to introduce the core focus of this whole series: the ndarray
crate. We will also introduce some new Rust language features to take full advantage of what ndarray
provides us.
The rectangle rule
Our goal is to compute
$$ \int_a^b{f(x),\text{d}x} $$ where $a$ and $b$ are finite real numbers, $a < b$, and $f$ is a function of one real variable, i.e. $f: \mathbb{R}\to\mathbb{R}$. The simplest approximation method is called rectangle rule (or midpoint rule):
- pick an integer $n$;
- divide the $[a, b]$ interval into $n$ sub-intervals of equal length; $$ a < a+\frac{1}{n}(b-a) < a+\frac{2}{n}(b-a) < \dots < a+\frac{n-1}{n}(b-a) < a+\frac{n}{n}(b-a)=b $$
- for each sub-interval $I_i$, build a rectangle $R_i$ with $I_i$ as basis and the function value in the midpoint of the $I_i$ sub-interval as height,
$f(\frac{x_i + x_{i+1}}{2})$
; - estimate the integral
$\int_a^b{f(x)\,\text{d}x}$
using the sum of the areas of all rectangles.
The smaller $n$, the better the estimate. Visually, it looks like this (for increasingly bigger values of $n$):
We can get the formula out with a bit of algebra. Let's call $x_i$
and $x_{i+1}$
the extremal points of the $i$
-th sub-interval, $I_i$
.
Then, the area of the corresponding rectangle is
$$ Area(R_i) = (x_{i+1}-x_{i})f\left(\frac{x_i + x_{i+1}}{2}\right) $$
We built our sub-intervals by partioning $[a, b]$ in equal parts, hence
$$ x_i = a+\frac{i}{n}(b-a) $$
Substituting it in the previous formula, we get:
$$ \begin{align} Area(R_i)&=(x_{i+1}-x_{i})f\left(\frac{x_i + x_{i+1}}{2}\right) \ &= \left(a+\frac{i+1}{n}(b-a)-\left(a+\frac{i}{n}(b-a)\right)\right)f\left(\frac{a+\frac{i+1}{n}(b-a)+a+\frac{i}{n}(b-a)}{2}\right) \ &= \frac{(b-a)}{n}f\left(a+\frac{2i+1}{2n}(b-a)\right) \end{align} $$
We can now sum all these areas together to get our integral estimate:
$$ \begin{align} \int_a^b{f,\text{d}x}&=\sum_{i=0}^{n-1} Area(R_i) \ &= \sum_{i=0}^{n-1} \frac{(b-a)}{n}f\left(a+\frac{2i+1}{2n}(b-a)\right) \ &= \frac{(b-a)}{n} \sum_{i=0}^{n-1} f\left(a+\frac{2i+1}{2}\frac{(b-a)}{n}\right) \ &= L_n \sum_{i=0}^{n-1} f\left(a+\frac{2i+1}{2}L_n\right) \end{align} $$
where $L_n=\frac{(b-a)}{n}$ is just the length of each sub-interval.
It's now time to turn the midpoint rule into a Rust routine.
Array1
Array1<T>
is going to be our foothold into ndarray
.
Array1<T>
is roughly equivalent to Vec<T>
: it's a one-dimensional array which owns its elements.
We will point out their differences along the way.
Constructors
We can create an instance of Array1<T>
using the array
macro:
// We import what we need from the `ndarray` crate
use ndarray::{Array1, array};
// The syntax is exactly the same we have already seen
// for the `vec` macro:
//
// array![<1st element>, <2nd element>, ..., <last element>]
let a: Array1<i32> = array![0, -1, 2, 3];
Alternatively, we can first build an instance of Vec<T>
and then convert it into an instance of Array1<T>
:
use ndarray::{Array1, array};
let a: Array1<i32> = array![0, -1, 2, 3];
// We start with a vector
let vector = vec![0, -1, 2, 3];
// `from` takes `vector` by value: `vector` is consumed due to move semantics
// We can't reuse `vector` after having performed this conversion
let b: Array1<i32> = Array1::from(vector);
// It does not matter how we built them - the resulting arrays are identical
assert_eq!(a, b);
The first step in computing
$$ \int_a^b{f,\text{d}x} = L_n \sum_{i=0}^{n-1} f\left(a+\frac{2i+1}{2}L_n\right) $$
is generating an Array1<f32>
containing the sequence of midpoints, i.e.
$$ \left{ a+\frac{2i+1}{2}L_n \right}_{i=0}^{n-1} $$
We can do it using a Vec<f32>
as intermediate step:
use ndarray::Array1;
// We take as inputs the variables of our problem:
// - n, the number of sub-intervals,
// - a, the left end of the interval
// - b, the right end of the interval
fn midpoints(n: u32, a: f32, b: f32) -> Array1<f32> {
// Quick check to ensure our inputs are valid
assert!(a < b);
assert!(n > 0);
// Initialise an empty vec
let mut v = vec![];
let length = (b - a) / n;
for i in 0..n {
// For each index, we compute the midpoint...
let midpoint = a + (i + 0.5) * length;
// ..and accumulate it in `v`
v.push(midpoint);
}
// We return the conversion result
Array1::from(v)
}
It looks pretty straight-forward, but the compiler is not fully onboard:
error[E0277]: cannot divide `f32` by `u32`
--> src/main.rs:14:21
|
14 | let length = (b - a) / n;
| ^
| no implementation for `f32 / u32`
|
= help: the trait `std::ops::Div<u32>` is
not implemented for `f32`
error[E0277]: cannot add `{float}` to `u32`
--> src/main.rs:17:31
|
17 | let midpoint = a + (i + 0.5) * length;
| ^
| no implementation for `u32 + {float}`
|
= help: the trait `std::ops::Add<{float}>` is
not implemented for `u32`
Rust does not perform implicit conversions (type casting) for numerical types.
If the types involved in the computation are not compatible (i.e. an implementation of the corresponding trait is missing, e.g. std::ops::Add<f32>
for u32
) the compiler will raise an error.
Type casting looks fairly innocent at a first glance, but several subtle bugs are lurking behind the corner if it's approached carelessly - I suggest you to have a look at the relevant section in Rust By Example to get an idea of what scenarios are possible.
In our case, it is sufficient to perform an explicit conversion, using the as
keyword. The middle section of our midpoints
function becomes:
let length = (b - a) / (n as f32);
for i in 0..n {
// For each index, we compute the midpoint...
let midpoint = a + ((i as f32) + 0.5) * length;
// ..and accumulate it in `v`
v.push(midpoint);
}
It compiles just fine now and we have our point sequence in an Array1
, as desired.
It took a little bit of fiddling though, for what looks like a very common use case: generating a sequence of values from a starting point (a + 0.5 * length
) using a fixed step (length
) until the endpoint is reached (b - 0.5 * length
).
We are also allocating memory more often than necessary - we know from the very beginning how many elements we are going to store.
Array1<T>
has a dedicated method to take care of this very common pattern: it's called range
(equivalent to arange
in NumPy). The previous snippet becomes:
use ndarray::Array1;
// We take as inputs the variables of our problem:
// - n, the number of sub-intervals,
// - a, the left end of the interval
// - b, the right end of the interval
fn midpoints(n: u32, a: f32, b: f32) -> Array1<f32> {
// Quick check to ensure our inputs are valid
assert!(a < b);
assert!(n > 0);
// Step size
let length = (b - a) / (n as f32);
// Starting point (included in the generated array)
let start = a + 0.5 * length;
// End point (excluded from the generated array).
// The last point in the generated sequence is going to be:
// end - length = b - 0.5 * length
let end = b + 0.5 * length;
// The desired sequence of midpoints
Array1::range(start, end, length)
}
Transformations
We can now generate the sequence of midpoints for an arbitrary choice of a
, b
and n
- the next step to evaluate
$$ \int_a^b{f,\text{d}x} = L_n \sum_{i=0}^{n-1} f\left(a+\frac{2i+1}{2}L_n\right) $$
is computing the value of $f$ in each of those points.
Let's try to do it for $f(x) = sin(x)$.
This can be done using the map
method and a closure:
// Generate the point sequence using our `midpoints` routine
let points = midpoints(129, 0., 1.);
let f_values = points.map(
|x| x.sin()
);
map
borrows an immutable reference to each element in points
(type: &f32
), it applies the specified closure to it (|x| x.sin()
) and returns a new Array1
instance containing the results of this operation.
A closure is an anonymous function, Rust's version of Python's lambda
s: the function argument is specified between pipes (i.e. |x|
) and the output of the function body (x.sin()
) is automatically returned.
It is also possible to specify more complex function bodies using a block:
// Generate the point sequence using our `midpoints` routine
let points = midpoints(129, 0., 1.);
let f_values = points.map(
// The beginning of the block is marked by an open curly brace, `{`
|x| {
// This works just like a normal function body, taking `x` as input
let sin = x.sin();
let cos = x.cos();
// The last expression, given that there is no trailing `;`,
// is automatically returned
cos + sin
// The end of the block is marked by a closed curly brace, `}`
}
);
Summing
We are almost there: we just need to sum our f_values
and multiply the result by $L_n=(b-a) / n$.
It's as easy as it sounds:
let a = 0.;
let b = 1.;
let n = 129;
let points = midpoints(n, a, b);
let f_values = points.map(
|x| x.sin()
);
let length = (b - a) / (n as f32);
// `Array1` provides a `sum` method
let integral_estimate = f_values.sum() * length;
Ez.
Have we done a good job?
Done... or not? I'd say it would be wise to:
- test it on some easy integrals (for which we can compute the answer analytically);
- benchmark its performance.
To move forward with both objectives we need a little bit more flexibility. Our routine should become a function: a
, b
n
and the function we want to integrate (|x| x.sin()
so far) have to become parameters.
The first three should be fairly easy at this point:
use ndarray::Array1;
fn midpoints(n: u32, a: f32, b: f32) -> Array1<f32> {
assert!(a < b);
assert!(n > 0);
let length = (b - a) / (n as f32);
let start = a + 0.5 * length;
let end = b + 0.5 * length;
Array1::range(start, end, length)
}
// We take as inputs the variables of our problem:
// - n, the number of sub-intervals,
// - a, the left end of the interval
// - b, the right end of the interval
fn rectangle_rule(n: u32, a: f32, b: f32) -> f32 {
let points = midpoints(n, a, b);
let f_values = points.map(
|x| x.sin()
);
let length = (b - a) / (n as f32);
let integral_estimate = f_values.sum() * length;
integral_estimate
}
What about the integrand function? What is its type?
Yeah, that's a thorny one. If you have been following closely so far, you know what's coming... a bunch of new traits! 🌟
Function traits
If you want to start passing closures around like a pro you have to get familiar with the Fn
traits provided by the standard library.
But, first of all, let's give a little bit more context around closures.
So far we have been describing closures as anonymous functions. A syntactic convenience, if you want, that allows you to define a function inline without having to go through the declaration boilerplate. Closures can do way more than that!
Closures can capture variables from the environment they are being defined into. Let's see it in action with an example (borrowed from the Book):
fn main() {
// A variable defined in the `main` scope
let x = 4;
// A closure that takes an input parameter (named `z`)
// and compares it with `x`... but `x` is not an input
// parameter of `equal_to_x`!
// `x` comes from the definition environment oof `equal_to_x`:
// we are capturing its value inside the closure!
let equal_to_x = |z| z == x;
let y = 4;
assert!(equal_to_x(y));
}
It compiles and the assertion is satisfied. If we try to do the same using a proper function...
fn main() {
// A variable defined in the `main` scope
let x = 4;
// We try to define `equal_to_x` as a function
// using `x` from the `main` scope in its body
fn equal_to_x(z: usize) -> bool {
z == x
}
let y = 4;
assert!(equal_to_x(y));
}
...we get a compiler error:
error[E0434]: can't capture dynamic environment in a fn item
--> src/main.rs:8:14
|
8 | z == x
| ^
|
= help: use the `|| { ... }` closure form instead
Unsurprisingly, the compiler has seen what we were trying to do and suggested us to use a closure!
How does this fit into Rust's ownership system? The capturing mechanism follows the same rules. A variable in the environment can be captured:
- as immutable reference;
- as mutable reference;
- by value, using move semantics.
For each of these cases, there exists a corresponding trait in the standard library, as we anticipated:
Fn
is implemented by closures that can be called multiple times without changing the captured variables (they don't consume the captured variables with move semantics and they don't require them to be mutable);FnMut
is implemented by closures that can be called multiple times with the possibility of mutating the captured variables (they don't consume the captured variables with move semantics but they might require them to be mutable);FnOnce
is implemented by closures that might take some of their captured variables by value, thus consuming them due to move semantics (that's why it's calledFnOnce
: it there is a move involved, you can only call them once!).
We don't have to implement these traits manually for out closures: Rust infers them automatically based on the way we are manipulating our captured variables.
Let's see a couple of examples:
Fn
:
fn main() {
// A variable defined in the `main` scope
let x = vec![1, 2, 3];
// Let's create an exact copy
let y = x.clone();
// `equal_to_x` implements `Fn`:
// we are only capturing an immutable reference
// to a variable from the environment, `x`
let equal_to_x = |z| z == &x;
// We can call it multiple times...
equal_to_x(vec![0, 2]);
equal_to_x(vec![0, 1, 3]);
equal_to_x(vec![0, 100]);
// ...and the captured variables is unchanged
assert_eq!(x, y);
}
FnMut
:
fn main() {
// A mutable variable defined in the `main` scope
let mut x = vec![1, 2, 3];
// Let's create an exact copy
let y = x.clone();
// `push_to_x` implements `FnMut`:
// we are capturing a mutable reference
// to a variable from the environment, `x`
let mut push_to_x = |z| x.push(z);
// We can call it multiple times...
push_to_x(4);
push_to_x(5);
push_to_x(6);
// ...and the captured variables is changed accordingly
assert_ne!(x, y);
assert_eq!(x, vec![1, 2, 3, 4, 5, 6]);
}
FnOnce
:
fn main() {
// A variable defined in the `main` scope
let x = vec![1, 2, 3];
// `push_to_x` implements `FnOnce`:
// we are taking ownership of a variable
// from the environment, `x`, using the `move`
// keyword
let move_x = move || x;
// We can only call it once
move_x();
}
If we try to call move_x
twice we get a compiler error:
error[E0382]: use of moved value: `move_x`
--> src/main.rs:13:5
|
12 | move_x();
| ------ value moved here
13 | move_x();
| ^^^^^^ value used here after move
|
note: closure cannot be invoked more than once because
it moves the variable `x` out of its environment
--> src/main.rs:9:26
|
9 | let move_x = move || x;
| ^
One important detail before moving on - there is a clear hierarchy between these traits:
- all closures that implement
Fn
automatically implementFnMut
andFnOnce
; - all closures that implement
FnMut
automatically implementFnOnce
.
All closures implement FnOnce
- if you can't call them at least once, why would you even define them? ¯\_(ツ)_/¯
More details on closures can be found in the relevant section in the book.
The map
method
If you think that was a deluge on information on function traits... well, you are right, it was (and we are not finished yet... 👀). As it always goes, everything will become clearer with practice, if put in the proper context.
Let's see how all of these applies to our integration problem. This is the line where we call our integrand function as a closure to compute its values on our point sequence:
let f_values = points.map(
|x| x.sin()
);
The map
method calls the provided closure on each element of the array (points
). It sounds sensible to say that we can't use FnOnce
as trait bound (unless we restrict ourselves to work with arrays containing a single element - a curious constraint).
With respect to captured variables, I would say that we don't care too much if they are being taken as mutable or immutable references: as long as we can call this closure multiple times, we are good. FnMut
seems to be the most reasonable choice at this point.3
How do we actually use FnMut
in a function signature?
Well, it turns out that you can't write down the type of a closure for reasons (if you are curious, take a look at the deep dive in this article or at Rust's language reference): you have to use a generic type parameter and constrain it using a trait bound (Fn
, FnMut
and FnOnce
are out friends here!)
Let's give it a go with our rectangle_rule
:
// We take as inputs the variables of our problem:
// - n, the number of sub-intervals,
// - a, the left end of the interval
// - b, the right end of the interval
// - f, our integrand function, as a closure
// To express `f` as an input parameter we introduce a type parameter, `F`,
// and we constrain `F` to implement the `FnMut` trait.
fn rectangle_rule<F>(n: u32, a: f32, b: f32, f: F) -> f32
where
F: FnMut
{
let points = midpoints(n, a, b);
let f_values = points.map(f);
let length = (b - a) / (n as f32);
let integral_estimate = f_values.sum() * length;
integral_estimate
}
Will it work?
Nope:
error[E0658]: the precise format of `Fn`-family traits'
type parameters is subject to change.
Use parenthetical notation (Fn(Foo, Bar) -> Baz) instead (see issue #29625)
--> src/main.rs:28:8
|
28 | F: FnMut
| ^^^^^
error[E0107]: wrong number of type arguments: expected 1, found 0
--> src/main.rs:28:8
|
28 | F: FnMut
| ^^^^^ expected 1 type argument
error: aborting due to 2 previous errors
If we put ourselves in the compiler's shoes, its complaint makes perfect sense.
Yes, we are passing in a closure.
Yes, it can be called multiple times because it implements FnMut
.
But don't you feel we are missing some key information? What parameters does it take as input? What does it return as output? The usual stuff, the basics, the stuff we forgot about while we were lost in our trait sidestory.
We can fix it real quick: we can use the parenthetical notation, as suggested by the compiler. In other words:
fn rectangle_rule<F>(n: u32, a: f32, b: f32, f: F) -> f32
where
// We just have to spell out inputs and outputs,
// with the same notation used for functions
F: FnMut(&f32) -> f32
It compiles 🎉
The full code of our example looks like this now:
use ndarray::Array1;
fn midpoints(n: u32, a: f32, b: f32) -> Array1<f32> {
// Quick check to ensure our inputs are valid
assert!(a < b);
assert!(n > 0);
let length = (b - a) / (n as f32);
let start = a + 0.5 * length;
let end = b + 0.5 * length;
Array1::range(start, end, length)
}
// We take as inputs the variables of our problem:
// - n, the number of sub-intervals,
// - a, the left end of the interval
// - b, the right end of the interval
// - f, our integrand function, as a closure
fn rectangle_rule<F>(n: u32, a: f32, b: f32, f: F) -> f32
where
F: FnMut(&f32) -> f32
{
let points = midpoints(n, a, b);
let f_values = points.map(f);
let length = (b - a) / (n as f32);
let integral_estimate = f_values.sum() * length;
integral_estimate
}
fn main() {
let integral_estimate = rectangle_rule(129, 0., 1., |x| x.sin());
println!("Estimate: {:?}.", integral_estimate);
}
It took a bit of study to get here, but the output looks quite readable - good job!
Correctness
It's time to answer the first question about our implementation: is it correct?
The rectangle rule returns us an estimate of the true integral, but it can be proved that if the integrand $f$ is a $\mathcal{C}^2([a, b])$ function (twice differentiable and all its derivatives up to the second order are continuous) then
$$ \left| \text{Err}\left(\int_a^b{f}\right) \right| \leq \frac{b-a}{24}L_n^2 \max_{x\in[a,b]}{f''(x)} $$
A rigorous test suite should check that this estimate holds for a variety of functions, taking into account numerical errors due to floating point arithmetics.
This, unfortunately, goes beyong the scope of this article.
For our purposes, we'll run rectangle_rule
on a couple of smooth functions and check that the result is reasonably close to the analytical integral, for the chosen n
.
Given that we are dealing with floating numbers, we will take advantage of the approx
crate: we can't assert a strict equality between the estimate and the expected result - we need to state how close we require them to be using one of its useful macro, assert_abs_diff_eq
.
Let's see a couple of example:
// Import the PI constant from the standard library
use std::f32::consts::PI;
// Machine precision constant for f32
use std::f32::EPSILON;
use approx::assert_abs_diff_eq;
[...]
fn main() {
let n = 513;
// The integral of the sine function over its period should be zero
let expected_sin = 0.;
let estimate_sin = rectangle_rule(n, -PI, PI, |x| x.sin());
// We require the estimated integral and the expected integral
// to agree up to the sixth decimal digit
assert_abs_diff_eq!(expected_sin, estimate_sin, epsilon = 1e-6);
// The integral of a constant, 1, is x.
// Given the interval, the result should be 1.
let expected_linear = 1.;
let estimate_linear = rectangle_rule(n, 0., 1., |x| 1.);
// The rectangle rule should be exact for constants.
// Thus we require the estimated integral and the expected integral
// to agree up to machine precision
assert_abs_diff_eq!(expected_linear, estimate_linear, epsilon = EPSILON);
}
Performance
We are now reasonably convinced of the correctness of our implementation. How fast is it?
Let's run a microbenchmark. We want to see how long it takes to compute an estimate of
$$ \int_0^1{sin(x),\text{d}x} $$
using the rectangle rule, with $n=513$.
For Rust, we use criterion
4.
For Python, we use %timeit rectangle_rule(513, 0, 1, np.sin)
in a Jupyter notebook - the code is:
import numpy as np
# We assume f is a vectorized function
# Almost all the execution time is spent in NumPy's routines
def rectangle_rule(n, a, b, f):
length = (b-a)/n
points = np.arange(a+length/2, b+length/2, length, dtype=np.float32)
return f(points).sum() * length
The Python version runs in ~9.3 microseconds. The Rust version takes ~2.2 microseconds - roughly 4 times faster.
Well, it feels good. What happens if we increase the number of sub-intervals? Let's say $n=50000$.
The Python version runs in 224 ± 7 microseconds, while the Rust version takes 231 ± 6 microseconds: we lost most of our edge - it seems that the overhead of calling NumPy's C code was significant for small arrays, but not so much for bigger ones.
Is there anything we can do to make our code run faster?
Let's have a second look at our function:
fn rectangle_rule<F>(n: u32, a: f32, b: f32, f: F) -> f32
where
F: FnMut(&f32) -> f32
{
let points = midpoints(n, a, b);
// Map applies `f` to all elements in `points`
// Then allocates a new array to hold the results
let f_values = points.map(f);
let length = (b - a) / (n as f32);
let integral_estimate = f_values.sum() * length;
integral_estimate
}
map
is very cool, but it's creating a whole new array to hold the results of f
applied to points
' elements.
Let's have a look at a couple more methods provided by ndarray
:
map_mut
takes a mutable reference to each element in the array (type:&mut f32
) and it creates a new array to hold the results;mapv
takes each element in the array by value (type:f32
) and it creates a new array to hold the results;mapv_into
takes each element in the array by value (type:f32
) and it uses the original array to store the results, overriding the inputs. No additional memory is allocated.
mapv_into
comes with an additional restriction: the output type of the closure has to be same of its input type, in order to be sure that we can reuse the input memory slot to store the result of the computation (e.g. an f64
does not fit into the 32 bit of an f32
, as you can imagine).
We satisfy both requirements of mapv_into
:
- our closure takes an
f32
value as input and it returns anf32
value; - computing
f_values
is all we needpoints
for.
We can thus use mapv_into
instead of map
: points
is a temporary variable which is never returned or seen by the caller of rectangle_rule
- we can override its elements and avoid performing any additional memory allocation.
The new code looks like this:
fn rectangle_rule<F>(n: u32, a: f32, b: f32, f: F) -> f32
where
// The input is now `f32`, not `&f32`
F: FnMut(f32) -> f32
{
let points = midpoints(n, a, b);
// points is consumed and its memory is reused to hold `f_values`
let f_values = points.mapv_into(f);
let length = (b - a) / (n as f32);
let integral_estimate = f_values.sum() * length;
integral_estimate
}
How does it affect our performance? The difference for $n=513$ is almost negligible, but for $n=50000$ we see a substantial speed-up:
The plot is a courtesy of criterion
- it shows the estimated probability distribution (pdf, in the legend) of running times of our previous version (in red, using map
) compared to the new version (in blue, using mapv_into
).
The average running time is now 205.71 ± 5.5 microseconds, slightly better than NumPy's 224 ± 7 microseconds 🚀
NumPy and SciPy are extremely valuable: they provide a high quality point of reference that can be used to check both correctness and performance of alternative (Rust) implementations. They push us to do better - that's precious.
Wrapping up
We started with a very simple task (the rectangle rule for integrals) and we managed to learn something new at every road block:
- we started to touch
ndarray
withArray1
, using a couple of constructors and a few methods; map
has been our gateway drug into the world of closures and function traits (definitely a tricky corner of the language, you should feel a bit of pride at this point!);- benchmarking against NumPy's has forced us to take a closer look at our code to identify opportunities where we could have been smarter in managing our resources - good training for our optimization eye!
It feels as if we are constantly breaking new ground, but we are slowly cementing the road behind us. In this episode we took advantage of everything we introduced in the previous ones: generic functions, trait bounds, ownership - you name it. Practice makes perfect: the more Rust we write, the more familiar these concepts will become.
In the next episode we will get serious with ndarray
: we'll uncover the true nature of Array1
, its connection to ArrayBase
. We will thus discover how dimensions and ownership are expressed in ndarray
.
Exciting times ahead!
Notes, suggestions and feedbacks to improve this introduction are more than welcome: you can reach me on Twitter at @algo_luca.
Scientific Computing: A Rust adventure (TOC)
- Part 0: Vectors
- Part 1: Zero-cost abstractions
- Part 2: Array1 (this post)
The last time this happened was when I was reading Neural Ordinary Differential Equations. If you have even a mild interest in neural networks, check it out - it's one of the most peculiar paper I happened to stumble upon in the area in the last couple of years.
2: If you are itching to flex your Rust skills, try to port SciPy's integration routines to Rust! ndarray-odeint focuses on solving ordinary differential equations, but a standalone crate dedicated to numerical integration is still missing.
3: You will not be surprised to find out that map
uses exactly the same constraint on its input closure - you can check it out in ndarray
's documentation, here.
4: You can look at the code used for benchmarking here.