Today I was preparing an R package for computing the nearest neighbors algorithm.

I plan to use this R package to explain in my CS499 class how to write C++ code in an R package. In particular I will explain how to use the Eigen C++ template library, which makes it easy to perform matrix operations and dynamic memory allocations.

Eigen has built-in checks/assertions which stop at either compile-time (for fixed/static matrix sizes that are defined in C++ code) or run-time (for dynamic matrix sizes that depend on the data). For my package there are only dynamic matrix sizes, defined using the Eigen::VectorXd etc functions. For example we have

Eigen::Map< Eigen::MatrixXd > train_inputs_mat(train_inputs_ptr, nrow, ncol);
Eigen::Map< Eigen::VectorXd > test_input_vec(test_input_ptr, ncol);

The first line defines a train feature matrix (nrow x ncol) and the second line defines a test feature vector (ncol), for a supervised machine learning problem with nrow observations and ncol features.

The Eigen documentation, page Matrix and vector arithmetic, section Addition and subtraction says that “The left hand side and right hand side must, of course, have the same numbers of rows and of columns.” Therefore I was expecting the following code to fail with a run-time assertion error:

for(int i=0; i<nrow; i++){
  distance_vec(i) = (train_inputs_mat.row(i)-test_input_vec).norm();
}

I was expecting it to fail because test_input_vec is a column vector (ncol x 1) and train_inputs_mat.row(i) is a row vector (1 x ncol). Instead I was seeing the code compile and run, and return the wrong answer!

I figured out the issue by looking at the compilation command line:

g++  -I"/usr/share/R/include" -DNDEBUG  -I"/home/tdhock/R/i686-pc-linux-gnu-library/3.5/RcppEigen/include" -I/home/tdhock/include   -fpic  -g -O2 -fdebug-prefix-map=/build/r-base-3pUaHF/r-base-3.5.1=. -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -g  -c knn.cpp -o knn.o

The important part is the -DNDEBUG flag which is explained in the Eigen Preprocessor Directives documentation.

EIGEN_NO_DEBUG - disables Eigen's assertions if defined. 
Not defined by default, unless the NDEBUG macro is defined
(this is a standard C++ macro which disables all asserts).

How to turn that off? R controls the compilation, which is explained in the Writing R Extensions manual, section 1.6, writing portable packages:

Under no circumstances should your compiled code ever call abort or exit: 
these terminate the user’s R process, 
quite possibly including all his unsaved work. 
One usage that could call abort is the assert macro in C or C++ functions, 
which should never be active in production code. 
The normal way to ensure that is to define the macro NDEBUG, 
and R CMD INSTALL does so as part of the compilation flags. 
If you wish to use assert during development 
you can include -UNDEBUG in PKG_CPPFLAGS. 
Note that your own src/Makefile or makefiles in sub-directories 
may also need to define NDEBUG.

So that means we can enable the compilation errors by including -UNDEBUG in the g++ command line. To do that in an R package we add the following to src/Makevars:

PKG_CPPFLAGS=-UNDEBUG

After doing that, R generates the following compilation command:

g++  -I"/usr/share/R/include" -DNDEBUG -UNDEBUG -I"/home/tdhock/R/i686-pc-linux-gnu-library/3.5/RcppEigen/include" -I/home/tdhock/include   -fpic  -g -O2 -fdebug-prefix-map=/build/r-base-3pUaHF/r-base-3.5.1=. -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -g  -c knn.cpp -o knn.o

and when I run the R code I get the following runtime assertion error,

R: /home/tdhock/R/i686-pc-linux-gnu-library/3.5/RcppEigen/include/Eigen/src/Core/CwiseBinaryOp.h:110: Eigen::CwiseBinaryOp<BinaryOp, Lhs, Rhs>::CwiseBinaryOp(const Lhs&, const Rhs&, const BinaryOp&) [with BinaryOp = Eigen::internal::scalar_difference_op<double, double>; LhsType = const Eigen::Block<Eigen::Map<Eigen::Matrix<double, -1, -1> >, 1, -1, false>; RhsType = const Eigen::Map<Eigen::Matrix<double, -1, 1> >; Eigen::CwiseBinaryOp<BinaryOp, Lhs, Rhs>::Lhs = Eigen::Block<Eigen::Map<Eigen::Matrix<double, -1, -1> >, 1, -1, false>; Eigen::CwiseBinaryOp<BinaryOp, Lhs, Rhs>::Rhs = Eigen::Map<Eigen::Matrix<double, -1, 1> >]: Assertion `aLhs.rows() == aRhs.rows() && aLhs.cols() == aRhs.cols()' failed.

Although the error message is pretty incomprehensible (it does not tell me the line number in my code which is problematic), I would argue that it is better than silently returning the wrong answer.

If the code is compiled and run interactively, one line at a time, it should be relatively simple to determine which line introduced the error. Another way to determine where the error happens is debugging or inserting print statements.