- 
      
Heap in R and python
Empirical time complexity
 - 
      
Ward clustering
Review of existing segmentation and clustering
 - 
      
Agglomerative hierarhical binary segmentation
Clustering using loss or distance minimization
 - 
      
Speed of named versus numbered indexing in R
Rotated Hi-C data visualization
 - 
      
Where does subtrain come from?
Review of references on cross-validation
 - 
      
Stratified batch sampler for mlr3torch
Demonstrations and verifications of correctness using sonar data
 - 
      
Paris tourist suggestions
Fun spots
 - 
      
MPI for parallelization in R
Another method for machine learning experiments
 - 
      
A custom DataLoader for mlr3torch
Stratified sampling for imbalanced classification
 - 
      
Creating large imbalanced data benchmarks
Tutorial with OpenML
 - 
      
Load-balanced parallel machine learning benchmarks
A new approach using filelock and batchtools
 - 
      
Debrief May 2025
Zurich, Munich, Mons
 - 
      
Parallel machine learning benchmarks
A new approach using targets and crew.cluster
 - 
      
New parallel computing frameworks
batchtools, clustermq, rush, mirai, crew, targets
 - 
      
Centralized vs de-centralized parallelization
Exploring rush
 - 
      
mlr3 tutorials
Links to other blogs
 - 
      
Comparing change-point pruning methods using square loss
Pruned Exact Linear Time (PELT) and Functional Pruning Optimal Partitioning (FPOP)
 - 
      
Organizing computational research projects
Guide for research students
 - 
      
Volcano plots
Redesign of SOAK paper results
 - 
      
Torch learning with binary classification
Implementing AUM loss in mlr3torch
 - 
      
Creating imbalanced data benchmarks
Tutorial with MNIST
 - 
      
Comparing neural network architectures using mlr3torch
Convolutional network versus linear model
 - 
      
Comparing pruning methods for optimal partitioning
Pruned Exact Linear Time (PELT) and Functional Pruning Optimal Partitioning (FPOP)
 - 
      
Are pipe operations linear or quadratic?
A demonstration of atime on mlr3torch
 - 
      
Bike ride map and time series data viz
A demonstration of animint2 and sf
 - 
      
Configuring eduroam
On cell phones and linux
 - 
      
Benchmarking data.table with polars, duckdb, and pandas
Demonstrating advantages of data.table
 - 
      
Implementing optimal partitioning in R
Comparison with Pruned Exact Linear Time (PELT)
 - 
      
Cross-validation experiments with torch learners
Demonstration of mlr3torch + mlr3resampling
 - 
      
Overhead of auto-grad in torch
Comparison with explicit gradients
 - 
      
AUC and AUM in torch
Demonstration of auto-grad
 - 
      
Ordinary least squares algorithms
Comparing computation time in R
 - 
      
Collaborations not allowed
Parsing a web page with regex
 - 
      
Code of conduct / conduite
Lecture obligatoire pour participants du labo LASSO / Required reading for LASSO lab participants
 - 
      
Visualizing prediction error
And clearly showing differences between algorithms
 - 
      
History of supervised change-point detection
Using git bisect to find a survival bug
 - 
      
Generate publications page
Parsing bibtex and generating markdown
 - 
      
Writing comprehensible tests
Documenting key code magic numbers in animint2 tests
 - 
      
Rust versus Go
Similarities and Differences
 - 
      
How reproducible are benchmarks?
Comparing atime results on different computers
 - 
      
Collapse reshape benchmark
Comparison with data.table
 - 
      
Porting base R regex code to nc
Case study with a complex regex
 - 
      
Benchmarking a change in data.table
Progress reporting for group by operations
 - 
      
Mammouth tutorial
Cluster computing for students at UdeS
 - 
      
Research student application
Please read if you want to do research under my supervision
 - 
      
HTML to Markdown
Regex for porting my lab web site
 - 
      
Short bio
Some text to use for talk introductions
 - 
      
Directions to my office in Sherbrooke
With a map in English!
 - 
      
New code for various kinds of cross-validation
Cross-validation in R with mlr3
 - 
      
Capturing regular expressions
Extracting data from loosely structured text
 - 
      
The importance of hyper-parameter tuning
And parallellizing machine learning experiments in R
 - 
      
When is it useful to train with combined subsets?
An exploration using cross-validation
 - 
      
Parsing check logs using regular expressions
A demonstration of nc R package
 - 
      
Unable to load shared object, Undefined symbol
Creating and explaining a linker error
 - 
      
Reshape performance comparison
Demonstration of asymptotic timing comparisons
 - 
      
Cross-validation with variable size train sets
Determining how many samples are necessary for optimal prediction
 - 
      
Upgrading R arrow
More build debugging
 - 
      
Partial matching on data frame row names
Comparing efficiency using atime
 - 
      
Interpretable learning algorithms with built-in feature selection
Regularized linear model and decision tree
 - 
      
Generalization to new subsets in R
Coding non-standard cross-validation
 - 
      
Comparing machine learning frameworks in R
for loop, mlr3, tidymodels
 - 
      
data.table CRAN diffs
Verifying consistency between CRAN and github
 - 
      
data.table asymptotic timings
Motivational figures
 - 
      
Debugging python code in emacs
Fixing a bug and building old emacs
 - 
      
Count unique students
Regex and data table summarization
 - 
      
Essential emacs key commands
Cheat sheet for my students
 - 
      
Splitting an R package
Recommendations from experience with spatstat
 - 
      
Installing Rmpi on the cluster
This package needs special treatment on compute nodes
 - 
      
Segfault using R arrow
Reproducing and fixing an error
 - 
      
Re-building vignettes on windows
Fixing mysterious error
 - 
      
Modifying default gcc compilation flags
When compiling R packages
 - 
      
Installing Ubuntu on an old Mac
Step by step instructions
 - 
      
spack package manager
contrast with conda
 - 
      
Checking R package on M1 Mac
Web services for R package developers
 - 
      
Comparing asymptotic timings of CSV read/write functions
Some surprising differences
 - 
      
Debugging C code
valgrind and gdb are essential tools
 - 
      
CRAN Meta-data
Backing up MRAN
 - 
      
Cross-validation experiments on the cluster
NAU monsoon tutorial
 - 
      
Generalization to new subsets
Cross-validation in python
 - 
      
R Package Release History
Extracting and plotting data from CRAN web site
 - 
      
Submitting python jobs on monsoon
And anaconda setup
 - 
      
Cloud Storage
Different options for internal and external sharing
 - 
      
Indirect reverse dependencies
Computing the entire graph, and histogram tutorial
 - 
      
GUI for WSL on Windows 10
use cygwin instead of vcxsrv
 - 
      
Reformatting NEWS files
Regular expression example
 - 
      
R packages on github
How to query CRAN meta-data
 - 
      
Positive and negative log transform
Non-linear transformations for heat maps and signed p-values
 - 
      
Research Mentorship Plan
Required reading for potential students
 - 
      
Historical reverse imports
Analysis of R package usage over time
 - 
      
Learning with Area Under the Min
How to use torch with a non-standard loss
 - 
      
Torch randomness
Reproducible neural network learning
 - 
      
No argument unpacking in C
But there is in R and Python
 - 
      
AUM in Torch
Auto-grad of a non-differentiable loss function
 - 
      
Erdos number
A distance calculator
 - 
      
Plotting the probability simplex
An application of matrix inversion
 - 
      
Link-time optimization
Fixing warnings from CRAN checks
 - 
      
Finding symbols in object files
Using objdump to find cerr
 - 
      
Ten years of R project in Google Summer of Code
Some success stories from my participation
 - 
      
Simple methods for defining small data by row
Comparison with base R and tribble
 - 
      
The C book
Documentation of stringize macros
 - 
      
Evidential machine learning
An alternative to probability
 - 
      
Stress testing reshape operations on list columns
Advantages of updated data.table::melt
 - 
      
Defining data by row and regex by sub-pattern
Avoiding separation of related concepts in code
 - 
      
Update about data reshaping and visualization in R and python
data.table, tidyr, nc, pandas, datatable, plotnine, altair, bokeh
 - 
      
Convex clustering theory
Recent results on trees and cluster shapes
 - 
      
R packages that depend on system libraries
How to pass CRAN checks
 - 
      
The UCR Time Series Archive
A benchmark for classification algorithms
 - 
      
Multi-threaded sorting
Thread safety of qsort variants
 - 
      
Faster AUM computation?
Log-linear C++ STL containers vs linear time radix sort
 - 
      
New ideas for classification
Weston-Watkins multiclass SVM and AUC optimization
 - 
      
Emulating the python interactive console
My hack using the code module
 - 
      
New packages for data storage and reshaping
tidyfast, tidyfst, fst, arrow, feather, parquet
 - 
      
Computing K-means train/validation error
Alternatives to for loops in R
 - 
      
Parsing CRAN maintainers
Regular expressions using nc R package
 - 
      
emacspeak
Teaching my son to type in emacs
 - 
      
Random train/validation/test assignment
Different methods tried by my students
 - 
      
C/C++ completion in emacs
Configuration details
 - 
      
Data manipulation libraries
Translating between data.table, pandas, dplyr
 - 
      
Custom evaluation metrics in TensorFlow
Implementing the exact area under the ROC curve
 - 
      
Fast parameter exploration
Caching and parallel execution
 - 
      
binsegRcpp inside a C++ program
Embedding Rcpp code into a main function
 - 
      
Embedding R
Compiling a program that links to R
 - 
      
R batchtools on Monsoon
Cluster computing tutorial for NAU students
 - 
      
Arizona time
Why does internet tell people the wrong time?
 - 
      
Emacs local variables
Custom configurations for R
 - 
      
Ubuntu setup and LaTeX debugging
Installing and configuring a 10 year old Mac
 - 
      
X forwarding on windows
Installing and configuring cygwin
 - 
      
Scientific poster suggestions
A helpful video
 - 
      
Tinyverse
Complex software dependencies considered harmful
 - 
      
useR 2019 debrief
Interesting talks I saw in Toulouse
 - 
      
R in Docker on Mac
Reproducing valgrind messages using an R-hub image
 - 
      
R package installation on windows considered harmful
Warning for unsuccessful DLL copy should be an error
 - 
      
tikzDevice on windows
Fixing missing packages
 - 
      
future.batchtools
Simple parallel R code on a computer cluster
 - 
      
OpenMP
Simple parallel for loops in C++
 - 
      
gdb with R
how to find line numbers of assertion errors
 - 
      
Eigen and UNDEBUG
Turning on runtime assertion errors for compiled code in R packages
 - 
      
survivalsvm
Support vector machine for survival analysis
 - 
      
Testing PeakSegPipeline on Travis with SLURM
Also batchtools and texinfo
 - 
      
Tweet when donation received
My first google script
 - 
      
Setting default web browser in LXDE
Need to create a .desktop file
 - 
      
Keyboard remapping on windows
Changing caps lock to control on windows
 - 
      
Training Benchmark for Deep neural networks
A new benchmark data set for neural network training
 - 
      
The biglasso package
An on-disk implementation for huge data
 - 
      
True reproducibility in R
The switchr package and manifests
 - 
      
Seasonal temperature variations where I have lived
Using R to download and plot temperature data from wikipedia
 - 
      
Loon
 - 
      
What Science Is
 - 
      
Compiling R
 - 
      
R-GSOC-2017
R project in Google Summer of Code 2017
 - 
      
useR! 2017 debrief
summary of interesting work I saw in Brussels
 - 
      
Combining data tables in R
rbind inside the for loop is much slower than outside
 - 
      
new web site
now more complete and informative
 
    
      Newer
    
  
  
    
      Older