My main contributions to free/open-source software are R packages that provide implementations of the methods described in my research papers (see below).
- Since 2012, I am co-administrator and mentor for the R project in Google Summer of Code – I have been helping teach college students all over the world how to write R packages. Because of this work, the R Foundation gave me the firstname.lastname@example.org email address.
- I was president of the organizing committee for “R in Montreal 2018,” a local conference for useRs and developeRs of R.
- I am an editor for the Journal of Statistical Software.
The PeakSeg R packages contain algorithms for inferring optimal segmentation models subject to the constraint that up changes must be followed by down changes, and vice versa. This ensures that the model can be interpreted in terms of peaks (after up changes) and background (after down changes).
- PeakSegDP provides a heuristic quadratic time algorithm for computing models from 1 to S segments for a single sample. This was the original algorithm described in our ICML’15 paper, but it is neither fast nor optimal, so in practice we recommend to use our newer packages below instead.
- PeakSegOptimal provides log-linear time algorithms for computing optimal models with multiple peaks for a single sample. arXiv:1703.03352
- PeakSegDisk provides an on-disk implementation of optimal log-linear algorithms for computing multiple peaks in a single sample (same as PeakSegOptimal but works for much larger data sets because disk is used for storage instead of memory). arXiv:1810.00117
- PeakSegJoint provides a fast heuristic algorithm for computing models with a single common peak in 0,…,S samples. arXiv:1506.01286
- PeakSegPipeline provides a supervised machine learning pipeline for genome-wide peak calling in multiple samples and cell types, as described in our PSB’20 paper.
To support our Bioinformatics (2017) paper about a labeling method for supervised peak detection, we created the R package PeakError which computes the number of incorrect labels for a given set of predicted peaks.
To support our paper about elastic net regularized interval regression models (in preparation), we created the iregnet R package.