Toby Dylan Hocking, PhD
Statistical machine learning researcher, focusing on fast, accurate and interpretable optimization algorithms for big data

LASSO lab director, Département d'informatique, Université de Sherbrooke
toby.dylan.hocking@usherbrooke.ca
+1(819)821-8000x65565
Directions to my office, D4-1010-11
Mailing address:
Toby Dylan Hocking
2500 Boulevard de l'Université
Sherbrooke, QC, J1K-2R1, Canada
Jobs
My lab is recruiting masters and PhD students who are interested in working on new statistical models, optimization algorithms, interactive systems, and software for machine learning. If you are interested in joining, please first read my mentorship plan to see what I will expect of you, then read the application instructions.
Brief CV
- 2024-present: Professeur Agrégé / Tenured Associate Professor, Université de Sherbrooke, Département d’informatique.
- 2018-2024: Tenure-Track Assistant Professor, Northern Arizona University, School of Informatics, Computing, and Cyber Systems.
- 2014-2018: Postdoc, McGill University, Human Genetics Department, with Guillaume Bourque.
- 2013: Postdoc, Tokyo Institute of Technology, Computer Science Department, with Masashi Sugiyama.
- 2009-2012: PhD in Mathematics from Ecole Normale Supérieure de Cachan, with Francis Bach and Jean-Philippe Vert.
- 2008-2009: Masters student, Université Paris 6, Statistics Department, research internship at INRA Jouy-en-Josas with Mathieu Gautier and Jean-Louis Foulley.
- 2006-2008: research assistant at Sangamo BioSciences.
- 2002-2006: undergraduate, UC Berkeley, Double major in Molecular & Cell Biology and Statistics, honors thesis with Terry Speed.
- Full CV, Short Bio.
Research interests: fast, accurate, and interpretable algorithms for learning from large data, using continuous optimization (clustering, regression, ranking, classification) and discrete optimization (changepoint detection, dynamic programming). The main application domains for these algorithms are genomics, neuroscience, medicine, microbiome, cybersecurity, robotics, satellite/sonar imagery, climate/carbon modeling.
I think reproducible research is important, so in addition to every paper I write, I also provide
- A reference implementation of the algorithm(s) described in the paper, typically as an R package.
- Source code for doing the analyses and creating the figures, typically in a GitHub repo.
For more info about my research activities, see my publications and software.
See my teaching page for lists of university course materials, university research students supervised, and Google Summer of Code students mentored.
As the leader of Data Table Ninjas, I
also provide consulting services related to R data.table
,
including big data analysis, machine learning, data visualization, and
teaching specialized programming classes related to these subjects.
If you want to send me encrypted messages, or verify my signed messages, you can use my GPG public key (updated April 2023, fingerprint 2AD6 F45A 31FF CF13 C3F7 0515 680A A3B7 3AA1 9C4F).
My ORCID is 0000-0002-3146-0865.
news
Mar 17, 2025 | To support our NSF POSE funded project about expanding the open-source ecosystem around R data.table, we will present a 2 hour tutorial at the Rencontres R conference in Mons, Belgium, 10AM-noon on Monday May 19 (salle informatique Russel), Abstract, Source. We will also present a 20 minute talk (15:50 - 16:10) about animint2, in session “Visualisation et cartographie” (auditoire Van Gogh), 15:50 - 16:50. |
Mar 6, 2025 | Our paper Cross-validation for training and testing co-occurrence network inference algorithms has been published in BMC Bioinformatics, Local PDF. |
Feb 27, 2025 | To support our NSF POSE funded project about expanding the open-source ecosystem around R data.table, we are giving invited talks, Source. Madrid R User Group (27 Feb), Video; Montpellier seminar for Julie Josse’s group (13 March); Paris State of the R meeting (4 April); Zurich Applied Statistics Seminar (8 May); Munich workshop with Bernd Bischl’s group (14-15 May). |
Nov 18, 2024 | To support our NSF POSE funded project about expanding the open-source software ecosystem around R data.table, we will present Using and contributing to the data.table package for efficient big data analysis at the online PyData Global conference, Wednesday, Dec 4, 8:30 AM Eastern. |
Oct 17, 2024 | Our paper Assessment of the Climate Trace global powerplant CO2 emissions has been published in Environmental Research Letters. |