Toby Dylan Hocking, PhD
Statistical machine learning researcher, focusing on fast, accurate and interpretable optimization algorithms for big data
LASSO lab director, Département d'informatique, Université de Sherbrooke
toby.dylan.hocking@usherbrooke.ca
+1(819)821-8000x65565
Directions to my office, D4-1010-11
Mailing address:
Toby Dylan Hocking
2500 Boulevard de l'Université
Sherbrooke, QC, J1K-2R1, Canada
Jobs
My lab is recruiting masters and PhD students who are interested in working on new statistical models, optimization algorithms, interactive systems, and software for machine learning. If you are interested in joining, please first read my mentorship plan to see what I will expect of you, then read the application instructions.
Brief CV
- 2024-present: Professeur Agrégé / Tenured Associate Professor, Université de Sherbrooke, Département d’informatique.
- 2018-2024: Tenure-Track Assistant Professor, Northern Arizona University, School of Informatics, Computing, and Cyber Systems.
- 2014-2018: Postdoc, McGill University, Human Genetics Department, with Guillaume Bourque.
- 2013: Postdoc, Tokyo Institute of Technology, Computer Science Department, with Masashi Sugiyama.
- 2009-2012: PhD in Mathematics from Ecole Normale Supérieure de Cachan, with Francis Bach and Jean-Philippe Vert.
- 2008-2009: Masters student, Université Paris 6, Statistics Department, research internship at INRA Jouy-en-Josas with Mathieu Gautier and Jean-Louis Foulley.
- 2006-2008: research assistant at Sangamo BioSciences.
- 2002-2006: undergraduate, UC Berkeley, Double major in Molecular & Cell Biology and Statistics, honors thesis with Terry Speed.
- Full CV, Short Bio.
Research interests: fast, accurate, and interpretable algorithms for learning from large data, using continuous optimization (clustering, regression, ranking, classification) and discrete optimization (changepoint detection, dynamic programming). The main application domains for these algorithms are genomics, neuroscience, medicine, microbiome, cybersecurity, robotics, satellite/sonar imagery, climate/carbon modeling.
I think reproducible research is important, so in addition to every paper I write, I also provide
- A reference implementation of the algorithm(s) described in the paper, typically as an R package.
- Source code for doing the analyses and creating the figures, typically in a GitHub repo.
For more info, see my publications and software.
If you want to send me encrypted messages, or verify my signed messages, you can use my GPG public key (updated April 2023, fingerprint 2AD6 F45A 31FF CF13 C3F7 0515 680A A3B7 3AA1 9C4F).
My ORCID is 0000-0002-3146-0865.
news
Nov 18, 2024 | To support our NSF POSE funded project about expanding the open-source software ecosystem around R data.table, we will present Using and contributing to the data.table package for efficient big data analysis session, Wednesday, Dec 4, 8:30 AM, online. |
Aug 5, 2024 | To support our NSF POSE funded project about expanding the open-source software ecosystem around R data.table, Kelly Bodwin plans to talk about “Building Sustainable Open Source Ecosystems: Lessons From the #rstats Community and an NSF Grant” in Strengthening The R Ecosystem session, Tuesday, Aug 13, 1:00 PM to 2:20 PM at posit::conf(2024) in Seattle, Washington, USA. |
Jun 20, 2024 | To support our NSF POSE funded project about expanding the open-source software ecosystem around R data.table, we plan to present several talks at useR’2024 in Salzburg, Austria. Paola Corrales and Elio Campitelli will present Tutorial: Efficient Data Analysis with Data.Table on Monday, July 8 at 09:00 - 12:30. Tyson Barrett will present Past, Present, and Future of Data.Table on Tuesday, July 9 at 11:00 - 11:20. Kelly Bodwin will present a keynote on Tuesday, July 9 at 16:30 - 17:30, and host a data.table hackathon afterwards. Doris Afriyie Amoakohene will present Performance Testing and Comparative Benchmarking for Data.Table on Thursday, July 11 at 10:30 - 10:35, video. |
Apr 2, 2024 | To support our NSF POSE funded project about expanding the open-source software ecosystem around R data.table, we plan to present a talk in session Big-ish Data in R: Efficient tools for large in-memory datasets Tuesday, Aug 6: 2:00 PM - 3:50 PM, at JSM’2024 in Portland, Oregon. |
Jan 5, 2024 | Our paper Functional Labeled Optimal Partitioning has been published in Journal of Computational and Graphical Statistics. |