Extended the evaluation of BalDRO, a robust optimisation-based machine unlearning method, in two directions: testing alternative forget objectives (WGA and TNPO) and running experiments across four additional model families (Llama-3.2-1B, Llama-3.1-8B, Qwen3-8B, Mistral-7B). Using the TOFU benchmark, the clearest result is that BalDRO + NPO substantially recovers forgetting on Llama-3.1-8B where plain NPO nearly fails. Results on other models and objectives are mixed and preliminary.
Fit log-normal linear mixed models (LMMs) to ecological camera trap data from two sites in Germany, modelling distance to the nearest settlement as a function of landscape features and predator counts. Compared three estimation approaches (lme4 REML, blme MAP Bayesian, and brms full MCMC), all producing consistent results. Distance to the nearest road was the most strongly associated predictor.
Applied elastic net regularization for variable selection, then compared logistic regression, random forests, and SVMs for binary classification of successful bird/bat carcass detections near Spanish wind farms. Using real ecological field data (~400 observations), the analysis finds that dog searchers are approximately 10ร more likely to successfully detect a carcass than human searchers, the dominant predictor across all three models.
Built and deployed a full-stack R Shiny web application for Bayesian diagnostic model comparison, developed during a research assistantship at the University of Toronto. The app enables researchers to run ROC/AUC analysis and relative belief ratio inference across multiple models and sampling regimes without writing code.
Compared gradient boosting, bagging, and random forests for distinguishing two Turkish raisin varieties (Kecimen and Besni) from 7 image-derived morphological features. All three methods achieve ~80โ82% accuracy; bagging and random forests outperform boosting. Major axis length and perimeter are the most discriminative features.
Analyzed GeoTIF crop inventory data from Agriculture and Agri-Food Canada across Saskatchewan and Manitoba (2009โ2013), tracking land use for corn and soybeans. Corn production increased roughly 7-fold over the period; most growth occurred in Manitoba. An interactive Shiny app extends the analysis to additional crop types with user-selectable plots and colour schemes.
Clustered 721 Pokรฉmon species by six battle stats (HP, Attack, Defense, Special Attack, Special Defense, Speed) using agglomerative clustering, K-means, and PCA-based hierarchical clustering. All three methods consistently suggested two clusters, with Rand indices above 0.85 across all pairwise comparisons. Ward's linkage was the most stable across linkage types.
Compared elastic net regularization (ENET) and Bayesian additive regression trees (BART) for predicting vegan ice cream ratings from U.S. survey respondents (n = 274). Both models achieved an MSE of 0.91 on the vegan ice cream test set. Attitudes toward meat consumption, racial identity, and political orientation were the strongest predictors; ENET was preferred given comparable accuracy and much lower computational cost.