学术会议
学术会议

学术会议

您当前的位置: 首页> 学术会议
20241105:Two lower bounds for regression trees
时间:2024-11-02


报告时间:2024年11月5日(星期二)下午2:00-3:00

报告地点:明德主楼1016

报告主题:Two lower bounds for regression trees


报告摘要

Regression trees and their ensembles are among the most popular and important machine learning models and are especially useful for high-dimensional data. Most previous theoretical work has focused on deriving upper bounds for the estimation error of a true regression function. In this talk, we instead present two lower bounds for regression trees that demonstrate their weaknesses. The first lower bound shows that any regression tree is inefficient at fitting functions with additive structure, thereby demonstrating limitations of the representation power of regression trees. The second lower bound shows that greedy regression trees and random forests suffer from the curse of dimensionality when fitted to functions that have low-to-no “marginal signal”, demonstrating limitations of greedy optimization. These lower bounds suggest ways in which regression tree algorithms can be improved while preserving their strengths.


报告人简介

Yanshuo Tan is currently an assistant professor at the Department of Statistics and Data Science at the National University of Singapore. He was previously a Neyman Visiting Assistant Professor at UC Berkeley's Statistics Department, where he was fortunate to be advised by Bin Yu. He did his PhD in Mathematics at the University of Michigan, where he was fortunate to be advised by Roman Vershynin. His current research is on statistical machine learning, focusing on the theory, methodology and applications of modelling with decision trees and tree ensembles.