Rule Based Regression
3/27 Department Seminar
Ph.D. Candidate in Computer Science from The University Of Mississippi
3:00 p.m., March 27, 2013
235 Weir Hall
Title: Rule Based Regression and Feature Selection for Biological Data
Regression is widely utilized in a variety of biological problems involving continuous outcomes. There are a number of methods for building regression models ranging from linear models to more complex nonlinear ones. While linear regression techniques can identify linear correlations between input and output, in many practical applications, the relations are nonlinear. These relations can be modeled by nonlinear regression techniques effectively. However, in general, models built with nonlinear techniques are relatively harder for humans to interpret, which is crucial in many problems.
We propose a rule based regression algorithm that uses 1-norm regularized random forests. The proposed approach simultaneously extracts a small number of rules from generated random forests and eliminates unimportant features. We tested the approach on a seacoast chemical sensors dataset, a Stockori flowering time dataset, and two datasets from UCI repository. The proposed approach is able to construct a significantly smaller set of regression rules using a subset of attributes while achieving prediction performance comparable to that of random forests regression. It demonstrates high potential in aiding prediction and interpretation of nonlinear relationships of subject being studied.
Sheng Liu received his Bachelor of Science in Biochemistry from Wuhan University, Master of Science in Computer Science from University of Mississippi. He is now Doctor of Philosophy student in Computer Science at Department of Computer and Information Science, University of Mississippi. His research interests include machine learning and Bioinformatics/Computational Biology.