## AbstractWe examine linear regression problems where the features may only be observable at some cost (e.g., in medical or financial domains where features may correspond to diagnostic tests or information-gathering that takes time and costs money). To do this, we define a \emph{parsimonious} linear regression objective criterion that jointly minimizes prediction error and feature cost, assuming they can be expressed in commensurable units. Formally, this objective results in an unconstrained non-convex optimization problem that can be recast as a mixed 0-1 integer quadratic program (MIQP). While this MIQP can be solved using off-the-shelf software, such approaches typically cannot scale to large numbers of features. Noting that a linear regression model in this setting will induce a feature cost for all features having non-zero weights, we are able to modify least angle regression algorithms commonly used for sparse linear regression (with non-costly features) to produce the ParLiR algorithm. ParLiR not only provides an efficient and parsimonious solution to linear regression with costly features as we demonstrate empirically, but it also provides formal guarantees on parsimony that we prove theoretically.
[Edit] |