## AbstractOnline learning is the process of answering a sequence of questions given knowledge of the correct answers to previous questions and possibly additional available information. Answering questions in an intelligent fashion and being able to make rational decisions as a result is a basic feature of everyday life. Will it rain today (so should I take an umbrella)? Should I fight the wild animal that is after me, or should I run away? Should I open an attachment in an email message or is it a virus? The study of online learning algorithms is thus an important domain in machine learning, and one that has interesting theoretical properties and practical applications. This dissertation describes a novel framework for the design and analysis of online learning algorithms. We show that various online learning algorithms can all be derived as special cases of our algorithmic framework. This unified view explains the properties of existing algorithms and also enables us to derive several new interesting algorithms. Online learning is performed in a sequence of consecutive rounds, where at each round the learner is given a question and is required to provide an answer to this question. After predicting an answer, the correct answer is revealed and the learner suffers a loss if there is a discrepancy between his answer and the correct one. The algorithmic framework for online learning we propose in this dissertation stems from a connection that we make between the notions of \emph{regret} in online learning and \emph{weak duality} in convex optimization. Regret bounds are the common thread in the analysis of online learning algorithms. A regret bound measures the performance of an online algorithm relative to the performance of a competing prediction mechanism, called a competing hypothesis. The competing hypothesis can be chosen in hindsight from a class of hypotheses, after observing the entire sequence of question-answer pairs. Over the years, competitive analysis techniques have been refined and extended to numerous prediction problems by employing complex and varied notions of progress toward a good competing hypothesis. We propose a new perspective on regret bounds which is based on the notion of duality in convex optimization. Regret bounds are universal in the sense that they hold for any possible fixed hypothesis in a given hypothesis class. We therefore cast the universal bound as a lower bound for an optimization problem, in which we search for the optimal competing hypothesis. While the optimal competing hypothesis can only be found in hindsight, after observing the entire sequence of question-answer pairs, this viewpoint relates regret bounds to lower bounds of minimization problems. The notion of duality, commonly used in convex optimization theory, plays an important role in obtaining lower bounds for the minimal value of a minimization problem. By generalizing the notion of Fenchel duality, we are able to derive a dual optimization problem, which can be optimized incrementally, as the online learning progresses. The main idea behind our derivation is the connection between regret bounds and Fenchel duality. This connection leads to a reduction from the process of online learning to the task of incrementally ascending the dual objective function. In order to derive explicit quantitative regret bounds we make use of the weak duality property, which tells us that the dual objective lower bounds the primal objective. The analysis of our algorithmic framework uses the increase in the dual for assessing the progress of the algorithm. This contrasts most if not all previous works that have analyzed online algorithms by measuring the progress of the algorithm based on the correlation or distance between the online hypotheses and a competing hypothesis. We illustrate the power of our framework by deriving various learning algorithms. Our framework yields the tightest known bounds for several known online learning algorithms. Despite the generality of our framework, the resulting analysis is more distilled than earlier analyses. The framework also serves as a vehicle for deriving various new algorithms. First, we obtain new algorithms for classic prediction problems by utilizing different techniques for ascending the dual objective. We further propose efficient optimization procedures for performing the resulting updates of the online hypotheses. Second, we derive novel algorithms for complex prediction problems, such as ranking and structured output prediction. The generality of our approach enables us to use it in the batch learning model as well. In particular, we underscore a primal-dual perspective on boosting algorithms, which enables us to analyze boosting algorithms based on the framework. We also describe and analyze several generic online-to-batch conversion schemes. The proposed framework can be applied in an immense number of possible real-world applications. We demonstrate a successful application of the framework in two different domains. First, we address the problem of online email categorization, which serves as an example of a natural online prediction task. Second, we tackle the problem of speech-to-text and music-to-score alignment. The alignment problem is an example of a complex prediction task in the batch learning model.
[Edit] |