Comment on ``On Discriminative vs. Generative Classifiers:
A Comparison of Logistic Regression and Naive Bayes" ## AbstractComparison of generative and discriminative classifiers is an ever-lasting topic. Based on their theoretical and empirical comparisons between the na\"{i}ve Bayes classifier and linear logistic regression, Ref.~\citeauthor{Ng:2001} claimed that there existed two distinct regimes of performance between the generative and discriminative classifiers with regard to the training-set size. However, our empirical and simulation studies, as presented in this paper, suggest that it is not so reliable to claim such an existence of the two distinct regimes. In addition, for real world datasets, so far there is no theoretically correct, general criterion for choosing between the discriminative and the generative approaches to classification of an observation $\mathbf{x}$ into a class $y$; the choice depends on the relative confidence we have in the correctness of the specification of either $p(y|\mathbf{x})$ or $p(\mathbf{x}, y)$. This can be to some extent a demonstration of why Ref.~\citeauthor{Efron:1975} and~\citeauthor{ONeill:1980} prefer LDA but other empirical studies may prefer linear logistic regression instead. Furthermore, we suggest that pairing of either LDA assuming a common diagonal covariance matrix (LDA-$\Lambda$) or the na\"{i}ve Bayes classifier and linear logistic regression may not be perfect, and hence it may not be reliable for any claim that was derived from the comparison between LDA-$\Lambda$ or the na\"{i}ve Bayes classifier and linear logistic regression to be generalised to all the generative and discriminative classifiers.
[Edit] |