PASCAL - Pattern Analysis, Statistical Modelling and Computational Learning

μTOSS - Multiple hypotheses testing in an open software system
Thorsten Dickhaus, Gilles Blanchard, Niklas Hack, Frank Konietschke, Kornelius Rohmeyer, Jonathan Rosenblatt, Marsel Scheer and Wiebke Werft
In: 2. joint Statistical Meeting Deutsche Arbeitsgemeinschaft Statistik "Statistics under one umbrella" (DAGStat 2010), Dortmund, Germany(2010).


Multiple hypotheses testing in mathematical statistics has emerged as one of the hottest research fields in statistics over the last 10-15 years, especially driven by large-scale applications like, e.g., in genomics, proteomics or cosmology. New error criteria like the nowadays quite popular "false discovery rate" (FDR) have been developed and most of the research in this field has been more or less directly implemented into individual software. It is fair to say that up to now every research group uses its own implementation, making (simulation) study evaluations and related results not entirely comparable. Moreover, the spread of newly emerging methods is hindered by the lack of a common software platform to agree on. Following a suggestion by Yoav Benjamini, we present an R-based, open software framework for multiple hypotheses testing called "$\mu$TOSS", sponsored by the PASCAL2 European Network of Excellence and realized at Berlin Institute of Technology. General key assets of the $\mu$TOSS system are: - Source code-open implementation (using R) - Well-documented developer interfaces for new procedures to add-on - Graphical user interface - Online user's guide on which procedure to use according to the user's specification of the test problem - Inclusion of a large part of the known MCP methods - Inclusion of testbed datasets for verification and exemplary purposes - Ongoing maintenance The several components of the $\mu$TOSS system provide (i) multiple tests controlling the Family-Wise Error Rate (single-step and stepwise rejective methods, resampling-based procedures), (ii) multiple tests controlling the False Discovery Rate (classical and data-adaptive frequentistic methods as well as Bayesian approaches and resampling-based techniques), (iii) multiplicity-adjusted simultaneous confidence intervals, (iv) modules for planning and evaluating studies with adaptive designs and will be exemplified with real-life datasets.

EPrint Type:Conference or Workshop Item (Talk)
Project Keyword:Project Keyword UNSPECIFIED
Subjects:Computational, Information-Theoretic Learning with Statistics
Learning/Statistics & Optimisation
ID Code:7866
Deposited By:Thorsten Dickhaus
Deposited On:17 March 2011