Assessment of multimedia quality relies heavily on subjective assessment, and is typically done by human subjects in the form of preferences or continuous ratings. Such data are crucial for analysis of different multimedia-processing algorithms as well as validation of objective (computational) methods for the said purpose. To that end, statistical testing provides a theoretical framework toward drawing meaningful inferences, and making well-grounded conclusions and recommendations. While parametric tests (such as t test, ANOVA, and error estimates like confidence intervals) are popular and widely used in the community, there appears to be a certain degree of confusion in the application of such tests. Specifically, the assumptions of normality and homogeneity of variance are often not well understood, leading to incorrect application and/or interpretation of the statistical test results. Therefore, the main goal of this paper is to present new guidelines toward proper use of statistical tests and, hence, fix some of the issues in multimedia quality assessment. The said guidelines are derived based on theoretical analysis of sampling distribution of test statistics, and consider practical aspects of data analysis in the said domain. Experimental results on both simulated and real data are presented to support the arguments made. Software that implements the said recommendations is also made publicly available, in order to help researchers and practitioners perform correct statistical comparison of models. © 1999-2012 IEEE.