Objective video quality assessment (VQA) is the use of computational models to evaluate the video quality in line with the perception of the human visual system (HVS). It is challenging due to the underlying complexity, and the relatively limited understanding of the HVS and its intricate mechanisms. There are three important issues that arise in objective VQA in comparison with image quality assessment: 1) the temporal factors apart from the spatial ones also need to be considered, 2) the contribution of each factor (spatial and temporal) and their interaction to the overall video quality need to be determined, and 3) the computational complexity of the resultant method. In this paper, we seek to tackle the first issue by utilizing the worst case pooling strategy and the variations of spatial quality along the temporal axis with proper analysis and justification. The second issue is addressed by the use of machine learning; we believe this to be more convincing since the relationship between the factors and the overall quality is derived via training with substantial ground truth (i.e., subjective scores). Experiments conducted using publicly available video databases show the effectiveness of the proposed full-reference (FR) algorithm in comparison to the relevant existing VQA schemes. Focus has also been placed on demonstrating the robustness of the proposed method to new and untrained data. To that end, cross-database tests have been carried out to provide a proper perspective of the performance of proposed scheme as compared to other VQA methods. © 2012 IEEE.