Diabetic retinopathy (DR) is a significant reason for the global increase in visual loss. Studies show that timely treatment can significantly bring down such incidents. Hence, it is essential to distinguish the stages and severity of DR to recommend needed medical attention. In this view, this paper presents DRISTI (Diabetic Retinopathy classIfication by analySing reTinal Images), where a hybrid deep learning model composed of VGG16 and capsule network is proposed, which yields statistically significant performance improvement over the state of the art. To validate our claim, we have reported detailed experimental and ablation studies. We have also created an augmented dataset to increase the APTOS dataset’s size and check how robust the model is. The five-class training and validation accuracy for the expanded dataset is 99.21 % and 75.50 %. The two-class training and validation accuracy on augmented APTOS is 99.96 % and 97.05 %. Extending the two-class model for the mixed dataset, we get a training and validation accuracy of 99.92 % and 91.43 % , respectively. We have also performed cross-dataset and mixed dataset testing to demonstrate the efficiency of DRISTI. © 2021, The Author(s), under exclusive licence to Springer-Verlag London Ltd., part of Springer Nature.