Gene ontology (GO) is a comprehensive resource for the properties of gene products and their relationships. A similarity measure can be defined between two gene products by utilizing GO, and the corresponding similarity score can be treated as a likelihood to interact between them physically. However, GO is being updated regularly by the addition of new terms and removal/merging of obsolete terms. Therefore, the similarity score of interaction may differ from one instance of GO to another. In this paper, we systematically study the impact of the continuous evolution of GO on the performance of similarity measures for the task of scoring confidence of protein–protein interactions (PPIs). We find that the performance of a similarity measure gets affected due to the continuous evolution of GO. We further observe that the degree of robustness of a similarity measure is highly influenced by the particular setting we consider. © 2020, Springer Nature Singapore Pte Ltd.