Image annotation (tag) and classification play a critical role in many computer vision applications, such as image retrieval, scene understanding, scene description etc. While, databases such as ImageNet have high quality labels for images, in real world, a large number of images have missing labels or tags that completely describe the contents of an image. To solve this problem, in this paper, we work on the hypothesis that class and tag information are correlated and propose a joint optimization for image classification and annotation. We construct a unified cost function to learn the class scoring vectors as well as tag scoring vectors. The proposed approach achieves state-of-the-art results on benchmark datasets for joint tag prediction and classification. © 2018 IEEE.