Gene–environment interactions (G × E) have attracted considerable research interest in the past owing to their scientific and public health implications, but powerful statistical methods are required to successfully track down G × E, particularly at a genome-wide level. Previously, a case-only (CO) design has been proposed as a means to identify G × E with greater efficiency than traditional case–control or cohort studies. However, as with genotype–phenotype association studies themselves, hidden population stratification (PS) can impact the validity of G × E studies using a CO design. Since this problem has been subject to little research to date, we used comprehensive simulation to systematically assess the type I error rate, power and effect size bias of CO studies of G × E in the presence of PS. Three types of PS were considered, namely genetic-only (PSG), environment-only (PSE), and joint genetic and environmental stratification (PSGE). Our results reveal that the type I error rate of an unadjusted Wald test, appropriate for the CO design, would be close to its nominal level (0.05 in our study) as long as PS involves only one interaction partner (i.e., either PSG or PSE). In contrast, if the study population is stratified with respect to both G and E (i.e., if there is PSGE), then the type I error rate is seriously inflated and estimates of the underlying G × E interaction are biased. Comparison of CO to a family-based case–parents design confirmed that the latter is more robust against PSGE, as expected. However, case–parent trios may be particularly unsuitable for G × E studies in view of the fact that they require genotype data from parents and that many diseases with an environmental component are likely to be of late onset. An alternative approach to adjusting for PS is principal component analysis (PCA), which has been widely used for this very purpose in past genome-wide association studies (GWAS). However, resolving genetic PS properly by PCA requires genetic data at the population level, the availability of which would conflict with the basic idea of the CO design. Therefore, we explored three modified Wald test statistics, inspired by the genomic control (GC) approach to GWAS, as an alternative means to allow for PSGE. The modified statistics were benchmarked against a stratified Wald test assuming known population affiliation, which should provide maximum power under PS. Our results confirm that GC is capable of successfully and efficiently correcting the PS-induced inflation of the type I error rate in CO studies of G × E. © 2015, Springer-Verlag Berlin Heidelberg.