CAG repeat expansion is the cause of an ever-increasing list of neurodegenerative disorders, especially hereditary ataxias. However, genes responsible for 10-50% of the clinically diagnosed ataxias are still unidentified in different populations. Traditional linkage and repeat expansion-detection based methods complemented with human genome sequence and expression information can now accelerate the pace of identification of putative disease candidates. We have analyzed two CAG repeat containing loci, human SMARCA2 and THAP11, which are expressed in the brain as putative candidates for SCAs, using computational as well as polymorphism scanning approaches. Both loci exhibited features characteristic of genes associated with repeat disorders. These loci are polymorphic with respect to size and interruption pattern in the Indian population. Furthermore, computational analysis of glutamine-stretch embedded domains in the respective proteins predicted these regions to be "natively unfolded" beyond a threshold of 40 glutamines. Comparative genome analysis suggested a stabilizing influence of CAA interspersions in repeat tract in THAP11 but not in SMARCA2. Although repeat expansion could not be detected within these genes in unidentified ataxia patients reported in India, we suggest that these loci be screened in other populations, as there is a wide heterogeneity in the prevalence of these disorders in different populations. © The Japan Society of Human Genetics and Springer-Verlag 2004.