The successful application of positional cloning, the process of disease susceptibility gene identification using gene mapping techniques, to identify the gene underlying cystic fibrosis (1) opened the era of gene hunting. The subsequent identification of susceptibility genes underlying numerous monogenic disorders led to the possibility that susceptibility genes for complex diseases could also be identified using the positional cloning approach. This, coupled with advances in both genotyping technology and the identification of large numbers of microsatellite markers across the genome, led to numerous genome-wide scans to identify susceptibility genes for various forms of diabetes. However, despite the genetic screening of genes known to be involved in the biology of diabetes (candidate genes) and positional cloning approaches, to date, few susceptibility genes for diabetes have been identified. Even less is known about the genetic basis for gestational diabetes mellitus (GDM). The inability to readily identify susceptibility genes for diabetes can be attributed to a variety of issues, including insufficient statistical power, etiologic heterogeneity, and the confounding effect of interactions with environmental factors. These same problems will likely apply to GDM, which appears to represent early stages of many forms of diabetes outside of pregnancy. Here, we briefly review the current state of knowledge regarding the genetics of diabetes and discuss specific issues regarding the genetics of GDM.
GENETICS OF DIABETES
Type 1 diabetes and rare/monogenic forms of diabetes
The HLA region on chromosome 6 was identified very early on as a major susceptibility gene for type 1 diabetes (2–4), with haplotypes within the HLA region accounting for as much as 50% of cases of type 1 diabetes in Caucasians (4). While the contribution of HLA to genetic susceptibility to type 1 diabetes was readily identified, numerous genome scans have also identified at least 16 additional loci across the genome that may harbor susceptibility genes for type 1 diabetes (5). However, identification of the specific genes underlying these linkage regions has proven to be difficult, partly because of the presence of the large genetic contribution of HLA alleles.
Greater success has been found in identifying susceptibility genes for monogenic (6–10), rare (11,12), or syndromic (13) forms of diabetes. Rare variation in the insulin gene results in an autosomal dominant form of diabetes (11), and the 3,243 A-G mutation in the mitochondrial tRNA (Leu-UUR) gene results in maternally inherited diabetes and deafness syndrome (12,14). Susceptibility genes identified in syndromic forms of diabetes include the Wolfram Syndrome gene (WSF1) and mutations in translation initiation factor 2-α kinase-3 gene (EIF2AK3), resulting in Wolcott-Rallison syndrome. Genes underlying susceptibility for the six different forms of maturity-onset diabetes of the young (MODY) have been identified (6–10,15). MODY is a form of diabetes characterized by an early age of onset, autosomal dominant inheritance, and β-cell dysfunction in the absence of insulin resistance (15). All six MODY genes are associated with the pancreatic β-cell, either altering transcriptional regulation or possibly altering β-cell mass or turnover (15).
The physiological consequence of genetic variation in MODY genes was characterized in a series of elegant studies by Polonsky and colleagues (16–18). Insulin secretory response to intravenous glucose infusion was assessed in nondiabetic patients and patients with known MODY variants in glucokinase (GCK, MODY2), hepatocyte nuclear factor-1α (HNF1A, MODY3), and hepatocyte nuclear factor-4α (HNF4A, MODY1). These studies demonstrated that the insulin secretory dose-response in patients with MODY susceptibility variants in GCK was less responsive and generally right-shifted compared with nondiabetic patients. In contrast, patients with MODY susceptibility variants in HNF1A or HNF4A appeared to have a normal secretory dose-response at lower glucose concentrations, but had maximal secretory responses that were substantially blunted compared with nondiabetic patients. Interestingly, the dose-response curves for HNF1A and HNF4A appeared to have similar characteristics, suggesting that these variants may have similar effects on the β-cell and overall insulin secretory response. However, subsequent studies in which patients were exposed to prolonged hyperglycemia, brought about by a 42-h glucose infusion, revealed that although blunted, increased insulin secretion observed after prolonged hyperglycemia, the so-called “priming effect” (19), could be observed in patients with HNF1A variants, whereas the priming effect was absent in patient with HNF4A variants.
Type 2 diabetes
One of the first type 2 diabetes susceptibility genes was identified not by linkage analysis, but by association (20). Deeb et al. (20) observed association between the Pro12Ala polymorphism in peroxisome proliferator–activated receptor-γ and BMI and insulin sensitivity in a sample of Finnish subjects. Individuals with at least one copy of Ala had a lower BMI and a higher insulin sensitivity compared with individuals homozygous for Pro. In addition, they observed a significantly increased risk of type 2 diabetes (odds ratio = 4.25) for individuals homozygous for Pro in a sample of Japanese Americans. The lower frequency Ala allele appeared to play a “protective” role against type 2 diabetes, since the Ala allele was also associated with a lower transactivation of response elements and therefore higher insulin sensitivity (20).
Peroxisome proliferator–activated receptor-γ is a nuclear receptor involved in adipocyte differentiation (21) and a target for the thiazolidinedione class of insulin-sensitizing drugs (22), making it an attractive candidate gene for type 2 diabetes. The initial report by Deeb et al. (20) led many groups to assess the Pro12Ala polymorphism, resulting in a variety of positive and negative associations. This led to some question as to whether Pro12Ala was a true diabetes susceptibility variant. However, this was resolved by a meta-analysis performed by Altshuler et al. (23). In their analysis of over 3,000 subjects, Altshuler et al. observed a modest but significant genotype relative risk (RR = 1.25) for type 2 diabetes associated with the Pro allele (23). When these results were combined with previously published reports in a meta-analysis, a significant protective effect for the Ala allele with a risk ratio of 0.79 was observed (23).
Over 25 genome-wide linkage scans for type 2 diabetes have been completed to date (24). Despite the large number of studies, very few regions of the genome showed common evidence for linkage across studies. Regions showing the greatest replication among studies include chromosomes 1q, 12q, and 20q. Initially, seven studies showed overlapping evidence for linkage to chromosome 1q (24). This led to the formation of the International Type 2 Diabetes 1q Consortium, in which investigators from multiple studies are pooling samples, resources, and an effort to fine-map susceptibility genes underlying this region.
One of the first genome-wide scans to be published came from Hanis et al. (25), who reported linkage results from Mexican Americans from Starr County, TX. The strongest linkage signal was observed on chromosome 2q, with the locus being dubbed NIDDM1. Subsequent analyses and fine-mapping efforts of this region led to the first type 2 diabetes susceptibility gene to be identified via positional cloning, calpain-10 (CAPN10), a ubiquitously expressed member of the calpain-like cysteine protease family (26). Three single-nucleotide polymorphisms (SNPs), SNP-43, -19, and -63, formed a high-risk haplotype within the Starr County Mexican-American sample. As with peroxisome proliferator–activated receptor-γ, subsequent studies have reported a mixture of both positive and negative associations in other populations leading to extensive discussion of whether CAPN10 could be classified as a type 2 diabetes susceptibility gene. Evans et al. (27) were the first to demonstrate that another SNP with CAPN10, SNP-44, located 11 bp from SNP-43, was associated with type 2 diabetes in Caucasian subjects from the U.K. A subsequent meta-analysis by Weedon et al. (28) examined whether SNP-44 was associated with type 2 diabetes). Their analysis included Asian, Mexican-American, and Caucasian samples and concluded that CAPN10 was associated with an overall modest increase in risk for type 2 diabetes (odds ratio = 1.17). However, examination of the combined Mexican-American samples alone suggests CAPN10 may be associated with a higher risk for type 2 diabetes in this specific ethnic group (odds ratio = 2.13), compared with the Asian (odds ratio = 1.09) and Caucasian (odds ratio = 1.17) samples.
Most recently, the DeCode group attempting to positionally clone genes for a variety of common diseases reported association between type 2 diabetes and variation in the gene for transcription factor 7-like 2 (TCF7L2) (29). The initial genome-wide linkage analysis performed in large families from the population of Iceland revealed evidence for linkage on chromosome 10 (30). Subsequent fine-mapping of the chromosome 10 interval of interest revealed a microsatellite marker, DG10S478, to be associated with type 2 diabetes. This association was replicated in both Danish and U.S. Caucasian samples. Individuals heterozygous for the risk allele had a relative risk for type 2 diabetes of 1.45, whereas individuals homozygous for the risk allele had a relative risk of 2.41. The mechanism by which TCF7L2 confers risk for type 2 diabetes is unknown, but it has been hypothesized that it may regulate proglucagon gene expression in the enteroendocrine cells (29).
Chromosome 20
Linkage to chromosome 20q was also observed for multiple studies and was the impetus for the formation of the International Type 2 Diabetes Linkage Consortium. The 20q region was independently fine-mapped by several groups, resulting in the identification of a novel glucose transporter (SLC2A10) (31) that does not appear to confer susceptibility to type 2 diabetes (32,33), along with two type 2 diabetes susceptibility genes: protein-tyrosine phosphatase, nonreceptor-type 1 (PTPN1), and hepatocyte nuclear factor-4α (HNF4A). PTPN1 was first shown to be associated with type 2 diabetes in patients with diabetes and end-stage renal disease and participants of the Diabetes Heart Study (34). Subsequently, variation in this gene was also shown to be associated with insulin resistance and fasting glucose in Hispanic Americans from the Insulin Resistance Atherosclerosis Study Family Study (35). However, in one of the largest association studies of type 2 diabetes with over 3,000 cases and 3,000 control subjects from mainly Northern European Caucasian samples, no evidence for association with type 2 diabetes was found for single variants within PTPN1 or with observed haplotypes (36).
The fall and rise of HNF4A.
The second susceptibility gene to be identified in the 20q region is HNF4A, already described above as a susceptibility gene for MODY. HNF4A was initially rejected by one study as a type 2 diabetes susceptibility gene. The Finland-U.S. Investigation of Non-Insulin-Dependent Diabetes Mellitus (FUSION) study (37), an effort to positionally clone type 2 diabetes susceptibility genes in the Finnish population, was one of several groups (38–41) showing evidence for linkage in the 20q region (42). Because HNF4A fell within the linkage region and given that coding variants were known to confer susceptibility to MODY1, Ghosh et al. (42) sequenced this gene and tested variants for association for type 2 diabetes. None of the variants they identified were associated with type 2 diabetes, leading them to conclude that variation in HNF4A was “… unlikely to account for the linkage results on chromosome 20q.”
However, ∼5 years later, the FUSION group along with Love-Gregory et al., jointly reported association between variation in the P2 promoter region of HNF4A and type 2 diabetes (43,44). Love-Gregory et al. (44), who were studying the Ashkenazi Jewish population, and the FUSION study, observed the associations independently while fine-mapping the 20q region. Both groups subsequently collaborated to ensure that common variants were genotyped in both samples, resulting in common SNPs showing association with type 2 diabetes in the P2 promoter region. In both studies, the associated SNPs accounted for a significant fraction of the evidence for type 2 diabetes in the 20q region (43,44). Type 2 diabetes–related quantitative trait data were not available in the Ashkenazi Jewish sample, but FUSION observed association between variation in the P2 promoter region and acute insulin response and disposition index in offspring of their affected patients (43), consistent with the presumed effect of HNF4A within the β-cell. The association between variation in the P2 promoter region of HNF4A and type 2 diabetes has been replicated in other populations (45–47).
GENETICS OF GDM
Familial clustering
So what about GDM? Is there a genetic basis for GDM? Surprisingly, there has been relatively little research in the area of GDM genetics per se. An essential first step in genetics research has been the determination of evidence for a genetic basis for the disease. This can come in the form of twin concordance studies or estimates of familial risk or heritability. However, performing such studies in a prospective fashion is fraught with numerous difficulties, primarily the need to identify women who will become pregnant. Studies are also difficult to perform in a retrospective fashion. The clinical definition for GDM has evolved over the years and differs slightly among countries (48–51). Furthermore, there has not been consistent screening for GDM, leading to possible bias in ascertainment, e.g., missed cases. Finally, there are difficulties in ascertaining families with multiple cases of GDM, which is partly related to the relatively low prevalence of GDM. There has been, to our knowledge, only one unpublished attempt to estimate familiality of GDM. In 1999, Williams and colleagues used the statewide medical record system in the state of Washington to identify and link sisters diagnosed with GDM using International Classification of Diseases, Ninth Revision (ICD-9), coding. Based on their initial screening, they estimated that the sibling risk ratio for GDM was 1.75 (M. Williams, personal communication), suggesting some evidence for a genetic basis for GDM. This risk, likely an underestimate, is significantly lower than the estimated sibling risk ratio for type 2 diabetes, which ranges from 2 to 4 (52).
The question of a genetic basis for GDM, however, is also closely tied to the debate of whether GDM is a unique disease state or whether pregnancy with its associated metabolic derangements simply provides us with a crystal ball with which to identify women who are susceptible to hyperglycemia and subsequent development of diabetes. If genetic variants associated with type 1 or type 2 diabetes are also associated with GDM, from a purely genetic perspective, it would be difficult to argue a unique genetic predisposition for GDM. This does not take into account the possibility of unique environmental exposures related to pregnancy that may interact with genetic variants to alter disease risk. Also, this does not negate the importance of using genetic information to improve treatment for GDM and minimize the deleterious effect of hyperglycemia on fetal outcomes.
There are studies that have examined the familial clustering of GDM and type 1 and type 2 diabetes. Examples include the studies of Dorner et al. (53), who showed increased familial aggregation of diabetes on the maternal side of offspring with type 1 diabetes whose mothers had GDM. Similarly, there is evidence for clustering of type 2 diabetes and impaired glucose tolerance in families with a GDM (54) and evidence for higher prevalence of type 2 diabetes in mothers of women with GDM (55). Thus, there is evidence of some link between both autoimmune and nonautoimmune forms of diabetes and GDM.
Candidate genes
Candidate genes related to both autoimmune and nonautoimmune forms of GDM have been assessed in a variety of cohorts. Freinkel et al. (56) examined whether HLA antigens were associated with GDM. They observed that HLA DR3 and DR4 antigens were uncommon overall, but nonetheless in higher frequency in women with GDM than in women with normal pregnancies. Similarly, Ober et al. (57) reported association between variation in the insulin receptor (INSR) in Caucasian and African-American women with GDM. They also noted that variation in INSR appeared to interact with both BMI and history of diabetes in mothers with GDM. Among Caucasian women, INSR variants also appeared to interact with variation in insulin-like growth factor-2 (IGF2). No associations between INSR and IGF2 were observed in Hispanic women with GDM (57).
There have also been several reports of association between variation in GCK and GDM (58–63). Stoffel et al. (58) estimated that the frequency of GCK variants in GDM was ∼5%, by extrapolating observations from 40 women with GDM. The relatively low frequency of GCK variants among GDM subjects was also confirmed by others (60,61). Despite the low frequency, the important contribution of GCK variation to risk for GDM can also be observe in families of MODY2 patients. Saker et al. (60) noted that a large proportion of female members of MODY2 families present with GDM. They further speculated that because variation in GCK typically results in subclinical hyperglycemia, the frequency of GCK variants may be higher and only detectable upon pregnancy (60). This possibility was confirmed by Ellard et al. (62) who used highly selective clinical criteria to select patients with GDM and tested whether they carried GCK variants (62). Their data suggest that the prevalence of GCK variants may be as high as 80% in a small subset of women with GDM selected by highly specific clinical criteria.
HNF4A in GDM.
Studies by a variety of investigators have identified a β-cell defect as being one of the primary characteristics of GDM (64–67). Given that β-cell function is a highly heritable trait (68–71), we became interested in trying to identify genes underlying β-cell dysfunction observed in GDM. This resulted in the BetaGene study, in which we are recruiting Mexican-American families of a proband with previous GDM (71,72). The reported associations between variation in the P2 promoter region of HNF4A and type 2 diabetes (43,44,46,73) led us to examine whether these variants might be associated with diabetes-related quantitative traits. Muller et al. (73) examined these variants for association with type 2 diabetes–related quantitative traits in Pima Indians and only observed modest association with insulin resistance, as assessed by the euglycemic glucose clamp. In the previous reports by Silander et al. (43) and Love-Gregory et al. (44), the frequency of the minor allele for the associated SNP (rs2144908) was between 16 and 27%, depending on the sample. In contrast, the same allele had a frequency of 49% in the Mexican-American sample from the BetaGene study (72). When we examined this SNP for association with type 2 diabetes–related phenotypes, we observed a significant association with disposition index (P = 0.035) under an additive genetic model. Disposition index is a measure of β-cell compensation (67,74), and this association is consistent with the known biologic function of HNF4A in the pancreatic β-cells (75). Thus, variation in HNF4A may contribute to the β-cell dysfunction observed in GDM.
THE FUTURE—
The field of genetics has come a long way since the days of Gregor Mendel and his pea experiments. During a period when positional cloning of complex disease genes by linkage analysis using microsatellite markers was reaching its zenith, a landmark article by Risch and Merikangas (76) appeared in the literature. This article provided a theoretical argument that large-scale association analysis was statistically more powerful than linkage analysis and therefore the preferable approach to identify genes underlying complex diseases. In that article, the authors are quoted as saying, “… imagine the time when all human genes (say 100,000 in total) have been found and that simple, diallelic polymorphisms in these genes have been identified.” Although far from reality at the time of publication, rapid advances in genotyping technology (77), drastic reductions in genotyping costs, the sequencing of the human genome (78,79), and the recent completion of the HapMap project (80) have now made whole genome association analysis a reality. These advances have now made it possible to select 250,000 to 500,000 SNPs across the human genome and rapidly genotype them in samples of 2,000–3,000 subjects in an affordable manner.
What does this new era of whole genome association mean for the genetics of GDM? It may now be possible to identify large case-control samples and perform whole genome association to identify regions of the genome harboring susceptibility genes for GDM. The unanswered question of whether GDM per se has a genetic basis may also be indirectly addressed using whole genome association by incorporating carefully selected samples of type 2 diabetes cases as a secondary contrast group into the study design. SNPs showing association with GDM, but not type 2 diabetes, may represent susceptibility genes unique to GDM. Finally, the identification of genetic variants underlying disease alone will not have a major impact on clinical care for patients with GDM. Like the MODY studies of Polonsky and his colleagues (16–18), additional molecular and biochemical studies, and most importantly, clinical studies must be performed to understand the physiological and clinical consequences of genetic variation. Finally, we will need to assess how to maximize genetic information and how to best incorporate it into the clinical care setting.
Article Information
R.M.W. was supported in part by a grant from the American Diabetes Association (05-RA-140). T.A.B. was supported in part by a Distinguished Clinical Scientist Award from the American Diabetes Association. The FUSION study was supported by intramural funds from the National Human Genome Research Institute (project no. OH95-C-N030) and by National Institutes of Health (NIH) Grants DK62370 and HG00376. The BetaGene Study was supported by NIH Grant DK61628 (to T.A.B.), with support from the University of Southern California General Clinical Research Center (M01-RR-00043).
References
This article is based on a presentation at a symposium. The symposium and the publication of this article were made possible by an unrestricted educational grant from LifeScan, Inc., a Johnson & Johnson company.
A table elsewhere in this issue shows conventional and Système International (SI) units and conversion factors for many substances.