Allele Frequency Calculator
Calculate allele frequencies, genotype frequencies, expected Hardy-Weinberg proportions, and heterozygosity from observed genotype counts. This interactive genetics calculator is designed for students, educators, researchers, and clinical genomics workflows.
Allele p
0.0000
Allele q
0.0000
Observed N
0
Expected Het
0.0000
Example: genotype AA count
Example: genotype Aa count
Example: genotype aa count
Formula used: p = (2N11 + N12) / 2N and q = (2N22 + N12) / 2N
Expert Guide to Using an Allele Frequency Calculator
An allele frequency calculator helps quantify how common different genetic variants are within a population. In population genetics, an allele is one version of a gene found at a particular locus. If a gene has two alleles, often represented as A and a, then the allele frequency describes the proportion of all copies of that gene that are A versus the proportion that are a. This concept is foundational in genetics, evolutionary biology, genomics, breeding science, and public health. It is also central to understanding Hardy-Weinberg equilibrium, expected genotype distributions, heterozygosity, and how populations change over time.
This calculator uses observed genotype counts to estimate the frequency of two alleles. For a diploid population, each individual contributes two alleles. If you know how many people or organisms have genotype AA, how many have genotype Aa, and how many have genotype aa, you can calculate the frequency of allele A and allele a directly. These values are usually denoted as p and q, where p + q = 1. Once p and q are known, you can also compute expected genotype frequencies under Hardy-Weinberg assumptions: p² for AA, 2pq for Aa, and q² for aa.
Because the calculator also displays observed genotype frequencies and expected heterozygosity, it can be used for classroom exercises, basic research summaries, and quick quality checks in laboratory and field datasets. It is especially useful when you want a fast answer without manually carrying out multiple arithmetic steps.
What allele frequency means in practical terms
Imagine a population of 100 diploid individuals. That means there are 200 total allele copies at a locus. If 120 of those copies are allele A and 80 are allele a, then the frequency of A is 120/200 = 0.60 and the frequency of a is 80/200 = 0.40. These numbers immediately summarize how common the two genetic variants are in the sample. Researchers use these frequencies to compare populations, evaluate selection pressures, estimate carrier rates, and test whether observed genotype proportions align with population-level expectations.
Allele frequencies are widely applied in:
- Population genetics and evolutionary studies
- Medical genetics and inherited disease screening
- Conservation biology for monitoring genetic diversity
- Plant and animal breeding programs
- Forensic DNA interpretation and ancestry studies
- Genome-wide association and variant catalog analysis
How the calculator works
The calculator requires three genotype counts for a two-allele system:
- Homozygous for allele 1, such as AA
- Heterozygous, such as Aa
- Homozygous for allele 2, such as aa
From those values, the total number of individuals is:
N = N(AA) + N(Aa) + N(aa)
The total number of alleles is 2N because each individual carries two copies at the locus. The frequency of allele A is:
p = [2N(AA) + N(Aa)] / 2N
The frequency of allele a is:
q = [2N(aa) + N(Aa)] / 2N
Since there are only two alleles in this model, p + q should equal 1, apart from tiny rounding differences. The calculator then uses p and q to derive expected genotype frequencies under Hardy-Weinberg equilibrium:
- Expected AA frequency = p²
- Expected Aa frequency = 2pq
- Expected aa frequency = q²
It also computes expected genotype counts by multiplying those frequencies by the total sample size N. This lets you compare what you observed with what a random-mating, idealized population would produce.
| Metric | Formula | What it tells you |
|---|---|---|
| Allele A frequency | (2AA + Aa) / 2N | Proportion of all allele copies that are A |
| Allele a frequency | (2aa + Aa) / 2N | Proportion of all allele copies that are a |
| Expected heterozygosity | 2pq | Expected proportion of heterozygotes under Hardy-Weinberg equilibrium |
| Observed heterozygosity | Aa / N | Actual heterozygous proportion in the sample |
Worked example
Suppose you observe the following counts in a sample of 100 individuals:
- AA = 36
- Aa = 48
- aa = 16
Total individuals N = 36 + 48 + 16 = 100. Total allele copies = 200.
Now calculate allele frequencies:
- A copies = 2(36) + 48 = 120
- a copies = 2(16) + 48 = 80
So p = 120/200 = 0.60 and q = 80/200 = 0.40. Under Hardy-Weinberg equilibrium, expected genotype frequencies are:
- AA = p² = 0.36
- Aa = 2pq = 0.48
- aa = q² = 0.16
In this example, the observed and expected values match perfectly. That does not prove the population is in equilibrium in a broad biological sense, but it does mean the observed genotype pattern is exactly what the Hardy-Weinberg model would predict from the allele frequencies.
Observed versus expected genotype structure
One of the strongest uses of an allele frequency calculator is comparing observed genotype frequencies with expected values. Deviations may suggest non-random mating, selection, population substructure, inbreeding, recent migration, genotyping error, or simply small sample noise. In introductory genetics, this comparison is often the first step before applying a chi-square test for Hardy-Weinberg equilibrium.
Below is a small comparison table using real, mathematically valid example frequencies that commonly appear in genetics teaching.
| Scenario | p | q | Expected AA (p²) | Expected Aa (2pq) | Expected aa (q²) |
|---|---|---|---|---|---|
| Balanced variation | 0.50 | 0.50 | 0.25 | 0.50 | 0.25 |
| Moderately common major allele | 0.70 | 0.30 | 0.49 | 0.42 | 0.09 |
| Rare minor allele | 0.95 | 0.05 | 0.9025 | 0.0950 | 0.0025 |
This table reveals an important pattern: heterozygosity is highest when allele frequencies are balanced. In a two-allele system, expected heterozygosity reaches its maximum when p = q = 0.5, producing 2pq = 0.5. As one allele becomes rare, heterozygosity declines sharply and homozygosity rises.
Why expected heterozygosity matters
Expected heterozygosity is a compact summary of genetic diversity. In conservation genetics, it is often used to evaluate whether a population is losing diversity through bottlenecks or inbreeding. In breeding programs, heterozygosity can reflect the amount of useful variation available for selection. In human genetics, it provides context for how common a minor allele is and what genotype distribution should be expected in a random-mating population.
For a biallelic locus, expected heterozygosity is simply 2pq. If p = 0.5 and q = 0.5, then expected heterozygosity is 0.50, the maximum possible for two alleles. If p = 0.99 and q = 0.01, then 2pq = 0.0198, which is much lower. That means most individuals are expected to be homozygous for the common allele, and very few carry the rare homozygous genotype.
Common mistakes when calculating allele frequencies
- Forgetting diploidy: the denominator is 2N, not N, because each individual contributes two alleles.
- Confusing genotype frequency with allele frequency: the proportion of AA individuals is not the same thing as the frequency of allele A.
- Using percentages inconsistently: convert counts to consistent proportions before comparing observed and expected values.
- Ignoring sample size: small samples can fluctuate widely due to random chance.
- Assuming equilibrium automatically: observed frequencies close to expected values do not guarantee all Hardy-Weinberg assumptions are satisfied.
Hardy-Weinberg equilibrium assumptions
The classic Hardy-Weinberg model assumes an ideal population. Specifically, it assumes random mating, a very large population size, no mutation, no migration, and no selection at the locus. Real populations rarely satisfy all assumptions perfectly, but the model remains extremely useful as a null expectation. If observed frequencies differ meaningfully from expected values, biologists can investigate possible causes.
- Random mating with respect to the locus
- No strong natural or artificial selection
- No substantial migration changing allele proportions
- No important new mutation pressure at the locus
- Sufficiently large population to reduce drift effects
How allele frequencies are used in medicine and genomics
In medical genetics, allele frequency is critical for variant interpretation. A variant that is common in a general population is less likely to be a highly penetrant cause of a severe rare disorder. Researchers and clinicians compare observed variant frequencies against reference databases to assess plausibility. Population frequency also affects expected carrier rates and expected genotype prevalence for recessive conditions. If a pathogenic recessive allele has frequency q, the Hardy-Weinberg expectation for affected homozygotes is q² and the heterozygous carrier rate is approximately 2q when q is small.
This is one reason public datasets and educational calculators are so valuable. They connect abstract formulas to practical outcomes such as carrier screening, risk modeling, and population comparison. The basic mathematics remain simple, but the implications are significant.
Interpreting results from this calculator
After you click calculate, the tool reports:
- Total individuals and total allele copies
- Allele frequencies p and q
- Observed genotype frequencies
- Expected Hardy-Weinberg genotype frequencies
- Expected genotype counts based on your sample size
- Observed heterozygosity and expected heterozygosity
The accompanying chart visualizes the observed versus expected genotype frequencies so differences are easy to spot. This is useful for students learning the relationship between genotype counts and allele frequencies, and for analysts who want a quick visual quality check before exporting data elsewhere.
Authoritative genetics resources
If you want to deepen your understanding of allele frequencies, Hardy-Weinberg equilibrium, and variant interpretation, these authoritative sources are excellent starting points:
- National Human Genome Research Institute (.gov): Allele glossary and genetics fundamentals
- MedlinePlus Genetics (.gov): Hardy-Weinberg equilibrium overview
- University of California, Berkeley (.edu): Evolution and population genetics teaching resources
Final takeaways
An allele frequency calculator converts raw genotype counts into meaningful population genetic information. It helps you estimate p and q, compare observed and expected genotype frequencies, and understand heterozygosity at a glance. Whether you are reviewing a classroom problem, checking a breeding population, or exploring a simple genomics dataset, the core logic is the same: count genotypes, convert to allele totals, divide by 2N, and then interpret the results in the context of population genetics. Mastering this process provides a strong foundation for more advanced topics such as linkage disequilibrium, genetic drift, selection coefficients, association studies, and clinical variant interpretation.