Hyper Geometric Random Variable Calculator
Compute exact and cumulative hypergeometric probabilities for sampling without replacement. This interactive tool is designed for statistics students, quality analysts, auditors, researchers, card game enthusiasts, and anyone who needs precise finite population probability results.
Calculator
Enter your population size, number of success states in the population, sample size, and target number of observed successes. Choose the probability type, then calculate instantly.
Probability Distribution Chart
The chart displays the full hypergeometric probability mass function across all feasible values of X. Your selected value is highlighted.
How to use it
- 1. Set N Enter the total population size. Example: 52 cards in a deck, 500 products in a lot, or 100 patients in a registry.
- 2. Set K Enter how many successes exist in the population. Example: 4 aces in a deck or 25 defective units in a lot.
- 3. Set n Enter how many items are sampled without replacement. This is what makes the hypergeometric model different from the binomial model.
- 4. Set k and probability type Choose whether you need the exact probability of seeing exactly k successes, at most k successes, or at least k successes.
- 5. Interpret the output The calculator returns the probability in decimal and percentage form, plus the expected value and variance for the distribution.
For a hypergeometric random variable X, P(X = k) = [C(K, k) × C(N – K, n – k)] / C(N, n)
where N is the population size, K is the number of success states, n is the sample size, and k is the number of observed successes.
Lot acceptance sampling, election auditing, medical chart review, ecology capture studies, lottery and card probability, fraud detection, and any finite population inspection problem.
Expert Guide to the Hyper Geometric Random Variable Calculator
A hyper geometric random variable calculator helps you find probabilities when you sample from a finite population without replacement. That final phrase, without replacement, is the key idea. In many real-world settings, once an item is selected, it is no longer available to be selected again. This small change from replacement to no replacement dramatically changes the probability model and makes the hypergeometric distribution the correct choice instead of the binomial distribution.
If you are checking a production lot for defects, drawing cards from a deck, reviewing a fixed list of records, or auditing ballots from a known batch, the hypergeometric distribution is often the right statistical tool. This calculator allows you to evaluate exact and cumulative probabilities quickly while also visualizing the entire probability distribution. That means you can move from raw inputs to interpretable results in seconds.
What is a hyper geometric random variable?
A hyper geometric random variable counts the number of successes observed in a sample drawn from a finite population that contains both successes and failures. Suppose a population contains N total items, and exactly K of them are classified as successes. If you draw a sample of size n without replacement, the random variable X represents how many successes appear in the sample.
This is different from a binomial random variable, where each trial is usually assumed to have the same probability of success and is independent of all other trials. In the hypergeometric setting, each draw changes the composition of the remaining population. If you pull out a success, there is one fewer success left. If you pull out a failure, the success proportion among remaining items changes in a different way. That dependency structure is exactly why the hypergeometric model matters.
When should you use a hyper geometric calculator?
You should use a hypergeometric calculator when all of the following conditions hold:
- The population size is finite and known or meaningfully bounded.
- Each item can be classified as either success or failure for the purpose of the question.
- You draw a sample without replacement.
- You want the probability of observing a certain number of successes in the sample.
Some common examples include:
- Quality control: A shipment contains a fixed number of defective units, and you inspect a sample.
- Card games: A standard deck has known counts of aces, kings, hearts, or face cards.
- Public auditing: A known ballot batch contains a finite number of ballots, and a random sample is reviewed.
- Medical records review: A finite list of patient files contains a known count of cases meeting some criterion.
- Ecology and conservation: A captured group includes a known number of tagged organisms in a finite habitat sample frame.
How the formula works
The hypergeometric probability formula is:
P(X = k) = [C(K, k) × C(N – K, n – k)] / C(N, n)
Each combination term has a clear interpretation:
- C(K, k) counts the ways to choose k successes from the K available successes.
- C(N – K, n – k) counts the ways to choose the remaining failures from the non-success items.
- C(N, n) counts the total number of possible samples of size n.
By dividing favorable samples by total possible samples, you get the exact probability of observing exactly k successes.
Not every value of k is possible. The valid range is from max(0, n – (N – K)) to min(n, K). This calculator checks that range automatically.
Understanding exact and cumulative probabilities
Most users need one of three results:
- Exact probability: P(X = k), the chance of observing exactly k successes.
- Lower cumulative probability: P(X ≤ k), the chance of observing at most k successes.
- Upper cumulative probability: P(X ≥ k), the chance of observing at least k successes.
In practice, exact probability is useful when you want one specific outcome. Cumulative probabilities are more common in decision making. For example, if you reject a lot when 3 or more defects appear in a sample, then you need P(X ≥ 3), not just P(X = 3).
Worked example
Imagine a batch of 50 components contains 5 defective parts. You randomly inspect 10 parts without replacement. What is the probability of finding exactly 2 defectives?
- N = 50
- K = 5
- n = 10
- k = 2
The exact hypergeometric probability is approximately 0.20984, or 20.984 percent. The expected number of defectives found is n(K/N) = 1. That tells you that observing two defectives is above the mean, but still not uncommon. This type of result is highly useful in acceptance sampling and operational risk analysis.
Expected value and variance
The expected value of a hypergeometric random variable is:
E[X] = n(K / N)
The variance is:
Var(X) = n(K / N)(1 – K / N)[(N – n) / (N – 1)]
The term [(N – n) / (N – 1)] is called the finite population correction. It shrinks the variance compared with a binomial model because sampling without replacement introduces dependence and reduces uncertainty relative to independent draws.
Comparison table: hypergeometric vs binomial
| Feature | Hypergeometric | Binomial | Practical implication |
|---|---|---|---|
| Population | Finite and fixed | Conceptually infinite or very large | Use hypergeometric when the sample meaningfully changes the remaining population. |
| Sampling method | Without replacement | With replacement or independent trials | Dependency between draws pushes you toward hypergeometric. |
| Success probability per draw | Changes from draw to draw | Constant across draws | Hypergeometric better fits card draws, lot inspection, and audits. |
| Mean | nK/N | np | The means can match when p = K/N. |
| Variance | Includes finite population correction | np(1-p) | Hypergeometric variance is smaller when the sample is a substantial fraction of the population. |
Real probability examples
The hypergeometric distribution appears in many familiar settings. The following table includes exact probabilities that are commonly used in teaching and applied probability.
| Scenario | N | K | n | Target | Exact probability |
|---|---|---|---|---|---|
| Draw exactly 2 aces in a 5-card poker hand | 52 | 4 | 5 | P(X = 2) | 0.03993, or 3.993% |
| Find exactly 1 defective when sampling 10 from a lot of 100 with 8 defectives | 100 | 8 | 10 | P(X = 1) | 0.39501, or 39.501% |
| Find exactly 0 defective when sampling 5 from a lot of 40 with 3 defectives | 40 | 3 | 5 | P(X = 0) | 0.66245, or 66.245% |
| Draw exactly 1 heart in 3 cards from a standard deck | 52 | 13 | 3 | P(X = 1) | 0.41359, or 41.359% |
Why visualizing the distribution helps
A single probability value is useful, but a chart is often more informative. The distribution view shows where the most likely outcomes occur, whether the distribution is concentrated or spread out, and how your selected value compares to neighboring values. For example, if the selected outcome is near the center of the distribution, it is often relatively likely. If it lies in the tail, the probability may be much smaller and more suitable for threshold-based decision rules.
The chart also reinforces an important point: exact probabilities for one value of k can be much smaller than cumulative probabilities over a range of values. This matters in quality control, risk screening, and compliance review, where decisions are usually based on cutoffs rather than single exact counts.
How to interpret results responsibly
- A low exact probability does not always imply a rare event overall. Nearby outcomes might have substantial combined probability.
- Check whether without replacement is realistic. If sampling fractions are tiny, the binomial approximation may be acceptable, but the hypergeometric result is still more exact.
- Keep the feasible range in mind. Some values of k are impossible given your inputs.
- Use cumulative probabilities for decision thresholds. Acceptance sampling plans, audit triggers, and screening protocols usually rely on cumulative logic.
Common mistakes to avoid
- Using a binomial calculator when sampling is clearly without replacement from a finite population.
- Entering inconsistent values, such as a sample size larger than the population.
- Choosing an impossible success count, such as more successes than draws or more successes than exist in the population.
- Interpreting P(X = k) as if it were the same as P(X ≥ k) or P(X ≤ k).
Applications in statistics, auditing, and science
In formal statistics, the hypergeometric distribution appears in exact tests, finite-population inference, survey design, and quality engineering. In auditing, it helps model findings in finite sets of claims, invoices, or ballots. In ecology, it appears in capture and classification studies. In genetics and bioinformatics, related forms of the hypergeometric model are often used in enrichment analysis, where the question becomes whether a category appears in a sample more often than expected by chance.
Government and university resources often discuss the hypergeometric model in the context of finite population sampling and exact probability. If you want a deeper mathematical treatment, see these authoritative references:
- Penn State University: Hypergeometric Distribution
- University-hosted introductory statistics reference on hypergeometric distribution
- U.S. Census Bureau reference material involving hypergeometric testing concepts
Why this calculator is useful
This calculator simplifies a distribution that can become computationally tedious by hand, especially when the numbers are large. It calculates exact and cumulative probabilities, displays the expected value and variance, validates your inputs, and renders a clean probability chart. Whether you are studying introductory probability or making a real inspection decision, this combination of speed, accuracy, and visualization saves time and reduces errors.
In short, use a hyper geometric random variable calculator whenever you are counting successes in a finite population sample without replacement. If your problem fits that structure, the hypergeometric distribution is not just a good choice, it is the mathematically appropriate one.