How to Calculate Covariance of Two Variables
Use this premium covariance calculator to measure whether two variables move together in the same direction, in opposite directions, or with little linear relationship. Enter paired data values, choose sample or population covariance, and instantly see the result, means, and a charted view of the relationship.
Interactive Covariance Calculator
Enter two equal-length lists of numbers. Each position represents a paired observation. Example: X = 2,4,6 and Y = 1,3,5.
Your results will appear here
Enter paired data and click “Calculate Covariance” to see the covariance, means, interpretation, and chart.
Expert Guide: How to Calculate Covariance of Two Variables
Covariance is one of the most useful introductory statistics tools for understanding how two variables move together. If you are studying finance, economics, business analytics, psychology, engineering, or data science, covariance helps you quantify whether increases in one variable tend to occur alongside increases in another variable, or whether one tends to rise while the other falls. Although the underlying idea is simple, many learners get confused by the formula, by the difference between sample and population covariance, and by how to interpret the final number. This guide explains the full process clearly and practically.
At a high level, covariance measures the joint variability of two variables. Suppose you have one variable X and another variable Y. If values of X above their mean tend to occur with values of Y above their mean, then the covariance is positive. If values of X above their mean tend to occur with values of Y below their mean, the covariance is negative. If there is no consistent linear pattern in how they move around their means, the covariance tends to be close to zero.
What covariance tells you
Covariance does not simply tell you whether two variables are “related.” Instead, it specifically summarizes how their deviations from their own means align. Think of each observation as a pair. For every paired observation, you look at how far X is from the mean of X and how far Y is from the mean of Y. Then you multiply those deviations together.
- If both deviations are positive, their product is positive.
- If both deviations are negative, their product is also positive.
- If one deviation is positive and the other is negative, the product is negative.
- Adding these products across all pairs reveals whether the variables generally move together or in opposite directions.
That is the essence of covariance. It is a directional measure of paired movement around the mean.
The covariance formulas
There are two common formulas depending on whether your data represent an entire population or only a sample drawn from a larger population.
In these formulas:
- xi is an observed value of X
- yi is the paired observed value of Y
- x̄ is the mean of X
- ȳ is the mean of Y
- n is the number of paired observations
The sample formula uses n – 1 rather than n to adjust for estimation from a sample. This is similar to why sample variance uses n – 1.
Step-by-step: how to calculate covariance manually
Let’s walk through a simple example. Suppose a student tracks hours studied and quiz scores across five quizzes:
| Observation | Hours Studied (X) | Quiz Score (Y) |
|---|---|---|
| 1 | 2 | 65 |
| 2 | 4 | 70 |
| 3 | 5 | 78 |
| 4 | 7 | 88 |
| 5 | 9 | 95 |
- Compute the mean of X. Add the X values and divide by 5. Mean X = (2 + 4 + 5 + 7 + 9) / 5 = 5.4.
- Compute the mean of Y. Mean Y = (65 + 70 + 78 + 88 + 95) / 5 = 79.2.
- Find each deviation from the mean. For each row, subtract 5.4 from X and 79.2 from Y.
- Multiply paired deviations. For each row, compute (xi – x̄)(yi – ȳ).
- Add the products. This gives the total joint variation.
- Divide by n or n – 1. Use n for a population and n – 1 for a sample.
When you do the arithmetic, the covariance is positive, which tells you that more hours studied tend to accompany higher quiz scores. The magnitude depends on the units of measurement, which leads to an important caveat: covariance is not standardized.
How to interpret the sign of covariance
The sign is usually the easiest part to interpret:
- Positive covariance: higher values of X tend to pair with higher values of Y, and lower values of X tend to pair with lower values of Y.
- Negative covariance: higher values of X tend to pair with lower values of Y.
- Zero or near-zero covariance: little linear co-movement is present.
For example, outside temperature and ice cream sales often show positive covariance because warmer days are associated with greater demand. By contrast, price and quantity demanded often show negative covariance because higher prices may be associated with lower purchases.
Why the magnitude of covariance can be hard to compare
Unlike correlation, covariance depends on the units of the variables. If you measure one variable in dollars instead of cents, or in kilograms instead of grams, the numerical covariance changes. That means a covariance of 25 is not automatically “stronger” than a covariance of 10 unless both datasets use comparable scales and units.
This is why analysts often compute correlation after covariance. Correlation standardizes covariance by dividing by the product of the variables’ standard deviations, producing a value between -1 and 1.
| Measure | What It Shows | Scale | Best Use |
|---|---|---|---|
| Covariance | Direction of joint movement and raw co-variation | Depends on units of X and Y | Understanding directional movement and matrix calculations |
| Correlation | Direction and standardized strength of linear relationship | Always between -1 and 1 | Comparing relationships across different datasets |
| Variance | Spread of one variable around its mean | Squared units of the variable | Measuring dispersion of a single dataset |
Sample covariance vs population covariance
You should use population covariance when you have every member of the group you care about. For example, if a company studies all 12 months of revenue and ad spend for a single completed year and treats those 12 months as the full set of interest, population covariance may be appropriate.
You should use sample covariance when your observations are only a subset of a larger population. If you analyze 50 customers out of thousands, or 100 trading days out of many years of market data, you are usually working with a sample. In practice, many real-world analyses use sample covariance.
Common mistakes when calculating covariance
- Mismatched pairs: Covariance requires paired observations. If X has five values and Y has five values, the first X must correspond to the first Y, and so on.
- Using the wrong denominator: Dividing by n instead of n – 1 changes the result. Be clear whether you are treating the data as a population or a sample.
- Interpreting magnitude as universal strength: Since covariance is not standardized, large numbers do not automatically imply a stronger relationship across datasets.
- Ignoring outliers: Extreme values can strongly affect covariance.
- Assuming zero covariance means no relationship: A non-linear relationship can still exist even if covariance is near zero.
Real-world comparison examples
To make interpretation more concrete, here is a simple comparison of plausible paired variables and the expected sign of covariance. These are illustrative examples used to help build intuition.
| Paired Variables | Typical Relationship | Expected Covariance Sign | Why |
|---|---|---|---|
| Daily temperature and residential electricity cooling demand | Same-direction movement | Positive | Hotter days often increase air-conditioning use. |
| Product price and units sold | Opposite-direction movement | Negative | Higher prices often reduce quantity demanded. |
| Study time and exam score | Same-direction movement | Positive | More preparation often aligns with better performance. |
| Speed and travel time over a fixed distance | Opposite-direction movement | Negative | As speed rises, travel time typically falls. |
Covariance in finance and portfolio analysis
Covariance is especially important in finance because it helps explain how asset returns move together. Portfolio risk is not just about the risk of individual assets. It also depends on whether those assets tend to rise and fall together. If two assets have strongly positive covariance, combining them may provide less diversification benefit. If they have lower or negative covariance, mixing them may reduce overall portfolio volatility.
This is one reason covariance matrices are foundational in modern portfolio theory. A covariance matrix summarizes covariances among many asset return series, helping analysts estimate portfolio variance and optimize asset allocation decisions.
Covariance in data science and machine learning
In data science, covariance appears in feature analysis, dimensionality reduction, and principal component analysis. A covariance matrix shows how every pair of features co-varies. Features with high covariance may contain overlapping information, while features with low covariance may contribute more unique signal. Understanding covariance can therefore improve feature engineering and model interpretation.
What a covariance near zero really means
A covariance close to zero means there is little linear pattern in how the two variables move around their means. However, this does not prove the variables are independent. For instance, a curved relationship can produce near-zero covariance even though Y clearly depends on X. This is why scatter plots are so useful. Visual inspection can reveal patterns that a single summary number hides.
How this calculator works
The calculator above follows the standard statistical process. It reads your paired X and Y values, checks that both series contain the same number of valid numeric observations, computes the mean of each variable, calculates each pair’s deviation product, sums those products, and divides by either n or n – 1 depending on your selection. It also visualizes the data with either a scatter plot or a bar chart of deviation products so you can move beyond a single number.
Authoritative references for further study
If you want to go deeper into covariance, variance, and correlation, these high-quality public resources are useful starting points:
- U.S. Census Bureau research materials on covariance-related statistical methods
- Penn State University statistics resources
- NIST Engineering Statistics Handbook
Final takeaway
To calculate covariance of two variables, start with paired data, compute the mean of each variable, measure each value’s deviation from its mean, multiply the paired deviations, sum those products, and divide by n or n – 1. The sign of the result shows direction. Positive values mean same-direction movement, negative values mean opposite-direction movement, and values near zero suggest weak linear co-movement. Because covariance depends on units, use correlation when you need a standardized comparison. Once you understand that covariance is really about paired deviations from the mean, the concept becomes much easier to apply correctly.