How To Calculate Covariance Between Two Variables In Excel

How to Calculate Covariance Between Two Variables in Excel

Use this premium calculator to estimate covariance, compare sample vs population covariance, and visualize how two variables move together. Paste your X and Y values exactly as you would arrange them in Excel, then review the formula, interpretation, and chart.

Excel-ready interpretation Sample and population modes Interactive scatter chart

Quick Excel Reminder

  • Use =COVARIANCE.S(array1,array2) for a sample.
  • Use =COVARIANCE.P(array1,array2) for a population.
  • Positive covariance means the variables tend to rise together.
  • Negative covariance means one tends to rise when the other falls.
  • A covariance near zero suggests little linear co-movement.

Covariance Calculator

Enter numbers separated by commas, spaces, or line breaks.
The number of Y values must match the number of X values.
Enter your X and Y values, choose sample or population covariance, and click Calculate Covariance.

Visualization

The chart plots each X,Y pair. A clear upward pattern often aligns with positive covariance, while a downward pattern often aligns with negative covariance.

How to Calculate Covariance Between Two Variables in Excel

Covariance is one of the most useful statistics for understanding whether two variables move together. If one variable tends to increase when another increases, covariance is positive. If one variable tends to increase when the other decreases, covariance is negative. If there is no consistent joint movement, covariance will usually be near zero. In Excel, covariance is easy to compute with built-in functions, but knowing what the number means is just as important as knowing which formula to use.

When people search for how to calculate covariance between two variables in Excel, they are often trying to compare sales and ad spend, returns on two investments, hours studied and test scores, temperature and electricity demand, or many other related datasets. Excel supports both sample covariance and population covariance, which is essential because the correct version depends on whether your data represents a full population or only a sample drawn from a larger group.

This guide explains what covariance is, how to calculate it manually, which Excel function to use, common mistakes to avoid, and how to interpret your results. If you want a practical workflow, the fastest method in Excel is to place one variable in one column, place the second variable in another column, and then use either COVARIANCE.S or COVARIANCE.P.

What covariance measures

Covariance measures the directional relationship between two variables. It does not standardize the scale of the data, so its size depends on the units involved. That is why covariance is excellent for showing the direction of joint movement, but less ideal for comparing relationships across very different datasets. For standardized comparison, analysts often use correlation after calculating covariance.

  • Positive covariance: both variables tend to move in the same direction.
  • Negative covariance: the variables tend to move in opposite directions.
  • Covariance near zero: there is little linear co-movement.

The covariance formula

For a sample, covariance is calculated as the sum of the products of each variable’s deviations from its mean, divided by n – 1. For a population, the same numerator is divided by n. Written conceptually:

  • Sample covariance: sum of (xi – x-bar)(yi – y-bar) divided by n – 1
  • Population covariance: sum of (xi – mu-x)(yi – mu-y) divided by n

The only difference is the denominator. This small change matters, especially when your dataset is not large. In Excel, that difference is represented by two separate functions, making it easier to choose the right method.

Excel functions for covariance

  1. Use =COVARIANCE.S(array1,array2) when your data is a sample.
  2. Use =COVARIANCE.P(array1,array2) when your data is the entire population.
  3. Make sure both arrays contain the same number of observations.
  4. Arrange data so each row contains one matched X,Y pair.

For example, if values for variable X are in cells A2 through A11 and values for variable Y are in cells B2 through B11, then your Excel formulas would be:

  • =COVARIANCE.S(A2:A11,B2:B11)
  • =COVARIANCE.P(A2:A11,B2:B11)

Step by step: how to calculate covariance in Excel

  1. Open Excel and enter the first variable in one column, such as Column A.
  2. Enter the second variable in the next column, such as Column B.
  3. Ensure each row is a matched pair. Row 2 in Column A must correspond to Row 2 in Column B.
  4. Click an empty cell where you want the result to appear.
  5. Type either =COVARIANCE.S(A2:A11,B2:B11) or =COVARIANCE.P(A2:A11,B2:B11).
  6. Press Enter.
  7. Interpret the sign and magnitude in context of your units.
Tip: If you are analyzing survey responses, experimental observations, or a subset of company data, sample covariance is often the correct choice. If you truly have every observation in the group of interest, population covariance may be appropriate.

Worked example with real style business data

Suppose a marketing analyst wants to understand whether weekly advertising spend and weekly sales revenue move together. The analyst records the following values:

Week Ad Spend ($000) Sales Revenue ($000)
112110
214118
313116
416125
518133
617130

If these six weeks are just a sample of many possible weeks, the analyst would use COVARIANCE.S. In Excel, with ad spend in A2:A7 and sales in B2:B7, the formula would be =COVARIANCE.S(A2:A7,B2:B7). Because higher ad spend weeks also tend to show higher sales, the result will be positive.

This does not prove causation. Covariance only indicates whether the variables tend to move together. Seasonal demand, promotions, pricing changes, and competitive actions could all influence both variables at the same time. That is why covariance is often the first step in analysis rather than the final conclusion.

Sample covariance vs population covariance

Many Excel users get this wrong because the formulas look similar. The correct choice depends on your data source. If you observed every item in the group you care about, use the population version. If you observed only some items from a bigger group, use the sample version.

Scenario Recommended Excel Function Why
Monthly returns for all 12 funds in a fixed portfolio you fully manage COVARIANCE.P You are using the full population of interest.
Customer spending from 300 surveyed customers out of 20,000 total customers COVARIANCE.S You only observed a sample from a larger population.
Scores from one classroom used to infer performance trends across the district COVARIANCE.S The classroom data is a sample relative to the district.
Measurements from every machine in a small production batch being audited in full COVARIANCE.P The batch itself is the complete group under study.

How to interpret covariance correctly

The sign is usually the easiest part to interpret. Positive means same-direction movement. Negative means opposite-direction movement. The magnitude is more difficult because it depends on the scale of your variables. For example, covariance between age in years and income in dollars may look large simply because income has large numeric values. That does not necessarily mean the relationship is stronger than another dataset with a smaller covariance number.

This is why analysts often compute correlation after covariance. Correlation is the standardized form of covariance and ranges between -1 and 1. Still, covariance remains valuable because it is foundational in finance, econometrics, machine learning, and multivariate statistics. Portfolio variance calculations, for example, rely heavily on covariance among asset returns.

Common Excel mistakes when calculating covariance

  • Mismatched rows: if X and Y are not correctly paired row by row, the result becomes meaningless.
  • Different array lengths: Excel requires equal-length arrays for valid covariance calculation.
  • Using sample instead of population: this can slightly distort the estimate, especially in small datasets.
  • Ignoring outliers: a few extreme values can heavily affect covariance.
  • Over-interpreting the number: covariance does not imply causation and is not scaled like correlation.

Manual method in Excel without the built-in function

If you want to understand the mechanics, you can also calculate covariance manually inside Excel. This is useful for teaching, audits, or validating formulas in a model.

  1. Calculate the mean of X with =AVERAGE(A2:A11).
  2. Calculate the mean of Y with =AVERAGE(B2:B11).
  3. In a new column, compute each xi – x-bar.
  4. In another column, compute each yi – y-bar.
  5. Multiply the deviations row by row.
  6. Sum those products.
  7. Divide by n – 1 for a sample or n for a population.

This approach makes the logic transparent. You can literally see how positive and negative cross-products contribute to the final result. Rows where both variables are above their means create positive contributions. Rows where one is above its mean and the other is below its mean create negative contributions.

Example using education style data

Imagine a researcher reviews hours studied and exam scores for a sample of students. If more study hours are generally associated with higher scores, covariance should be positive. If there is no pattern, the covariance may be close to zero. The researcher might then use a scatter plot in Excel to confirm the visual trend, followed by correlation or regression for deeper analysis.

Why finance professionals use covariance so often

Covariance is central to portfolio theory. Investors want to combine assets that do not always move together. If two stocks have highly positive covariance, they may rise and fall at the same time, which reduces diversification benefits. If their covariance is lower or negative, combining them may reduce portfolio risk. That is one reason finance textbooks and spreadsheet models emphasize covariance matrices.

Public educational resources from universities and government agencies can help reinforce statistical interpretation. For further reading, consult: U.S. Census Bureau, National Institute of Standards and Technology, and Penn State Statistics Online.

When covariance is useful and when it is not enough

Covariance is useful when you want a direct measure of how two variables move together in their original units. It is especially useful inside variance-covariance matrices and mathematical models. However, if your main goal is comparing relationship strength across datasets with very different scales, correlation is usually better. If your goal is prediction, you may need regression instead.

  • Use covariance to detect direction of joint variation.
  • Use correlation to compare relationships on a standardized scale.
  • Use regression when you want to model or predict one variable from another.

Final takeaway

To calculate covariance between two variables in Excel, organize your paired values in two columns and use COVARIANCE.S for sample data or COVARIANCE.P for population data. The sign tells you the direction of movement, while the magnitude reflects both the relationship and the scale of the variables. Always pair rows correctly, choose the right function, and interpret the result in context. If you need a quick answer before opening Excel, the calculator above gives you the same core logic instantly and visualizes the relationship in a chart.

Leave a Comment

Your email address will not be published. Required fields are marked *

Scroll to Top