Stata Percentage Calculator

How to Calculate Percentage of Variable in Stata

Use this premium calculator to compute percentages from raw values, preview the exact Stata command pattern you need, and visualize the result instantly. This tool is especially useful when you want to convert a count, subtotal, or category frequency into a percent of a total inside Stata.

Interactive Calculator

Enter the numerator and denominator that match your Stata variable logic. Then choose whether you want a simple percent, a generated variable command, or a grouped frequency example.

Numerator value

Denominator value

Variable name in Stata

Total variable or constant

Calculation scenario

Decimal places

Ready: enter values and click Calculate Percentage to see the computed result, formula, and suggested Stata command.

What this helps you do

Common Stata pattern:
gen pct = (variable / total) * 100

Example:
gen income_pct = (income / household_total) * 100

Expert Guide: How to Calculate Percentage of Variable in Stata

If you are learning how to calculate percentage of variable in Stata, the core idea is straightforward: divide the value of interest by the appropriate total and multiply by 100. The challenge is not the arithmetic. The real challenge is choosing the correct denominator, deciding whether you need observation-level or group-level percentages, and writing efficient Stata code that is easy to audit later.

In Stata, percentages can be produced in several ways depending on your goal. You might want the percentage of one numeric variable relative to another numeric variable, the percentage distribution of a categorical variable, or the percentage share of a subgroup within a panel, household, firm, county, or survey domain. Each use case looks similar on the surface, but the command pattern differs. That is why understanding both the formula and the data structure matters.

The basic percentage formula

The standard percentage formula is:

percentage = (part / total) * 100

In Stata, a direct observation-level calculation usually looks like this:

gen pct = (part_variable / total_variable) * 100

For example, if you have employee sales and company total sales recorded on the same row, you can compute the employee share as:

gen sales_pct = (employee_sales / company_sales) * 100

When to use generate versus egen

Many Stata users confuse generate and egen. Use generate when both the numerator and denominator already exist row by row. Use egen when you need to create a total before calculating the percentage. For instance, if each observation is a person and you need a household total first, you would often use egen with a grouping variable:

bysort household_id: egen household_income = total(income)
gen income_share = (income / household_income) * 100

This pattern is extremely common in microdata analysis, labor data, firm records, and household surveys. The first line creates the denominator within each household. The second line converts each person’s income into a share of household income.

Three common ways to calculate percentages in Stata

1. Percentage of one numeric variable relative to another

This is the easiest case. Suppose a dataset contains profit and revenue. You want profit margin as a percentage:

gen profit_margin = (profit / revenue) * 100

Always check for zero or missing denominators before running this on production data:

gen profit_margin = .
replace profit_margin = (profit / revenue) * 100 if revenue > 0

This avoids divide-by-zero problems and prevents misleading infinite values.

2. Percentage distribution of a categorical variable

If you want percentages for categories such as sex, region, industry, education level, or response type, the simplest route is often a frequency table:

tabulate region

Stata automatically displays frequencies and percentages. If you need a two-way percentage table, use:

tabulate region sex, row
tabulate region sex, col
tabulate region sex, cell

row gives row percentages, col gives column percentages, and cell gives the percentage of the whole table. This is one of the most useful distinctions in applied work, especially when analyzing survey distributions or cross-tab summaries.

3. Percentage share within groups

Many analysts need percentages within a subgroup, such as the percentage of household expenditure by category, vote share within district, or student share within school. In these cases, use bysort and egen:

bysort district: egen district_total = total(votes)
gen vote_share = (votes / district_total) * 100

This method is ideal when the denominator is not a fixed dataset total but a dynamic total that changes by group.

Worked examples

Example 1: Individual percentage of total

Suppose 125 survey respondents out of 500 selected a given option. The percentage is:

(125 / 500) * 100 = 25

In Stata, if count_yes is 125 and count_total is 500 for a row or a subset, the code pattern remains:

gen yes_pct = (count_yes / count_total) * 100

Example 2: Household expenditure shares

Imagine each row is one expense item within a household. You have variables household_id and expense. To calculate what percentage each item contributes to total household spending:

bysort household_id: egen household_expense_total = total(expense)
gen expense_share = (expense / household_expense_total) * 100

This is a textbook use case for percentage of variable in Stata because the denominator must be created inside each group.

Example 3: Category percentages from tabulate

If your variable is categorical, for example employment_status, and you simply want the percentage in each category, use:

tab employment_status

Stata outputs a table showing counts, percents, and cumulative percentages. This is often faster and cleaner than manually generating indicator variables unless you specifically need a new percentage variable stored in the dataset.

Comparison table: best Stata method by use case

Use case	Recommended command pattern	Why it works	Typical output
Variable divided by another variable	`gen pct = (x / y) * 100`	Both numerator and denominator already exist by observation	Observation-level percentage
Share within group	`bysort group: egen total_y = total(y)` `gen pct = (y / total_y) * 100`	Creates a group-specific denominator	Within-group share
Category frequency percentage	`tab variable`	Stata computes counts and percents automatically	Frequency table
Two-way percentage table	`tab rowvar colvar, row` or `col`	Useful for cross-tab comparison	Row or column percentages

Real statistics context: why percentage calculations matter

Percentages are used constantly in policy, education, labor, and public health datasets. According to the U.S. Census Bureau, percentage shares are central to reporting demographic, housing, and economic distributions across groups and geographies. In higher education data, institutions often report percentages of enrollment by race, sex, or attendance status rather than raw counts because proportions allow fair comparison between differently sized groups. Labor and health datasets similarly rely on percent distributions to compare categories across states, years, or demographic segments.

Illustrative data context	Raw count	Total	Percentage
Students enrolled part time in a college sample	1,250	5,000	25.0%
Households with broadband in a county sample	18,400	23,000	80.0%
Workers in services in a local labor sample	7,350	14,700	50.0%
Survey respondents selecting option A	125	500	25.0%

Important data quality checks before calculating percentages

Confirm the denominator. A wrong denominator creates a wrong percentage even when the formula is correct.
Check for zero denominators. Use conditional replacement if totals may be zero.
Inspect missing values. Missing numerator or denominator values can propagate into the result.
Verify grouping logic. If you use bysort, make sure the grouping variable truly identifies the intended unit such as household, school, or district.
Decide on scale. Some workflows store shares between 0 and 1, while others store percentages between 0 and 100.

Example of safe coding for missing or zero totals

gen pct = .
replace pct = (x / y) * 100 if !missing(x) & !missing(y) & y > 0

This pattern is safer in real datasets because it explicitly protects the result from invalid denominators.

How to calculate percentages after collapse or contract

Two powerful Stata workflows are collapse and contract. If you need category percentages from the dataset itself, contract can create frequencies and then percentages:

contract occupation
egen total_freq = total(_freq)
gen pct = (_freq / total_freq) * 100

If you are summarizing data before percentage calculation, collapse can be equally helpful. For example, you can collapse to district totals first and then compute each district’s share of the national total.

Row, column, and cell percentages explained

When working with two-way tables, analysts often misuse row and column percentages. Here is the difference:

Row percentages add to 100 across each row. They answer: within this row category, how are observations distributed across columns?
Column percentages add to 100 down each column. They answer: within this column category, how are observations distributed across rows?
Cell percentages are based on the grand total of the table. They answer: what percent of the entire dataset is represented by this cell?

In Stata, these correspond to:

tabulate var1 var2, row
tabulate var1 var2, col
tabulate var1 var2, cell

Common mistakes users make in Stata

Multiplying by 100 before division without parentheses in more complex formulas.
Using the overall dataset total when a group-level total is required.
Forgetting that tabulate percentages are display output, not automatically saved variables.
Creating percentages from already rounded totals, which can distort results.
Comparing percentages from groups with very different sample sizes without checking the counts.

Recommended workflow for analysts

Inspect the variable with summarize, tabulate, or codebook.
Define whether your denominator is observation-level, group-level, or dataset-level.
Create totals if needed using egen total() within the correct bysort structure.
Generate the percentage variable with clear naming, such as income_pct or vote_share.
Validate the result using a few hand-checked observations.
Format or round only after the calculation, not before.

Authoritative references for methodology and data reporting

For deeper reference material on statistics, survey reporting, and tabular percentages, review guidance from U.S. Census Bureau, National Center for Education Statistics, and U.S. Bureau of Labor Statistics.

Final takeaway

To calculate percentage of variable in Stata, you generally divide a part by a total and multiply by 100. If the total already exists, use generate. If the total must be built within groups, use bysort with egen total(). If you only need category percentages for a table, use tabulate. The command is simple, but the denominator decision is everything. Once you understand that distinction, percentage calculations in Stata become consistent, scalable, and easy to explain in reports, appendices, and reproducible code files.

How To Calculate Percentage Of Variable In Stata