Skip to content
Calcipedia
Chi-Square Calculator instructional illustration

Chi-Square Calculator

Run a chi-square calculator for goodness-of-fit or independence tests with p-value, expected counts, selected-alpha decision, contribution rows.

Last updated

Hypothesis testing

Chi-square test calculator

Test whether observed counts match an expected pattern or whether two categorical variables appear associated. The result sheet breaks out expected counts, major contributors, effect size, and small-cell warnings so you can audit the conclusion instead of relying on a lone p-value.

Test type

Use one categorical variable when you want to compare observed counts with a hypothesized distribution.

Examples

Before you interpret the result

  • Enter raw counts, not percentages or proportions.
  • Each person, record, or event should contribute to one category or cell only once.
  • For sparse 2×2 tables, treat Fisher’s exact test as the follow-up when expected counts are very small.

Result

χ² = 2

The observed counts are reasonably consistent with the expected distribution (χ² = 2, p = 0.849145). Small category differences are present, but this sample does not provide strong evidence of a meaningful departure from the hypothesized pattern.

P-value
0.849145
Degrees of freedom
5
Sample size
88
Minimum expected count
15
Expected cells below 5
0
Cohen's w
0.1508
small effect
Decision at α = 0.05
Do not reject H₀
Do not reject the null hypothesis

Selected-alpha interpretation

The p-value (0.849145) is above α = 0.05, so this sample does not clear the selected significance threshold.

Report-ready summary

χ²(5, N = 88) = 2, p = .849.

Assumption check

Approximation looks healthy
  • Use raw category counts rather than percentages or already-normalized proportions.
  • The chi-square approximation assumes independent observations and mutually exclusive categories.
  • Expected counts are all at least 5, which supports the usual chi-square approximation.

Cell contributions

The largest contribution rows are the categories or cells that are pushing the chi-square statistic upward.

CellObservedExpectedO − EContributionShare
Category 2181530.630%
Category 51215-30.630%
Category 61215-30.630%
Category 1161510.06673.33%
Category 3161510.06673.33%
Category 41415-10.06673.33%
← All Statistics calculators

Hypothesis Testing

Chi-square calculator: run goodness-of-fit and independence tests with p-values

Use this chi-square calculator to test whether observed counts match an expected distribution or whether two categorical variables appear associated. It returns the chi-square statistic, p-value, degrees of freedom, effect size, expected counts, and cell-level contributions so you can interpret why the result is significant or not significant.

When to use goodness-of-fit vs. independence

A chi-square goodness-of-fit test is for one categorical variable. You already have a hypothesis about the expected distribution, and you want to know whether the observed counts are close enough to that benchmark to be explained by chance. Typical examples include checking whether a die looks fair, whether market-share counts match a target split, or whether observed genetic ratios match a theoretical expectation.

A chi-square test of independence is for two categorical variables recorded in a contingency table. Instead of supplying expected counts directly, you supply the observed table, and the expected counts are calculated from the row totals, column totals, and grand total. This version answers whether the row and column variables behave as if they are independent.

That distinction matters because the null hypothesis changes. In goodness-of-fit, the null is that the sample follows the expected distribution you supplied. In independence testing, the null is that the row and column variables are unrelated in the population, so any cell imbalances are due to sampling variation.

χ² = Σ (Oᵢ − Eᵢ)² / Eᵢ

Core chi-square formula: add each cell's squared observed-minus-expected gap divided by its expected count.

df = k - 1

Degrees of freedom for a goodness-of-fit test with k categories.

df = (r - 1)(c - 1)

Degrees of freedom for an independence test with r rows and c columns.

How the calculator works

For a goodness-of-fit test, enter the observed counts and the expected counts in the same category order. The calculator compares each pair, computes the cell contribution (O - E)^2 / E, sums those contributions into the overall chi-square statistic, and then converts that statistic into a right-tailed p-value using the chi-square distribution.

For an independence test, enter the contingency table with one row per line. The calculator computes the expected count for each cell as row total × column total ÷ grand total, then uses those expected counts to build the chi-square statistic. The result sheet highlights the cells making the biggest contribution, which is often the fastest way to see where the association is coming from.

The calculator also reports an effect-size guide. For goodness-of-fit this is Cohen's w. For independence tables it reports the phi coefficient for 2×2 tables or Cramér's V for larger tables. Those measures do not replace the p-value, but they help you judge whether a statistically significant result is also practically meaningful.

Worked examples

Suppose you roll a die 88 times and observe counts of 16, 18, 16, 14, 12, and 12. If the die were fair, the expected count in each category would be 88 / 6 ≈ 14.67, or 15 in a simplified worked example. The chi-square statistic adds the six category contributions together and produces a modest value, leading to a large p-value and no strong evidence that the die is unfair.

Now consider a 2×2 independence table with counts [[30, 10], [5, 25]]. Under independence, the expected counts are derived from the row totals, column totals, and grand total, giving [[20, 20], [15, 15]]. The observed counts differ sharply from those expectations, so the calculator returns a large chi-square statistic, a very small p-value, and a clear sign that the two variables are associated in this sample.

The most useful reading step after the headline result is the contribution table. If one category or cell is responsible for most of the chi-square value, that is usually where the practical story sits. A significant chi-square result tells you that the pattern is unlikely under the null hypothesis, but the contribution rows tell you what part of the data is doing the work.

How to interpret p-values, effect size, and expected counts

The p-value is the probability of observing a chi-square statistic at least as large as the one in your data if the null hypothesis were true. A p-value below your chosen alpha threshold, such as 0.05, is evidence against the null hypothesis. A p-value above that threshold does not prove the null is true; it means the sample does not provide strong enough evidence to reject it.

Expected counts matter because the chi-square test is an approximation. As a rule of thumb, expected counts should generally not fall below 5 in too many cells, and cells below 1 are a stronger warning sign that the approximation may be unstable. When the calculator flags sparse expected counts, treat the result more cautiously and consider whether Fisher's exact test or a category redesign is more appropriate.

Effect size helps prevent over-reading large samples. With enough data, even a tiny mismatch can become statistically significant. Cohen's w, phi, and Cramér's V help you describe whether the departure from the null hypothesis is negligible, small, medium, or large. Those thresholds are only rough guides, but they are useful context when reporting a result.

Choosing alpha and reporting chi-square results

A useful chi-square test calculator should let you choose the significance level before you read the result. Alpha is the threshold for rejecting the null hypothesis, so α = 0.05, α = 0.01, and α = 0.10 can lead to different wording even when the chi-square statistic and p-value stay the same. Pick the threshold that matches your study plan, class assignment, or reporting convention rather than changing alpha after seeing the p-value.

For a report-ready sentence, include the test type, degrees of freedom, sample size, chi-square statistic, p-value, and effect-size measure. For example, an independence test might be written as χ²(1, N = 70) = 23.33, p < .001, phi = 0.58, followed by a plain-language note about which cells contributed most. That extra contribution context is what separates a useful chi-square analysis from a bare p-value.

A chi-square test of homogeneity uses the same contingency-table mechanics as a test of independence. The difference is the study question: homogeneity compares whether several groups share the same categorical distribution, while independence asks whether two categorical variables are associated in one sample. In both cases, the expected-count table and contribution rows help explain where the observed pattern departs from the null model.

Common mistakes and limitations

The most common input error is using percentages instead of raw counts. Chi-square tests work on frequency counts, not proportions that have already been normalized. If you only have percentages, convert them back to counts using the underlying sample size before you interpret the output.

Another frequent mistake is treating a non-significant chi-square result as proof that categories are identical or that variables are unrelated. A non-significant result may simply reflect low power, small sample size, or sparse data. Likewise, a significant result does not prove causation, practical importance, or a flawless study design.

This calculator does not automatically switch to Yates correction or Fisher's exact test. It also does not fit log-linear models, estimate confidence intervals for association measures, or diagnose confounding in observational data. Use it as a transparent calculator and interpretation aid, not as a substitute for full study design review or domain-specific statistical advice.

Frequently asked questions

What are the assumptions of a chi-square test?

The core assumptions are independence of observations, mutually exclusive categories, and expected counts that are not too small. In practice, the usual rule of thumb is that expected counts should generally be at least 5 in most cells, with cells below 1 being a stronger warning that the chi-square approximation may be unreliable. The test also assumes you are working with raw counts rather than percentages or rates.

What is the difference between a goodness-of-fit test and a test of independence?

A goodness-of-fit test compares one observed categorical distribution with an expected distribution you provide. A test of independence starts from a contingency table and asks whether two categorical variables are related. The formula for the chi-square statistic is the same, but the null hypothesis and the way expected counts are created are different.

Can I use percentages instead of counts in a chi-square calculator?

Not directly. Chi-square tests are based on observed and expected frequencies, so the safest input is raw counts. If you only have percentages, convert them back to counts using the sample size first. Feeding percentages straight into the formula can distort both the chi-square statistic and the effect-size interpretation.

What if some expected counts are below 5?

Small expected counts make the usual chi-square p-value less reliable. If only a few cells dip below 5, the result can still be a reasonable approximation, but you should treat it with caution. For sparse 2×2 tables, Fisher's exact test is often the preferred follow-up. In larger tables, you may need to combine sparse categories or collect more data before making a firm conclusion.

What do the degrees of freedom mean in a chi-square test?

Degrees of freedom determine which chi-square distribution should be used to convert the test statistic into a p-value. For goodness-of-fit, df usually equals the number of categories minus one. For independence tables, df equals (rows - 1) × (columns - 1). More degrees of freedom generally mean a broader chi-square reference distribution.

What does a non-significant chi-square result mean?

A non-significant result means the observed differences are not large enough, relative to the sample size and expected variation, to reject the null hypothesis at your chosen alpha level. It does not prove that the categories are identical or that the variables are independent. The sample may simply be too small, too noisy, or too sparse to detect a real effect.

What is Cramér's V or phi in the result sheet?

Those are effect-size measures that describe how strong the departure from independence appears to be. For 2×2 tables the common effect-size measure is phi; for larger contingency tables it is Cramér's V. They help answer a different question from the p-value: not just whether the association is statistically detectable, but whether it looks negligible, small, medium, or large in magnitude.

Is the chi-square test always right-tailed?

Yes. The chi-square statistic is built from squared deviations, so it cannot be negative. Larger chi-square values indicate greater departure from the null hypothesis, which means evidence always accumulates in the right tail of the chi-square distribution.

Should I use Yates correction for a 2×2 table?

Yates correction is a continuity adjustment sometimes used for 2×2 chi-square tests, especially with small samples. It makes the test more conservative by reducing the chi-square statistic. Some analysts prefer it, while others move directly to Fisher's exact test for sparse 2×2 tables. This calculator reports the standard uncorrected chi-square result, so use judgment when counts are small.

Which significance level should I choose for a chi-square test?

Use the alpha level that was chosen before the analysis. Many introductory examples use α = 0.05, but α = 0.01 is more conservative and α = 0.10 is sometimes used for exploratory screening. The calculator lets you compare common thresholds, but the interpretation should come from your study plan, course instructions, or reporting standard rather than from trying several cutoffs until one looks significant.

How do I report a chi-square result in words?

Report the test type, degrees of freedom, sample size, chi-square statistic, p-value, and effect size. A compact sentence might say, χ²(1, N = 70) = 23.33, p < .001, phi = 0.58, followed by a short explanation of which expected-versus-observed cells contributed most. If expected counts are sparse, mention that limitation and consider Fisher's exact test for a 2×2 table.

When is Fisher's exact test better than chi-square?

Fisher's exact test is usually preferred for 2×2 tables when expected counts are small, because it does not rely on the same large-sample approximation. Chi-square is fast and useful for moderate-to-large samples, but Fisher's exact test is the safer option when one or more expected cells fall below 5 and the table is small enough for an exact calculation.

Can a chi-square test prove causation?

No. A chi-square test can show that observed counts differ from expectation or that two categorical variables are associated more than chance alone would suggest. It does not prove a causal relationship. Study design, bias control, confounding, measurement quality, and subject-matter knowledge still matter.

Also in Statistics

Related

More from nearby categories

These related calculators come from the same leaf category, nearby sibling categories, or the same top-level topic.