How to use the calculator (quick steps)
- Name your criteria (for example: Cost, Quality, Reliability, Responsiveness). Use short labels that everyone on your team understands the same way.
- Fill the upper triangle of the matrix with your pairwise judgments using Saaty’s scale (1/9 to 9). You can also fill the lower triangle; the tool will keep reciprocals synchronized.
- The calculator automatically fills reciprocals (if A vs B is 5, then B vs A becomes 1/5). The diagonal is fixed at 1.
- Click Compute priorities to generate weights, λmax, CI, and CR. The results table shows intermediate values so you can audit the computation.
- If CR is high, revise the comparisons that feel least defensible and recompute. Small changes to one or two entries often reduce inconsistency substantially.
Tip: if you are working with stakeholders, start by agreeing on definitions. For example, “Reliability” could mean uptime, defect rate, warranty claims, or delivery predictability. AHP works best when each criterion has a shared meaning.
Saaty’s 1–9 scale (what the numbers mean)
Enter values between 1/9 (≈0.111) and 9. Values above 1 mean the row criterion is preferred over the column criterion; values below 1 mean the column criterion is preferred. The scale is intentionally coarse: it encourages you to express the strength of preference without pretending you know an exact ratio.
Saaty’s fundamental scale for pairwise comparisons.
| Value |
Meaning |
Practical interpretation |
| 1 | Equal importance | You would be comfortable treating the two criteria as equally influential. |
| 3 | Moderate importance | One criterion is preferred, but the other still matters a lot. |
| 5 | Strong importance | You would usually choose the preferred criterion when tradeoffs arise. |
| 7 | Very strong importance | The preferred criterion dominates in most realistic scenarios. |
| 9 | Extreme importance | Only rare exceptions would make you prioritize the other criterion. |
| 2, 4, 6, 8 | Intermediate values | Use when you are between two verbal judgments. |
| 1/3, 1/5, 1/7, 1/9 | Opposite preference | Same strengths, but favoring the column criterion instead of the row. |
The calculator keeps the matrix reciprocal automatically. This matters because reciprocity is a core AHP assumption: if you say “Cost is 5× more important than Quality,” then “Quality is 1/5 as important as Cost.”
Formulas and assumptions (what is computed)
Let A be your 4×4 comparison matrix where each entry aij expresses how much more important criterion i is than criterion j. AHP assumes:
- Positivity: all comparisons are positive numbers.
- Reciprocity: aji = 1 / aij.
- Diagonal equals 1: a criterion compared to itself is 1.
- Reasonable scale: values typically stay within 1/9 to 9 to avoid overconfident ratios.
This calculator derives weights using the geometric mean method (a common, stable approach for small matrices). For each row i:
- Row geometric mean: gi = (∏j=1..4 aij)1/4
- Normalize to weights: wi = gi / ∑k gk
Consistency is estimated by computing Aw, then averaging (Aw)i / wi to estimate λmax. From that:
- Consistency index: CI = (λmax − n) / (n − 1), with n = 4
- Consistency ratio: CR = CI / RI, using RI ≈ 0.90 for n = 4
A common guideline is CR < 0.10 for acceptable consistency. Treat this as a quality check, not a guarantee. A perfectly consistent matrix can still reflect poor priorities if the criteria are defined badly or if important factors are missing.
Worked example (supplier criteria)
Suppose you are weighting four supplier-selection criteria: Cost, Quality, Reliability, and Responsiveness. Your team agrees on these judgments:
- Cost is moderately more important than Quality → enter 3 in Cost vs Quality.
- Cost is strongly more important than Reliability → enter 5 in Cost vs Reliability.
- Cost is very strongly more important than Responsiveness → enter 7 in Cost vs Responsiveness.
- Quality is slightly to moderately more important than Reliability → enter 2 in Quality vs Reliability.
- Quality is strongly more important than Responsiveness → enter 4 in Quality vs Responsiveness.
- Reliability is moderately more important than Responsiveness → enter 3 in Reliability vs Responsiveness.
Enter those values in the upper triangle; the tool fills the reciprocals automatically. After computing, you’ll get weights that sum to 100%. If CR is below 0.10, your comparisons are reasonably coherent. If it’s higher, look for contradictions such as “A ≫ B” and “B ≫ C” but “C ≫ A.”
A quick mental check: if you believe Cost dominates everything, you should expect Cost’s weight to be the largest. If the output shows a different criterion on top, that is a signal to re-check the entries or the criterion names.
How to interpret the results
- Weights: Use them as importance multipliers when scoring alternatives. A weight of 0.40 means “40% of the decision emphasis.”
- Ranking criteria: Sort criteria by weight to see what drives the decision most. If two weights are close, treat them as roughly tied and avoid over-interpreting tiny differences.
- λmax, CI, CR: These indicate internal coherence. A perfectly consistent 4×4 matrix would have λmax = 4. As inconsistency increases, λmax rises above 4.
- When CR is high: Re-check the comparisons that were hardest to justify. Often one or two entries drive most inconsistency.
Troubleshooting a high consistency ratio (CR)
If your CR is above the guideline threshold, it does not mean AHP “failed.” It means your set of pairwise statements does not fit together cleanly. Use the following practical workflow:
- Find the weakest comparison: Identify the pair you were least confident about and adjust it toward 1 (less extreme) to see if CR improves.
- Check transitivity: If A is preferred to B and B is preferred to C, then A should usually be preferred to C. Large violations often create inconsistency.
- Clarify definitions: High CR often comes from ambiguous criteria (e.g., “Quality” mixing durability, aesthetics, and compliance). Tighten the definition and re-enter comparisons.
- Avoid double-counting: If two criteria overlap (e.g., “Total cost” and “Purchase price”), comparisons become unstable. Merge or redefine criteria.
- Use sensitivity testing: Change one entry at a time and observe how weights move. If weights swing wildly, the decision is preference-sensitive and may need more evidence.
In group settings, a useful technique is to ask each stakeholder for a quick independent matrix, then discuss only the comparisons with the largest disagreement. Even if you still enter a single consensus matrix here, that discussion tends to reduce inconsistency.
Common use cases for a 4-criteria AHP model
A four-criterion model is intentionally small. It is ideal when you want structure without turning the exercise into a long workshop. Typical scenarios include:
- Vendor or supplier selection: Balance cost, quality, reliability, and service responsiveness.
- Product roadmap prioritization: Compare customer impact, engineering effort, revenue potential, and risk.
- Hiring decisions: Weight experience, technical skill, communication, and culture fit (with careful definitions).
- Project portfolio triage: Compare strategic alignment, ROI, feasibility, and urgency.
- Policy or community investments: Balance equity, cost, effectiveness, and implementation speed.
If you have more than four criteria, you can still use this page as a first pass: combine similar criteria into broader buckets, compute weights, and then refine later with a larger AHP model in a spreadsheet or dedicated tool.
Limitations and good practice
- Exactly four criteria: This page is intentionally limited to a 4×4 matrix for usability and speed.
- Subjective inputs: The output quality depends on your judgments. A low CR indicates coherence, not correctness.
- Weights are not a final decision: You still need performance data for each alternative and a method to combine scores with weights.
- Scale compression: The 1–9 scale cannot express extremely fine distinctions; that is a feature, not a bug.
- Context matters: Preferences can change by scenario (budget cuts, deadlines, regulatory changes). Re-run the matrix when context changes.
Good practice: document the meaning of each criterion and the rationale for any comparison above 5 or below 1/5. Extreme values are sometimes justified, but they are also where teams most often overstate certainty.