Matthews Correlation Coefficient Calculator

Measures the correlation between the predicted and observed binary classification of a sample.

Refer to the text below the calculator for more information on the MCC formula and variables.


The Matthews correlation coefficient or the phi-coefficient is a measure of the strength of a correlation, which is the statistical relation between two variables.

It was introduced in 1975 by the biochemist Brian W Matthews. The coefficient accounts for true and false positives and negatives and can be used even where classes are of very different sizes.


The MCC describes how changing the value of one variable will affect the value of another and returns a value between -1 and 1:

+1 describes a perfect prediction;

0 unable to return any valid information (no better than random prediction);

-1 describes complete inconsistency between prediction and observation.

Matthews Correlation Coefficient Formular


True Positives (TP)
True Negatives (TN)
False Positives (FP)
False Negatives (FN)
  Embed  Print  Share 

Send Us Your Feedback

Steps on how to print your input & results:

1. Fill in the calculator/tool with your values and/or your answer choices and press Calculate.

2. Then you can click on the Print button to open a PDF in a separate window with the inputs and results. You can further save the PDF or print it.

Please note that once you have closed the PDF you need to click on the Calculate button before you try opening it again, otherwise the input and/or results may not appear in the pdf.


 

About the correlation coefficient

The Matthews correlation coefficient or the phi-coefficient is a measure of the strength of a correlation, which is the statistical relation between two variables. It was introduced in 1975 by the biochemist Brian W Matthews. The coefficient accounts for true and false positives and negatives and can be used even where classes are of very different sizes.

The Matthews correlation coefficient formula is based on the so called confusion matrix so the variables are:

■ True positive represents the outcome where the model correctly predicts the positive class (condition is detected when present).

■ True negative is the outcome where the model correctly predicts the negative class (condition is not detected when absent).

■ False positive represents the outcome where the model incorrectly predicts the positive class (condition is detected when absent).

■ False negative is the outcome where the model incorrectly predicts the negative class (condition is not detected when present).

  Condition
Present Absent
Test Positive True Positive False Positive
Negative False Negative True Negative

The relation between these classifications is expressed with Matthews correlation coefficient formula:

Matthews Correlation Coefficient Formular

The MCC describes how changing the value of one variable will affect the value of another and returns a value between -1 and 1:

+1 describes a perfect prediction;

0 unable to return any valid information (no better than random prediction);

-1 describes complete inconsistency between prediction and observation.

If any of the four sums in the denominator is zero, the denominator can be arbitrarily set to the value of 1 and so the MCC becomes 0.

If the FP = FN = 0 (perfect classifier) then the MCC is 1, thus indicating perfect positive correlation. Conversely, if TP = TN = 0 (misclassified) then the MCC is -1, thus indicating perfect negative correlation.

MCC is also perfectly symmetric, so no class is more important than the other; even if the positive and negatives are changed, the same value is obtained.

The Matthews correlation formula is used in research, biological sciences as well as in machine learning (the scientific field that combines statistical models and algorithms). It is deemed to be the best determinant of the quality of a binary classifier prediction in a confusion matrix context.

 

References

Matthews, B. W. Comparison of the predicted and observed secondary structure of T4 phage lysozyme. Biochimica et Biophysica Acta (BBA) - Protein Structure. 1975; 405 (2): 442–451

Powers, David M W. Evaluation: From Precision, Recall and F-Measure to ROC, Informedness, Markedness & Correlation (PDF). Journal of Machine Learning Technologies. 2011; 2 (1): 37–63.

Chicco D, Jurman G. The advantages of the Matthews correlation coefficient (MCC) over F1 score and accuracy in binary classification evaluation. BMC Genomics. 2020; 21 (6).


Specialty: Research

No. Of Variables: 4

Abbreviation: MCC

Article By: Denise Nedea

Published On: April 15, 2020 · 12:00 AM

Last Checked: April 15, 2020

Next Review: April 15, 2025