Multi-class confusion matrix library in Python
Expert Video Review by SEOGANT · March 2026
PyCM is a Python library for computing and displaying multi-class confusion matrix statistics in machine learning classification tasks.
While scikit-learn provides basic confusion matrix functionality, PyCM offers an extensive set of over 100 performance metrics derived from confusion matricescovering not just accuracy, precision, recall, and F1, but also Matthews Correlation Coefficient, Cohen's Kappa, Informedness, Markedness, AUC estimates, and dozens of statistical tests relevant to specific evaluation contexts.
This breadth makes it a comprehensive tool for thorough classification model evaluation.
The library handles both binary and multi-class classification problems and provides per-class metrics alongside aggregated statistics, helping practitioners identify whether a model's aggregate accuracy masks poor performance on specific classes.
PyCM generates human-readable reports in multiple formatsprinted tables, CSV, HTML, and JSONsuitable for embedding in automated evaluation pipelines or sharing with stakeholders who need to understand model performance without running code themselves.
It also includes statistical significance tests for comparing two models' performance on the same dataset.
Data scientists evaluating classification models in domains where class imbalance is significantmedical diagnosis, fraud detection, rare event predictionuse PyCM to go beyond accuracy metrics that can be misleading when class frequencies differ substantially.
ML engineers building automated model evaluation pipelines integrate it to generate comprehensive performance reports at each training run, tracking not just primary metrics but the full distribution of per-class performance over time.
Get implementation playbooks for tools like pycm in guided Academy lessons. Start free, then unlock the full library with Learner.
Open Academy →Pricing details on provider page.
Comments (0)
Sign in to join the discussion.