Different international plant protection organisations advocate different schemes for conducting pest risk assessments. Most of these schemes use structured questionnaire in which experts are asked to score several items using an ordinal scale. The scores are then combined using a range of procedures, such as simple arithmetic mean, weighted averages, multiplication of scores, and cumulative sums. The most useful schemes will correctly identify harmful pests and identify ones that are not. As the quality of a pest risk assessment can depend on the characteristics of the scoring system used by the risk assessors (i.e., on the number of points of the scale and on the method used for combining the component scores), it is important to assess and compare the performance of different scoring systems. In this article, we proposed a new method for assessing scoring systems. Its principle is to simulate virtual data using a stochastic model and, then, to estimate sensitivity and specificity values from these data for different scoring systems. The interest of our approach was illustrated in a case study where several scoring systems were compared. Data for this analysis were generated using a probabilistic model describing the pest introduction process. The generated data were then used to simulate the outcome of scoring systems and to assess the accuracy of the decisions about positive and negative introduction. The results showed that ordinal scales with at most 5 or 6 points were sufficient and that the multiplication-based scoring systems performed better than their sum-based counterparts. The proposed method could be used in the future to assess a great diversity of scoring systems.