The Science Behind SecProve

Testing yourself is the most effective way to learn

The most robust finding in learning science is the testing effect: actively retrieving information from memory produces dramatically better long-term retention than passively reviewing it.[1] This isn't a small effect — in a landmark 2011 study published in Science, Karpicke and Blunt found that retrieval practice outperformed even elaborative concept mapping for learning.[2]

This is why SecProve is built around questions, not videos or readings. Every time you answer a question — whether you get it right or wrong — you're engaging in retrieval practice. The act of trying to recall the answer strengthens the neural pathways that store that knowledge.

Dunlosky et al. (2013) reviewed ten common learning techniques and rated practice testing as "high utility" — one of only two techniques to earn that rating. Highlighting, rereading, and summarization were all rated "low utility."[3]

Competition accelerates learning

There's a reason chess players improve faster playing opponents than solving puzzles alone. Competitive environments increase effort, engagement, and ultimately learning outcomes.[4]

A comprehensive review of gamification in education found that elements like points, badges, leaderboards, and levels improve engagement, motivation, and learning outcomes when designed thoughtfully.[5] Hamari et al. (2014) confirmed in a meta-analysis of 24 empirical studies that gamification provides positive effects — especially when competition and social comparison elements are present.[6]

SecProve's 1v1 challenges, ELO leaderboard, and tier system aren't just for fun. They create the competitive context that research shows drives deeper engagement with the material.

Skill-based rating measures what matters

Most learning platforms use volume-based scoring — answer more questions, get more points. This rewards persistence, not skill. A user who answers 1,000 easy questions shouldn't outrank someone who answers 100 expert-level questions correctly.

SecProve uses an ELO-style rating system, originally developed by Arpad Elo for chess[7] and now the standard for skill measurement in competitive domains. Your rating adjusts based on the difficulty of each question relative to your current ability — harder questions move the needle more. This approach is grounded in Item Response Theory (IRT), the statistical framework used in standardized testing worldwide.[8]

Speed matters too. Correct answers given quickly earn a small bonus — up to 25% — reflecting the difference between confident recall and labored reasoning. Research on response latency in testing shows that faster correct responses correlate with deeper, more automated knowledge.[14] However, wrong answers never benefit from speed, preventing a guessing incentive. The result: your rating rewards both accuracy and fluency.

SecProve also asks "How confident are you?" before revealing each answer. This isn't just for show — the hypercorrection effect shows that high-confidence errors are corrected more effectively than low-confidence errors.[15]Your self-reported confidence feeds into your proficiency profile, helping the system identify dangerous blind spots where you're wrong but don't know it.

shipped 2026-04-24· calibration score

Those confidence ratings now feed a second, published rating: the Calibration Score. It sits alongside Knowledge Rating and measures how well your self-reported confidence tracks your actual accuracy. Chess has ELO. Forecasting has Brier scores. Cybersecurity has historically had neither for everyday practitioners.

A practitioner with high knowledge and low calibration is the hardest person to work with on an incident — they confidently assert things that aren't true, and the team grants them authority on the strength of their knowledge rating. Separating the two numbers makes that pattern visible. Full write-up with the research behind the score.

The result: your SecProve ratings reflect genuine skill and accurate self-assessment — not just how much time you've spent on the platform.

Calibrating question difficulty — the input to your rating

An ELO rating is only as meaningful as the difficulty signal that feeds it. If every question is scored the same, the math collapses — answering an easy question and answering an expert one would move your rating identically. So before a question ever reaches you, SecProve grades it on a 1–10 difficulty scale built from five independent components, each grounded in educational-measurement research.

Cognitive level comes from Bloom's revised taxonomy[16] — does the question test recall, analysis, or evaluation? Scenario complexity counts the distinct facts the user must integrate, drawing from Sweller's Cognitive Load Theory.[17] Distractor similarity measures how semantically close the wrong answers sit to the correct one — a finding from Haladyna's item-writing research that the tightness of the option field drives more observed difficulty than stem length alone.[18] Linguistic load uses Flesch–Kincaid readability,[19] and a final modifier load accounts for negation, quantitative reasoning, and multi-step structure — all of which systematically depress p-values even when content difficulty is constant.[20]

The weighted sum produces a score that maps to the four ELO tiers (Beginner 0.5× → Expert 2.0×), which is the multiplier on your K-factor every time you answer a question. Our rubric is an a-priori estimate; once a question has ≥50 graded responses, we compare it against observed Item Response Theory parameters[8] — p-values and r-biserial discrimination — and recalibrate the weights. Your rating math gets more accurate, not less, as the platform matures.

Read the full methodology →

Explanatory feedback is what makes wrong answers valuable

Getting a question wrong is only useful if you understand why it's wrong. Hattie and Timperley (2007) showed that feedback is among the most powerful influences on learning — but only when it explains the reasoning, not just the verdict.[9] Simple "correct/incorrect" indicators are far less effective than detailed explanations.[10]

That's why SecProve provides per-choice explanations for every answer — not just the correct one. Each wrong answer explains the specific misconception it represents and when that answer might be correct in a different context. Every explanation cites authoritative sources: NIST, OWASP, MITRE ATLAS, and peer-reviewed research.

Consistency beats cramming — always

A meta-analysis of 317 experiments found that spacing practice over time dramatically improves retention compared to massed practice (cramming).[11] This is the spacing effect — one of the most replicated findings in all of cognitive psychology.

SecProve's daily streak system, knowledge decay tracking, and "maintain" recommendations are all designed around this principle. Short, consistent practice sessions are more effective than marathon study sessions — and the research strongly supports this.

Traditional cybersecurity training isn't working

Research on cybersecurity training effectiveness shows that interactive, scenario-based assessment outperforms passive training methods.[12] The cybersecurity skills gap continues to widen in part because traditional certification approaches fail to keep pace with evolving threats — one-time exams don't verify ongoing competence.[13]

SecProve addresses this with continuous, adaptive assessment. Your knowledge profile is always current, always honest, and always pointing you toward what to learn next.

References

[1] Roediger, H.L. & Karpicke, J.D. (2006). Test-Enhanced Learning: Taking Memory Tests Improves Long-Term Retention. Psychological Science, 17(3), 249-255. Link →
[2] Karpicke, J.D. & Blunt, J.R. (2011). Retrieval Practice Produces More Learning than Elaborative Studying with Concept Mapping. Science, 331(6018), 772-775. Link →
[3] Dunlosky, J., Rawson, K.A., Marsh, E.J., Nathan, M.J. & Willingham, D.T. (2013). Improving Students' Learning With Effective Learning Techniques. Psychological Science in the Public Interest, 14(1), 4-58. Link →
[4] Bigoni, M., Camera, G. & Casari, M. (2019). The Effect of Competition on Learning: Evidence from a Classroom Experiment. Working Paper. Link →
[5] Dicheva, D., Dichev, C., Agre, G. & Angelova, G. (2015). Gamification in Education: A Systematic Mapping Study. Educational Technology & Society, 18(3), 75-88. Link →
[6] Hamari, J., Koivisto, J. & Sarsa, H. (2014). Does Gamification Work? A Literature Review of Empirical Studies on Gamification. Proc. 47th Hawaii International Conference on System Sciences. Link →
[7] Elo, A.E. (1978). The Rating of Chessplayers, Past and Present. Arco Publishing, New York. Link →
[8] Embretson, S.E. & Reise, S.P. (2000). Item Response Theory for Psychologists. Lawrence Erlbaum Associates. Link →
[9] Hattie, J. & Timperley, H. (2007). The Power of Feedback. Review of Educational Research, 77(1), 81-112. Link →
[10] Pashler, H., Cepeda, N.J., Wixted, J.T. & Rohrer, D. (2005). When Does Feedback Facilitate Learning of Words?. Journal of Experimental Psychology: Learning, Memory, and Cognition, 31(1), 3-8. Link →
[11] Cepeda, N.J., Pashler, H., Vul, E., Wixted, J.T. & Rohrer, D. (2006). Distributed Practice in Verbal Recall Tasks: A Review and Quantitative Synthesis. Psychological Bulletin, 132(3), 354-380. Link →
[12] Caulfield, T. et al. (2023). A Systematic Review of Cybersecurity Training and Education. ACM Computing Surveys. Link →
[13] Dawson, J. & Thomson, R. (2018). The Future Cybersecurity Workforce: Going Beyond Technical Skills for Successful Cyber Performance. Computers & Security, 73, 283-293. Link →
[14] Ratcliff, R. & McKoon, G. (2008). The Diffusion Decision Model: Theory and Data for Two-Choice Decision Tasks. Neural Computation, 20(4), 873-922. Link →
[15] Butterfield, B. & Metcalfe, J. (2001). Errors Committed with High Confidence Are Hypercorrected. Journal of Experimental Psychology: Learning, Memory, and Cognition, 27(6), 1491-1494. Link →
[16] Anderson, L.W. & Krathwohl, D.R. (2001). A Taxonomy for Learning, Teaching, and Assessing: A Revision of Bloom's Taxonomy of Educational Objectives. Longman, New York. Link →
[17] Sweller, J., Ayres, P. & Kalyuga, S. (2011). Cognitive Load Theory. Springer, New York. Link →
[18] Haladyna, T.M., Downing, S.M. & Rodriguez, M.C. (2002). A Review of Multiple-Choice Item-Writing Guidelines for Classroom Assessment. Applied Measurement in Education, 15(3), 309-333. Link →
[19] Kincaid, J.P., Fishburne, R.P., Rogers, R.L. & Chissom, B.S. (1975). Derivation of New Readability Formulas for Navy Enlisted Personnel. Naval Technical Training Command Research Branch Report 8-75. Link →
[20] Downing, S.M. (2002). Construct-Irrelevant Variance and Flawed Test Questions: Do Multiple-Choice Item-Writing Principles Make Any Difference?. Academic Medicine, 77(10 Suppl), S103-S104. Link →