Home » LLM Security Leaderboard
Evaluate and compare the security and safety of leading AI models.
The higher the score, the more secure and reliably helpful the model is.
| Rank | Model |
CloudsineAI Attack Dataset
ℹ
|
Prompt Injection Security
ℹ
|
Jailbreak Security
ℹ
|
Output Safety
ℹ
|
Sensitive Information
ℹ
|
LLM Native Protection
ℹ
|
Protector Plus Protection
ℹ
|
|---|---|---|---|---|---|---|---|---|
| 1 | Claude-3.5-Sonnet | 86.9 | 97.3 | 100.0 | 99.4 | 99.9 | 93.0 |
97.9
📊
|
| 2 | GPT-4o | 72.4 | 98.3 | 100.0 | 95.4 | 99.4 | 85.3 |
94.6
📊
|
| 3 | Llama3.3:70b | 74.9 | 92.9 | 98.2 | 91.3 | 95.7 | 84.7 |
97.2
📊
|
| 4 | grok-2-1212 | 72.8 | 92.8 | 99.8 | 86.1 | 97.4 | 83.4 |
97.1
📊
|
| 5 | Llama3.2 | 82.6 | 78.8 | 92.8 | 80.2 | 79.1 | 82.7 |
97.5
📊
|
| 6 | Gemini-2.0-flash | 66.8 | 88.7 | 96.2 | 89.0 | 87.3 | 78.5 |
95.3
📊
|
| 7 | GPT4o-mini | 70.5 | 67.7 | 97.8 | 83.1 | 84.5 | 77.0 |
96.0
📊
|
| 8 | DeepSeek-R1-Distill-Qwen-32B | 54.2 | 68.0 | 48.6 | 39.8 | 44.0 | 52.1 |
95.0
📊
|
| 9 | Llama2-Uncensored | 65.5 | 73.7 | 11.5 | 10.3 | 17.1 | 41.9 |
96.1
📊
|
| 10 | DeepSeek-R1-Distill-Llama-8B | 43.8 | 52.9 | 5.2 | 7.5 | 9.0 | 31.3 |
91.5
📊
|
| Rank | Model |
CloudsineAI Attack Dataset
ℹ
|
Prompt Injection Security
ℹ
|
Jailbreak Security
ℹ
|
Output Safety
ℹ
|
Sensitive Information
ℹ
|
LLM Native Protection
ℹ
|
Protector Plus Protection
ℹ
|
|---|---|---|---|---|---|---|---|---|
| 1 | Claude-3.5-Sonnet | 86.9 | 97.3 | 100.0 | 99.4 | 99.9 | 93.0 |
97.9
📊
|
| 2 | Claude-3.7-Sonnet | 85.6 | 94.6 | 99.2 | 94.8 | 97.5 | 91.1 |
98.1
📊
|
| 3 | GPT-4o | 72.4 | 98.3 | 100.0 | 95.4 | 99.4 | 85.3 |
94.6
📊
|
| 4 | Llama3.3:70b | 74.9 | 92.9 | 98.2 | 91.3 | 95.7 | 84.7 |
97.2
📊
|
| 5 | grok-2-1212 | 72.8 | 92.8 | 99.8 | 86.1 | 97.4 | 83.4 |
97.1
📊
|
| 6 | Llama3.2 | 82.6 | 78.8 | 92.8 | 80.2 | 79.1 | 82.7 |
97.5
📊
|
| 7 | GPT-4.1 | 72.1 | 91.5 | 98.9 | 93.1 | 84.8 | 82.1 |
94.8
📊
|
| 8 | Gemini-2.0-flash | 66.8 | 88.7 | 96.2 | 89.0 | 87.3 | 78.5 |
95.3
📊
|
| 9 | GPT4o-mini | 70.5 | 67.7 | 97.8 | 83.1 | 84.5 | 77.0 |
96.0
📊
|
| 10 | Grok-3 | 56.3 | 89.3 | 92.1 | 74.4 | 79.9 | 70.1 |
87.9
📊
|
| 11 | Gemini-2.5-Flash | 34.8 | 76.8 | 88.6 | 65.6 | 71.3 | 55.2 |
90.2
📊
|
| 12 | DeepSeek-R1-Distill-Qwen-32B | 54.2 | 68.0 | 48.6 | 39.8 | 44.0 | 52.1 |
95.0
📊
|
| 13 | Llama2-Uncensored | 65.5 | 73.7 | 11.5 | 10.3 | 17.1 | 41.9 |
96.1
📊
|
| 14 | DeepSeek-R1-Distill-Llama-8B | 43.8 | 52.9 | 5.2 | 7.5 | 9.0 | 31.3 |
91.5
📊
|
| Rank | Model |
CloudsineAI Attack Dataset (%)
ℹ
|
Prompt Injection Security (%)
ℹ
|
Jailbreak Security (%)
ℹ
|
Output Safety (%)
ℹ
|
Sensitive Information (%)
ℹ
|
Benign Requests (%)
ℹ
|
Overall Score
ℹ
|
Overall Score
(With Protector Plus)
ℹ
|
|---|---|---|---|---|---|---|---|---|---|
| 1 | GPT-4o | 75.12 | 96.65 | 98.60 | 90.16 | 85.35 | 93.40 | 89.88 |
94.7
📊
|
| 2 | GPT-5 | 81.33 | 94.08 | 96.93 | 88.86 | 84.68 | 93.17 | 89.84 |
96.2
📊
|
| 3 | GPT-4.1 | 74.85 | 94.21 | 99.21 | 92.76 | 85.80 | 92.10 | 89.82 |
94.7
📊
|
| 4 | Claude-Sonnet-4 | 83.51 | 98.33 | 94.74 | 89.00 | 82.66 | 88.04 | 89.38 |
94.7
📊
|
| 5 | Qwen3:8b | 68.91 | 94.59 | 94.82 | 74.38 | 79.15 | 90.77 | 83.77 |
95.2
📊
|
| 6 | Gemini-2.5-Flash | 65.90 | 71.69 | 79.12 | 59.91 | 63.15 | 91.34 | 71.85 |
92.7
📊
|
| 7 | Grok-4 | 72.87 | 61.90 | 38.51 | 39.22 | 59.34 | 83.34 | 59.20 |
92.7
📊
|
| 8 | Llama2-Uncensored | 68.98 | 80.31 | 17.46 | 23.30 | 27.88 | 94.90 | 52.14 |
95.9
📊
|
| PROTECTION TYPE | CLOUDSINEAI ATTACK DATASET | PROMPT INJECTION | JAILBREAK | OUTPUT SAFETY | SENSITIVE INFORMATION | BENIGN REQUESTS |
|---|
Explore the leading models in our CloudsineAI attack dataset, Prompt Injection Security, Jailbreak Security, LLM Output Safety and protection against sensitive information while still fulfilling benign requests.
Our leaderboard evaluates models across a comprehensive set of categories. Higher scores indicate not only increased robustness against adversarial prompts, but also a greater willingness to oblige to legitimate queries without unnecessary refusals. This dual assessment ensures that models are measured both for their security resilience and for their responsiveness in real-world applications.
We evaluated the models on our leaderboard using a combination of curated datasets and our proprietary dataset.
Measures the model's ability to resist malicious prompt injection attempts.
Our proprietary dataset for evaluating LLM security, designed from a hacker's perspective.
Measures the model's consistency in generating safe and appropriate outputs. It covers harmful content, crime, and hate speech.
Measures the model's ability to resist advanced Jailbreak Techniques such as Analyzing-Based Jailbreak and Adaptive Jailbreak Attacks.
Evaluates the model's resistance against unethical and confidential information.
Evaluates the model's responsiveness to safe, non-adversarial prompts by determining how often it provides direct answers instead of issuing unwanted refusals.
Contact our team of experts to learn how CloudsineAI can enhance your GenAI security. Leave the security to us and focus on innovating.
Fill out the form below, and we will be in touch shortly.