cloudsineAI LLM Security Leaderboard

Evaluate and compare the security and safety of leading AI models. The higher the score, the more secure the model is.

Rank Model
cloudsineAI's Attack dataset
Prompt Injection Security
Jailbreak Security
Output Safety
Sensitive Information
LLM Native Protection
Protector Plus Protection
1 Claude-3.5-Sonnet 86.9 97.3 100.0 99.4 99.9 93.0
97.9 📊
2 Claude-3.7-Sonnet 85.6 94.6 99.2 94.8 97.5 91.1
98.1 📊
3 GPT-4o 72.4 98.3 100.0 95.4 99.4 85.3
94.6 📊
4 Llama3.3:70b 74.9 92.9 98.2 91.3 95.7 84.7
97.2 📊
5 grok-2-1212 72.8 92.8 99.8 86.1 97.4 83.4
97.1 📊
6 Llama3.2 82.6 78.8 92.8 80.2 79.1 82.7
97.5 📊
7 GPT-4.1 72.1 91.5 98.9 93.1 84.8 82.1
94.8 📊
8 Gemini-2.0-flash 66.8 88.7 96.2 89.0 87.3 78.5
95.3 📊
9 GPT4o-mini 70.5 67.7 97.8 83.1 84.5 77.0
96.0 📊
10 Grok-3 56.3 89.3 92.1 74.4 79.9 70.1
87.9 📊
11 Gemini-2.5-Flash 34.8 76.8 88.6 65.6 71.3 55.2
90.2 📊
12 DeepSeek-R1-Distill-Qwen-32B 54.2 68.0 48.6 39.8 44.0 52.1
95.0 📊
13 Llama2-Uncensored 65.5 73.7 11.5 10.3 17.1 41.9
96.1 📊
14 DeepSeek-R1-Distill-Llama-8B 43.8 52.9 5.2 7.5 9.0 31.3
91.5 📊

About the cloudsineAI LLM Security Leaderboard

Explore the leading models in our cloudsineAI attack dataset, Prompt Injection Security, Jailbreak Security, LLM Output Safety and protection against sensitive information.

Our leaderboard ranks models based on performance across these key categories, with an overall score ranging from 0.00 to 100.00. The higher the score, the more secure the model. For example, Claude 3.5 Sonnet is the most secure model we tested, with its native protection able to block 93% of all attacks. When enhanced with our GenAI firewall, the model can block almost 98% of all attacks.

Evaluation Methodology

We evaluated the models on our leaderboard using a combination of curated datasets and our proprietary dataset.

  • Dataset Composition & Weightage:
    • Four curated datasets, each tailored to a specific attack vector (weighted at 12.5% each).
    • Our proprietary attack dataset (weighted at 50%), crafted from a hacker’s perspective to simulate real-world adversarial tactics.
  • Evaluation Criteria: For each model, we analysed the proportion of outputs classified as harmful or safe under these conditions. 

  • Testing with WebOrion® Protector Plus:  We retested the models with WebOrion® Protector Plus protection, to assess the model’s effectiveness in blocking harmful prompts and outputs while measuring the safety of the resulting output.

Key Metrics

Prompt Injection Security

Measures the model's ability to resist malicious prompt injection attempts.

cloudsineAI's Dataset

Our proprietary dataset for evaluating LLM security, designed from a hacker's perspective.

Output Safety

Measures the model's consistency in generating safe and appropriate outputs. It covers harmful content, crime, and hate speech.

Jailbreak Security

Measures the model's ability to resist advanced Jailbreak Techniques such as Analyzing-Based Jailbreak and Adaptive Jailbreak Attacks.

Exposure of Sensitive Information

Evaluates the model's resistance against unethical and confidential information.

Take the Next Step

Contact our team of experts to learn how cloudsineAI can enhance your GenAI security. Leave the security to us and focus on innovating.

contact us today

Fill out the form below, and we will be in touch shortly.