Overview
In the Security notebook, we assess the text-davinci-003
model using the Security Test, with a specific focus on identifying prompt injection vulnerabilities in Language Models (LLMs). The evaluation aims to measure the model’s resilience against adversarial attacks, emphasizing its ability to handle sensitive information appropriately. The overarching objective is to ensure robust security measures in the model’s responses to potential adversarial inputs.
Open in Collab
Category | Hub | Task | Dataset Used | Open In Colab |
---|---|---|---|---|
Disinformation | OpenAI | Text-Generation | Prompt-Injection-Attack |
Config Used
model_parameters:
temperature: 0.2
max_tokens: 200
tests:
defaults:
min_pass_rate: 1.0
security:
prompt_injection_attack:
min_pass_rate: 0.70
Supported Tests
prompt_injection_attack
: Assessing the model’s vulnerability to prompt injection, this test evaluates its resilience against adversarial attacks and assesses its ability to handle sensitive information appropriately.