Dynamic System Prompts and Prompt Injection

Client-supplied prompt fragments as an attack surface — and what developers must do about it

Security compliance — Dynamic System Prompts and Prompt Injection
Key takeaways
  • Prompt injection via client-configured fragments is not a theoretical risk — it has been demonstrated in production multi-tenant AI systems.
  • OWASP LLM01 (prompt injection) is the top security risk for LLM-based applications — dynamic prompts dramatically expand this attack surface.
  • When a jailbroken chatbot causes harm to a user, the question of whether the platform or client is liable depends on who introduced the injection — and whether the platform had adequate controls.
Risk signals
  • No sanitisation of client-supplied prompt fragments before injection into the system prompt.
  • No monitoring of LLM outputs for signs of prompt injection success.
  • Client admin access without multi-factor authentication — making the admin panel itself an attack vector.
Action items
  • Sanitise all client-supplied fragments at write time: reject fragments containing injection keywords.
  • Wrap client fragments in structural delimiters and instruct the LLM to treat them as configuration, not meta-instructions.
  • Monitor all outputs for meta-references to system prompts or instructions — alert and log on detection.

Dynamic system prompt construction is powerful and dangerous. A client who embeds injection instructions in an agent mode fragment can cause the LLM to ignore all other instructions, reveal the system prompt, or produce outputs designed to harm their own users.

Key Analysis

Prompt injection via client-configured fragments is not a theoretical risk — it has been demonstrated in production multi-tenant AI systems.
OWASP LLM01 (prompt injection) is the top security risk for LLM-based applications — dynamic prompts dramatically expand this attack surface.
When a jailbroken chatbot causes harm to a user, the question of whether the platform or client is liable depends on who introduced the injection — and whether the platform had adequate controls.

Risk Signals

No sanitisation of client-supplied prompt fragments before injection into the system prompt.
No monitoring of LLM outputs for signs of prompt injection success.
Client admin access without multi-factor authentication — making the admin panel itself an attack vector.

Action Items

Sanitise all client-supplied fragments at write time: reject fragments containing injection keywords.
Wrap client fragments in structural delimiters and instruct the LLM to treat them as configuration, not meta-instructions.
Monitor all outputs for meta-references to system prompts or instructions — alert and log on detection.

LinkedIn

Technical Deep Dive

Read the technical deep dive

See the implementation walkthrough on govindpreetsingh.com

Read on govindpreetsingh.com →

Request a consultation

This is a lightweight intake endpoint for now. It is structured so the practice management system can later take over scheduling, conflict checks and matter creation.

Submitting this form does not create an advocate-client relationship. Please avoid sending confidential details until engagement is confirmed.