Dynamic system prompt construction is powerful and dangerous. A client who embeds injection instructions in an agent mode fragment can cause the LLM to ignore all other instructions, reveal the system prompt, or produce outputs designed to harm their own users.
Key Analysis
Prompt injection via client-configured fragments is not a theoretical risk — it has been demonstrated in production multi-tenant AI systems.
OWASP LLM01 (prompt injection) is the top security risk for LLM-based applications — dynamic prompts dramatically expand this attack surface.
When a jailbroken chatbot causes harm to a user, the question of whether the platform or client is liable depends on who introduced the injection — and whether the platform had adequate controls.
Risk Signals
No sanitisation of client-supplied prompt fragments before injection into the system prompt.
No monitoring of LLM outputs for signs of prompt injection success.
Client admin access without multi-factor authentication — making the admin panel itself an attack vector.
Action Items
Sanitise all client-supplied fragments at write time: reject fragments containing injection keywords.
Wrap client fragments in structural delimiters and instruct the LLM to treat them as configuration, not meta-instructions.
Monitor all outputs for meta-references to system prompts or instructions — alert and log on detection.