I just proposed per-request middleware to greyproxy to handle dangerous LLM outputs. This demo shows a Python middleware intercepting and stripping a "halt system" command before it reaches the client. It works even with compressed responses.
asciinema.org/a/4vHZIHogMu... #ai #llm #middleware