ChatGPT Just (Accidentally) Shared All Its Secret Rules – Here's What We Learned


ChatGPT inadvertently revealed a set of internal instructions built into OpenAI to a user who shared what they discovered on Reddit. OpenAI has since shut down the unlikely access to its chatbot’s commands, but the revelation has sparked further debate about the complexities and security measures built into the AI’s design.

Reddit user F0XMaster explained that he greeted ChatGPT with a casual “Hello,” and in response, the chatbot divulged a comprehensive set of system instructions to guide him and keep him within predefined ethical and security boundaries in many use cases.

“You are ChatGPT, a large language model trained by OpenAI, based on the GPT-4 architecture. You are chatting with the user via the ChatGPT iOS app,” the chatbot wrote. “This means that most of the time your lines should be one or two sentences, unless the user's request requires extensive reasoning or results. Never use emojis, unless explicitly asked. Knowledge limit: 2023-10 Current date: 2024-06-30.”

(Image credit: Eric Hal Schwartz)

ChatGPT then set up rules for Dall-E, an AI image generator integrated with ChatGPT, and the browser. The user then replicated the result by directly asking the chatbot for its exact instructions. ChatGPT continued with a different approach than the custom directives that users can enter. For example, one of the disclosed instructions related to DALL-E explicitly limits creation to a single image per request, even if a user requests more. The instructions also emphasize the importance of avoiding copyright infringement when generating images.

scroll to top