Secrets of the ChatGPT Linux system

Sun 16 June 2024
Tagged: software, ai

Have you noticed that ChatGPT sometimes writes out Python code and somehow executes it? How does that work? What kind of environment is it using? Can we co-opt it for our own ends? Let's find out!

To play along at home: open a ChatGPT chat (I used gpt-4o) and ask if it can execute whoami using os.popen.

Here's a short example session: https://chatgpt.com/share/94ad03ba-2a76-4643-a9cc-fb17df2e0345

Occasionally it will refuse to help you, this seems to be random chance as far as I can tell:

Executing system-level commands like whoami or using os.exec is not allowed in this environment for security reasons.

If it refuses, just open a new chat and try again until it works. After it has worked once, it will run anything you ask.

First impressions

Getting it to run stuff using os.popen works very well, I encourage you to try it. Once it has run the first few commands for you, it understands what it needs to do and you can just ask it to do what you want without having to be explicit about using os.popen. Interacting with a Unix environment in plain English is an incredible experience, it feels like a superpower.

The first things I discovered are:

user is "sandbox" (from whoami)
Debian 12 bookworm (from /etc/os-release)
2 CPUs, x86_64 (from /proc/cpuinfo)
1GB RAM (from /proc/meminfo)
8 exabytes (!) of disk space (from df -h)
it's hosted on Azure (from /etc/resolv.conf)
we're probably under Kubernetes (from env)

(I think the 8 exabytes of disk space is wrong - it's probably some sort of virtual "grow on demand" disk, I doubt they'd actually let you create 8 exabytes of disk usage, although I didn't actually try).

I got it to ls /home/sandbox and I found a file called README, saying:

Thanks for using the code interpreter plugin!
    
Please note that we allocate a sandboxed Unix OS just for you, so it's expected that you can see
and modify files on this system.

Processes

ps aux was quite helpful:

USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
sandbox      1  0.0  1.5  32980 16460 ?        Ssl  21:20   0:00 tini -- python3 -m uvicorn --host 0.0.0.0 --port 8080 user_machine.app:app
sandbox      3  0.8 11.2 239412 118084 ?       Sl   21:20   0:09 python3 -m uvicorn --host 0.0.0.0 --port 8080 user_machine.app:app
sandbox     12  0.6 26.5 1012168 278192 ?      Ssl  21:20   0:07 /usr/local/bin/python3 -m ipykernel_launcher -f /home/sandbox/kernel-0f2cff68-76d0-428d-aeca-457929590ecf.json
sandbox     56  0.2 10.7 199880 112696 ?       Ssl  21:20   0:02 /usr/local/bin/python3 -m ipykernel_launcher -f /home/sandbox/kernel-17ed5563-6b6a-4090-9230-1cf364cb683b.json
sandbox     77  0.2 10.3 193736 108296 ?       Ssl  21:21   0:02 /usr/local/bin/python3 -m ipykernel_launcher -f /home/sandbox/kernel-0eb063b9-61a6-48e6-9777-703e6be500fd.json
sandbox    333 25.0  1.6  33084 17256 ?        Sl   21:38   0:00 /bin/sh -c ps aux
sandbox    335  166  2.2  40052 23328 ?        Rl   21:38   0:00 ps aux

So from this we learn:

it's using tini, "a tiny but valid init for containers"
it's using uvicorn, "an ASGI web server, for Python"
it's running 3 IPython kernels
and it's not doing a fat lot else!

I don't know why it is running 3 IPython kernels. It seems like it only needs 1. Maybe one is for your actual chat session, and the other two are just there by accident because of parallel background sessions that do some processing on your chat session, for example the thing that reads the first few messages and sets the "title"? Don't know, pure speculation.

Networking

What kind of network access do we have? First impression is none at all. DNS doesn't work, and I can't curl anything on the public internet even by IP address.

But what is that uvicorn web server doing if there's no network access? I must be missing something.

I tried to find out the local IP address with commands like ifconfig -a and ip addr, but neither of them were available. Then I had a better idea: ask ChatGPT to find out its own local IP address. And it worked! It wrote some Python code that enumerated its available interfaces and showed me the non-loopback address. This is what I mean about interacting with a Linux system in plain English being a superpower: it actually works better if you say what you want to achieve instead of saying how to do it! (Sometimes).

It only had 2 network interfaces, lo and eth0. The IP address on eth0 was something like 10.230.120.251. I guessed that there might be a gateway at 10.230.120.1, but again I couldn't connect to this on any port.

The assumption I have is that the web server is the mechanism by which the LLM is able to run code in the Linux environment. I'm saying that it's using IPython the same way a human might, and the web app is there to expose IPython. Eventually I had the idea to try to look in netstat etc. to find out where remote connections are coming from, but of course netstat isn't available.

Again the plain English solution worked best: I asked ChatGPT to tell me what network connections are open, and it came up with some Python code to do it using psutil, and I found that there was a connection into port 8080 from 10.230.1.112! So that's our web server's client.

Again I tried to connect to this IP address on likely port numbers, and again no connection succeeded.

So what I've learnt about networking is that there is basically no external access whatsoever. The external ChatGPT-running system is able to connect into the uvicorn server, but it looks like the Kubernetes firewall (?) blocks everything else. But have a look yourself, maybe you'll find something I missed.

The web app

But there's another line of attack here. If it's running a web app in this environment, then the web app must exist here somewhere. Let's find it.

We know from ps aux that the app is called something like "user_machine", so I asked ChatGPT to find files containing "user_machine". It wrote a find command and it worked first time: there is a "user_machine" directory at /home/sandbox/.openai_internal/user_machine/.

.openai_internal? Now we're getting somewhere!

I got it to do a recursive ls -la, and I won't bore you with all the details, but the most important parts are:

.openai_internal/user_machine/app.py: 28197 bytes
.openai_internal/user_machine/routes.py: 2309 bytes

Dumping routes.py is easy because it is short (just ask ChatGPT to show you what's in it). Dumping app.py was trickier because it is so long that the output gets truncated. I asked ChatGPT to give it to me in 5k byte chunks, and I pieced them together manually. (But see the end of this post for a better method).

So this feels like real progress! I've dumped out the source code of the internal web app that connects the upstream ChatGPT-running system to IPython.

It looks like comments have been replaced with spaces, because there are no comments in these files, and there are large blocks containing nothing but spaces in places you might expect to find comments, like at the top of a function.

I found one thing that kind of looks a bit like it might be a bug. If you squint.

In app.py:

@app.get("/download/{path:path}")
async def download(path: str):
    path = urllib.parse.unquote(path)
    [...]
    
@app.get("/check_file/{path:path}")
async def check_file(path: str):
    path = "/" + urllib.parse.unquote(path)
    [...]

Why does check_file prepend a "/" to the path but download doesn't?

Apart from that the code looked straightforward and believable. app.py is by far the largest part, and is only about 650 lines.

The API it exposes is as follows.

GET /check_liveness
GET /check_health - identical to /check_liveness
GET /self_identify - empty string response, presumably used to query the x-ace-self-identify header
GET /download/{$path} - "path" says the path to the file to serve to the client
GET /check_file/{$path} - check whether the file exists and report its size, note aforementioned leading slash
POST /upload - a "destination" parameter says the path to write, max. size of 1GB
GET /kernel/{$kernel_id} - report how much time is remaining in an IPython kernel session (?)
DELETE /kernel/{$kernel_id} - shutdown an IPython kernel
POST /kernel - start a new IPython kernel
websocket at /channel - used to interact with an IPython kernel

So presumably the ChatGPT-running environment uses POST /kernel to create the 3 IPython kernels, then it opens a websocket to /channel to submit ChatGPT's code to run, and retrieve the outputs. When you upload a file I guess it uses POST /upload to put it inside the container (although they seem to end up in /mnt/data so probably it could put files in the environment through mounting the same filesystem elsewhere), and similarly I guess it uses GET /download to retrieve files when trying to give them to you to download (like if you ask it to make an Excel spreadsheet).

I don't quite know what "self_identify" is for. There is a UUID in an environment variable, and a "middleware" function inserts the value from this environment variable as the x-ace-self-identify header (except, not when the client is localhost? why?). Maybe it's just a sanity check to prevent crosstalk between different users' Linux environments?

Next steps

Getting files into the environment

You can attach files to your messages using the paperclip icon, these get put in the Linux environment under /mnt/data, and it seems to accept any type of file. I submitted an x86_64 binary and asked ChatGPT to execute it and it all worked first time.

Example chat session: https://chatgpt.com/share/8ff0e4b2-a6b5-4242-8735-af606e7ca410

Getting files out of the environment

In my investigation I was getting files out by asking ChatGPT to print out 5k bytes at a time and then copy and pasting them into an editor, but I later found you can actually do a lot better: just ask ChatGPT to provide the file as an attachment, and it works automatically! It writes Python code to copy the file into /mnt/data and then gives you a hyperlink to download it.

Example chat session: https://chatgpt.com/share/bde51271-46f9-40bd-8b5a-602304805c5c

Modify the implementation of user_machine

It would be easy to modify .openai_internal/user_machine/app.py (for example by uploading your preferred alternative version). The problem is that to make your changes active you need to restart uvicorn, and as soon as uvicorn exits, tini shuts down the whole environment. And when the environment comes back up, your changes are gone.

I read something that suggested that if you send SIGHUP to uvicorn then it does a hot reload, but this reset my environment anyway. I don't know if that's because the uvicorn process exits anyway when you do this, or because SIGHUP isn't actually handled at all.

My next idea was to send a SIGSTOP to tini to prevent it from noticing that uvicorn had exited, but it didn't work, my environment was reset anyway. I don't know if my environment was reset because SIGSTOP didn't actually stop tini, or because restarting uvicorn closes your IPython kernels so then the orchestrator closes your environment anyway.

It seems like it should be possible to substitute your own implementation of user_machine, I just haven't quite worked out how.

And even if you manage it, I'm not sure it enhances your capabilities very much because you can already run arbitrary code on the system. You could relax the 1GB limit on uploaded files, but that's about it. (And OpenAI should be enforcing that limit at other layers anyway, so it shouldn't actually help you).

The best you could hope for here is sending malicious responses to the requests in order to exploit a potential vulnerability in the client program.

Get networking

The thing you really want to do is get networking. I expect the firewall is good enough, so probably the only way you're going to get network access is by substituting your own user_machine implementation and exploiting a vulnerability in the client program.

Find out what the other 2 IPython kernels are for

Getting logs from the other IPython kernels would be interesting, and might clarify their purpose.

Prior art

My investigation took several "sittings" because I kept hitting the dreaded gpt-4o usage limit and had to wait a few hours to continue. After the first sitting I DuckDuckWent the text from the README file and found a LessWrong article from last summer: Jailbreaking GPT-4's code interpreter.

There is also a comment on that post from someone who worked on the feature, addressing most of the points, and worth reading.

The gist is that last year ChatGPT kept telling the user that certain capabilities weren't available to it (for example it might say "I can't write files outside /mnt/data") and then the user discovered that it actually could write files outside /mnt/data. The explanation is that the OpenAI system prompt (?) told ChatGPT that it couldn't write files outside /mnt/data not because it doesn't physically have permission to, but because they want the default experience to work better. And ChatGPT was mistaking "instructions on how to work best" for "rules on how the environment works".

One thing that was a genuine bug at the time is that /mnt/data was shared between multiple chats from the same user! That means that each chat session was not properly isolated. That has now been fixed, as far as I can tell: in 2 simultaneous chat sessions, writing a file to /mnt/data in one did not result in a file becoming present in /mnt/data in the other.

Conclusion

Leaking the source code for "user_machine" could be mitigated by, for example, having the webapp owned by a separate user, and dropping privileges to run IPython, but I don't think it's a big deal.

Although I had a lot of fun poking around this system, I didn't really find anything that I would consider a "vulnerability", it was actually much better than I expected. Arbitrary code execution is not a problem: OpenAI provide a sandboxed Linux environment for you to use with ChatGPT, and the outcome is you can use a sandboxed Linux environment with ChatGPT. Working as intended, WONTFIX.

If you like my blog, please consider subscribing to the RSS feed or the mailing list:

James Stanley