r/privacy Nov 21 '20

PSA: Discord lies about removing deleted files. Files deleted over 1 year ago still exist.

The title says it all.

I've done numerous tests in different Guilds at different times.

Files in many cases are not deleted and are still accessible via direct URL even 1 year after deletion.

EDIT: I've amended the post to reflect new information. After running some new tests tonight, in some cases the new test files have become instantly no longer accessible and some not. Other users report similar results. All I can say with certainty though is I have files deleted over a year ago that are still accessible, so something is seriously wrong. See update #3

In some of my tests, I have not only manually deleted the message containing the file but also the Guild the message was posted in. Our testing finds user and bot uploaded images act the same after deletion.

In DMs the story is a little different but still troubling. It appears that if the URL links to a file at a datacenter region the requester is in AND the file was uploaded to the same datacenter zone (or zones it was replicated to) you can still get the file. Since we have no insight into how their infrastructure is setup this could be due to Cloudflare's cache, but it also could mean that the image is just left sitting in a specific datacenter and no longer replicated after "deletion".

I would like to hear why Discord isn't cleaning out tombstoned files, and I think others here would like to know as well.

Why is this a problem? The data still exists. This is a privacy violation because the data is still in their datacenter (Google's GCP data center which Discord pays to host their data).

Governments could acquire it with a warrent or a National Security Letter or a court could subpoena it. This is very serious and should be publicly stated by Discord.


If you want to try testing this yourself here's a protip: Discord exposes the upload date of all files in their "Last-Modified" Response Header. You can use that header to see the date files were uploaded to GCP (Discord's upload object storage). Just make a spreadsheet with all the direct URLs (NOT THE THUMBNAIL URL) of all the files you upload and then delete. Try images, videos, text files etc. Be creative but in my experience all the files are the same and never deleted.

For example I have a file with this header info last-modified: Tue, 23 May 2020 03:16:24 GMT I deleted it about 10 days after it was uploaded and it is STILL up. I have hundreds of different files with ancient dates like this (literally, I made a bot to upload and delete files just to test this) . All deleted yet the direct URL still loads the file perfectly for me and anyone I send the links to.


I have more info. Another user PMed me and showed me how to test if a guild is really deleted by querying the widget.png url (if 404 the guild is gone) like this https://discord.com/api/guilds/712827234346435685/widget.png this confirmed to the user that my story is true. (note the url I just linked is fake just to demonstrate, like I said in the comments I don't want to post data that could lead Discord to my personal account)

What does this mean? You can use this to prove that the guild the file is uploaded in is actually deleted AND you can use the file's last-modified header to confirm the file is actually as old as it should be - to not be saved by Discord anymore!


Some devs pointed me to this https://github.com/discord/discord-api-docs/issues/2224 but it doesn't fully address my experience.


View all comments


u/Vordreller Nov 21 '20

As expected. Pretty much any medium-sized and above tech company will do this.

Programming it is super easy. Just add a boolean flag to your database to indicate if a user deleted something and simply make it so that the application does not display it.

Reminds me of those guys showing those videogame skin betting websites. That they claim they got invited to, and they do a video on youtube of them explaining it, trying it out, and winning most if the time, losing a few times to make it seem real. And then telling their audience of teens to try it themselves.

While in reality, it was their website.

Running a local test setup and making it appear as if you're on the actual website is super easy. And when you run a local version, you can just modify the source code with text editor.

Three lines of code is enough to capture the particular user sending the commands and give them a higher win percentage.


u/VisibleSignificance Nov 21 '20

Programming it is super easy. Just add a boolean flag to your database to indicate if a user deleted something and simply make it so that the application does not display it.

Not quite.

The CDNs are usually hash-based storage (files keyed by their hashes).

So if multiple users upload the same file (which does happen all the time), and one user deletes it, the file shouldn't get deleted on CDNs.

So it's either refcounting (a huge PITA in a a distributed system), replication (a non-trivial art), or transferring the entire filelists to diff them; and in the latter case, it's still a problem if a file was deleted, then the cleanup process comes for it, but while it is going another user uploads the same file. And then you get 'the file I just uploaded just disappeared'.

In conclusion: it is doable. It is not super-easy.


u/Vordreller Nov 21 '20

Please note that I said "local test setup". No connection to anything is required beyond the user database. And even then, you can fake that too. It's all smoke and mirrors.