restricting mirror permissions #12

Open
opened 2023-06-16 18:38:23 +00:00 by luna · 4 comments
Owner

while mirrors are useful, letting them be created by anyone leads to the entire gitdab website being a sea of mirrors. this decreases the value of the service beyond "free storage" for a new user, and considering disks don't grow on trees, we have to control the space (it's a common action of mine to have to clean up disk space to get the service running again, or emailing various users about repositories that are too large for what they do).

the idea i have is to have an http thing in the middle of gitdab and nginx, checking if the user is in a certain list (like people i trust to mirror without exploding storage space), and if it does, the mirroring actions go through (TODO find out how the forgejo frontend does mirrors so we can attach to those paths).

it's a compromise i'm okay with, hence i'm proposing it.

while mirrors are useful, letting them be created by anyone leads to the entire gitdab website being a sea of mirrors. this decreases the value of the service beyond "free storage" for a new user, and considering disks don't grow on trees, we have to control the space (it's a common action of mine to have to clean up disk space to get the service running again, or emailing various users about repositories that are too large for what they do). the idea i have is to have an http thing in the middle of gitdab and nginx, checking if the user is in a certain list (like people i trust to mirror without exploding storage space), and if it does, the mirroring actions go through (TODO find out how the forgejo frontend does mirrors so we can attach to those paths). it's a compromise i'm okay with, hence i'm proposing it.

What worries me is that people are even cloning mirrors that aren't theirs for random projects. I'd personally find it "okay" to mirror your own projects for a backup and such, as long as they're not massive repos (especially not random files other than code), but other than that...

That said, I don't see a feasible way of restricting this aside from a "trusted whitelist".

What worries me is that people are even cloning mirrors that aren't theirs for random projects. I'd personally find it "okay" to mirror your own projects for a backup and such, as long as they're not massive repos (especially not random files other than code), but other than that... That said, I don't see a feasible way of restricting this aside from a "trusted whitelist".
Author
Owner

people are even cloning mirrors that aren't theirs for random projects

there is a degree where mirroring a repository that's not yours may be in general a value-add, like youtube-dl (though everyone who worked on it also has a full copy of the repo as well).

I'd personally find it "okay" to mirror your own projects for a backup and such

that's where i take issue with being seen as "free storage" for a user to mirror their own projects. there is a user (not in gitdab anymore) that setup an account mirror from their github account to their gitdab account (consuming multiple gigabytes off the disk). it took us some many weeks of back and forth emails to find out they lost access to the server running the mirror system, asked for an account deletion.

which kind of goes back to what i said, if you keep clones of your own repos in your machine and treat those as the source of truth, you can already bootstrap yourself on a separate git forge (though not including the issues/PRs/etc, that's something forge federation should help with, by standardizing these workflows in the ActivityStreams2 format), plus you are incentivized to keep repo sizes small, to not explode your own disk

That said, I don't see a feasible way of restricting this aside from a "trusted whitelist".

i don't know if we could do something automated like account age... i don't see any issues with that idea at the moment, but i can miss anything

> people are even cloning mirrors that aren't theirs for random projects there is a degree where mirroring a repository that's not yours may be in general a value-add, like youtube-dl (though everyone who worked on it also has a full copy of the repo as well). > I'd personally find it "okay" to mirror your own projects for a backup and such that's where i take issue with being seen as "free storage" for a user to mirror their own projects. there is a user (not in gitdab anymore) that setup an *account* mirror from their github account to their gitdab account (consuming multiple gigabytes off the disk). it took us some many weeks of back and forth emails to find out they lost access to the server running the mirror system, asked for an account deletion. which kind of goes back to what i said, if you keep clones of your own repos in your machine and treat those as the source of truth, you can already bootstrap yourself on a separate git forge (though not including the issues/PRs/etc, that's something forge federation should help with, by standardizing these workflows in the ActivityStreams2 format), plus you are incentivized to keep repo sizes small, to not explode your own disk > That said, I don't see a feasible way of restricting this aside from a "trusted whitelist". i don't know if we could do something automated like account age... i don't see any issues with that idea at the moment, but i can miss anything
Author
Owner

We cleaned up our largest repository (around 6GB free), but days later, we reached 0% available disk space (around 20 minutes ago). This is not sustainable at all.

All mirrors are now DISABLED (they will NOT sync up new commits, but they WILL STILL BE AVAILABLE) until the new allowlist mirroring system can be setup.

I can't give an ETA on this, but something had to be done.

We cleaned up our largest repository (around 6GB free), but days later, we reached 0% available disk space (around 20 minutes ago). This is not sustainable at all. All mirrors are now **DISABLED** (they will NOT sync up new commits, but they **WILL STILL BE AVAILABLE**) until the new allowlist mirroring system can be setup. I can't give an ETA on this, but something had to be done.
Author
Owner

the cause was funnier than we could expect. turns out we were affected by this:

https://forgejo.org/docs/v1.20/admin/search-engines-indexation/#disallow-crawling-archives-to-save-disk-space

robots.txt was edited and mirroring has been turned on some weeks ago as an experiment. its likely it'll be turned off later on, so that we can actually do restricted mirror perms

the cause was funnier than we could expect. turns out we were affected by this: https://forgejo.org/docs/v1.20/admin/search-engines-indexation/#disallow-crawling-archives-to-save-disk-space robots.txt was edited and mirroring has been turned on some weeks ago as an experiment. its likely it'll be turned off later on, so that we can actually do restricted mirror perms
Sign in to join this conversation.
No Milestone
No Assignees
2 Participants
Notifications
Due Date
The due date is invalid or out of range. Please use the format 'yyyy-mm-dd'.

No due date set.

Dependencies

No dependencies set.

Reference: gitdab/gitdab#12
No description provided.