DeutschEnglish

Submenu

 - - - By CrazyStat - - -

17. February 2012

Subversion (SVN): Permanently remove files from repository (history)

Filed under: Linux,Server Administration — Tags: , , , , , , — Christopher Kramer @ 20:02

As I am about to make CrazyStat’s subversion repository public, I checked whether there is anything in there that is not suitable to be made public. I stumbled upon some logfiles which I had once used for testing and accidentally commited to the repo. These logfiles contained private data and therefore, I needed to remove them from the history before making the repository public.

And that is how it can be done:

As SVN has no ‘obliterate’ command yet (see feature request here), you need to perform the following steps:

  1. Make sure nobody else uses the repo at the time
  2. Dump your repository to a dumpfile
  3. Filter the dumpfile (remove the files you do not want to be in there anymore)
  4. Create a new repository
  5. Import the dumpfile in the new repository
  6. Replace the old with the new repository
  7. Check it
  8. Clean up

These steps in detail:

Step 1: Make sure nobody else uses the repo at the time

I think the easiest way would be to remove write-permissions from the repository-folder. E.g. if you access your svn through apache, just chown it from www-data to root and nobody should be able to write anymore:

chown -R root:root /var/svn/REPOSITORY

Step 2: Dump your repository to a dumpfile

svnadmin dump /var/svn/REPOSITORY > dumpfile

Step 3: Filter the dumpfile

svndumpfilter exclude /path/of/file/to/remove < dumpfile > newdumpfile

This will remove the file “/path/of/file/to/remove”. You can remove multiple files at a time like this:

svndumpfilter exclude file1 file2 < dumpfile > newdumpfile

I did not find any way to use wildcards, though. Let me know in case you find anything.

Update: Thanks to the comment by Florian! Here is the way to use wildcards:

svndumpfilter exclude –pattern "*.OLD" < dumpfile > newdumpfile

Florian also pointed us to a documentation of svndumpfilter which might be helpful for some of you.

Step 4: Create a new repository

svnadmin create /var/svn/REPOSITORY_NEW

Familiar, right? 😉

Step 5: Import the dumpfile in the new repository

svnadmin load /var/svn/REPOSITORY_NEW < newdumpfile

Step 6: Replace the old with the new repository

chown -R www-data:www-data /var/svn/REPOSITORY_NEW
mv /var/svn/REPOSITORY /var/svn/REPOSITORY_OLD
mv /var/svn/REPOSITORY_NEW /var/svn/REPOSITORY

In the first line I also changed the file owner and group to www-data to make the new repository accessible for apache. In case you do not use apache (e.g. svnserve), skip the line or change the file owner and group to your needs (see what the owner of the old repo was using “ls -l /var/svn” ).

Step 7: Check it

You update your working copy (shouldn’t change anything). But when you browse your history and want to see one of the files you removed, you will get an error that the file could not be found.
You might want to make a fresh checkout and a commit to see whether everything still works as expected…

Step 8: Clean up

In case everything went well, you can delete a couple of things:

rm -R dumpfile newdumpfile /var/svn/REPOSITORY_OLD

 

Deleting old revisions

I also found a useful blog post on how to delete old revisions and only keep new ones. Some users might prefer this option if it is not a single file they want to get rid of but complete old revisions.

 

By the way, the CrazyStat SVN repository will be publicly available soon…

Hope somebody finds anything of this useful.

Recommendation

Try my Open Source PHP visitor analytics script CrazyStat.