Migrate Subversion Repository to GitHub

First if your subversion repository is internet accessible and has a valid SSL certificate this is easy as GitHub provides an simple import tool, but I don't have either of those on this subversion server so this is the extremely offline and hard way to migrate a repository with history.

Use Linux/macOS as many script commands do not work on Windows at this time.

Clone the svn server contents with git

git svn clone https://svnserver/repo/ targetFolder

Attach the new remote origin if you are converting into a new git repository

git remote add origin https://github.com/TargetOwner/newRepo.git

Using git filter to fix up committers and filter out large files

Fix committers

git filter-branch --env-filter '
OLD_EMAIL="svnuser@email.com"
CORRECT_NAME="GitHubUserName"
CORRECT_EMAIL="user@email.com"
if [ "$GIT_COMMITTER_EMAIL" = "$OLD_EMAIL" ]
then
 export GIT_COMMITTER_NAME="$CORRECT_NAME"
 export GIT_COMMITTER_EMAIL="$CORRECT_EMAIL"
fi
if [ "$GIT_AUTHOR_EMAIL" = "$OLD_EMAIL" ]
then
 export GIT_AUTHOR_NAME="$CORRECT_NAME"
 export GIT_AUTHOR_EMAIL="$CORRECT_EMAIL"
fi
' --tag-name-filter cat -- --branches --tags

Find the 20 largest files in the current master branch

git rev-list master | while read rev; do git ls-tree -lr $rev | cut -c54- | sed -r 's/^ +//g;'; done | sort -u | perl -e 'while (<>) { chomp; @stuff=split("\\t");$sums{$stuff[1]} += $stuff[0];} print "$sums{$_} $_\
" for (keys %sums);' | sort -rn | head -n 20 > /tmp/largefiles.txt

Now filter out these large files from the commit history

git filter-branch --tree-filter 'rm -rf `cat /tmp/largefiles.txt | cut -d " " -f 2` ' --prune-empty

Sometimes this still doesn't work and you have to remove the commits by filename

git filter-branch --tree-filter 'rm -rf Path/To/Big/file.txt' -f HEAD

Now push the complete amended and cleaned history to git

git push --force --tags origin 'refs/heads/*'