Migrate Subversion Repository to GitHub
First if your subversion repository is internet accessible and has a valid SSL certificate this is easy as GitHub provides an simple import tool, but I don't have either of those on this subversion server so this is the extremely offline and hard way to migrate a repository with history.
Use Linux/macOS as many script commands do not work on Windows at this time.
Clone the svn server contents with git
git svn clone https://svnserver/repo/ targetFolder
Attach the new remote origin if you are converting into a new git repository
git remote add origin https://github.com/TargetOwner/newRepo.git
Using git filter to fix up committers and filter out large files
Fix committers
git filter-branch --env-filter '
OLD_EMAIL="svnuser@email.com"
CORRECT_NAME="GitHubUserName"
CORRECT_EMAIL="user@email.com"
if [ "$GIT_COMMITTER_EMAIL" = "$OLD_EMAIL" ]
then
export GIT_COMMITTER_NAME="$CORRECT_NAME"
export GIT_COMMITTER_EMAIL="$CORRECT_EMAIL"
fi
if [ "$GIT_AUTHOR_EMAIL" = "$OLD_EMAIL" ]
then
export GIT_AUTHOR_NAME="$CORRECT_NAME"
export GIT_AUTHOR_EMAIL="$CORRECT_EMAIL"
fi
' --tag-name-filter cat -- --branches --tags
Find the 20 largest files in the current master branch
git rev-list master | while read rev; do git ls-tree -lr $rev | cut -c54- | sed -r 's/^ +//g;'; done | sort -u | perl -e 'while (<>) { chomp; @stuff=split("\\t");$sums{$stuff[1]} += $stuff[0];} print "$sums{$_} $_\
" for (keys %sums);' | sort -rn | head -n 20 > /tmp/largefiles.txt
Now filter out these large files from the commit history
git filter-branch --tree-filter 'rm -rf `cat /tmp/largefiles.txt | cut -d " " -f 2` ' --prune-empty
Sometimes this still doesn't work and you have to remove the commits by filename
git filter-branch --tree-filter 'rm -rf Path/To/Big/file.txt' -f HEAD
Now push the complete amended and cleaned history to git
git push --force --tags origin 'refs/heads/*'