2015年7月30日 星期四

How to Delete Big Files From Git History

Please refer to these articles:

  1. To find that object and decide whether it’s worth deleting later on
    $ git rev-list --objects --all | sort -k 2 > allfileshas.txt
  2. Get the last object SHA for all committed files and sort them in biggest to smallest order
    $ git gc && git verify-pack -v .git/objects/pack/pack-*.idx | egrep "^\w+ blob\W+[0-9]+ [0-9]+ [0-9]+$" | sort -k 3 -n -r > bigobjects.txt
  3. Take that result and iterate through each line of it to find the SHA, file size in bytes, and real file name (you also need the allfileshas.txt output file from above)
    #!/bin/sh
    for SHA in `cut -f 1 -d\  < bigobjects.txt`; do
    echo $(grep $SHA bigobjects.txt) $(grep $SHA allfileshas.txt) | awk '{print $1,$3,$7}' >> bigtosmall.txt
    done;
  4. Use filter-branch to remove the file/directory (replace MY-BIG-DIRECTORY-OR-FILE with the path that you’d like to delete relative to the root of the git repo
    $ git filter-branch --prune-empty --index-filter 'git rm -rf --cached --ignore-unmatch MY-BIG-DIRECTORY-OR-FILE' --tag-name-filter cat -- --all
  5. Then clone the repo and make sure to not leave any hard links with
    $ git clone --no-hardlinks file:///Users/yourUser/your/full/repo/path repo-clone-name
  6. Change remote origin url
    $ git remote remove origin
    $ git remote add origin YOUR-PROJECT-GIT-URL
  7. To force-push your local changes to overwrite your GitHub repository, as well as all the branches you've pushed up
    $ git push origin --force --all
  8. In order to remove the sensitive file from your tagged releases, you'll also need to force-push against your Git tags
    $ git push origin --force --tags