'Definitive retroactive .gitignore (how to make Git completely/retroactively forget about a file now in .gitignore)
Preface
This question attempts to clear the confusion regarding applying .gitignore retroactively, not just to the present/future.1
Rationale
I've been searching for a way to make my current .gitignore be retroactively enforced, as if I had created .gitignore in the first commit.
The solution I am seeking:
- Will not require manually specifying files
- Will not require a commit
- Will apply retroactively to all commits of all branches
- Will ignore .gitignore-specified files in working dir, not delete them (just like an originally root-committed .gitignore file would)
- Will use git, not BFG
- Will apply to .gitignore exceptions like:
*.ext
!*special.ext
Not solutions
git rm --cached *.ext
git commit
This requires 1. manually specifying files and 2. an additional commit, which will result in newly-ignored file deletion when pulled by other developers. (It is effectively just a git rm
- which is a deletion from git tracking - but it leaves the file alone in the local (your) working directory. Others who git pull
afterwards will receive the file deletion commit)
git filter-branch --index-filter 'git rm --cached *.ext'
While this does purge files retroactively, it 1. requires manually specifying files and 2. deletes the specified files from the local working directory just like plain git rm
(and so also for others who git pull
)!
Footnotes
1There are many similar posts here on SO, with less-than-specifically-defined questions and even more less-than-accurate answers. See this question with 23 answers where the accepted answer with ~4k votes is incorrect according to the standard definition of "forget" as noted by one mostly-correct answer, and only 2 answers include the required git filter-branch
command.
This question with 21 answers is was marked as a duplicate of the previous one, but the question is defined differently (ignore vs forget), so while the answers may be appropriate, it is not a duplicate.
This question is the closest I've found to what I'm looking for, but the answers don't work in all cases (paths with spaces...) and perhaps are a bit more complex than necessary regarding creating an external-to-repository .gitignore file and copying it into every commit.
Solution 1:[1]
This may be only a partial answer but here is how I accomplished retroactively removing files from previous git commits based on my current .gitignore file:
- Make a backup of the repo folder you are working on. I just made a .7z archive of the entire folder.
- Install git-filter-repo
- Copy your .gitignore file somewhere else temporarily. Since I'm on Windows and using Command Prompt, I ran
copy .gitignore ..\
and just made the temp copy only directory level up - If your .gitignore file has wildcard filters (like
nbproject/Makefile-*
), you'll need to edit your temp copied .gitignore file so those lines readglob:nbproject/Makefile-*
- Run
git filter-repo --invert-paths --paths-from-file ..\.gitignore
. My understanding is that this uses the temp copy as a list of files/directories to remove. Note: if you receive an error regarding your repo not being a clean clone, search for "FRESH CLONE SAFETY CHECK AND --FORCE" in the git-filter-repo help. Be careful.
For more info see: git-filter-repo help (Search for "Filtering based on many paths")
Disclaimer: I have no idea what I'm doing but this worked for me.
Solution 2:[2]
EDIT: I've recently found git-filter-repo. It may be a better choice. Perhaps a good idea to investigate the rationale and filter-branch gotchas for yourself, but they wouldn't have affected my use-case below.
This method makes Git completely forget ignored files (past/present/future), but does not delete anything from working directory (even when re-pulled from remote).
This method requires usage of /.git/info/exclude
(preferred) OR a pre-existing .gitignore
in all the commits that have files to be ignored/forgotten. 1
This method avoids removing the newly-ignored files from other developers machines on the next git pull
2
All methods of enforcing Git ignore behavior after-the-fact effectively re-write history and thus have significant ramifications for any public/shared/collaborative repos that might be pulled after this process. 3
General advice: start with a clean repo - everything committed, nothing pending in working directory or index, and make a backup!
Also, the comments/revision history of this answer (and revision history of this question) may be useful/enlightening.
#commit up-to-date .gitignore (if not already existing)
#these commands must be run on each branch
#these commands are not strictly necessary if you don't want/need a .gitignore file. .git/info/exclude can be used instead
git add .gitignore
git commit -m "Create .gitignore"
#apply standard git ignore behavior only to current index, not working directory (--cached)
#if this command returns nothing, ensure /.git/info/exclude AND/OR .gitignore exist
#this command must be run on each branch
#if using .git/info/exclude, it will need to be modified per branch run, if the branches have differing (per-branch) .gitignore requirements.
git ls-files -z --ignored --exclude-standard | xargs -r0 git rm --cached
#Commit to prevent working directory data loss!
#this commit will be automatically deleted by the --prune-empty flag in the following command
#this command must be run on each branch
#optionally use the --amend flag to merge this commit with the previous one instead of creating 2 commits.
git commit -m "ignored index"
#Apply standard git ignore behavior RETROACTIVELY to all commits from all branches (--all)
#This step WILL delete ignored files from working directory UNLESS they have been dereferenced from the index by the commit above
#This step will also delete any "empty" commits. If deliberate "empty" commits should be kept, remove --prune-empty and instead run git reset HEAD^ immediately after this command
git filter-branch --tree-filter 'git ls-files -z --ignored --exclude-standard | xargs -r0 git rm -f --ignore-unmatch' --prune-empty --tag-name-filter cat -- --all
#List all still-existing files that are now ignored properly
#if this command returns nothing, it's time to restore from backup and start over
#this command must be run on each branch
git ls-files --other --ignored --exclude-standard
Finally, follow the rest of this GitHub guide (starting at step 6) which includes important warnings/information about the commands below.
git push origin --force --all
git push origin --force --tags
git for-each-ref --format="delete %(refname)" refs/original | git update-ref --stdin
git reflog expire --expire=now --all
git gc --prune=now
Other devs that pull from now-modified remote repo should make a backup and then:
#fetch modified remote
git fetch --all
#"Pull" changes WITHOUT deleting newly-ignored files from working directory
#This will overwrite local tracked files with remote - ensure any local modifications are backed-up/stashed
git reset FETCH_HEAD
Footnotes
1 Because /.git/info/exclude
can be applied to all historical commits using the instructions above, perhaps details about getting a .gitignore
file into the historical commit(s) that need it is beyond the scope of this answer. I wanted a proper .gitignore
to be in the root commit, as if it was the first thing I did. Others may not care since /.git/info/exclude
can accomplish the same thing regardless where the .gitignore
exists in the commit history, and clearly re-writing history is a very touchy subject, even when aware of the ramifications.
FWIW, potential methods may include git rebase
or a git filter-branch
that copies an external .gitignore
into each commit, like the answers to this question
2 Enforcing git ignore behavior after-the-fact by committing the results of a standalone git rm --cached
command may result in newly-ignored file deletion in future pulls from the force-pushed remote. The --prune-empty
flag in the git filter-branch
command (or git reset HEAD^
afterwards) avoids this problem by automatically removing the previous "delete all ignored files" index-only commit.
3 Re-writing git history also changes commit hashes, which will wreak havoc on future pulls from public/shared/collaborative repos. Please understand the ramifications fully before doing this to such a repo. This GitHub guide specifies the following:
Tell your collaborators to rebase, not merge, any branches they created off of your old (tainted) repository history. One merge commit could reintroduce some or all of the tainted history that you just went to the trouble of purging.
Alternative solutions that do not affect the remote repo are git update-index --assume-unchanged </path/file>
or git update-index --skip-worktree <file>
, examples of which can be found here.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 | wreckfix |
Solution 2 |