'Git pre-merge-commit hook : How do I ignore a file during a merge?

Context

I'm working in a complex git flow where some specific branches get specific submodules and some specific config files that require to be committed, but must not be merged.

These are few files but it is too dangerous to let anyone merge branches without being careful not to merge those.

In order to make it automatic, I worked on pre-merge-commit hooks, both at server and local side.

In case of conflict, I make use of .gitattributes and git/config files to resolve the conflict with a custom merge driver. It works like a charm.

Problem

However, I'm struggling to make it work when there is no conflict. In this case, the merge is carried out successfully and my pre-merge-hook is triggered. It does its magic and then exit successfully. Though, for some reason, git re-do some merging stuff after the hook which make it useless. Here is the behavior I'm witnessing :

before the merge

I got two branches, let's say A_current and B_incoming.

Both got a file I don't want to be merged. This file is called do_not_merge_me. At some point, do_not_merge_me content changed in B_incoming. Let's say it went from contentA to contentB

during the merge

What I see when I'm merging B_incoming into A_current is :

  • The merge goes on, and adds files in the staged area, including do_not_merge_me.
  • The merge succeed, so it triggers my hook
  • my hook finds do_not_merge_me in the staging area and remove it from the staging area (at the end, it's a git reset do_not_merge_me followed by a git checkout do_not_merge_me)
  • my hook ends properly, do_not_merge_me is not in the staging area anymore
  • Git does some dark magic : it redo a merge and re-stage do_not_merge_me
  • Git validate the commit, I see this added in my console :
Merge made by the 'recursive' strategy.
 do_not_merge_me               | 2 +-

  • Weirdly, after the merge is done, I got the correct versions of the files in my staged area (I'd never seen anything in the staging area after a merge, before this)

Question

The git documentation, available here https://git-scm.com/docs/githooks#_pre_merge_commit, states the pre-merge-commit is triggered after the merged is successfully handled and before the commit is validated.

My questions are:

  1. why do I get the correct version in the staged area ?
  2. Is there any way to achieve what I'm trying to do ?
  3. Why is git doing some merging stuff after the hook ? Is it a bug ?


Solution 1:[1]

The short answer is that you can't.

When git merge runs, it reads three commits into Git's index. These three commits are:

  • the merge base (in slot 1);
  • the --ours commit (in slot 2); and
  • the --theirs commit (in slot 3).

These are stored in the usual index format: a path name including slashes, a mode (100644 or 100755 for regular files, 120000 for symbolic links, and 160000 for gitlinks), and a hash ID.

The first part of the merge then compares the modes to make sure those are suitable (if not, this is a merge conflict). Assuming normal files and suitable modes here, it goes on to compare the hash IDs:

  • all three equal? file is successfully merged, drop to slot 0, erase slots 1-3
  • two equal? take the third one: drop to slot 0, erase slots 1-3
  • all three unequal? leave for later, for the real merge code.

There are a few more special cases (e.g., file exists in merge base and theirs/ours, but deleted in ours/theirs) that are also handled directly in the index, I think, but your particular case—file modified in theirs, but identical in ours and base—hits the middle "two equal? take third" case: the file is the same in your commit and the merge base, so Git just assumes that their updated file is the correct result.

When Git does this in the early pass, it never runs your merge driver at all. The file goes to staging slot zero—"ready to be committed"—rather than conflicted and you never get a chance to do anything. Your pre-merge-commit will get invoked, but the copy of the file in the index will be the one from the theirs commit.

We now get into the seriously dark magic part: "the index" assumes that there's a single index (.git/index) that is always used. This isn't really the case: it's mostly true, but:

  • $GIT_INDEX_FILE overrides the name;
  • added work-trees (from git worktree add) have their own index; and
  • various Git commands read the index into memory and then work with that.

In this case, it looks like git merge has the index in-memory and just uses it as is to make the new commit. Your git add replaces the stage-zero copy in the .git/index file, but git merge does not notice this, and goes on to produce the new merge commit using the incoming copy that was there before it even ran your pre-merge-commit hook.

Assuming this is all true—and it may change from one Git version to another, depending on when and whether Git does any re-reading of the index—this would answer your question #1, and render the answer to #2 "no" and the answer to #3 be "you're trying to do something outside the range of what Git handles".

What you want to do is not inherently unreasonable, but Git just doesn't support it.

Solution 2:[2]

So to anyone who need to apply changes during a merge, here is the solution I came with.

keep in mind this solution can possibly create some issues in some corner cases as pointed by @torek in this comment.

Most of the time, you want to avoid doing modifications at merge. Prefer verification.

Those steps work well for me with my version of git (2.31.1). I don't know if this behavior is consistent across versions.

  1. Implement a custom merge strategy using .gitattributes for the files you need to modify. It must apply those modifications. This does the same thing than step 3, but in case of conflict on the targeted files

  2. Implement a pre-merge-commit hook. This will be triggered after conflicts are solved. This mean you will have access to a staged area that mirrors the merge result.

  3. Modify the staged area: Using your pre-merge-commit hook, you can modify your staged area, this won't actually modify the merge outcome (which is stored somewhere else). Instead, when your script will successfully exit, your modifications will get in the staged area. This is the first time I see something left in the staged area after a merge.

Note : The reason why it does that is git seems to initiate a second merge

  1. Finally, you need to implement a post-merge hook to amend the merge commit with the actual staged area content. You need to delete the file .git/MERGE_HEAD before doing so.

Solution 3:[3]

I also encountered this need by myself, and I managed to solve it with a git alias.

I published it on this repository, so you can use it too.

You are very welcome to add your own ideas to it for future updates and contribute.

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 torek
Solution 2 NicolasDg
Solution 3 Tal Jacob - Sir Jacques