How to automatically create fixup commits

How to automatically create fixup commits

Save time when implementing PR fixes with this neat script

Let’s say you’ve submitted a pull request with nice atomic commits and your colleagues have found a few improvements. They are easy to implement, but if you want a nice history, you now have to figure out to which commit you should add them. Usually that’s just the last commit where the file was changed. But it still takes a while to find that — or you let this script do the work.

Made by the author using [carbon.sh](https://cdn.hashnode.com/res/hashnode/image/upload/v1633809858519/2jm_n2wSf.html)Made by the author using carbon.sh

Usage

Before showing you the script, I first want to show you what you can do with it.

By default this script automatically figures out the commit a changed file was last modified in and creates fixup commits for them. There will only be one fixup commit for one original commit, even if multiple changed files were modified in the same one.

You should be able to squash all of the generated commits without a merge conflict (using git rebase -i).

Additionally, you can also tweak its behavior a bit with some options:

  • The plain text parameter “root”: This script will only look at commits starting from (but excluding) the root commit. This can be used to ensure that no fixup commits will be created for commits that aren’t on the current branch. Files that were last changed in or before this commit will simply be left uncommitted.

  • -n | --dry-run: If dry run is specified no lasting changes are made. Instead it will just output which files it would assign to which commits.

  • -d | --default-root Specifies the default root. If both the default root and root are specified, only root is considered. This is meant for a git alias, where you can specify main or master as default and still overwrite it when using the alias.

  • -M<n> | --find-renames[=<n>]: If specified, will attempt to also assign renames. The format is the same as for git diff.

For example, let’s look at the following command:

assign -n -d main -M master

This will cause the script to list the commits for all changed files (instead of creating commits thanks to -n) including renamed files (due to -M). It will not list files if their last changed commit was made before or in the commit that the current master is on (would be main with just the -d option, but it was overwritten by the last parameter).

See the next chapter for an example output.

Example

To test this, I also made a small script that creates a git repository and adds some commits to it.

#!/bin/sh

rm -rf repo
mkdir repo
cd repo

git init

echo "* text eol=lf" > .gitattributes
git add .
git commit -m "initial commit"
git checkout -b "feature-branch"

echo a > A
git add .
git commit -m "A"

echo b > B
git add .
git commit -m "B"

echo with spaces > "file with spaces"
git add .
git commit -m "file with spaces"

echo aAndC >> B
echo c > C
git add .
git commit -m "B and C"

echo change > A
echo change > B
echo change > C
mv "file with spaces" "other file with spaces"

This will create a few commits and some uncommitted changes:

Screenshot of the [git for windows](https://cdn.hashnode.com/res/hashnode/image/upload/v1633809862416/yFLBsSJqP.html) bash made by the authorScreenshot of the git for windows bash made by the author

If I now execute assign -M -d main -n in that repository, the output will look like this:

Screenshot of the [git for windows](https://gitforwindows.org/) bash made by the authorScreenshot of the git for windows bash made by the author

Note that in this screenshot, I already took advantage of a git alias that I’ll show you in the end.

The code

Okay, enough explanation, here is the code:

#!/bin/bash

# Usage information:
#   assign [-n | --dry-run] [-d | --default-root] [-M[<n>] | --find-renames[=<n>]] [<root>]
# 
#   -n | --dry-run:
#       If dry run is specified no lasting changes are made.
#       Instead it will just output which files it would assign to which commits
#   -d | --default-root:
#       Specifies the default root.
#       If both the default root and root are specified, only root is considered.
#       This is meant for a git alias, where you can specify 'main' or 'master' as default
#       and still overwrite it when using the alias.
#   -M[<n>] | --find-renames[=<n>]:
#       If specified, will attempt to also assign renames.
#       The format is the same as for git diff.
#       (see https://git-scm.com/docs/git-diff#Documentation/git-diff.txt--Mltngt)
#   root:
#       This script will only look at commits starting from (but excluding) the root commit.
#       This can be used to ensure that no fixup commits will be created for commits that
#       aren't on the current branch.
#       Files that were last changed in or before this commit will simply be left uncomitted.

# Default options
DRY_RUN=false
DEFAULT_ROOT=""
ROOT=""
FIND_RENAMES=()

# Read potential options
while [ "$1" != "" ];
do
    case $1 in
        -n | --dry-run)
            DRY_RUN=true
            ;;
        -d | --default-root)
            shift
            DEFAULT_ROOT="$1"
            ;;
        -M* | --find-renames*)
            FIND_RENAMES=("$1")
            ;;
        -*)
            printf "unrecognised option: %s" "$1"
            exit 1
            ;;
        *)
            if [[ $ROOT == "" ]];
            then
                ROOT="$1"
            else
                printf "Cannot specify root twice (specified both '%s' and '%s')" "${ROOT}" "$1"
                exit 1
            fi
            ;;
    esac
    shift
done

if [[ $DEFAULT_ROOT != "" && $ROOT == "" ]];
then
    ROOT="${DEFAULT_ROOT}"
fi

# For each changed file find the commit in which it was last changed.
# lastCommits is an associative array with the commit hash as key and
# a newline separated list of the file names as value.
declare -A lastCommits
IFS=$'\n'; for file in $(git diff --name-only);
do
    if [[ $ROOT != "" ]];
    then
        commit=$(git log -1 --format="format:%H" "${ROOT}.." -- "${file}")

        # When a root is specified, a file could have no changes before that.
        # We simply ignore that file then.
        if [[ $commit != "" ]];
        then
            lastCommits[$commit]+="$file"$'\n'
        fi
    else
        commit=$(git log -1 --format="format:%H" -- "${file}")
        lastCommits["$commit"]+="$file"$'\n'
    fi
done

declare -A renamedFiles=()
if (( ${#FIND_RENAMES[@]} ));
then
    # first add all files indvidually as intent-to-add
    intentToAdd=$(git ls-files -oX .gitignore)
    for file in $intentToAdd;
    do
        git add -N "$file"
    done

    # get all renames and save them into an associative array
    while IFS=$'\t' read -r origFile newFile;
    do
        renamedFiles["$origFile"]="$newFile"
    done < <(git diff --name-status --diff-filter=R "${FIND_RENAMES[@]}" | cut -f 2,3)

    # remove intent-to-add
    for file in $intentToAdd;
    do
        git reset -q -- "$file"
    done
fi

# For each commit in lastCommits, we now add all of the associated
# files and then create a fixup commit for the respective commit

add_file() {
    if [[ $DRY_RUN == false ]];
    then
        git add "$1"
    else
        printf "\t%s\n" "$1"
    fi
}


IFS=$'\n'
for commit in "${!lastCommits[@]}";
do
    if [[ $DRY_RUN == true ]];
    then
        printf "%s\n" "$(git log -1 --color --format="format:%C(auto)%h%d%Creset: (%aN) %s" "${commit}")"
    fi

    for file in ${lastCommits[$commit]};
    do
        add_file "$file"
        if [[ -v "renamedFiles[${file}]" ]];
        then
            add_file "${renamedFiles[${file}]}"
        fi
    done

    if [[ $DRY_RUN == false ]];
    then
        git commit --fixup "${commit}"
    fi
done
unset IFS

It’s a lot, but luckily you can just copy-paste it into a file on your computer. And instead of always calling the file directly, I’d recommend making a git alias for it.

I did this by adding the following to my ~/.gitconfig file:

[alias]
 a = assign
 assign = !bash -c '~/.gitscripts/assign -M -d "$(git merge-base HEAD main)" "$@"' -
 assign-all = !bash -c '~/.gitscripts/assign -M "$@"' -

Of course you might want to change the default options for your aliases a bit (e.g. changing main to master), but I find that those are the most sensible defaults for me. Just don’t forget to replace the path to your script (in my case I saved it as ~/.gitscripts/assign).

Oh, and no guarantees if your file names have newlines or tabs in them.

Explanation

If you just want to use it, you can stop reading now. But if you’re curious how it works, here is an incremental explanation:

The simplest script

At first I just wanted to have a script that automatically creates fixup commits based on the last commit a changed file was modified in. I came up with a simple 5-line script:

for file in $(git diff --name-only);
do
    git add "${file}"
    git commit --fixup "$(git log -1 --format="format:%H" -- "${file}")"
done

Pretty simple, right? Go through each changed file, add it and create a fixup commit.

The simple script

But the previous script will potentially create multiple fixup commits for the same commit. I thought that was ugly, so instead of directly adding and committing, we’re first creating a map of commits to its changed files.

#!/bin/bash

# For each changed file find the commit in which it was last changed.
# lastCommits is an associative array with the commit hash as key and
# a newline separated list of the file names as value.
declare -A lastCommits
for file in $(git diff --name-only);
do
    commit=$(git log -1 --format="format:%H" -- "${file}")
    lastCommits[$commit]+="$file"$'\n'
done

# For each commit in lastCommits, we now add all of the associated
# files and then create a fixup commit for the respective commit
IFS=$'\n'
for commit in "${!lastCommits[@]}";
do
    for file in ${lastCommits[$commit]};
    do
        git add "${file}"
    done

    git commit --fixup "$commit"
done
unset IFS

Adding some options

Next I wanted the ability to have a dry run for the times, where I’m not sure whether the last commit is actually the correct one. So I added a bit more command parsing boilerplate at the beginning and some if statements.

#!/bin/bash

# Default options
DRY_RUN=false

# Read potential options
while [ "$1" != "" ];
do
    case $1 in
        -n | --dry-run)
            DRY_RUN=true
            ;;
        *)
            printf "unrecognised option: %s" "$1"
            exit 1
            ;;
    esac
    shift
done

# For each changed file find the commit in which it was last changed.
# lastCommits is an associative array with the commit hash as key and
# a newline separated list of the file names as value.
declare -A lastCommits
for file in $(git diff --name-only);
do
    commit=$(git log -1 --format="format:%H" -- "${file}")
    lastCommits[$commit]+="$file"$'\n'
done

# For each commit in lastCommits, we now add all of the associated
# files and then create a fixup commit for the respective commit
IFS=$'\n'
for commit in "${!lastCommits[@]}";
do
    if [[ $DRY_RUN == true ]];
    then
        printf "commit %s:\n" "${commit}"
    fi

for file in ${lastCommits[$commit]};
    do
        if [[ $DRY_RUN == false ]];
        then
            git add "${file}"
        else
            printf "\t%s\n" "${file}"
        fi
    done

if [[ $DRY_RUN == false ]];
    then
        git commit --fixup "${commit}"
    fi
done
unset IFS

Adding the root options

In order to add the root option(s), I extended the boilerplate code at the beginning. After the options are read, the root is set to either the default root or the specific root.

Now, when figuring out which files belong to which commit in the first loop, we just also specify the root commit (if we have one). This will cause git log to return empty if it finds no relevant commit after the specified commit.

No other changes are required.

#!/bin/bash

# Default options
DRY_RUN=false
DEFAULT_ROOT=""
ROOT=""

# Read potential options
while [ "$1" != "" ];
do
    case $1 in
        -n | --dry-run)
            DRY_RUN=true
            ;;
        -d | --default-root)
            shift
            DEFAULT_ROOT="$1"
            ;;
        -*)
            printf "unrecognised option: %s" "$1"
            exit 1
            ;;
        *)
            if [[ $ROOT == "" ]];
            then
                ROOT="$1"
            else
                printf "Cannot specify root twice (specified both '%s' and '%s')" "${ROOT}" "$1"
                exit 1
            fi
            ;;
    esac
    shift
done

if [[ $DEFAULT_ROOT != "" && $ROOT == "" ]];
then
    ROOT="${DEFAULT_ROOT}"
fi

# For each changed file find the commit in which it was last changed.
# lastCommits is an associative array with the commit hash as key and
# a newline separated list of the file names as value.
declare -A lastCommits
for file in $(git diff --name-only);
do
    if [[ $ROOT != "" ]];
    then
        commit=$(git log -1 --format="format:%H" "${ROOT}.." -- "${file}")

        # When a root is specified, a file could have no changes before that.
        # We simply ignore that file then.
        if [[ $commit != "" ]];
        then
            lastCommits[$commit]+="$file"$'\n'
        fi
    else
        commit=$(git log -1 --format="format:%H" -- "${file}")
        lastCommits[$commit]+="$file"$'\n'
    fi
done

Adding find-renames

The previous script works pretty well already. The only annoying thing is that if I rename a file, it will only create a fixup commit for the deletion of the original file, but not for the adding of the “new” file.

So I added a new option:

# Default options
DRY_RUN=false
DEFAULT_ROOT=""
ROOT=""
FIND_RENAMES=()

# Read potential options
while [ "$1" != "" ];
do
    case $1 in
        -n | --dry-run)
            DRY_RUN=true
            ;;
        -d | --default-root)
            shift
            DEFAULT_ROOT="$1"
            ;;
        -M* | --find-renames*)
            FIND_RENAMES=("$1")
            ;;
        -*)
            printf "unrecognised option: %s" "$1"
            exit 1
            ;;
        *)
            if [[ $ROOT == "" ]];
            then
                ROOT="$1"
            else
                printf "Cannot specify root twice (specified both '%s' and '%s')" "${ROOT}" "$1"
                exit 1
            fi
            ;;
    esac
    shift
done

Note that the variable FIND_RENAMES is either empty or has the whole, unchanged parameter. This means that I can just provide it as-is to git diff and you can automatically use all of the features from that command.

So, after we’ve already assigned each changed file it’s commit, we now also want to figure out if there are any renames according to git. To do this, we first need to find all unstaged files. Then we tell git that we intent to add them (using git add -N). Afterwards git will check whether it’s a renamed file or not. And we can ask whether it is by using git diff --name-status and filter for renames using --diff-filter=R.

We then simply save which file was renamed to what.

declare -A renamedFiles=()
if (( ${#FIND_RENAMES[@]} ));
then
    # first add all files indvidually as intent-to-add
    intentToAdd=$(git ls-files -o)
    for file in $intentToAdd;
    do
        git add -N "$file"
    done

# get all renames and save them into an associative array
    while IFS=$'\t' read -r origFile newFile;
    do
        renamedFiles["$origFile"]="$newFile"
    done < <(git diff --name-status --diff-filter=R "${FIND_RENAMES[@]}" | cut -f 2,3)

    # remove intent-to-add
    for file in $intentToAdd;
    do
        git reset -q -- "$file"
    done
fi

And when adding the files, we now also add all renamed files:

if [[ -v "renamedFiles[${file}]" ]];
then
    add_file "${renamedFiles[${file}]}"
fi

And that’s it.

Photo by [Alex](https://cdn.hashnode.com/res/hashnode/image/upload/v1633809867125/9eCN0B9K2.html) on [Unsplash](https://unsplash.com?utm_source=medium&utm_medium=referral)Photo by Alex on Unsplash