Fixing a typo across multiple repos

Yesterday I found a typo in a pull request description while browsing another team’s project which I stumbled upon. I mentioned it to the author but it turned out that that part of the text came from the repository’s pull request template, which means every pull request will have this amusing but irritating mistake. I sent them a pull request, modifying the template, to fix the mistake at the source and avoid it in future, and thought that would be the end of it.

It turns out that template was written once and then copied across to new repos, which means this typo actually exists in almost all the pull requests in all of that team’s projects. Well that escalated quickly. This is the point where the average person probably says “OK whatever, it’s not worth it for something so small, there are too many repos, it’s just a small typo, never mind” and stop. A very determined person might actually start opening browser tabs and psyching themselves up to do pull requests. I open my terminal emulator and start writing a for loop.

xkcd: Automation
xkcd #1319: Automation

I have a really low tolerance for repetitive tasks, but unlike in the xkcd comic above, this actually worked on the second try — the first one was thwarted by a typo, which would be ironic if it wasn’t the number one cause of programming errors. Here’s the result reformatted for readability:

for i in {android,ios}-{foo,bar,baz,many,projects}; do
    git clone myorg/$i && 
        cd $i && 
        git checkout -b typofix && 
        sed -i -e 's/teh tyop/the typo/' PULL_REQUEST_TEMPLATE.md && 
        git add . && 
        git commit -am 'Fix typo in PR template' && 
        git push origin typofix && 
        git pull-request -F -<<PRTEXT
Fix typo

Typo: "teh tyop" misspelled in the PULL_REQUEST_TEMPLATE, fixed it to "the typo"
so that future PRs have it right.
PRTEXT
    cd ..
done | tee ~/typo.log | grep http

This loops over the repos, clones them, does the edits and git commands you would do one-by-one, makes a pull request and then collects all the PR URLs. What’s interesting here though? A bunch of stuff, all of which is easy once you know how, but can otherwise be quite esoteric.

Hub

My git command is actually aliased to hub, the GitHub CLI which wraps git with extra GitHub-specific commands. It’s easy to use, but was a bit tricky to set up with our particular GHE instance. Hub lets you open pull requests from the command line with git pull-request, which is handy since that’s where I’m writing and committing code, but it’s critical here because I need to script it. Hub also lets me clone the repo without typing the whole name: The repo name and owner is all you need so git clone myorg/foo will do what you expect but with much less typing.

The hub pull-request command usually opens an editor for you to write your pull request description, unless you give it the -F flag, in which case it gets the text from an existing file. What’s even cooler is that if you give it - as a filename it’ll get the text from standard input. That means you could echo out the text, pipe it into the hub command and that’ll be your pull request, but it can be cooler still if you use a HERE document.

This is a special kind of shell redirection in which anything between two markers is passed exactly as it is via stdin. The marker is traditionally HERE, but it can be PRTEXT or anything that’s unlikely to occur inside your text, because it will signify the end of the doc. It’s especially useful here because hub uses the first line of text as the PR title, and the following lines as the description body. Now after typing the starting marker, you just press enter and type whatever you want until you repeat the marker and that’ll be your text.

Shell Built-ins

You’ll also notice I use && after almost every line. This is a hack using the order of evaluation in logical expressions and is extremely commonly used in shell scripting. It’ll only execute the part after && if the part before it succeeded. I use this here to chain together things that depend on each other. For example, if the push fails (maybe a branch with that name already exists) then it should not try to make a pull request with that branch. Notably, the last cd .. is not preceded by one of these because, whatever happens in that repo, good or bad, we want to get out again before carrying on with the next one. The obvious edge case is: What if the clone fails? But that didn’t happen! 😉

As for cloning the repos at all, I already know their names, and they’re consistent, so I can save some typing. This team does iOS and Android versions of various apps, so there’s one of each. With brace expansion you can write {ios,android}-{foo,bar}, which will expand to ios-foo ios-bar android-foo android-bar, which you can pass on to your loop, without you having to type each one twice.

The last part of this script contains more normal things. The output of the for loop is piped on to grep, which filters for the URLs (matching http) output by git pull-request so I can share them with the team for review. Obviously, this means any other output would be lost, including warning messages from git, so the tee command between the two writes all the output to a log file for me to check afterwards if anything goes wrong.

automate ALL THE PULL REQUESTS

In summary, I learnt a thing or two about the git pull-request command and relearnt the HERE document syntax (by tomorrow I’ll probably forget if it’s 2 arrows or 3) and even though this solved a really simple problem, it was fun and educationally to do. More importantly: Now you know it’s possible, so never shy away from a challenge and have fun automating ALL THE PULL REQUESTS!!!