Are there pitfalls to putting $HOME in git instead of symlinking dotfiles?

I have for many years had my entire $HOME directory checked into subversion. This has included all my dotfiles and application profiles, many scripts, tools and hacks, my preferred basic home directory structure, not a few oddball projects and a warehouse worth of random data. This was a good thing. While it lasted.

But it’s gotten out of hand. The basic checkout is the same across dozens of systems, but not all that stuff is appropriate for all my machines. It doesn’t even all play nicely with different distros.

I’m in the process of cleaning house — separating the data out where it belongs, splitting out some scripts as separate projects, fixing some broken links in stuff that should be automated, etc.

My intent is to replace subversion with git for the toplevel checkout of $HOME, but I’d like to pare this down to just the things I’d like to have on ALL my systems, meaning dotfiles, a few directories and some basic custom scripts.

In reading up online a lot of people seem to be doing this using the symlink approach: clone into a subdirectory then create symlinks from $HOME into the repository. Having had my $HOME under full version control for over a decade, I don’t like the idea of this approach and I can’t figure out why people seem so averse to the straight checkout method. Are there pitfalls I need to know about specific to git as a top level checkout for $HOME?

P.S. Partly as an exercise in good coding, I’m also planning on making my root checkout public on GitHub. It’s scary how much security sensitive information I’ve allowed to collect in files that ought to be sharable without a second thought! WiFi password, un-passphrased RSA keys, etc. Eeek!

Answers:

Thank you for visiting the Q&A section on Magenaut. Please note that all the answers may not help you solve the issue immediately. So please treat them as advisements. If you found the post helpful (or not), leave a comment & I’ll get back to you as soon as possible.

Method 1

Yes, there is at least one major pitfall when considering git to manage a home directory that is not a concern with subversion.

Git is both greedy and recursive by default.

Subversion will naively ignore anything it doesn’t know about and it stops processing folders either up or down from your checkout when it reaches one that it doesn’t know about (or that belongs to a different repository). Git, on the other hand, keeps recursing into all child directories making nested checkouts very complicated due to namespace issues. Since your home directory is likely also the place where you checkout and work on various other git repositories, having your home directory in git is almost certainly going to make your life an impossible mess.

As it turns out, this is the main reason people checkout their dotfiles into an isolated folder and then symlink into it. It keeps git out of the way when doing anything else in any child directory of your $HOME. While this is purely a matter of preference if checking your home into subversion, it becomes a matter of necessity if using git.

However, there is an alternate solution. Git allows for something called a “fake root” where all the repository machinery is hidden in an alternate folder that can be physically separated from the checkout working directory. The result is that the git toolkit won’t get confused: it won’t even SEE your repository, only the working copy. By setting a couple environment variables you can tip off git where to find the goods for those moments when you are managing your home directory. Without the environment variables set nobody is the wiser and your home looks like it’s classic file-y self.

To make this trick flow a little smoother, there are some great tools out there. The vcs-home mailing list seems like the defacto place to start, and the about page has a convenient wrap up of howtos and people’s experiences. Along the way are some nifty little tools like vcsh, mr. If you want to keep your home directory directly in git, vcsh is almost a must have tool. If you end up splitting your home directory into several repostories behind the scenes, combine vcsh with mr for quick and not very dirty way to manage it all at once.

Method 2

I wouldn’t want my entire home directory checked into version control simply because it means every subdirectory I go into would have the version-control context of my home dir. Commands like git checkout would have an actual action in that case, causing issues if I accidentally run something from the wrong directory, whether that something is git itself or a script that calls git.

It also makes it more likely to add something to the repo that you don’t want, which would not have been an issue when you had everything checked in, but now becomes a problem. What if you accidentally add a private key file (perhaps out of habit) and push it to github?

Having said that, I think the primary disadvantages are not really technical — just wanting to save me from myself.

As for symlinking: You could clone your repo into a subdirectory, and have a script which updates any symlinks that need to be updated. The amount of maintenance required for this script might outweigh the benefits of having it at all, though; symlinking might turn out to be less work.

With symlinks, you can also easily make distro-specific (or even host-specific) additions that get checked into git. Your symlink-update script will ignore files intended for incompatible platforms or different hosts, and only update the appropriate ones.

Something like:

HOMEREPO=$HOME/homerepo
HOST=$(hostname)
UNAME=$(uname)

for dotfile in $HOMEREPO/shared/* $HOMEREPO/host-$HOST/* $HOMEREPO/uname-$UNAME/*
do
    target=$HOME/$(basename $dotfile)
    [ ! -r $target ] && ln -s $dotfile $target
done

Personally: I use symlinks, and I don’t symlink directories; only the files within. This gives me some flexibility to make site-local changes in those directories (ie. add/remove files). Setting up my account on a new system is tedious because I have to recreate all the symlinks by hand.

Method 3

To give another point of view: I have my $HOME under git since sometime now and didn’t find any drawbacks. I obviously do not sync this git repo to github; I use a service which have private repos. I also do not put any media files or downloads or packages under git control.

  • git status is a kind of “to do, to clean” checklist.
  • I have a ~/tmp for temporary things, which is gitignored.
  • I like to see in git status anything that a recently installed software dare to add to my $HOME, and often delete these files, or even uninstall the culprits.
  • I add manually the really useful local files and dirs to .gitignore, which has a ‘know what you do when installing things’ benefit.
  • If I build a new VM or install a new PC, I just clone my remote home to $HOME and have immediately everything I need under hands.
  • Things like vundle for vim plugins are not necessary anymore.

I dislike complexity. When I tweak any rcfile, I just do it, commit and push. Then, as a reflex, I git pull in $HOME every other day, and have always the latest config. It is that simple.

Machines currently under this regimen: Home laptop, work PC, work VM, plus 3 or 4 remote servers.

Method 4

I’ve tried both, and preferred the symlink approach in the end:

  • Check out to wherever
  • make install
  • Log out and in again to load the X settings

Disadvantages:

  • Have to move files to the repo before adding them
  • Have to maintain the list of symbolic links in the Makefile

Advantages:

  • No need for a massive .gitignore (I have 133 dotfiles in ~ on my humble Ubuntu box)
  • Can keep maintenance scripts and other ~-related stuff (such as Makefile and utility scripts) out of the way
  • Can version control personal and public settings separately

Restrictions:

  • Unlike @mrb, I only create symlinks in ~. That keeps the symlinking simple, and makes it trivial to notice new files in for example ~/.vim, at the cost of some very rare .gitignore maintenance.

That last two advantages tipped the scales in my case – I don’t want to clutter the home directory, and I want to keep private and public content clearly separate.

The only application I know of which has (or at least had) problems with handling symlinks was Pidgin – It kept overwriting my symlinks with ordinary files.

Method 5

Here’s one: If you try to do git rebase -i --root and you have checked in .gitconfig in the first commit in the repository, git will temporarily remove the .gitconfig file, which in turn will make it unable to finish the rebase operation since it requires your name and your email to do that, which are stored in that file.

You may configure them back again and do git rebase --continue, but after I did that and finished the rebase operation, my git repository had gained an empty commit without a commit message before the commit that was previously the first commit in the repository, which I do not know how to get rid of.

I don’t know what happens if you do git rebase -i <commit> instead, and .gitconfig is checked in together with any commit after <commit>.

Perhaps the easiest solution is to refrain from adding .gitconfig to the repository and instead list it in .gitignore.

Method 6

This is how I do it:

  1. install a clean linux (not necessary, but makes life more pleasant in step 4)
  2. install etckeeper
  3. run git init in your home
  4. create .gitignore and add everything that looks like it doesn’t interest you or that might change a lot. Be sure to add things like *.cache, *.lock etc. I don’t recommend adding /* because you won’t be notified automatically when something new is added to your home. It is a blacklist approach vs whitelist approach, where I basically want to keep my config for all software except for volatile stuff and some software I don’t care about. When you later merge, migrate or compare systems, being able to diff everything is quite handy. You can set up your new systems much faster than if you would just have .bashrc and a few other dotfiles stored. This way you will keep configuration that you might otherwise set through the GUI, and be unaware of which dotfiles store the settings. (If it ever turns out you have committed volatile files, you can still tell git to assume-unchanged)
  5. run etckeeper init -d /home/username
  6. run git commit -d /home/username
  7. set up aliases in your shell to make the command line nicer, like homekeeper checkout

The reason for using etckeeper is that it will store metadata like permissions for your files (rather important for certain things like ssh keys). You should now have a pre-commit hook that will save metadata automatically. I’m not so sure about post-checkout. You should probably use etckeeper checkout xxx -d /home/user I will look into it a bit more and elaborate this answer.

Method 7

My $HOME has been a git repository for years, and I have other repos under it, and haven’t had any problems.

A sticking point for many people seems to be the .gitignore file, presumably because they don’t want every untracked file to show up when doing git status. There is a concern that your ~/.gitignore might exclude things that you wouldn’t want to exclude from other repos under your home directory.

I didn’t add anything to my ~/.gitignore; instead, I set status.showUntrackedFiles = no in my ~/.git/config. Then the output of git status is clean. If for some reason you want to see the “normal” output, just say git status -unormal. Or you can also use git ls-files -u to list the untracked files under the CWD for a more targeted listing.

Alternatively, you can use git status -uno (or an alias for it) to get the same “quiet” status output, as someone suggested in a comment.

Method 8

My major problem with using Git on the home directory is that Git does not store file attributes such as file permissions and timestamps. For me it is important to know when certain files were created, that may or may not be the case for you. Furthermore, loosing permissions to files and directories such as .ssh is problematic. I understand that you plan on keeping .ssh out of Git, but there will be other places where permissions might matter (such as uncompressed website backups).

Method 9

A git-based solution is especially useful if you need to deploy your files to different machines, and even more so if you have parts that are common to all machines, and parts that are specific to some machines. You can make multiple repositories and use a tool like multigit or vcsh to clone them over the same directory (your home dir in this case).


All methods was sourced from stackoverflow.com or stackexchange.com, is licensed under cc by-sa 2.5, cc by-sa 3.0 and cc by-sa 4.0

0 0 votes
Article Rating
Subscribe
Notify of
guest

0 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
0
Would love your thoughts, please comment.x
()
x