gitolite/doc/3-faq-tips-etc.mkd
2010-02-01 10:44:48 +05:30

24 KiB

assorted faqs, tips, and notes on gitolite

In this document:

  • common errors and mistakes
  • git version dependency
  • other errors, warnings, notes...
    • ssh-copy-id
    • cloning an empty repo
    • @all syntax for repos
    • umask setting
  • getting a tar file from a clone
  • differences from gitosis
    • simpler syntax
    • two levels of access rights checking
    • file/dir NAME based restrictions
    • error checking the config file
    • including config lines from other files
    • delegating parts of the config file
    • easier to specify gitweb "description" and gitweb/daemon access
    • easier to link gitweb authorisation with gitolite
    • better logging
    • one user, many keys
    • support for git installed outside default PATH
    • what repos do I have access to?
    • "exclude" (or "deny") rules
    • "personal" branches
    • custom hooks and custom git config
  • design choices
    • keeping the parser and the access control separate

common errors and mistakes

  • adding repositories/ at the start of the repo name in the git clone. This error is typically made by the admin himself -- because he knows what $REPO_BASE is set to and thinks he has to provide that prefix on the client side also :-) In fact gitolite prepends $REPO_BASE internally, so you shouldn't also do the same thing!

  • being able to clone but getting errors on push. Most likely caused by a combination of:

    • you already have shell access to the server, not just "gitolite" access, and

    • you cloned using git clone git@server:repositories/repo.git (notice there's an extra "repositories/" in there?)

    In other words, you used a key that completely bypassed gitolite and went straight to the shell to do the clone.

    Please see doc/6-ssh-troubleshooting.mkd for what all this means.

git version dependency

Here's a workaround for a version dependency that the normal flow of gitolite has.

When you edit your config file to create a new repo, and push the changes to the server, gitolite creates an empty, bare repo for you. Normally, you're expected to clone this on the client side, and start working -- make your first commit(s), then push, etc.

However, cloning an empty repo requires a server side git version that is at least 1.6.2. Gitolite detects this when creating a repo, and warns you.

The workaround is to use the older (gitosis-style) method on the client: create an empty repo locally, make a commit or two, set an "origin" remote, and then push. Something like:

mkdir my-new-project
cd    my-new-project
git init
git commit --allow-empty -m 'Initial repository'
# or, if your client side git is too old for --allow-empty, just make some
# files, "git add" them, then "git commit"
git remote add origin git@gitolite-server:my-new-project.git
git push origin master:master

Once this is done, the repo is available for cloning by anyone else in the normal way, since it's not empty anymore.

other errors, warnings, notes...

ssh-copy-id

don't have ssh-copy-id? This is broadly what that command does, if you want to replicate it manually. The input is your pubkey, typically ~/.ssh/id_rsa.pub from your client/workstation.

  • it copies it to the server as some file

  • it appends that file to ~/.ssh/authorized_keys on the server (creating it if it doesn't already exist)

  • it then makes sure that all these files/directories have go-w perms set (assuming user is "git"):

    /home/git/.ssh/authorized_keys
    /home/git/.ssh
    /home/git
    

[Actually, sshd requires that even directories above ~ (/, /home, typically) also must be go-w, but that needs root. And typically they're already set that way anyway. (Or if they're not, you've got bigger problems than gitolite install not working!)]

cloning an empty repo

Cloning an empty repo is only possible with clients greater than 1.6.2. So at least one of your clients needs to have a recent git. Once at least one commit has been made, older clients can also use it

When you clone an empty repo, git seems to complain about fatal: The remote end hung up unexpectedly. However, you can ignore this, since it doesn't seem to hurt anything. [Update 2009-09-14; this has been fixed in git 1.6.4.3]

@all syntax for repos

There is a way to use the @all syntax for repos also, as described in conf/example.conf. However, there is an important difference between this and the old @all (for users):

  • @all for repos is immediately expanded, when found, into the currently known list of repos. "Currently" means upto this point in the config file, and "known" means having some user with some permissions associated with the repo!

  • This means that if you really want all repos, you'd better put this para at the end of the config file!

umask setting

Gitweb not able to read your repos? You can change the umask for newly created repos to something more relaxed -- see the ~/.gitolite.rc file

getting a tar file from a clone

You can clone the repo from github or indefero, then execute a make command to extract a tar file of the branch you want. Please use the make command, not a plain "git archive", because the Makefile adds a file called .GITOLITE-VERSION that will help you identify which version you are using.

git clone git://github.com/sitaramc/gitolite.git
        # (OR)
git clone git://sitaramc.indefero.net/sitaramc/gitolite.git
cd gitolite
make master.tar
# or maybe "make rebel.tar" or "make pu.tar"

differences from gitosis

Apart from the big ones listed in the top level README, and subjective ones like "better config file format", there are some small, but significant and concrete, differences from gitosis.

simpler syntax

The basic syntax is simpler and cleaner but it goes beyond that: you can specify access in bits and pieces, even if they overlap.

Some access needs are best grouped by repo, some by username, and some by both. So just do all of them, and gitolite will combine all the access lists! Here's an example:

# define groups of people
@bosses     = phb1 phb2 phb3
@devs       = dev1 dev2 dev3
@interns    = int1 int2 int3

# define groups of projects
@open       = git gitolite linux rakudo
@closed     = c1 c2 c3
@topsecret  = ts1 ts2 ts3

# all bosses have read access to all projects
repo @open @closed @topsecret
    R   =   @bosses

# everyone has read access to "open" projects
repo @open
    R   =   @bosses @devs @interns

[...or any other combination you want...]

# later in the file:

# specify access for individual repos (like RW, RW+, etc)
repo c1
    [...]

[...etc...]

If you notice that @bosses are given read access to @open via both rules, do not worry that this causes some duplication or inefficiency. It doesn't :-)

See the "specify gitweb/daemon access" section below for one more example.

two levels of access rights checking

Gitolite has two levels of access checks. The first check is what I will call the pre-git level (this is the only check that gitosis has). At this stage, the gl-auth-command has been invoked by sshd, and it knows just three things:

  • who,
  • what repository, and
  • what type of access (R or W)

Note that at this point no git program has entered the picture, and we have no way of knowing what ref (branch, tag, etc) he is trying to update, even if it is a "write" operation.

For a "read" operation to pass this check, the username (or @all) must have read permission (i.e., R, RW, or RW+) on at least one branch of the repo.

For a "write" operation, there is an additional restriction: lines specifying only R (read access) don't count. The user must have write access to some ref in the repo in order to pass this stage!

The second check is via a git update hook. This check only happens for write operations. By this time we know what "ref" he is trying to update, as well as the old and the new SHAs of that ref (by which we can also deduce whether it's a rewind or not). This is where the "per-branch" permissions come into play.

Each refex that allows W access (or + if this is a rewind) for this user, on this repo, is matched against the actual refname being updated. If any of the refexes match, the push succeeds. If none of them match, it fails.

Gitolite also allows "exclude" or "deny" rules. See later in this document for details.

file/dir NAME based restrictions

In addition to branch-name based restrictions, gitolite also allows you to restrict what files or directories can be involved in changes being pushed. This basically uses git diff --name-only to obtain the list of files being changed, treating each filename as a "ref" to be matched.

Please see conf/example.conf for syntax and examples.

error checking the config file

gitosis does not do any. I just found out that if you mis-spell members as member, gitosis will silently ignore it, and leave you wondering why access was denied.

Gitolite "compiles" the config file first and keyword typos are caught so you know right away.

including config lines from other files

See the entry under "INCLUDE SOME OTHER FILE" in conf/example.conf.

delegating parts of the config file

You can now split up the config file and delegate the authority to specify access control for their own pieces. See doc/5-delegation.mkd for details.

easier to specify gitweb "description" and gitweb/daemon access

To enable access to a repo via gitweb and create a "description" for it to show up on the webpage, just add a line like this, anywhere in the config file:

reponame = "one line of description"

You can also specify an "owner":

reponame "owner name" = "one line of description"

To enable access to one or more repos via git daemon, just give "read" permissions to the special username daemon.

There is also a special user called gitweb to specify gitweb access; useful if you don't care about specifying individual descriptions for each repo and just want to quickly enable gitweb access to one or more repos.

Remember gitolite lets you specify the access control specs in bits and pieces, so you can keep all the daemon/gitweb access in one place, even if each repo has more specific branch-level access config specified elsewhere. Here's an example, using really short reponames because I'm lazy:

# maybe near the top of the file, for ease of access:

@only_web       = r1 r2 r3
@only_daemon    = r4 r5 r6
@web_and_daemon = r7 r8 r9

repo @only_web
    R   = gitweb
repo @only_daemon
    R   = daemon
repo @web_and_daemon
    R   = gitweb
    R   = daemon

# ...maybe much later in the file:

repo r1
    # normal developer access lists for r1 and its branches/tags in the
    # usual way

repo r2
# ...and so on...

Over and above whether a repo is even shown by gitweb, you may want to further restrict people, allowing them to view only those repos for which they have been given read access by gitolite.

This requires that:

  • you have to have some sort of HTTP auth on your web server (out of my scope, sorry!)
  • the HTTP auth should use the same username (like "sitaram") as used in the gitolite config (for the corresponding user)

Once that is done, it's easy. Gitweb allows you to specify a subroutine to decide on access. We use that feature and tie it to gitolite. Sample code (untested, munged from something I saw here) is given below.

Note the utter simplicity of the actual check (just 1 line!). This is an unexpected piece of luck coming from the decision to keep the config parse separate from the actual access control. The config parser puts a pure perl hash in that file named below as $gl_conf_compiled, so all the parsing is already done and we just use it!

# completely untested... but the basic idea should work fine

# change these as needed
# projectroot should be the same as gitolite's REPO_BASE, but converted to
# an absolute path
$projectroot = '/home/git/repositories/';
my $gl_conf_compiled = '/home/git/.gitolite/conf/gitolite.conf-compiled.pm';

# I assume this gives us the HTTP auth username
my $username = $cgi->remote_user;

# ----------

# parse the config file; updates %repos hash
our %repos;
die "parse $gl_conf_compiled failed: " . ($! or $@) unless do $gl_conf_compiled;

# this is gitweb's mechanism; it calls whatever sub is pointed at by this
# variable to decide access yes/no
$export_auth_hook = sub {
    my $reponame = shift;
    # gitweb passes us the full repo path; so we strip the beginning...
    $reponame =~ s/\Q$projectroot\E\/?//;
    # ...and the end, to get the repo name as it is specified in gitolite conf
    $reponame =~ s/\.git$//;

    return exists $repos{$reponame}{R}{$username}
        || exists $repos{$reponame}{R}{'@all'};
};

better logging

If you have been too liberal with the permission to rewind, it has built-in logging as an emergency fallback if someone goes too far, or for audit purposes [*]. The logfile names and location are configurable, and can include the year/month/day etc in the filename for easy archival or further processing. The log file even tells you which pattern in the config file matched to allow that specific access to proceed.

[*] setting core.logAllRefUpdates true does provide a safety net against over-zealous rewinds, but it does not tell you "who". And strangely, management does not seem to share the view that "blame" is just a synonym for "annotate" ;-)]

The log lines look like this:

2009-09-19.10:24:37  +  b4e76569659939  4fb16f2a88d8b5  myrepo refs/heads/master       user2   refs/heads/master

The "+" at the start indicates a non-fast forward update, in this case from b4e76569659939 to 4fb16f2a88d8b5. So b4e76569659939 is the one to restore! Can it get easier?

The other parts of the log line are the name of the repo, the refname being updated, the user updating it, and the refex pattern (from the config file) that matched, in case you need to debug the config file itself.

one user, many keys

I have a laptop and a desktop I need to access the server from. I have different private keys on them, but as far as gitolite is concerned both of them should be treated as "sitaram". How does this work?

In gitosis, the admin creates a single "sitaram.pub" containing one line for each of my pubkeys. In gitolite, we keep them separate: "sitaram@laptop.pub" and "sitaram@desktop.pub". The part before the "@" is the username, so gitolite knows these two keys belong to the same person.

Note that you don't say "sitaram@laptop" and so on in the config file -- as far as the config file is concerned there's just one user called "sitaram" -- so you only say "sitaram" there.

I think this is easier to maintain if you have to delete or change one of those keys.

However, now that sitaramc@gmail.com is also a valid username, we need to distinguish between sitaramc@gmail.com.pub and sitaramc@desktop.pub. We do that by requiring that the multi-key suffix you use (like "desktop" and "laptop") should not have a "." in it. If it does, it looks like an email address. The following table lists sample pubkey filenames and the corresponding derived usernames (which is what goes into the conf/gitolite.conf file):

  • old style multikeys; not mistaken for emails because there is no "." in hostname part

    sitaramc.pub                            sitaramc
    sitaramc@laptop.pub                     sitaramc
    sitaramc@desktop.pub                    sitaramc
    
  • new style, email keys; there is a "." in hostname part; so it's an email address

    sitaramc@gmail.com.pub                  sitaramc@gmail.com
    
  • multikeys with email address

    sitaramc@gmail.com@laptop.pub           sitaramc@gmail.com
    sitaramc@gmail.com@desktop.pub          sitaramc@gmail.com
    

support for git installed outside default PATH

The normal solution is to add to the system default PATH somehow, either by munging /etc/profile or by enabling PermitUserEnvironment in /etc/ssh/sshd_config and then setting the PATH in ~/.ssh/.environment. All these are security risks because they allow a lot more than just you and your git install :-)

And if you don't have root, you can't do this anyway.

The only solution till now has been to ask every client to set the config parameters remote.<name>.receivepack and remote.<name>.uploadpack. But telling every client to do so is a pain...

Gitolite lets you specify the directory in which git binaries are to be found, via a new variable ($GIT_PATH) in the "rc" file. If this variable is non-empty, it will be appended to the PATH environment variable before attempting to run git stuff.

Very easy, very simple, and completely transparent to the users :-)

what repos do I have access to?

Sometimes there are too many repos, maybe even named similarly, or with the potential for typos, confusion about hyphens/underscores or upper/lower case, etc. You'd just like a simple way to know what repos you have access to.

Easy! Just use ssh and try to log in as if you were attempting to get a shell:

$ ssh gitolite info
PTY allocation request failed on channel 0
hello sitaram, the gitolite version here is v0.6-17-g94ed189
you have the following permissions:
  R  W  Anu-WSD
  R     ROtest
  R  W  SecureBrowse
  R  W  entrans
  R  W  git-notes
  R  W  gitolite
  R  W  gitolite-admin
  R  W  indic_web_input
  R  W  proxy
  @  @  testing
  R  W  vkc

Note that until this version, we used to put out an ugly need SSH_ORIGINAL_COMMAND error, just like gitosis used to. All we did is put that code path to better use :-)

"exclude" (or "deny") rules

IMPORTANT CAVEAT: if you use deny rules, the order of the rules also makes a difference, where earlier it did not. Please review your ruleset carefully or test it. In particular, do not use @all in a deny rule -- it won't work as you might expect. Also, deny rules are only processed in the second level checks (see "two levels of access rights checking" above), which means they only apply to write operations.

Take a look at the following snippet, which seems to say that "bruce" can write versioned tags (anything containing refs/tags/v[0-9]), but the other staffers can't:

    @staff = bruce whitfield martin
            [... and later ...]
    RW refs/tags/v[0-9]     = bruce
    RW refs/tags            = @staff

But that's not how the matching works. As long as any refex matches the refname being updated, it's a "yes". Since the second refex (which says "anything containing refs/tags") is a superset of the first one, it lets anyone on @staff create versioned tags, not just Bruce.

One way to fix this is to allow "excludes" -- some changes in syntax, combined with a rigorous, ordered, interpretation would do it.

Let's recap the existing semantics:

the first matching refex that has the permission you're looking for (W or +), results in success. A fallthrough results in failure

Here are the new semantics, with changes from the "main" one in bold:

the first matching refex that has the permission you're looking for (W or +) or a minus (-), results in success or failure, respectively. A fallthrough also results in failure

So the example we started with becomes, if you use "deny" rules:

    RW refs/tags/v[0-9]     = bruce
    -  refs/tags/v[0-9]     = @staff
    RW refs/tags            = @staff

And here's how it works:

  • for non-version tags, only the 3rd rule matches, so anyone on staff can push them
  • for version tags by bruce, the first rule matches so he can push them
  • for version tags by staffers other than bruce, the second rule matches before the third one, and it has a - as the permission, so the push fails

"personal" branches

"personal" branches are great for corporate environments, where unauthenticated pull/clone is a no-no. Since a dev workstation cannot do authentication, even work shared just between 2 devs has to go via the server. This causes the same branch name clutter as in a centralised VCS, plus setting up permissions for this becomes a chore for the admin.

gitolite lets you define a "personal" or "scratch" namespace prefix for each developer (e.g., refs/personal/<devname>/*), with full permissions for that dev and read-only for everyone else. And you get this without adding a single line to the access config file -- pretty much fire and forget as far as the admin is concerned, even if there is constant churn in the project teams.

Not bad for something that took just one line of code to implement. And that's one clean, readable, line, by the way ;-)

The admin would set $PERSONAL_BRANCH_PREFIX in the rc file and communicate this to all users. It could be something like refs/heads/personal, which means all such branches will show up in git branch lookups and git clone will fetch them. Or he could use, say, refs/personal, which means it won't show up in any normal "branch-y" commands and stuff, and generally be much less noisy.

Note that a user who has NO write access cannot have personal branches; if you read the section (above) on "two levels of access rights checking" you'll understand why.

For instance, in the following example, user3 cannot push to any refs/heads/personal/user3/* branches because the first level check stops him cold:

# assume $PERSONAL = 'refs/heads/personal' in ~/.gitolite.rc
repo myrepo
    RW+ master      = sitaram
    RW+ release     = qa_guy
    RW              = user1 user2
    R               = user3

If we relax that check, any access becomes write access. Yes it will be caught later, by the hook, but it's good practice to catch things in multiple places.

If you want user3 to have his own personal branch, but without write access to any of the "real" branches (like "master", "release", etc.), just use a dummy branch. Choose a name that will never exist in practice, or even if someone creates it, we don't care. For example, this will get him past the first check:

    RW dummy        = user3

Just don't show the user this config file; it might sound insulting :-)

custom hooks and custom git config

You can specify hooks that you want to propagate to all repos, as well as per-repo "gitconfig" settings. Please see doc/2-admin.mkd and conf/example.conf for details.

design choices

keeping the parser and the access control separate

There are two programs concerned with access control:

  • gl-auth-command, the program that is run via ~/.ssh/authorized_keys; this decides whether git should even be allowed to run (basic R/W/no access). (This one cannot decide on the branch-level access; it is not known at this point what branch is being accessed)
  • the update-hook on each repo, which decides the per-branch permissions

I have chosen to keep the relatively complex task of parsing the config file out of them to keep them simpler (and faster). So any changes to the config have to be first "compiled", and the access control programs use this "compiled" version of the config. (The compile step also refreshes ~/.ssh/authorized_keys).

If you choose the "easy install" method, all this is quite transparent to you anyway. If you cannot use the easy install and must install manually, I have clear instructions on how to set it up.