gitolite/doc/3-faq-tips-etc.mkd

# assorted faqs, tips, and notes on gitolite

In this document:

  * common errors and mistakes
  * git version dependency
  * other errors, warnings, notes...
      * ssh-copy-id
      * cloning an empty repo
      * `@all` syntax for repos
      * umask setting
  * getting a tar file from a clone
  * features
      * syntax and normal usage
          * simpler syntax
          * one user, many keys
      * security, access control, and auditing
          * two levels of access rights checking
          * better logging
          * "exclude" (or "deny") rules
          * the "D" permission -- separating delete and rewind rights
          * file/dir NAME based restrictions
          * delegating parts of the config file
      * convenience features
          * what repos do I have access to?
          * error checking the config file
          * including config lines from other files
          * support for git installed outside default PATH
          * "personal" branches
          * custom hooks and custom git config
      * helping with gitweb
          * easier to specify gitweb "description" and gitweb/daemon access
          * easier to link gitweb authorisation with gitolite
      * advanced features
          * repos named with wildcards
          * access control for external commands
  * design choices
      * keeping the parser and the access control separate

## common errors and mistakes

  * adding `repositories/` at the start of the repo name in the `git clone`.
    This error is typically made by the *admin* himself -- because he knows
    what `$REPO_BASE` is set to and thinks he has to provide that prefix on
    the client side also :-)  In fact gitolite prepends `$REPO_BASE`
    internally, so you shouldn't also do the same thing!

  * being able to clone but getting errors on push.  Most likely caused by a
    combination of:

      * you already have shell access to the server, not just "gitolite"
        access, *and*

      * you cloned using `git clone git@server:repositories/repo.git` (notice
        there's an extra "repositories/" in there?)

    In other words, you used a key that completely bypassed gitolite and went
    straight to the shell to do the clone.

    Please see doc/6-ssh-troubleshooting.mkd for what all this means.

## git version dependency

Here's a workaround for a version dependency that the normal flow of gitolite
has.

When you edit your config file to create a new repo, and push the changes to
the server, gitolite creates an empty, bare repo for you.  Normally, you're
expected to clone this on the client side, and start working -- make your
first commit(s), then push, etc.

However, cloning an empty repo requires a server side git version that is at
least 1.6.2.  Gitolite detects this when creating a repo, and warns you.

The workaround is to use the older (gitosis-style) method on the client:
create an empty repo locally, make a commit or two, set an "origin" remote,
and then push.  Something like:

    mkdir my-new-project
    cd    my-new-project
    git init
    git commit --allow-empty -m 'Initial repository'
    # or, if your client side git is too old for --allow-empty, just make some
    # files, "git add" them, then "git commit"
    git remote add origin git@gitolite-server:my-new-project.git
    git push origin master:master

Once this is done, the repo is available for cloning by anyone else in the
normal way, since it's not empty anymore.

## other errors, warnings, notes...

### ssh-copy-id

don't have `ssh-copy-id`?  This is broadly what that command does, if you want
to replicate it manually.  The input is your pubkey, typically
`~/.ssh/id_rsa.pub` from your client/workstation.

  * it copies it to the server as some file

  * it appends that file to `~/.ssh/authorized_keys` on the server
    (creating it if it doesn't already exist)

  * it then makes sure that all these files/directories have go-w perms
    set (assuming user is "git"):

        /home/git/.ssh/authorized_keys
        /home/git/.ssh
        /home/git

[Actually, `sshd` requires that even directories *above* `~` (`/`, `/home`,
typically) also must be `go-w`, but that needs root.  And typically
they're already set that way anyway.  (Or if they're not, you've got
bigger problems than gitolite install not working!)]

### cloning an empty repo

Cloning an empty repo is only possible with clients greater than 1.6.2.  So at
least one of your clients needs to have a recent git.  Once at least one
commit has been made, older clients can also use it

When you clone an empty repo, git seems to complain about `fatal: The remote
end hung up unexpectedly`.  However, you can ignore this, since it doesn't
seem to hurt anything.  [Update 2009-09-14; this has been fixed in git
1.6.4.3]

### `@all` syntax for repos

There *is* a way to use the `@all` syntax for repos also, as described in
`conf/example.conf`.  However, there are a couple of minor cautions:

  * don't use `NAME/` or such restrictions on the special `@all` repo.  Due to
    the potential for defeating a crucial optimisation and slowing down *all*
    access, we do not support this.
  * don't try giving `@all` users some permission for `@all` repos

### umask setting

Gitweb not able to read your repos?  You can change the umask for newly
created repos to something more relaxed -- see the `~/.gitolite.rc` file

## getting a tar file from a clone

You can clone the repo from github or indefero, then execute a make command to
extract a tar file of the branch you want.  Please use the make command, not a
plain "git archive", because the Makefile adds a file called
`.GITOLITE-VERSION` that will help you identify which version you are using.

    git clone git://github.com/sitaramc/gitolite.git
            # (OR)
    git clone git://sitaramc.indefero.net/sitaramc/gitolite.git
    cd gitolite
    make master.tar
    # or maybe "make pu.tar"

<a name="features"></a>

## features

Apart from the big ones listed in the top level README, and subjective ones
like "better config file format", gitolite has evolved to have many useful
fearures than the original goal of "gitosis + branch-level access control".

### syntax and normal usage

<a name="simpler_syntax"></a>

#### simpler syntax

The basic syntax is simpler and cleaner but it goes beyond that: **you can
specify access in bits and pieces**, even if they overlap.

Some access needs are best grouped by repo, some by username, and some by
both.  So just do all of them, and gitolite will combine all the access lists!
Here's an example:

    # define groups of people
    @bosses     = phb1 phb2 phb3
    @devs       = dev1 dev2 dev3
    @interns    = int1 int2 int3

    # define groups of projects
    @open       = git gitolite linux rakudo
    @closed     = c1 c2 c3
    @topsecret  = ts1 ts2 ts3

    # all bosses have read access to all projects
    repo @open @closed @topsecret
        R   =   @bosses

    # everyone has read access to "open" projects
    repo @open
        R   =   @bosses @devs @interns

    [...or any other combination you want...]

    # later in the file:

    # specify access for individual repos (like RW, RW+, etc)
    repo c1
        [...]

    [...etc...]

If you notice that `@bosses` are given read access to `@open` via both rules,
do not worry that this causes some duplication or inefficiency.  It doesn't
:-)

See the "specify gitweb/daemon access" section below for one more example.

<a name="multikeys"></a>

#### one user, many keys

I have a laptop and a desktop I need to access the server from.  I have
different private keys on them, but as far as gitolite is concerned both of
them should be treated as "sitaram".  How does this work?

In gitosis, the admin creates a single "sitaram.pub" containing one line for
each of my pubkeys.  In gitolite, we keep them separate: "sitaram@laptop.pub"
and "sitaram@desktop.pub".  The part before the "@" is the username, so
gitolite knows these two keys belong to the same person.

Note that you don't say "sitaram@laptop" and so on in the **config** file --
as far as the config file is concerned there's just **one** user called
"sitaram" -- so you only say "sitaram" there.

I think this is easier to maintain if you have to delete or change one of
those keys.

However, now that `sitaramc@gmail.com` is also a valid username, we need to
distinguish between `sitaramc@gmail.com.pub` and `sitaramc@desktop.pub`.  We
do that by requiring that the multi-key suffix you use (like "desktop" and
"laptop") should not have a `"."` in it.  If it does, it looks like an email
address.  The following table lists sample pubkey filenames and the
corresponding derived usernames (which is what goes into the
`conf/gitolite.conf` file):

  * old style multikeys; not mistaken for emails because there is no "." in
    hostname part

        sitaramc.pub                            sitaramc
        sitaramc@laptop.pub                     sitaramc
        sitaramc@desktop.pub                    sitaramc

  * new style, email keys; there is a "." in hostname part; so it's an email
    address

        sitaramc@gmail.com.pub                  sitaramc@gmail.com

  * multikeys *with* email address

        sitaramc@gmail.com@laptop.pub           sitaramc@gmail.com
        sitaramc@gmail.com@desktop.pub          sitaramc@gmail.com

### security, access control, and auditing

<a name="two_levels"></a>

#### two levels of access rights checking

Gitolite has two levels of access checks.  The **first check** is what I will
call the **pre-git** level (this is the only check that gitosis has).  At this
stage, the `gl-auth-command` has been invoked by `sshd`, and it knows just
three things:

  * who,
  * what repository, and
  * what type of access (R or W)

Note that at this point no git program has entered the picture, and we have no
way of knowing what **ref** (branch, tag, etc) he is trying to update, even if
it is a "write" operation.

For a "read" operation to pass this check, the username (or `@all`) must have
read permission (i.e., R, RW, or RW+) on at least one branch of the repo.

For a "write" operation, there is an additional restriction: lines specifying
only `R` (read access) don't count.  *The user must have write access to
**some** ref in the repo in order to pass this stage!*

The **second check** is via a git `update hook`.  This check only happens for
write operations.  By this time we know what "ref" he is trying to update, as
well as the old and the new SHAs of that ref (by which we can also deduce
whether it's a rewind or not).  This is where the "per-branch" permissions
come into play.

Each refex that allows `W` access (or `+` if this is a rewind) for *this*
user, on *this* repo, is matched against the actual refname being updated.  If
any of the refexes match, the push succeeds.  If none of them match, it fails.

Gitolite also allows "exclude" or "deny" rules.  See later in this document
for details.

#### better logging

If you have been too liberal with the permission to rewind, it has built-in
logging as an emergency fallback if someone goes too far, or for audit
purposes [`*`].  The logfile names and location are configurable, and can
include the year/month/day etc in the filename for easy archival or further
processing.  The log file even tells you which pattern in the config file
matched to allow that specific access to proceed.

>   [`*`] setting `core.logAllRefUpdates true` does provide a safety net
>   against over-zealous rewinds, but it does not tell you "who".  And
>   strangely, management does not seem to share the view that "blame" is just
>   a synonym for "annotate" ;-)]

The log lines look like this:

    2009-09-19.10:24:37  +  b4e76569659939  4fb16f2a88d8b5  myrepo refs/heads/master       user2   refs/heads/master

The "+" at the start indicates a non-fast forward update, in this case from
b4e76569659939 to 4fb16f2a88d8b5.  So b4e76569659939 is the one to restore!
Can it get easier?

The other parts of the log line are the name of the repo, the refname being
updated, the user updating it, and the refex pattern (from the config file)
that matched, in case you need to debug the config file itself.

#### "exclude" (or "deny") rules

Here is an illustrative explanation of "deny" rules.  However, please be sure
to read the "DENY/EXCLUDE RULES" section in `conf/example.conf` for important
notes/caveats before using "deny" rules.

Take a look at the following snippet, which *seems* to say that "bruce" can
write versioned tags (anything containing `refs/tags/v[0-9]`), but the other
staffers can't:

        @staff = bruce whitfield martin
                [... and later ...]
        RW refs/tags/v[0-9]     = bruce
        RW refs/tags            = @staff

But that's not how the matching works.  As long as any refex matches the
refname being updated, it's a "yes".  Since the second refex (which says
"anything containing `refs/tags`") is a superset of the first one, it lets
anyone on `@staff` create versioned tags, not just Bruce.

One way to fix this is to allow "excludes" -- some changes in syntax, combined
with a rigorous, ordered, interpretation would do it.

Let's recap the **existing semantics**:

>   the first matching refex that has the permission you're looking for (`W`
>   or `+`), results in success.  A fallthrough results in failure

Here are the **new semantics**, with changes from the "main" one in bold:

>   the first matching refex that has the permission you're looking for (`W`
>   or `+`) **or a minus (`-`)**, results in success **or failure,
>   respectively**.  A fallthrough **also** results in failure

So the example we started with becomes, if you use "deny" rules:

        RW refs/tags/v[0-9]     = bruce
        -  refs/tags/v[0-9]     = @staff
        RW refs/tags            = @staff

And here's how it works:

  * for non-version tags, only the 3rd rule matches, so anyone on staff can
    push them
  * for version tags by bruce, the first rule matches so he can push them
  * for version tags by staffers *other than bruce*, the second rule matches
    before the third one, and it has a `-` as the permission, so the push
    fails

#### the "D" permission -- separating delete and rewind rights

Since the beginning, `RW+` meant being able to rewind *or* delete a ref.  My
stand is that these two are fairly similar, and infact a rewind is almost the
same as a delete+push (the only difference I can see is if you had
core.logAllRefUpdates set, which is *not* a default setting).

However, there seem to be cases where it is useful to distinguish them --
situations where one of them should be restricted more than the other.
([Arguments][sdrr] exist for both sides: restrict delete more than rewind, and
vice versa).

So we now allow these two rights to be separated.  Just use the new `D`
permission anywhere in the config for the repo, and instantly all `RW+`
permissions (for that repo) cease to permit deletion of the ref matched.

This provides the *greatest* backward compatibility (if you don't specify any
`D` permissions, everything works just as before), while also enabling the new
semantics at the granularity of a repo, instead of the entire config.

Note 1: if you find that `RW+` no longer allows deletion but you can't see a
`D` permission in the rules, remember that gitolite allows a repo config to be
specified in multiple places for convenience, included delegated or included
files.  Be sure to search everywhere :)

Note 2: a quick way to make this the default for *all* your repos is:

    repo @all
        D   dummy-branch    =   foo

where foo can be either the administrator, or if you can ignore the warning
message when you push, a non-existant user.

[sdrr]: http://groups.google.com/group/gitolite/browse_thread/thread/9f2b4358ce406d4c#

#### file/dir NAME based restrictions

In addition to branch-name based restrictions, gitolite also allows you to
restrict what files or directories can be involved in changes being pushed.
This basically uses `git diff --name-only` to obtain the list of files being
changed, treating each filename as a "ref" to be matched.

Please see `conf/example.conf` for syntax and examples.

#### delegating parts of the config file

You can now split up the config file and delegate the authority to specify
access control for their own pieces.  See
[doc/5-delegation.mkd](http://github.com/sitaramc/gitolite/blob/pu/doc/5-delegation.mkd)
for details.

### convenience features

<a name="myrights"></a>

#### what repos do I have access to?

Sometimes there are too many repos, maybe even named similarly, or with the
potential for typos, confusion about hyphens/underscores or upper/lower case,
etc.  You'd just like a simple way to know what repos you have access to.

Easy!  Just use ssh and try to log in as if you were attempting to get a
shell:

    $ ssh gitolite info
    PTY allocation request failed on channel 0
    hello sitaram, the gitolite version here is v0.6-17-g94ed189
    you have the following permissions:
      R  W  Anu-WSD
      R     ROtest
      R  W  SecureBrowse
      R  W  entrans
      R  W  git-notes
      R  W  gitolite
      R  W  gitolite-admin
      R  W  indic_web_input
      R  W  proxy
      @  @  testing
      R  W  vkc

Note that until this version, we used to put out an ugly `need
SSH_ORIGINAL_COMMAND` error, just like gitosis used to.  All we did is put
that code path to better use :-)

#### error checking the config file

gitosis does not do any.  I just found out that if you mis-spell `members` as
`member`, gitosis will silently ignore it, and leave you wondering why access
was denied.

Gitolite "compiles" the config file first and keyword typos *are* caught so
you know right away.

#### including config lines from other files

See the entry under "INCLUDE SOME OTHER FILE" in `conf/example.conf`.

#### support for git installed outside default PATH

The normal solution is to add to the system default PATH somehow, either by
munging `/etc/profile` or by enabling `PermitUserEnvironment` in
`/etc/ssh/sshd_config` and then setting the PATH in `~/.ssh/.environment`.
All these are security risks because they allow a lot more than just you and
your git install :-)

And if you don't have root, you can't do this anyway.

The only solution till now has been to ask every client to set the config
parameters `remote.<name>.receivepack` and `remote.<name>.uploadpack`.  But
telling *every* client to do so is a pain...

Gitolite lets you specify the directory in which git binaries are to be found,
via a new variable (`$GIT_PATH`) in the "rc" file.  If this variable is
non-empty, it will be appended to the PATH environment variable before
attempting to run git stuff.

Very easy, very simple, and completely transparent to the users :-)

#### "personal" branches

"personal" branches are great for corporate environments, where
unauthenticated pull/clone is a no-no.  Since a dev workstation cannot do
authentication, even work shared just between 2 devs has to go *via* the
server.  This causes the same branch name clutter as in a centralised VCS,
plus setting up permissions for this becomes a chore for the admin.

gitolite lets you define a "personal" or "scratch" namespace prefix for each
developer (e.g., `refs/personal/<devname>/*`).  Just add a line like:

        RW+ personal/USER/      =   @userlist

This means I (user "sitaram") can do anything to any branch whose name starts
with `personal/sitaram/` assuming I'm in "userlist".

You can have any number of such lines with different prefixes (for example,
using topic names instead of "personal") or even suffixes if you like.  The
important thing is that the "branch" name should contain `/USER/` (including
the slashes).  At runtime this will match whoever is the current user.  Access
is still determined by the right hand side of course.

#### custom hooks and custom git config

You can specify hooks that you want to propagate to all repos, as well as
per-repo "gitconfig" settings.  Please see `doc/2-admin.mkd` and
`conf/example.conf` for details.

<a name="gitweb"></a>

### helping with gitweb

Although gitweb is a completely separate program, gitolite can do quite a
lot to help you manage gitweb access as well; once the initial setup is
complete, you can do it all from within the gitolite config file!

#### easier to specify gitweb "description" and gitweb/daemon access

To enable access to a repo via gitweb *and* create a "description" for it to
show up on the webpage, just add a line like this, anywhere in the config
file:

    reponame = "one line of description"

You can also specify an "owner":

    reponame "owner name" = "one line of description"

To enable access to one or more repos via git daemon, just give "read"
permissions to the special username `daemon`.

There is also a special user called `gitweb` to specify gitweb access; useful
if you don't care about specifying individual descriptions for each repo and
just want to quickly enable gitweb access to one or more repos.

Remember gitolite lets you specify the access control specs in bits and
pieces, so you can keep all the daemon/gitweb access in one place, even if
each repo has more specific branch-level access config specified elsewhere.
Here's an example, using really short reponames because I'm lazy:

    # maybe near the top of the file, for ease of access:

    @only_web       = r1 r2 r3
    @only_daemon    = r4 r5 r6
    @web_and_daemon = r7 r8 r9

    repo @only_web
        R   = gitweb
    repo @only_daemon
        R   = daemon
    repo @web_and_daemon
        R   = gitweb
        R   = daemon

    # ...maybe much later in the file:

    repo r1
        # normal developer access lists for r1 and its branches/tags in the
        # usual way

    repo r2
    # ...and so on...

<a name="gitwebauth"></a>

#### easier to link gitweb authorisation with gitolite

Over and above whether a repo is even *shown* by gitweb, you may want to
further restrict people, allowing them to view *only* those repos for which
they have been given read access by gitolite.

This requires that:

  * you have to have some sort of HTTP auth on your web server (out of my
    scope, sorry!)
  * the HTTP auth should use the same username (like "sitaram") as used in the
    gitolite config (for the corresponding user)

Normally a superuser sets up passwords for users using the "htpasswd" command,
but this is an administrative chore.

Robin Smidsrød had the *great* idea that, since each user already has pubkey
access to `git@server`, this gives us a very neat way of using gitolite to let
the users *manage their own HTTP passwords*.  Here's how:

  * setup apache so that the htaccess file it looks for is owned by the "git"
    user
  * in the `~/.gitolite.rc` file, look for the variable `$HTPASSWD_FILE` and
    point it to this file
  * tell your users to type in `ssh git@server htpasswd` to set or change
    their HTTP passwords

Here's the rest of how it hangs together.

Gitweb allows you to specify a subroutine to decide on access.  We use that
feature and tie it to gitolite.  Sample code (untested by me, but others do
use it, munged from something I saw [here][leho]) is given below.

Note the **utter simplicity** of the actual check (just 1 line!).  This is an
unexpected piece of luck coming from the decision to keep the config parse
separate from the actual access control.  The config parser puts a pure perl
hash in that file named below as `$gl_conf_compiled`, so all the parsing is
already done and we just use it!

    # completely untested... but the basic idea should work fine

    # change these as needed
    # projectroot should be the same as gitolite's REPO_BASE, but converted to
    # an absolute path
    $projectroot = '/home/git/repositories/';
    my $gl_conf_compiled = '/home/git/.gitolite/conf/gitolite.conf-compiled.pm';

    # I am told this gives us the HTTP auth username
    my $username = $cgi->remote_user;

    # ----------

    # parse the config file; updates %repos hash
    our %repos;
    die "parse $gl_conf_compiled failed: " . ($! or $@) unless do $gl_conf_compiled;

    # this is gitweb's mechanism; it calls whatever sub is pointed at by this
    # variable to decide access yes/no.  Gitweb calls it with one argument
    # containing the full path of the repo being accessed
    $export_auth_hook = sub {
        my $reponame = shift;
        # take the full path provided, strip the beginning...
        $reponame =~ s/\Q$projectroot\E\/?//;
        # ...and the end, to get the repo name as it is specified in gitolite conf
        $reponame =~ s/\.git$//;

        return exists $repos{$reponame}{R}{$username}
            || exists $repos{$reponame}{R}{'@all'};
    };


[leho]: http://leho.kraav.com/news/2009/10/27/using-apache-authentication-with-gitweb-gitosis-repository-access-control/

### advanced features

#### repos named with wildcards

Please see `doc/4-wildcard-repositories.mkd` for all the details.

#### access control for external commands

Gitolite now has a mechanism for allowing access control for arbitrary
external commands, as long as they are invoked via ssh and present a
server-side command that contains enough information to make an access control
decision.  The first (and only, so far) such command implemented is rsync.

Note that this is incompatible with giving people shell access as described in
`doc/6-ssh-troubleshooting.mkd` -- people who have shell access are not
subject to this mechanism (it wouldn't make sense to try and control someone
who has shell access anyway).

Please see the config files (both of them) for examples and usage.

## design choices

### keeping the parser and the access control separate

There are two programs concerned with access control:

  * `gl-auth-command`, the program that is run via `~/.ssh/authorized_keys`;
    this decides whether git should even be allowed to run (basic R/W/no
    access).  (This one cannot decide on the branch-level access; it is not
    known at this point what branch is being accessed)
  * the update-hook on each repo, which decides the per-branch permissions

I have chosen to keep the relatively complex task of parsing the config file
out of them to keep them simpler (and faster).  So any changes to the config
have to be first "compiled", and the access control programs use this
"compiled" version of the config.  (The compile step also refreshes
`~/.ssh/authorized_keys`).

If you choose the "easy install" method, all this is quite transparent to you
anyway.  If you cannot use the easy install and must install manually, I have
clear instructions on how to set it up.