gitolite/doc/vref.mkd
Sitaram Chamarty 850882c1a6 allow VREF code to print to STDOUT...
Using a g2-style "chained update hook" as a VREF doesn't *quite* work:

  - all STDOUT from the hook is lost
  - worse, all lines get parsed as a ref followed by a message, and if
    the ref doesn't look like a ref it dies

So now we do all this only if the message starts with 'VREF/'.  Any
other output is just printed out as is.
2012-04-30 06:02:17 +05:30

12 KiB

virtual refs

IMPORTANT: fallthru is success in VREFs, unlike the normal refs. That won't make sense until you read further, but I had to put it up here for folks who stop reading halfway!


Here's an example to start you off.

repo    r1
    RW+                         =   lead_dev dev2 dev3
    -   VREF/COUNT/9            =   dev2 dev3
    -   VREF/COUNT/3/NEWFILES   =   dev2 dev3

Now dev2 and dev3 cannot push changes that affect more than 9 files at a time, nor those that have more than 3 new files.

Another example is detecting duplicate pubkeys in a push to the admin repo:

repo gitolite-admin
    # ... normal rules ...
    -   VREF/DUPKEYS            =   @all

rule matching recap

You won't get any joy out of this if you don't understand at least [refex][]es and how [rules][] are processed.

But VREFs have one very important difference from normal rules. With VREFs, a fallthru results in success. You'll see why this is more convenient as you read on.


what is a virtual ref

A ref like refs/heads/master is the main property of a push that gitolite uses to make its yes/no decision. I call this a "real" ref.

Any other property of the push that you want to use to help in the decision is therefore a virtual ref. This could be a property that git knows about, like in the example above, or comes from outside git like, say, the current time; see examples section later for some ideas.

#vref-fallthru fallthru is success here

Notice that you didn't need to add an RW+ VREF/... rule for user lead_dev in our example. This section explains why.

Virtual refs are best used as additional "deny" rules, performing extra checks that core gitolite cannot.

Making fallthru be a "fail" forces you to add rules for all users, instead of just the ones who should have those extra checks. Worse, since every virtual ref involves calling an external program, many of these calls may be wasted.

There's another advantage to doing it this way: a VREF can choose to simply die if things look bad, and it will have the same effect, assuming you used the VREF only in "deny" rules.

This in turn means any existing update hook can be used as a VREF as-is, as long as it (a) prints nothing on success and (b) dies on failure. See the email-check and dupkeys examples later.

how it works -- overview

Briefly, a refex starting with VREF/FOO triggers a call to a program called FOO in $GL_BINDIR/VREF.

That program is expected to print zero or more lines to its STDOUT; each line that starts with VREF/ is taken by gitolite as a new "ref" to be matched against all the refexes for this user in the config. Including the refex that caused the vref call, of course.

Normally, you send back the refex itself, if the test determines that the rule should be matched, otherwise nothing. So, in our example, we print VREF/COUNT/9 if the count was indeed greater than 9. Otherwise we just exit.

how it works -- details

  • The VREF code is only called if there are any VREF rules for the user, which means when the lead developer pushes, the VREF is not called at all.

    Side note: this is enormously more efficient than adding additional update hooks, which will get executed whether they are needed or not, for every repo and every user!

  • When dev2 or dev3 push, gitolite first checks the real ref (ref/heads/master or whatever). After this it looks at VREF rules, and calls an external program for every one it finds. Specifically, in a line like

       -   VREF/COUNT/3/NEWFILES    =   user
    

    COUNT is the vref name, so the program called is $GL_BINDIR/VREF/COUNT.

    The program is passed nine arguments in this case (see next section for details).

  • The script can print anything it wants to STDOUT. Lines not starting with VREF/ are printed as is (so your VREF can do mostly-normal printing to STDOUT).

    For lines starting with VREF/, the first word in each such line will be treated as a virtual ref to be matched against all the rules, while the rest, if any, is a message to be added to the standard "...DENIED..." message that gitolite prints if that refex matches.

    Usually it only makes sense to either

    • Print nothing that starts with VREF/ -- if you don't want the rule that triggered it to match (ie., whatever condition being tested was not violated; like if the count of changed files did not exceed 9, in our earlier example).
    • Print the refex itself (plus an optional message), so that it matches the line which invoked it.

arguments passed to the vref code

  • Arguments 1, 2, 3: the 'ref', 'oldsha', and 'newsha' that git passed to the update hook (see 'man githooks').

    This, combined with the fact that non-zero exits are detected, mean that you can simply use an existing update.secondary as a new VREF as-is, no changes needed.

  • Arguments 4 and 5: the 'oldtree' and 'newtree' SHAs. These are the same as the oldsha and newsha values, except if one of them is all-0. (indicating a ref creation or deletion). In that case the corresponding 'tree' SHA is set (by gitolite, as a courtesy) to the special SHA 4b825dc642cb6eb9a060e54bf8d69288fbee4904, which is the hash of an empty tree.

    (None of these shenanigans would have been needed if git diff $oldsha $newsha would not error out when passed an all-0 SHA.)

  • Argument 6: the attempted access flag. Typically W or +, but could also be C, D, or any of these 4 followed by M. If you have to ask what they mean, you haven't read enough gitolite documentation to be able to make virtual refs work.

  • Argument 7: is the entire refex; in our example VREF/COUNT/3/NEWFILES.

  • Arguments 8 onward: are the split out (by /) portions of the refex, excluding the first two components. In our example they would be 3 followed by NEWFILES.

Yes, argument 7 is redundant if you have 8 and 9. It's meant to make it easy to write vref scripts in any language. See script examples in source.

what (else) can the vref code pass back

Actually, the vref code can pass anything back; each line in its output that starts with VREF/ will be matched against all the rules as usual (with the exception that fallthru is not failure).

For example, you could have a ruleset like this:

repo r1
    # ... normal rules ...

    -   VREF/TIME/WEEKEND       =   @interns
    -   VREF/TIME/WEEKNIGHT     =   @interns
    -   VREF/TIME/HOLIDAY       =   @interns

and you could write the TIME vref code to passback any or all of the times that match. Then if an intern tried to access the system, each rule would trigger a call to gl-bindir/VREF/TIME.

The script should send back any of the applicable times (even more than one, or none at all, as the case may be). So even if it was invoked using the first rule, it might pass back (to gitolite) a virtual ref saying 'VREF/TIME/HOLIDAY', which would promptly cause the request to be denied.

VREFs shipped with gitolite

#NAME restricting pushes by dir/file name

The "NAME" VREF allows you to restrict pushes by the names of dirs and files changed.

Here's an example. Say you don't want junior developers pushing changes to the Makefile, because it's quite complex:

repo foo
        RW+                             =   @senior_devs
        RW                              =   @junior_devs

        -   VREF/NAME/Makefile          =   @junior_devs

When a senior dev pushes, the VREF is not invoked at all. But when a junior dev pushes, the VREF is invoked, and it returns a list of files changed as refs, looking like this:

VREF/NAME/file-1
VREF/NAME/dir-2/file-3
...etc...

Each of these refs is matched against the access rules. If one of them happens to be the Makefile, then the ref returned (VREF/NAME/Makefile) will match the deny rule and kill the push.

Another way to use this is when you know what is allowed instead of what is not allowed. Let's say the QA person is only allowed to touch a file called CHANGELOG and any files in a directory called ReleaseNotes:

repo foo
        RW+                             =   @senior_devs
        RW                              =   @junior_devs
        RW+                             =   QA-guy

        RW+ VREF/NAME/CHANGELOG         =   QA-guy
        RW+ VREF/NAME/ReleaseNotes/     =   QA-guy
        -   VREF/NAME/                  =   QA-guy

number of new files

The COUNT VREF is used like this:

-   VREF/COUNT/9                    =   @junior-developers

In response, if anyone in the user list pushes a commit series that changes more than 9 files, a vref of "VREF/COUNT/9" is returned. Gitolite uses that as a "ref" to match against all the rules, hits the same rule that invoked it, and denies the request.

If the user did not push more than 9 files, the VREF code returns nothing, and nothing happens.

COUNT can take one more argument:

-   VREF/COUNT/9/NEWFILES           =   @junior-developers

This is the same as before, but have to be more than 9 new files not just changed files.

advanced filetype detection

Note: this is more for illustration than use; it's rather specific to one of the projects I manage but the idea is the important thing.

Sometimes a file has a standard extension (that cannot be 'gitignore'd), but it is actually automatically generated. Here's one way to catch it:

 -   VREF/FILETYPE/AUTOGENERATED     =   @all

You can look at src/VREF/FILETYPE to see how it handles the 'AUTOGENERATED' option. You could also have a more generic option, like perhaps BINARY, and handle that in the FILETYPE vref too.

checking author email

Some people want to ensure that "you can only push your own commits".

If you force it on everyone, this is a very silly idea (see "Philosophical Notes" section of src/VREF/EMAIL-CHECK).

But there may be value in enforcing it just for the junior developers.

The neat thing is that the existing contrib/update.email-check was just copied to src/VREF/EMAIL-CHECK and it works, because VREFs get the same first 3 arguments and those are all that it cares about. (Note: you have to change one subroutine in that script if you want to use it)

catching duplicate pubkeys

This checks keydir/ for duplicate keys and aborts the push if it finds any. You should use this only on the gitolite-admin repo.

We covered this as a teaser example at the start.

other ideas -- code welcome!

"no non-merge first-parents"

Shruggar on #gitolite wanted this. Possible code to implement it would be something like this (untested)

[ -z "$(git rev-list --first-parent --no-merges $2..$3)" ]

This can be implemented using src/VREF/MERGE-CHECK as a model. That script does what the 'M' qualifier does in access rules (see last part of [this][write-types]), although the syntax to be used in conf/gitolite will be quite different.

other ideas for VREFs

Here are some more ideas:

  • Number of commits (git rev-list --count $old $new).
  • Number of binary files in commit (currently I only know to count occurrences of Bin in the output of git diff --stat.
  • Number of new binary files (count Bin 0 -> in git diff --stat output).
  • Time of day/day of week (see example snippet somewhere above).
  • IP address.
  • Phase of the moon.

Note that pretty much anything that involves $oldsha..$newsha will have to deal with the issue that when you push a new tag or branch, the "old" part is all 0's, and unless you consider --all existing branches and tags it becomes meaningless in terms of "number of new files" etc.