2010-05-14 14:50:57 +02:00
|
|
|
# what is a "big-config"
|
|
|
|
|
|
|
|
In this document:
|
|
|
|
|
2010-09-02 15:45:32 +02:00
|
|
|
* <a href="#_when_why_do_we_need_it_">when/why do we need it?</a>
|
|
|
|
* <a href="#_how_do_we_use_it_">how do we use it?</a>
|
|
|
|
* <a href="#_other_optimisations">other optimisations</a>
|
2010-10-26 16:26:51 +02:00
|
|
|
* <a href="#_disabling_various_defaults">disabling various defaults</a>
|
|
|
|
* <a href="#_optimising_the_authkeys_file">optimising the authkeys file</a>
|
2010-09-02 15:45:32 +02:00
|
|
|
* <a href="#_what_are_the_downsides_">what are the downsides?</a>
|
2010-10-07 13:06:14 +02:00
|
|
|
* <a href="#_storing_usergroup_information_outside_gitolite_like_in_LDAP_">storing usergroup information outside gitolite (like in LDAP)</a>
|
|
|
|
* <a href="#_why">why</a>
|
|
|
|
* <a href="#_how">how</a>
|
2010-12-24 06:31:28 +01:00
|
|
|
* <a href="#_implementation_notes">implementation notes</a>
|
2010-05-21 14:23:05 +02:00
|
|
|
|
2010-09-02 15:45:32 +02:00
|
|
|
<a name="_when_why_do_we_need_it_"></a>
|
2010-05-14 14:50:57 +02:00
|
|
|
|
|
|
|
### when/why do we need it?
|
|
|
|
|
2010-05-16 02:48:08 +02:00
|
|
|
A "big config" is anything that has a few thousand users and a few thousand
|
|
|
|
repos, organised into groups that are much smaller in number (like maybe a few
|
|
|
|
hundreds of repogroups and a few dozens of usergroups).
|
2010-05-14 14:50:57 +02:00
|
|
|
|
|
|
|
So let's say you have
|
|
|
|
|
|
|
|
@wbr = lynx firefox
|
|
|
|
@devs = alice bob
|
|
|
|
|
|
|
|
repo @wbr
|
|
|
|
RW+ next = @devs
|
|
|
|
RW master = @devs
|
|
|
|
|
|
|
|
Gitolite internally translates this to
|
|
|
|
|
|
|
|
repo lynx firefox
|
|
|
|
RW+ next = alice bob
|
|
|
|
RW master = alice bob
|
|
|
|
|
|
|
|
Not just that -- it now generates the actual config rules once for each
|
|
|
|
user-repo-ref combination (there are 8 combinations above; the compiled config
|
|
|
|
file looks partly like this:
|
|
|
|
|
|
|
|
%repos = (
|
|
|
|
'firefox' => {
|
|
|
|
'R' => {
|
|
|
|
'alice' => 1,
|
|
|
|
'bob' => 1
|
|
|
|
},
|
|
|
|
'W' => {
|
|
|
|
'alice' => 1,
|
|
|
|
'bob' => 1
|
|
|
|
},
|
|
|
|
'alice' => [
|
|
|
|
{
|
|
|
|
'refs/heads/next' => 'RW+'
|
|
|
|
},
|
|
|
|
{
|
|
|
|
'refs/heads/master' => 'RW'
|
|
|
|
}
|
|
|
|
],
|
|
|
|
'bob' => [
|
|
|
|
{
|
|
|
|
'refs/heads/next' => 'RW+'
|
|
|
|
},
|
|
|
|
{
|
|
|
|
'refs/heads/master' => 'RW'
|
|
|
|
}
|
|
|
|
]
|
|
|
|
},
|
|
|
|
'lynx' => {
|
|
|
|
'R' => {
|
|
|
|
'alice' => 1,
|
|
|
|
'bob' => 1
|
|
|
|
},
|
|
|
|
'W' => {
|
|
|
|
'alice' => 1,
|
|
|
|
'bob' => 1
|
|
|
|
},
|
|
|
|
'alice' => [
|
|
|
|
{
|
|
|
|
'refs/heads/next' => 'RW+'
|
|
|
|
},
|
|
|
|
{
|
|
|
|
'refs/heads/master' => 'RW'
|
|
|
|
}
|
|
|
|
],
|
|
|
|
'bob' => [
|
|
|
|
{
|
|
|
|
'refs/heads/next' => 'RW+'
|
|
|
|
},
|
|
|
|
{
|
|
|
|
'refs/heads/master' => 'RW'
|
|
|
|
}
|
|
|
|
]
|
|
|
|
}
|
|
|
|
);
|
|
|
|
|
|
|
|
Phew!
|
|
|
|
|
|
|
|
You can imagine what that does when you have 10,000 users and 10,000 repos.
|
|
|
|
Let's just say it's not pretty :)
|
|
|
|
|
2010-09-02 15:45:32 +02:00
|
|
|
<a name="_how_do_we_use_it_"></a>
|
2010-05-21 14:23:05 +02:00
|
|
|
|
2010-05-14 14:50:57 +02:00
|
|
|
### how do we use it?
|
|
|
|
|
|
|
|
Now, if you had all those 10,000 users and repos explicitly listed (no
|
|
|
|
groups), then there is no help. But if, like the above example, you had
|
|
|
|
groups like we used above, there is hope.
|
|
|
|
|
|
|
|
Just set
|
|
|
|
|
|
|
|
$GL_BIG_CONFIG = 1;
|
|
|
|
|
2010-07-23 13:33:21 +02:00
|
|
|
in the `~/.gitolite.rc` file on the server (see next section for more
|
|
|
|
variables). When you do that, and push this configuration, the compiled file
|
|
|
|
looks like this:
|
2010-05-14 14:50:57 +02:00
|
|
|
|
|
|
|
%repos = (
|
|
|
|
'@wbr' => {
|
|
|
|
'@devs' => [
|
|
|
|
{
|
|
|
|
'refs/heads/next' => 'RW+'
|
|
|
|
},
|
|
|
|
{
|
|
|
|
'refs/heads/master' => 'RW'
|
|
|
|
}
|
|
|
|
],
|
|
|
|
'R' => {
|
|
|
|
'@devs' => 1
|
|
|
|
},
|
|
|
|
'W' => {
|
|
|
|
'@devs' => 1
|
|
|
|
}
|
|
|
|
},
|
|
|
|
);
|
|
|
|
%groups = (
|
|
|
|
'@devs' => {
|
|
|
|
'alice' => 'master',
|
|
|
|
'bob' => 'master'
|
|
|
|
},
|
|
|
|
'@wbr' => {
|
|
|
|
'firefox' => 'master',
|
|
|
|
'lynx' => 'master'
|
|
|
|
}
|
|
|
|
);
|
|
|
|
|
|
|
|
That's a lot smaller, and allows orders of magintude more repos and groups to
|
|
|
|
be supported.
|
|
|
|
|
2010-09-02 15:45:32 +02:00
|
|
|
<a name="_other_optimisations"></a>
|
2010-05-21 14:23:05 +02:00
|
|
|
|
2010-07-23 13:33:21 +02:00
|
|
|
### other optimisations
|
2010-05-16 02:48:08 +02:00
|
|
|
|
2010-10-26 16:26:51 +02:00
|
|
|
<a name="_disabling_various_defaults"></a>
|
|
|
|
|
|
|
|
#### disabling various defaults
|
|
|
|
|
2010-07-23 13:33:21 +02:00
|
|
|
The default RC file contains the following lines (we've already discussed the
|
|
|
|
first one):
|
2010-05-16 02:48:08 +02:00
|
|
|
|
|
|
|
$GL_BIG_CONFIG = 0;
|
|
|
|
$GL_NO_DAEMON_NO_GITWEB = 0;
|
2010-07-23 13:33:21 +02:00
|
|
|
$GL_NO_CREATE_REPOS = 0;
|
|
|
|
$GL_NO_SETUP_AUTHKEYS = 0;
|
|
|
|
|
|
|
|
`GL_NO_DAEMON_NO_GITWEB` is a very useful optimisation that you *must* enable
|
|
|
|
if you *do* have a large number of repositories, and do *not* use gitolite's
|
|
|
|
support for gitweb or git-daemon access (see "[easier to specify gitweb
|
2010-09-02 15:45:32 +02:00
|
|
|
description and gitweb/daemon access][gwd]" for details). This will save a
|
|
|
|
lot of time when you push the gitolite-admin repo with changes. This variable
|
2010-07-23 13:33:21 +02:00
|
|
|
also control whether "git config" lines (such as `config hooks.emailprefix =
|
|
|
|
"[gitolite]"`) will be processed or not.
|
|
|
|
|
|
|
|
Setting this is relatively harmless to a normal installation, unlike the next
|
|
|
|
two variables :-) `GL_NO_CREATE_REPOS` and `GL_NO_SETUP_AUTHKEYS` are meant
|
|
|
|
for installations where some backend system already exists that does all the
|
|
|
|
actual repo creation, and all the authentication setup (ssh auth keys),
|
|
|
|
respectively.
|
|
|
|
|
|
|
|
Summary: Please **leave those two variables alone** unless you're initials are
|
|
|
|
"JK" ;-)
|
|
|
|
|
|
|
|
Also note that using all 3 of the `GL_NO_*` variables will result in
|
|
|
|
*everything* after the config compile being skipped. In other words, gitolite
|
|
|
|
is being used **only** for its access control language.
|
2010-05-16 02:48:08 +02:00
|
|
|
|
2010-10-26 16:26:51 +02:00
|
|
|
<a name="_optimising_the_authkeys_file"></a>
|
|
|
|
|
|
|
|
#### optimising the authkeys file
|
|
|
|
|
|
|
|
Sshd does a linear scan of the `~/.ssh/authorized_keys` file when an incoming
|
|
|
|
connection shows up. This means that keys found near the top get served
|
|
|
|
faster than keys near the bottom. On my laptop, it takes about 2500 keys
|
|
|
|
before I notice the delay; on a typical server it could be double that, so
|
|
|
|
don't worry about all this unless your user-count is in that range.
|
|
|
|
|
|
|
|
One way to deal with 5000+ keys is to use customised, database-backed ssh
|
|
|
|
daemons, but many people are uncomfortable with taking non-standard versions
|
|
|
|
of such a critical piece of the security infrastructure. In addition, most
|
|
|
|
distributions do not make it painless to use them.
|
|
|
|
|
|
|
|
So what do you do?
|
|
|
|
|
|
|
|
The following trick uses the Pareto principle (a.k.a the "80-20 rule")
|
|
|
|
to get an immediate boost in response for the most frequent or prolific
|
|
|
|
developers. It can allow you to ignore the problem until the next big
|
|
|
|
increase in your user counts!
|
|
|
|
|
|
|
|
Here's how:
|
|
|
|
|
|
|
|
* create subdirectories of keydir/ called 0, 1, (maybe 2, 3, etc., also),
|
|
|
|
and 9.
|
|
|
|
* in 0/, put in the pubkeys of the most frequent users
|
|
|
|
* in 1/, add the next most important set of users, and so on for 2, 3, etc.
|
|
|
|
* finally, put all the rest in 9/
|
|
|
|
|
|
|
|
Make sure "9" contains at least 70-90% of the total number of pubkeys,
|
|
|
|
otherwise this doesn't really help.
|
|
|
|
|
|
|
|
You can easily determine who your top users are by runnning something like
|
|
|
|
this (note the clever date command that always gets you last months log file!)
|
|
|
|
|
|
|
|
cat .gitolite/logs/gitolite-`date +%Y-%m -d -30days`.log |
|
|
|
|
cut -f2 | sort | uniq -c | sort -n -r
|
|
|
|
|
2010-09-02 15:45:32 +02:00
|
|
|
<a name="_what_are_the_downsides_"></a>
|
2010-05-21 14:23:05 +02:00
|
|
|
|
2010-05-14 14:50:57 +02:00
|
|
|
### what are the downsides?
|
|
|
|
|
2010-05-18 14:20:58 +02:00
|
|
|
There is one minor issue.
|
2010-05-14 14:50:57 +02:00
|
|
|
|
2010-05-18 14:20:58 +02:00
|
|
|
If you use the delegation feature, you can no longer define or extend
|
2010-05-14 14:50:57 +02:00
|
|
|
@groups in a fragment, for security reasons. It will also not let you use any
|
|
|
|
group other than the @fragname itself (specifically, groups which contained a
|
|
|
|
subset of the allowed @fragname, which would work normally, do not work now).
|
|
|
|
|
|
|
|
(If you didn't understand all that, you're probably not using delegation, so
|
|
|
|
feel free to ignore it!)
|
|
|
|
|
2010-10-07 13:06:14 +02:00
|
|
|
<a name="_storing_usergroup_information_outside_gitolite_like_in_LDAP_"></a>
|
2010-05-21 14:23:05 +02:00
|
|
|
|
2010-10-07 13:06:14 +02:00
|
|
|
### storing usergroup information outside gitolite (like in LDAP)
|
2010-05-14 14:50:57 +02:00
|
|
|
|
|
|
|
[Please NOTE: this is all about *user* groups, not *repo* groups]
|
|
|
|
|
2010-10-07 13:06:14 +02:00
|
|
|
[WARNING: the earlier method of doing this has been discontinued; please see
|
|
|
|
the commit message for details]
|
2010-05-14 14:50:57 +02:00
|
|
|
|
2010-10-07 13:06:14 +02:00
|
|
|
Gitolite now allows usergroup information to be stored outside its own config
|
|
|
|
file. We'll see "why" first, then the "how".
|
2010-05-14 14:50:57 +02:00
|
|
|
|
2010-10-07 13:06:14 +02:00
|
|
|
<a name="_why"></a>
|
2010-05-14 14:50:57 +02:00
|
|
|
|
2010-10-07 13:06:14 +02:00
|
|
|
#### why
|
2010-05-14 14:50:57 +02:00
|
|
|
|
2010-10-07 13:06:14 +02:00
|
|
|
Large sites often have LDAP servers that already contain user and group
|
|
|
|
information, including group membership details. Such sites may prefer that
|
|
|
|
gitolite just pick up that info instead of having to redundantly put it in
|
|
|
|
gitolite's config file.
|
2010-05-14 14:50:57 +02:00
|
|
|
|
2010-10-07 13:06:14 +02:00
|
|
|
Consider this example config for one repo:
|
2010-05-14 14:50:57 +02:00
|
|
|
|
2010-10-07 13:06:14 +02:00
|
|
|
repo foo
|
|
|
|
RW+ = @lead_devs
|
|
|
|
RW = @devs
|
|
|
|
R = @interns
|
2010-05-14 14:50:57 +02:00
|
|
|
|
2010-10-07 13:06:14 +02:00
|
|
|
Normally, you would also need to specify:
|
2010-05-14 14:50:57 +02:00
|
|
|
|
2010-10-07 13:06:14 +02:00
|
|
|
@lead_devs = dilbert alice
|
|
|
|
@devs = wally
|
|
|
|
@interns = ashok
|
2010-05-14 14:50:57 +02:00
|
|
|
|
2010-10-07 13:06:14 +02:00
|
|
|
However, if the corporate LDAP server already tags these people correctly, and
|
|
|
|
if there is some way of getting that information out **at run time**, that
|
|
|
|
would be cool.
|
2010-05-14 14:50:57 +02:00
|
|
|
|
2010-10-07 13:06:14 +02:00
|
|
|
<a name="_how"></a>
|
2010-05-14 14:50:57 +02:00
|
|
|
|
2010-10-07 13:06:14 +02:00
|
|
|
#### how
|
2010-05-14 14:50:57 +02:00
|
|
|
|
2010-10-07 13:06:14 +02:00
|
|
|
All you need is a script that, given a username, queries your LDAP or similar
|
|
|
|
server, and returns a space-separated list of all the groups she is a member
|
|
|
|
of. If an invalid user name is sent in, or the user is valid but is not part
|
|
|
|
of any groups, it should print nothing.
|
|
|
|
|
|
|
|
This script will probably be specific to your site. [**Help wanted**: I don't
|
|
|
|
know LDAP, so if someone wants to contribute some sample code I'd be happy to
|
|
|
|
put it in contrib/, with credit of course!]
|
|
|
|
|
|
|
|
Then set the `$GL_GET_MEMBERSHIPS_PGM` variable in the rc file to the full
|
|
|
|
path to this program, set `$GL_BIG_CONFIG` to 1, and that will be that.
|
2010-09-02 15:45:32 +02:00
|
|
|
|
|
|
|
[gwd]: http://github.com/sitaramc/gitolite/blob/pu/doc/3-faq-tips-etc.mkd#gwd
|
2010-12-24 06:31:28 +01:00
|
|
|
|
|
|
|
<a name="_implementation_notes"></a>
|
|
|
|
|
|
|
|
### implementation notes
|
|
|
|
|
|
|
|
To understand how big-config works, we'll first look at how it works without
|
|
|
|
this setting. Think back to the example at the top, and assume 'alice' is
|
|
|
|
accessing the 'lynx' repo. The various rights are governed by the following
|
|
|
|
hash elements:
|
|
|
|
|
|
|
|
# for the first level checks
|
|
|
|
$repos{'lynx'}{'R'}{'alice'} = 1
|
|
|
|
$repos{'lynx'}{'W'}{'alice'} = 1
|
|
|
|
|
|
|
|
# for the second level checks
|
|
|
|
$repos{'lynx'}{'alice'}{'refs/heads/master'} = 'RW';
|
|
|
|
$repos{'lynx'}{'alice'}{'refs/heads/next'} = 'RW+';
|
|
|
|
|
|
|
|
Those elements are explicitly specified in the compiled hash, as you can see
|
|
|
|
(you don't need to know perl too much to read a hash; just make some educated
|
|
|
|
guesses if needed!)
|
|
|
|
|
|
|
|
Now look at the compiled hash produced when `GL_BIG_CONFIG` is set. In place
|
|
|
|
of both 'firefox' and 'lynx' you have '@wbr', and similarly '@devs' for both
|
|
|
|
'alice' and 'bob'. In addition, there is a group hash at the bottom that
|
|
|
|
lists each group and its members.
|
|
|
|
|
|
|
|
When 'alice' tries to access the 'lynx' repo, gitolite collects all the group
|
|
|
|
names that these names belong to, so '@devs' is added to the list of 'user'
|
|
|
|
names that 'alice' inherits permissions from, and '@wbr' is added to the list
|
|
|
|
of 'repo' names that 'lynx' inherits from. This means that the final access
|
|
|
|
inherits all permissions pertaining to the following combinations:
|
|
|
|
|
|
|
|
alice, lynx
|
|
|
|
alice, @wbr
|
|
|
|
@devs, lynx
|
|
|
|
@devs, @wbr
|
|
|
|
|
|
|
|
(Actually there are 3 more... try and guess what they may be!)
|
|
|
|
|
|
|
|
Anyway, all ACL rules for these combinations are clubbed together to make the
|
|
|
|
composite set of rules that 'alice' accessing 'lynx' is subject to.
|