(new mirroring) documentation
This commit is contained in:
parent
5143cc890f
commit
37ce28a43b
|
@ -1,12 +1,13 @@
|
||||||
## mirroring a gitolite setup
|
# mirroring gitolite servers
|
||||||
|
|
||||||
Mirroring git repos is essentially a one-liner. For each mirror you want to
|
Mirroring a repo is simple in git; you just need code like this in a
|
||||||
update, you just add a post-receive hook that says
|
`post-receive` hook in each repo:
|
||||||
|
|
||||||
#!/bin/bash
|
#!/bin/bash
|
||||||
git push --mirror slave_user@mirror.host:/path/to/repo.git
|
git push --mirror slave_user@mirror.host:/path/to/repo.git
|
||||||
|
|
||||||
But life is never that simple...
|
The hard part is managing this across multiple mirror sites with multiple
|
||||||
|
repositories being mirrored.
|
||||||
|
|
||||||
**This document has been tested using a 3-server setup, all installed using
|
**This document has been tested using a 3-server setup, all installed using
|
||||||
the "non-root" method (see doc/1-INSTALL.mkd). However, the process is
|
the "non-root" method (see doc/1-INSTALL.mkd). However, the process is
|
||||||
|
@ -20,33 +21,25 @@ never *really* lost until you do a `git gc`**.
|
||||||
|
|
||||||
----
|
----
|
||||||
|
|
||||||
**Update 2011-03-10**: I wrote this with a typical "corporate" setup in mind
|
|
||||||
where all the servers involved are owned and administered by the same group of
|
|
||||||
people. As a result, the scripts assume the servers trust each other
|
|
||||||
completely. If that is not your situation, you will have to add code into
|
|
||||||
`gl-mirror-shell` to limit the commands the remote may send. Patches welcome
|
|
||||||
:-)
|
|
||||||
|
|
||||||
----
|
|
||||||
|
|
||||||
In this document:
|
In this document:
|
||||||
|
|
||||||
* <a href="#_RULE_NUMBER_ONE_">RULE NUMBER ONE!</a>
|
* <a href="#_RULE_NUMBER_ONE_">RULE NUMBER ONE!</a>
|
||||||
* <a href="#_things_that_will_NOT_be_mirrored_by_this_process">things that will NOT be mirrored by this process</a>
|
* <a href="#_what_will_will_not_work">what will/will not work</a>
|
||||||
* <a href="#_conventions_in_this_document">conventions in this document</a>
|
* <a href="#_concepts_and_terminology">concepts and terminology</a>
|
||||||
* <a href="#_setting_up_mirroring">setting up mirroring</a>
|
* <a href="#_setup_and_usage">setup and usage</a>
|
||||||
* <a href="#_install_gitolite_on_all_servers">install gitolite on all servers</a>
|
* <a href="#_server_level_setup">server level setup</a>
|
||||||
* <a href="#_generate_keypairs">generate keypairs</a>
|
* <a href="#_repository_level_setup">repository level setup</a>
|
||||||
* <a href="#_setup_the_mirror_shell_on_each_server">setup the mirror-shell on each server</a>
|
* <a href="#_commands_to_re_sync_mirrors">commands to (re-)sync mirrors</a>
|
||||||
* <a href="#_set_slaves_to_slave_mode">set slaves to slave mode</a>
|
* <a href="#_details">details</a>
|
||||||
* <a href="#_set_slave_server_lists">set slave server lists</a>
|
* <a href="#_the_conf_gitolite_conf_file">the `conf/gitolite.conf` file</a>
|
||||||
* <a href="#_efficiency_versus_paranoia">efficiency versus paranoia</a>
|
* <a href="#_redirecting_pushes">redirecting pushes</a>
|
||||||
* <a href="#_syncing_the_mirrors_the_first_time">syncing the mirrors the first time</a>
|
* <a href="#_discussion">discussion</a>
|
||||||
* <a href="#_switching_over">switching over</a>
|
* <a href="#_problems_with_the_old_mirroring_model">problems with the old mirroring model</a>
|
||||||
* <a href="#_the_return_of_foo">the return of foo</a>
|
* <a href="#_the_new_mirroring_model">the new mirroring model</a>
|
||||||
* <a href="#_switching_back">switching back</a>
|
* <a href="#_appendix_A_example_cronjob_based_mirroring">appendix A: example cronjob based mirroring</a>
|
||||||
* <a href="#_making_foo_a_slave">making foo a slave</a>
|
* <a href="#_appendix_B_efficiency_versus_paranoia">appendix B: efficiency versus paranoia</a>
|
||||||
* <a href="#_URLs_that_your_users_will_use">URLs that your users will use</a>
|
|
||||||
|
----
|
||||||
|
|
||||||
<a name="_RULE_NUMBER_ONE_"></a>
|
<a name="_RULE_NUMBER_ONE_"></a>
|
||||||
|
|
||||||
|
@ -62,285 +55,491 @@ Corollary: if the primary went down and you effected a changeover, you must
|
||||||
make sure that the primary does not come up in a push-enabled mode when it
|
make sure that the primary does not come up in a push-enabled mode when it
|
||||||
recovers.
|
recovers.
|
||||||
|
|
||||||
<a name="_things_that_will_NOT_be_mirrored_by_this_process"></a>
|
<a name="_what_will_will_not_work"></a>
|
||||||
|
|
||||||
### things that will NOT be mirrored by this process
|
### what will/will not work
|
||||||
|
|
||||||
Let's get this out of the way. This procedure will only mirror your git
|
This process will only mirror your git repositories, using `git push
|
||||||
repositories, using `git push --mirror`. Therefore, certain files will not be
|
--mirror`. It will not mirror log files, and repo-specific files like
|
||||||
mirrored:
|
`gl-creater` and `gl-perms` files, or indeed anything that was manually
|
||||||
|
created or added (for example, custom config entries added manually instead of
|
||||||
* gitolite log files
|
via gitolite).
|
||||||
* "gl-creator" and "gl-perms" files
|
|
||||||
* "projects.list", "description", and entries in the "config" files within
|
|
||||||
each repo
|
|
||||||
|
|
||||||
None of these affect actual repo contents of course, but they could be
|
None of these affect actual repo contents of course, but they could be
|
||||||
important, (especially the gl-creator, although if your wildcard pattern had
|
important, (especially the gl-creator, although if your wildcard pattern had
|
||||||
"CREATOR" in it you can recreate those files easily enough anyway).
|
"CREATOR" in it you can recreate those files easily enough anyway).
|
||||||
|
|
||||||
Your best bet is to use rsync for the log files, and tar for the others, at
|
Mirroring has not been, and will not be, tested when gitolite is installed
|
||||||
regular intervals.
|
using the deprecated 'from-client' method. Please use one of the other
|
||||||
|
methods.
|
||||||
|
|
||||||
<a name="_conventions_in_this_document"></a>
|
Also, none of this has been tested with smart-http. I'm not even sure it'll
|
||||||
|
work; http is very fiddly to get right. If you want mirroring, at least your
|
||||||
|
server-to-server comms should be over ssh.
|
||||||
|
|
||||||
### conventions in this document
|
<a name="_concepts_and_terminology"></a>
|
||||||
|
|
||||||
The userid hosting gitolite is `gitolite` on all machines. The servers are
|
### concepts and terminology
|
||||||
foo, bar, and baz. At the beginning, foo is the master, the other 2 are
|
|
||||||
slaves.
|
|
||||||
|
|
||||||
<a name="_setting_up_mirroring"></a>
|
Servers can host 3 kinds of repos: master, slave, and local.
|
||||||
|
|
||||||
### setting up mirroring
|
* A repo can be a **master** on one and only one server. A repo on its
|
||||||
|
"master" server is a **native** repo, on slaves it is "non-native".
|
||||||
|
|
||||||
<a name="_install_gitolite_on_all_servers"></a>
|
* A **slave** repo cannot be pushed to by a user. It will only accept
|
||||||
|
pushes from a master server. (But see later for an exception).
|
||||||
|
|
||||||
#### install gitolite on all servers
|
* A **local** repo is not involved in mirroring at all, in either direction.
|
||||||
|
|
||||||
* before running the final step in the install sequence, make sure you go to
|
<a name="_setup_and_usage"></a>
|
||||||
the `hooks/common` directory and rename `post-receive.mirrorpush` to
|
|
||||||
`post-receive`. See doc/hook-propagation.mkd if you're not sure where you
|
|
||||||
should look for `hooks/common`.
|
|
||||||
|
|
||||||
* if the server already has gitolite installed, use the normal methods to
|
### setup and usage
|
||||||
make sure this hook gets in.
|
|
||||||
|
|
||||||
* Use the same "admin key" on all the machines, so that the same person has
|
<a name="_server_level_setup"></a>
|
||||||
gitolite-admin access to all of them.
|
|
||||||
|
|
||||||
<a name="_generate_keypairs"></a>
|
#### server level setup
|
||||||
|
|
||||||
#### generate keypairs
|
To start with, assign each server a short name. We will use 'frodo', 'sam',
|
||||||
|
and 'gollum' as examples here.
|
||||||
|
|
||||||
Each server will be potentially logging on to one or more of the other
|
1. Generate ssh keys on each machine. Copy the `.pub` files to all other
|
||||||
servers, so first generate keypairs on each of them (`ssh-keygen`) and copy
|
machines with the appropriate names. I.e., frodo should have sam.pub and
|
||||||
the `.pub` files to all other servers, named appropriately. So foo will have
|
gollum.pub, etc.
|
||||||
bar.pub and baz.pub, etc.
|
|
||||||
|
|
||||||
<a name="_setup_the_mirror_shell_on_each_server"></a>
|
2. Install gitolite on all servers, under some 'hosting user' (we'll use
|
||||||
|
`git` in our examples here). You need not use the same hosting user on
|
||||||
|
all machines.
|
||||||
|
|
||||||
#### setup the mirror-shell on each server
|
It is not necessary to use the same "admin key" on all the machines.
|
||||||
|
However, if you do plan to mirror the gitolite-admin repo also, they will
|
||||||
|
eventually become the same anyway. In our example, frodo does mirror the
|
||||||
|
admin repo to sam, but not to gollum. (Can you really see frodo or sam
|
||||||
|
trusting gollum?)
|
||||||
|
|
||||||
XXX review this document after testing mirroring...
|
3. Now copy `hooks/common/post-receive.mirrorpush` from the gitolite source,
|
||||||
|
and install it as a custom hook called `post-receive`; see [here][ch] for
|
||||||
|
instructions.
|
||||||
|
|
||||||
If you installed gitolite using the from client method, run the following:
|
4. Edit `~/.gitolite.rc` on each machine and add/edit the following lines.
|
||||||
|
The `GL_HOSTNAME` variable **must** have the correct name for that host
|
||||||
|
(frodo, sam, or gollum), so that will definitely be different on each
|
||||||
|
server. The other line can be the same, or may have additional patterns
|
||||||
|
for other `git config` keys you have previously enabled. See [here][rsgc]
|
||||||
|
and the description for `GL_GITCONFIG_KEYS` in [this][vsi] for details.
|
||||||
|
|
||||||
# on foo
|
$GL_HOSTNAME = 'frodo'; # will be different on each server!
|
||||||
export GL_BINDIR=$HOME/.gitolite/src
|
$GL_GITCONFIG_KEYS = "gitolite.mirror.*";
|
||||||
cat bar.pub baz.pub |
|
|
||||||
sed -e 's,^,command="'$GL_BINDIR'/gl-mirror-shell" ,' >> ~/.ssh/authorized_keys
|
|
||||||
|
|
||||||
If you installed using any of the other 3 methods do this:
|
(Remember the "rc" file is NOT mirrored; it is meant to be site-local).
|
||||||
|
|
||||||
# on foo
|
Note: if `GL_HOSTNAME` is undefined, all mirroring features are disabled
|
||||||
export GL_BINDIR=`gl-query-rc GL_BINDIR`
|
on that server, regardless of other settings.
|
||||||
cat bar.pub baz.pub |
|
|
||||||
sed -e 's,^,command="'$GL_BINDIR'/gl-mirror-shell" ,' >> ~/.ssh/authorized_keys
|
|
||||||
|
|
||||||
Also do the same thing on the other machines.
|
5. On each machine, add the keys for all other machines. For example, on
|
||||||
|
frodo you'd run these two commands:
|
||||||
|
|
||||||
Now test this access:
|
gl-tool add-mirroring-peer sam.pub
|
||||||
|
gl-tool add-mirroring-peer gollum.pub
|
||||||
|
|
||||||
# on foo
|
6. Create "host" aliases on each machine to refer to all other machines. See
|
||||||
ssh gitolite@bar pwd
|
[here][ha] for what/why/how.
|
||||||
# should print /home/gitolite/repositories
|
|
||||||
ssh gitolite@bar uname -a
|
|
||||||
# should print the appropriate info for that server
|
|
||||||
|
|
||||||
Similarly test the other combinations.
|
The host alias for a host (in other machines' `~/.ssh/config` files) MUST
|
||||||
|
be the same as the `GL_HOSTNAME` in the referred host's `~/.gitolite.rc`.
|
||||||
|
Gitolite mirroring **requires** this consistency in naming; things will
|
||||||
|
NOT work otherwise.
|
||||||
|
|
||||||
<a name="_set_slaves_to_slave_mode"></a>
|
For example, if machine A's `~/.gitolite.rc` says `$GL_HOSTNAME =
|
||||||
|
'frodo';`, then all other machines must use a host alias of "frodo" in
|
||||||
|
their `~/.ssh/config` files to refer to machine A.
|
||||||
|
|
||||||
#### set slaves to slave mode
|
Once you've done this, each host should be able to reach the other hosts and
|
||||||
|
get a response back. For example, running this on sam:
|
||||||
|
|
||||||
Set slave mode on all the *slave* servers by setting `$GL_SLAVE_MODE = 1`
|
ssh frodo info
|
||||||
(uncommenting the line if necessary).
|
|
||||||
|
|
||||||
Leave the master server's file as is.
|
should get you
|
||||||
|
|
||||||
<a name="_set_slave_server_lists"></a>
|
Hello sam, I am frodo.
|
||||||
|
|
||||||
#### set slave server lists
|
Check this command from *everywhere to everywhere else*, and make sure you get
|
||||||
|
expected results. **Do NOT proceed otherwise.**
|
||||||
|
|
||||||
On the master (foo), set the names of the slaves by editing the
|
<a name="_repository_level_setup"></a>
|
||||||
`~/.gitolite.rc` to contain:
|
|
||||||
|
|
||||||
$ENV{GL_SLAVES} = 'gitolite@bar gitolite@baz';
|
#### repository level setup
|
||||||
|
|
||||||
**Note the syntax well; this is critical**:
|
Setting up mirroring at the repository level instead of at the "entire server"
|
||||||
|
level gives you a lot of flexibility (see "discussion" section below).
|
||||||
|
|
||||||
* **this must be in single quotes** (or you must remember to escape the `@`)
|
The basic idea is to use `git config` variables within each repo (gitolite
|
||||||
* the variable is an ENV var, not a plain perl var
|
allows you to create them from within the gitolite.conf file so that's
|
||||||
* the values are *space separated*
|
convenient), and use these to specify which machine is the master and which
|
||||||
* each value represents the userid and hostname for one server
|
machines are slaves for the repo.
|
||||||
|
|
||||||
The basic idea is that this string, should be usable in both the following
|
<font color="gray">
|
||||||
syntaxes:
|
|
||||||
|
|
||||||
git clone gitolite@bar:repo
|
> Side note: if you just want to simulate the old mirroring scheme, despite
|
||||||
ssh gitolite@bar pwd
|
> its limitations, it's very easy. Say frodo is the master for all repos,
|
||||||
|
> and the other 2 are slaves. Just clone the gitolite-admin repos of all
|
||||||
|
> servers, add these lines to the top of each:
|
||||||
|
|
||||||
You can also use ssh host aliases. Let's say server "bar" has a non-standard
|
repo @all
|
||||||
port number:
|
config gitolite.mirror.master = "frodo"
|
||||||
|
config gitolite.mirror.slaves = "sam gollum"
|
||||||
|
|
||||||
# in ~/.ssh/config on foo
|
> then commit, and push all 3. Finally, make a dummy commit on just the
|
||||||
host mybar
|
> frodo clone and push again. You're done.
|
||||||
hostname bar
|
|
||||||
user gitolite
|
|
||||||
port 2222
|
|
||||||
|
|
||||||
# in ~/.gitolite.rc on foo
|
</font>
|
||||||
$ENV{GL_SLAVES} = 'bar gitolite@baz';
|
|
||||||
|
|
||||||
And that's really all there is, unless...
|
Let's say frodo and sam are internal servers, while gollum is an external (and
|
||||||
|
therefore less trusted) server that has agreed to help us out by mirroring one
|
||||||
|
of our high traffic repos. We want the following setup:
|
||||||
|
|
||||||
<a name="_efficiency_versus_paranoia"></a>
|
* the "gitolite-admin" repo, as well as an internal project repo called
|
||||||
|
"ip1", should be mastered on frodo and mirrored to sam.
|
||||||
|
|
||||||
### efficiency versus paranoia
|
* internal project "ip2" has almost all of its developers closer to sam, so
|
||||||
|
it should be mastered there, and mirrored on frodo.
|
||||||
|
|
||||||
|
* an open source project we manage, "os1", should be mastered on frodo and
|
||||||
|
mirrored on both sam and gollum.
|
||||||
|
|
||||||
|
So here's how our example would go:
|
||||||
|
|
||||||
|
1. Clone frodo's and sam's gitolite-admin repos to your workstation, then add
|
||||||
|
the following lines to both their gitolite.conf files:
|
||||||
|
|
||||||
|
repo ip1 gitolite-admin
|
||||||
|
config gitolite.mirror.master = "frodo"
|
||||||
|
config gitolite.mirror.slaves = "sam"
|
||||||
|
|
||||||
|
repo ip2
|
||||||
|
config gitolite.mirror.master = "sam"
|
||||||
|
config gitolite.mirror.slaves = "frodo"
|
||||||
|
|
||||||
|
You also need normal access control lines for ip1 and ip2; I'm assuming
|
||||||
|
you already have them elsewhere, at least on frodo. (What you have on sam
|
||||||
|
won't matter in a few minutes, as you will see!)
|
||||||
|
|
||||||
|
Commit and push these changes.
|
||||||
|
|
||||||
|
2. There are a couple of quirks to keep in mind when you make changes to the
|
||||||
|
gitolite-admin repo's config.
|
||||||
|
|
||||||
|
* the first push will create the `git config` entries required, but by
|
||||||
|
then it is too late to *act* on them; i.e., actually do the mirroring.
|
||||||
|
If there were any older values, like a different list of slaves
|
||||||
|
perhaps, then those would be in effect.
|
||||||
|
|
||||||
|
This is largely because git invokes post-receive before post-update.
|
||||||
|
In theory I can work around this but I do not intend to.
|
||||||
|
|
||||||
|
Anyway, this means that after the 2 pushes, you have to make a dummy
|
||||||
|
push from frodo:
|
||||||
|
|
||||||
|
git commit --allow-empty -m empty; git push
|
||||||
|
|
||||||
|
which gets you something like this amidst the other messages:
|
||||||
|
|
||||||
|
remote: (25158&) frodo ==== (gitolite-admin) ===> sam
|
||||||
|
|
||||||
|
telling you that frodo is sending gitolite-admin to sam in the
|
||||||
|
background.
|
||||||
|
|
||||||
|
* the second quirk is that your clone of server sam's gitolite-admin
|
||||||
|
repo is now completely out of date, since frodo has overwritten it on
|
||||||
|
the server. You have to 'cd' to that clone and do this:
|
||||||
|
|
||||||
|
git fetch
|
||||||
|
git reset --hard origin/master
|
||||||
|
|
||||||
|
2. That completes the setup of the gitolite-admin and the internal project
|
||||||
|
repos. We'll now setup things for the open source project, "os1".
|
||||||
|
|
||||||
|
On frodo's gitolite-admin clone, add the following lines to
|
||||||
|
`conf/gitolite.conf`, then commit and push:
|
||||||
|
|
||||||
|
repo os1
|
||||||
|
config gitolite.mirror.master = "frodo"
|
||||||
|
config gitolite.mirror.slaves = "sam gollum"
|
||||||
|
|
||||||
|
Also, send the same lines to gollum's administrator and ask him to add
|
||||||
|
them into his conf/gitolite.conf file, commit, and push.
|
||||||
|
|
||||||
|
<a name="_commands_to_re_sync_mirrors"></a>
|
||||||
|
|
||||||
|
#### commands to (re-)sync mirrors
|
||||||
|
|
||||||
|
Sometimes there's a network problem and a mirror will not receive an update
|
||||||
|
immediately on a push. When the network is back up, you can do one of these
|
||||||
|
things to get it back in sync.
|
||||||
|
|
||||||
|
1. On the master server, you can start a **background** job to mirror a repo.
|
||||||
|
For example, this:
|
||||||
|
|
||||||
|
gl-mirror-shell request-push ip1
|
||||||
|
|
||||||
|
triggers a mirror-push of repo "ip1" to all slaves listed in that repo's
|
||||||
|
"gitolite.mirror.slaves" config.
|
||||||
|
|
||||||
|
On the hand, this:
|
||||||
|
|
||||||
|
gl-mirror-shell request-push ip1 gollum
|
||||||
|
|
||||||
|
triggers a mirror-push of "ip1" *only* to the gollum server, regardless of
|
||||||
|
what servers are listed as slaves in the config.
|
||||||
|
|
||||||
|
Note that this invocation does not even check if gollum is listed as a
|
||||||
|
slave for "ip1"; since you're doing it at the command line on the master
|
||||||
|
server, you're allowed to push it to *any* slave that will accept it.
|
||||||
|
|
||||||
|
<font color="gray">
|
||||||
|
|
||||||
|
> Side note: if you want to start a **foreground** job, the syntax is
|
||||||
|
> `gl-mirror-shell request-push ip1 -fg gollum`. Foreground mode
|
||||||
|
> requires one (and only one) slave name -- you cannot send to an
|
||||||
|
> implicit list, nor to more than one slave.
|
||||||
|
|
||||||
|
</font>
|
||||||
|
|
||||||
|
2. Cronjobs and custom mirroring schemes are now very easy to do. Just use
|
||||||
|
the second form of the command above to push any repo to any slave, and it
|
||||||
|
can form the basis of any scheme you like. Appendix A contains an example
|
||||||
|
setup.
|
||||||
|
|
||||||
|
3. Once in a while a slave will realise it needs an update, and wants to ask
|
||||||
|
for one. It can run this command to do so:
|
||||||
|
|
||||||
|
ssh sam request-push ip2
|
||||||
|
|
||||||
|
If the requesting server is not one of the slaves listed in the config
|
||||||
|
variable gitolite.mirror.slaves on the master, it will be rejected.
|
||||||
|
|
||||||
|
This is always a foreground push, reflecting the fact that the slave may
|
||||||
|
want to know why their push errored out or didn't work last time or
|
||||||
|
whatever.
|
||||||
|
|
||||||
|
<a name="_details"></a>
|
||||||
|
|
||||||
|
### details
|
||||||
|
|
||||||
|
<a name="_the_conf_gitolite_conf_file"></a>
|
||||||
|
|
||||||
|
#### the `conf/gitolite.conf` file
|
||||||
|
|
||||||
|
One goal I have is to minimise the code changes to "core" gitolite due to
|
||||||
|
this, so all repo-specific mirror settings are stored as `git config`
|
||||||
|
variables (you know you can specify git config variables in the gitolite
|
||||||
|
config file right?). These are:
|
||||||
|
|
||||||
|
* `gitolite.mirror.master`
|
||||||
|
|
||||||
|
The name of the server which is the master for this repo. Each server
|
||||||
|
will compare this with `$GL_HOSTNAME` (from its own rc file) to
|
||||||
|
determine if it's the master or a slave. Here're the possible values:
|
||||||
|
|
||||||
|
* **undefined** or `local`: this repo is local to this server
|
||||||
|
* **same** as `$GL_HOSTNAME`: this server is the "master" for this
|
||||||
|
repo. (The repo is "native" to this server).
|
||||||
|
* **not same** as `$GL_HOSTNAME`: this server is a "slave" for the
|
||||||
|
repo. (The repo is a non-native on this server).
|
||||||
|
|
||||||
|
* `gitolite.mirror.slaves`
|
||||||
|
|
||||||
|
Ignored for non-native repos. For native repos, this is a space-separated
|
||||||
|
list of servers to push to from the `post-receive` hook.
|
||||||
|
|
||||||
|
Clearly, you can have different sets of slaves for different repos (again,
|
||||||
|
see "discussion" section later for more on this).
|
||||||
|
|
||||||
|
* `gitolite.mirror.redirectOK`
|
||||||
|
|
||||||
|
See the section on "redirecting pushes"
|
||||||
|
|
||||||
|
<a name="_redirecting_pushes"></a>
|
||||||
|
|
||||||
|
### redirecting pushes
|
||||||
|
|
||||||
|
**Please read carefully; there are security implications if you enable this
|
||||||
|
for mirrors NOT under your control**.
|
||||||
|
|
||||||
|
When a user pushes to a non-native repo, it is possible to transparently
|
||||||
|
redirect the push to the correct master server. This is a very neat feature,
|
||||||
|
because now all your users just use one URL (the mirror nearest to them).
|
||||||
|
They don't need to know where the actual master is, and more importantly, if
|
||||||
|
you and the other admins change it, they don't need to know it changed!
|
||||||
|
|
||||||
|
The `gitolite.mirror.redirectOK` config variable decides where this
|
||||||
|
redirection is OK. If it is set to 'true', any valid 'slave' can redirect an
|
||||||
|
incoming non-native push from a developer. Otherwise, it contains a list of
|
||||||
|
slaves that are permitted to redirect pushes (this might happen if you don't
|
||||||
|
trust some of your slaves enough to accept a redirected push from them).
|
||||||
|
|
||||||
|
This check needs to pass on both the master and slave servers; both have a say
|
||||||
|
in deciding if this is allowed. (The master may have real reasons not to
|
||||||
|
allow this; see below. I cannot think of any real reason for the *slave* to
|
||||||
|
disable this, but it's there in case some admin doesn't like it).
|
||||||
|
|
||||||
|
There are some potential issues that you MUST consider before enabling this:
|
||||||
|
|
||||||
|
* (security) If the slave and master server are so different or autonomous
|
||||||
|
that a user, say "alice", on the slave is not guaranteed to be the same
|
||||||
|
one as "alice" on the master, then the master admin should NOT enable this
|
||||||
|
feature.
|
||||||
|
|
||||||
|
This is because, in this scheme, authentication happens on the slave, but
|
||||||
|
authorisation is on the master. The slave-authenticated userid (alice) is
|
||||||
|
passed to the master.
|
||||||
|
|
||||||
|
(If you know ssh well enough, you know that the ssh authentication has
|
||||||
|
already happened, so all we can do is ensure authorisation happens with
|
||||||
|
whatever username we know so far).
|
||||||
|
|
||||||
|
* If your slave is out of sync with the master for whatever reason, then the
|
||||||
|
user will get confusing results. A `git fetch` may say everything is
|
||||||
|
upto-date but the push fails saying it is not a fast-forward push. (Of
|
||||||
|
course there's a way to fix this; see the "commands to (re-)sync mirrors"
|
||||||
|
section above).
|
||||||
|
|
||||||
|
* We cannot redirect non-git commands like ADC, setperms, etc because we
|
||||||
|
don't really have a way of knowing what repo he's talking about (different
|
||||||
|
commands have different syntaxes, some have more than one reponame...).
|
||||||
|
Any user who needs to do that should access the end server directly. It
|
||||||
|
should be easy enough to write an ADC to do the forwarding, in case the
|
||||||
|
slave server is the only one that can reach the real master due to network
|
||||||
|
or firewall setup.
|
||||||
|
|
||||||
|
Ideally, I recommend that ad hoc repos not be mirrored at all. Keep
|
||||||
|
mirroring for "blessed" repos only.
|
||||||
|
|
||||||
|
<a name="_discussion"></a>
|
||||||
|
|
||||||
|
### discussion
|
||||||
|
|
||||||
|
<a name="_problems_with_the_old_mirroring_model"></a>
|
||||||
|
|
||||||
|
#### problems with the old mirroring model
|
||||||
|
|
||||||
|
The old mirroring model had a single server as the master for *all*
|
||||||
|
repositories. Slaves were effectively only for load-balancing reads, or for
|
||||||
|
failover if the master died.
|
||||||
|
|
||||||
|
This is not good enough for corporate setups where the developers are spread
|
||||||
|
fairly evenly across the world. Some repos need to be closer to some teams
|
||||||
|
(NUMA is a good analogy).
|
||||||
|
|
||||||
|
A model where different repos are "mastered" in different cities is much more
|
||||||
|
efficient here.
|
||||||
|
|
||||||
|
The old model had other rigidities too, though they're not really *problems*,
|
||||||
|
as such:
|
||||||
|
|
||||||
|
* the slaves are just slaves; they can't have any "local" repos.
|
||||||
|
|
||||||
|
* a slave had to carry *all* repos; it couldn't choose to carry just a
|
||||||
|
subset.
|
||||||
|
|
||||||
|
* it implicitly assumed all the mirrors were under the same admin, and that
|
||||||
|
the gitolite-admin repo was itself mirrored too.
|
||||||
|
|
||||||
|
<a name="_the_new_mirroring_model"></a>
|
||||||
|
|
||||||
|
#### the new mirroring model
|
||||||
|
|
||||||
|
In the new model, servers can be (but, I hasten to add, don't *have to* be!)
|
||||||
|
much more independent and autonomous than in the old model. This has a few
|
||||||
|
pros/cons:
|
||||||
|
|
||||||
|
* The gitolite-admin repo (and config) need not be mirrored. This allows
|
||||||
|
site-local repos not meant to be mirrored, without unnecessarily creating
|
||||||
|
a second gitolite install just for those.
|
||||||
|
|
||||||
|
(Site-local repos are useful for purely local projects that need
|
||||||
|
not/should not be mirrored for some reason, or ad-hoc personal repos that
|
||||||
|
developers create for themselves, etc.)
|
||||||
|
|
||||||
|
Of course, then the admin(s) need to make an effort to keep things
|
||||||
|
consistent for the "blessed" repos. For example, two servers can both
|
||||||
|
claim to be "master"!
|
||||||
|
|
||||||
|
* Servers can choose to mirror a subset of the repos from one of the bigger
|
||||||
|
servers.
|
||||||
|
|
||||||
|
In the open source world, you can imagine more popular repos (or more
|
||||||
|
popular parts of huge projects like KDE) having more mirrors. Or
|
||||||
|
substitute "more popular" with "larger in size" if you wish
|
||||||
|
(FlightGear-data anyone?)
|
||||||
|
|
||||||
|
In the corporate world it could help with jurisdiction issues if the
|
||||||
|
mirror is in a different country with different laws.
|
||||||
|
|
||||||
|
I'm sure people will find other uses for this. And I'm *positive* the
|
||||||
|
pros will outweigh the cons. If you don't like it, follow the suggestion
|
||||||
|
in the side note somewhere up above, and just forget this feature exists
|
||||||
|
:-)
|
||||||
|
|
||||||
|
----
|
||||||
|
|
||||||
|
<a name="_appendix_A_example_cronjob_based_mirroring"></a>
|
||||||
|
|
||||||
|
### appendix A: example cronjob based mirroring
|
||||||
|
|
||||||
|
Let's say you have some repos that are so active that you're pushing halfway
|
||||||
|
across the world every few seconds. The slaves do not need to be that closely
|
||||||
|
updated, and it is sufficient to update them once an hour instead. Here's how
|
||||||
|
you might do that:
|
||||||
|
|
||||||
|
repo foo bar frob/nitz
|
||||||
|
config gitolite.mirror.hourly = "slave1 slave2 slave3"
|
||||||
|
|
||||||
|
Then you'd write a cron job that looks like this (untested):
|
||||||
|
|
||||||
|
#!/bin/bash
|
||||||
|
|
||||||
|
REPO_BASE=`${0%/*}/gl-query-rc REPO_BASE`
|
||||||
|
GL_BINDIR=`${0%/*}/gl-query-rc GL_BINDIR`
|
||||||
|
|
||||||
|
cd $REPO_BASE
|
||||||
|
find . -type d -name "*.git" -prune | while read r
|
||||||
|
do
|
||||||
|
cd $REPO_BASE; cd $r
|
||||||
|
|
||||||
|
# get reponame as gitolite knows it
|
||||||
|
r=${r:2}
|
||||||
|
r=${r%.git}
|
||||||
|
|
||||||
|
# get slaves list
|
||||||
|
slaves=`git config --get gitolite.mirror.hourly`
|
||||||
|
|
||||||
|
gl-mirror-shell request-push $r $slaves
|
||||||
|
|
||||||
|
# that command backgrounds the push, so you'd best wait a few seconds
|
||||||
|
# before hitting the next one, otherwise you'll have all your repos
|
||||||
|
# going out at once!
|
||||||
|
sleep 10
|
||||||
|
done
|
||||||
|
|
||||||
|
<a name="_appendix_B_efficiency_versus_paranoia"></a>
|
||||||
|
|
||||||
|
### appendix B: efficiency versus paranoia
|
||||||
|
|
||||||
If you're paranoid enough to use mirrors, you should be paranoid enough to
|
If you're paranoid enough to use mirrors, you should be paranoid enough to
|
||||||
like the `receive.fsckObjects` setting we now default to :-) However, informal
|
use the `receive.fsckObjects` setting. However, informal tests indicate a
|
||||||
tests indicate a 40-50% CPU overhead from this. If you don't like that,
|
40-50% CPU overhead from this. If you're ok with that, make the appropriate
|
||||||
remove that line from the post-receive code.
|
adjustments to `GL_GITCONFIG_KEYS` and possibly `GL_GITCONFIG_WILD` in the rc
|
||||||
|
file, then add this to your gitolite.conf file:
|
||||||
|
|
||||||
Please also note that we only set it on mirrors, and that too at the time the
|
repo @all
|
||||||
mirrored repo is *created*. This means, when you start using your old "main"
|
config receive.fsckObjects = "true"
|
||||||
server as a mirror (see later sections on switching over to a mirror, etc.),
|
|
||||||
it's repos do not have this setting. Repos created by previous versions of
|
|
||||||
gitolite also will not have this setting.
|
|
||||||
|
|
||||||
Personally, I just set `git config --global receive.fsckObjects true`, since
|
Personally, I just set `git config --global receive.fsckObjects true`, since
|
||||||
those servers aren't doing anything else anyway, and are idle for long
|
those servers aren't doing anything else anyway, and are idle for long
|
||||||
stretches of time. It's upto you what you want to do here.
|
stretches of time. It's upto you what you want to do here.
|
||||||
|
|
||||||
<a name="_syncing_the_mirrors_the_first_time"></a>
|
[ch]: http://sitaramc.github.com/gitolite/doc/2-admin.html#_custom_hooks
|
||||||
|
[ha]: http://sitaramc.github.com/gitolite/doc/ssh-troubleshooting.html#_appendix_4_host_aliases
|
||||||
|
[rsgc]: http://sitaramc.github.com/gitolite/doc/gitolite.conf.html#_repo_specific_git_config_commands
|
||||||
|
[vsi]: http://sitaramc.github.com/gitolite/doc/gitolite.rc.html#_variables_with_a_security_impact
|
||||||
|
|
||||||
### syncing the mirrors the first time
|
|
||||||
|
|
||||||
This is fine if you're setting up everything from scratch. But if your master
|
|
||||||
server already had some repos with commits on them, you have to manually sync
|
|
||||||
them up once.
|
|
||||||
|
|
||||||
# on foo
|
|
||||||
gl-mirror-sync gitolite@bar
|
|
||||||
# path to "sync" program is ~/.gitolite/src if "from-client" install
|
|
||||||
|
|
||||||
<a name="_switching_over"></a>
|
|
||||||
|
|
||||||
### switching over
|
|
||||||
|
|
||||||
Let's say foo goes down. You want to make bar the main server, and continue
|
|
||||||
to have "baz" be a slave.
|
|
||||||
|
|
||||||
* on bar, edit `~/.gitolite.rc` and set
|
|
||||||
|
|
||||||
$GL_SLAVE_MODE = 0;
|
|
||||||
$ENV{GL_SLAVES} = 'gitolite@baz';
|
|
||||||
|
|
||||||
* **sanity check**: go to your gitolite-admin clone, add a remote for "bar",
|
|
||||||
fetch it, and make sure they are the same:
|
|
||||||
|
|
||||||
git remote add bar gitolite@bar:gitolite-admin
|
|
||||||
git fetch bar
|
|
||||||
git branch -a -v
|
|
||||||
# check that all SHAs are the same
|
|
||||||
|
|
||||||
* inform everyone of the new URL for their repos (see next section for more
|
|
||||||
on this)
|
|
||||||
|
|
||||||
* make sure that if "foo" does come up, it will not immediately start
|
|
||||||
serving requests. You'll be in trouble if (a) foo comes up as it was
|
|
||||||
before, and (b) some developer still had the old URL lying around and
|
|
||||||
started pushing changes to it.
|
|
||||||
|
|
||||||
You could jump in quickly and set `$GL_SLAVE_MODE = 1` as soon as the
|
|
||||||
system comes up. Better still, use extraneous means to block incoming
|
|
||||||
connections from normal users (out of scope for this document).
|
|
||||||
|
|
||||||
<a name="_the_return_of_foo"></a>
|
|
||||||
|
|
||||||
### the return of foo
|
|
||||||
|
|
||||||
<a name="_switching_back"></a>
|
|
||||||
|
|
||||||
#### switching back
|
|
||||||
|
|
||||||
Switching back is fairly easy.
|
|
||||||
|
|
||||||
* synchronise all repos from bar to foo. This may take some time, depending
|
|
||||||
on how long foo was down.
|
|
||||||
|
|
||||||
# on bar
|
|
||||||
gl-mirror-sync gitolite@foo
|
|
||||||
# path to "sync" program is ~/.gitolite/src if "from-client" install
|
|
||||||
|
|
||||||
* turn off pushes on "bar" by setting slave mode to 1
|
|
||||||
* run the sync once again; this should complete quickly
|
|
||||||
|
|
||||||
* **double check by comparing some the repos on both sides if needed**. You
|
|
||||||
could run the following snippet on all servers for a quick check:
|
|
||||||
|
|
||||||
cd ~/repositories # or wherever $REPO_BASE is
|
|
||||||
find . -type d -name "*.git" | sort |
|
|
||||||
while read r
|
|
||||||
do
|
|
||||||
echo $r
|
|
||||||
git ls-remote $r | sort
|
|
||||||
done | md5sum
|
|
||||||
|
|
||||||
* on foo, set the slave list (or check that it is correct)
|
|
||||||
* on foo, set slave mode off
|
|
||||||
* tell everyone to switch back
|
|
||||||
|
|
||||||
<a name="_making_foo_a_slave"></a>
|
|
||||||
|
|
||||||
#### making foo a slave
|
|
||||||
|
|
||||||
If "foo" does come up in a controlled manner, you might not want to switch
|
|
||||||
back right away. Unless you're doing DNS tricks, users may be peeved at
|
|
||||||
having to do 2 switches.
|
|
||||||
|
|
||||||
If you want to make foo a slave, you know the drill by now:
|
|
||||||
|
|
||||||
* set slave mode to 1 on foo
|
|
||||||
* on bar, add foo as a slave
|
|
||||||
|
|
||||||
# in ~/.gitolite.rc on bar
|
|
||||||
$ENV{GL_SLAVES} = 'gitolite@foo gitolite@baz';
|
|
||||||
|
|
||||||
I think that should cover pretty much everything. I *have* tested most of
|
|
||||||
this, but YMMV.
|
|
||||||
|
|
||||||
----
|
|
||||||
|
|
||||||
<a name="_URLs_that_your_users_will_use"></a>
|
|
||||||
|
|
||||||
### URLs that your users will use
|
|
||||||
|
|
||||||
Unless you play DNS tricks, it is more than likely that your users would have
|
|
||||||
to change the URLs they use to access their repos if you change the server
|
|
||||||
they push to.
|
|
||||||
|
|
||||||
I cannot speak for the plethora of git client software out there but for
|
|
||||||
normal git, this problem can be mitigated somewhat by doing this:
|
|
||||||
|
|
||||||
* in `~/.ssh/config` on my workstation, I have
|
|
||||||
|
|
||||||
host gl
|
|
||||||
hostname=primary.server.ip
|
|
||||||
user=gitolite
|
|
||||||
|
|
||||||
* all my `git clone` commands use `gl:reponame` as the URL
|
|
||||||
|
|
||||||
* if the primary goes down, and I have to access the secondary, I just
|
|
||||||
change the `hostname` line in `~/.ssh/config`.
|
|
||||||
|
|
||||||
That's it. Every clone of every repo used anywhere in this userid is now
|
|
||||||
changed.
|
|
||||||
|
|
||||||
To repeat, this may or may not work with all the git clients that exist (like
|
|
||||||
jgit, or any of the GUI tools, and especially if you're on Windows).
|
|
||||||
|
|
||||||
If anyone has a better idea, something that works more universally, I'd love
|
|
||||||
to hear it.
|
|
||||||
|
|
Loading…
Reference in a new issue