Darcs 2.0.0 (+ 205 patches)
Darcs

David Roundy


Contents

Introduction

Darcs is a revision control system, along the lines of CVS or arch. That means that it keeps track of various revisions and branches of your project, allows for changes to propagate from one branch to another. Darcs is intended to be an ``advanced'' revision control system. Darcs has two particularly distinctive features which differ from other revision control systems: 1) each copy of the source is a fully functional branch, and 2) underlying darcs is a consistent and powerful theory of patches.

Every source tree a branch

The primary simplifying notion of darcs is that every copy of your source code is a full repository. This is dramatically different from CVS, in which the normal usage is for there to be one central repository from which source code will be checked out. It is closer to the notion of arch, since the `normal' use of arch is for each developer to create his own repository. However, darcs makes it even easier, since simply checking out the code is all it takes to create a new repository. This has several advantages, since you can harness the full power of darcs in any scratch copy of your code, without committing your possibly destabilizing changes to a central repository.

Theory of patches

The development of a simplified theory of patches is what originally motivated me to create darcs. This patch formalism means that darcs patches have a set of properties, which make possible manipulations that couldn't be done in other revision control systems. First, every patch is invertible. Secondly, sequential patches (i.e. patches that are created in sequence, one after the other) can be reordered, although this reordering can fail, which means the second patch is dependent on the first. Thirdly, patches which are in parallel (i.e. both patches were created by modifying identical trees) can be merged, and the result of a set of merges is independent of the order in which the merges are performed. This last property is critical to darcs' philosophy, as it means that a particular version of a source tree is fully defined by the list of patches that are in it, i.e. there is no issue regarding the order in which merges are performed. For a more thorough discussion of darcs' theory of patches, see Appendix [*].

A simple advanced tool

Besides being ``advanced'' as discussed above, darcs is actually also quite simple. Versioning tools can be seen as three layers. At the foundation is the ability to manipulate changes. On top of that must be placed some kind of database system to keep track of the changes. Finally, at the very top is some sort of distribution system for getting changes from one place to another.

Really, only the first of these three layers is of particular interest to me, so the other two are done as simply as possible. At the database layer, darcs just has an ordered list of patches along with the patches themselves, each stored as an individual file. Darcs' distribution system is strongly inspired by that of arch. Like arch, darcs uses a dumb server, typically apache or just a local or network file system when pulling patches. darcs has built-in support for using ssh to write to a remote file system. A darcs executable is called on the remote system to apply the patches. Arbitrary other transport protocols are supported, through an environment variable describing a command that will run darcs on the remote system. See the documentation for DARCS_APPLY_FOO in Chapter [*] for details.

The recommended method is to send patches through gpg-signed email messages, which has the advantage of being mostly asynchronous.

Keeping track of changes rather than versions

In the last paragraph, I explained revision control systems in terms of three layers. One can also look at them as having two distinct uses. One is to provide a history of previous versions. The other is to keep track of changes that are made to the repository, and to allow these changes to be merged and moved from one repository to another. These two uses are distinct, and almost orthogonal, in the sense that a tool can support one of the two uses optimally while providing no support for the other. Darcs is not intended to maintain a history of versions, although it is possible to kludge together such a revision history, either by making each new patch depend on all previous patches, or by tagging regularly. In a sense, this is what the tag feature is for, but the intention is that tagging will be used only to mark particularly notable versions (e.g. released versions, or perhaps versions that pass a time consuming test suite).

Other revision control systems are centered upon the job of keeping track of a history of versions, with the ability to merge changes being added as it was seen that this would be desirable. But the fundamental object remained the versions themselves.

In such a system, a patch (I am using patch here to mean an encapsulated set of changes) is uniquely determined by two trees. Merging changes that are in two trees consists of finding a common parent tree, computing the diffs of each tree with their parent, and then cleverly combining those two diffs and applying the combined diff to the parent tree, possibly at some point in the process allowing human intervention, to allow for fixing up problems in the merge such as conflicts.

In the world of darcs, the source tree is not the fundamental object, but rather the patch is the fundamental object. Rather than a patch being defined in terms of the difference between two trees, a tree is defined as the result of applying a given set of patches to an empty tree. Moreover, these patches may be reordered (unless there are dependencies between the patches involved) without changing the tree. As a result, there is no need to find a common parent when performing a merge. Or, if you like, their common parent is defined by the set of common patches, and may not correspond to any version in the version history.

One useful consequence of darcs' patch-oriented philosophy is that since a patch need not be uniquely defined by a pair of trees (old and new), we can have several ways of representing the same change, which differ only in how they commute and what the result of merging them is. Of course, creating such a patch will require some sort of user input. This is a Good Thing, since the user creating the patch should be the one forced to think about what he really wants to change, rather than the users merging the patch. An example of this is the token replace patch (See Section [*]). This feature makes it possible to create a patch, for example, which changes every instance of the variable ``stupidly_named_var'' to ``better_var_name'', while leaving ``other_stupidly_named_var'' untouched. When this patch is merged with any other patch involving the ``stupidly_named_var'', that instance will also be modified to ``better_var_name''. This is in contrast to a more conventional merging method which would not only fail to change new instances of the variable, but would also involve conflicts when merging with any patch that modifies lines containing the variable. By more using additional information about the programmer's intent, darcs is thus able to make the process of changing a variable name the trivial task that it really is, which is really just a trivial search and replace, modulo tokenizing the code appropriately.

The patch formalism discussed in Appendix [*] is what makes darcs' approach possible. In order for a tree to consist of a set of patches, there must be a deterministic merge of any set of patches, regardless of the order in which they must be merged. This requires that one be able to reorder patches. While I don't know that the patches are required to be invertible as well, my implementation certainly requires invertibility. In particular, invertibility is required to make use of Theorem [*], which is used extensively in the manipulation of merges.

Features

Record changes locally

In darcs, the equivalent of a cvs ``commit'' is called record, because it doesn't put the change into any remote or centralized repository. Changes are always recorded locally, meaning no net access is required in order to work on your project and record changes as you make them. Moreover, this means that there is no need for a separate ``disconnected operation'' mode.

Interactive records

You can choose to perform an interactive record, in which case darcs will prompt you for each change you have made and ask if you wish to record it. Of course, you can tell darcs to record all the changes in a given file, or to skip all the changes in a given file, or go back to a previous change, or whatever. There is also an experimental graphical interface, which allows you to view and choose changes even more easily, and in whichever order you like.

Unrecord local changes

As a corollary to the ``local'' nature of the record operation, if a change hasn't yet been published to the world--that is, if the local repository isn't accessible by others--you can safely unrecord a change (even if it wasn't the most recently recorded change) and then re-record it differently, for example if you forgot to add a file, introduced a bug or realized that what you recorded as a single change was really two separate changes.

Interactive everything else

Most darcs commands support an interactive interface. The ``revert'' command, for example, which undoes unrecorded changes has the same interface as record, so you can easily revert just a single change. Pull, push, send and apply all allow you to view and interactively select which changes you wish to pull, push, send or apply.

Test suites

Darcs has support for integrating a test suite with a repository. If you choose to use this, you can define a test command (e.g. ``make check'') and have darcs run that command on a clean copy of the project either prior to recording a change or prior to applying changes--and to reject changes that cause the test to fail.

Any old server

Darcs does not require a specialized server in order to make a repository available for read access. You can use http, ftp, or even just a plain old ssh server to access your darcs repository.

You decide write permissions

Darcs doesn't try to manage write access. That's your business. Supported push methods include direct ssh access (if you're willing to give direct ssh access away), using sudo to allow users who already have shell access to only apply changes to the repository, or verification of gpg-signed changes sent by email against a list of allowed keys. In addition, there is good support for submission of patches by email that are not automatically applied, but can easily be applied with a shell escape from a mail reader (this is how I deal with contributions to darcs).

Symmetric repositories

Every darcs repository is created equal (well, with the exception of a ``partial'' repository, which doesn't contain a full history...), and every working directory has an associated repository. As a result, there is a symmetry between ``uploading'' and ``downloading'' changes--you can use the same commands (push or pull) for either purpose.

CGI script

Darcs has a CGI script that allows browsing of the repositories.

Portable

Darcs runs on UNIX (or UNIX-like) systems (which includes Mac OS X) as well as on Microsoft Windows.

File and directory moves

Renames or moves of files and directories, of course are handled properly, so when you rename a file or move it to a different directory, its history is unbroken, and merges with repositories that don't have the file renamed will work as expected.

Token replace

You can use the ``darcs replace'' command to modify all occurrences of a particular token (defined by a configurable set of characters that are allowed in ``tokens'') in a file. This has the advantage that merges with changes that introduce new copies of the old token will have the effect of changing it to the new token--which comes in handy when changing a variable or function name that is used throughout a project.

Configurable defaults

You can easily configure the default flags passed to any command on either a per-repository or a per-user basis or a combination thereof.

Switching from CVS

Darcs is refreshingly different from CVS.

CVS keeps version controlled data in a central repository, and requires that users check out a working directory whenever they wish to access the version-controlled sources. In order to modify the central repository, a user needs to have write access to the central repository; if he doesn't, CVS merely becomes a tool to get the latest sources.

In darcs there is no distinction between working directories and repositories. In order to work on a project, a user makes a local copy of the repository he wants to work in; he may then harness the full power of version control locally. In order to distribute his changes, a user who has write access can push them to the remote repository; one who doesn't can simply send them by e-mail in a format that makes them easy to apply on the remote system.

Darcs commands for CVS users

Because of the different models used by cvs and darcs, it is difficult to provide a complete equivalence between cvs and darcs. A rough correspondence for the everyday commands follows:
cvs checkout
darcs get
cvs update
darcs pull
cvs -n update
darcs pull --dry-run (summarize remote changes)
cvs -n update
darcs whatsnew --summary (summarize local changes)
cvs -n update | grep '?'
darcs whatsnew -ls | grep ^a (list potential files to add)
rm foo.txt; cvs update foo.txt
darcs revert foo.txt (revert to foo.txt from repo)
cvs diff
darcs whatsnew (if checking local changes)
cvs diff
darcs diff (if checking recorded changes)
cvs commit
darcs record (if committing locally)
cvs commit
darcs tag (if marking a version for later use)
cvs commit
darcs push or darcs send (if committing remotely)
cvs diff | mail
darcs send
cvs add
darcs add
cvs tag -b
darcs get
cvs tag
darcs tag

Migrating CVS repositories to darcs

Tools and instructions for migrating CVS repositories to darcs are provided on the darcs community website: http://darcs.net/DarcsWiki/ConvertingFromCvs

Switching from arch

Although arch, like darcs, is a distributed system, and the two systems have many similarities (both require no special server, for example), their essential organization is very different.

Like CVS, arch keeps data in two types of data structures: repositories (called ``archives'') and working directories. In order to modify a repository, one must first check out a corresponding working directory. This requires that users remember a number of different ways of pushing data around -- tla get, update, commit, archive-mirror and so on.

In darcs, on the other hand, there is no distinction between working directories and repositories, and just checking out your sources creates a local copy of a repository. This allows you to harness the full power of version control in any scratch copy of your sources, and also means that there are just two ways to push data around: darcs record, which stores edits into your local repository, and pull, which moves data between repositories. (darcs push is merely the opposite of pull; send and apply are just the two halves of push).

Darcs commands for arch users

Because of the different models used by arch and darcs, it is difficult to provide a complete equivalence between arch and darcs. A rough correspondence for the everyday commands follows:

tla init-tree
darcs initialize
tla get
darcs get
tla update
darcs pull
tla file-diffs f | patch -R
darcs revert
tla changes -diffs
darcs whatsnew
tla logs
darcs changes
tla file-diffs
darcs diff -u
tla add
darcs add
tla mv
darcs mv (not tla move)
tla commit
darcs record (if committing locally)
tla commit
darcs tag (if marking a version for later use)
tla commit
darcs push or darcs send (if committing remotely)
tla archive-mirror
darcs pull or darcs push
tla tag
darcs get (if creating a branch)
tla tag
darcs tag (if creating a tag).

Migrating arch repositories to darcs

Tools and instructions for migrating arch repositories to darcs are provided on the darcs community website: http://darcs.net/DarcsWiki/ConvertingFromArch

Building darcs

This chapter should walk you through the steps necessary to build darcs for yourself. There are in general two ways to build darcs. One is for building released versions from tarballs, and the other is to build the latest and greatest darcs, from the darcs repo itself.

Please let me know if you have any problems building darcs, or don't have problems described in this chapter and think there's something obsolete here, so I can keep this page up-to-date.

Prerequisites

To build darcs you will need to have ghc, the Glorious Glasgow Haskell Compiler. You should have at the very minimum version 6.4.

It is a good idea (but not required) to have software installed that provide darcs network access. The libwww-dev, libwww-ssl-dev or libcurl packages newer than than 7.18.0 are recommended because they provide pipelining support speed up HTTP access. They have to be explicitly chosen with -with-libwww or -with-curl-pipelining. Otherwise, darcs will automatically look for one of libcurl, curl or wget. You also might want to have scp available if you want to grab your repos over ssh...

To use the diff command of darcs, a diff program supporting options -r (recursive diff) and -N (show new files as differences against an empty file) is required. The configure script will look for gdiff, gnudiff and diff in this order. You can force the use of another program by setting the DIFF environment variable before running configure.

To rebuild the documentation (which should not be necessary since it is included in html form with the tarballs), you will need to have latex installed, as well as latex2html if you want to build it in html form.

Building on Mac OS X

To build on Mac OS X, you will need the Apple Developer Tools and the ghc 6.4 package installed.

Building on Microsoft Windows

To build on Microsoft Windows, you will need:

Copy the zlib and curl libraries and headers to both GHC and MinGW. GHC stores C headers in <ghc-dir>/gcc-lib/include and libraries in <ghc-dir>/gcc-lib. MinGW stores headers in <mingw-dir>/include and libraries in <mingw-dir>/lib.

Set PATH to include the <msys-dir>/bin, <mingw-dir>/bin, <curl-dir>, and a directory containing a pre-built darcs.exe if you want the build's patch context stored for `darcs --exact-version'.

C:\darcs> cd <darcs-source-dir>
C:\darcs> sh

$ export GHC=/c/<ghc-dir>/bin/ghc.exe
$ autoconf
$ ./configure --target=mingw
$ make

Building from tarball

If you get darcs from a tarball, the procedure (after unpacking the tarball itself) is as follows:
% ./configure
% make
# Optional, but recommended
% make test
% make install

There are options to configure that you may want to check out with

% ./configure --help

If your header files are installed in a non-standard location, you may need to define the CFLAGS and CPPFLAGS environment variables to include the path to the headers. e.g. on NetBSD, you may need to run

% CFLAGS=-I/usr/pkg/include CPPFLAGS=-I/usr/pkg/include ./configure

Building darcs from the repository

To build the latest darcs from its repository, you will first need a working copy of Darcs 2. You can get darcs using:
% darcs get -v http://darcs.net/
and once you have the darcs repository you can bring it up to date with a
% darcs pull

The repository doesn't hold automatically generated files, which include the configure script and the HTML documentation, so you need to run autoconf first.

You'll need autoconf 2.50 or higher. Some systems have more than one version of autoconf installed. For example, autoconf may point to version 2.13, while autoconf259 runs version 2.59.

Also note that make is really "GNU make". On some systems, such as the *BSDs, you may need to type gmake instead of make for this to work.

If you want to create readable documentation you'll need to have latex installed.

% autoconf
% ./configure
% make
% make install

If you want to tweak the configure options, you'll need to run ./configure yourself after the make, and then run make again.

Submitting patches to darcs

I know, this doesn't really belong in this chapter, but if you're using the repository version of darcs it's really easy to submit patches to me using darcs. In fact, even if you don't know any Haskell, you could submit fixes or additions to this document (by editing building_darcs.tex) based on your experience building darcs...

To do so, just record your changes (which you made in the darcs repository)

% darcs record --no-test
making sure to give the patch a nice descriptive name. The --no-test options keeps darcs from trying to run the unit tests, which can be rather time-consuming. Then you can send the patch to the darcs-devel mailing list by email by
% darcs send -u
The darcs repository stores the email address to which patches should be sent by default. The email address you see is actually my own, but when darcs notices that you haven't signed the patch with my GPG key, it will forward the message to darcs-devel.

Getting started

This chapter will lead you through an example use of darcs, which hopefully will allow you to get started using darcs with your project.

Creating your repository

Creating your repository in the first place just involves telling darcs to create the special directory (called _darcs) in your project tree, which will hold the revision information. This is done by simply calling from the root directory of your project:

% cd my_project/
% darcs initialize --darcs-2
This creates the _darcs directory and populates it with whatever files and directories are needed to describe an empty project. You now need to tell darcs what files and directories in your project should be under revision control. You do this using the command darcs add3.1:
% darcs add *.c Makefile.am configure.ac
When you have added all your files (or at least, think you have), you will want to record your changes. ``Recording'' always includes adding a note as to why the change was made, or what it does. In this case, we'll just note that this is the initial version.
% darcs record --all
What is the patch name? Initial revision.
Note that since we didn't specify a patch name on the command line we were prompted for one. If the environment variable `EMAIL' isn't set, you will also be prompted for your email address. Each patch that is recorded is given a unique identifier consisting of the patch name, its creator's email address, and the date when it was created.

Making changes

Now that we have created our repository, make a change to one or more of your files. After making the modification run:

% darcs whatsnew
This should show you the modifications that you just made, in the darcs patch format. If you prefer to see your changes in a different format, read Section [*], which describes the whatsnew command in detail.

Let's say you have now made a change to your project. The next thing to do is to record a patch. Recording a patch consists of grouping together a set of related changes, and giving them a name. It also tags the patch with the date it was recorded and your email address.

To record a patch simply type:

% darcs record
darcs will then prompt you with all the changes that you have made that have not yet been recorded, asking you which ones you want to include in the new patch. Finally, darcs will ask you for a name for the patch.

You can now rerun whatsnew, and see that indeed the changes you have recorded are no longer marked as new.

Making your repository visible to others

How do you let the world know about these wonderful changes? Obviously, they must be able to see your repository. Currently the easiest way to do this is typically by http using any web server. The recommended way to do this (using apache in a UNIX environment) is to create a directory called /var/www/repos, and then put a symlink to your repository there:
% cd /var/www/repos
% ln -s /home/username/myproject .

As long as you're running a web server and making your repository available to the world, you may as well make it easy for people to see what changes you've made. You can do this by running make installserver, which installs the program darcs_cgi at /usr/lib/cgi-bin/darcs. You also will need to create a cache directory named /var/cache/darcs_cgi, and make sure the owner of that directory is the same user that your web server runs its cgi scripts as. For me, this is www-data. Now your friends and enemies should be able to easily browse your repositories by pointing their web browsers at http://your.server.org/cgi-bin/darcs.

Getting changes made to another repository

Ok, so I can now browse your repository using my web browser... so what? How do I get your changes into my repository, where they can do some good? It couldn't be easier. I just cd into my repository, and there type:
% darcs pull http://your.server.org/repos/yourproject
Darcs will check to see if you have recorded any changes that aren't in my current repository. If so, it'll prompt me for each one, to see which ones I want to add to my repository. Note that you may see a different series of prompts depending on your answers, since sometimes one patch depends on another, so if you answer yes to the first one, you won't be prompted for the second if the first depends on it.

Of course, maybe I don't even have a copy of your repository. In that case I'd want to do a

% darcs get http://your.server.org/repos/yourproject
which gets the whole repository.

I could instead create an empty repository and fetch all of your patches with pull. Get is just a more efficient way to clone a whole repository.

Get, pull and push also work over ssh. Ssh-paths are of the same form accepted by scp, namely [username@]host:/path/to/repository.

Moving patches from one repository to another

Darcs is flexible as to how you move patches from one repository to another. This section will introduce all the ways you can get patches from one place to another, starting with the simplest and moving to the most complicated.

All pulls

The simplest method is the ``all-pull'' method. This involves making each repository readable (by http, ftp, nfs-mounted disk, whatever), and you run darcs pull in the repository you want to move the patch to. This is nice, as it doesn't require you to give write access to anyone else, and is reasonably simple.

Send and apply manually

Sometimes you have a machine on which it is not convenient to set up a web server, perhaps because it's behind a firewall or perhaps for security reasons, or because it is often turned off. In this case you can use darcs send from that computer to generate a patch bundle destined for another repository. You can either let darcs email the patch for you, or save it as a file and transfer it by hand. Then in the destination repository you (or the owner of that repository) run darcs apply to apply the patches contained in the bundle. This is also quite a simple method since, like the all-pull method, it doesn't require that you give anyone write access to your repository. But it's less convenient, since you have to keep track of the patch bundle (in the email, or whatever).

If you use the send and apply method with email, you'll probably want to create a _darcs/prefs/email file containing your email address. This way anyone who sends to your repository will automatically send the patch bundle to your email address.

If you receive many patches by email, you probably will benefit by running darcs apply directly from your mail program. I have in my .muttrc the following:

auto_view text/x-patch text/x-darcs-patch
macro pager A "<pipe-entry>darcs apply --verbose --mark-conflicts \
        --reply droundy@abridgegame.org --repodir ~/darcs"
which allows me to view a sent patch, and then apply the patch directly from mutt, sending a confirmation email to the person who sent me the patch. The autoview line relies on on the following lines, or something like them, being present in one's .mailcap:
text/x-patch;                           cat; copiousoutput
text/x-darcs-patch;                     cat; copiousoutput

Push

If you use ssh (and preferably also ssh-agent, so you won't have to keep retyping your password), you can use the push method to transfer changes (using the scp protocol for communication). This method is again not very complicated, since you presumably already have the ssh permissions set up. Push can also be used when the target repository is local, in which case ssh isn't needed. On the other hand, in this situation you could as easily run a pull, so there isn't much benefit.

Note that you can use push to administer a multiple-user repository. You just need to create a user for the repository (or repositories), and give everyone with write access ssh access, perhaps using .ssh/authorized_keys. Then they run

% darcs push repouser@repo.server:repo/directory

Push --apply-as

Now we get more subtle. If you like the idea in the previous paragraph about creating a repository user to own a repository which is writable by a number of users, you have one other option.

Push --apply-as can run on either a local repository or one accessed with ssh, but uses sudo to run a darcs apply command (having created a patch bundle as in send) as another user. You can add the following line in your sudoers file to allow the users to apply their patches to a centralized repository:

ALL   ALL = (repo-user) NOPASSWD: /usr/bin/darcs apply --all --repodir /repo/path*
This method is ideal for a centralized repository when all the users have accounts on the same computer, if you don't want your users to be able to run arbitrary commands as repo-user.

Sending signed patches by email

Most of the previous methods are a bit clumsy if you don't want to give each person with write access to a repository an account on your server. Darcs send can be configured to send a cryptographically signed patch by email. You can then set up your mail system to have darcs verify that patches were signed by an authorized user and apply them when a patch is received by email. The results of the apply can be returned to the user by email. Unsigned patches (or patches signed by unauthorized users) will be forwarded to the repository owner (or whoever you configure them to be forwarded to...).

This method is especially nice when combined with the --test option of darcs apply, since it allows you to run the test suite (assuming you have one) and reject patches that fail--and it's all done on the server, so you can happily go on working on your development machine without slowdown while the server runs the tests.

Setting up darcs to run automatically in response to email is by far the most complicated way to get patches from one repository to another... so it'll take a few sections to explain how to go about it.

Security considerations

When you set up darcs to run apply on signed patches, you should assume that a user with write access can write to any file or directory that is writable by the user under which the apply process runs. Unless you specify the --no-test flag to darcs apply (and this is not the default), you are also allowing anyone with write access to that repository to run arbitrary code on your machine (since they can run a test suite--which they can modify however they like). This is quite a potential security hole.

For these reasons, if you don't implicitly trust your users, it is recommended that you create a user for each repository to limit the damage an attacker can do with access to your repository. When considering who to trust, keep in mind that a security breach on any developer's machine could give an attacker access to their private key and passphrase, and thus to your repository.

Installing necessary programs

You also must install the following programs: gnupg, a mailer configured to receive mail (e.g. exim, sendmail or postfix), and a web server (usually apache). If you want to be able to browse your repository on the web you must also configure your web server to run cgi scripts and make sure the darcs cgi script was properly installed (by either a darcs-server package, or `make install-server').

Granting access to a repository

You create your gpg key by running (as your normal user):

% gpg --gen-key
You will be prompted for your name and email address, among other options. Of course, you can skip this step if you already have a gpg key you wish to use.

You now need to export the public key so we can tell the patcher about it. You can do this with the following command (again as your normal user):

% gpg --export "email@address" > /tmp/exported_key
And now we can add your key to the allowed_keys:
(as root)> gpg --keyring /var/lib/darcs/repos/myproject/allowed_keys \
               --no-default-keyring --import /tmp/exported_key
You can repeat this process any number of times to authorize multiple users to send patches to the repository.

You should now be able to send a patch to the repository by running as your normal user, in a working copy of the repository:

% darcs send --sign http://your.computer/repos/myproject
You may want to add ``send sign'' to the file _darcs/prefs/defaults so that you won't need to type --sign every time you want to send...

If your gpg key is protected by a passphrase, then executing send with the --sign option might give you the following error:

darcs failed:  Error running external program 'gpg'
The most likely cause of this error is that you have a misconfigured gpg that tries to automatically use a non-existent gpg-agent program. GnuPG will still work without gpg-agent when you try to sign or encrypt your data with a passphrase protected key. However, it will exit with an error code 2 (ENOENT) causing darcs to fail. To fix this, you will need to edit your ~/.gnupg/gpg.conf file and comment out or remove the line that says:
use-agent
If after commenting out or removing the use-agent line in your gpg configuration file you still get the same error, then you probably have a modified GnuPG with use-agent as a hard-coded option. In that case, you should change use-agent to no-use-agent to disable it explicitly.

Setting up a sendable repository using procmail

If you don't have root access on your machine, or perhaps simply don't want to bother creating a separate user, you can set up a darcs repository using procmail to filter your mail. I will assume that you already use procmail to filter your email. If not, you will need to read up on it, or perhaps should use a different method for routing the email to darcs.

To begin with, you must configure your repository so that a darcs send to your repository will know where to send the email. Do this by creating a file in /path/to/your/repo/_darcs/prefs called email containing your email address. As a trick (to be explained below), we will create the email address with ``darcs repo'' as your name, in an email address of the form ``David Roundy $<$droundy@abridgegame.org$>$.''

% echo 'my darcs repo <user@host.com>' \
      > /path/to/your/repo/_darcs/prefs/email

The next step is to set up a gnupg keyring containing the public keys of people authorized to send to your repository. Here I'll give a second way of going about this (see above for the first). This time I'll assume you want to give me write access to your repository. You can do this by:

gpg --no-default-keyring \
    --keyring /path/to/the/allowed_keys --recv-keys D3D5BCEC
This works because ``D3D5BCEC'' is the ID of my gpg key, and I have uploaded my key to the gpg keyservers. Actually, this also requires that you have configured gpg to access a valid keyserver. You can, of course, repeat this command for all keys you want to allow access to.

Finally, we add a few lines to your .procmailrc:

:0
* ^TOmy darcs repo
|(umask 022; darcs apply --reply user@host.com \
    --repodir /path/to/your/repo --verify /path/to/the/allowed_keys)
The purpose for the ``my darcs repo'' trick is partially to make it easier to recognize patches sent to the repository, but is even more crucial to avoid nasty bounce loops by making the --reply option have an email address that won't go back to the repository. This means that unsigned patches that are sent to your repository will be forwarded to your ordinary email.

Like most mail-processing programs, Procmail by default sets a tight umask. However, this will prevent the repository from remaining world-readable; thus, the ``umask 022'' is required to relax the umask. (Alternatively, you could set Procmail's global UMASK variable to a more suitable value.)

Checking if your e-mail patch was applied

After sending a patch with darcs send, you may not receive any feedback, even if the patch is applied. You can confirm whether or not your patch was applied to the remote repository by pointing darcs changes at a remote repository:

darcs changes --last=10 --repo=http://darcs.net/

That shows you the last 10 changes in the remote repository. You can adjust the options given to changes if a more advanced query is needed.


Reducing disk space usage

A Darcs repository contains the patches that Darcs uses to store history, the working directory, and a pristine tree (a copy of the working directory files with no local modifications). For large repositories, this can add up to a fair amount of disk usage.

There are two techniques that can be used to reduce the amount of space used by Darcs repositories: linking and using no pristine tree. The former can be used on any repository; the latter is only suitable in special circumstances, as it makes some operations much slower.

Linking between repositories

A number of filesystems support linking files, sharing a single file data between different directories. Under some circumstances, when repositories are very similar (typically because they represent different branches of the same piece of software), Darcs will use linking to avoid storing the same file multiple times.

Whenever you invoke darcs get to copy a repository from a local filesystem onto the same filesystem, Darcs will link patches whenever possible.

In order to save time, darcs get does not link pristine trees even when individual files are identical. Additionally, as you pull patches into trees, patches will become unlinked. This will result in a lot of wasted space if two repositories have been living for a long time but are similar. In such a case, you should relink files between the two repositories.

Relinking is an asymmetric operation: you relink one repository (to which you must have write access) to another repository, called the sibling. This is done with darcs optimize --relink, with -the --sibling flag specifying the sibling.

  $ cd /var/repos/darcs.net
  $ darcs optimize --relink --sibling /var/repos/darcs
The --sibling flag can be repeated multiple times, in which case Darcs will try to find a file to link to in all of the siblings. If a default repository is defined, Darcs will try, as a last resort, to link against the default repository.

Additional space savings can be achieved by relinking files in the pristine tree (see below) by using the --relink-pristine flag. However, doing this prevents Darcs from having precise timestamps on the pristine files, which carries a moderate performance penalty.

Alternate formats for the pristine tree

By default, every Darcs repository contains a complete copy of the pristine tree, the working tree as it would be if there were no local edits. By avoiding the need to consult a possibly large number of patches just to find out if a file is modified, the pristine tree makes a lot of operations much faster than they would otherwise be.

Under some circumstances, keeping a whole pristine tree is not desirable. This is the case when preparing a repository to back up, when publishing a repository on a public web server with limited space, or when storing a repository on floppies or small USB keys. In such cases, it is possible to use a repository with no pristine tree.

Darcs automatically recognizes a repository with no pristine tree. In order to create such a tree, specify the --no-pristine-tree flag to darcs initialize or darcs get. There is currently no way to switch an existing repository to use no pristine tree.

The support for --no-pristine-tree repositories is fairly new, and has not been extensively optimized yet. Please let us know if you use this functionality, and which operations you find are too slow.


Configuring darcs

There are several ways you can adjust darcs' behavior to suit your needs. The first is to edit files in the _darcs/prefs/ directory of a repository. Such configuration only applies when working with that repository. To configure darcs on a per-user rather than per-repository basis (but with essentially the same methods), you can edit (or create) files in the ~/.darcs/ directory. Finally, the behavior of some darcs commands can be modified by setting appropriate environment variables.

prefs

The _darcs directory contains a prefs directory. This directory exists simply to hold user configuration settings specific to this repository. The contents of this directory are intended to be modifiable by the user, although in some cases a mistake in such a modification may cause darcs to behave strangely.


defaults

Default values for darcs commands can be configured on a per-repository basis by editing (and possibly creating) the _darcs/prefs/defaults file. Each line of this file has the following form:

COMMAND FLAG VALUE
where COMMAND is either the name of the command to which the default applies, or ALL to indicate that the default applies to all commands accepting that flag. The FLAG term is the name of the long argument option without the ``--'', i.e. verbose rather than --verbose. Finally, the VALUE option can be omitted if the flag is one such as verbose that doesn't involve a value. If the value has spaces in it, use single quotes, not double quotes, to surround it. Each line only takes one flag. To set multiple defaults for the same command (or for ALL commands), use multiple lines.

Note that the use of ALL easily can have unpredicted consequences, especially if commands in newer versions of darcs accepts flags that they didn't in previous versions. A command like obliterate could be devastating with the ``wrong'' flags (for example -all). Only use safe flags with ALL.

~/.darcs/defaults provides defaults for this user account
repo/_darcs/prefs/defaults provides defaults for one project,
  overrules changes per user

For example, if your system clock is bizarre, you could instruct darcs to always ignore the file modification times by adding the following line to your _darcs/prefs/defaults file. (Note that this would have to be done for each repository!)

ALL ignore-times

If you never want to run a test when recording to a particular repository (but still want to do so when running check on that repository), and like to name all your patches ``Stupid patch'', you could use the following:

record no-test
record patch-name Stupid patch

If you would like a command to be run every time patches are recorded in a particular repository (for example if you have one central repository, that all developers contribute to), then you can set apply to always run a command when apply is successful. For example, if you need to make sure that the files in the repository have the correct access rights you might use the following. There are two things to note about using darcs this way:

apply posthook chmod -R a+r *
apply run-posthook

Similarly, if you need a command to run automatically before darcs performs an action you can use a prehook. Using prehooks it could be possible to canonicalize line endings before recording patches.

There are some options which are meant specifically for use in _darcs/prefs/defaults. One of them is --disable. As the name suggests, this option will disable every command that got it as argument. So, if you are afraid that you could damage your repositories by inadvertent use of a command like amend-record, add the following line to _darcs/prefs/defaults:

amend-record disable

Also, a global preferences file can be created with the name .darcs/defaults in your home directory. Options present there will be added to the repository-specific preferences. If they conflict with repository-specific options, the repository-specific ones will take precedence.

repos

The _darcs/prefs/repos file contains a list of repositories you have pulled from or pushed to, and is used for autocompletion of pull and push commands in bash. Feel free to delete any lines from this list that might get in there, or to delete the file as a whole.


author

The _darcs/prefs/author file contains the email address (or name) to be used as the author when patches are recorded in this repository, e.g. David Roundy <droundy@abridgegame.org>. This file overrides the contents of the environment variables $DARCS_EMAIL and $EMAIL.


boring

The _darcs/prefs/boring file may contain a list of regular expressions describing files, such as object files, that you do not expect to add to your project. As an example, the boring file that I use with my darcs repository is:
\.hi$
\.o$
^\.[^/]
^_
~$
(^|/)CVS($|/)
A newly created repository has a longer boring file that includes many common source control, backup, temporary, and compiled files.

You may want to have the boring file under version control. To do this you can use darcs setpref to set the value ``boringfile'' to the name of your desired boring file (e.g. darcs setpref boringfile .boring, where .boring is the repository path of a file that has been darcs added to your repository). The boringfile preference overrides _darcs/prefs/boring, so be sure to copy that file to the boringfile.

You can also set up a ``boring'' regexps file in your home directory, named ~/.darcs/boring, which will be used with all of your darcs repositories.

Any file not already managed by darcs and whose repository path (such as manual/index.html) matches any of the boring regular expressions is considered boring. The boring file is used to filter the files provided to darcs add, to allow you to use a simple darcs add newdir newdir/* without accidentally adding a bunch of object files. It is also used when the --look-for-adds flag is given to whatsnew or record. Note that once a file has been added to darcs, it is not considered boring, even if it matches the boring file filter.

binaries

The _darcs/prefs/binaries file may contain a list of regular expressions describing files that should be treated as binary files rather than text files. Darcs automatically treats files containing ^Z\ or '\0' within the first 4096 bytes as being binary files. You probably will want to have the binaries file under version control. To do this you can use darcs setpref to set the value ``binariesfile'' to the name of your desired binaries file (e.g. darcs setpref binariesfile ./.binaries, where .binaries is a file that has been darcs added to your repository). As with the boring file, you can also set up a ~/.darcs/binaries file if you like.

email

The _darcs/prefs/email file is used to provide the e-mail address for your repository that others will use when they darcs send a patch back to you. The contents of the file should simply be an e-mail address.

sources

The _darcs/prefs/sources file is used to indicate alternative locations from which to download patches when using a ``hashed'' repository. This file contains lines such as:
cache:/home/droundy/.darcs/cache
readonly:/home/otheruser/.darcs/cache
repo:http://darcs.net
This would indicate that darcs should first look in /home/droundy/.darcs/cache for patches that might be missing, and if the patch isn't there, it should save a copy there for future use. In that case, darcs will look in /home/otheruser/.darcs/cache to see if that user might have downloaded a copy, but won't try to save a copy there, of course. Finally, it will look in http://darcs.net. Note that the sources file can also exist in ~/.darcs/. Also note that the sources mentioned in your sources file will be tried before the repository you are pulling from. This can be useful in avoiding downloading patches multiple times when you pull from a remote repository to more than one local repository.

We strongly advise that you enable a global cache directory, which will allow darcs to avoid re-downloading patches (for example, when doing a second darcs get of the same repository), and also allows darcs to use hard links to reduce disk usage. To do this, simply

mkdir -p $HOME/.darcs/cache
echo cache:$HOME/.darcs/cache > $HOME/.darcs/sources
Note that the cache directory should reside on the same filesystem as your repositories, so you may need to vary this. You can also use multiple cache directories on different filesystems, if you have several filesystems on which you use darcs.


motd

The _darcs/prefs/motd file may contain a ``message of the day'' which will be displayed to users who get or pull from the repository without the --quiet option.

Environment variables

There are a few environment variables whose contents affect darcs' behavior. Here is a quick list of all the variables and their documentation in the rest of the manual:

Variable Section
DARCS_EDITOR, EDITOR, VISUAL [*]
DARCS_PAGER, PAGER [*]
HOME [*]
TERM [*]
DARCS_EMAIL, EMAIL [*]
DARCS_APPLY_FOO [*]
DARCS_GET_FOO [*]
DARCS_MGET_FOO [*]
DARCS_MGETMAX [*]
DARCS_PROXYUSERPWD [*]
DARCS_WGET [*]
DARCS_SSH [*]
DARCS_SCP [*]
DARCS_SFTP [*]
SSH_PORT [*]
DARCS_ALTERNATIVE_COLOR [*]
DARCS_ALWAYS_COLOR [*]
DARCS_DO_COLOR_LINES [*]
DARCS_DONT_COLOR [*]
DARCS_DONT_ESCAPE_TRAILING_CR [*]
DARCS_DONT_ESCAPE_TRAILING_SPACES [*]
DARCS_DONT_ESCAPE_8BIT [*]
DARCS_DONT_ESCAPE_ANYTHING [*]
DARCS_DONT_ESCAPE_ISPRINT [*]
DARCS_ESCAPE_EXTRA [*]
DARCS_DONT_ESCAPE_EXTRA [*]

General-purpose variables


DARCS_EDITOR

When pulling up an editor (for example, when adding a long comment in record), darcs uses the contents of DARCS_EDITOR if it is defined. If not, it tries the contents of VISUAL, and if that isn't defined (or fails for some reason), it tries EDITOR. If none of those environment variables are defined, darcs tries vi, emacs, emacs -nw and nano in that order.


DARCS_PAGER

When using a pager for displaying a patch, darcs uses the contents of DARCS_PAGER if it is defined. If not, it tries the contents of PAGER and then more.


DARCS_TMPDIR

If the environment variable DARCS_TMPDIR is defined, darcs will use that directory for its temporaries. Otherwise it will use TMPDIR, if that is defined, and if not that then /tmp and if /tmp doesn't exist, it'll put the temporaries in _darcs.

This is very helpful, for example, when recording with a test suite that uses MPI, in which case using /tmp to hold the test copy is no good, as /tmp isn't shared over NFS and thus the mpirun call will fail, since the binary isn't present on the compute nodes.


HOME

HOME is used to find the per-user prefs directory, which is located at $HOME/.darcs.


TERM

If darcs is compiled with libcurses support and support for color output, it uses the environment variable TERM to decide whether or not color is supported on the output terminal.

Remote repositories


SSH_PORT

When using ssh, if the SSH_PORT environment variable is defined, darcs will use that port rather than the default ssh port (which is 22).


DARCS_SSH

The DARCS_SSH environment variable defines the command that darcs will use when asked to run ssh. This command is not interpreted by a shell, so you cannot use shell metacharacters, and the first word in the command must be the name of an executable located in your path.


DARCS_SCP and DARCS_SFTP

The DARCS_SCP and DARCS_SFTP environment variables define the commands that darcs will use when asked to run scp or sftp. Darcs uses scp and sftp to access repositories whose address is of the form user@foo.org:foo or foo.org:foo. Darcs will use scp to copy single files (e.g. repository meta-information), and sftp to copy multiple files in batches (e.g. patches). These commands are not interpreted by a shell, so you cannot use shell metacharacters, and the first word in the command must be the name of an executable located in your path. By default, scp and sftp are used. When you can use sftp, but not scp (e.g. some ISP web sites), it works to set DARCS_SCP to `sftp'. The other way around does not work, i.e. DARCS_FTP must reference an sftp program, not scp.


DARCS_PROXYUSERPWD

This environment variable allows DARCS and libcurl to access remote repositories via a password-protected HTTP proxy. The proxy itself is specified with the standard environment variable for this purpose, namely 'http_proxy'. The DARCS_PROXYUSERPWD environment variable specifies the proxy username and password. It must be given in the form username:password.


DARCS_GET_FOO, DARCS_MGET_FOO and DARCS_APPLY_FOO

When trying to access a repository with a URL beginning foo://, darcs will invoke the program specified by the DARCS_GET_FOO environment variable (if defined) to download each file, and the command specified by the DARCS_APPLY_FOO environment variable (if defined) when pushing to a foo:// URL.

This method overrides all other ways of getting foo://xxx URLs.

Note that each command should be constructed so that it sends the downloaded content to STDOUT, and the next argument to it should be the URL. Here are some examples that should work for DARCS_GET_HTTP:

fetch -q -o -  
curl -s -f
lynx -source 
wget -q -O -

If set, DARCS_MGET_FOO will be used to fetch many files from a single repository simultaneously. Replace FOO and foo as appropriate to handle other URL schemes. These commands are not interpreted by a shell, so you cannot use shell metacharacters, and the first word in the command must be the name of an executable located in your path. The GET command will be called with a URL for each file. The MGET command will be invoked with a number of URLs and is expected to download the files to the current directory, preserving the file name but not the path. The APPLY command will be called with a darcs patchfile piped into its standard input. Example:

wget -q


DARCS_MGETMAX

When invoking a DARCS_MGET_FOO command, darcs will limit the number of URLs presented to the command to the value of this variable, if set, or 200.


DARCS_WGET

This is a very old option that is only used if libcurl is not compiled in and one of the DARCS_GET_FOO is not used. Using one of those is recommended instead.

The DARCS_WGET environment variable defines the command that darcs will use to fetch all URLs for remote repositories. The first word in the command must be the name of an executable located in your path. Extra arguments can be included as well, such as:

wget -q

Darcs will append -i to the argument list, which it uses to provide a list of URLS to download. This allows wget to download multiple patches at the same time. It's possible to use another command besides wget with this environment variable, but it must support the -i option in the same way.

These commands are not interpreted by a shell, so you cannot use shell meta-characters.


Highlighted output

If the terminal understands ANSI color escape sequences, darcs will highlight certain keywords and delimiters when printing patches. This can be turned off by setting the environment variable DARCS_DONT_COLOR to 1. If you use a pager that happens to understand ANSI colors, like less -R, darcs can be forced always to highlight the output by setting DARCS_ALWAYS_COLOR to 1. If you can't see colors you can set DARCS_ALTERNATIVE_COLOR to 1, and darcs will use ANSI codes for bold and reverse video instead of colors. In addition, there is an extra-colorful mode, which is not enabled by default, which can be activated with DARCS_DO_COLOR_LINES.

By default darcs will escape (by highlighting if possible) any kind of spaces at the end of lines when showing patch contents. If you don't want this you can turn it off by setting DARCS_DONT_ESCAPE_TRAILING_SPACES to 1. A special case exists for only carriage returns: DARCS_DONT_ESCAPE_TRAILING_CR.


Character escaping and non-ASCII character encodings

Darcs needs to escape certain characters when printing patch contents to a terminal. Characters like backspace can otherwise hide patch content from the user, and other character sequences can even in some cases redirect commands to the shell if the terminal allows it.

By default darcs will only allow printable 7-bit ASCII characters (including space), and the two control characters tab and newline. (See the last paragraph in this section for a way to tailor this behavior.) All other octets are printed in quoted form (as ^<control letter> or \<hex code>).

Darcs has some limited support for locales. If the system's locale is a single-byte character encoding, like the Latin encodings, you can set the environment variable DARCS_DONT_ESCAPE_ISPRINT to 1 and darcs will display all the printables in the current system locale instead of just the ASCII ones. NOTE: This curently does not work on some architectures if darcs is compiled with GHC 6.4. Some non-ASCII control characters might be printed and can possibly spoof the terminal.

For multi-byte character encodings things are less smooth. UTF-8 will work if you set DARCS_DONT_ESCAPE_8BIT to 1, but non-printables outside the 7-bit ASCII range are no longer escaped. E.g., the extra control characters from Latin-1 might leave your terminal at the mercy of the patch contents. Space characters outside the 7-bit ASCII range are no longer recognized and will not be properly escaped at line endings.

As a last resort you can set DARCS_DONT_ESCAPE_ANYTHING to 1. Then everything that doesn't flip code sets should work, and so will all the bells and whistles in your terminal. This environment variable can also be handy if you pipe the output to a pager or external filter that knows better than darcs how to handle your encoding. Note that all escaping, including the special escaping of any line ending spaces, will be turned off by this setting.

There are two environment variables you can set to explicitly tell darcs to not escape or escape octets. They are DARCS_DONT_ESCAPE_EXTRA and DARCS_ESCAPE_EXTRA. Their values should be strings consisting of the verbatim octets in question. The do-escapes take precedence over the dont-escapes. Space characters are still escaped at line endings though. The special environment variable DARCS_DONT_ESCAPE_TRAILING_CR turns off escaping of carriage return last on the line (DOS style).

Best practices

Introduction

This chapter is intended to review various scenarios and describe in each case effective ways of using darcs. There is no one ``best practice'', and darcs is a sufficiently low-level tool that there are many high-level ways one can use it, which can be confusing to new users. The plan (and hope) is that various users will contribute here describing how they use darcs in different environments. However, this is not a wiki, and contributions will be edited and reviewed for consistency and wisdom.

Creating patches

This section will lay down the concepts around patch creation. The aim is to develop a way of thinking that corresponds well to how darcs is behaving -- even in complicated situations.

In a single darcs repository you can think of two ``versions'' of the source tree. They are called the working and pristine trees. Working is your normal source tree, with or without darcs alongside. The only thing that makes it part of a darcs repository is the _darcs directory in its root. Pristine is the recorded state of the source tree. The pristine tree is constructed from groups of changes, called patches (some other version control systems use the term changeset instead of patch).5.1 Darcs will create and store these patches based on the changes you make in working.

Changes

If working and pristine are the same, there are ``no changes'' in the repository. Changes can be introduced (or removed) by editing the files in working. They can also be caused by darcs commands, which can modify both working and pristine. It is important to understand for each darcs command how it modifies working, pristine or both of them.

whatsnew (as well as diff) can show the difference between working and pristine to you. It will be shown as a difference in working. In advanced cases it need not be working that has changed; it can just as well have been pristine, or both. The important thing is the difference and what darcs can do with it.

Keeping or discarding changes

If you have a difference in working, you do two things with it: record it to keep it, or revert it to lose the changes.5.2

If you have a difference between working and pristine--for example after editing some files in working--whatsnew will show some ``unrecorded changes''. To save these changes, use record. It will create a new patch in pristine with the same changes, so working and pristine are no longer different. To instead undo the changes in working, use revert. It will modify the files in working to be the same as in pristine (where the changes do not exist).

Unrecording changes

unrecord is a command meant to be run only in private repositories. Its intended purpose is to allow developers the flexibility to undo patches that haven't been distributed yet.

However, darcs does not prevent you from unrecording a patch that has been copied to another repository. Be aware of this danger!

If you unrecord a patch, that patch will be deleted from pristine. This will cause working to be different from pristine, and whatsnew to report unrecorded changes. The difference will be the same as just before that patch was recorded. Think about it. record examines what's different with working and constructs a patch with the same changes in pristine so they are no longer different. unrecord deletes this patch; the changes in pristine disappear and the difference is back.

If the recorded changes included an error, the resulting flawed patch can be unrecorded. When the changes have been fixed, they can be recorded again as a new--hopefully flawless--patch.

If the whole change was wrong it can be discarded from working too, with revert. revert will update working to the state of pristine, in which the changes do no longer exist after the patch was deleted.

Keep in mind that the patches are your history, so deleting them with unrecord makes it impossible to track what changes you really made. Redoing the patches is how you ``cover the tracks''. On the other hand, it can be a very convenient way to manage and organize changes while you try them out in your private repository. When all is ready for shipping, the changes can be reorganized in what seems as useful and impressive patches. Use it with care.

All patches are global, so don't ever replace an already ``shipped'' patch in this way! If an erroneous patch is deleted and replaced with a better one, you have to replace it in all repositories that have a copy of it. This may not be feasible, unless it's all private repositories. If other developers have already made patches or tags in their repositories that depend on the old patch, things will get complicated.

Special patches and pending

The patches described in the previous sections have mostly been hunks. A hunk is one of darcs' primitive patch types, and it is used to remove old lines and/or insert new lines. There are other types of primitive patches, such as adddir and addfile which add new directories and files, and replace which does a search-and-replace on tokens in files.

Hunks are always calculated in place with a diff algorithm just before whatsnew or record. But other types of primitive patches need to be explicitly created with a darcs command. They are kept in pending5.3until they are either recorded or reverted.

Pending can be thought of as a special extension of working. When you issue, e.g., a darcs replace command, the replace is performed on the files in working and at the same time a replace patch is put in pending. Patches in pending describe special changes made in working. The diff algorithm will fictively apply these changes to pristine before it compares it to working, so all lines in working that are changed by a replace command will also be changed in pending$+$pristine when the hunks are calculated. That's why no hunks with the replaced lines will be shown by whatsnew; it only shows the replace patch in pending responsible for the change.

If a special patch is recorded, it will simply be moved to pristine. If it is instead reverted, it will be deleted from pending and the accompanying change will be removed from working.

Note that reverting a patch in pending is not the same as simply removing it from pending. It actually applies the inverse of the change to working. Most notable is that reverting an addfile patch will delete the file in working (the inverse of adding it). So if you add the wrong file to darcs by mistake, don't revert the addfile. Instead use remove, which cancels out the addfile in pending.

Using patches

This section will lay down the concepts around patch distribution and branches. The aim is to develop a way of thinking that corresponds well to how darcs is behaving -- even in complicated situations.

A repository is a collection of patches. Patches have no defined order, but patches can have dependencies on other patches. Patches can be added to a repository in any order as long as all patches depended upon are there. Patches can be removed from a repository in any order, as long as no remaining patches depend on them.

Repositories can be cloned to create branches. Patches created in different branches may conflict. A conflict is a valid state of a repository. A conflict makes the working tree ambiguous until the conflict is resolved.

Dependencies

There are two kinds of dependencies: implicit dependencies and explicit dependencies.

Implicit dependencies is the far most common kind. These are calculated automatically by darcs. If a patch removes a file or a line of code, it will have to depend on the patch that added that file or line of code.5.4If a patch adds a line of code, it will usually have to depend on the patch or patches that added the adjacent lines.

Explicit dependencies can be created if you give the --ask-deps option to darcs record. This is good for assuring that logical dependencies hold between patches. It can also be used to group patches--a patch with explicit dependencies doesn't need to change anything--and pulling the patch also pulls all patches it was made to depend on.

Branches: just normal repositories

Darcs does not have branches--it doesn't need to. Every repository can be used as a branch. This means that any two repositories are ``branches'' in darcs, but it is not of much use unless they have a large portion of patches in common. If they are different projects they will have nothing in common, but darcs may still very well be able to merge them, although the result probably is nonsense. Therefore the word ``branch'' isn't a technical term in darcs; it's just the way we think of one repository in relation to another.

Branches are very useful in darcs. They are in fact necessary if you want to do more than only simple work. When you get someone's repository from the Internet, you are actually creating a branch of it. It may first seem inefficient (or if you come from CVS--frightening), not to say plain awkward. But darcs is designed this way, and it has means to make it efficient. The answer to many questions about how to do a thing with darcs is: ``use a branch''. It is a simple and elegant solution with great power and flexibility, which contributes to darcs' uncomplicated user interface.

You create new branches (i.e., clone repositories) with the get and put commands.

Moving patches around--no versions

Patches are global, and a copy of a patch either is or is not present in a branch. This way you can rig a branch almost any way you like, as long as dependencies are fulfilled--darcs won't let you break dependencies. If you suspect a certain feature from some time ago introduced a bug, you can remove the patch/patches that adds the feature, and try without it.5.5

Patches are added to a repository with pull and removed from the repositories with obliterate. Don't confuse these two commands with record and unrecord, which constructs and deconstructs patches.

It is important not to lose patches when (re)moving them around. pull needs a source repository to copy the patch from, whereas obliterate just erases the patch. Beware that if you obliterate all copies of a patch it is completely lost--forever. Therefore you should work with branches when you obliterate patches. The obliterate command can wisely be disabled in a dedicated main repository by adding obliterate disable to the repository's defaults file.

For convenience, there is a push command. It works like pull but in the other direction. It also differs from pull in an important way: it starts a second instance of darcs to apply the patch in the target repository, even if it's on the same computer. It can cause surprises if you have a ``wrong'' darcs in your PATH.

Tags--versions

While pull and obliterate can be used to construct different ``versions'' in a repository, it is often desirable to name specific configurations of patches so they can be identified and retrieved easily later. This is how darcs implements what is usually known as versions. The command for this is tag, and it records a tag in the current repository.

A tag is just a patch, but it only contains explicit dependencies. It will depend on all the patches in the current repository.5.6Darcs can recognize if a patch is as a tag; tags are sometimes treated specially by darcs commands.

While traditional revision control systems tag versions in the time line history, darcs lets you tag any configuration of patches at any time, and pass the tags around between branches.

With the option --tag to get you can easily get a named version in the repository as a new branch.

Conflicts

This part of darcs becomes a bit complicated, and the description given here is slightly simplified.

Conflicting patches are created when you record changes to the same line in two different repositories. Same line does not mean the same line number and file name, but the same line added by a common depended-upon patch.

If you are using a darcs-2 repository (Section [*]), darcs does not consider two patches making the same change to be a conflict, much in the same fashion as other version control systems. (The caveat here is two non-identical patches with some identical changes may conflict. For the most part, darcs should just do what you expect).

A conflict happens when two conflicting patches meet in the same repository. This is no problem for darcs; it can happily pull together just any patches. But it is a problem for the files in working (and pristine). The conflict can be thought of as two patches telling darcs different things about what a file should look like.

Darcs escapes this problem by ignoring those parts5.7of the patches that conflict. They are ignored in both patches. If patch A changes the line ``FIXME'' to ``FIXED'', and patch B changes the same line to ``DONE'', the two patches together will produce the line ``FIXME''. Darcs doesn't care which one you pulled into the repository first, you still get the same result when the conflicting patches meet. All other changes made by A and B are performed as normal.

Darcs can mark a conflict for you in working. This is done with mark-conflicts. Conflicts are marked such that both conflicting changes are inserted with special delimiter lines around them. Then you can merge the two changes by hand, and remove the delimiters.

When you pull patches, darcs automatically performs a mark-conflicts for you if a conflict happens. You can remove the markup with revert, Remember that the result will be the lines from the previous version common to both conflicting patches. The conflict marking can be redone again with mark-conflicts.

A special case is when a pulled patch conflicts with unrecorded changes in the repository. The conflict will be automatically marked as usual, but since the markup is also an unrecorded change, it will get mixed in with your unrecorded changes. There is no guarantee you can revert only the markup after this, and resolve will not be able to redo this markup later if you remove it. It is good practice to record important changes before pulling.

mark-conflicts can't mark complicated conflicts. In that case you'll have to use darcs diff and other commands to understand what the conflict is all about. If for example two conflicting patches create the same file, mark-conflicts will pick just one of them, and no delimiters are inserted. So watch out if darcs tells you about a conflict.

mark-conflicts can also be used to check for unresolved conflicts. If there are none, darcs replies ``No conflicts to resolve''. While pull reports when a conflict happens, obliterate and get don't.

Resolving conflicts

A conflict is resolved (not marked, as with the command mark-conflicts) as soon as some new patch depends on the conflicting patches. This will usually be the resolve patch you record after manually putting together the pieces from the conflict markup produced by mark-conflicts (or pull). But it can just as well be a tag. So don't forget to fix conflicts before you accidently ``resolve'' them by recording other patches.

If the conflict is with one of your not-yet-published patches, you may choose to amend that patch rather than creating a resolve patch.

If you want to back out and wait with the conflict, you can obliterate the conflicting patch you just pulled. Before you can do that you have to revert the conflict markups that pull inserted when the conflict happened.


Distributed development with one primary developer

This is how darcs itself is developed. There are many contributors to darcs, but every contribution is reviewed and manually applied by myself. For this sort of a situation, darcs send is ideal, since the barrier for contributions is very low, which helps encourage contributors.

One could simply set the _darcs/prefs/email value to the project mailing list, but I also use darcs send to send my changes to the main server, so instead the email address is set to ``Davids Darcs Repo <droundy@abridgegame.org>''. My .procmailrc file on the server has the following rule:

:0
* ^TODavids Darcs Repo
|(umask 022; darcs apply --reply darcs-devel@abridgegame.org \
             --repodir /path/to/repo --verify /path/to/allowed_keys)
This causes darcs apply to be run on any email sent to ``Davids Darcs Repo''. apply actually applies them only if they are signed by an authorized key. Currently, the only authorized key is mine, but of course this could be extended easily enough.

The central darcs repository contains the following values in its _darcs/prefs/defaults:

apply test
apply verbose
apply happy-forwarding
The first line tells apply to always run the test suite. The test suite is in fact the main reason I use send rather than push, since it allows me to easily continue working (or put my computer to sleep) while the tests are being run on the main server. The second line is just there to improve the email response that I get when a patch has either been applied or failed the tests. The third line makes darcs not complain about unsigned patches, but just to forward them to darcs-devel.

On my development computer, I have in my .muttrc the following alias, which allows me to easily apply patches that I get via email directly to my darcs working directory:

macro pager A "<pipe-entry>(umask 022; darcs apply --no-test -v \
        --repodir ~/darcs)"


Development by a small group of developers in one office

This section describes the development method used for the density functional theory code DFT++, which is available at http://dft.physics.cornell.edu/dft.

We have a number of workstations which all mount the same /home via NFS. We created a special ``dft'' user, with the central repository living in that user's home directory. The ssh public keys of authorized persons are added to the ``dft'' user's .ssh/allowed_keys, and we commit patches to this repository using darcs push. As in Section [*], we have the central repository set to run the test suite before the push goes through.

Note that no one ever runs as the dft user.

A subtlety that we ran into showed up in the running of the test suite. Since our test suite includes the running of MPI programs, it must be run in a directory that is mounted across our cluster. To achieve this, we set the $DARCS_TMPDIR environment variable to ~/tmp.

Note that even though there are only four active developers at the moment, the distributed nature of darcs still plays a large role. Each developer works on a feature until it is stable, a process that often takes quite a few patches, and only once it is stable does he push to the central repository.

Personal development

It's easy to have several personal development trees using darcs, even when working on a team or with shared code. The number and method of using each personal tree is limited only by such grand limitations as: your disk space, your imagination, available time, etc.

For example, if the central darcs repository for your development team is $R_{c}$, you can create a local working directory for feature $f_1$. This local working directory contains a full copy of $R_c$ (as of the time of the ``darcs get'' operation) and can be denoted $R_1$. In the midst of working on feature $f_1$, you realize it requires the development of a separate feature $f_2$. Rather than intermingling $f_1$ and $f_2$ in the same working tree, you can create a new working tree for working on $f_2$, where that working tree contains the repository $R_2$.

While working on $f_2$, other developers may have made other changes; these changes can be retrieved on a per-patch selection basis by periodic ``darcs pull'' operations.

When your work on $f_2$ is completed, you can publish it for the use of other developers by a ``darcs push'' (or ``darcs send'') from $R_2$ to $R_c$. Independently of the publishing of $f_2$, you can merge your $f_2$ work into your $f_1$ working tree by a ``darcs pull $R_2$'' in the $R_1$ development tree (or ``darcs push'' from $R_2$ to $R_1$).

When your work on $f_1$ is completed, you publish it as well by a ``darcs push'' from $R_1$ to $R_c$.

Your local feature development efforts for $f_1$ or $f_2$ can each consist of multiple patches. When pushing or pulling to other trees, these patches can either all be selected or specific patches can be selected. Thus, if you introduce a set of debugging calls into the code, you can commit the debugging code in a distictly separate patch (or patches) that you will not push to $R_c$.

Private patches

As discussed in the section above, a developer may have various changes to their local development repositories that they do not ever wish to publish to a group repository (e.g. personal debugging code), but which they would like to keep in their local repository, and perhaps even share amongst their local repositories.

This is easily done via darcs, since those private changes can be committed in patches that are separate from other patches; during the process of pushing patches to the common repository ($R_c$), the developer is queried for which patches should be moved to ($R_c$) on a patch-by-patch basis.

The --complement flag for the ``darcs pull'' operation can further simplify this effort. If the developer copies (via ``darcs push'' or ``darcs pull'') all private patches into a special repository/working tree ($R_p$), then those patches are easily disregarded for pulling by adding --complement to the ``darcs pull'' line and listing $R_p$ after the primary source repository.

The --complement flag is only available for ``darcs pull'', and not ``darcs push'' or ``darcs send'', requiring the user to have pull access to the target repository. While the actual public repository is often not shared in this manner, it's simple to create a local version of the public repository to act as the staging area for that public repository.

The following example extends the two feature addition example in the previous section using a local staging repository ($R_l$) and a private patch repository:

$ cd working-dir
$ darcs get http://server/repos/Rc Rl

$ darcs get Rl R1
$ cd R1
...development of f1
$ darcs record -m'p1: f1 initial work'
...
$ darcs record -m'p2: my debugging tracepoints'
...

$ cd ..
$ darcs get http://server/repos/Rc R2
$ cd R2
$ darcs pull -p p2 ../R1
... development of f2
$ darcs record -m'p3: f2 finished'

$ cd ..
$ darcs get Rl Rp
$ cd Rp
$ darcs pull -p p2 ../R2

$ cd ../Rl
$ darcs pull --complement ../R2 ../Rp
$ darcs send
... for publishing f2 patches to Rc

$ cd ../R1
$ darcs pull ../R2
... updates R1 with f2 changes from R2
... more development of f1
$ darcs record -m'p4: f1 feature finished.'

$ cd ../Rl
$ darcs pull --complement ../R1 ../Rp
$ darcs send

Darcs commands

The general format of a darcs command is

% darcs COMMAND OPTIONS ARGUMENTS ...
Here COMMAND is a command such as add or record, which of course may have one or more arguments. Options have the form --option or -o, while arguments vary from command to command. There are many options which are common to a number of different commands, which will be summarized here.

If you wish, you may use any unambiguous beginning of a command name as a shortcut: for darcs record, you could type darcs recor or darcs rec, but not darcs re since that could be confused with darcs replace, darcs revert and darcs remove.

In some cases, COMMAND actually consists of two words, a super-command and a subcommand. For example, the ``display the manifest'' command has the form darcs query manifest.

Command overview

Not all commands modify the ``patches'' of your repository (that is, the named patches which other users can pull); some commands only affect the copy of the source tree you're working on (your ``working directory''), and some affect both. This table summarizes what you should expect from each one and will hopefully serve as guide when you're having doubts about which command to use.

affects patches working directory
record yes no
unrecord yes no
rollback yes yes
revert no yes
unrevert no yes
pull yes yes
obliterate yes yes
apply yes yes
push6.1 no no
send6.2 no no
put6.3 no no

Common options to darcs commands

--help
Every COMMAND accepts --help as an argument, which tells it to provide a bit of help. Among other things, this help always provides an accurate listing of the options available with that command, and is guaranteed never to be out of sync with the version of darcs you actually have installed (unlike this manual, which could be for an entirely different version of darcs).
% darcs COMMAND --help

--disable
Every COMMAND accepts the --disable option, which can be used in _darcs/prefs/defaults to disable some commands in the repository. This can be helpful if you want to protect the repository from accidental use of advanced commands like obliterate, unpull, unrecord or amend-record.

--verbose, --quiet, --normal-verbosity
Most commands also accept the --verbose option, which tells darcs to provide additional output. The amount of verbosity varies from command to command. Commands that accept --verbose\verb also accept --quiet\verb, which surpresses non-error output, and --normal-verbosity\verb which can be used to restore the default verbosity if --verbose or --quiet is in the defaults file.

--debug
Many commands also accept the --debug option, which causes darcs to generate additional output that may be useful for debugging its behavior, but which otherwise would not be interesting.

--repodir
Another common option is the --repodir option, which allows you to specify the directory of the repository in which to perform the command. This option is used with commands, such as whatsnew, that ordinarily would be performed within a repository directory, and allows you to use those commands without actually being in the repository directory when calling the command. This is useful when running darcs in a pipe, as might be the case when running apply from a mailer.

--remote-repo

Some commands, such as pull require a remote repository to be specified, either from the command line or as a default. The --remote-repo provides an alternative way to supply this remote repository path. This flag can be seen as temporarily ``replacing'' the default repository. Setting it causes the command to ignore the default repository (it also does not affect, i.e. overwrite the default repository). On the other hand, if any other repositories are supplied as command line arguments, this flag will be ignored (and the default repository may be overwritten).


Selecting patches

Many commands operate on a patch or patches that have already been recorded. There are a number of options that specify which patches are selected for these operations: --patch, --match, --tag, and variants on these, which for --patch are --patches, --from-patch, and --to-patch. The --patch and --tag forms simply take (POSIX extended, aka egrep) regular expressions and match them against tag and patch names. --match, described below, allows more powerful patterns.

The plural forms of these options select all matching patches. The singular forms select the last matching patch. The range (from and to) forms select patches after or up to (both inclusive) the last matching patch.

These options use the current order of patches in the repository. darcs may reorder patches, so this is not necessarily the order of creation or the order in which patches were applied. However, as long as you are just recording patches in your own repository, they will remain in order.

When a patch or a group of patches is selected, all patches they depend on get silently selected too. For example: darcs pull --patches bugfix means ``pull all the patches with `bugfix' in their name, along with any patches they require.'' If you really only want patches with `bugfix' in their name, you should use the --no-deps option, which makes darcs exclude any matched patches from the selection which have dependencies that are themselves not explicitly matched by the selection.

For unrecord, unpull and obliterate, patches that depend on the selected patches are silently included, or if --no-deps is used selected patches with dependencies on not selected patches are excluded from the selection.

Match

Currently --match accepts five primitive match types, although there are plans to expand it to match more patterns. Also, note that the syntax is still preliminary and subject to change.

The first match type accepts a literal string which is checked against the patch name. The syntax is

darcs annotate --summary --match 'exact foo+bar'
This is useful for situations where a patch name contains characters that could be considered special for regul