DNA Cafe | Download | FreeSoftware | Java | Links

DNA - FreeSoftware - distcc

distcc is a program to distribute builds of C, C++, Objective C or Objective C++ code across several machines on a network. distcc should always generate the same results as a local compile, is simple to install and use, and is often much faster than a local compile.

distcc sends the complete preprocessed source code and compiler arguments across the network for each job. So the machines do not need to share a filesystem, header files, or libraries, or even be running the same platform. The compiler must be installed under the same name on the client and on every volunteer machine.


distcc

Put the names of the servers in your environment:

$ DISTCC_HOSTS="host_to_use_for_compilation ..."
$ export DISTCC_HOSTS

Fast machine first. distcc work on a first come, first working basis.

And build programs.

$ make -j8 CC=distcc

The -j value should normally be set to about twice the total number of available CPUs, to allow for some tasks being blocked waiting for disk or network IO.

For example, you have one local and three remote PCs, which have one CPU, so the -j value is set to 8 or lower.

pump mode

The major improvement in distcc 3.0 is the inclusion of "pump" mode.

$ DISTCC_POTENTIAL_HOSTS="host_to_use_for_compilation ..."
$ export DISTCC_POTENTIAL_HOSTS
$ pump make -j8 CC=distcc

Using distcc with ccache

Normally it is better for ccache to be run before distcc. But do not use pump mode and ccache at the same time. Please refer to note for ccache (ccache.html).

pump mode, or ccache

pump mode is distributing preprocessing to the servers. Using ccache prevents the use of "pump" mode.
Which is better, pump mode or ccache?

distcc, pump mode and ccache
combination preprocess compile advantage
distcc on localhost on servers  
distcc pump on servers on servers distributed preprocessing
distcc + ccache on localhost on servers or cache sharing a cache directory

A very simple model. Preprocessing time is tcpp, compilation time is tcc1, link time is tld, cache hit rate is h (0 ≤ h ≤ 1), number of compilation is n (n ≥ 1).

tdistcc ∼ tcpp + tcc1/n + tld
tpump ∼ (tcpp + tcc1)/n + tld
tccache ∼ tcpp + (1-h) tcc1/n + tld

Parformance of ccache and distcc pump mode
Performance of ccache and distcc pump mode.

On ccache, compilation time is between tccache(h=0;n) and tccache(h=1;n). On pump mode, compilation time is in inverse proportion to n.
Performance depend on number of conditions, but pump mode has advantage if you have many servers, and ccache has advantage if you have 1-3 servers.


distccd

On the server side, distccd may be used either from inetd or as a stand-alone server.

Standalone server

The recommended method for running distccd is as a standalone server. To start distccd as a standalone service, run a command like this either as root or an ordinary user:

$ distccd --daemon

Traditional Unix inetd

For traditional Unix inetd, inetd knows to run the command distccd when it receives a connection on the right port.

Put a line like this into each file (after checking to make sure it isn't already there):

/etc/services
distcc 3632/tcp #distributed C/C++ compiler server
/etc/inetd.conf
distcc stream tcp nowait root /usr/bin/distccd distccd --verbose --inetd --user nobody --nice 20
/etc/hosts.allow
distcc: LOCAL

If you use xinetd such as RedHat, put configuration file for distcc.

/etc/xinetd.d/distcc
distcc configuration file

And send SIGHUP signal to reread its configuration file.


References

http://distcc.org/
distcc - Google Code
http://distcc.samba.org/
distcc: a fast, free distributed C/C++ compiler (v2 or older)
ccache.html
Note for ccache.

Vector Valid XHTML 1.1! Valid CSS!