Commit 3468a4e9 authored by Jason Rhinelander's avatar Jason Rhinelander
Browse files

Made parallel-runner public

(Moved from one of my private repositories).
Pipeline #94 skipped
This diff is collapsed.
parallel-runner: parallel-runner.cpp
$(CXX) -O2 $(CXXFLAGS) -lpthread -std=c++11 parallel-runner.cpp -o parallel-runner
# parallel-runner - remote host job manager
parallel-runner is a program that runs a command a certain number of times,
distributing the job across many hosts connected to via ssh. Each host has a
maximum number of active jobs; new jobs are submitted to hosts as current jobs
## Compiling
Run `make` in the project directory. A not-too-old C++ compiler (e.g. is needed.
This will probably only compile on Linux systems.
## Usage
`parallel-runner [FILE=/path/to/parallel.hosts] [HOSTS='host1=j host2 ...'] [PER_HOST=J] N CMD`
### Example
`parallel-runner HOSTS='host1 host2 host3=4' PER_HOST=3 20 CMD`
Runs `CMD` 20 times using remote machines `host1`, `host2`, and `host3`,
without running more than 3 instances at a time on `host1` and `host2`, and
without running more than 4 instances at a time on `host3`.
### Detailed description
The command `CMD` (including any arguments) is executed via ssh on multiple
hosts as listed in the HOSTS argument (if given), otherwise the parallel.hosts
file (see below). If PER\_HOST=J is specified and is an integer argument, it
specifies the number of jobs allowed to run on a single host at a time. If
omitted, the default comes from the parallel.hosts file. Each host may
optionally have "=j" (where j is an integer) prepended to override the maximum
number of jobs for that host.
Jobs are allocated from a pool as other jobs finish on existing hosts, so as to
keep as many processes as allowed running at any given time until finished.
This program runs until all jobs have finished.
Any output is logged (on the local host) to `logs/TIMESTAMP/HOST-N.log`, where
`TIMESTAMP` is the timestamp when *this* program is invoked, HOST is the remote
hostname, and N is a number from 1 to J, reflecting the J parallel task
streams. A symlink logs/latest is updated each time parallel-runner is invoked
to point at the most recent timestamp.
`CMD` may reference the environment variables `PARALLEL_HOST`, `PARALLEL_JOBNO`, and
`PARALLEL_THREAD` which contain the specified host, overall job number (from 1 to
N), and host thread number (from 1 to J), respectively. The host/thread pair is
unique across currently-running jobs; the jobno is unique and sequential across
all initiated jobs.
### Program Arguments:
#### `HOSTS='host1 username@host3 username@host4=2'`
This directive (which may be repeated) contains a whitespace-separated list of
hosts to ssh to. You will required an installed SSH key on each one as password
authentication is not supported. (If your key is password-protected, you will
also require an authenticated, active ssh-agent with the key loaded). Each host
item consists of:
where username is the login username, hostname is the hostname to give to ssh,
and j is the maximum number of jobs to run at once on the given host (if 0, the
host is not used at all). If =j is omitted, the maximum number of jobs comes
from the `PER_HOST` value.
Additional ssh options for the connection cannot be specified here; instead set
any desired options with an appropriate `~/.ssh/config file`.
#### `HOSTS+='host5 host6'`
This is like the HOSTS= argument, in that it appends hosts to connect to, but
unlike 'HOSTS=' it does not suppress loading of hosts from the parallel.hosts
configuration file.
#### `PER_HOST=J`
This specifies the default maximum jobs per host (for any hosts without an
individual '=j' specifier).
#### `FILE=/path/to/parallel.hosts`
If one or both of the `HOSTS=` and `PER_HOST=` arguments are omitted, the `HOSTS` and
`PER_HOST` values are read from a parallel.hosts configuration file, which can
specify the two variables as in the following example:
# This is an example comment.
# Define three hosts--note the parentheses, not quotes, here:
HOSTS=(host1 host2 myusername@host3=8)
# You can append hosts here using HOSTS+=(...):
HOSTS+=(host5 username@host6 host7=2)
# Run up to 4 jobs at once on each host without a specific '=j' job number value:
# PARALLEL-RUNNER-DONE -- everything below this line is ignored
If the `FILE=` argument to parallel-runner is specified, that file is loaded.
Otherwise, a suitable file is searched for in the following locations; the first
one found is used:
- `parallel.hosts` (in the current directory)
- `.parallel.hosts` (in the current directory)
- `$XDG_CONFIG_HOME/parallel.hosts` (if `XDG_CONFIG_HOME` is set)
- `$HOME/.config/parallel.hosts` (if `XDG_CONFIG_HOME` is not set)
- `$HOME/.parallel.hosts`
The `HOSTS` and `PER_HOST` values function in the same way as the command-line
arguments. Note one difference: the value in the file is a bash-compatible
array declaration surrounded with paranthesis rather than a simple string. (The
parenthetical versions are permitted on the command line, but not required).
Command-line options override the file options, and a command-line `PER_HOST`
argument sets an upper bound for host jobs when `HOSTS` is not listed on the
command-line. In other words, `parallel-runner PER_HOST=2 'some command'` with a
`parallel.hosts` file containing `HOSTS=(hosta=1 hostb=2 hostc=3)` will run one
job at a time on `hosta` and two jobs at a time on both `hostb` and `hostc`.
This diff is collapsed.
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment