Submit Jobs with CLI Commands
This chapter provides examples of submitting jobs using CLI (command line interface) commands.
nc run
nc run -r SWAP/1GB — sleep
0
, and nc run -r RAM/0.1Tb — sleep 0
are supported.
The currently supported parameter names for which this conversion is supported are RAM/, RAM#, RAMFREE#, RAMFREE/, RAMTOTAL#, RAMFREE/, SWAP/, SWAP#, SWAPFREE#, SWAPFREE/, SWAPTOTAL#, SWAPTOTAL/ and TMP# or TMP/. By default the unit is MB (Megabytes), where 1MB is 1<<20 bytes.
nc: Usage Message
NC RUN:
Run one or more jobs. The jobs are added to the system, and
will remain in the system until you use 'forget' to forget them
or they are automatically forgotten by the system.
If taskers and resources are available, the jobs are dispatched
immediately, else they are queued.
USAGE:
% nc run [OPTIONS] command ...
GENERAL OPTIONS:
-h -- Help usage message.
-v <level> -- Verbose level from 0 (silent) to 9 (very verbose).
-- -- Null option. In case of ambiguity, use this to separate
the options from the command.
In addition, the value of the environment variable NC_RUN_ARGS is
prepended to the argument list for this command, while the value of
NC_RUN_ARGS_AFTER is appended.
JOB CHARACTERISTICS OPTIONS:
-autokill <time> -- Kill job if it runs longer than specified time.
Set it to zero to disable autokill (the default).
-clearcase -- This is a job to be run in a ClearCase view
(see docs).
-C <class> -- The job belongs to the given class.
If argument is empty, the option is ignored.
Option can be repeated. The jobclass of the job
will be the last one specified.
-e <env> -- Set the environment. Default is current env, as
defined by the variable VOV_ENV.
Setting this to the null string "" or to
"SNAPSHOT", forces the use of an environment
snapshot.
-e+ <env> -- Append to existing environment.
-ep -- Capture environment in a SNAPSHOT property. Uses
SNAPPROP environment.
-first -- Schedule job first in its bucket.
-forceterm -- In the case of interactive jobs where the output is
piped, the job's TERM environment variable is set
to 'network'. This option disables that behavior.
-fstokens <N> -- Multiply weight of this job in FairShare by N.
Default 1, range is [0..50000]
-g <group> -- Specify the FairShare group.
The .user:subgroup suffix will be added. You need
attach permission to run in the specified group.
-G <group.tag> -- Specify the complete FairShare group with the
<group>.<tag> and/or <group>.<tag>:<subgroup> syntax.
(<tag> is typically a user.) If the <group> or
<subgroup> does not exist, it will be created with
the current user as the owner. You need attach
permission to run in the specified group.
-ioprofile -- Activate enhanced job profiling.
-I, -Ir -- Run interactive job. TTY signals like <ctrl>C are
propagated to the job. If the environment variable
VOV_INTERACTIVE_PING is set, its value (TIMESPEC
format. minimum is 1m) will be used to keep the
connection with the interactive job alive by pinging
the job at the specified interval.
-Il -- Run interactive job. TTY signals like <ctrl>C are
kept local, not propagated to the job. Appropriate
for piping stdout to a file or command. See above
for usage of VOV_INTERACTIVE_PING.
-Ix -- Run X Window based interactive job, no TTY, no wait.
Adds env D(DISPLAY=...) so job displays on submission
display. See above for usage of VOV_INTERACTIVE_PING.
-jobproj <name> -- The job project is set to <name>. The default is
determined by the environment variables VOV_JOBPROJ,
LM_PROJECT and RLM_PROJECT.
-jpp <JPP> -- Specify a job-placement-policy. These policies are
advisory only. Legal values for JPP are a comma
separated list of one or more words from the following
list:
At most one of these:
fastest -- This is the default job placement
policy: among all the taskers that can
execute a job, choose the one with the
highest power. The assumption is that a
tasker with a higher power will complete
the job faster.
slowest -- Among all the taskers that can execute a
job, choose the with the least power.
This policy may be useful to run
regression jobs on older, less powerful
hardware.
first -- As soon as the scheduler finds a tasker
to execute the job, it uses that tasker
without checking all other taskers. This
policy is useful for lowering the
scheduling effort.
largest -- Among all the taskers that can execute a
job, choose the tasker with the most
amount of unused slots and unused RAM
((MB of unused RAM) + 16000*(Number of
unused cores)). This policy tends to
spread the jobs on idle machines. In
most cases, this policy may not be the
most effective.
smallest -- Among all the taskers that can execute a
job, choose the tasker with the least
amount of unused slots and unused RAM
((MB of unused RAM) + 16000*(Number of
unused cores)). This policy tends to
pack the jobs on machines that are
already busy, thus keeping idle machines
available if a large job is submitted.
smallram -- Among all the taskers that can execute a
job, choose the tasker with the least
amount of total RAM. This policy is
useful to pack the jobs on the smaller
machines first, which keeps large
machines available if a large job is
submitted.
At most one of these (Linux-only):
pack -- NUMA control: assign the job to a
NUMA node with the least number of
available resources that will fit the
job. If none of the NUMA nodes have
sufficient job slots and RAM, the job
will be allowed to run on as many
NUMA nodes as needed to satisfy its
resource requirement.
spread -- NUMA control: assign the job to a
NUMA node that has the largest number
of available resources.
none -- NUMA control: allow Linux to place
jobs. The Linux CPUs Allowed affinity
list will be all the CPUs on the
system (default).
Examples:
-jpp slowest
-jpp spread
-jpp smallest,pack
-jpp first,spread
Note: To place jobs on the same machines,
use first or smallest.
-J <jobname> -- Set the job name (same as -N <jobname>)
-keep -- Keep job after completion, disabling auto-forget.
-keepfor <time> -- Keep job after completion for specified time.
Disables auto-forget.
-limit <spec> -- Add limit to jobs submitted with this option. <spec>
could be just a number, or a name followed by a
number. Throttles the running jobs of a user
submitted with the same -limit option to the
specified number.
-L <exitstatus> -- Legal exit status list (default is 0). You can also
use commas to separate the valid statuses.
Examples:
-L 0,2,10 -L 0,200-208
-maxresched <N> -- Maximum number of times the job can be rescheduled.
Must be >= 1 and <= 10. Implemented
via the MAX_RESCHEDULE property on the job.
-mpres RESLIST -- Specification of the resources required by a
multiphase job. The RESLIST specifies the resource
lists for all phases of the job with % characters
delimiting the phases. The sublists of resources for
each phase are percent sign delimited.
Example: -mpres "RAM/200 CORES/2%RAM/20 CORES/3"
-mpres+ <rsrc> -- Append one resource to the multiphase resource list.
This option must follow the "-mpres" or "-mpres<n>"
option, otherwise the resources specified in "-mpres+"
will be overwritten.
-mpres1 RESLIST
-mpres2 RESLIST
-mpres<n> RESLIST -- Specify resources for a stage <n> of a multiphase
job. The number <n> is in the range from 1 to 9.
Example: -mpres1 "RAM/200"
-mpres2 "CORES/4 RAM/10"
-N <jobname> -- Same as -J <jobname>.
(compatible with FDL 'N' procedure)
-p <priorities> -- Set priorities for scheduling and execution of job.
Format is <schedulingPriority>[.executionPriority]
Priority is either a number from 1 (low) to 15 (top)
or a symbolic value 'low' 'normal' 'high' 'top' or
any abbreviation thereof.
Examples: -p n -p 4.high -p high.low
-pre <SCRIPT> -- Execute <SCRIPT> as precondition. The JOBID will be
-precmd <SCRIPT> appended to the arguments of the script. If the
script exits with non-zero status, the job is not
run. See examples in $VOVDIR/etc/pre.
-post <SCRIPT> -- Execute <SCRIPT> as postcondition. The JOBID and
-postcmd <SCRIPT> EXITSTATUS will be appended to the arguments of the
script. The script is executed irrespective of the
success of the job. The exit status of the script
becomes the exit status of the job.
See examples in $VOVDIR/etc/post.
-preemptable <N> -- Set preemptable mode:
N=0 not preemptable
N=1 preemption allowed (default)
-profile -- Activate job profiling to track and graph over time
the following: RAM usage, CPU usage, cumulative I/O,
and License usage. Without this option, only the
current usage of RAM, CPU, and Licenses are reported
in the web UI.
-r <r1> [r2..rN] -- Set requested resources of the job. Accepts
multiple resource arguments and may be repeated.
If -r is the last option, use '--' to separate
the last resource from the command line.
-r+ <resource> -- Append one resource to resource list.
No termination necessary.
-r- <resource> -- Remove one resource from the resource list.
It is an error to remove a resource that does not
exist. No termination necessary.
-rf Add Filer:<FILER_NAME> resource
(computed from run dir)
-reconcilemem -- Monitor a job's actual memory usage and decrease
consumed resources for the job if the consumed RAM for
the job is less than the requested RAM. Optional value
is a triplet of time specs:
start[:interval[:end]]
where:
- start is the time after job starts that monitoring
begins.
- interval is the time between checks. Default is
to check only once.
- end is the time after job start that monitoring
ends. Default is end of job.
For example, "-reconcilemem 1m:2m:8m" instructs
Accelerator to start checking memory usage 1
minute after the job has begun, do it every 2
minutes and stop when the job has been running
for 8 minutes. Note: if actual memory usage
for a job exceeds requested RAM, the consumed
RAM resource for the job is increased whether
or not -reconcilemem is specified.
-rundir <dir> -- Specify a different run directory (default ".")
If the <dir> specification is quoted by single
quotes, the directory is taken exactly as given,
instead of being canonicalized. When using -rundir
with the SNAPSHOT environment, the -ep argument
must also be passed. Implies -D.
-set <setName> -- Assign the job(s) to the given set.
-sg <subgroup> -- Specify a subgroup for fairshare for the current user
-splitstderr -- Write the stderr output of the interactive job to
stderr. Default is to write the job's stderr output
to stdout. Note that using this option will probably
result in garbled terminal output due to interleaving
of stdout and stderr outputs.
-tool <toolName> -- Specify a "toolName" different from the tail of the
first command line argument. The argument must be
less than 100 characters long and contain only
alphanumeric chars.
-x | -xdur <xdur> -- Set the expected duration of the job.
-deadline <duration> -- Job is expected to be completed within the
given duration.
Set it to zero to disable it (the default).
-deadlineat <time> -- Job is expected to be completed before the
given time.
The time is parsed by the Tcl command
[clock scan $time]
Set it to zero to disable it (the default).
SUBMISSION OPTIONS:
-after <time> -- Fire job after specified time.
-array <n> -- Submit a jobarray of 'n' repeated commands
Some fields may contain the strings @INDEX@, @JOBID@,
and @ARRAYID@, which are substituted when the array
is created.
These fields are: command, env, wd, toolname, jobname
The output files are also subject to the same
substitutions.
Three comma-separated formats for <n> are supported:
last
first,last
first,last,increment
-at <date> -- Specify earliest date to fire job
The date is parsed by the Tcl command
[clock scan $date]
-atomic -- Create job array using a single RPC between
client and server.
-dp N -- Run a Distributed Parallel (DP) job requiring N
components.
-dpactive <n> -- The n-th component is the one that becomes active
(default 1).
-dpres RESLIST -- Specification of the resources required by a parallel
job. Example: -dpres "RAM/200 CORES/2"
See vovcreatepartialjobs for more info.
-dpres+ <rsrc> -- Append one resource to the distributed processing
resource list.
No termination necessary.
-dpres1 RESLIST
-dpres2 RESLIST
-dpres<n> RESLIST -- Specify resources for a component <n> of a DP job.
The number <n> is in the range from 1 to <N>
(option -dp)
Example: -dpres1 "RAM/200"
-dpres2 "CORES/4 RAM/10"
-dpwait TIMESPEC -- The time the components wait to rendezvous
(default 30s, minimum 3s). The wait is increased with
each attempt. The maximum wait is controlled by the
property DP_WAIT_MAX
-dpnocohortwait -- Partial jobs may exit without waiting for primary.
-dpinitialport N -- Specify starting port on which partial jobs should
attempt to communicate.
-D -- Do not check the validity of the directory.
-f <file> -- Get a list of commands from a file, one per line.
Jobs are created and then scheduled in blocks
of 200 jobs (unless otherwise specified by -fb).
-fb <n> -- Change the size of blocks of jobs scheduled with -f
(default 200).
-fw <S> -- Specify delay between blocks of jobs, in seconds.
Value must be >= 0, default is 0. Use with -f.
-dribble -- Short hand for -fb 1 -fw 0.1
-F -- Force running of job even if it is already valid.
This is useful only if you are also using option -l
to set the name of the log file, otherwise this
option has no effect.
-multiphase <N> -- Set multiphase mode:
N=0 not multiphase (default)
N=1 multiphase
If a job will have more than 4 phases, then the
"-maxresched N" option must also be specified, where
N is the number of phases the job will have.
LOGFILE AND OTHER DEPENDENCIES OPTIONS:
-dep <Id|Name> -- Specify a dependency on the list of jobs.
-d <Id|Name> The argument can be a list of job Ids or job names.
In the case of job names, the dependency is looked
for in the set of jobs belonging to the submitting
user.
The current job will not start until the
specified jobs have completed successfully.
May be repeated.
Performance note: dependencies on job names are much
slower than dependencies on job ids.
-depset <Name> -- Specify a dependency on all jobs in the named set at
the time of submission. If other jobs are added to
the set later, they will not be added to the
dependencies. May be repeated.
-forcelog -- Force the declared output log to be the output of
-force this job. If another job was declaring the same
output, it will become black (SLEEPING).
-forcedequeue -- Force the declared output log to be the output of
this job. If any job was declaring the same
output, upcone of all the jobs producing this
file will be stopped and dequeued, it will
become black (SLEEPING).
-i <in_file> -- Specify an input dependency.
-l <logfile> -- Specify name of logfile.
As with -rundir, if the <logfile> is quoted with
either " or ', then the name is taken literally
and not canonicalized.
Quoted or not, variable substitution on the file name
is performed for the following variables
@JOBID@ -> Id of job.
@ARRAYID@ -> Id of job array (if applicable).
@DATE@ -> ISO_TIMESTAMP
@UNIQUE@ -> %Y%m%d_%H%M%S.SUBMISSION_PID
@JOBCLASS@ -> job class (the alphanumeric part)
@JOBNAME@ -> job name (the alphanumeric part)
You may need to use -forcelog together with -l.
Timestamp in format '%Y%m%d_%H%M%S.SUBMISSION_PID'
will be added to the logfile name for array jobs
when '@UNIQUE@' is not present in the logfile name.
-n -- Use no wrapper (default: use 'vw').
-nolog -- Do not keep a log.
-o <out_file> -- Specify an output dependency.
-P <NAME=VALUE> -- Add the given property to the jobs (may be repeated).
-s -- Declare that the logfile is SHARED (see docs).
You rarely need this option. If misused, this option
causes extra buckets to be created in the scheduler.
Probably you need '-forcelog -F' instead.
-uniqueid -- Force NC to use a unique new
VovId for each job submission,
even when the same job is submitted multiple times.
-wrapper <W> -- Use specified wrapper '<W>' (default: use 'vw').
E-MAIL NOTIFICATION AND WAIT OPTIONS:
-m -- Send me mail upon job completion.
-M <mail rule> -- Send mail according to the given rule (see docs).
-w -- Wait for the job(s) to finish: do not show any log.
For the meaning of the exit status, check nc wait.
-wl -- Wait for the job(s) to finish: show the log of
the last job.
For the meaning of the exit status, check nc wait.
EXTRAS:
-nodb -- The job is not stored in the jobs log or in the
database.
-nopolicy -- For ADMIN only. Disable the policy layer.
EXAMPLES:
% nc run sleep 10
% nc run -autokill 30m sleep 10000000
% nc run -r SLOT/4 -xdur 500 -deadline 1h sleep 500
% nc run -r SLOT/4 -xdur 500 -deadlineat 10am sleep 500
% nc run -array 10 sleep 1 # submit 10 sleep jobs via
% nc run -array 10,200,10 sleep 1 # submit sleep jobs with index
% nc run -g /teams/chipA -sg session12 sleep 1
% nc run -G /teams/chipA.any sleep 1
% nc run -C longjobs sleep 10000
% nc run -C longjobs -r+ RAM/200 sleep 10000
% nc run -r unix -- sleep 10
% nc run -p high sleep 10
% nc run -e BASE -p h sleep 10
% nc run -e SNAPSHOT+SIM -p h sleep 10
% nc run -m sleep 10; # email when job finishes.
% nc run -M ":ERROR" sleep 10; # email only if
% nc run -dp 3 -dpres sun7,linux vovparallel clone sleep 10
% nc run -at 6pm sleep 10
% nc run -at "tomorrow 6pm" sleep 10
% nc run -after 10m sleep 10
% nc run -forcelog -F -l mylog.txt ./myjob
Default Output of nc run
nc run
includes the following
information:- The resource list assigned
to the job, which can be controlled with the option
-r.
- The environment used for the
job, which can be controlled with the option
-e.
- The command line.
- The log file used to store both
stderr
andstdout
of the command, which can be controlled with the option-l
- The JobId assigned by Accelerator to this job. JobIDs are used as handles with many of the Accelerator commands.