Submit Jobs with CLI Commands

This chapter provides examples of submitting jobs using CLI (command line interface) commands.

Note: CLI commands are case insensitive. For example, timetolerance and timeTolerance represent the same command.

nc run

Note: nc modify -res now support binary unit conversion for all memory based resources as a convenience from Petabytes (PB), TerabyteS (TB), or GigabyteS (GB) to Megabytes (MB), which is still used internally and reported by all commands. The input conversion will accept either decimal or integer form and are all case-insensitive, so for example both

nc run -r SWAP/1GB — sleep
                    0

, and nc run -r RAM/0.1Tb — sleep 0 are supported.

The currently supported parameter names for which this conversion is supported are RAM/, RAM#, RAMFREE#, RAMFREE/, RAMTOTAL#, RAMFREE/, SWAP/, SWAP#, SWAPFREE#, SWAPFREE/, SWAPTOTAL#, SWAPTOTAL/ and TMP# or TMP/. By default the unit is MB (Megabytes), where 1MB is 1<<20 bytes.


nc: Usage Message

NC RUN:
    Run one or more jobs. The jobs are added to the system, and
    will remain in the system until you use 'forget' to forget them
    or they are automatically forgotten by the system.
    If taskers and resources are available, the jobs are dispatched
    immediately, else they are queued.

USAGE:
    % nc run [OPTIONS] command ...

GENERAL OPTIONS:
    -h               -- Help usage message.
    -v  <level>      -- Verbose level from 0 (silent) to 9 (very verbose).
    --               -- Null option. In case of ambiguity, use this to separate
                        the options from the command.

    In addition, the value of the environment variable NC_RUN_ARGS is
    prepended to the argument list for this command, while the value of
    NC_RUN_ARGS_AFTER is appended.

JOB CHARACTERISTICS OPTIONS:
    -autokill <time>   -- Kill job if it runs longer than specified time.
                          Set it to zero to disable autokill (the default).
    -clearcase         -- This is a job to be run in a ClearCase view
                          (see docs).
    -C <class>         -- The job belongs to the given class.
                          If argument is empty, the option is ignored.
                          Option can be repeated.  The jobclass of the job
                          will be the last one specified.
    -e <env>           -- Set the environment. Default is current env, as
                          defined by the variable VOV_ENV.
                          Setting this to the null string "" or to
                          "SNAPSHOT", forces the use of an environment
                          snapshot.
    -e+ <env>          -- Append to existing environment.
    -ep                -- Capture environment in a SNAPSHOT property. Uses
                          SNAPPROP environment.
    -first             -- Schedule job first in its bucket.
    -forceterm         -- In the case of interactive jobs where the output is
                          piped, the job's TERM environment variable is set
                          to 'network'.  This option disables that behavior.
    -fstokens <N>      -- Multiply weight of this job in FairShare by N.
                          Default 1, range is [0..50000]
    -g <group>         -- Specify the FairShare group.
                          The .user:subgroup suffix will be added. You need
                          attach permission to run in the specified group.
    -G <group.tag>     -- Specify the complete FairShare group with the 
                          <group>.<tag> and/or <group>.<tag>:<subgroup> syntax.
                          (<tag> is typically a user.) If the <group> or
                          <subgroup> does not exist, it will be created with
                          the current user as the owner. You need attach
                          permission to run in the specified group.
    -ioprofile         -- Activate enhanced job profiling.
    -I, -Ir            -- Run interactive job. TTY signals like <ctrl>C are
                          propagated to the job.  If the environment variable
                          VOV_INTERACTIVE_PING is set, its value (TIMESPEC
                          format. minimum is 1m) will be used to keep the
                          connection with the interactive job alive by pinging
                          the job at the specified interval.
    -Il                -- Run interactive job. TTY signals like <ctrl>C are
                          kept local, not propagated to the job. Appropriate
                          for piping stdout to a file or command.  See above
                          for usage of VOV_INTERACTIVE_PING.
    -Ix                -- Run X Window based interactive job, no TTY, no wait.
                          Adds env D(DISPLAY=...) so job displays on submission
                          display.  See above for usage of VOV_INTERACTIVE_PING.
    -jobproj <name>    -- The job project is set to <name>. The default is
                          determined by the environment variables VOV_JOBPROJ,
                          LM_PROJECT and RLM_PROJECT.
    -jpp <JPP>         -- Specify a job-placement-policy. These policies are 
                          advisory only. Legal values for JPP are a comma 
                          separated list of one or more words from the following 
                          list:
                          At most one of these:
                           fastest   -- This is the default job placement
                                        policy: among all the taskers that can
                                        execute a job, choose the one with the
                                        highest power. The assumption is that a
                                        tasker with a higher power will complete
                                        the job faster.
                           slowest   -- Among all the taskers that can execute a
                                        job, choose the  with the least power.
                                        This policy may be useful to run
                                        regression jobs on older, less powerful
                                        hardware.
                           first     -- As soon as the scheduler finds a tasker
                                        to execute the job, it uses that tasker
                                        without checking all other taskers. This
                                        policy is useful for lowering the
                                        scheduling effort.
                           largest   -- Among all the taskers that can execute a
                                        job, choose the tasker with the most
                                        amount of unused slots and unused RAM
                                        ((MB of unused RAM) + 16000*(Number of
                                        unused cores)). This policy tends to
                                        spread the jobs on idle machines. In
                                        most cases, this policy may not be the
                                        most effective.
                           smallest  -- Among all the taskers that can execute a
                                        job, choose the tasker with the least
                                        amount of unused slots and unused RAM
                                        ((MB of unused RAM) + 16000*(Number of
                                        unused cores)). This policy tends to
                                        pack the jobs on machines that are
                                        already busy, thus keeping idle machines
                                        available if a large job is submitted.
                           smallram  -- Among all the taskers that can execute a
                                        job, choose the tasker with the least
                                        amount of total RAM. This policy is
                                        useful to pack the jobs on the smaller
                                        machines first, which keeps large
                                        machines available if a large job is
                                        submitted.
                          At most one of these (Linux-only):
                           pack      -- NUMA control: assign the job to a
                                        NUMA node with the least number of
                                        available resources that will fit the
                                        job. If none of the NUMA nodes have
                                        sufficient job slots and RAM, the job
                                        will be allowed to run on as many
                                        NUMA nodes as needed to satisfy its
                                        resource requirement.
                           spread    -- NUMA control: assign the job to a
                                        NUMA node that has the largest number
                                        of available resources.
                           none      -- NUMA control: allow Linux to place
                                        jobs. The Linux CPUs Allowed affinity
                                        list will be all the CPUs on the
                                        system (default).

                          Examples:
                          -jpp slowest
                          -jpp spread
                          -jpp smallest,pack
                          -jpp first,spread

                        Note: To place jobs on the same machines,
                              use first or smallest.
    -J <jobname>       -- Set the job name (same as -N <jobname>)
    -keep              -- Keep job after completion, disabling auto-forget.
    -keepfor <time>    -- Keep job after completion for specified time.
                          Disables auto-forget.
    -limit <spec>      -- Add limit to jobs submitted with this option. <spec>
                          could be just a number, or a name followed by a
                          number. Throttles the running jobs of a user
                          submitted with the same -limit option to the
                          specified number.
    -L <exitstatus>    -- Legal exit status list (default is 0). You can also
                          use commas to separate the valid statuses.
                          Examples:
                          -L 0,2,10        -L 0,200-208
    -maxresched <N>    -- Maximum number of times the job can be rescheduled.
                          Must be >= 1 and <= 10. Implemented
                          via the MAX_RESCHEDULE property on the job.
    -mpres RESLIST     -- Specification of the resources required by a
                          multiphase job. The RESLIST specifies the resource
                          lists for all phases of the job with % characters
                          delimiting the phases. The sublists of resources for
                          each phase are percent sign delimited.
                          Example: -mpres "RAM/200 CORES/2%RAM/20 CORES/3"
    -mpres+ <rsrc>     -- Append one resource to the multiphase resource list.
                          This option must follow the "-mpres" or "-mpres<n>"
                          option, otherwise the resources specified in "-mpres+"
                          will be overwritten.
    -mpres1 RESLIST
    -mpres2 RESLIST
    -mpres<n> RESLIST  -- Specify resources for a stage <n> of a multiphase
                          job. The number <n> is in the range from 1 to 9.
                          Example:  -mpres1 "RAM/200"
                                    -mpres2 "CORES/4 RAM/10"
    -N <jobname>       -- Same as -J <jobname>.
                          (compatible with FDL 'N' procedure)
    -p <priorities>    -- Set priorities for scheduling and execution of job.
                          Format is <schedulingPriority>[.executionPriority]
                          Priority is either a number from 1 (low) to 15 (top)
                          or a symbolic value 'low' 'normal' 'high' 'top' or
                          any abbreviation thereof.
                          Examples:  -p n   -p 4.high   -p high.low
    -pre <SCRIPT>      -- Execute <SCRIPT> as precondition. The JOBID will be
    -precmd <SCRIPT>      appended to the arguments of the script. If the
                          script exits with non-zero status, the job is not
                          run. See examples in $VOVDIR/etc/pre.
    -post <SCRIPT>     -- Execute <SCRIPT> as postcondition. The JOBID and
    -postcmd <SCRIPT>     EXITSTATUS will be appended to the arguments of the
                          script. The script is executed irrespective of the
                          success of the job. The exit status of the script
                          becomes the exit status of the job.
                          See examples in $VOVDIR/etc/post.
    -preemptable <N>   -- Set preemptable mode:
                          N=0    not preemptable
                          N=1    preemption allowed (default)
    -profile           -- Activate job profiling to track and graph over time
                          the following:  RAM usage, CPU usage, cumulative I/O,
                          and License usage.  Without this option, only the
                          current usage of RAM, CPU, and Licenses are reported
                          in the web UI.
    -r <r1> [r2..rN]   -- Set requested resources of the job.  Accepts
                          multiple resource arguments and may be repeated.
                          If -r is the last option, use '--' to separate
                          the last resource from the command line.
    -r+ <resource>     -- Append one resource to resource list.
                          No termination necessary.
    -r- <resource>     -- Remove one resource from the resource list.
                          It is an error to remove a resource that does not
                          exist. No termination necessary.
    -rf                   Add Filer:<FILER_NAME> resource
                          (computed from run dir)
    -reconcilemem      -- Monitor a job's actual memory usage and decrease 
                          consumed resources for the job if the consumed RAM for
                          the job is less than the requested RAM. Optional value
                          is a triplet of time specs:
                            start[:interval[:end]]
                            where:
                            - start is the time after job starts that monitoring
                              begins.
                            - interval is the time between checks. Default is
                              to check only once.
                            - end is the time after job start that monitoring
                              ends. Default is end of job. 
                          For example, "-reconcilemem 1m:2m:8m" instructs
                          Accelerator to start checking memory usage 1
                          minute after the job has begun, do it every 2
                          minutes and stop when the job has been running
                          for 8 minutes. Note: if actual memory usage
                          for a job exceeds requested RAM, the consumed
                          RAM resource for the job is increased whether
                          or not -reconcilemem is specified.
    -rundir <dir>      -- Specify a different run directory (default ".")
                          If the <dir> specification is quoted by single
                          quotes, the directory is taken exactly as given,
                          instead of being canonicalized. When using -rundir
                          with the SNAPSHOT environment, the -ep argument
                          must also be passed. Implies -D.
    -set <setName>     -- Assign the job(s) to the given set.
    -sg <subgroup>     -- Specify a subgroup for fairshare for the current user
    -splitstderr       -- Write the stderr output of the interactive job to
                          stderr. Default is to write the job's stderr output
                          to stdout. Note that using this option will probably
                          result in garbled terminal output due to interleaving
                          of stdout and stderr outputs.
    -tool <toolName>   -- Specify a "toolName" different from the tail of the
                          first command line argument.  The argument must be
                          less than 100 characters long and contain only
                          alphanumeric chars.
    -x | -xdur <xdur>  -- Set the expected duration of the job.
    -deadline <duration> -- Job is expected to be completed within the
                            given duration.
                            Set it to zero to disable it (the default).
    -deadlineat <time> -- Job is expected to be completed before the
                          given time.
                          The time is parsed by the Tcl command
                          [clock scan $time]
                          Set it to zero to disable it (the default).

SUBMISSION OPTIONS:
    -after <time>      -- Fire job after specified time.
    -array <n>         -- Submit a jobarray of 'n' repeated commands
                          Some fields may contain the strings @INDEX@, @JOBID@,
                          and @ARRAYID@, which are substituted when the array
                          is created.
                          These fields are: command, env, wd, toolname, jobname
                          The output files are also subject to the same
                          substitutions.
                          Three comma-separated formats for <n> are supported:
                          last
                          first,last
                          first,last,increment
    -at <date>         -- Specify earliest date to fire job
                          The date is parsed by the Tcl command
                          [clock scan $date]
    -atomic            -- Create job array using a single RPC between
                          client and server.
    -dp N              -- Run a Distributed Parallel (DP) job requiring N
                          components.
    -dpactive <n>      -- The n-th component is the one that becomes active
    (default 1).
    -dpres RESLIST     -- Specification of the resources required by a parallel
                          job.  Example: -dpres "RAM/200 CORES/2"
                          See vovcreatepartialjobs for more info.
    -dpres+ <rsrc>     -- Append one resource to the distributed processing
                          resource list.
                          No termination necessary.
    -dpres1 RESLIST
    -dpres2 RESLIST
    -dpres<n> RESLIST  -- Specify resources for a component <n> of a DP job.
                          The number <n> is in the range from 1 to <N>
                          (option -dp)
                          Example:  -dpres1 "RAM/200"
                                    -dpres2 "CORES/4 RAM/10"
    -dpwait TIMESPEC   -- The time the components wait to rendezvous
                          (default 30s, minimum 3s). The wait is increased with
                          each attempt.  The maximum wait is controlled by the
                          property DP_WAIT_MAX
    -dpnocohortwait    -- Partial jobs may exit without waiting for primary.
    -dpinitialport N   -- Specify starting port on which partial jobs should
                          attempt to communicate.
    -D                 -- Do not check the validity of the directory.
    -f <file>          -- Get a list of commands from a file, one per line.
                          Jobs are created and then scheduled in blocks
                          of 200 jobs (unless otherwise specified by -fb).
    -fb <n>            -- Change the size of blocks of jobs scheduled with -f
                          (default 200).
    -fw  <S>           -- Specify delay between blocks of jobs, in seconds.
                          Value must be >= 0, default is 0.  Use with -f.
    -dribble           -- Short hand for -fb 1 -fw 0.1
    -F                 -- Force running of job even if it is already valid.
                          This is useful only if you are also using option -l
                          to set the name of the log file, otherwise this
                          option has no effect.
    -multiphase <N>    -- Set multiphase mode:
                          N=0    not multiphase (default)
                          N=1    multiphase
                          If a job will have more than 4 phases, then the
                          "-maxresched N" option must also be specified, where 
                          N is the number of phases the job will have.

LOGFILE AND OTHER DEPENDENCIES OPTIONS:
    -dep <Id|Name>     -- Specify a dependency on the list of jobs.
    -d   <Id|Name>        The argument can be a list of job Ids or job names.
                          In the case of job names, the dependency is looked
                          for in the set of jobs belonging to the submitting
                          user.
                          The current job will not start until the
                          specified jobs have completed successfully.
                          May be repeated.
                          Performance note: dependencies on job names are much
                          slower than dependencies on job ids.
    -depset <Name>     -- Specify a dependency on all jobs in the named set at
                          the time of submission. If other jobs are added to
                          the set later, they will not be added to the
                          dependencies. May be repeated.
    -forcelog          -- Force the declared output log to be the output of
    -force                this job. If another job was declaring the same
                          output, it will become black (SLEEPING).
    -forcedequeue      -- Force the declared output log to be the output of
                          this job. If any job was declaring the same
                          output, upcone of all the jobs producing this
                          file will be stopped and dequeued, it will
                          become black (SLEEPING).
    -i <in_file>       -- Specify an input dependency.
    -l <logfile>       -- Specify name of logfile.
                          As with -rundir, if the <logfile> is quoted with
                          either "  or ', then the name is taken literally
                          and not canonicalized.
                          Quoted or not, variable substitution on the file name
                          is performed for the following variables
                            @JOBID@    -> Id of job.
                            @ARRAYID@  -> Id of job array (if applicable).
                            @DATE@     -> ISO_TIMESTAMP
                            @UNIQUE@   -> %Y%m%d_%H%M%S.SUBMISSION_PID
                            @JOBCLASS@ -> job class (the alphanumeric part)
                            @JOBNAME@  -> job name  (the alphanumeric part)
                          You may need to use -forcelog together with -l.
                          Timestamp in format '%Y%m%d_%H%M%S.SUBMISSION_PID'
                          will be added to the logfile name for array jobs
                          when '@UNIQUE@' is not present in the logfile name.
    -n                 -- Use no wrapper (default: use 'vw').
    -nolog             -- Do not keep a log.
    -o <out_file>      -- Specify an output dependency.
    -P <NAME=VALUE>    -- Add the given property to the jobs (may be repeated).
    -s                 -- Declare that the logfile is SHARED (see docs).
                          You rarely need this option. If misused, this option
                          causes extra buckets to be created in the scheduler.
                          Probably you need '-forcelog -F' instead.
    -uniqueid          -- Force NC to use a unique new
                          VovId for each job submission,
                          even when the same job is submitted multiple times.
    -wrapper <W>       -- Use specified wrapper '<W>' (default: use 'vw').

E-MAIL NOTIFICATION AND WAIT OPTIONS:
    -m                 -- Send me mail upon job completion.
    -M <mail rule>     -- Send mail according to the given rule (see docs).
    -w                 -- Wait for the job(s) to finish: do not show any log.
                          For the meaning of the exit status, check nc wait.
    -wl                -- Wait for the job(s) to finish: show the log of
                          the last job.
                          For the meaning of the exit status, check nc wait.

EXTRAS:
    -nodb              -- The job is not stored in the jobs log or in the
                          database.
    -nopolicy          -- For ADMIN only.  Disable the policy layer.

EXAMPLES:
    % nc run sleep 10
    % nc run -autokill 30m   sleep 10000000
    % nc run -r SLOT/4 -xdur 500 -deadline 1h sleep 500
    % nc run -r SLOT/4 -xdur 500 -deadlineat 10am sleep 500
    % nc run -array 10 sleep 1          # submit 10 sleep jobs via
    % nc run -array 10,200,10 sleep 1   # submit sleep jobs with index
    % nc run -g /teams/chipA  -sg session12  sleep 1
    % nc run -G /teams/chipA.any   sleep 1
    % nc run -C longjobs sleep 10000
    % nc run -C longjobs -r+ RAM/200   sleep 10000
    % nc run -r unix -- sleep 10
    % nc run -p high sleep 10
    % nc run -e BASE -p h sleep 10
    % nc run -e SNAPSHOT+SIM -p h sleep 10
    % nc run -m sleep 10;               # email when job finishes.
    % nc run -M ":ERROR" sleep 10;    # email only if
    % nc run -dp 3 -dpres sun7,linux vovparallel clone sleep 10
    % nc run -at 6pm sleep 10
    % nc run -at "tomorrow 6pm" sleep 10
    % nc run -after 10m sleep 10
    % nc run -forcelog -F -l mylog.txt ./myjob

Default Output of nc run

The default output from nc run includes the following information:

The resource list assigned to the job, which can be controlled with the option -r.
The environment used for the job, which can be controlled with the option -e.
The command line.
The log file used to store both stderr and stdoutof the command, which can be controlled with the option -l
The JobId assigned by Accelerator to this job. JobIDs are used as handles with many of the Accelerator commands.