Skip to main content

Managing Jobs on Quest

The sections below show you how to manage your batch jobs after they’ve been submitted.  Please note: Job resource requirements in the submission script are recorded by the scheduler. Any changes made to the job script (jobscript.sh in the example above) after the job has been submitted with MSUB will have no effect on the job. After submitting a job, you can only Hold, Release, Kill, and Modify job parameters using the the Moab commands in the list below.

Job ID

When you submit jobs on Quest using MSUB, the scheduler returns the job ID and queues it for execution.

For example, if a user submitted jobscript.sh using MSUB:

[psi391@quser03 Cone]$ msub jobscript.sh

If the job was submitted successfully, a job ID will be returned:

13870586

You can use this job ID later to monitor the job.

Job Status

After submitting a job, you can execute the showq command or the checkjob command to check the status of your job. On Quest, submitted jobs are analyzed and queued by the scheduler. When a job is sent to the scheduler, it is first checked by a resource manager. The resource manager ensures that you have enough resources, such as compute hours or storage, on the system in order to run your job.

If enough resources exist in your allocation, the job is forwarded to the scheduler to be put in queue. It is important to note that if there is a typo in your job submission script, it may be flagged by the resource manager and your job will be rejected and placed on BatchHold.

When the scheduler receives a job, it will prioritize your job relative to other jobs currently in the queue. The accounting system assumes that your job will run with the amount of time and number of cores that you specified in your job submission script. If your job requires less time than you specified, the accounting system only charges you for the time used on the system.

If you lack enough resources to run your job, it will be placed on BatchHold or SystemHold. Jobs in a BatchHold or SystemHold state, will remain in this state until you cancel the job or a system administrator intervenes to either add enough resources for your job to run, or to redirect your job to another account for you to access so your job can run.   If your job is under a BatchHold or SystemHold and you need assistance from a system administrator, please contact quest-help@northwestern.edu for assistance.

Generally, the more resources that a job requires, the longer a job may sit in the queue until the necessary resources become free and can be scheduled. Full access nodes are dedicated resources thus the access criteria, queues, job duration and job size limits for these nodes are different.

Commonly Used Commands

The showq Command

The showq command (without any options) displays the job queues for all users on Quest.  To quickly access information about your specific job(s), there are options to filter the results:

Command

Description

showq -r

Show running jobs

showq -i

Show idle jobs

showq -b

Show blocked jobs

showq -w acct=accountID

Show only jobs belonging to account specified

showq -w user=userID

Show only jobs belonging to user specified

For more information type showq --help

The checkjob Command

The checkjob command displays detailed information about a submitted job’s status and diagnostic information that can be useful for troubleshooting submission issues. It can also be used to obtain useful information about completed jobs such as the allocated nodes, resources used, and exit codes. Northwestern IT recommends using the flag ‘-v’ to gather additional diagnostic information.  Example usage:

checkjob -v 123483384

where 123483384 should be replaced with your job ID number.  See examples of jobs that demonstrate its usage on a successfully running job and a blocked job.

The mjobctl Commands

The Moab job control command (mjobctl) is used for holding, releasing, and canceling jobs, or changing the parameters of a submitted job.  You can place your job in a “user hold” state after the job has been submitted by using the msub –h jobID option.

Jobs placed in a “user hold” state will appear in the output of showq and checkjob commands. You can then release your job by using the mjobctl -r jobID command.

Moab permits modification of some job parameters after job submission and before the job starts running. These parameters include:

  • Account
  • Queue
  • Job name
  • Wall clock limit

In general, the syntax for modifying an attribute is mjobctl -m attr= value JobID. Some examples are provided in the list below:

Command

Description

mjobctl -m reqawduration+=600

Add 10 minutes to walltime

mjobctl -h

Place job on hold

mjobctl -r

Release hold

mjobctl -m

Modify job parameters

mjobctl -c

Delete job

mjobctl -m account=ACCT

Change account to ACCT

mjobctl -m queue=short

Change queue to short

mjobctl -m depend=1000

Change job to depend upon job 1000

mjobctl -m userprio-=100

Reduce priority by 100

For a complete listing of mjobctl options, see the official MOAB documentation.

Last Updated: 5 September 2017

Get Help Back to top