The ICAT Job Portal supports the submission of jobs to run on datasets and datafiles selected via the ICAT Data Service.
A Job Type defines properties of a particular type of job for which the IJP can be used to submit instances. Each job type is defined in an XML file in the IJP job_types configuration folder. The key properties of a Job Type are:
In outline, you use the IJP like this:
When you load the IJP, or return to it after your session has expired, you will be asked to log in using an ICAT user account.
The initial display shows:
The Dataset Type list and Submit Job button are disabled at this point. The IJP is "job-directed": you must first select a Job Type. If the Job Type specifies a number of Dataset Types for its jobs, these will be added to the Dataset Types list.
Some Job Types can be (and should be) run without any dataset or datafile selections (for example, a job that reports the current status of the IJP or the target platform(s)). If you select such a Job Type, the job details message indicates that this job runs without datasets or datafiles. The Dataset Types list remains disabled, but the Submit Job button is now enabled; clicking on it will proceed to the Job Options Panel (see later).
If the Job Type specifies one or more Dataset Types, these will be added to the Dataset Types selection list, and it will be enabled.
Select a Dataset Type from the list. The display will change, showing:
The Search subpanel shows a number of predetermined search options (which are determined from the IJP configuration), a "+" button to add further search options (which are based on the Dataset Type) and a Search button. Pressing the latter will search for datasets that are of the selected Dataset Type and which fulfil the search filters, and will add the results to the Matching Datasets table.
The first result of a search is selected by default; you can de-select it or add other datasets to the selection.
When a single dataset is selected, the Info panel displays more of its details; and the Download, Download URL and Show Info buttons can be used.
When one or more datasets are selected, and if the Job Type accepts datasets, the Add To Datasets Cart and Submit Job for Selected Datasets buttons are enabled. The Submit button can be used to submit a job (or multiple jobs) for the selected datasets.
If the Job Type does not accept datasets, it should still be possible to search for and select datafiles from within each matching dataset (see later).
The Add Selected Datasets To Cart button will add the selected datasets to the "global" datasets cart; this allows you to build up a "shopping cart" of datasets, in any of the following ways:
When datasets are added to the global cart, a separate Datasets Cart table appears, showing the cart contents, and with a "Submit job for these datasets" button. When the button is pressed, *all* of the datasets in the cart will be used in the job submission. It is possible to select one or more datasets in the cart, but only so that they can be removed using the "Remove selected datasets" button. "Remove All" can be used to empty the datasets cart, in which case it will disappear from the display.
Note that each dataset will only be added to the cart once, even if it is selected multiple times.
Also note that choosing a different Job Type will clear the datasets cart, even if the job type accepts the same dataset type(s).
There is a global datafiles cart, similar to the datasets cart. It only appears when one or more datafiles have been added to it. Datafiles are added to the cart using a separate search panel.
If the Job Type accepts datafiles, then the table of Matching Datasets will have a column titled "Datafiles Selection", and each row will have a button of the form "Add (N in cart)", where N shows the number of datafiles from that dataset that are in the global datafiles cart.
Pressing this button for a particular dataset opens the Datafiles Selection Panel. This is a separate dialog that is somewhat similar to the datasets search / choice panels, containing:
The process here is similar to that for datasets:
Once the Current Selection list is as desired, pressing Add to Main Cart will add these datafiles to the global datafiles cart, and leave the dialog. Cancel will leave the dialog without updating the global cart.
Back in the main IJP page, the Datafiles Cart will now appear, together with buttons to remove selected datafiles, to empty it, and to submit job(s) for its contents. (As with the Datasets cart, the full cart contents are used, regardless of any selection.)
If the Submit Job For These Datafiles button is pressed, no datasets will be passed to the job(s), even if there are datasets in the datasets cart, or if datasets are selected in the Matching Datasets list. Similarly, pressing the Submit Job For These Datasets button below the datasets cart will ignore the contents of the datafiles cart.
When both the datasets cart and the datafiles cart have contents, an extra button will appear at the bottom of the form (i.e. below the datafiles cart), titled "Submit Job for Cart (datasets and datafiles)". This is the only way to submit both datasets and datafiles to the same job run. The Job Type has to accept both datasets and datafiles before this will be possible.
Clicking on any of the Submit buttons on the main IJP page launches the JobOptions dialog.
If multiple inputs (datasets and/or datafiles) are selected, and if the job-type allows multiple inputs per job, this option allows you to choose to run one job for all the inputs, or separate jobs for each input.
Selection of a single dataset and a single datafile counts as "multiple inputs selected", as there will be two inputs.
Note that it may not be sensible to run a "multiple" job-type once per input. For example, the demo copy_datafile job requires precisely one dataset and one datafile to be selected; so the job-type specifies that it accepts multiple inputs, but the job won't run unless the inputs consist of a single dataset and a single datafile.
(By the time you read this, copy_datafile may have been expanded to take one dataset and multiple datafiles, and to copy all of the datafiles into the dataset; but it will still make no sense to run one job per input, as one instance will receive the dataset but no datafiles, and the other instances will receive one datafile but no target dataset!)
If multiple inputs are selected, but the selected job-type does not support multiple inputs per job, then the option will ask you to confirm (via a checkbox) that you would like to run multiple instances of the job, one per selected dataset or datafile.
If only a single input (dataset or datafile) is selected, this option does not appear.
The remainder of the options in the dialog are determined by the job type specification, which defines the set of options and their input types. Each option type has its own input elements; the currently-supported option types and their input forms are:
If the option has default, min or max values, these are shown on the form. It is possible to enter values outside the min/max range, but this will report an error if you try to submit the job.
In the job type specification, each job option can have a condition that depends on the dataset parameters; the option will only be made available if the condition is true for *all* selected datasets. Datafiles are not considered at all; one consequence of this is that if *no* datasets are selected (only datafiles), then any options that have a condition will not be made available.
The IJP is configured with details of one or more batch servers to which it can submit jobs.
Pressing the Submit button will send the job (or multiple jobs, if so requested) to the batch server(s). For each job, the IJP will request an estimate from each batch server, and will choose one of the servers that returns the best estimate.
A dialog shows the job-id obtained from the batch server. The id should be visible in the Job Status Panel.
Note that users cannot directly select a particular batch server; but some batch servers may not be able to run some job types (in which case, they should return an estimate of (effective) "infinity".
On the main IJP panel, clicking the "Show job status panel" button opens the Job Status panel in an overlay. This lists all jobs submitted by the current user, ordered by time of submission. For each job, the list shows the job's ID, name, submission date/time and status. The status can be the following:
The panel is automatically refreshed; the IJP checks the status of each uncompleted job with the batch servers at regular intervals.
At the top of the panel are a number of buttons:
Refresh Job Status checks the status of each uncompleted job (though the auto-refresh interval is small enough that this should rarely be required).
The remaining buttons apply to the currently-selected job.
Display Job Output shows the (standard) output from the job. If the job status is Queued, there will be no output yet. A job that is Executing may have output, if the batch server supports it (and if the job has produced any output at that point). A job that has Completed may produce standard output.
The nature of the output depends on the batch server. For example, the unixbatch server wraps a small amount of output (timestamps, returned status value) around whatever output the job itself produces; on Scarf, Platform LSF wraps the job output with more detailed information.
The job output display is refreshed at regular intervals, so it can be left open to monitor the progress of a job (assuming that the job and the batch server support it).
Display Job Error displays any error output (in unix terms, stderr output) from the job. As with non-error output, there will be no output for a job that is Queued, and output for other states depends on the batch system and the job itself.
Cancel Job can be used to cancel a job, but only if it has not yet Completed. The behaviour in other cases depends on the batch server, but typically a Queued job will be removed before it starts execution, and an Executing job will be halted. Any standard or error output produced will be visible via the Display Job Output / Error buttons.
It is possible that a job whose status appears as Queued has started execution by the time the cancel request reaches the batch server.
Delete Job removes the job details (including its standard and error outputs) from the IJP. It is only possible to delete jobs once they are Completed or Cancelled. Jobs remain visible in the Status Panel until they are Deleted.
Job outputs and other details are stored in the IJP's filespace. It is possible that a future policy may be to remove "very old" jobs from the system to release space.