The Dashboard application gives a bird's eye view of the currently running and queued jobs, along with the overall status of the JobServer scheduling engine and related resources such as Queues and Agents. It shows you quickly, which Partitions and Agents are being used and allows you to drill down to see which jobs are running and where they are running.
From this tool you can create, edit, and view the Partitions available in JobServer. Partitions are a way of managing resources available to JobServer. A Partition defines a set of resources including the maximum number of jobs that can be run at any one time. When a job is ready to run, it is placed into a Partition's corresponding Queue. Once in the Queue, the job will wait until the Partition has free resources available and is ready to execute the job. A job will remain in the Queue until a thread is available and allows it to begin processing. Jobs are moved from the Queue and into a running state based on a priority scheme that uses a first in first out algorithm. Jobs with higher priority get queued and run ahead of jobs with lower priority.
Please note, if you have view only permissions to this module, you will not have some of these advanced configuration functions available in this tool. Also some features, such as Agents, are only available to JobServer Professional users only. Here are the major features available in from this tool:
Scheduling Engine Status
The Dashboard shows high-level status of all available Partitions and Agents along with
the status of the JobServer Scheduling Engine. Here are the possible values of the Scheduling Engine:
Show Partition Details
If this checkbox option is selected, it will allow you to see how Agents are being used and associated
to praticular Partitions. Partitions can use one or more Agents in order to run jobs on remote
machines. Using Agents, allows a Partition to increase its job processing capacity. Selecting this option
lets you see where the jobs are running. Jobs can be running on remote Agents or on the
main JobServer host machine. If jobs are running on the main JobServer host, you will see the "Agents Used"
rows marked as "primary" or "secondary".
"primary/secondary" means that jobs are running on the main JobServer engine and not running on
a remote Agent. If you are not using Agents and are not not using JobServer Pro edition you will not
have access to this feature.
This Dashboard allows the user to view the current status of all available Partitions and their processing state. It shows the Partition's name, the current number of running jobs, maximum number of jobs allowed, whether the Partition's Job processing is enabled or disabled, and how many jobs are in the Queue waiting to be run.
Each row in the table shows the available Partitions. And if you show the Agent details, you will see each Agent associated with the Partition and the state of each Agent relative to the Partition it is associated with. Each row contains the following columns/cells and related information:
Partition Name
Click on this column/cell for each Partition and it will take you to the "Edit Partition" screen where you can
edit the various Partition properties and resources. This screen
allows you to choose a Partition and edit its properties/resources. You can edit the
Partition's job processing status (enabled or disabled) and set the maximum number
of running jobs allowed, along with other features.
If you are using JobServer Pro version, the Partition edit screen allows you to add and remove Agents to the Partition. Agents are a way to add processing capacity to a Partition and allows you to distribute load across multiple machines. You can enable/disable an Agent for a Partition. A disabled Agent will not receive new jobs to run. You can set the maxmimum allowed job size for the Agent to not exceed a certain number of concurrent jobs.
Running Jobs
This column/cell shows the number of currently running jobs. A value of "NA"
means that JobServer is not running and is in an idle state. If JobServer is running
it will show something like "2 / 6", for example. The first number shows the number of jobs actively running
in the Partition. The
second value is the maxmium number of jobs allowed to run on the Partition. With JobServer Professional edition,
you will see something like slightly different that will instead look like
this "2 / 6 (10). The last number in parentheses, "(10)", shows the hard upper limit of the allowed
maximum jobs permitted by the Partition. The user sets this maximum capacity of the Partition which is this
hard upper limit on the number of jobs that can run in a given Partition.
The second number in this example, "6", is
the effective maxmimum number of jobs allowed and is a "calculated" size by the Partition.
This effective maximum value is "calculated" because when you are using remote Agents with a Partition, the
actual capacity of job processing my vary due to Agents being available or not. For example an Agent
host computer may become unavaiable due to many factors. If this happens, the Partition will detect
this and no longer send jobs to the Agent, until the Agent is available again.
So this second number "6" i this example, is the actual effective size and
can be different from the user entered maximum size because remote Agents may not always be
avaliable for job processing. The effective maximum size is the sum of all
Agents and "primary/secondary" job processing capacity
and indicates the true upper limit of the number of jobs allowed to run on a Partition at any
give point int time. As mentioned this effective maximum size is dynamic and can change as Agents
status changes (Agents go and come back online). But
this number can never exceed the hard maxmium capacity shown in parentheses. Note, that
you will only see Partition's maximum capacity limit (number that is shown in parentheses) if it is
different from the effective maxmimum size. If they are equal, you will just see something like "3 / 6", for
example.
If there are running jobs, then firs number will show a count greater than zero and you can see the details of what jobs are running by clicking on the cell for the given Partition. This will launch a popup that shows what jobs are running and will give some basic information on when and how the jobs where started ...etc. You also have the option of requesting that a running job be killed/terminated from this tool. The job will be terminated by JobServer if and when it reaches a safe check point, so it may not respond immediately to the kill request. It is not guaranteed that the job will actually be killed, if a safe check point is not reached before normal completion of the job. However, if the job was run in its own JVM it will be killed immediately.
If you are using Agents, then you can also see the "Agents Used" row for each Partition. From this you can see how many jobs, from this Partition, are running on a particular Agent. If you click on the link, you will be taking to a popup that shows you the details for the running jobs on that Agent/Partition combination. From there you can also kill running jobs if you like.
Job Processing Status
This column shows whether the Scheduler, that is associated with this Partition,
is capable of scheduling jobs (enabled or disabled).
When disabled, jobs that are ready to run will not run until the Scheduler is enabled again. This lets you enable
and disable job scheduling on a Partition by Partition basis. For the Agent rows, this columns indicates
if the Agent is allowed to accept jobs for this Partition.
This column shows the number of queued jobs associated to the Partition. To view the details of the queued jobs, click on the highlighted link in the cell. This will bring up a popup window which allows you to see the specific jobs that are in the Queue for this given Partition (and all Partitions). You also have the option of deleting the queued jobs and editing their ordering in the Queue.
The Edit Partition screen, lets you edit the major attributes and resources associated with a Partition. If you are using JobServer Standard edition, you can edit the maximum number of concurrent jobs along with the job processing status for this Partition.
Max Jobs Allowed
Controls the maximum number of jobs that can be run concurrently at any one time by the Partition. Changing this
value can increase or decrease the maximum number of jobs a Partition can run concurrently at
any one point in time.
Job Processing Status
If a Partition's Job Processing is disabled, jobs that are in the Queue will remain in the Queue.
Even when a Partition is disabled, jobs can still get scheduled and placed into the Queue,
however, they will remain waiting in the Queue until the Partition is enabled again.
If a Partition is disabled while jobs are already running in it, the jobs that are already running
will continue to run until completion, but newly scheduled jobs will remain in the Queue until the Partition
is enabled again.
If you are running on JobServer Professional edition, you have additional options. With the Pro edition, you can assign any number of available Agents to a Partition. This allows a partition to run jobs on multiple remote Agent servers, along with running jobs locally on the "primary/secondary" Partition machine. By default you will always have a "primary/secondary" resource to run jobs on. Using Agents are optional. To use Agents you must enable this by selecting the check box "Use remote agents". If this features is not available or not selected then you can only run jobs on the main JobServer processing engine. You can enable/disable and set the maximum concurrent jobs for each Agent/Partition combination. Note you can allocate more Agent concurrent maximum job capacity than you can use, but your actual maximum size is limited by the value set by "Max Concurrent Jobs" at the Partition level. For redundancy, it is recommended to allocate more Agent job processing capacity than you need; this way if a single Agent goes down, you will have backup capacity for the Partition to use on other Agents. This is just an example of a strategy you can use and is not required.
Alert Emails
Alerts are sent to the email addresses listed, when a job that is part of a Partition,
encounters any kind a unexpected failure. This allows
a person or group of people to be notified if anything exceptional goes wrong with any job
within a Partition.
Job alerts notify users when a job failure occurs during processing. These are typically
failures associated with the Job/Tasklet throwing an unexpected exception that may result in the Tasklet
or job failing to continue processing. For example an uncaught
out of memory exception or sql exception would constitute such a situation. Also
when a Job/Tasklet throws TaskletFailureException
this will also trigger an alert to be sent
out.
Note that errors and warnings logged via Log4J or the Java Logging API do not trigger an email alert.
The email alerts use a cascading mechanism. It works by first sending an alert to the email
address listed at the system level. It then sends the alert to the email addresses defined for that
job's Partition, it then sends it to the job's Group alert addresses, and then it finally
will send it to the alert email addresses defined for the specific job. With this design you can
setup a hierarchy of email alerts. So, for example, you can set it up so that you only
receive emails when a specific job fails or when any job in a particular Partition fails ...etc.
A Tasklet may also programmatically trigger alerts by using the SOAFaces API. Refer to the
API TaskletOutputContext.sendAlert()
.
Partition JVM Configuration Options
In a Partition, you have the option to configure jobs to run in their
own dedicated and external JVM. You have several options where you can have jobs running in the Partition
to either run in an external JVM (each job runs in its own dedicated JVM separate from the main JobServer process) or
have jobs run in the same JVM as the JobServer process. Isolating jobs in their own JVM can be useful in
situations where you need to limit the possibility of a misbehaving job from negatively impacting the rest
of the shared system.
Also, jobs running in their own external JVM are easier to kill and destroy, but keep in mind, if you have a large
number of jobs running concurrently, having each one run in its own JVM can consume a lot of system memory.
If you have enough memory and related database resources, this will not be a problem.
You have the option to also let the individual job designer decide where to run jobs (shared or external JVM)
by leaving the decision to them.
The job designer can create the job and decide
where the job should run or you can
force all jobs in the Partition to use only one of the possible JVM options (external or shared).
If you choose to use the external JVM option,
you can also limit
the maximum memory that the job and JVM can use, and you can also pass additional custom JVM options to the JVM.
If you set the maximum memory of the JVM at the Partition level, the job designer, at the job level, will
not be able to increase the maximum JVM memory capacity at the per job level. Leaving the maxmium JVM memory blank,
at the Partition level, will allow the user editing the job to set any JVM maximum memory setting
they wish.
You can added any number of Agents to a Partition. This allows you to distribute job processing capacity to remote Agent servers. You have the option to set the maxmimum jobs allowed to run on a per Agent and Partition level. And you can enable or disable each Agent for a particular Partition. By default there will always be a "primary/secondary" Agent that will run jobs on the local JobServer host machine. If you do not want to use Agents or do not have access to remote Agents then you don't need to concern yourself with this feature.
Agents can't be deleted if they are running jobs or if the Agent is not disabled. An Agent must be disabled before you can disassociate it from a Partition and the Agent must not have any jobs running on it.
Partitions can be added and removed through this screen. The user can create as many Partitions as allowed by their environment and available resources. Existing Partitions can be deleted only when JobServer is not running (in Idle state) and the there are no jobs assigned to the Partition. The "RootPartition", however, can't be deleted as is the default Partition.
The advanced options screen lets you configure some of the more advanced scalability options available in JobServer. You will only be able to access these options if you are using JobServer Professional. Note, that some of these features require JobServer to be in an "Idle" state for the the settings to take effect. This means performing a "jsshutdown" followed by a "jsstartup" for the features to take effect.
If you ar using JobServer Professional, you have additional optional settings to configure. JobServer Professional has advanced features that allows an administrator to configure high-end scalability settings. By default, a single Scheduler resource is shared among all the Partitions. JobServer Professional, however, can be configured such that each Partition has its own private Scheduler. If your environment has a large number of jobs that run concurrently (e.g. thousands of jobs), this feature can extend the scalability power of JobServer and allows for more fine grained control over a Partition's configuration. Go to the "Advanced Options" screen to configure these advanced options.
Scheduler Scan Paths
You can set the number of scan threads that the main Scheduler uses to find and run jobs that are ready
to be scheduled. Increasing this number can improve response times of the Scheduler, especially in the case
where you have a large number of jobs that run in and around the same time. Note, that the more
scan threads you use the more system resources will be consumed. On a single processor system, setting
the scan threads above a value of "2" may not buy you anything, however on SMP and multi-core hardware it can
significantly improve scheduling response times and throughput. Under normal coniditions you do not
need to concern youself with this feature.
Do not edit this particular Scan Threads property unless you know what you are doing. This field controls the number of scheduler scan threads that will be assigned to a Partition's Scheduler. Increasing the scan threads can improve the Schedulers response times, especially, when there are a large number of concurrently scheduled jobs. Note, this does NOT increase the number of concurrently running jobs allowed, it only controls the number of internal threads that will try to put jobs in a ready to run state. Consult JobServer Support Team for questions about this advanced feature.
Scheduler Thread Per Partition
If you have a large number of jobs and Partitions, turning on this setting allows each Partition to have
its own dedicated Scheduler thread. This also allows each Partition to be controlled and managed
individually, including each Partition/Scheduler having its own dedicated set of scan threads and
the Scheduler can be enabled/disable separately from other Scheduler/Partition pairs.
Database Resources Per Scan Thread
This feature, essentially gives each scan thread its own database
connection from which to talk to the database.
This feature can improve Scheduler concurrency but can also
consume significant amounts of database resources, especially if you have a large number of
Scheduler scan threads and Partitions. With this feature turned off, all the scan
threads share the same database connection of their parent Scheduler/Partition. This is the
default. Consult JobServer Support Team for questions about this feature.