DQS_ERROR_0002 dqs_add_del
APPENDIX - A
DQS 3.1.3 ERROR MESSAGES
This appendix contains a list of all error messages which can
appear in the DQS "err_file" as well as in messages,
responses and warnings sent to the user or the system administrator.
Each error message is identified with its DQS_ERROR number and
the source line in the DQS 3.1.3 release code where the message
is emitted. A brief extraction of the message is given preceded
by an indication of the severity of the error:
INFO: Messages are for information purposes only. They are
sent to the "err_file" to a
assist the administrator in tracing anomalous conditions which
might arise in operations.
WARNING: Messages may be sent to either the user or the administrator.
These flag possible cases where a change may need to be made to
system operating parameters or to the user's submitted job.
ERROR: Messages are sent to either the user or the administrator.
They indicate a condition has occurred which required the abnormal
termination of a DQS process, job or user request.
CRITICAL: Messages are sent only to the administrator and indicate
a serious DQS system condition which may require intervention
by system management personnel. In the most egregious cases the
qmaster or dqs_execd daemons may have to be restarted.
Please accept our apologies for
the incompleteness of this appendix. It is a "work in progress
and will be completed soon.
Error Number
Error Message Extract
0002 ERROR: NULL ptr passed to dqs_add_queue()
An internal has occurred wherein the qconf utility
has passed a NULL pointer for the add host operation. If this
error occurs consistently the err_file and acct_file should be
saved for post-mortem examination.
0003 ERROR: invalid hostname
The host name provided has failed the gethostbyname
test. This error can only be resolved by the system administrator.
0004 INFO: adding to Host_has
as an alias for
The host name being added to the host file is an
alias for another host name.
0005 ERROR: NULL ptr passed
to dqs_add_queue()
An internal error has occurred in DQS 3.1.3. The
qmaster has failed to adequately check the input queue name.
0006 ERROR: invalid hostname
associated with queue
The host named in the queue configuration cannot
be reached by the qmaster while adding or modifying a queue.
0007 ERROR: the queue
already exists
The queue name already exists in the cell for another
queue.
0008 ERROR: invalid hostname
The host name cannot be found by the qmaster using
gethostbyname.
0009 CRITICAL: error: Host_hash
is screwed
The internal host list has been corrupted. Try restarting
the qmaster. If the error persists, shutdown all daemons and remove
the "host_file" from ../DQS/common/conf/qmaster/<qmaster
name>/common directory.
0010 CRITICAL: error: Queue_hash
is screwed
The internal queue lists have been corrupted. Restart
the qmaster. If this problem pesists contact the DQS support team.
0011 ERROR: NULL ptr passed
to dqs_modify_queue()
An internal error has occurred. The qmaster has failed
to adequately check the queue being submitted.
0012 ERROR: cannot locate queue
A delete queue function has submitted an erroneous
queue name.
0013 ERROR: NULL ptr passed
to dqs_add_job()
An internal error has occurred. The qmaster has failed
to adequately check the job name being submitted.
0014 ERROR: the job already
exists
Job sequence numbers have become corrupted. This
can occur is the "seq_num_file" is accidentally deleted
by the administrator.
0015 CRITICAL Job_hash is screwed
The internal job tables have become corrupted. Restart
the qmaster to regenerate this table. If the problem persists
contact the DQS support team.
0016 ERROR: NULL ptr passed
to dqs_add_complex()
An internal error has occurred. The qmaster has failed
to adequately check the complex name being submitted.
0017 ERROR: the complex
already exists
The name of the complex being added is already in
the cell's complex lists.
0018 CRITICAL:error: Complex_hash
is screwed
An internal error has occured in the qmaster. Restart
the qmaster. If the problem persists contact the DQS support team.
0019 ERROR: NULL ptr passed
to s_modify_complex()
An internal error has occurred. The qmaster has failed
to adequately check the complex name being submitted.
0020 ERROR: the complex
doesn't exist
An attempt has been made to delete a complex which
does not exist
0021 ERROR: couldn't locate
job
An attempt to delete a job has given a name not in
the qmaster's job list. This may be due to a transient condition
where the job has terminated since the latest status display.
It may also occur in some cases when the dqs_execd and qmaster
status gets out of sync. If a job appears to be running on a host
but has disappeared from the qmasters's queue list, the job will
have to be aborted manually.
0022 ERROR: NULL ptr passed
to dqs_add_complex()
An internal error has occurred. The qmaster has failed
to adequately check the complex name being submitted.
0023 ERROR: the consumable
already exists
An attempt to add a consumable to the qmaster's list
has failed because that name is already in the list.
0024 CRITICAL: error: Consumable_hash
is screwed
An internal error has occurred. Restart the qmaster.
If the problem persists contact the DQS support team.
0025 ERROR: NULL ptr passed
to modify_consumable
An internal error has occurred. The qmaster has failed
to adequately check the consumable name being submitted.
0026 ERROR: the consumable
doesn't exist
An attempt to delete a consumable has submitted a
name not in the qmaster's list.
0027 ERROR: sending ck in
list to
The qmaster has received a "check-on" message
from a new dqs_execd. An attempt to send a response to that dqs_execdd
has failed. Check to see if the dqs_execd is still running.
0028 ERROR: sending loadavg
ACK
The dqs_execd has transmitted a load average message
to the qmaster but fails to respond to an acknowledgement by the
qmaster. Check to see if the qmaster is still running.
0029 ERROR: cannot locate
host
The qmaster cannot find the host name for the dqs_execd
in its hosts table.
0030 ERROR: Host_hash is
screwed
An internal error has occurred in the qmaster. Restart
the qmaster to regenerate this table. If the problem persists
contact the DQS support team.
0031 ERROR: sending ACK
A job terminsation message hass been received by
the qmaster but the the dqs_execd fails to respond to an acknowledgement
from the qmaster. Check to see that the dqs_execd is still running.
0032 ERROR: illegal action
request from deq_execd
The dqs_execd has made a request of the qmaster which
is unrecognizable or not permitted to a dqs_execd.
0033 INFO: CASE unknown list
type
A message from a dqs_execd has arrived but is garbled
and its type and contents cannot be discerned.
0034 INFO: TRANSACTION
ALREADY OCCURRED
This is for system information only. The dqs_execd
has sent a transaction earlier and is repeating it because of
some interruption.
0035 ERROR: Could not
locate the jid
The qalter command has requested a job of which the
qmaster has no knowledge.
0036 ERROR: you are not the
owner
An attempt to qalter a job failed because the command
is being executed by someone other than the owner of the job or
the DQS manager.
0037 ERROR: "cannot alter
a runnig job"
A qaslter command has attempted to modify a job definition
while the job is in execution.
0038 INFO: just qaltered
A qalter command has been compleetd successfully.
This information message is sent to the error file as a record
of the change action.
0039 CRITICAL:botched tables
An internal error has occurred in the qmaster. Restart
the qmaster and check the status of the job which was altered.
0040 INFO: CASE unknown
list type
A qalter command has submitted a message to the qmaster
which has become garbled. Re-execute the qaltercommand.
0041 WARNING: complexes may
only be added by managers
An attempt to add a complex using qconf -ac was performed
by a non-manager.
0042 WARNING: Complex
already exists
An attempt to add a complex failed because the complex
already exists.
0043 INFO: added complex
A complex has been successfully added by the dqs
manager.
0044 WARNING: consumable can
only be added by managers
An attempt to add a consumable resource by a non-manager
has failed.
0045 WARNING:
An attempt to add a consumable failed because an
identically named one already exists.
0046 INFO: added consumable
A consumable has been successfully added to the system.
0047 WARNING: hosts may only
be added by managers
An attempt to add a host to the system has failed
because it was performed by a non-manager.
0048 WARNING: host already
exists
An attempt to add a host has failed because one with
that name already exists in the system.
0049 WARNING: invalid host
name
An attempt to add a host name has failed because
the name is an invalid cell member. This error message should
be accompanied by other messages which may give a more detailed
reason for the invalidation of the name.
0050 INFO: host added
A host has been successfully added to the system
0051 WARNING: only a manager
can add another manager
An non-manager attempt to add a manager failed.
0052 WARNING: already a manager
The named manager is already in the manager's list
0053 WARNING: invalid user
name
The submitted name is not valid as a login to the
qmaster's system.
0054 INFO: added manager
A manager has been successfully added to the system.
0055 WARNING: only a manager
can add an operator
A non-manager attempt to add an operator failed
0056 WARNING: is already an
operator
An attempt to add an operator has failed because
the submitted name is already in the operators list.
0057 WARNING: invalid user
name
The submitted name is not valid as a login to the
qmaster's system.
0058 INFO: added operator
An operator has been successfully added to the system
0059 WARNING: only managers
can add queues
A non-manager attempt to add a queue has failed.
0060 WARNING: queue already
exists
An attempt to add a queue has failed because one
with that name already exists.
0061 WARNING: invalid host
name
The host name supplied in the queue configuration
is not valid. See other error messages prior to this message for
the reasons for invalidation.
0062 INFO: added queue
A queue has been successfully added to the system
0063 WARNING: adding acl requires
an operator or manager
A non-operator or non-manager attempt to add a user
to the queue access control list has failed.
0064 WARNING: user already
in acl
An attempt to add a user to the acl has failed because
the name already exists in the list.
0065 WARNING: invalid user
name
The name submitted by an add yuser request has failed
because the login name doesn't exist in the system.
0066 INFO: user added to acl
A user name has been successfully added to the access
control list.
0067 WARNING: cleaning
a queue requires a anager
A non-manager attempt to clear a queue of data has
failed.
0068 WARNING: cannot
locate the queue
An attempt to clean a queue has failed because its
name is not a valid queue name.
0069 ERROR cleaned the
queue
The named manager has cleaned out the specified queue.
This is recorded as an ERROR message since it is an extraordinary
action which should only be taken due to some inconsistency in
the running DQS system.
0070 ERROR t-unlinking
For every job present in the queue's own job list
an attempt is made to remove that job and all of its temporary
files from the system As each job is removed this message will
appear.
0071 WARNING: deleting a
complex requires a manager's permissions
A non-manager attempt to delete a complex has failed.
0072 WARNING: cannot locate
he complex
An attempt to delete a complex has failed because
its name could not be found in the complex lists.
0073 INFO: complex has been
deleted
A complex has been successfully deleted.
0074 WARNING: deleting a
consumable requires a manager
A non-manager attempt to delete a consumable has
failed.
0075 WARNING: cannot locate
the consumable
An attempt to delete a consumable has failed because
the name give cannot be found in the qmaster's consumable lists.
0076 INFO: consumable has been
deleted
A consumable resource has been successfully deleted
from the system
0077 WARNING: deleting a host
requires a manager's permissions
A non-manager attempt to delete a host has failed.
0078 WARNING: invalid host
name
An attempt to delete a host has failed because the
name given cannot be validated within the qmaster's network.
0079 WARNING: cannot locate
host
An attempt to delete a host has failed because the
name cannot be found in the qmaster's host list.
0080 WARNING: cannot delete
the host
An attempt to delete a host has failed because the
name submitted is that of the qmaster's own host!!
0081 WARNING: the
host has an active queue
An attempt to delete a host has failed because an
active queue exists for that host. The queue itself must be deleted
before the host can be deleted.
0082 INFO: deleting Host
has as an alias
An attempt to delete a host has succeeded,
however the manager is warned that the host and any aliases have
all been deleted from the system.
0083 INFO: host deleted
A host has been successfully deleted from the system
0084 WARNING: deleting a manager
requires manager privileges
A non-manager attempt to delete a manager has failed.
0085 WARNING: cannot locat
the manager
An attempt to delete a manager has failed because
the name submitted is not in the qmaster's manager list.
0086 INFO: manager has been
deleted
A manager has been successfully deleted from the
system
0087 WARNING: deleting an operator
requires manager privileges
A non-manager attempt to delete an operator has failed.
0088 WARNING: cannot locate
the operator
An attempt to delete an operator has failed be cause
the name submitted is not in the qmaster's operator list.
0089 INFO: operator has been
deleted
An operator has been successfully deleted from the
system
0090 WARNING: deleting a
queue requires manager permissions
A non-manager attempt to delete a queue has failed.
0091 WARNING: cannot locate
the queue
An attempt to delete a queue has failed because the
submitted name is not in the qmaster's directory of queues.
0092 INFO: queue has been
deleted
A queue has been successfully deleted from the system
0093 WARNING: deleting acl
requires operator privileges
A non-manager or non-operator attempt to delete a
user from an access control list has failed.
0094 WARNING: user not in acl
An attempt to delete a user from an access control
list has failed
0095 INFO: user has been deleted
from acl
A user name has been successfully removed from the
qmaster's access control lists.
0096 WARNING: cannot
locate the complex
An attempt to get a complex (qconf -gc) has failed
because the name submitted is not in the qmaster's complex list.
0097 WARNING: cannot
locate the consumable
An attempt to get a consumable resource (qconf -gcons)
has failed because the name submitted is not in the qmaster's
consumable list.
0098 WARNING: killing
a queue requires manager privileges
An attempt to kill a queue operation within the qmaster
AND the dqs_execd failed because a non-manager submitted the request.
0099 WARNING: kill queue
is not implemented yet
DQS 313 has disabled the "kill queue" function.
0100 WARNING: killing multiple
queues requires manager privileges
Am attempt to kill multiple queues failed because
the command was not submitted by a DQS manager.
0101 WARNING: -kqs is not
implemented yet
DQS 3.1.3 has disabled the kill queues command.
0102 WARNING: modifying
a complex requires manager \
A non-manager attemp at modifying a queue has failed.
0103 ERROR: the complex
hash is trashed
An internal error has occurred and the complex tables
have been corrupted. Restart the qmaster. If this problem persists
contact the DQS support team.
0104 INFO: complex
has been modified
A complex has been successfully modified.
0105 WARNING: qconf -mconf
is not implemented yet
The dynamic modification of the conf_file, avoiding
restarts of the dqs_execd and qmaster is not yet operational in
DQS 3.1.3
0106 WARNING: modifying
a consumable requires a manger
A non-manager attempt to modify a consumable resource
definition gas failed.
0107 WARNING: the
consumable hash is trashed
An internal error has occurred and the consumable
tables have been corrupted. Restart the qmaster. If this condition
persists contact the DQS support team.
0108 INFO: the consumable
has been modified
A consumable resource has been successfully modified.
0109 WARNING: modifying
a queue requires manager
A non-manger attempt at modifying a queue has failed.
0110 WARNING: cannot
modify queue the definition is trashed
An internal error has occurred, the memory resident
queue definitions have been corrupted. Restart the qmaster. If
the problem persists, contact the DQS support team.
0111 INFO: the queue has been
modified
A queue has been successfully modified.
0112 WARNING: cannot locate
complex
An attempt to show a complex failed because its name
is not in the qmaster's complex list.
0113 WARNING: cannot locate
consumable
An attempt to show the definition of a consumable
resource failed because the name cannot be found in the qmaster's
consumable list.
0114 WARNING: cannot
locate the queue
An attempt to show a queue definition because the
name cannot be found in the qmaster's queue lists.
0115 CRITICAL: terminal
dqs_read_in_qconf() failed
An attempt to show a named queue failed. The queue
configuration file for this queue cannot be read by the qmaster.
This is a serious error. Restarting the qmaster will not usually
correct the problem. The manager should check the ../DQS/common/conf/qmaster/<qmaster
name>/common_dir/queue_dir for the existence of a file with
this queue name. If the file does not exist, it has to have been
deleted since the qmaster was started. If the file exists check
that it is non-zero length. Look at other error messages near
this one for additional clues. Contact the DQS support team.
0116 WARNING: CASE
unknown list type
An internal error has occurred. The qconf utility
has sent an illegal message to the qmaster. Retry the command.
If this condition persists contact the DQS support team.
0117 INFO: TRANSACTION
ALREADY OCCURRED
A queue delete operation has been repeated within
the same command execution of "qdel". This usually occurs
when network congestion has blocked an acknowledgment of the action
from the qmaster to the DQS utility.
0118 ERROR: forcing a
queue state
An attempt to force a queue state change by a user
other than a manager or operator failed.
0119 ERROR: cannot locate
the job
An attempt to delete a job failed because the job
identifier cannot be found in the qmaster's job list.
0120 ERROR: you do not
have the necessary permissions
An attempt to delete a job has failed because a user
other than the manager, operator or job owner has executed qdel.
0121 ERROR: cannot
locate the queue associated with this job
An attempt to delete a running job has failed because
the qmaster cannot locate the queue which the job is allegedly
operating in. This is a serious error and the manager should check
to see:
Is the job running on the target host
Does the qstat status show the queue in RUNNING state?
Does the qstat display show the job running?
If the answer to all these questions is yes, restart
the qmaster. If any answer is no, shutdown the dqs_execd for that
queue and then shutdown the qmaster. Restart the qmaster and then
the dqs_execd.
0122 WARNING: forced the
deletion of job
The manager has used the "force" option
to delete the job which overrides any DQS safeguards.
0123 ERROR: unable to sync
state with remote host
The attempt to delet a job has failed because the
job's alleged host refuses to kill the job. This occurs when the
dqs_execd is not actually executing the named job.
0124 INFO: job deleted
A job has been successfully deleted.
0125 INFO: job deleted
A job has been successfully deleted.
0126 INFO: unknown message
A qdel command has send a garbled message to the
qmaster. Retry the command.
0127 INFO: TRANSACTION ALREADY
OCCURRED
A queue hold operation has been repeated within the
same command execution of "qhold". This usually occurs
when network congestion has blocked an acknowledgment of the action
from the qmaster to the DQS utility.
0128 ERROR using SYSTEM
or OTHER holds requires a manager
An attempt to place a qhold of SYSTEM or OTHER rather
than USER, by a non-manager has failed.
0129 ERROR cannot locate the
job
An attempt to place a hold on a job has failed because
its identifier cannot be found in the qmaster's job list.
0130 ERROR Cannot HOLD a
job already running
An attempt to place a hold on a job has failed because
the job is already in the running state.
0131 ERROR you do not have
the necessary permissions
An attempt to place a hold on a job has failed because
the qhold command was invoked by someone other than a manager,
operator or the job's owner.
0132 ERROR cannot locate the
queue
An attempt to place a hold on a job has failed because
the queue associated with this job is not in the qmaster's queue
list.
0133 INFO: hold set for the
job
A job hold has been successfully set.
0134 INFO: unknown list
type
A qhold utility has sent a garbled message to the
qmaster. Retry the command.
0135 INFO: TRANSACTION ALREADY
OCCURRED
A queue idle operation has been repeated within the
same command execution of "qidle". This usually occurs
when network congestion has blocked an acknowledgment of the action
from the qmaster to the DQS utility. If this error reoccurs restart
the qmaster and try again. If this condition persists contact
the DQS support team.
0136 WARNING: unable to force
a SUSPENDED state
An attempt to suspend a queue failed. The target
dqs_execd was unable to signal running jobs to stop. This error
usually arises when the qmaster and dqs_execd get "out of
sync". The symptoms of this situation are that the dqs_execd
is not actually running the job whilst the qmaster thinks it is.
If this is true the manager should use the clean queue function
(qconf -cq) to align the qmaster's tables with reality.
0137 WARNING: , forced a SUSPENDED
state
The qidle utility has forced a queue suspension of
a running job, overriding system interlocks.
0138 WARNING: unable
to force a RUNNING state
An attempt to set a queue to RUNNING failed. The
target dqs_execd was unable to signal running jobs to un-suspend.
This error usually arises when the qmaster and dqs_execd get "out
of sync". The symptoms of this situation are that the dqs_execd
is not actually running the job whilst the qmaster thinks it is.
If this is true the manager should use the clean queue unction
(qconf -cq) to align the qmaster's tables with reality.
0139 WARNING: forced a RUNNING
state
The dqs_execd has acknowledged a request to SIGCONT
a job it is managing. This signal is sent to the job despite DQS
system interlocks.
0140 INFO: unknown request
A qidle command has sent an unknown request to the
qmaster. Retry the command.
0141 INFO: unknown list
type
A qidle command has sent an almost completely garbled
message to the qmaster. Retry the qidle command.
0142 CRITICAL: unable to
open for <job file > writing
The dqs_execd is unable to open the execution
(or script) file where it will place the job information sent
by the qmaster. This is the "exec_dir" specified in
the "conf_file" and MUST be fully accessible to the
dqs_execd. If this is the first attempt at using this dqs_execd
check for possible NFS cross-mounting problems. If the dqs_execd
has been working correctly and now fails some insdeous file system
error has occurred.
0143 INFO: CASE unknown list
type
The dqs_execd has received a garbled message
from the qmaster. If this condition persists restart both the
qmaster and dqs_execd.
0144 ERROR: signal delivery
The qmaster has requested that the dqs_execd deliver
a SIGKILL or SIGSTOP to the job. This may or not be an error
depending if an intentional "qdel" was requestd by an
authorized manager/operator.
0145 INFO: NOTIFIABLE SIGNAL
JID and setting
The qmaster has requested that a SIGKILL be sent
to a job. This is reported as an error to ensure that all such
signals are always recorded in the "err_log".
0146 INFO: NOTIFIABLE SIGNAL
JID
The qmaster has requested that a SIGSTOP be sent
to a job. This is reported as an error to ensure that all such
signals are always recorded in the "err_log".
0147 INFO: NON-NOTIFIABLE SIGNAL
JID
The qmaster has requested that a signal other than
a SIGTOP or SIGKILL be sent to a job. This is reported as an error
to ensure that all such signals are always recorded in the "err_log".
0148 INFO: delivering signal
to pid
A signal has actually been sent to the indicated
process.
0149 INFO: TRANSACTION ALREADY
OCCURRED
The qmaster has received a duplicate qmod request
and will ignore it. This may occur when the qmod utility has not
received an acknowledgment of the original request due to network
congestion or a restart of the qmaster. Ifg this condition persists
check the ALARM values for adequate communications windows. Check
the "err_log" for recent qmaster problems.
0149a ERROR: Forcing a queue
state requires manager
An attempt to use the "-f" option for a
qmod request was rejected because the command was not issued by
a manager.
0150 error: forced actions
require manager
An attempt to use the "-f" option
for a qmod request was rejected because the command was not issued
by a manager.
0151 INFO: TRACE
If the INFO level has been selected for logging,
this message will appear whenever qmaster is processing a hard-resource
request for a qmod command.
0152 INFO: TRACE
If the INFO level has been selected for logging,
this message will appear whenever qmaster is processing a hard-resource
request for a qmod command if at least one queue exists with that
resource.
0153 WARNING: You do not have
permission to modify
An attempt to modify a queue has been rejected for
insufficient permissions.
0154 WARNING: You do have
permission to enable
An attempt to enable a queue has been rejected
for insufficient permissions.
0155 WARNING: Queue is already
enabled
An attempt to enabled a queue was ignored because
the queue was already enabled.
0156 INFO: Queue has been
enabled
An attempt to enable a queue has succeeded.
0157 WARNING: You do not have
permission to diable
Am attempt to disable a queue has failed for insufficient
permissions.
0158 NING: Queue is already
disabled
An attempt to disable a queue has failed because
the queue is already disabled.
0159 O: Queue has been disabled
An attempt to disable a queue has succeeded.
0160 NING: You do not have
permission to soc
An attempt to suspend a queue upon its completion
has failed has failed for insufficient permissions.
0161 WARNING: Queue is already
marked soc
An attempt to suspend a queue upon completion has
been ignored because the queue is already in that state.
0162 INFO: Queue has been
marked as soc
An queue has been successfully set to the suspend
on completion state.
0163 WARNING: You do not have
permissions to remove xsoc
An attempt to remove the unsuspend queue on completion
has failed for insufficient permissions.
0164 WARNING: Queue is not
masked as xsoc
An attempt to remove the suspend on completion state
has failed because the queue does not have that flag set.
0165 INFO: has canceled suspend_on_comp
on the queue
The suspend on completion flag has been successfully
removed from the queue
0166 WARNING: You do not have
permission to suspend
An attempt to suspend a queue has failed for insufficient
permissions.
0167 WARNING: Unable to force
a suspended state
The qmaster failed in an attempt to get a dqs_execd
to suspend its queue has failed. This is a serious but not a fatal
problem, however it is a symptom that the qmaster and the dqs_execd
are no longer synchronized in their respective queue information.
0168 WARNING: forced a suspended
state
A queue has been forced to the suspend state and
all jobs have been stopped.
0169 WARNING: Already suspended
Am attempt to suspend a queue has failed because
the queue is already suspended.
0170 WARNING: Unable to force
a suspended state
The qmaster failed in an attempt to get a dqs_execd
to suspend its queue has failed. This is a serious but not a fatal
problem, however it is a symptom that the qmaster and the dqs_execd
are no longer synchronized in their respective queue information.
0171 INFO: forced a suspended
state
A queue has been forced to the suspend state and
all jobs have been stopped.
0172 WARNING: Unable to sync
suspend of the queue
The qmaster failed in an attempt to get a dqs_execd
to suspend its queue has failed. This is a serious but not a fatal
problem, however it is a symptom that the qmaster and the dqs_execd
are no longer synchronized in their respective queue information.
0173 INFO: has suspended the
queue
A queue has been forced to the suspend state and
all jobs have been stopped.
0174 WARNING: You do not have
permissions to unsuspend
The qmaster failed in an attempt to get a dqs_execd
to set its queue to running has failed. This is a serious but
not a fatal problem, however it is a symptom that the qmaster
and the dqs_execd are no longer synchronized in their respective
queue information.
0176 WARNING: forced a RUNNING
state
The qmaster has successfully set a queue to RUNNING
state and restarted all the suspended jobs.
0177 WARNING: queue is already
running
An attempt to unsuspend a queue has failed because
it is already in the RUNNING state.
0178 dqs_c_qmod.c 414 WARNING:
Unable to sync the RUNNING state of the queue
The qmaster failed in an attempt to get a dqs_execd
to set its queue state to running has failed. This is a serious
but not a fatal problem, however it is a symptom that the qmaster
and the dqs_execd are no longer synchronized in their respective
queue information.
0179 INFO: Forced a RUNNING
state
The qmaster has successfully caused the dqs_execd
to set the state of its queue to RUNNIN.
0180 WARNING: Unable to sync
unsuspended state
The qmaster failed in an attempt to get a dqs_execd
to set its queue state to unsuspended has failed. This is a serious
but not a fatal problem, however it is a symptom that the qmaster
and the dqs_execd are no longer synchronized in their respective
queue information.
0181 INFO: has unsuspended the
queue
The qmaster has successfully set the queue status
to unsuspended.
0182 INFO: unknown action request
The qmaster has received a request for an action
(like suspend, unsuspend) which is not legal for the qmod command.
This should not occur since the qmod command itself is edited
before bothering the qmaster. Try the qmod command again. If the
problem persists, try restarting the qmaster. Report this case
to the DQS support team.
0183 INFO: unknown list type
The qmaster has received a garbled message from the
qmod utility. Try the qmod again, if the problem reoccurs, restart
the qmaster. Report this problem to the DQS support team.
0184 ERROR: unable to update
remote queue
The qmaster was unable to contact the dqs_execd to
change its queue status. Check to see if the dqs_execd is still
running. If the dqs_execd is running and retrying this command
results in the same error message, restart the qmaster. Report
this problem to the DQS support team.
0185 INFO: TRANSACTION ALREADY
OCCURRED
The qmaster has received a duplicate qmove request
and will ignore it. This may occur when the qmove utility has
not received an acknowledgment of the original request due to
network congestion or a restart of the qmaster. If this condition
persists check the ALARM values for adequate communications windows.
Check the "err_log" for recent qmaster problems.
0186 ERROR: cannot locate
the job
The qmaster was unable to locate the job requested
by a "qmove" command. The job is not in this cell's
job list. Check to make sure that the cell name given is correct.
0187 ERROR: you do
not have the necessary permissions
The qmaster has rejected a qmove request because
it was submitted by someone other than the job owner or the DQS
manager.
0188 ERROR: cannot locate the
queue
The qmaster could not locate the queue associated
with the job name given by a "qmove" request. It is
possible that two identical job names exist in two separate cells,
each in a different queue. A "qmove" command with the
wrong cell name could cause this error. Otherwise there is an
interan error. Restart the qmaster to re-synchronize the queue
and job status.
0189 INFO: has moved the job
The qmaster has successfully moved the job to another
cell.
0190 INFO: unknown list type
The qmaster has received a garbled message from a
"qmove" command. Retry the command. If the problem recurs
restart the qmaster.
0191 INFO: TRANSACTION ALREADY
OCCURRED
The qmaster has received a duplicate qrls request
and will ignore it. This may occur when the qrls utility has not
received an acknowledgment of the original request due to network
congestion or a restart of the qmaster. If this condition persists
check the ALARM values for adequate communications windows. Check
the "err_log" for recent qmaster problems.
0192 ERROR: using SYSTEM
or OTHER requires manager
The qmaster has rejected a "qrls" request
because the type of job release was either SYSTEM or OTHER which
requires manager permission.
0193 ERROR: cannot
locate job
The qmaster has been unable to locate the job requested
by a "qrls" command.
0194 ERROR: cannot release a
job already RUNNING
The qmaster has rejected a qrls request because the
job is already in the RUNNING state.
0195 ERROR: you do not have
the necessary permissions
The qmaster has rejected a qrls request because the
command was submitted by someone other than the job owner or DQS
manager.
0196 ERROR: cannot locate the
queue
An internal error has been detected by the qmaster
in response to a qrls command. This is a serious problem since
the qmaster has the job in its internal list but not the queue.
(A queue can be deleted but any jobs running there should also
be deleted. Restart the qmaster.
0197 INFO: user has removed
a HOLD for job
The qmaster has successfully removed a designated
HOLD (USER, SYSTEM,OTHER) from the job requested by a "qrls"
command.
0198 INFO: unknown list type
The qmaster has received a garbled message from a
"qrls" command. Retry the command. If the problem recurs,
restart the qmaster.
0199 INFO: CASE unknown list
type
The qmaster has received a garbled message from a
"qstat" command. Retry the command. If the problem recurs,
restart the qmaster.
0200 INFO: TRANSACTION ALREADY
OCCURRED
The qmaster has received a duplicate qsub request
and will ignore it. This may occur when the qsub utility has not
received an acknowledgment of the original request due to network
congestion or a restart of the qmaster. If this condition persists
check the ALARM values for adequate communications windows. Check
the "err_log" for recent qmaster problems.
0201 INFO: CASE unknown list
type
The qmaster has received a garbled message from a
"qsub" command. Retry the command. If the problem recurs,
restart the qmaster.
0202 ERROR: could not open
for appending
The qmaster has been unable to open the "stat_file".
This file must be accessible by the qmaster at all times. Check
for an NFS failure. Restart the qmaster.
0203 ERROR: ###PENDING SIGNAL
delivering
The dqs_execd has sent a SIGUSR to a job using the
"-notify" option. After waiting the specified number
of seconds the job is still in execution, so the dqs_execd is
now delivering the SIGSTOP or SIGKILL. This is not an error but
is recorded in the err-log as an incident which should be logged.
0204 ERROR: ###now lp->job->hard_wallclock_gmt
This is the first of three messages (204, 205, 206)
triggered by a job exceeding the hard wall clock limits established
by the user or system defaults. This message contains the GMT
system time and the GMT before which the job should have terminated.
0205 ERROR: ###exceeded hard_wallclock
This is the second of three messages (204,205,206)
triggered by a job exceeding the hard wall clock time. This message
contains the job name and the DQS job sequence number.
0206 ERROR: ###delivering pid
This is thethird of three messages (204, 205, 206)
triggered by a job exceeding the hard wall clock limits established
by the user or system defaults. This message contains the signal
number being sent to the job and the job's UNIX process identifier
(pid).
0207 ERROR: ### exceeded soft_wallclock
The dqs_execd has determined that the job has exceeded
the soft wall clock time specified a s a limit. A SIGUSR1 signal
will be sent to the job.
0208 CRITICAL: error: could
not open
An attempt to open the ASCII documentation file requested
failed, because the file could not be found.
0209 CRITICAL: error:
could not open
An attempt to open a MAN file has failed because
the file could not be found.
0210 CRITICAL: error: JID
setpag failed
A dqs_execd in an AFS based system was unable to
perform the lsetpage function in the child process.
0211 ERROR: (process leader)
cannot make pipe
The dqs_execd was unable to open a pipe between the
process "shepherd" and the job itself in order to instantiate
the COPY_FILES option of system operation. This is fatal for the
user's job and requires that the manager examine the errno and
determine if pipe usage is in violation of local system limits.
0212 ERROR: (process leader)
waitpid ERROR
The process "shepherd" process within the
dqs_execd has received an error signal while waiting for the executing
job to terminate. This is a serious internal error. If it recurs
try restarting the dqs_execd. Notify the DQS support staff of
this problem.
0213 ERROR: ### exceeded
hard_cpulimit",
The process "shepherd" process within
the dqs_execd has determined that the current process for the
job has exceeded the hard cpu time limits set for the queue. A
SIGKILL will be sent to the job to terminate any subsequent processes.
0214 ERROR: ### exceeded
soft_cpulimit",
The process "shepherd" process within
the dqs_execd has determined that the current process for the
job has exceeded the soft cpu time limits set for the queue. A
SIGUSR2 will be sent to the job to terminate any subsequent processes.
0215 ERROR: stdout
to <filename>
The process shepherd has established a COPY_FILES
process to move stdout and stderr files from the working directory
to some other location.
0216 ERROR: execve failed
The dqs_execd "process shepherd" failed
in its attempt to initiate execution of the job. This is a serious
and fatal error and should be looked into immediately. The error
message itself contains the actual execve command attempted and
the UNIX errno value. If a single dqs_execd is reporting this
problem, restart the dqs_execd. If this doesn't correct the problem
look at the system resource availability on that host, one possibility
is that the "process limits" of that host are being
exceeded.
0217 ERROR: NULL path passwd
to dqs_am_chdir()
The dqs_execd failed in an attempt to change the
user's directories to their automounted form because a NULL automount
name string was provided. This is an internal error which should
never occur. Try restarting the dqs_execd. Check the site's automounter
to make sure it is operating correctly.
0219 CRITICAL: Bad service?
The dqs_execd has attempted to open its socket designated
by the entry in the /etc/services file. This occurs during the
initialization of the dqs_execd. If any of these initial steps
fail the dqs_execd will abort. Check the service name being searched
for to make sure the spelling is correct in both the conf_file
and /etc/services.
0220 CRITICAL: socket creation
ERROR
The dqs_execd failed to obtain a socket during its
initialization. This can occur if all socket resources are exhausted.;
a frequent mischief-maker is the local NIS. This error is fatal
and the dqs_execd will abort.
0221 CRITICAL: socket option
ERROR
The dqs_execd failed to set default socket options
during its initialization. This can occur if the socket resources
get confused with other executing processes; a frequent mischief-maker
is the local NIS. This error is fatal and the dqs_execd will abort.
0222 CRITICAL: bind failure
check for duplicate port
The dqs_execd failed in its attempt to bind the socket
provided to the port named for dqs_execd services. The most common
source of this error is duplicate port numbers in the /etc/services
file. This can occur in lengthy /etc/services files and may not
arise if the duplicate port is not being used.
0223 ERROR: Error reporting
reaped children
When a dqs_execd is restarted it checks to see if
any DQS managed jobs have terminated while the dqs_execd was
asleep. Is so the qmaster is sent a record of each of the "children"
whose job information has been "reaped" by the signal
handler at termination. This is an informational message which
we choose to add to the err_file for logging purposes.
0224 ERROR: unable to
check in with qmaster
At startup the dqs_execd MUST check in with the qmaster
before commencing any other activities. The first step consists
of sending a simple STARTING_UP message. This error message occurs
when the qmaster fails to receive the first packets. A frequent
cause of this problem is starting dqs_execds before the qmaster
has been started.The actual send-receive subroutine will have
reported more detailed information as to the error, most often
a bad port number assigned for qmaster services. Any initialization
failure such as this will cause the dqs_execd to abort.
0225 ERROR: unable get checkin
list from qmaster
The dqs_execd was aboe to send the initial STARTING_UP
message to the qmaster but the acknowledgment of this message
was not returned. This could be due to a time-out during communications
due to ALARM values which are too low in the conf_file. This error
at this point in initialization is fatal and the dqs_execd will
abort.
0226 ERROR: Cannot Check-in
with Qmaster
Any error which prevents the dqs_execd from completing
its initial handshaking with the qmaster will result in the dqs_execd
aborting. This error confirms this condition. Note that once the
dqs_execd gets past this point it can survive the shutting down
of the qmaster at any future time.
0227 dqs_execd.c 498 ERROR: No rusage stats to
report
On initial startup the dqs_execd attempts to send
the qmaster any information about its previous operations. This
error message often occurs when there has been no previous instance
of the dqs_execd running on this host or when no jobs were in
execution when the dqs_execd was shutdown. If this error occurs
frequently there may be a problem with the dqs_execd. Restart
the dqs_execd.
0228 ERROR: TID ERROR... no
tid
The dqs_execd has received an acknowledgment to an
rusage message, but the response is missing the task identifier
used when sending the original message to the qmaster or the dqs_execd
has no record of sending an rusage message "recently".
The best solution is to restart the dqs_execd at the next opportunity
.
0229 ERROR: TID ERROR
The dqs_execd has received an acknowledgment to an
rusage message, but the response is not in sync with the task
identifier used when sending the original message to the qmaster.
The best solution is to restart the dqs_execd at the next opportunity
.
0230 CRITICAL: error:
couldn't exec jid
The dqs_execd was unable to start a "process
shepherd" for the job. The only possible cause of this would
be that a system limit on active processes has been exceeded.
0231 INFO: ***************************
This is the header line for a list of all hosts know
by this dqs_execd.
0232 INFO: dqs_execd on building
Host list GMT
This the informational line for each host
name shown.
0233 ERROR:*************************
This is the header line to flag the following message.
0234 ERROR: NULL Host_head
There is no list of cell hosts available to the dqs_execd
at this time.
0235 CRITICAL:calloc() failure
A fatal error has occurred when attempting to malloc
space for internal hash tables.
Check memory resources available on this host.
0236 ERROR: NULL string passed
to dqs_hash_add()
The calling program has erroneously provided no string
to be added to an internal hash table. This is an internal error
which should not occur.
0237 ERROR: NULL hash_table
passed
The calling program has erroneously provided no internal
hash table. This is an internal error which should not occur.
0238 ERROR: element already
exists
A string being added to a hash table already has
a matching entry in the table. This is an internal error which
should not occur.
0239 ERROR: NULL queue passed
The calling program has erroneously provided no queue
pointer to be added to an internal hash table. This is an internal
error which should not occur.
0240 ERROR: NULL queue->name
passed
The calling program has erroneously provided no queue
name to be added to an internal hash table. This is an internal
error which should not occur.
0241 ERROR: the queue already
exists
A string identifying a queue being added to a hash
table already has a matching entry in the table. This is an internal
error which should not occur.
0242 ERROR: could not resolve
The queue name was found in the hash table but does
not match any queue name in the internal queue list. This is a
fatal internal error and the calling qmaster or dqs_execd will
be aborted and must be restarted. This is one of the few cases
where the daemons will abort after they have been started, and
this "silent death" will be repaired in the next release.
0243 ERROR: Queue_list is screwed
The queue name was found in the hash table and
match a queue name in the internal queue list. However the queue
pointer has been corrupted This is a fatal internal error and
the calling qmaster or dqs_execd will be aborted and must be restarted.
This is one of the few cases where the daemons will abort after
they have been started, and this "silent death" will
be repaired in the next release.
0244 ERROR: NULL string passed
to dqs_hash_del()
The calling program has erroneously provided no string
to be deleted from an internal hash table. This is an internal
error which should not occur.
0245 ERROR: NULL hash_table
passed
The calling program has erroneously provided no internal
hash table. This is an internal error which should not occur.
0246 ERROR: hash table is screwed
The hash table passed to delete hash is invalid.
There is no string entries in the first entry in the table. This
is a critical error and the daemon will be aborted.
0247 CRITICAL:error: hash
table screwed
The hash table passed to delete hash is invalid.
There are no string entries after the first entry in the table.
This is a critical error and the daemon will be aborted.
0248 CRITICAL:error: Job_hash
is screwed
The job name was found in the hash table but its
internal pointer to the job structure has vanished. This is a
critical error and the daemon will be aborted.
0249 CRITICAL:error: Queue_hash
is screwed
The queue name was found in the hash table but its
internal pointer to the queue structure has vanished. This is
a critical error and the daemon will be aborted.
0250 CRITICAL:error: Complex_hash
is screwed
The complex name was found in the hash table but
its internal pointer to the complex structure has vanished. This
is a critical error and the daemon will be aborted.
0251 CRITICAL:error: Consumable_hash
is screwed
The consumable name was found in the hash table but
its internal pointer to the consumable structure has vanished.
This is a critical error and the daemon will be aborted.
0252 WARNING: dqs_open_tcp:
cannot resolve host
An attempt open a tcp socket to the named host has
failed because the name is not recognized as known within the
cell's network. Check the hosts table and name server for this
host.
0253 NFO: dqs_open_tcp: using
port %d", port
This is an informational message which should appear
at the startup of the qmaster and any dqs_execds.
0254 WARNING: dqs_open_tcp:
cannot get service
The named service cannot be found in /etc/services.
This name originates in conf_file so check that the conf_file
which the program has access to is up to date and the service
entries match those in the /etc/services file.
0255 dqs_io.c 127 WARNING:
dqs_open_tcp: rresvport() failed
0256 dqs_io.c 137 WARNING:
dqs_open_tcp: unable to create socket
0257 dqs_io.c 157 WARNING:
dqs_open_tcp: cannot connect to peer
0258 dqs_io.c 274 WARNING:
MAX_STRING_SIZE exceeded
0259 dqs_job_exit.c 140 ERROR:
could not locate job
0260 dqs_job_exit.c 157 ERROR:
could not locate queue
0261 dqs_job_exit.c 187
ERROR: could not locate queue
0262 dqs_job_exit.c 235 ERROR:
WRITING JOB TO DISK
0263 dqs_job_exit.c 260 INFO:
not found - making ACT_FILE
0264 dqs_job_exit.c 267 ERROR:
opening for writing ACT_FILE
0265 dqs_job_exit.c 281 ERROR:
writing to ACT_FILE
0266 dqs_job_exit.c 288 ERROR:
INTERNAL ERROR
0267 dqs_job_exit.c 329 ERROR:
0268 dqs_job_exit.c 335 ERROR:
the queue cannot be located
0269 dqs_job_exit.c 502
ERROR: INTERNAL ERROR
0270 dqs_list.c 195 CRITICAL:string
too long\
0271 dqs_list.c 259 ERROR:
NULL head passed in
0272 dqs_list.c 265 ERROR:
NULL head->str0 passed in
0273 dqs_list.c 285 CRITICAL:error:
list screwed
0274 dqs_list.c 333 ERROR:
NULL head passed in
0275 dqs_list.c 572 CRITICAL:unknown
insertion ordering in insert
0276 dqs_list.c 982 ERROR: string
>< - too long
0277 dqs_list.c 1128 CRITICAL:can't
open for writing
0278 dqs_list.c 1134 CRITICAL:error
writing list_to_disk()
0279 dqs_list.c 1251 ERROR:
error reading
0280 dqs_list.c 1268 CRITICAL:error
reading
0281 dqs_load_avg.c 174 ERROR:
error getting ack from qmaster
0282 dqs_load_avg.c 184 INFO:
rebuilding trusted host list
0283 dqs_mail.c 117 ERROR:
NULL \"user\" passed to dqs_send_mail()
0284 dqs_mail.c 136 CRITICAL:error:
pipe() failed
0285 dqs_mail.c 141 CRITICAL:error:
fork() failed
0286 dqs_mail.c 156 CRITICAL:error:
dup() failed
0287 dqs_mail.c 166 CRITICAL:error:
mail failed
0288 dqs_mail.c 194 CRITICAL:mail
failed
0289 dqs_mail.c 210 CRITICAL:
mail failed
0290 dqs_mail.c 221 CRITICAL:
mail failed
0291 dqs_parse.c 357
WARNING: option has already been set
0292 dqs_parse.c 447
ERROR: getwd() failed
0293 dqs_parse.c 459 ERROR:
unable to stat directory
0294 dqs_parse.c 507
WARNING: suspend_enablehas already been set
0295 dqs_parse.c 527 WARNING:
suspend_enable has already been set
0296 dqs_parse.c 565
WARNING: option has already been set
0297 dqs_parse.c 590 WARNING:
option has already been set
0298 dqs_parse.c 608
WARNING: option has already been set
0299 dqs_parse.c 717
ERROR: invalid option argument
0300 dqs_parse.c 751
WARNING: option has already been set
0301 dqs_parse.c 899
WARNING: option has already been set
0302 dqs_parse.c 972
ERROR: AFS support not compiled in
0303 dqs_parse.c 991
ERROR: AFS support not compiled in
0304 dqs_parse.c 1033
WARNING: option has already been set
0305 dqs_parse.c 1051
WARNING: option has already been set
0306 dqs_parse.c 1060
ERROR: AFS reauth time must be gtr than 600
0307 dqs_parse.c 1081
WARNING: option has already been set
0308 dqs_parse.c 1093
ERROR: invalid option argument
0309 dqs_parse.c 1112
WARNING: option has already been set
0310 dqs_parse.c 1135 WARNING:
suspend_enable has already been set
0311 dqs_parse.c 1167
WARNING: suspend_enable has already been set
0312 dqs_parse.c 1302
WARNING: option has already been set
0313 dqs_parse.c 1323
WARNING: suspend_enable already been set
0314 dqs_parse.c 1351 ERROR:
invalid option argument
0315 dqs_parse.c 1456 ERROR:
is not a valid option
0316 dqs_parse.c 1478 ERROR:
no option argument provided to
0317 dqs_parse.c 1528 ERROR:
invalid date_time string
0318 dqs_parse.c 1538 ERROR:
invalid date_time string
0319 dqs_parse.c 1568 ERROR:
invalid date_time string
0320 dqs_parse.c 1583 ERROR:
invalid date_time string
0321 dqs_parse.c 1596 ERROR:
invalid date_time string
0322 dqs_parse.c 1608 ERROR:
invalid date_time string
0323 dqs_parse.c 1618 ERROR:
invalid date_time string
0324 dqs_parse.c 1628 ERROR:
invalid date_time string
0325 dqs_parse.c 1752 ERROR:
invalid hold_list
0326 dqs_parse.c 1785 ERROR:
invalid keep_list
0327 dqs_parse.c 1821 ERROR:
invalid mail_option_list
0328 dqs_parse.c 2349 ERROR:
invalid priority, must be less than 1024
0329 dqs_parse.c 2356 ERROR:
invalid priority, must be gtr than -1023
0330 dqs_parse.c 2400 ERROR:
invalid state_list \"%s\"",state_str
0331 dqs_parse.c 2424 ERROR:
invalid signal
0332 dqs_parse.c 2544 ERROR:
invalid signal
0333 dqs_parse.c 2615 ERROR:
invalid variable string
0334 dqs_parse_qconf.c 393
ERROR: no list_name provided to
0335 dqs_parse_qconf.c 549
ERROR: no list_name provided to
0336 dqs_parse_qconf.c 1110
ERROR: invalid option argument
0337 dqs_parse_qconf.c 1280
ERROR: edit file does not exist
0338 dqs_parse_qconf.c 1297 ERROR:
editor exited with ERROR
0339 dqs_parse_qconf.c 1306
ERROR: edit file no longer exists
0340 dqs_parse_qconf.c 1326
ERROR: editor was terminated by a signal
0341 dqs_parse_qconf.c 1339
ERROR: could not exec default_editor
0342 dqs_parse_qconf.c 1446
ERROR: error generating tempoary file name
0343 dqs_parse_qconf.c 1453
ERROR: error opening for writing
0344 dqs_parse_qconf.c 1546 ERROR:
error opening for reading
0345 dqs_parse_qconf.c 1650 ERROR:
error opening for reading
0346 dqs_queue.c 89 CRITICAL:error:
generating tmpnam()
0347 dqs_queue.c 101
CRITICAL:error writing
0348 dqs_queue.c 254 CRITICAL:error
writing
0349 dqs_queue.c 325 ERROR:
unable to open for reading
0350 dqs_queue.c 334 ERROR:
reading conf file: no queue name
0351 dqs_queue.c 343 ERROR:
reading conf file: no hostname specified
0352 dqs_queue.c 352 ERROR:
reading conf file: no seq_no specified
0353 dqs_queue.c 359 ERROR:
reading conf file: no load_masg specified
0354 dqs_queue.c 366 ERROR:
reading conf file: no load_alarm
0355 dqs_queue.c 373 ERROR:
reading conf file: no priority specified
0356 dqs_queue.c 380 ERROR:
reading conf file: no type specified
0357 dqs_queue.c 394 ERROR:
reading conf file: invalid queue type
0358 dqs_queue.c 401 ERROR:
reading conf file: no rerun specified
0359 dqs_queue.c 416 ERROR:
reading conf file: invalid rerun option
0360 dqs_queue.c 423 ERROR:
reading conf file: no quantity specified
0361 dqs_queue.c 430 ERROR:
reading conf file: no tmpdir specified
0362 dqs_queue.c 439 ERROR:
reading conf file: no shell specified
0363 dqs_queue.c 447 ERROR:
reading conf file: no klog specified
0364 dqs_queue.c 455 ERROR:
reading conf file: no reauth_time
0365 dqs_queue.c 462
0366 dqs_queue.c 472 ERROR:
reading conf file: no max_user_jobs
0367 dqs_queue.c 479 ERROR:
reading conf file: notify specified
0368 dqs_queue.c 492 ERROR:
reading conf file: no owner_list specified
0369 dqs_queue.c 507 ERROR:
reading conf file: no user_acl specified
0370 dqs_queue.c 521 ERROR:
reading conf file: no xacl specified
0371 dqs_queue.c 535 ERROR:
reading conf file: no subordinate_list
0372 dqs_queue.c 549 ERROR:
reading conf file: no complex_list 0373 dqs_queue.c 564
ERROR: reading conf file: no consumables
0374 dqs_queue.c 578 ERROR:
reading conf file: no s_rt specified
0375 dqs_queue.c 585 ERROR:
reading conf file: no h_rt specified
0376 dqs_queue.c 592 ERROR:
reading conf file: no s_cpu specified
0377 dqs_queue.c 599 ERROR:
reading conf file: no h_cpu specified
0378 dqs_queue.c 606 ERROR:
reading conf file: no s_fsize specified
0379 dqs_queue.c 613 ERROR:
reading conf file: no h_fsize specified
0380 dqs_queue.c 620 ERROR:
reading conf file: no s_data specified
0381 dqs_queue.c 628 ERROR:
reading conf file: no h_data specified
0382 dqs_queue.c 635 ERROR:
reading conf file: no s_stack specified
0383 dqs_queue.c 642 ERROR:
reading conf file: no h_stack specified
0384 dqs_queue.c 649 ERROR:
reading conf file: no s_core specified
0385 dqs_queue.c 656 ERROR:
reading conf file: no h_core specified
0386 dqs_queue.c 663 ERROR:
reading conf file: no s_rss specified
0387 dqs_queue.c 670 ERROR:
reading conf file: no h_rss specified
0388 dqs_queue.c 755 ERROR:
cannot locate complex\
0389 dqs_reauth.c 172
ERROR: REAUTHING=========== job
0390 dqs_reauth.c 192
CRITICAL:error: unable to reauth - invalid Passwd
0391 dqs_reauth.c 206
CRITICAL:error: pipe() failed
0392 dqs_reauth.c 213
CRITICAL:error: fork() failed
0393 dqs_reauth.c 226 CRITICAL:error:
dup() failed
0394 dqs_reauth.c 233
CRITICAL:error: JID klog -principal -cell pipe
0395 dqs_reauth.c 243
CRITICAL:error: JID fdopen() in dqs_do_reauth()
0396 dqs_reauth.c 271 CRITICAL:error:
JID klog -principal -cell failed
0397 dqs_reauth.c 279
CRITICAL:error: JID klog -principal cell timed out
0398 dqs_reauth.c 288
CRITICAL:error: JID klog -principal- wifstopped
0399 dqs_reauth.c 299 CRITICAL:error:
JID klog -principal -cell returned
0400 dqs_reauth.c 305
INFO: JID klog -principal -cell returned
0401 dqs_reauth.c 337 ERROR:
AFS support not compiled in
0402 dqs_reauth.c 449 CRITICAL:error:
NULL key ptr passed in
0403 dqs_reauth.c 457 ERROR:
opening KEY_FILE
0404 dqs_reauth.c 465 ERROR:
reading KEY_FILE
0405 dqs_reauth.c 497 CRITICAL:error:
unable to reauth - invalid key file
0406 dqs_reauth.c 535 CRITICAL:error:
unable to reauth - invalid key file
0407 dqs_reauth.c 574 CRITICAL:error:
NULL file name passed in
0408 dqs_reauth.c 582 ERROR:
opening fname
0409 dqs_reauth.c 601 ERROR:
invalid entry at line
0410 dqs_reauth.c 610 ERROR:
invalid entry at line
0411 dqs_reauth.c 619 ERROR:
invalid entry at line
0412 dqs_resolve.c 123 ERROR:
stating RESOLVE_FILE
0413 dqs_resolve.c 139 ERROR:
opening RESOLVE_FILE
0414 dqs_resolve.c 157 ERROR:
invalid entry at line
0415 dqs_resolve.c 166 ERROR:
invalid entry at line
0416 dqs_resolve.c 175 ERROR:
invalid entry at line
0417 dqs_resolve.c 188 ERROR:
invalid entry at line
0418 dqs_resolve.c 201 ERROR:
invalid entry at line
0419 dqs_schedule.c 813
ERROR: -NULL-granted_destin_identifier_list
0420 dqs_schedule.c 818 ERROR:
ERROR-NULL-master_queue-ERROR
0421 dqs_schedule.c 830
CRITICAL:unable to locate queue job->master
0422 dqs_schedule.c 839
ERROR: unable to handoff job to queue
0423 dqs_sec.c 169 WARNING:
Illegitimate request origin
0424 dqs_sec.c 175 WARNING:
Illegitimate request != AF_INET from
0425 dqs_sec.c 186 WARNING:
Non-Reserved port origin
0426 dqs_sec.c 203 WARNING:
Couldn't gethostbyaddr() origin
0427 dqs_sec.c 218 WARNING:
Couldn't gethostbyname() origin
0428 dqs_sec.c 229
WARNING: addr not listed for
0429 dqs_sec.c 239 WARNING:
Couldn't gethostbyname() origin
0430 dqs_sec.c 314 WARNING:
Illegitimate request != AF_INET
0431 dqs_sec.c 325 WARNING:
Non-Reserved port
0432 dqs_sec.c 367
WARNING: Couldn't find addr for origin
0433 dqs_sec.c 379 WARNING:
Couldn't gethostbyname() origin
0434 dqs_sec.c 389 ERROR: Illegitimate
host tried to connect to
0435 dqs_sec.c 420 CRITICAL:gethostname()
failed
0436 dqs_sec.c 433 CRITICAL:gethostbyname()
failed
0437 dqs_sec.c 662 CRITICAL:NULL
username passed to valid_queue
0438 dqs_sec.c 728 CRITICAL:NULL
username passed to set_uid_gid()
0439 dqs_sec.c 738 ERROR: getpwnam()
failed
0440 dqs_sec.c 745 ERROR: gid
less than minimum allowed
0441 dqs_sec.c 752 ERROR: setgid()
failed
0442 dqs_sec.c 776 ERROR: initgroups()
failed
0443 dqs_sec.c 782 ERROR: initgroups()
failed
0444 dqs_sec.c 790 ERROR: user
gid less than min specified in conf
0445 dqs_sec.c 797 ERROR: setuid()
failed
0446 dqs_select_queue.c 105
ERROR: the complex does not exist
0447 dqs_select_queue.c 117
ERROR: the consumabledoes not exist
0448 dqs_select_queue.c 515 CRITICAL:error:
unable to locate queue
0449 dqs_select_queue.c 541 CRITICAL:error:
unable to locate queue
0450 dqs_select_queue.c 574
ERROR: the consumable does not exist
0451 dqs_select_queue.c 580
ERROR: consumable chain corrupt
0452 dqs_select_queue.c 606
ERROR: the consumable does not exist
0453 dqs_select_queue.c 631
ERROR: the consumable does not exist
0454 dqs_select_queue.c 666
ERROR: the consumabledoes not exist
0455 dqs_select_queue.c 784
ERROR: the complex does not exist
0456 dqs_select_queue.c 809
ERROR: the consumable does not exist
0457 dqs_send_receive.c 145
ERROR: NULL service passed to dqs_send_list_()
0458 dqs_send_receive.c 154
ERROR: unable to connect to host
0459 dqs_send_receive.c 169
ERROR: error writting in dqs_send_list()
0460 dqs_send_receive.c 217
ERROR: unable to resolve cell \
0461 dqs_send_receive.c 223
ERROR: bogus host associated with cell cell
0462 dqs_send_receive.c 237
ERROR: dqs_get_tid() failed
0463 dqs_send_receive.c 400
ERROR: reading in dqs_get_list()
0464 dqs_send_receive.c 411
ERROR: reading in dqs_get_list() bogus packet
0465 dqs_send_receive.c 417
ERROR: reading in dqs_get_list()
0466 dqs_send_receive.c 426
ERROR: error reading in dqs_get_list()
0467 dqs_send_receive.c 493
ERROR: NULL cell passed to send_receive_list()
0468 dqs_send_receive.c 500 ERROR:
NULL service passed to send_receive_
0469 dqs_send_receive.c 507 ERROR:
NULL list passed to send_receive_list()
0470 dqs_send_receive.c 542
ERROR: max_retries encountered - bailing out
0471 dqs_setenv.c 110
ERROR: realloc() failure
0472 dqs_setup.c 672 CRITICAL:
CONF_FILE could not be opened
0473 dqs_setup.c 682 CRITICAL:ERROR
in the configuration file
0474 dqs_setup.c 732 CRITICAL:invalid
configuration line
0475 dqs_setup.c 743 CRITICAL:invalid
configuration line
0476 dqs_setup.c 754 CRITICAL:invalid
configuration line
0477 dqs_setup.c 765 CRITICAL:invalid
configuration line
0478 dqs_setup.c 776 CRITICAL:invalid
configuration line
0479 dqs_setup.c 789 CRITICAL:invalid
configuration line
0480 dqs_setup.c 802 CRITICAL:invalid
configuration line
0481 dqs_setup.c 823 CRITICAL:invalid
configuration line
0482 dqs_setup.c 871 CRITICAL:unknown
descriptor in conf_file
0483 dqs_setup.c 913 CRITICAL:could
not open for writing CONF_FILE
0484 dqs_setup.c 1104 INFO:
not found - making",HOST_FILE
0485 dqs_setup.c 1109 ERROR:
not found in HOST_LIST -
0486 dqs_setup.c 1118 ERROR:
not found in HOST_LIST -
0487 dqs_setup.c 1138 INFO:
not found - making MAN_FILE
0488 dqs_setup.c 1146 INFO:
not found in MAN_LIST -
0489 dqs_setup.c 1156 INFO:
not found in MAN_LIST -
0490 dqs_setup.c 1179 INFO:
not found - making",OP_FILE
0491 dqs_setup.c 1205 ERROR:
error reading in QUEUE_DIR
0492 dqs_setup.c 1252 ERROR:
error reading in JOB_DIR
0493 dqs_setup.c 1295 ERROR:
error reading in RUSAGE_DIR,
0494 dqs_setup.c 1313 INFO:
writing generic queue configuration
0495 dqs_setup.c 1320 CRITICAL:dqs_read_in_qconf()
failed
0496 dqs_setup.c 1436 ERROR:
job reaped at machine startup
0497 dqs_setup.c 1452 ERROR:
error reading in RUSAGE_DIR,
0498 dqs_setup.c 1470 ERROR:
error reading in RUSAGE_DIR
0499 dqs_setup.c 1506 ERROR:
cannot opendir
0500 dqs_setup.c 1541 CRITICAL:cannot
chdir() to
0501 dqs_setup.c 1566 CRITICAL:null
path passed to dqs_mkdir()
0502 dqs_setup.c 1609 CRITICAL:unable
to mkdir(
0503 dqs_setup.c 1619
CRITICAL:unable to chownr(
0504 dqs_shutdown.c 43
CRITICAL:Controlled shutdown
0505 dqs_sig_handlers.c 288
/* ERROR: ALARM_CLOCK-ALARM_CLOCK-
0506 dqs_sig_handlers.c 291
ERROR: ALARM_CLOCK shutdown
0507 dqs_start_generic.c 127
ERROR: cannot open stdin file
0508 dqs_start_generic.c 140
ERROR: cannot open output file
0509 dqs_start_generic.c 149
ERROR: cannot open stdout
0510 dqs_start_generic.c 167
ERROR: JID execl'ing(
0511 dqs_start_generic.c 172
ERROR: JID execl failed",
0512 dqs_start_p4.c 154
ERROR: cannot open stdin file
0513 dqs_start_p4.c 167
ERROR: cannot open output file
0514 dqs_start_p4.c 176
ERROR: cannot open stdout
0515 dqs_start_p4.c 193
INFO: JID execl'ing(
0516 dqs_start_p4.c 198 ERROR:
JID execl( - failed",
0517 dqs_tid.c 118 INFO: SAVING(
0518 dqs_tid.c 144 INFO +++dqs_tid_garbage_collector()+++++
0519 dqs_tid.c 149 INFO: GARBAGE_COLLECTOR
0520 dqs_tid.c 192 INFO:
TID_GARBAGE_COLLECTOR nuking
0521 dqs_tid.c 206 INFO:
TID_GARBAGE_COLLECTOR nuking
0522 dqs_tid.c 306 INFO:
=====dqs_tid_del_x_host
0523 dqs_tid.c 311 INFO: dqs_tid_del_x_host
0524 dqs_tid.c 336 /* INFO:
dqs_tid_del_x_host nuking
0525 dqs_tid.c 345 /* INFO:
dqs_tid_del_x_host
0526 dqs_tid.c 354 /* INFO:
dqs_tid_del_x_host nuking
0527 dqs_tmpdir.c 71 ERROR:
getpwnam() failed
0528 dqs_tmpdir.c 91 ERROR:
chown() failed
0529 dqs_tmpdir.c 135 ERROR:
couldn't stat
0530 dqs_tmpdir.c 142 ERROR:
couldn't stat
0531 dqs_tmpdir.c 177 ERROR:
opendir() failed
0532 dqs_tmpdir.c 184 ERROR:
getcwd() failed
0533 dqs_tmpdir.c 191 ERROR:
chdir() failed
0534 dqs_tmpdir.c 203
ERROR: stat() failed
0535 dqs_tmpdir.c 217 ERROR:
unlink() failed
0536 dqs_tmpdir.c 230 ERROR:
chdir() failed
0537 dqs_tmpdir.c 237 ERROR:
rmdir() failed,
0538 dqs_utility.c 278 CRITICAL:NULL
path_str in dqs_dequalify_path()
0539 dqs_utility.c 312 CRITICAL:NULL
host_str in dequalify_hostname()
0540 dqs_utility.c 416
CRITICAL:error: string too long
0541 dqs_utility.c 1422 CRITICAL:unable
to convert string
0542 dqs_utility.c 1454 CRITICAL:malloc()
failure
0543 dqs_utility.c 1490 CRITICAL:realloc()
failure
0544 dqs_utility.c 2032 INFO:
file not found - making SEQ_NUM_FILE
0545 dqs_utility.c 2043 CRITICAL:error:
opening ,SEQ_NUM_FILE
0546 dqs_utility.c 2049 CRITICAL:error:
reading ,SEQ_NUM_FILE
0547 dqs_utility.c 2161 ERROR:
cannot locate host
0548 dqs_utility.c 2285
CRITICAL:error writing
0549 dqs_utility.c 2327
CRITICAL:error writing
0550 dqs_utility.c 2429 ERROR:
NULL suffix passed to dqs_unlink()
0551 dqs_utility.c 2443 ERROR:
unlink() returned
0552 dqs_utility.c 2494 CRITICAL:strlen()
exceeds MAX_STRING_SIZE
0553 qalter.c 85 ERROR:
error opening
0554 qalter.c 166 ERROR:
You must request some resources
0555 qalter.c 173 ERROR: You
must request a jid
0556 qalter.c 180 ERROR: You
must request a jid
0557 qmaster.c 185 ERROR:
parsing options
0558 qmaster.c 209 CRITICAL::
Bad service?
0559 qmaster.c 214 CRITICAL:socket
creation ERROR
0560 qmaster.c 223 CRITICAL:bind
failure
0561 qmaster.c 267 ERROR:
accept ERROR\n
0562 qmaster.c 348 INFO:
0563 qstat.c 120 CRITICAL:dqs_parse_job()
returned NULL
0564 qstat.c 297 ERROR:
cannot locate job
0565 qstat.c 391 ERROR:
cannot locate job
0566 qsub.c 127 ERROR:
error opening
|