What's New
View new features for Accelerator Plus 2024.1.0.
2024.1.0 Release Notes
New Features
Internal Number | Products | Case Number | Description |
---|---|---|---|
VOV-14441 | Accelerator | None | All config.tcl, and *.tcl related source files
using syntax such as FOO(slave,bar) have now been updated to use
FOO(tasker,bar) . |
VOV-14812 | Accelerator | None | Added functionality to nc info for displaying I/O profiling performance
statistics.
Note: The above command displays stats after the job completes.
It will display a message if the job is still running.
|
VOV-15409 | Accelerator | None | The Accelerator User Guide has been updated to reflect the new Job I/O Profiling feature. |
VOV-15610 | All | CS0384959 | Administrators can configure $VOVDIR/local/equivalences to set symbolic name to physical path pairs. This file should not be removed. The template file $VOVDIR/etc/equivalences has been given comment lines that describe how to use this configuration file. A note in the file is added to inform admins that the $VOVDIR/local/equivalences file should not be removed, or unpredictable results could occur. |
VOV-15803 | Accelerator, Accelerator Plus | None | Accelerator Plus now supports multiple instances of the vovwxd daemon to
be configured for dynamically requesting taskers from other job scheduler(s) or base
queue(s). Daemons can be started and configured and managed via
SWD/autostart/start_wx_daemon.tcl, vovdaemonmgr
utility, and vovwxconnect.tcl same as for a single
vovwxd daemon. For example, vovdaemonmgr start -v -v -v -v
vovwxd vovwx2d would start vovwxd daemons configured in
SWD/vovwxd and SWD/vovwx2d respectively. |
VOV-15825 | All | CS0411539, CS0433472 | Added config(tasker.minWaitToReconnect) parameter in
policy.tcl that specifies how long should a tasker wait and try to
reconnect to the server before initiating the failover election process. |
VOV-15956 | Accelerator, Monitor | None | Improved logging messages for registry permission errors and RDS startup with an invalid Altair Monitor name configured. |
VOV-16009 | Accelerator | None | Added the ability to create reservations for user groups. |
VOV-16041 | Accelerator | None | The capability for REST authentication via VOV security keys has been added. The new authorizeWithKey() function added to the vov_rest_v3.py Python module issues a REST JWT Access Token, and takes as input a private user VOV key and the public vovserver VOV key. See vovsecurity -h for more details. |
VOV-16140 | Accelerator | CS0448878 | A systemd script altair-vovtasker.service was added to $VOVDIR/etc/boot. It starts a tasker named mytaskername in the vnc project as user vovadmin at startup. Previously, Accelerator shipped only an initd version of this script with no Systemd support. |
VOV-16169 | Allocator, Monitor | None | SSL/TLS is now enabled by default for VOV products LM, and LA. These products will open webports by default, and the web UI will have URLs beginning with "https:". |
VOV-16190 | All | None | A new trace parameter has been added, named trustUserReportedByClient, which is set to 0 by default. Actual client uid and gid values are sent by vov clients and are checked server side on protocol startup if the flag is set to zero. If the values do not match, an error is issued to prevent rootless container mode usage under vovserver. |
VOV-16222 | All | None | Added a new scheduler policy parameter, taskerBusyUponDispatch, to control whether the scheduler sets the tasker state to "BUSY" upon dispatching a job to it. Short job scheduling performance may be improved by setting this parameter to 0. Additionally, system taskers are no longer added to the "recent taskers" list. This is done to avoid an unwanted delay in job dispatching for queues that do not have many taskers connected to it. |
VOV-16259 | Accelerator, Monitor | None | A secure communication mode is implemented for communication between Accelerator's RDS service and Monitor's event port. The secure communication is enabled by setting the rds.secure configuration parameter to 1 and following the documented steps for proper configuration. |
VOV-16264 | Accelerator Plus | CS0464266 | Implemented the -forcedequeue option on nc run. Like the -force option, it will place prior jobs that use the same -l output log file into sleeping state. If the prior job is currently running, the -force option will have no effect, but the new -forcedequeue option will stop the prior running job and put it into sleeping state. See nc run -h for details. The use of this new option is not recommended. |
VOV-16266 | All | None | Added FSRANK, which is the FairShare rank of the FairShare group to which a job belongs, to SDS metrics for buckets. |
VOV-16303 | Accelerator | None | The vovsecurity command is added to manage VOV security keys. See the vovsecurity -h help screen for details. |
VOV-16338 | Accelerator | CS0474595 | A pty port range for the run -I commands can be specified in
${VOVDIR}/local/vncConfig/${VOV_PROJECT_NAME}.tcl. For example,
setenv VOV_PTY_PORT_RANGE 13300:14299 would specify the default range. If
a run -I job is submitted and a PTY port in this range is not available
then the submission will fail to add a job. |
VOV-16341 | All | None | Added a new SDS topic, jobstats, containing cpu, ram, and io usage stats for jobs. |
VOV-16343 | Accelerator | None | Enhanced scheduler to skip jobs conflicting with the future tasker or resource reservation so that subsequent non-conflicting jobs can run. It can be controlled by server config and policy parameters skipConflictingJobsInDispatchLoop and unsetSkipFlagOnConfictingJobs. |
VOV-16355 | All | None | Support for RHEL 9 and equivalent Linus distributions -- Rocky, Alma, and OEL -- has been added to the Altair Accelerator products. |
VOV-16364 | Accelerator, Monitor | None | Recent feature improvements caused issue with LM DB config failing upon restart after early server failure. This has been fixed. |
VOV-16367 | Accelerator, FlowTracer | None | When RDS is active, matching of jobs to license checkouts is now able to utilize PID information for software licensing systems that track and report the PIDs of the processes that make checkouts. |
VOV-16409 | Accelerator | None | The maximum allowed value of the schedMaxEffort policy parameter has been increased to 90. |
VOV-16472 | Monitor | None | The ldap.cfg LDAP configuration file in SWD/config
supports a new option to require that an LDAP service that Monitor binds to uses CA-signed
SSL/TLS certificates. This mode is off by default, but to activate it set the following in
ldap.cfg: set LDAP(validCertificate) 1 |
VOV-16485 | Accelerator, Accelerator Plus | None | The VOV Reference Guide has been updated to include information regarding the new VOV Security Keys feature. |
VOV-16490 | Monitor | CS0499980 | The procedures for upgrading a database in Monitor using Windows have been added to the Installation Guide online help. |
VOV-16577 | FlowTracer | None | The FlowTracer web UI has a new flow graph viewing and control page. This web page is a modernized alternative to using the vovconsole thick client for interaction with FlowTracer flowgraphs. |
VOV-16583 | Accelerator | None | Separated read and write results. Added Effective BW and Latency/Op to the output of nc info -ioprofile. |
VOV-16636 | Monitor | None | Fixed issue with the download icons missing in the HTML format of the resulting batch report. The HTML format of the batch report is now in line with the style in the browser. |
Resolved Issues
Internal Number | Products | Case Number | Description |
---|---|---|---|
VOV-14506 | Accelerator | None | Removed the non-functional "Authorize" button from the REST API interactive documentation page, aka the "Swagger" page. |
VOV-14788 | FlowTracer | None | Fix a problem in FlowTracer whereby an attempt to kill a job would result in multiple bkill requests to the LSF base queue. |
VOV-15649 | Accelerator | None | Fix the diagnostic error messages reported by Accelerator products when an Altair standalone license file is being used and this standalone license file is valid on a different hostid. The error being reported now correctly says "HostID does not match license". |
VOV-15756 | All | None | Improved the documentation about properly configuring and running the Altair License Manager server with Accelerator products. |
VOV-15817 | All, Accelerator | CS0332038, CS0475775 | Fix an issue that prevented secure LDAP connections from Monitor vovserver. The secure, ssl/tls-enabled connections are attempted if LDAP(ssl) is set to 1 in the ldap.cfg configuration file. |
VOV-15818 | Accelerator | CS0413116 | Fixed a bug in Windows installer that caused an incomplete installation with missing tcl/vtcl/ package. |
VOV-15968 | Accelerator, Accelerator Plus | CS0435337 | Added support for controlling the minhw server parameter in policy.tcl. When calling the vovservermgr config enableWaitReasons command, old waitreasons will now be deleted when disabling, and waitreason counts will be updated when enabling. |
VOV-16006 | Accelerator | CS0418446 | Running and rerunning a job with nc run and nc rerun no longer causes vovserver to perform a stat call. Similarly for vovconsole & the node editor. |
VOV-16027 | Accelerator Plus | CS0424970 | Wx taskers will now update fields such as memory, ram free, swap free, tmp space, load averages, and idle time after taskers maxlife has been reached. |
VOV-16139 | Monitor | CS0449822 | Fixed issue that caused the name of a monitor agent to be ignored, resulting in the default name (the host name) being used. |
VOV-16184 | FlowTracer | CS0449650 | Suppressed the bash function parsing messages that were appearing when using the ves command. |
VOV-16191 | Hero | CS0457702 | Significantly reduced the run time taken by the vovemulmgr config command
on large configurations. Added the emul.cfg Zebu emulator specific
option CONNECTIVITY = "extended" . Updated vovemulmgr
config to allow scheduling to continue during the config operation. |
VOV-16199 | Monitor | CS0421670 | Allow Monitor customer group names longer than 80 characters in reports. |
VOV-16206 | Accelerator, Accelerator Plus | CS0461223 | Fixed a vovserver crash that happened when multiple parallel REST job run requests (with different resources) were submitted. |
VOV-16215 | Allocator | CS0461778 | In Allocator config file, if a site is present and the user edits the site nickname, now can see the result in Allocator Resource plot page without stopping and starting the project again. |
VOV-16221 | Accelerator, Accelerator Plus | CS0463085 | Fixed incorrect vovselect tool value when using vwi & hero_adapter. |
VOV-16224 | All | None | The TCL/TK language interpreter package that is packaged with and used by Accelerator products has been upgraded from version 8.6.5 to 8.6.13. |
VOV-16230 | Monitor | CS0462741 | In Monitor, when adding a new monitor via UI (Admin > Monitors), if any of the available fields is written wrapped into double quotes, the confil_aux.tcl file where they are stored will not be broken. |
VOV-16236 | Accelerator, Accelerator Plus | CS0464846 | Fixed a potential log error when bogus vov protocol packets are sent to vovserver via a security scanning tool. |
VOV-16239 | Accelerator, Accelerator Plus | CS0464909, CS0464910 | Fixed issue that prevented DNS lookups from succeeding when using the portable architecture on CentOS 6. |
VOV-16254 | Monitor | None | Fixed issue that prevented tasker load graphs from being rendered on the Machine Load page that is located under the Network tab. |
VOV-16263 | FlowTracer | None | Fixed issue that prevented jobs from being dispatched when a base-queue-only taskerlist was requested. |
VOV-16273 | Accelerator | None | To address a potential SSL vulnerability, SSL/TLS renegotiation was disabled in the internal webserver when support for older TLS/SSL versions than TLS1.3 is enabled. |
VOV-16274 | Accelerator | None | Made guards against malicious HTTP GET file requests more strict. |
VOV-16289 | Monitor | CS0467401 | Fixed issue with the Checkout Statistics report with cost reporting enabled where report numbers were elevated. |
VOV-16290 | Accelerator | None | When a user browsed the NC dashboard page URL before having logged in, a blank dashboard page was displayed instead of sending the browser to the login page. This has been fixed. |
VOV-16315 | Accelerator | None | Enhanced scheduler to skip the jobs conflicting with the future tasker reservation so that subsequent non-conflicting jobs get dispatched. This can be controlled using the policy parameter skipConflictingJobsInDispatchLoop. |
VOV-16328 | Accelerator Plus | None | When a WX queue was connected to a base NC queue that has SSL enabled (default in recent releases), some spurious "Connection failed" error messages were printed in SWD/vovwxd/*.log. This was fixed. |
VOV-16340 | All | None | ROLLOVERTS field has been added to property events in the vov-jobdata topic |
VOV-16371 | Accelerator, Accelerator Plus, Hero | None | Single-slot taskers will now report their status as "full" instead of "working" when a task is running on them. |
VOV-16381 | Accelerator | None | Fixes an issue where specifying incorrect values for a PERCENT resource in an nc
run command, for example, nc run -r PERCENT/abc -- sleep 1
would give an inconsistent error message. |
VOV-16387 | Monitor | None | When webserver=internal and webport enabled, files such as batch reports that are placed in SWD/html would fail to be served, resulting in a NOT FOUND error when requested. This has been fixed. |
VOV-16413 | All | None | The version of the TCL TCLLIB library included in the Accelerator products package has been updated to version 1.21. |
VOV-16432 | FlowTracer | None | Fixed issue that prevented splines from being rendered in vovconsole on ARM hosts. |
VOV-16439 | Accelerator, Accelerator Plus | CS0473552 | EINTR on system primitives will now retry and should not disconnect from server for Linux |
VOV-16440 | Accelerator | None | Added support for relative paths in the installation utility install.sh. |
VOV-16441 | Accelerator | None | Fixed an issue in the installer script install.sh that caused the -platforms option to honor only 1 of multiple values passed in. |
VOV-16446 | Monitor | CS0483193 | Fixing an issue with the ftlm_batch_report utility that would produce two different representations of the tree map for the same feature. There was a difference in the produced tree map graph when comparing the plot of a single feature and comparing the plot of this feature when plotting all features of the tag. |
VOV-16448 | Accelerator | None | The robustness of complex Distributed Parallel jobs with many components was improved. The nc rerun command now supports the -after <time> option. Jobs that exit with the special exit status values between 201-215 to be automatically rescheduled now have their logs appended to rather than overwritten. Added vtk_prop_decr_and_get to provide symmetry with vtk_prop_incr_and_get. |
VOV-16470 | Accelerator | None | The "http" or "https" part of the URL displayed on the REST documentation page was wrong in some circumstances. This was fixed. |
VOV-16479 | Allocator | None | Removed the requirement that the LA project be enabled before using the lamgr reset command with the -name PROJECT option. |
VOV-16484 | Monitor | CS0419423 | In LM product, all text-edit boxes display has changed. Now the text-edit boxes are displayed in the maximum width possible (full screen width) and text lines are not wrapped (if lines are longer that space available, a horizontal scroll bar appears to show all the content). |
VOV-16494 | Accelerator | None | On systems where the hostname command is configured to print the long fully qualified host name, the nc info -ioprofile JOB command could not successfully find the I/O profiling results that had been generated by a job launched with the -ioprofile option. This has been fixed. |
VOV-16497 | Accelerator | CS0500300 | If the default vovtriggerd trigger callback is used (triggerCallBack), the TRIGGER property must be set to a Tcl proc defined in vovtriggerd/config.tcl. This is to improve security. |
VOV-16501 | Accelerator | None | The following improvements were made to the nc gui -ioprofile JOB
display:
|
VOV-16502 | Accelerator | None | The following improvements were made to the nc gui -ioprofile JOB
display:
|
VOV-16506 | All | None | Fixed an issue with the new ROLLOVER_TS implementation where existing jobs might get a new ROLLOVER_TS in the Kafka event. |
VOV-16541 | All | None | Fixed issue that caused the scheduler to ignore tasker HW resources that have both letters and numbers in their name. |
VOV-16550 | Accelerator, Accelerator Plus, Monitor | None | Fixed issue that prevented Monitor non-admin users from generating and viewing historical license usage reports. |
VOV-16573 | Accelerator | None | Changed -ioprofile data labels using more appropriate terminology |
VOV-16591 | Allocator | None | The lamgr reset command is supposed to reset the LA project enabled in the current shell, but instead it tried to reset LA project "la". This has been fixed. |
VOV-16599 | Accelerator | None | Added -l compatibility to nc run -ioprofile. A logfile name can be specified using the -l option when submitting a job with I/O profiling option. |
VOV-16606 | Accelerator | None | Fixed a crash that occurred when resources for REST job create requests were longer than 1024 characters. |
VOV-16628 | Accelerator | None | Fixed an issue where RDS could cause a server crash when removing/restarting old jobs. |
Previous Releases
2023.1.2 Release Notes
New Features
Internal Number | Products | Case Number | Description |
---|---|---|---|
VOV-14464 | Accelerator | None | Removed PERCENT/1 from the minimum resource requirements for a job. From this release onward, the PERCENT resource will need to be used explicitly to limit the number of jobs running on a Tasker. Jobs will no longer request the PERCENT resource by default. This change allows more than 100 jobs to be run concurrently on a single Tasker. |
VOV-14728 | Accelerator | None | Fixed issue which could result in RDS missing some checkin/checkouts that occur during startup and changes in configuration in SWD/resources.cfg. |
VOV-14810 | Accelerator | None | Preview feature: Added functionality to deliver Mistral results to Accelerator. We can run a
job with the -ioprofile option that generates log file.
Visualization is added in the NC GUI for a few of the measured labels from the log
file. Below are the steps:
|
VOV-15253 | Accelerator Plus | CS0351320 | Multiple "Deleting failed tasker" alerts now increases the alert count rather than generating distinct alerts. |
VOV-15508 | Accelerator | CS0390432, CS0438406 | Adds support for VOV user groups in reservations. The vtk_reservation_create Tcl procedure now accepts a -usergroup option for specifying the name of a VOV user group. |
VOV-15770 | All | None | When RDS is active, Monitor now uses a new port, the "event port". The lmmgr start command has a new option -eventport. The option -upport is no longer available. |
VOV-15889 | Accelerator | None | Improved error diagnostic messages for syntax errors in RDS configuration file resources.cfg by including the line number where the error is detected. |
VOV-15909 | FlowTracer | None | Implemented callback to customize functionality for vovconsole status 'Force Validate...', 'Skip...' and 'Waive Exit Code...'. |
VOV-15919 | Hero | None | The FREE_RESOURCES preemption rule type for Hero has been added to the online help. |
VOV-15946 | Accelerator, Accelerator Plus | None | The live_keepfor_jobs.tcl file is deleted and the implementation to cleanup the keepfor jobs is moved to vovserver, which can be controlled by following policy parameters: keptJobsCleanupChunkSize and keptJobsCleanupInterval |
VOV-15982 | Accelerator, Monitor | None | RDS no longer uses the init port, lmmgr -initport and INIT_PORT parameter in
resources.cfg are deprecated. |
VOV-16007 | Accelerator, Accelerator Plus | CS0439498 | A new callback has been added that can be used in vnc_policy.tcl. The procedure VncPolicyValidateOptions has been added which takes a sub-command name and a list of options to verify. This procedure is expected to return a modified list of options. |
VOV-16022 | All | None | Added a field 'ROLLOVERTS' in the trace object which can be used in the query to address the jobid uniqueness issue. |
VOV-16031 | FlowTracer | None | Implemented callback to customize functionality for vovconsole status 'Invalidate...'. |
VOV-16036 | Accelerator Plus | None | The online help has been updated to include Azure output parameters for Streaming Data Services. |
VOV-16046 | Accelerator | None | With a Tasker set to autokillmethod=direct, autokill
will honor signal specifications in NC_STOP_SIGNALS and NC_STP_SIG_DELAY as well
as VOV_STOP_SIGNALS. Multiple signals are comma separated. Each "signal" can be
a signal name such as "USR1", or an EXT-like signal specification using
EXT:SIGNAL:includerx:excluderx:skiptop format. The EXT-like signal specification
can also omit the leading "EXT" so long as it begins with a colon. For example,
The default signal list for autokillmethod=direct is
TERM,HUP,INT,KILL, but the default can also be controlled by
defaultStopSignalCascade in
Policy.tcl. defaultStopSignalCascade
does NOT support the EXT signal format. It only supports a comma separated list
of signal names and it has been this way for quite some time. |
VOV-16134 | All | None | The batch_install.csh installation script for Linux is no longer
provided with Accelerator product installation media. The new CLI-based
installation method on Linux is to use the install.sh script
with its -batch option. The new install.sh
-batch method is compatible with batch_install.csh
with the following 2 exceptions:
|
VOV-16140 | Accelerator | CS0448878 | A SystemD script altair-vovtasker.service was added to
common/etc/boot that starts a Tasker named
mytaskername in the vnc project as user
vovadmin at startup. |
VOV-16141 | All | None | Implemented an event to Kafka on project stop. |
VOV-16143 | Accelerator, Accelerator Plus | CS0448778 | Procedure for enabling client side logging using RabbitMQ has been added to the online help. |
VOV-16147 | Monitor | None | Display a success indicator when assigning a feature alias or cost in the web UI. |
VOV-16150 | Accelerator, Monitor | None | The online help has been updated to include the removal of the requirement of the init and update ports for RDS. The event service port is now being used. |
VOV-16176 | All | None | Updated the Altair License Server that is included with Accelerator Products packages to version 15.2.0. In addition, the Accelerator software has been built with version 15.2.0, and requires License Servers to be version 15.2.0 or higher. If you are using Altair floating licenses with Accelerator products, then this new version of Accelerator will require that the Altair license server be stopped, upgraded to the version 15.2.0 software included with Accelerator, and restarted. |
Resolved Issues
Internal Number | Products | Case Number | Description |
---|---|---|---|
VOV-12474 | Accelerator | None | Fixed issue causing incorrect start time and duration of resource reservation. |
VOV-13992 | Accelerator | None | An issue was found when using the webserver with SSL enabled and when using host-specific certs, where self signed cert files were also generated even though they were not needed. vovserver would still use the correct cert files but creating the self-signed files added some confusion. This has been fixed. |
VOV-14462 | Accelerator, Accelerator Plus | CS0413928 | PostgreSQL and SMTP notification passwords are returned in obfuscated form by vtk_prop_get if user is not a project owner. |
VOV-15733 | Accelerator | None | Fixed some cases where the incorrect HTTP status code was returned for codes other than success (200). Specifically, with "webprovider=internal", some incorrect HTTP requests would return code 500 rather than 400, or other appropriate codes. |
VOV-15740 | Accelerator | None | In recent releases Accelerator jobs launched with an explicit "-r percent/0" resource specification for "percent" were flagged as errors. Starting in this release, 0 of the "percent" resource will be the default. The "-r percent/0" specification on job launch will continue to be flagged as an command syntax error. Users should simply remove "-r percent/0" from the job launch if 0 percent resources are desired. |
VOV-15757 | FlowTracer | CS0404497 | Added the buckettime and maxswap job fields to the list of fields that are persistent across restarts. |
VOV-15806 | Accelerator Plus | CS0410831 | The fields REQSWAP, REQSLOTS and REQRAM now report a sum of all requested resources rather than the amount of the first request, e.g. "SWAP/10 SWAP/20" will report a REQSWAP field of 30, where it previously would have reported 10. |
VOV-15822 | Accelerator | CS0414459 | Calls to vovprop set and vovprop del will now check object ACL's for proper authorization before modifying object properties. |
VOV-15828 | FlowTracer | CS0416354 | Fixed issues related to the automatic zipping and unzipping of FILE, FILEX, and PHANTOM
databases.
|
VOV-15836 | Monitor | None | Fixed an issue where the expiration of served licenses was not being displayed by Monitor in the | window of the web UI.
VOV-15851 | All | CS0187053 | For this release vovdaemonmgr has been modified to configure some environment variables internally to speed up interactions with vovps utility in situations where a UID may exist on the system without a user entry in the database being associated with the same UID on Linux based platforms. |
VOV-15856 | Monitor | CS0419423 | In Monitor, when you click any config file to edit at Admin/System/Configuration Information, it opens a text box that now is again resizable in both dimensions (vertically and horizontally). |
VOV-15877 | Accelerator | None | Fix a Monitor life-support issue when RDS is active. The issue symptom was the zeroing out of out-of-queue (OOQ) usage numbers for license resources when Monitor went offline or went down. The problem only occurred for Altair License Manager (aka LMX) license servers being monitored by Monitor. |
VOV-15885 | Accelerator, Accelerator Plus | CS0418925 | A DEQUEUE capability, similar to the STOP capability, has been added to the ACL system. The DEQUEUE capability is needed for permission to use nc stop to change a queued job into idle state. |
VOV-15886 | Accelerator Plus | CS0453249 | It was possible for two vovwxd daemons to be started for an Accelerator Plus queue, with unpredictable results. This is now fixed, and enforcement is now in place to prevent a second vovwxd daemon from being started. |
VOV-15893 | Monitor | None | Fixed issue in Monitor in which the "View Widget Alone" button and the "Help" button where overlapping with the text placed above the graphs. |
VOV-15913 | Monitor | CS0428850 | Updated FTLM parser to parse the server version in all formats. |
VOV-15926 | All | None | Fixed error message that was displayed in license violation messages. |
VOV-15947 | Accelerator | None | Fixed a bug that was causing a Tasker remain in BUSY state when it's reconnected after a network failure. |
VOV-15962 | Monitor | None | Fix a memory leak in Altair licensing libraries that had caused the vovserver process associated with Monitor projects to grow by as much as 750 MB per month when licensed with Altair License floating licenses. This fix was accomplished by updating the Altair License management software SDK used to build Accelerator products software to version 15.2.0. |
VOV-15968 | Accelerator, Accelerator Plus | CS0435337 | Added support for controlling the minhw server parameter in
policy.tcl. |
VOV-15978 | Allocator | CS0440853 | In Allocator Overview page: Added ordinal number to each row in the first column. When the
text is longer that the space available, each table cell can be scrolled to see
the whole content. By default, the Resource column is ordered as per the order in
config file. In Allocator Resources Summary page: Added ordinal number to each row in the first column. The Unassigned row has been fixed to be always the last one. Added ascending and descending sorting. By default, the Nickname column is ordered as per the order in config file. |
VOV-15999 | FlowTracer | CS0432304 | Fixed issue in .gz file reading which is resulting in an empty file in the node editor. |
VOV-16001 | Accelerator Plus | CS0439073 | Added VOVDIR parameter in vovnc.tcl, which allows you to specify the installation location for wxagent. This can be used in multi-platform setup. |
VOV-16005 | Allocator | None | Resolved a Tcl stack trace in lamgr reset due to earlier restructuring of the Monitor data stream. |
VOV-16010 | Accelerator | None | The command vovtaskermgr reserveshow and related commands, for example vovshow -reservations now support user group reservations and will display the code S=<groupname> in the "For" column of their output. |
VOV-16013 | Monitor | None | Fixed issue when changing the path location of the database in Monitor, in which the new changed path was not being saved. |
VOV-16015 | All | None | Added a post-installation script to automate the creation of a "portable" architecture
directory for linux64, named linux64p. This portable architecture contains
required system libraries and modified binaries that point to them instead of
relying on those on the host. The intent of this is to provide an architecture
that will run on Red Hat 6 or equivalent. The script,
$VOVDIR/../scripts/install-portable-arch.sh, requires the
patchelf utility to be installed on the host running it. The reference
configuration is based upon RHEL 7.9 or equivalent, and thus, the script will
generate and utilize a configuration file (portable-arch.cfg)
in the CWD that contains paths to required system libraries specific to that
distribution. The configuration file can be adjusted to reflect differences in
library locations and/or versions, but note that significant differences in
versions may result in the portable architecture being unusable. The configuration
file contains a full library path and an optional symlink name on each line. The
optional symlink name is unused for the first entry and follows Linux library
naming conventions for the remaining entries. To utilize the new linux64p
architecture, you must set the VOV_PORTABLE_ARCH environment variable to 1 prior
to sourcing the vovrc script that sets up your shell to work
with a VOV installation. To summarize the process:
|
VOV-16016 | Accelerator | None | A job now needs ATTACH permission on its fsgroup ACL in order to run. If the run command -G parameter command results in an fsgroup being created, the current user will be the owner of the fsgroup. |
VOV-16025 | FlowTracer | None | Fixed vovproject destroy error "can't read "vncConfigRegistry": no such variable". |
VOV-16033 | Accelerator | None | When an existing NC queue is upgraded from a 2020.x or prior version to a newer version, the doTestHealthCheckDownSlaves and doTestHealthCheckSlaveset vovnotifyd health checks will not be automatically replaced with the corresponding health checks in the "Tasker" lexicon. To enable these health checks with new names doTestHealthCheckDownTaskers and doTestHealthCheckTaskerst", navigate to the web UI under and disable each health check. Then wait a few seconds and reenable the health check. |
VOV-16055 | All | None | Fixed handling of the "create" provisioning method for the "local" directory when installing in batch mode. |
VOV-16059 | All | None | Fixes a potential crash when running a vovselect, vtk_select_loop or related query that selects from a named set that is subsequently deleted before the query finishes processing, that is, "vovselect id,status from temporarySet". |
VOV-16144 | Accelerator Plus | CS0435310 | Accelerator Plus now ensures that the Accelerator Plus' version of the vov utilities are used in the job pipeline when dispatching to a different version Accelerator queue. This prevents errors such as features being used that are not present in an older version base queue. |
VOV-16146 | Monitor | None | Fix an issue in the Monitor web UI | page, whereby the Cost column was being incorrectly calculated in some cases. The correct value of Cost is the "Total" column value multiplied by the hourly cost per token.
VOV-16154 | All | None | Fixed an issue where the login link on the NC guest page did not honor the SSL setting and always used HTTP. |
VOV-16160 | Accelerator Plus | CS0453463 | Taskers will no longer consider themselves "idle" for the purposes of max idle calculations when they only have suspended jobs. |
VOV-16172 | Accelerator | None | Fixed an issue that caused vovresourced to exit when configured to interface with a non-existent instance of Monitor. |
VOV-16195 | Monitor | None | The URL shown in batch reports has been fixed to point to the correct host and port per the web server configuration. |
VOV-16211 | FlowTracer | None | Keeping FT web UI login page as it is, linking the old UI with the new one with a button in the header (similar to NC dashboard is linked). |
VOV-16220 | Accelerator | None | Renamed -mistral to -ioprofile. To run a job with
Mistral or use the GUI to see Mistral results, we need to add
-ioprofile tag. Example:
|
VOV-16254 | Monitor | None | Fixed an issue that prevented Tasker load graphs from being rendered on the Machine Load page that is located under the Network tab. |
VOV-16261 | FlowTracer | None | Fixed an issue that prevented arrows from being rendered in vovconsole node graphs. |
VOV-16281 | Accelerator, Accelerator Plus | CS0468254 | Fixed a bug that's causing fatal errors in vovserver when servicing older version of vovwxd clients configured for DirectDrive. This happens when FlowTracer projects or Accelerator Plus queues are configured with Direct Drive and an Accelerator base queue. |
2023.1.2-p1 Release Notes
New Features
Internal Number | Products | Case Number | Description |
---|---|---|---|
VOV-16264 | Accelerator Plus | CS0464266 | Implemented dequeue/stop of jobs contributing to the same file. This feature can be enabled
using nc run -forcedequeue option. |
Resolved Issues
Internal Number | Products | Case Number | Description |
---|---|---|---|
VOV-16191 | Hero | CS0457702 | Significantly reduced the run time taken by the vovemulmgr config command
on large configurations. Added the emul.cfg
Zebu emulator specific option CONNECTIVITY =
"extended" . Updated vovemulmgr
config to allow scheduling to continue during the
config operation. |
VOV-16206 | Accelerator, Accelerator Plus | CS0461223 | Fixed a crash that happened when multiple parallel REST job run requests (with different resources) were submitted. |
VOV-16236 | Accelerator, Accelerator Plus | CS0464846 | Fixed a potential log error when bogus vov protocol packets are sent to vovserver via a security scanning tool. |
VOV-16340 | All | None | ROLLOVERTS field has been added to property events in the vov-jobdata topic |
VOV-16419 | Allocator | CS0485605 | Fixed issue that caused raw checkout data to be printed in the server log. |
VOV-16433 | Accelerator, Accelerator Plus | CS0487100 | The vovforget -allemptysets option restored for admins of projects. |
VOV-16439 | Accelerator, Accelerator Plus | CS0473552 | EINTR on system primitives will now retry and should not disconnect from server for Linux |
VOV-16493 | Accelerator Plus | None | Fixed a bug when displaying a job's wait reason in WX connected to a base queue for Classic and DirectDrive modes. The vovsh binary, vtcl directory, as well as vovnc.tcl or vovaccel.tcl used by vovwxd must be updated in swd in order to apply the fix. Fixed a bug of a missing autoforget flag in DD agent jobs. The base queue server must be updated for the fix to take place. |
VOV-16550 | Accelerator, Accelerator Plus, Monitor | None | Fixed issue that prevented non-admin users from generating and viewing historical reports. |
2023.1.2-p2 Release Notes
Resolved Issues
Internal Number | Products | Case Number | Description |
---|---|---|---|
VOV-16191 | Hero | CS0457702 | Significantly reduced the run time taken by the vovemulmgr config command on large configurations. Added the emul.cfg Zebu emulator specific option CONNECTIVITY = "extended". Updated vovemulmgr config to allow scheduling to continue during the config operation. |
2023.1.1 Release Notes
New Features
Internal Number | Products | Case Number | Description |
---|---|---|---|
VOV-14464 | Accelerator | Removed PERCENT/1 from the minimum resource requirements for a job. From this release onward, the PERCENT resource will need to be used explicitly to limit the number of jobs running on a tasker. Jobs will no longer request the PERCENT resource by default. This change allows more than 100 jobs to be run concurrently on a single tasker. | |
VOV-14703 | All | None | SSL/TLS is enabled by default for VOV products Accelerator, Accelerator Plus, Hero, and FlowTracer. These products will open webports by default. |
VOV-14705 | Allocator | None | Added the ability for Allocator product instances to enable SSL, and enhanced lamgr
start to support the -webport and
-webprovider options. |
VOV-15052 | Monitor | None | Reworked the utilization plot tooltip to display the date/time and value at the mouse cursor's location. |
VOV-15200 | Accelerator | None | Implemented functionality to handle multiple active reservations, ordered by dominance. |
VOV-15289 | Accelerator | None | Modified SDS to handle the azure schema id (mix of numbers and alphabets). |
VOV-15346 | Accelerator | None | Added RDS match events to | stream.
VOV-15545 | FlowTracer | CS0377941 | The vovconsole welcome splashscreen has been eliminated. |
VOV-15558 | Accelerator | None | Improved help message for the vovacl command to reflect all available agents and actions, and clarify some commands. |
VOV-15595 | Accelerator, Monitor | None | AVS formatted config files, such as resources.cfg, now accept C-style multi-line comments delimited by /* and */ . |
VOV-15601 | All | None | Streaming Data Service now includes bucket metrics in the event data is published to Kafka. |
VOV-15613 | Accelerator | None | The use of Access Control Lists (ACLs) for resource maps are expanded to allow VOV user groups (USERGROUP) agents. See the vovacl -h help screen for details. |
VOV-15648 | Accelerator, Accelerator Plus | None | Fixed issue in the create/edit job class form in Accelerator web based UI. Names entered in the form are stripped of leading and trailing white space. |
VOV-15657 | Accelerator | None | The following CLI commands with "slave" lexicon are removed. Use the corresponding commands
in the "tasker" lexicon instead. For example, instead of
vovslavelaunch, use
vovtaskeraunch.
|
VOV-15691 | Accelerator | None | Implemented partial tasker reservations and partial lookahead reservations. |
VOV-15761 | Accelerator | None | Handled resource updated using tasker definition and vovtaskermgr configure. |
VOV-15765 | All | None | For some types of licenses, FlowTracer and Accelerator projects will want to communicate their license usage to an LM instance over the LM secure HTTPS web port. If the project cannot communicate with LM, a "Warning" alert will be posted and can only be cleared when LM communication has been established. |
VOV-15777 | Monitor | CS0404521 | Added in Monitor Web UI Detailed Plots page a new checkbox option at treemap level. This new option is called 'Hide Legend' and is checked by default. It results in hiding by default the treemap legend, giving you the option to display it again by uncheking it. The purpose of this option is to reduce the treemap height, in order to have the Detailed Plot section visible without scrolling. |
VOV-15783 | Accelerator | None | Implemented partial tasker reservation creation based on multiple parameters like RAM, CORES,
SWAP, and SLOTS. The vtk_reservation_create
API has been updated with an option -resources
to reserve a given number of resources on a tasker. |
VOV-15801 | Monitor | None | Added the ability to assign each named license feature an hourly, per-token cost value that will be shown in the Checkout Statistics report if the Show Cost option is selected. The cost can be specified via the page, or via a new ftlm_feature_admin utility, which also provides the ability to delete and rename features. |
VOV-15815 | Monitor | None | The lmmgr loaddb command can now take an option to supply a time that is used as the start date for loading database. The default is is equivalent to lmmgr loaddb -start 1y, which will load data for the past year until now. |
VOV-15925 | All | None | Release 2023.1.1 of Accelerator products has dropped support for CentOS 6, SLES 12, Ubuntu 16.04, and Windows 7; and added support for Ubuntu 22.04. See the OS support matrix in the Release Notes for an up-to-date list of supported OSes. |
VOV-15939 | All | None | vtk_preemptrule_create_or_modify has been split into two commands: vtk_preemptrule_create and vtk_preemptrule_modify. The online help has been updated to reflect this change. |
Resolved Issues
Internal Number | Products | Case Number | Description |
---|---|---|---|
VOV-8797 | Accelerator, FlowTracer | None | Previous versions of the software may have given conflicting information about full and overloaded taskers when requesting Why related information. This has been resolved in this release. |
VOV-13284 | Monitor | None | A syntax error was causing previous versions of the product to throw an error when reporting by feature/User in "usage trends." This has been resolved in this release. |
VOV-13801 | Accelerator | CS0204915 | The RDS syntax for controlling License resources and SUM resources will not lead to SUM resources being generated with maps to nonexistent feature resources. This issue is fixed in the new RDS-based resource management. |
VOV-14682 | Monitor | CS0295704 | When running in "Altair Monitor Basic" mode, no longer check out "user_licmon" features. |
VOV-15063 | FlowTracer | None | Fixed a FlowTracer issue on Windows, whereby the vovconsole tool was occasionally freezing when displaying a FlowTracer project. |
VOV-15185 | FlowTracer | None | Fixed some problems in vovconsole that were exposed by the use of site customizations to the vovconsole status bar. Added more comments so that ::VovGUI::configJobStatusBar is used appropriately. |
VOV-15470 | Monitor | CS0358260 | Improved performance of => Inserting data into Database from CHK and DEN files. => Trimming Database. |
VOV-15508 | Accelerator | CS0438406 | Adds support for VOV user groups in reservations. The
vtk_reservation_create Tcl procedure now
accepts a -usergroup option for specifying the
name of a VOV user group. |
VOV-15536 | Monitor | None | Fixed an issue where version numbers were not parsed correctly by the LMX parser, and the License Version column for Altair License Manager checkouts were not properly displayed in the Monitor web UI under | .
VOV-15553 | Accelerator | CS0376234, CS0404054 | Previous releases did not properly incorporate ACLs when applying transition modifications, with the effect that only the owner could modify most properties regardless of which ACLs were set for a given object. Now with proper ACLs set, all the properties with the exception of USER, OSGROUP, NAME or command line, ENV, DIR, and SUBMITHOST are available for modification with appropriate ACL permission levels. |
VOV-15562 | FlowTracer | CS0380536 | Modify vovproject create so that default project setup.tcl is sourced before the vovserver is started. |
VOV-15605 | All | None | Windows now supports a command-line interface (CLI) installer for Accelerator products via
the new -batch option on
install.bat. The corresponding Linux
install.sh command has also been
enhanced on Linux to accept the -batch option.
Invoking install.sh ... -batch is now
equivalent to invoking the CLI installer script
batch_install.csh. The
install.batch and
install.sh utilities now have an
-h option to display complete usage
syntax. |
VOV-15617 | Accelerator Plus | CS0322667 | Fixed an issue where invalid launcher jobs (with an ID of 000000000) would get added to the user and jobclass sets, and cause the WX console to crash. |
VOV-15619 | Monitor | CS0388358 | A syntax error was causing previous versions of the product to throw an error when reporting by feature/User in "usage trends." This has been resolved in this release. |
VOV-15643 | Monitor | None | Fixed issue in Heatmap view in Monitor, where the show numbers option was not taken under consideration. If the box is unchecked, numbers are not displayed inside the rectangles. |
VOV-15664 | Monitor | None | Fixed a critical error that caused vovdb_util upgrade ... to malfunction on Windows. The root cause was related to locale. |
VOV-15667 | Monitor | None | Monitor help now opens in a new tab. |
VOV-15669 | All | None | RLM is no longer supported as a license management system for the Altair Accelerator product line. |
VOV-15670 | All | None | The XYNTService wrapper is no longer supported as a means of running Accelerator products as a Windows service. The only supported method for running as a Windows service is via the Single File Distributable (SFD) model. |
VOV-15677 | Accelerator Plus | CS0392770 | Added a new config taskerDisconnResvCleanupTimeout to postpone the cleanup
of the reservations on the disconnected tasker due to connection
error so that reservation can be restored if the tasker connects
back within the timeout. |
VOV-15720 | Accelerator Plus | None | vovwxd notifies you when the driver script or config.tcl is deleted. |
VOV-15721 | Hero | None | The list of Zebu placements generated now includes all combinations of units as opposed to
just contiguous placements as was the case previously. The
previous behavior can be restored by updating the relevant
section of the emul.cfg with the line
CONNECTIVITY = "sequential" . |
VOV-15725 | All | None | Fixed an issue that prevented burst licensing from working with ALM. |
VOV-15732 | Monitor | None | Fix server response error when editing the content of the file in cvs.cgi page. |
VOV-15736 | Accelerator | None | In the past, Accelerator jobs launched via nc run r cresource#N for
consumable resource "cresources " were allowed
to use the "#" character even though the use of "#" is
inappropriate for consumable resources like "cores" or "cpus".
The "/" character should be used for specification of consumable
resources, for example "r cpus/2". In this release the job
submit commands enforce the use of the "/" character in job
submissions with consumable resources. |
VOV-15775 | Accelerator | None | Developer related environment variable was removed from product documentation. |
VOV-15781 | Accelerator, Accelerator Plus | CS0403873 | Fixed an issue that caused vovreconciled to incorrectly grab extra license resources in response to "also" matches, increasing over time. |
VOV-15790 | Allocator | CS0406786 | Fixed issue that prevented subsequent addSites configuration lines from being processed if an invalid site was specified in a previous line. |
VOV-15799 | Accelerator | None | RDS Feature rules now accepts a TOTAL attribute that overrides the total available for the corresponding feature resource map rather than using the total available from Monitor. This value will also be used in the sum resource map total calculation unless that is also overridden. |
VOV-15807 | Accelerator | None | Fixed issue with server-side tasker startup requests, prompted by the
-server option to vovtaskermgr
start. |
VOV-15811 | All | None | Call vovwait4server as part of the startup process for FlowTracer projects to ensure the vovserver is up and running before returning to the shell. |
VOV-15819 | Monitor | CS0392344 | Fixed issues with lmmgr loaddb that would skip some subdirectories when loading denial data. |
VOV-15827 | All | CS0412949 | Added robustness to code dealing with REST and web session key management. |
VOV-15837 | Accelerator | None | The source files information is removed from the binaries. |
VOV-15838 | Monitor | None | lmmgr loaddb will now pass a starting timestamp value to its internal call
to ftlm_capacity, instead of defaulting to "1
year ago", but the default time frame is still equivalent to "1
year ago", unless -start is explicitly passed
by the user to lmmgr loaddb . |
VOV-15852 | Accelerator | None | Fixed issue causing vovserver to hang with multiple partial tasker reservations. |
VOV-15871 | FlowTracer | None | Fix an issue whereby some complex REST object queries were crashing vovserver. |
VOV-15886 | Accelerator Plus | None | Only one instance of vovwxd daemon can be launched for an AAP queue. Other
attempts to start a vovwxd process will fail
with an error. |
VOV-15916 | Accelerator | None | Fixed an issue with NC removal of license based resources when the originating license server
(eg. FlexNet) goes down. The NC licensed based resources should
disappear automatically after the license server goes down, but
they live on. The fix is in the new Resource Data Service (RDS)
resource management service, which ensures automatic deletion of
these resources. With classic resource management, a workaround
for the issue is to restart the vovresourced
daemon to effect the deletion of these resources. |
VOV-15921 | Monitor | CS0429534 | The treemap plots now include an Export button that allows you to export the plot in a png, csv, or svg format. Aligned the functionality with the available plots in the Feature Detailed Plots. In contrast with other plots the Export buttons are located on the top of the plot instead on the right of the plot. This is done due to the fact that only treemaps are placed in a row next to each other. All the other plots are placed on top of another. As such, for the Export buttons to not interfere with the siblings plot the placement is migrated to the top. The resulting csv file will contain the following two columns: - 'group': the name of the group according to the option under 'Usage treemap report by' and 'Denial treemap report by'. There is a possibility that this value will be truncated since the maximum length is set to 300 characters. - 'value': the value of the group. |
VOV-15979 | Accelerator Plus | None | Fixed issue that prevented autoreschedule from automatically activating for jobs that fail because they were dispatched to a tasker that was in the process of being shut down. |
VOV-15991 | All | None | Fixed issue that caused vovresourced to run with an elevated verbosity level, resulting in larger-than-normal log files. |
Version-Specific Patch Releases
2023.1.0 Release Notes
New Features
Internal Number | Products | Case Number | Description |
---|---|---|---|
VOV-3411 | Accelerator | None | vovselect TURBOBOOST from taskers is now available. Set to 1 if Intel Turbo
Boost is enabled on this tasker host, or 0 if disabled (Linux
only), otherwise set to NA. |
VOV-7894 | Accelerator | AAP21491 | Added Jobclass and Autokill columns to the Running Jobs table in the web UI. |
VOV-8024 | Accelerator | AAP21343 | Added the ability to specify a stop reason when stopping all running jobs for all users via the Running Jobs page. |
VOV-8642 | Monitor | None | The ftlm_tag_admin command line utility for Monitor is now described in the documentation. |
VOV-9900 | None | AAP24167 | An enhancement has been made for the NC family of managers (ncmgr, vncmgr, wxmgr and hemgr) to show the queue name and command action in the log file name if logging to the $VOVDIR/local/logs/* location. |
VOV-12011 | Accelerator | CS0120886 | A new ncmgr rehost" subcommand is added to migrate an Accelerator vovserver from one host to another. See ncmgr rehost for details. |
VOV-14032 | All | CS0223251 | vncmgr/ncmgr stop action now supports a -freeze_nocpr option to save the PR file without compression for a freeze to potentially save time when working with very large PR files. |
VOV-14383 | Accelerator Plus | None | The Direct Drive documentation topic has been clarified with more information regarding formatting of the driver script. |
VOV-14395 | Accelerator | None | The vtk_server_config Tcl command has a new rds.reinitialize parameter to allow reinitialization of the RDS system from the command line. This parameter requires no additional arguments. |
VOV-14488 | Accelerator Plus | CS0260943 | Added new configuration parameters for fine-grained control of vovwxd. |
VOV-14514 | Accelerator, Accelerator Plus | CS0268091 | Clients that interact with VOV subsystems are "immediately" terminated if they do not have sufficient security privileges (as defined in security.tcl). The ANYBODY / READONLY security level is documented for users with minimum privilege. |
VOV-14577 | All | None | The default HTTP server, or webprovider, used by vovserver is changing to the internal webprovider in place of "nginx". This applies to the URL used to access the vovserver "web port". The -webprovider nginx option is available on ncmgr and other vovserver start commands to select the nginx webprovider if that is preferred. |
VOV-14701 | Accelerator | CS0292715 | You can now use vovselect to list tasker lists created with vovtaskerlist, and get information about tasker lists via the v3 REST API with the URL http::hostname/api/v3/taskerlists. |
VOV-14718 | Accelerator | None | Implemented backfilling resource reservation to handle conflicting jobs. This can be controlled by server config backfillResReservation, disabled by default. |
VOV-14774 | Accelerator, Monitor | None | Four new subcommands are added to the vovdb_util command for managing the Monitor or Accelerator PostgreSQL database: exportconfig, exportpasswords, importconfig, and importpasswords. See vovdb_util for details. |
VOV-14789 | Accelerator | AAP22106 | The use of access control lists (ACLs) for resource maps are expanded to allow specified
users or user groups to edit, reserve, and forget (delete) the
resource map object. An example of an ACL-setting command that
would be used for this objective is:
where
$id is the VOV id number for a resourcemap
object. Supported only if the resourcemap is created with
rank=0. |
VOV-14806 | Accelerator | None | Rapid Scaling 3.0 implemented in vovwxd and the Altair NavOps cloud connector. Supports in-cloud deployments of a cluster scheduled by the Accelerator workload manager with adaptive compute node allocation to adjust to job load. |
VOV-14863 | Accelerator Plus, Monitor | CS0312583 | Enabled the doTestHealthFailoverServerCandidates health check for Accelerator Plus and Monitor. |
VOV-15120 | Accelerator | None | Implemented backfilling resource reservation with non-conflicting jobs. This can be controlled by server config backfillResReservation, disabled by default. |
VOV-15352 | Accelerator | None | Implemented dispatch of deadline jobs on lookahead reservation activation. Supported argument -deadline in nc run to submit the job with an expected duration of completion. |
VOV-15404 | Monitor | None | Migrate replacing images from Monitor batch reports to node, only D3 based plots are now supported which are part of the new look and feel of Monitor. |
VOV-15429 | All | None | The default license manager for Accelerator products is changed to Altair License Manager in
the 2023.1.0 release. To revert to RLM, set
alm.enable to 0 in
policy.tcl. This change does not affect
licensing by the RTDA legacy keyfile license files, which are
still supported. |
VOV-15439 | Accelerator | None | Implemented tasker property hardbound and softbound to
restrict the scheduler to dispatch only autokill jobs on
hardbound tasker, while autokill & xdur jobs may run on
softbound tasker. Implemented reservation property hardfill and softfill to restrict the scheduler to backfill hardfill reservation with only autokill jobs, while softfill reservation may be backfilled with autokill and xdur jobs. Modified vovtaskermgr utility to support all 4 options. |
VOV-15548 | Allocator | None | Implemented Allocator Resource Summary page column 'Allocated' to show the "Number Of Resource Tokens Allocated Including OOQ Tokens To Each Site". |
VOV-15588 | Accelerator | None | An online help topic to describe AVS syntax has been added. |
VOV-15618 | Accelerator | None | DP component jobs now inherit the expected duration (-xdur) setting of the main job. |
VOV-15653 | Monitor | None | The PostgreSQL software included with Monitor and Accelerator products is upgraded to version 14.4. A database conversion to the new version is needed for projects that are upgrading and retaining an existing database. See Monitor and software installation documentation for database upgrade instructions. |
VOV-15656 | Accelerator | None | The "tasker" lexicon versions of Accelerator CLI commands were added in a recent software
release to replace the "slave" lexicon versions. The "slave"
lecicon commands have been available, but be advised of their
deprecation. The following CLI commands are deprecated and will
be removed in an upcoming release:
|
VOV-15661 | Accelerator | None | The following new server configuration parameters are added to control the new lookahead and
backfill scheduling policies. See
policy.tcl for descriptions:
|
VOV-15694 | Accelerator | None | Resource Data Service (RDS) is a replacement for vovresourced. RDS is provided as an opt-in preview feature. When the optional RDS operation mode is selected, Accelerator and Monitor communicate on new pub/sub channels. Resource management and license matching is performed by a new RDS thread of the vovserver process. |
Resolved Issues
Internal Number | Products | Case Number | Description |
---|---|---|---|
VOV-11585 | Accelerator Plus | AAP25265 | Fixed a bug in vovwxd that caused incorrect "taskers max reached" alert. |
VOV-12137 | Accelerator | CS0120974 | nc modify now accepts the syntax to change the FairShare group of a queued job, and will allow the user to be different than the job's original FairShare user so long as the client has the correct ATTACH ACL on the new fsgroup. |
VOV-12766 | FlowTracer | None | Changes in functionality of vovforget. In Accelerator, Accelerator
Plus, Hero:
|
VOV-13525 | Accelerator, Accelerator Plus | CS0288823 | Fixed an inconsistency in tasker status returned by nc hosts and vovselect. |
VOV-13784 | Monitor | CS0191792 | The client systems may have a clock skew. Therefore timestamp reported by client should not be used for calculations. For consistency, from now on, we avoid client reported timestamps for job checkouts, checkins, matching, etc. With this fix, server-side timestamp are now prioritized. |
VOV-14321 | Accelerator | CS0246573 | Resource requests attempted with #0 or /0 will now throw an error through out the CLI,API, and UI, with the exception of CORES/0, or CPUS/0. |
VOV-14356 | Monitor | CS0247184 | The display of incorrect status of LM Control Center Agents in Web UI has been fixed. |
VOV-14501 | All | CS0262252 | vovserver config parameter autoLogout now allows admins
to configure the maximum duration for which a given Web/REST
session stays valid. |
VOV-14505 | Accelerator, Accelerator Plus | CS0263183 | The bjobs implementation has been improved to significantly reduce the number of vovserver inquiries. |
VOV-14628 | Allocator | CS0270617 | A tuning parameter was added to the LA distribution algorithm. It can be modifed using the
command LA::SetUnhappyReserveFraction
<fraction> in <swd/vovlad/config.tcl. The
fraction must be a number between zero and one and defaults to
zero. A value of zero corresponds to the behavior before this
change as implemented. |
VOV-14824 | All | None | Fix a denial-of-service (DOS) vulnerability in vovserver's base webserver from nefarious large HTTP requests. |
VOV-14886 | Accelerator Plus | CS0266302 | Added support for a new server configuration param called wx.setQueueEnv
to control how NC_QUEUE may be set in the job environment. If not set, NC_QUEUE will be set on the job environment only if it's explicitly set in the job submission environment and the user is using SNAPPROP or SNAPSHOT. If wx.setQueueEnv is set to 1 and NC_QUEUE is not set in the submission environment, NC_QUEUE in the job environment will be the WX project name. If NC_QUEUE is set in the submission environment and SNAPPROP or SNAPSHOT is used, NC_QUEUE will be propagated from the submission environment to the job environment. |
VOV-14959 | Accelerator Plus | CS0267206 | Fixed a bug in vovwxd that caused an infinite loop due to a deleted "top job" of a bucket. |
VOV-14985 | FlowTracer | CS0324179 | Introduced vtk_set_operation MOVECONTENT to transfer content from one set to another. |
VOV-14990 | Accelerator Plus | CS0336870 | Fixed a bug causing a vovDeprecated TCL error in vovwxd. |
VOV-15105 | FlowTracer | 00000 | The output files of autoflow jobs are no longer marked missing by vovcheckfiles. Moreover, direct descendents of autoflow jobs are always considered valid provided the parent node is valid. |
VOV-15112 | All | None | Updated Kafka Job events with additional data. |
VOV-15188 | Monitor | None | The code management system used to manage license files has changed from CVS to fossil. New installations will use the fossil based system by default, existing installations will continue to use the CVS based system by default. If the user wishes to switch from the CVS based system to the CVS based system, some manual steps will be required. The user can choose start afresh and delete the existing CVS history, or choose from two recipes to migrate the CVS history to the fossil based system. |
VOV-15195 | FlowTracer | None | Fixed job name issue with IFDEF FDL procedure that prevented it from running successfully. |
VOV-15257 | All | None | OpenSSL version on Windows and Linux platforms is upgraded to 1.1.1p |
VOV-15351 | FlowTracer | CS0352183 | vovwxd added support for the CONFIG(slave,env) parameter in DirectDrive mode. |
VOV-15387 | FlowTracer | CS0360469 | The default values for VOV_SEND_TIMEOUT_MS and VOV_RECV_TIMEOUT_MS on Linux will now be set higher to 2 minutes connection timeout and 5 minutes for send/receive timeouts. This should mitigate protocolErrorInBuf type errors, with the default values under most conditions. If higher or lower values are desired the user can still override the default behaviors by setting VOV_SEND_TIMEOUT_MS and VOV_RECV_TIMEOUT_MS as desired for any particular scenario. |
VOV-15416 | FlowTracer | CS0362761 | The vovproject archive command no longer truncates the hostname. |
VOV-15417 | Monitor | None | Fixed issue that prevented the lmmgr reset operation from completing without error. |
VOV-15482 | Monitor | CS0332038 | Monitor's batch reports are using the same version charts like the ones found in the browser setting by default. The use of the command line utility remains the same. However, in order to extract the charts and convert them to static images new requirements have been set in place. Node v14 or greater is required to have been installed on the machine and be available in the path environment variable. In addition, manually reverting the LM configuration parameter "lookAndFeel" to its previous value will result to image extraction failure. |
VOV-15498 | Monitor | None | Remove the "usage plot" and "plot queued results" checkboxes. The functionality of these checkboxes is replaced by: "plot usage details checkbox" and "plot usage capacity" checkbox for the usage plot and "plot queued request details" for the queued plot. The default behavior of the form remains the same. Visiting the page and submitting the form without editing any of the options will result to the exact same plots. New checkbox with the label of "plot average queued" that will enable the plotting of the average queued requests, thus decoupling the "show average" checkbox from controlling both plots. Any component/element of the two plots can be plotted irrespectively of the selection made in the checkboxes. For example for the usage plot, the average can be set to on without the need to have the details or the capacity on the plot. The checkboxes have been re-ordered and groupped to accordingly to the plot they have control over. |
VOV-15515 | Accelerator | CS0373997 | Fixed a bug in tracking of resource usage by taskers that are in "DONE" state. |
VOV-15541 | Accelerator Plus | CS0376655 | Using containers could cause some potential naming conflicts in scripts when based upon timestamp and pid alone, which unfortunately is not guaranteed to be a unique identifier with current configuration options for containers. There a true uuid has been added in utility scripts for commonly used scripts where such ambiguity might normally cause issues when used frequently inside multiple containers at the same time. |
VOV-15560 | Allocator | None | Fixed issue in Allocator resource summary page which is resulting in zero size csv file. |
VOV-15620 | Accelerator Plus | None | A -dd option was added to vovwxconnect that uses Direct Drive when connecting to NC queues. The -dd option can also be passed to wxmgr start to connect to existing queues, for example, wxmgr start -basequeue vnc -dd. |
VOV-15724 | Accelerator, Accelerator Plus | CS0398311 | Fixed a bug causing a slowness in processing foreign buckets in WX DirectDrive mode. |
VOV-15735 | Accelerator, Accelerator Plus | None | Fixed issue that resulted in permanently-used resource maps when the top job of a foreign bucket can not be created. The issue commonly showed up in NC base queues with a WX meta-scheduler using direct drive. |
VOV-15741 | Accelerator | None | Reject Accelerator jobs that consume no slots. The fix detects this case: nc
run -r slots/0 ... and rejects the attempted job
submission. |
2022.1.1 Release Notes
New Features
Internal Number | Products | Case Number | Description |
---|---|---|---|
VOV-7819 | Accelerator | 21323 | Add to vovselect the capability to specify more FIELD.X items. Added
RESOURCES.<RESNAME> to estimate the
requested resource value of RESNAME . Added
GRABBEDRESOURCES.<RESNAME> which returns
the current value of RESNAME in a job's grabbed
resources. Similarly, SOLUTION.<RESNAME>
returns the value of RESNAME in a job's
solution. |
VOV-9389 | Accelerator | AAP23445 | The doTestHealthTooManyOutOfQueueJobs health check was enhanced to allow
@USER@ as a valid recipient. |
VOV-11252 | Accelerator Plus | AAP25064 | vovwxd was modified to use the CONFIG(slave,setName)
parameter instead of a constant "WXTaskers:<wxQueue>" set
name. |
VOV-13421 | All | None | In an upcoming major release of Accelerator Products, the vovselect *
wildcard select feature will be dropped. To prepare for this
change, users should update scripts and REST requests to issue
vovselect requests using a specified list
of field names. For example:
An easy way to find out what fields are in an object type is
by using vovselect fieldname. For
example:
|
VOV-13912 | All | CS0202834 | Fixed an issue with daily log rotation of auto-started daemons. |
VOV-14232 | All | CS0238414 | Clarified log messaging when the saving of trace database and/or metrics data is taking longer than expected. |
VOV-14449 | Accelerator Plus | CS0222621 | Added CONFIG(failedAgentsCooldownPeriod) parameter for
vovwxd that allows you to continue to
request agents after the specified period for a bucket that had
failed agent jobs. Format is a time specification, a value of 0
disables this feature. |
VOV-14451 | Allocator | CS0255747 |
|
VOV-14978 | Hero | None | The Licensing section of the online help has been updated and improved with recent Altair License Manager (ALM) licensing changes. |
VOV-15010 | Hero | None | The Altair Hero online documentation has been updated to reflect the changes in functionality and features. |
VOV-15085 | Monitor | None | Sorting the File Systems & Process Summary table in Monitor by clicking the header of each column, will sort the table in place in the current working tab, instead of creating a new browser tab with the table sorted. |
VOV-15104 | Accelerator, Accelerator Plus | None | The SDS Configuration documentation has been updated to reflect the newly added "enable_jobdata" feature. |
VOV-15149 | Hero | None | The following targets were added to the Palladium emulators: BRD144, BRD72, BRD48, BRD36, BRD24, BRD18, BRD16, BRD12, BRD9, BRD8, BRD6, BRD5, BRD4, BRD3, BRD2 and BRD1. Thes targets correspond to placements with the corresponding number of boards. For example, if the emulator is configured with a group named PZ1, then a job for a Palladium with 8 racks would use the resource HERO:PZ1_BRD144. The placement rules are restrictive and will evolve with experience. |
VOV-15150 | Accelerator, Accelerator Plus | None | The Accelerator and Accelerator Plus subserver feature has been retired. This feature provided one way of optimizing load on vovserver. Other modern optimization methods will be used for vovserver performance and scalability going forward. |
VOV-15206 | All | None | Added description of Security Level = ANYBODY in the online documentation. |
VOV-15254 | All | None | Implemented a new event to Kafka on job delete based on topic 'vov-jobdata' allowing these events to be correctly ordered with others for the same job. |
VOV-15326 | All | None | Bundled the Altair License Manager server components with the Accelerator Products installation media. |
VOV-15408 | Accelerator | None | FAIRSHARE_WEIGHTS within job class definitions are no longer supported. Use
vovfsgroup modify FSGROUP.user weight W
instead. |
Resolved Issues
Internal Number | Products | Case Number | Description |
---|---|---|---|
VOV-13165 | Accelerator | CS0127745, CS0172968 | With vovreconciled active, there were cases when license usage counts that were growing during a job's execution, but the correct license grab counts were not being set for the job. |
VOV-14407 | Accelerator, Accelerator Plus | None | Fixed issue that prevented the preemption of wxagent jobs when using Direct Drive. |
VOV-14533 | Accelerator | None | Add more documentation about multiphase jobs and their dependency on autoRescheduleCount. |
VOV-14557 | Monitor | CS0282491 | Fixed issue with removing a license checkout from within the web UI. |
VOV-14586 | Accelerator | CS0285073 | The default values of the FairShare group "/" weight & window values were restored to the 2019.01 values. |
VOV-14663 | Accelerator, Accelerator Plus | None | Add documentation for the TLS 1.2 and 1.3 configuration parameters that affect the protocols supported by the internal webserver for REST and HTTP requests. |
VOV-14822 | Accelerator, Accelerator Plus | CS0309335, CS0314259 | Fixed issue causing nc wait -dir to wait for jobs in other
directories. |
VOV-14824 | All | None | Fix a DOS vulnerability in vovserver's base webserver from nefarious large HTTP requests. |
VOV-14874 | FlowTracer | None | Fixed issue that affected the ability to stop jobs running on an indirect tasker:
|
VOV-14959 | Accelerator Plus | CS0267206 | Fixed a bug in vovwxd that caused an infinite loop due to a deleted "top job" of a bucket. |
VOV-14969 | All | None | Change description of fields in Admin licensing page when ALM is used. |
VOV-14976 | Accelerator | None | Fixed issue that caused the primary license feature to be blank on the licensing web UI page for the first hour after starting up a vovserver that is using node-locked ALM licensing. |
VOV-15056 | Accelerator, Accelerator Plus, FlowTracer | None | Fixed issue that prevented the -m option from being used to request a specific host in the bsub emulation command. |
VOV-15064 | All | None | Improved structure and clarity is added to the documentation about vovserver configuration parameters. |
VOV-15088 | Accelerator | CS0330199 | Taskers that lose connection with their vovserver will no longer treat a hostname lookup error as critical and exit, but will continue trying to reconnect to the vovserver on a periodic basis. This adds resilience in networks where vovserver is hosted on a VM or a container that has been restarted, and where dynamic DNS removes hostname entries when the VM or container are down. |
VOV-15092 | Accelerator | None | All phases of multiphase jobs now correctly run at the same priority. |
VOV-15093 | Accelerator | None | A problem parsing -mpres1 XXX -mpres2 YYY -mpresN ZZZ was fixed when running
multiphase jobs. |
VOV-15099 | Monitor | None | New feature introduced in Monitor Administrator page. In the actions column of the "Edit Monitors" table, a new action has been introduced. This action is only available for monitor definitions prescribed in the config.tcl file. Clicking the button with the icon file will bring up the config.tcl file in a browser environment. Any modifications made to this file through the browser will be persisted to the actual file as well. Monitors that were originally prescribed in the config.tcl can only be edited or removed, either by directly modifying the config.tcl through a text editor, or by using the "Edit config.tcl" action in the actions column of the table. |
VOV-15100 | Monitor | None | The new Monitor web UI front page is enhanced to display the Altair Monitor product name. |
VOV-15106 | Accelerator Plus | CS0325511 | Fixed issue causing nc wait to exit with error "Failed subcommand wait: Illegal object id". |
VOV-15112 | All | None | Updated Kafka Job events with additional data. |
VOV-15122 | Monitor | None | Fix a problem in the Monitor web UI in the | page whereby sorting by some of the columns in the table would display an error.
VOV-15140 | Monitor | None | Able to sort by the time related column in the processes table in LM web UI. |
VOV-15144 | Monitor | None | Change the look and feel of the Host details page in LM web UI to match the recent re-skin. |
VOV-15148 | Accelerator, Accelerator Plus | None | Fixed issue causing nc wait to exit with error can't read
"jInfo(exit)": no such element in array . |
VOV-15151 | Accelerator, Allocator | None | Fixed issue with SetMinQuantity causing jobs to run despite no allocated licenses on NC site. |
VOV-15152 | FlowTracer | None | In some extraordinary circumstances, if a dialog box was closed by pressing "x" button, the entire vovconsole would crash. This has been fixed. |
VOV-15172 | Monitor | None | The number of columns in the grid view in Current Utilization Overview page will always be compliant with the preferred number of columns that the user has selected. |
VOV-15174 | Accelerator | None | Documentation is added in the REST Tutorial to explain when TLS 1.2 support must be enabled for REST python client programs running on CentOS 7. |
VOV-15190 | Allocator | CS0334264 | Fixed issue server config 'maxResMap' update is not reflected in the resource map which is
resulting following error messages:
|
VOV-15194 | Accelerator Plus | CS0341461 | Job limits can now have periods in them. For example, if a user has a username of
"test.user", a command such as this is now allowed:
|
VOV-15219 | All | None | Auto-trimming of PostgreSQL database is disabled by default. |
VOV-15222 | All | None | Fixed the vovshow -licenses output when ALM is enabled. |
VOV-15245 | Accelerator, Accelerator Plus | CS0325511 | Updated nc wait to filter event by jobid when waiting on 10 or less jobs. |
VOV-15283 | Accelerator | CS0351032 | Support was added for FIPS enabled hosts on CentOS and other RHEL based hosts. |
VOV-15292 | All | None | Some issues dealing with HTTP requests with headers larger than 4K have been addressed. The internal HTTP server now accepts HTTP requests with a total header size up to 16K. |
VOV-15305 | Accelerator | CS0349956 | HTTP x-www-form-urlencoded POST data that's passed to CGI pages via temp files is now encrypted with a random key that changes for each request. |
VOV-15310 | Accelerator Plus | CS0355127 | Fixed a crash where an ambigious "vovquery select id from 9" results in an object lookup that vovquery doesn't support. |
VOV-15315 | Monitor | None | Fix wrongful description of action in alerts page of LM. |
VOV-15332 | Accelerator | CS0318230 | Raw power calculations for taskers has been increased from 2.1 million to 2147483647. If the calculated raw power is greater than this number, it will be clipped to 2147483647 rather than being allowed to become negative due to an integer overflow. A warning will appear in the tasker's log indicating that the rawpower exceeded this value and was "clipped" to the max allowable value. |
VOV-15334 | All | None | Updated Kafka Job events with additional data. |
VOV-15398 | Monitor | None | In LM Detailed Plots fixed:
|
VOV-15482 | Monitor | CS0332038 | Monitor's batch reports are using the same version charts like the ones found in the browser setting by default. The use of the command line utility remains the same. However, in order to extract the charts and convert them to static images new requirements have been set in place. Node v14 or greater is required to have been installed on the machine and be available in the path environment variable. In addition, manually reverting the LM configuration parameter "lookAndFeel" to its previous value will result to image extraction failure. |
VOV-15497 | All | None | Fixed issue in single-mode license accounting. |
2023.1.0 Release Notes
New Features
Internal Number | Products | Case Number | Description |
---|---|---|---|
VOV-3411 | Accelerator | None | vovselect TURBOBOOST from taskers is now available. Set to 1 if Intel Turbo
Boost is enabled on this tasker host, or 0 if disabled (Linux
only), otherwise set to NA. |
VOV-7894 | Accelerator | AAP21491 | Added Jobclass and Autokill columns to the Running Jobs table in the web UI. |
VOV-8024 | Accelerator | AAP21343 | Added the ability to specify a stop reason when stopping all running jobs for all users via the Running Jobs page. |
VOV-8642 | Monitor | None | The ftlm_tag_admin command line utility for Monitor is now described in the documentation. |
VOV-9900 | None | AAP24167 | An enhancement has been made for the NC family of managers (ncmgr, vncmgr, wxmgr and hemgr) to show the queue name and command action in the log file name if logging to the $VOVDIR/local/logs/* location. |
VOV-12011 | Accelerator | CS0120886 | A new ncmgr rehost" subcommand is added to migrate an Accelerator vovserver from one host to another. See ncmgr rehost for details. |
VOV-14032 | All | CS0223251 | vncmgr/ncmgr stop action now supports a -freeze_nocpr option to save the PR file without compression for a freeze to potentially save time when working with very large PR files. |
VOV-14383 | Accelerator Plus | None | The Direct Drive documentation topic has been clarified with more information regarding formatting of the driver script. |
VOV-14395 | Accelerator | None | The vtk_server_config Tcl command has a new rds.reinitialize parameter to allow reinitialization of the RDS system from the command line. This parameter requires no additional arguments. |
VOV-14488 | Accelerator Plus | CS0260943 | Added new configuration parameters for fine-grained control of vovwxd. |
VOV-14514 | Accelerator, Accelerator Plus | CS0268091 | Clients that interact with VOV subsystems are "immediately" terminated if they do not have sufficient security privileges (as defined in security.tcl). The ANYBODY / READONLY security level is documented for users with minimum privilege. |
VOV-14577 | All | None | The default HTTP server, or webprovider, used by vovserver is changing to the internal webprovider in place of "nginx". This applies to the URL used to access the vovserver "web port". The -webprovider nginx option is available on ncmgr and other vovserver start commands to select the nginx webprovider if that is preferred. |
VOV-14701 | Accelerator | CS0292715 | You can now use vovselect to list tasker lists created with vovtaskerlist, and get information about tasker lists via the v3 REST API with the URL http::hostname/api/v3/taskerlists. |
VOV-14718 | Accelerator | None | Implemented backfilling resource reservation to handle conflicting jobs. This can be controlled by server config backfillResReservation, disabled by default. |
VOV-14774 | Accelerator, Monitor | None | Four new subcommands are added to the vovdb_util command for managing the Monitor or Accelerator PostgreSQL database: exportconfig, exportpasswords, importconfig, and importpasswords. See vovdb_util for details. |
VOV-14789 | Accelerator | AAP22106 | The use of access control lists (ACLs) for resource maps are expanded to allow specified
users or user groups to edit, reserve, and forget (delete) the
resource map object. An example of an ACL-setting command that
would be used for this objective is:
where
$id is the VOV id number for a resourcemap
object. Supported only if the resourcemap is created with
rank=0. |
VOV-14806 | Accelerator | None | Rapid Scaling 3.0 implemented in vovwxd and the Altair NavOps cloud connector. Supports in-cloud deployments of a cluster scheduled by the Accelerator workload manager with adaptive compute node allocation to adjust to job load. |
VOV-14863 | Accelerator Plus, Monitor | CS0312583 | Enabled the doTestHealthFailoverServerCandidates health check for Accelerator Plus and Monitor. |
VOV-15120 | Accelerator | None | Implemented backfilling resource reservation with non-conflicting jobs. This can be controlled by server config backfillResReservation, disabled by default. |
VOV-15352 | Accelerator | None | Implemented dispatch of deadline jobs on lookahead reservation activation. Supported argument -deadline in nc run to submit the job with an expected duration of completion. |
VOV-15404 | Monitor | None | Migrate replacing images from Monitor batch reports to node, only D3 based plots are now supported which are part of the new look and feel of Monitor. |
VOV-15429 | All | None | The default license manager for Accelerator products is changed to Altair License Manager in
the 2023.1.0 release. To revert to RLM, set
alm.enable to 0 in
policy.tcl. This change does not affect
licensing by the RTDA legacy keyfile license files, which are
still supported. |
VOV-15439 | Accelerator | None | Implemented tasker property hardbound and softbound to
restrict the scheduler to dispatch only autokill jobs on
hardbound tasker, while autokill & xdur jobs may run on
softbound tasker. Implemented reservation property hardfill and softfill to restrict the scheduler to backfill hardfill reservation with only autokill jobs, while softfill reservation may be backfilled with autokill and xdur jobs. Modified vovtaskermgr utility to support all 4 options. |
VOV-15548 | Allocator | None | Implemented Allocator Resource Summary page column 'Allocated' to show the "Number Of Resource Tokens Allocated Including OOQ Tokens To Each Site". |
VOV-15588 | Accelerator | None | An online help topic to describe AVS syntax has been added. |
VOV-15618 | Accelerator | None | DP component jobs now inherit the expected duration (-xdur) setting of the main job. |
VOV-15653 | Monitor | None | The PostgreSQL software included with Monitor and Accelerator products is upgraded to version 14.4. A database conversion to the new version is needed for projects that are upgrading and retaining an existing database. See Monitor and software installation documentation for database upgrade instructions. |
VOV-15656 | Accelerator | None | The "tasker" lexicon versions of Accelerator CLI commands were added in a recent software
release to replace the "slave" lexicon versions. The "slave"
lecicon commands have been available, but be advised of their
deprecation. The following CLI commands are deprecated and will
be removed in an upcoming release:
|
VOV-15661 | Accelerator | None | The following new server configuration parameters are added to control the new lookahead and
backfill scheduling policies. See
policy.tcl for descriptions:
|
VOV-15694 | Accelerator | None | Resource Data Service (RDS) is a replacement for vovresourced. RDS is provided as an opt-in preview feature. When the optional RDS operation mode is selected, Accelerator and Monitor communicate on new pub/sub channels. Resource management and license matching is performed by a new RDS thread of the vovserver process. |
Resolved Issues
Internal Number | Products | Case Number | Description |
---|---|---|---|
VOV-11585 | Accelerator Plus | AAP25265 | Fixed a bug in vovwxd that caused incorrect "taskers max reached" alert. |
VOV-12137 | Accelerator | CS0120974 | nc modify now accepts the syntax to change the FairShare group of a queued job, and will allow the user to be different than the job's original FairShare user so long as the client has the correct ATTACH ACL on the new fsgroup. |
VOV-12766 | FlowTracer | None | Changes in functionality of vovforget. In Accelerator, Accelerator
Plus, Hero:
|
VOV-13525 | Accelerator, Accelerator Plus | CS0288823 | Fixed an inconsistency in tasker status returned by nc hosts and vovselect. |
VOV-13784 | Monitor | CS0191792 | The client systems may have a clock skew. Therefore timestamp reported by client should not be used for calculations. For consistency, from now on, we avoid client reported timestamps for job checkouts, checkins, matching, etc. With this fix, server-side timestamp are now prioritized. |
VOV-14321 | Accelerator | CS0246573 | Resource requests attempted with #0 or /0 will now throw an error through out the CLI,API, and UI, with the exception of CORES/0, or CPUS/0. |
VOV-14356 | Monitor | CS0247184 | The display of incorrect status of LM Control Center Agents in Web UI has been fixed. |
VOV-14501 | All | CS0262252 | vovserver config parameter autoLogout now allows admins
to configure the maximum duration for which a given Web/REST
session stays valid. |
VOV-14505 | Accelerator, Accelerator Plus | CS0263183 | The bjobs implementation has been improved to significantly reduce the number of vovserver inquiries. |
VOV-14628 | Allocator | CS0270617 | A tuning parameter was added to the LA distribution algorithm. It can be modifed using the
command LA::SetUnhappyReserveFraction
<fraction> in <swd/vovlad/config.tcl. The
fraction must be a number between zero and one and defaults to
zero. A value of zero corresponds to the behavior before this
change as implemented. |
VOV-14824 | All | None | Fix a denial-of-service (DOS) vulnerability in vovserver's base webserver from nefarious large HTTP requests. |
VOV-14886 | Accelerator Plus | CS0266302 | Added support for a new server configuration param called wx.setQueueEnv
to control how NC_QUEUE may be set in the job environment. If not set, NC_QUEUE will be set on the job environment only if it's explicitly set in the job submission environment and the user is using SNAPPROP or SNAPSHOT. If wx.setQueueEnv is set to 1 and NC_QUEUE is not set in the submission environment, NC_QUEUE in the job environment will be the WX project name. If NC_QUEUE is set in the submission environment and SNAPPROP or SNAPSHOT is used, NC_QUEUE will be propagated from the submission environment to the job environment. |
VOV-14959 | Accelerator Plus | CS0267206 | Fixed a bug in vovwxd that caused an infinite loop due to a deleted "top job" of a bucket. |
VOV-14985 | FlowTracer | CS0324179 | Introduced vtk_set_operation MOVECONTENT to transfer content from one set to another. |
VOV-14990 | Accelerator Plus | CS0336870 | Fixed a bug causing a vovDeprecated TCL error in vovwxd. |
VOV-15105 | FlowTracer | 00000 | The output files of autoflow jobs are no longer marked missing by vovcheckfiles. Moreover, direct descendents of autoflow jobs are always considered valid provided the parent node is valid. |
VOV-15112 | All | None | Updated Kafka Job events with additional data. |
VOV-15188 | Monitor | None | The code management system used to manage license files has changed from CVS to fossil. New installations will use the fossil based system by default, existing installations will continue to use the CVS based system by default. If the user wishes to switch from the CVS based system to the CVS based system, some manual steps will be required. The user can choose start afresh and delete the existing CVS history, or choose from two recipes to migrate the CVS history to the fossil based system. |
VOV-15195 | FlowTracer | None | Fixed job name issue with IFDEF FDL procedure that prevented it from running successfully. |
VOV-15257 | All | None | OpenSSL version on Windows and Linux platforms is upgraded to 1.1.1p |
VOV-15351 | FlowTracer | CS0352183 | vovwxd added support for the CONFIG(slave,env) parameter in DirectDrive mode. |
VOV-15387 | FlowTracer | CS0360469 | The default values for VOV_SEND_TIMEOUT_MS and VOV_RECV_TIMEOUT_MS on Linux will now be set higher to 2 minutes connection timeout and 5 minutes for send/receive timeouts. This should mitigate protocolErrorInBuf type errors, with the default values under most conditions. If higher or lower values are desired the user can still override the default behaviors by setting VOV_SEND_TIMEOUT_MS and VOV_RECV_TIMEOUT_MS as desired for any particular scenario. |
VOV-15416 | FlowTracer | CS0362761 | The vovproject archive command no longer truncates the hostname. |
VOV-15417 | Monitor | None | Fixed issue that prevented the lmmgr reset operation from completing without error. |
VOV-15482 | Monitor | CS0332038 | Monitor's batch reports are using the same version charts like the ones found in the browser setting by default. The use of the command line utility remains the same. However, in order to extract the charts and convert them to static images new requirements have been set in place. Node v14 or greater is required to have been installed on the machine and be available in the path environment variable. In addition, manually reverting the LM configuration parameter "lookAndFeel" to its previous value will result to image extraction failure. |
VOV-15498 | Monitor | None | Remove the "usage plot" and "plot queued results" checkboxes. The functionality of these checkboxes is replaced by: "plot usage details checkbox" and "plot usage capacity" checkbox for the usage plot and "plot queued request details" for the queued plot. The default behavior of the form remains the same. Visiting the page and submitting the form without editing any of the options will result to the exact same plots. New checkbox with the label of "plot average queued" that will enable the plotting of the average queued requests, thus decoupling the "show average" checkbox from controlling both plots. Any component/element of the two plots can be plotted irrespectively of the selection made in the checkboxes. For example for the usage plot, the average can be set to on without the need to have the details or the capacity on the plot. The checkboxes have been re-ordered and groupped to accordingly to the plot they have control over. |
VOV-15515 | Accelerator | CS0373997 | Fixed a bug in tracking of resource usage by taskers that are in "DONE" state. |
VOV-15541 | Accelerator Plus | CS0376655 | Using containers could cause some potential naming conflicts in scripts when based upon timestamp and pid alone, which unfortunately is not guaranteed to be a unique identifier with current configuration options for containers. There a true uuid has been added in utility scripts for commonly used scripts where such ambiguity might normally cause issues when used frequently inside multiple containers at the same time. |
VOV-15560 | Allocator | None | Fixed issue in Allocator resource summary page which is resulting in zero size csv file. |
VOV-15620 | Accelerator Plus | None | A -dd option was added to vovwxconnect that uses Direct Drive when connecting to NC queues. The -dd option can also be passed to wxmgr start to connect to existing queues, for example, wxmgr start -basequeue vnc -dd. |
VOV-15724 | Accelerator, Accelerator Plus | CS0398311 | Fixed a bug causing a slowness in processing foreign buckets in WX DirectDrive mode. |
VOV-15735 | Accelerator, Accelerator Plus | None | Fixed issue that resulted in permanently-used resource maps when the top job of a foreign bucket can not be created. The issue commonly showed up in NC base queues with a WX meta-scheduler using direct drive. |
VOV-15741 | Accelerator | None | Reject Accelerator jobs that consume no slots. The fix detects this case: nc
run -r slots/0 ... and rejects the attempted job
submission. |
2022.1.0 Release Notes
New Features
Internal Number | Products | Case Number | Description |
---|---|---|---|
VOV-7252 | Accelerator | AAP20379 | Added an email icon to the Job/Node Information page in the Accelerator web UI so the user can email the job description and execution information details to the owner of the job. The email action is by a "mailto:" link, so your browser needs to have "mailto:" link handling enabled. |
VOV-9004 | All | 23042 | The system health and notification daemon, vovnotifyd, now has the ability
to control the timing at which license expiration and other
emails are sent. In vovhealthlib.tcl, you
can define a TIMEVAR such as:
Where calling suppressMail with
a "1" for the specified health check will suppress the mailings
for that TIMEVAR. suppressMail accepts either
the name of a specific health check, or "ALL" to control all
defined health check routines. |
VOV-9417 | Allocator | 23693 | Added Totals row to Allocator Resource Summary Report page. |
VOV-10250 | All | AAP24412 | If nc wait cannot find related job info the related message now more accurately describes the job as forgotten instead of vanished. |
VOV-10545 | All | 24546 | An additional option -verify has been added to the existing vovversion CLI
which checks current client and server versions for mismatches.
If a discrepancy is discovered, an error message is displayed
and a non-zero exit value is returned, otherwise a verification
message is displayed indicating the current version number and a
zero exit value is returned. |
VOV-11001 | All | 24808 | The cron.csh scripts used by vovcrontab now log through vovdailylog. |
VOV-11202 | All | 24995 | Enhanced ? tip help message on filter inputs and added clarification on filter inputs for 'Taskers' and 'Resource Statistics' UI. |
VOV-11536 | Accelerator | None | The Container Configuration help page has been updated for clarity and now provides newer, better examples. |
VOV-11975 | Accelerator | CS0120839 | There is a new entry in cleanup.config.tcl to control when tasker
startup logfiles get removed when running vovcleanup.
set config(cleanup,taskerStartup) 180d . This
controls clean-up of the tasker startup logs separate from the
tasker logs. Users can use vovcleanup to
prune tasker logs without accidentally removing the startup log
if they wish to do so. |
VOV-12006 | Accelerator | CS0120876 | Added auto-trim database option to Database Configuration web page. |
VOV-12242 | Accelerator | None | Added a new "shrink-to-fit" job submission option -reconcilemem to
nc run. In shrink-to-fit mode, a job's
memory resource usage will be reduced after a time if actual
memory usage is less than requested RAM usage that was specified
with "-r ram/NN". |
VOV-12400 | Monitor | None | Updated HASP monitors to support the Sentinel LDK Admin version 8 API. |
VOV-12858 | Accelerator | CS0142679 | Added ability to pass the -cmp_remotefile option when using Calibre. |
VOV-13227 | Accelerator, Accelerator Plus | AAP25139 | Distributed Parallel support now allows -dpres+ to append resources to
either -r or -dpres resource
specifiers on the command line or in jobclasses via
VOV_JOB_DESC(). |
VOV-13505 | FlowTracer | CS0182760 | Enhanced LSF driver script to allow jobs submission on execution host(s)/cluster(s) based on
the following configurations in LSF config.tcl.
|
VOV-13788 | Accelerator, Accelerator Plus | AAP24926 | Added cleanup of /data/jobs for Accelerator and Accelerator Plus. The default cleanup interval is 10 years. |
VOV-13795 | Accelerator, Monitor | CS0186684 | Eliminated unnecessary HTTP-requests made by ftlm_lmproject utility to Monitor since the non-FQDN-version of the hostname is always included in the FQDN request. |
VOV-13851 | Accelerator, Accelerator Plus | CS0208070 | Added RabbitMQ logs for nc wait and vovselect command. Added config 'VOV_CLIENT_LOG_TLS' in ncConfig/<queue name>.tcl to enable TLS for RabbitMQ connection. |
VOV-13900 | Accelerator | None | A change was made to the upper right pane within the Accelerator dashboard page, whereby the server daemons status squares are removed. This pane will henceforth display only alerts. |
VOV-13911 | All | None | When stopping a job with nc stop, you can now specify -skiptop
[1|0] to indicate whether or not to send the signal
to the top process. The top process is usually a wrapper such as
vw, unless the job was started without a wrapper. |
VOV-13983 | Accelerator | CS0121174, CS0209718 | Changed the pre and post job environments to include the following variables: VOV_JOBID, VOV_JOBPROJ, VOV_JOBSLOT, VOV_GRABBED_RESOURCES, and for Accelerator specifically, NC_JOBID. |
VOV-14017 | Accelerator Plus | CS0212504 | Improved procedure of dequeuing Accelerator Plus jobs in case of a repeated agent failures. vovnc.tcl script will find the jobs of the current bucket more efficiently. |
VOV-14026 | All | None | Introduced a feature called VovScope, which provides insights into network activities (read/writes) of vovserver and clients. |
VOV-14033 | Monitor | None | Implemented a more modern look-and-feel for the Monitor web UI. |
VOV-14098 | Monitor | None | Added ability to generate Checkout Statistics report by user and host. |
VOV-14139 | Accelerator | None | Preemption Plans can now take the form:
|
VOV-14177 | Accelerator | None | nc stop now supports setting either NC_STOP_SIGNALS or VOV_STOP_SIGNALS
with an EXT like format that doesn't involve invoking
vovjobctrl for its implementation. If
you reference NC_STOP_SIGNALS, the EXT format is :
You can now use
|
VOV-14178 | Accelerator | None | When starting a job via the REST API, an additional field named nowrapper
has been added. This is equivalent to the -n
option on the CLI command nc run . The default
value is False . If nowrapper
is False and the wrapper field
is blank or unset, it gets set by the API to
vw. You must set
When
preempting a job via the REST API with a "PUT" request with
action='preempt', the API now accepts a syntax for the
method field of:
This is similar to the updated preemption plan format. |
VOV-14274 | Accelerator, FlowTracer | CS0241168 | Default value of -singleuser tasker option can be set with
vtk_tasker_set_defaults. |
VOV-14289 | Monitor | CS0243425 | Enhanced ftlm_parse_flexlm parser to handle additional data on checkout in
the form [user_data=<value>] , and display
the same under the project in Monitor reports. |
VOV-14311 | All | None | The new -alm option is added to the batch_install.csh
script to allow the host and port of Altair License Manager to
be specified at install time. This is useful when the
Accelerator products are to be licensed via Altair License
Manager licenses. |
VOV-14315 | Monitor | None | Increased performance of matching license checkouts to jobs for NRU cases. |
VOV-14419 | All | None | Multiphase support is provided by two additional command arguments to nc
run: -multiphase [1|0] -mpres "resource string" -multiphase 1 enables multiphase jobs. -mpres sets the resources that will be used for each phase. The '%' is used as a delimeter for the resources of each phase. e.g. -mpres "linux64 foo%linux64 bar:linux64 baz" By specifying the resources of each phase and designating that certain resources are only allocated to certain taskers, one can run different phases of a job on different taskers. For example: I have two taskers named tasker1 and
tasker2. I want to run phase 1 and 3 on tasker1, and phase 2
on tasker2. My resources may look like:
I could then run a multiphase job as:
A
multiphase job will have two new Job Properties set:
MPRESOURCES: Contains the same resources passed in
MPCURRENTPHASE: Contains an integer indicating the current job phase. It starts at one, and has a max value of 9. What the job script sees: The running job script will see an environment variable named VOV_JOB_PHASE which is set to the current phase. The script writer will need to use that to decide what work to do for that phase. If the script exits with an exit code of 216, Accelerator will increment the job phase, change the job resources, and reschedule the job to run again. If the script exits with an exit code of 0, the job is considered "Done", and MPCURRENTPHASE is reset to 1. Failed jobs: If a job fails during a phase with a code other than 0 or 216, it is considered FAILED and MPCURRENTPHASE will not increment. If the job is invalided and re-run (e.g. nc rerun -f JOBID), the job will re-run starting at MPCURRENTPHASE and further phases will run if the job exits with code 216, as described above. Logging: After the first phase is run, subsequent phases of the job will have the command rewritten so that the wrappers are passed "-a -A", telling the wrappers to append to the job log. This is so that all phases of the job get their stdout and stderr logged to the same file. If this was not done, each phase of the job would overwrite the log, and the user would only see the output from the last phase that was run. If Accelerator does not detect one of the standard vov wrappers at the beginning of the command line, it will assume the command is not using a wrapper. In this case, it will look for the standard ">;" redirect symbol in the command and replace it with ">>;". REST Support: In the payload for submitting a job via rest, two new fields are allowed: multiphase and mpres. Setting multiphase = True enables multiphase job support. Setting the mpres field behaves the same as described for the command line argument described above. Re-running a multiphase job that has failed via the REST re-run API will behave similarly to rerunning a failed multiphase job from the command line as described above. |
VOV-14420 | Accelerator, Accelerator Plus | None | Updated tclrmq library to (1.4.5) currently (1.3.8) to support TLS connection for RabbitMQ. |
VOV-14477 | All, Accelerator, Accelerator Plus | CS0257852 | Taskers running as non-root will no longer get sent jobs unless the job's user matches the non-root tasker's userid. This is to address a situation where a job running on a non-root tasker gets access to the user's data on the filesystem. This policy can be disabled by setting the allowForeignJobsOnUserTaskers configuration parameter to 1. |
VOV-14488 | Accelerator Plus | CS0260943 | Added new configuration parameters for fine-grained control of vovwxd. |
VOV-14490 | All | None | The newly added VovScope feature has been added to the online help documentation. |
VOV-14511 | All | AAP25265 | Increased maximum number of normal clients per user limit to 250K from the existing 100K. |
VOV-14536 | Accelerator Plus | None | When using wx, the EXTLINKS property on the project will contain information about each base
queue. Previously, the format of EXTLINKS was "queueName1 URL1
queueName2 URL2 ..." This has been changed so that the queueName also contains a status of whether or not vovwxd sees the status of the basequeue as ONLINE or DOWN. The new format of EXTLINKS is: "queueName1:status=STAT1 URL1 queueName2:status=STAT2 URL2 ..." When visiting the web UI for the wx project, the link for a base queue that is down will be shown in red instead of the customary gray. |
VOV-14561 | Accelerator | CS0267219 | Reduced process tree log frequency from every ~30 secs to once in 10 minutes. |
VOV-14571 | Accelerator | CS0266023 | In order to provide more control over the creation of FairShare subgroups, a new acl type named CREATE was added as well as a new server config param called fairshare.strictNodeCreationChecks. The default for fairshare.strictNodeCreationChecks is 0, which means a user only needs the ATTACH acl to create sub-groups or to run a job on an existing group. If fairshare.strictNodeCreationChecks is set to 1, however, a user will need the CREATE acl in order to create sub-groups. They still need the ATTACH acl to run jobs under an existing group. If you set the debug flags FairShareGroups and acl, server.log will contain messaging about the acl checks being made to allow or deny creating a FairShare sub-group. |
VOV-14587 | Accelerator Plus | None | In WX DirectDrive setup, the onBucketProcess Tcl procedure is enhanced to support parameters
for the agent job in the base queue. The BUCKETINFO array will
be initialized with the following fields.
BUCKETINFO(AUXRESOURCES) BUCKETINFO(BUCKETID) BUCKETINFO(COUNT) BUCKETINFO(FSGROUP) BUCKETINFO(JOBCLASS) BUCKETINFO(JOBPROJ) BUCKETINFO(OSGROUP) BUCKETINFO(PRIORITY) BUCKETINFO(RESOURCES) BUCKETINFO(USER) BUCKETINFO(XDUR) BUCKETINFO(maxidle) BUCKETINFO(maxlife) BUCKETINFO(taskerAutokill) BUCKETINFO(taskerCmd) BUCKETINFO(taskerReservation) BUCKETINFO(taskerResources) The user can set values in vovaccel.tcl. |
VOV-14589 | All | None | For products other than Hero 2.0, parameters to the resource expression comma operator are required to be Monitor features. The resource expression processing no longer allows resource loops in resource maps. |
VOV-14616 | Accelerator | None | The output of nc hosts is changed slightly to make the information about running jobs per tasker clearer. The "JOBS" column has been replaced with 2 columns: "RUN/SWP" and "SLOTS". |
VOV-14629 | Hero | None | The command passed to hero_adapter is wrapped by vtool -f
Emul:<emul-name> <leaf-count> so that
the corresponding feature checkouts and checkins can be tracked
using Monitor. |
VOV-14704 | Monitor | None | Improved the ability for vovresourced to auto-discover Monitor metadata, such as the SSL and web port settings. |
VOV-14853 | All | None | Implemented new events to Kafka on job status updates, property changes, and delete jobs. This functionality is controlled using config 'enable_jobdata' in sds.cfg and this is not supported for array jobs. |
VOV-14871 | Monitor | None | If there are alerts present on the system, the favicon of the tab will cycle between a red dot and the default icon. Additionally, presence of alerts are notifiesd to the user by a "triangle alert" icon that gets prepended to the label of the Home tab in the main navigation bar. The color of this matches the alert level, the icon demonstrates a pulsation effect. |
VOV-14888 | All | None | The Supported Platforms online help page has been updated to reflect the changes implemented for 2022.1.0. |
VOV-14992 | Accelerator | None | In the 2021.2.0 release and later, the processing of jobclass initialization has been taken over by the new autostart script start_init_jobclasses.tcl in SWD/autostart, and the new liveness script live_init_jobclasses.tcl in SWD/tasks. |
VOV-15000 | Accelerator | None | The -reconcilemem option of the nc run command was updated in the CLI help. |
VOV-15002 | Accelerator, FlowTracer | None | The parameter allowForeignJobsOnUserTaskers has been added to the online help. |
Resolved Issues
Internal Number | Products | Case Number | Description |
---|---|---|---|
VOV-8079 | All | 21734 | Fixed issue with using Environment Modules from within a sh/bash shell. The "module unload" operation was failing. |
VOV-8648 | FlowTracer | 22241 | Added -running option to vovstop to provide the ability
to stop running jobs without dequeueing queued jobs. |
VOV-8752 | Monitor | 22547 | Fixed use of LDAP Email addresses in Monitor reports when VOVLM(ldapEmail) is set to 1 in <SWD>/config/web.cfg. |
VOV-8871 | Monitor | AAP22766 | Fixed legend spillover on Monitor usage comparison plot with many items. |
VOV-9166 | All | 23318 | Fixed issue where an invalid slave resource expression would result in a permanent error message on the Slaves web page. |
VOV-9638 | All | CS0288154 | Added a -nolog option to the vovgetnetinfo utility that will prevent the creation of a log file in the SWD. |
VOV-9776 | Accelerator | 24061 | The form for submitting a slave reservation now checks that the data is valid before actually submitting the reservation. |
VOV-9897 | All | AAP24162 | Initialize VOV_JOB_DESC variable to avoid errors when listing job classes. |
VOV-10568 | All | CS0120645 | HyperThreading fieldname 'HT' on taskers previously reported incorrectly if HT was available. The field now properly reports if SMT threading (either Intel HyperThreading or AMD SMT) is currently available and enabled by checking value(s) of either the /sys/devices/system/cpu/smt/control, /sys/devices/system/cpu/smt/active pair or /sys/devices/system/cpu/cpu0/topology/thread_siblings_list (or equivalent). directly. For /sys/devices/system/cpu/smt/control based files the values must now be both "on" and "1" or HT field will be reported as disabled. If you are having problems with tasker HT field values on a particular system please contact Altair support with details on your system information such as distribution name, release version, kernel version, method of HT/SMT configuration and expected values. |
VOV-10904 | Monitor | None | Fixed issue that prevented the | report from displaying the processes for the specified host(s) and/or user(s). Also fixed issue on this report that caused the UI to jump the batch reporting page if table sorting/filtering was used.
VOV-11292 | Accelerator Plus | AAP25068 | When submitting Accelerator Plus jobs where you are specifying the name of the base queue,
vovwxd would incorrectly schedule the
correct number of base queue jobs based on
minQueuedPerBucketPerNCQueue and
maxQueuedPerBucketPerNC , often scheduling
more than allowed. This has been addressed. |
VOV-11971 | Allocator | None | New Altair Allocator procedures LA::MarkVqAsOoq and LA:MarkResourceVqAsOoq have been added to the online help. |
VOV-12137 | Accelerator | CS0120974 | nc modify now accepts the syntax -G /group.user to change
the FairShare group of a queued job, and will allow the user to
be different than the job's original FairShare user so long as
the client has the correct ATTACH ACL on the new fsgroup. |
VOV-12253 | Monitor | CS0120851, CS0203350 | Enhanced MathLM parser to handle space in feature name. |
VOV-12510 | Monitor | CS0265568 | Initialization of the crypto signing key for web auth tokens was moved earlier in the vovserver initialization to address possible issues with trying to log into web pages while vovserver was still starting. |
VOV-12852 | Monitor | CS0132965 | Fixed parser error "can't read "feature": no such variable" and incorrect capacity value due to usage info not available. |
VOV-13088 | Allocator, Monitor | CS0155322 | Added an alert on server buffer overflow for all the products. Alert message will contain the client name if defined else unnamed client with fd would come. |
VOV-13094 | Monitor | CS0193004 | Adjusted the Sentinel parser to handle features with unlimited capacity. |
VOV-13142 | Allocator | CS0160431 | If a daemon is already running, starting a daemon from web UI is a no-op (in previous releases daemon was stopped and started). |
VOV-13150 | Accelerator, Accelerator Plus | CS0160883 | When visiting a URL in the web UI (such as for a newly created job), a user who has not been authenticated would sometimes not get properly redirected to their original URL after logging in. This has been fixed. |
VOV-13165 | Accelerator | CS0127745, CS0172968 | vovreconciled kept on repeatedly adding "NRU resources with same handles" back to the job. This has been fixed. |
VOV-13219 | Monitor | CS0165626 | Updated embedded LM widgets URL in lm_widgets.php. |
VOV-13294 | All | CS0170164 | Updated support email address. |
VOV-13348 | FlowTracer | CS0172882 | Fixed issue with vovconsole -fontsize option for not changing the node label font. |
VOV-13389 | All | None | Fixed an error when clicking the Cleanup All Cached Files Used for Tasker Load button on the web page for a tasker. |
VOV-13418 | Monitor | CS0175623 | Fixed custom time range issue in Monitor Batch Reporting which is resulting in zero size report. |
VOV-13422 | All | CS0175205 | The vovdaemonmgr command must be run on the vovserver host, and if not, will result in an error message. |
VOV-13548 | Accelerator | CS0185445 | Fixed issue that caused the equal sign to be lost when copying negated resource requests from a preempted job to its resumer job. |
VOV-13579 | Accelerator | CS0187071, CS0213289 | Fixed a web UI bug when viewing multiqueue preemption rules with watched resources. |
VOV-13595 | Allocator | CS0188280 | Suppressed the log "Changing site ID of host..." which can be controlled by using the flag MicroCode. |
VOV-13669 | Monitor | CS0192634 | Added support for renaming custom groups and custom group types that contain SQL-sensitive characters. Also added the ability to create and rename custom groups and custom group types via the ftlm_accounts CLI utility. |
VOV-13671 | Monitor | CS0193959, CS0244917 | The ftlm_deobfuscate utility now resets the obfuscation count in the database for specific files that have been requested to be processed. |
VOV-13698 | Accelerator Plus | None | Fixed tasker premature idleness exit. |
VOV-13704 | Monitor | CS0190737, CS0237022 | TLS 1.3 support is added in connections to the vovserver webport when the "internal"
webprovider is active in this release, and TLS 1.2 is diabled by
default. The enablement of TLS 1.2 and specific cipher suites is
possible with the addition of new configuration parameters. 4
new configuration parameters have been added to allowing tuning
of what SSL/TLS versions the internal webserver supports. You
can view and modify these via vovservermgr
config and make them permanent inside policy.tcl:
http.minSSLVersion - set the minimum TLS version supported. Defaults to "TLSv1.3". Valid values are TLSv1, TLSv1.1, TLSv1.2, and TLSv1.3 http.maxSSLVersion - set the maximum TLS version supported. Defaults to "TLSv1.3". Valid values are TLSv1, TLSv1.1, TLSv1.2, and TLSv1.3 http.tls12Ciphers - this is a list of ciphers that TLS 1.2 and earlier will be restricted to. The default is an empty string, but it can be set to a string such as "ECDH+AESGCM:ECDH+AES256:ECDH+AES128:DH+3DES:!ADH:!AECDH:!MD5" For an explanation of the string format, see "Cipher List Format" at https://www.openssl.org/docs/manmaster/man1/openssl-ciphers.html for details on what's acceptable. http.tls13Ciphers - This is a list of cipher SUITES that is used by TLS 1.3 (and presumably above if newer TLS versions are introduced in the future) See the description of SSL_set_ciphersuites() at https://www.openssl.org/docs/manmaster/man3/SSL_CTX_set_cipher_list.html for an explanation. By default the value is an empty string, which tells OpenSSL to use the default, which according to the documentation is TLS_AES_256_GCM_SHA384:TLS_CHACHA20_POLY1305_SHA256:TLS_AES_128_GCM_SHA256" Changing any four of these values via vovservermgr config while the server is running will cause the internal webserver to stop and restart with the new configuration. We recommend using a tool such as sslscan that is part of RHEL 8 to verify your changes. |
VOV-13708 | Accelerator | None | The Queued jobs numbers format has been modified to integer format. |
VOV-13709 | Accelerator | None | A fix has been added to address the browser refresh issue of the Set Details page. |
VOV-13728 | Accelerator | None | In the web UI, the following changes have been made:
|
VOV-13729 | Accelerator | None | The Sets names listed in the Recently Visited and Favorites lists in the Set Browser at the left side were made into hot links so that it would be easy to see the corresponding Set Detail page. Added fix to the Jobs Running and Jobs Queued numbers, for the Sets names listed in the Recently Visited and Favorites lists in the Set browser at the left side. |
VOV-13733 | Accelerator, Accelerator Plus | CS0196065 | Fixed Running jobs CGI error when taskers are missing. |
VOV-13734 | Accelerator | CS0189976 | Fixed issue that prevented the VncPolicyValidateEnvironment procedure's results from being applied to the job environment when using the the -f option to process a command file. |
VOV-13738 | Accelerator Plus | None | Fixed a bug that caused a vovwxd error, when adding non existing base queues. |
VOV-13783 | Accelerator, Monitor | None | When selecting ports for the web port, vov port, and read-only port, vovserver now does additional checking to ensure the same port number isn't inadvertently used for multiple services. Also, vov client connections now have a default send and receive timeout of 30s, but this can be controlled by the environment variables VOV_SEND_TIMEOUT and VOV_RECV_TIMEOUT. These environment variables were supported on Windows in an earlier release but are also honored on Linux. This should prevent commands like vovwait4server from hanging indefinitely if it accidentally tries to connecting to an listening port that is NOT the vov port. |
VOV-13790 | Accelerator | CS0203368 | PTY_ERROR property is set to YES for interactive job when its terminal of has been killed. |
VOV-13798 | Accelerator | CS0204909 | Fixed the error "USER ERROR: Failed subcommand info: can't read
"compatInfo(685614164,info)": no such element in array" for
'nc info -c <jobid associated with taskerlist>'
. |
VOV-13843 | Monitor | CS0208377 | Added a new option (-removelockfile) to lmmgr start. This instructs the script to remove the server lock file if it already exists. |
VOV-13846 | Accelerator | CS0202837 | Added new -confirmafter option to the vovtaskermgr start
operation that waits for a specified amount of time, then
confirms whether each tasker that was requested to start did, in
fact, start. |
VOV-13848 | Accelerator Plus | None | The bottom line of the | table was updated to indicate the actual number of taskers displayed and the total in the current selection.
VOV-13855 | Accelerator | None | The http.workerthreads parameter has been marked as Obsolete in the online documentation. |
VOV-13868 | Accelerator | CS0209055 | Fixed usage of VOV_MAX_WAIT_TO_RECONNECT and VOV_MAX_WAIT_AFTER_CRASH environment variables so that one does not override/affect the other. |
VOV-13869 | All | None | vovtaskermgr/vovslavemgr configure now supports
-maxwaittoreconnect . |
VOV-13874 | Accelerator | None | The cycle length bullet has been modified from showing 'scheduler time' value to the inverse of 'cycle frequency' value. |
VOV-13877 | Accelerator, Accelerator Plus | None | nc hosts -ALL no longer shows duplicate consumables and extras. We added
-rl option to show the legacy resource
output. Note that -rl only shows the partial
list of resources. nc hosts -r now shows a
complete list of resources, including their status. |
VOV-13910 | Accelerator Plus | CS0213212, CS0303516 | If a job is dispatched to a tasker that is in the process of exiting, the job will be refused by the tasker and automatically rescheduled for execution up to the maximum number of times allowed by the autoRescheduleCount server configuration parameter. |
VOV-13914 | Accelerator Plus | None | Added documentation of the startWXLauncher configuration parameter to all vovwxd configuration file examples. |
VOV-13925 | Allocator | None | Corrected the resource map UI title to have only the resource name without extra parameters (like quantity). |
VOV-13930 | Allocator | None | Fixed error "no such element in array" on Allocator Resource Overview page. |
VOV-13945 | Accelerator, Accelerator Plus | CS0212259, CS0222585 | Interactive jobs using the vwi script no longer change the SHELL environment variable. |
VOV-13954 | Accelerator | None | The footer of nc -h previously contained unsubstituted Tcl expressions. This has been fixed. |
VOV-13961 | Accelerator | CS0204211 | nc run -e "<option>" now supports arbitrary length option, constrained
only by the value of the maxEnvLength
policy variable. |
VOV-14005 | Monitor | CS0219301 | Fixed Expire column content for feature with version pools. Will show:
Will NOT show: "some expired" - deprecated |
VOV-14007 | Accelerator, Accelerator Plus | CS0213217 | The SNAPPROP environment now prepends relevant VOVDIR paths to PATH in a manner similar to the SNAPSHOT environment. |
VOV-14011 | FlowTracer | None | Indirect tasker improvements:
|
VOV-14012 | Accelerator Plus | CS0221575 | Accelerator Plus in direct drive mode will now detect wait reason changes from the base queues for buckets without incoming jobs. |
VOV-14052 | All | None | The URL shown in the lmmgr, ncmgr, and vsi outputs now reflect the value of VOV_HOST_HTTP_NAME, if set. |
VOV-14073 | Accelerator | None | Fixed issues with bjobs -o option that prevented the specified output format from being generated. |
VOV-14077 | Accelerator | None | The "Use Altair Accelerator's REST API to Submit and List Jobs" tutorial has been updated and improved. |
VOV-14091 | Accelerator | CS0227548, CS0296810 | Queued Job is properly dispatched when one of the running jobs was suspended with
-manualresume flag |
VOV-14095 | Hero | None | New implementation of Hero leveraging the Accelerator job submission engine. Native support for calendar based reservations and a connector for metrics reporting via Monitor. |
VOV-14112 | Accelerator | CS0229221 | Fixed logic to properly preempt Jobs with no requested License resouces. |
VOV-14128 | All | None | Some minor typos were fixed in product help. |
VOV-14153 | Accelerator, Accelerator Plus | None | Fixed issue that prevented vovtaskermgr stop -sick from stopping sick
taskers that still have a client attached to the server. |
VOV-14156 | All | CS0233956 | The vovcleanup utility now removes empty directories which are not
expected to be persistent. This includes job profile and wave
directories under SWD/data. Additionally, a
new configuration item has been added to independently control
the cleanup of resource-based wave data files: set
config(cleanup,waves,resources) 90d |
VOV-14158 | Accelerator | None | Fix truncated output from vovversion -clients. |
VOV-14182 | Monitor | None | LM checkout usage, denial statistics, usage comparison, and usage trends along with ftlm_batch_report with same options can be done now "by user/host" in addition to other available options. |
VOV-14217 | Accelerator Plus | CS0237217 | An issue that prevented dp jobs from successfully being run via Accelerator Plus has been resolved. |
VOV-14228 | Accelerator | None | The documentation for logging interactive jobs has been improved to clarify which options are best suited for specific scenarios. |
VOV-14290 | All | None | The output format of of the nc hosts command has changed. Instead of showing suspended jobs as part of the count of running jobs, we now show suspended jobs as a separate category. For example, if a host has 8 running jobs and 1 suspended job with 8 total slots, the "JOBS" section of the output for that host would previously read "9/8". It will now read "1/8/8", formatted as <suspended count>/<running count>/<total slots>. This change also pertains to relevant sections of the HTML UI. |
VOV-14307 | Accelerator | CS0245432 | Fixed bug where various web pages were sending lists of items with wrong delineator and also added delayed auto refresh to certain job and project based pages when starting/stopping/removing listed items. |
VOV-14359 | Accelerator, Accelerator Plus, FlowTracer | CS0160432 | Fixed issue that could cause jobs that are dispatched to an indirect tasker to be spawned with /tmp as their working directory following a restart of the base queue to which the indirect tasker is connected. Note that this fix applies to indirect taskers defined by vtk_tasker_nc only, with taskers defined by the legacy vtk_slave_nc call remaining the same. |
VOV-14362 | Accelerator, Accelerator Plus, Monitor | CS0248090 | Fixes a bug where versions of LM prior to 2021.1.0 could not communicate properly with 2021.1.0 and later versions of NC, and versions of NC prior to 2021.1.0 could not communicate properly with 2021.1.0 and later versions of LM. |
VOV-14369 | Monitor | None | Fixed issue that caused ftlm_batch_report to hang when generating reports with static images or when extracting static images from a previously generated report that contains dynamic images. |
VOV-14372 | Accelerator | None | Fixed issue on Windows that prevented the specified job environment from being applied for execution. |
VOV-14430 | Accelerator, Accelerator Plus | CS0254035 | VOV_JOB_DESC(jobclass) global variable is set for every job class when sourcing its TCL config file. |
VOV-14435 | FlowTracer | None | Fixed skip jobs timestamp issue due to which its valid outputs turned invalid. |
VOV-14454 | Accelerator | None | Fixed an issue where an SSL cert that contains a CA chain was not handled properly by the internal webserver, causing some SSL clients and libraries (curl, Python) to issue an error about the validity of the SSL certificate being offered by the webserver. |
VOV-14455 | FlowTracer | CS0257984 | Fixed local resource accounting error when using array submission with FlowTracer and vovwxd with an LSF base queue that resulted in lingering tasker definitions that consumed local limits. |
VOV-14459 | All | None | Fixed issue where specifying an instance name was not being handled correctly by the vovreadlic utility, resulting in the instance not being found in the VOV registry. Also added support for ALM licensing. |
VOV-14463 | Accelerator Plus | CS0259047 | Fixed error when tasker cannot reconnect to Accelerator Plus after server was frozen and restarted on different port |
VOV-14468 | Accelerator, Accelerator Plus | None | A defect was fixed where stopping and starting a daemon through the web UI could cause the web page to show an error rather than correctly reload the daemons status page. This only occurs when webserver is set to internal. The error was cosmetic in that the user could reload the daemons page or any other page, and vovserver also continued to operate normally. |
VOV-14470; VOV-14581; VOV-14639 | Accelerator | None | Fixed a crash in the internal webserver when certain URI requests were received. |
VOV-14487 | FlowTracer | None | Fix a problem with FlowTraccer when used with the vovwxd daemon. In certain cases, an insufficient number of demand jobs were launched to the base queue for a bucket. |
VOV-14492 | Allocator | CS0284342 | Fixed issue causing double counting of NRU matched tokens, which is resulting in underutilization of licenses. |
VOV-14494 | Accelerator, Accelerator Plus, Allocator, Monitor | CS0259047 | Set VOV_SO_REUSEADDR environment variable to set VOV server port option to be reused when VOV server is restarted. |
VOV-14495 | Accelerator | None | Restored an alert that is generated by vovresourced when there is no match to a license feature in LM that has been requested specifically by vtk_flexlm_monitor. |
VOV-14497 | Accelerator, Accelerator Plus | CS0262302 | Fixed issue which is causing array SNAPPROP property error 'SNAPPROP environment: continue sentinel missing'. |
VOV-14498 | Accelerator | CS0261544 | Handled temporary directory write permission error by creating a temporary script in the current directory when /tmp, /var/tmp, /usr/tmp don't have write permission. |
VOV-14503 | Accelerator, Accelerator Plus | CS0261567 | Enhanced job modification through the web UI or NC GUI to append the NC_MODIFY_LOG property and log the modification details. |
VOV-14535 | Accelerator Plus | None | Fixed a bug in vovwxd that was causing it not to process new buckets in case when a base queue was inaccessible. |
VOV-14563 | All | CS0278895 | Fixes an issue where large queries made using vovselect, vtk_select_loop or related commands may attempt to refresh their data infinitely, causing a slowdown in the responsiveness of the server. |
VOV-14564 | All | CS0276473 | Removed an unwanted message from the server log that is printed when a license key file is processed. |
VOV-14566 | Monitor | None | Fixed an issue with the fatal error handler in ftlm_parse_flexlm that caused the parser to exit uncleanly instead of printing the error and generating an alert. |
VOV-14572 | Accelerator | CS0265596 | A bug was fixed that was responsible for errors being generated when running vovnotifyd. |
VOV-14599 | Accelerator | None | A bug was fixed that prevented job classes from being created via the web interface. |
VOV-14611 | Accelerator | In prior releases users were able to specify the "-r slots/N" resource spec on nc run with values of N that were not equal to 1. This behavior was not recommended in the past, and is no longer supported in this release. The SLOTS resource spec in -r will be ignored (set to SLOTS/1) going forward. If a job will use more than one processor, use the CPUS or CORES resource instead of SLOTS. | |
VOV-14614 | FlowTracer | CS0277985 | Added null pointer safety within the "vrt" job wrapper's instrumentation library. |
VOV-14640 | FlowTracer | CS0294142, CS0323287 | Fixed a bug in vovwxd that was causing improper deletion of sick tasker objects |
VOV-14683 | All | None | Fixed issues with the vsi, vovbrowser, and nc run commands that prevented them from showing the correct URL to the web UI under certain configurations. |
VOV-14693 | All | None | Some web security vulnerabilities were fixed. |
VOV-14712 | All | None | The nginx webserver, which is used when the "webserver=nginx" configuration parameter is set, has been updated to the latest version 1.21.0. |
VOV-14722 | Accelerator | 123 | An issue with partialTool for DP jobs was identified, where it could incorrectly bind to just the IPV4 or the IPV6 port. It has been fixed to bind to both IPV4 and IPV6 ports simultaneously if the kernel has them enabled. |
VOV-14724 | FlowTracer | CS0301482 | Update handling of state reported by bjobs to include PSUSP, SSUSP, USUSP & UNKWN. |
VOV-14725 | Hero | None | Hero 2.0 configuration updated to deal with Palladium Z2. |
VOV-14735 | Accelerator, Accelerator Plus | CS0220480 | Changed the output of the nc hosts command to improve clarity. Fixed a bug where a tasker with more than (max capacity - capacity) suspended jobs would cause the server to erroneously mark queued jobs as running. |
VOV-14737 | Accelerator | None | Corrected some documentation about a distributed parallel job query command example. |
VOV-14745 | Monitor | None | Fixed issue that caused permanent server licenses to be reported as expired. |
VOV-14790 | All | None | The openssl library used by various parts of Accelerator was upgraded to the latest version to incorporate the latest security patches. |
VOV-14822 | Accelerator, Accelerator Plus | CS0309335, CS0314259 | Fixed issue causing nc wait -dir to wait for jobs in other directories. |
VOV-14824 | All | None | Fix a DOS vulnerability in vovserver's base webserver from nefarious large HTTP requests. |
VOV-14830 | FlowTracer | CS0310567 | Removed the environment variable listing at the beginning of the tasker log file for WX taskers. |
VOV-14840 | Accelerator | None | The "group" array field for VOV_JOB_DESC erroneously listed both "-g" and "-G" as options. Only "-g" is appropriate. The "-G" option has been updated to reflect the appropriate field, "group,final." |
VOV-14870 | Monitor | CS0312659 | If the output of batch reporting is set to multiple files, the tag filter should be taken under consideration. |
VOV-14906 | All | None | Fixed Altair License Manager license checkout failure due to version mismatch. |
VOV-14907 | Accelerator Plus, Hero | None | A crash was fixed where vovserver may refer to a bucket ID that's not a valid bucket. |
VOV-14911 | Accelerator | CS0304721 | Changed the use of the disable file access parameter in Tcl to be in line with the cpp implementation. When file access is set to 1, no one has access to the contents of a file, including the owner of the queue. If the file access is set to 2, then only the owner of the queue has access to contents of files. |
VOV-14960 | Accelerator | None | The environment variable LD_LIBRARY_PATH is no longer set by the BASE environment. |
VOV-14997 | Monitor | None | Add a rule in LM CSS that hides any img child element. If the LM CSS is not loaded, that's the case for all other products the wrapping element has no effect as a result the img .gif will get displayed. Basically fallback to old icons if bootstrap is not loaded. |
2022.1.0-p1 Release Notes
Resolved Issues
Internal Number | Products | Case Number | Description |
---|---|---|---|
VOV-14822 | Accelerator, Accelerator Plus | CS0309335, CS0314259 | Fixed issue causing nc wait -dir to wait for
jobs in other directories. |
VOV-15106 | Accelerator Plus | CS0325511 | Fixed issue causing nc wait to exit with
error "Failed subcommand wait: Illegal object id". |
VOV-15148 | Accelerator, Accelerator Plus | Fixed issue causing nc wait to exit with
error "can't read "jInfo(exit)": no such element in
array" |
|
VOV-15149 | Hero | The following targets were added to the Palladium emulators:
BRD144, BRD72, BRD48, BRD36, BRD24, BRD18, BRD16, BRD12, BRD9,
BRD8, BRD6, BRD5, BRD4, BRD3, BRD2 & BRD1. These targets
correspond to placements with the corresponding number of
boards. For example, if the emulator is configured with a group
named PZ1, then a job for a Palladium with 8 racks would use the
resource HERO:PZ1_BRD144. The placement rules are restrictive and will evolve with experience. |
|
VOV-15245 | Accelerator, Accelerator Plus | CS0325511 | Updated nc wait to filter event by jobid when waiting on 10 or less jobs. |
2021.2.0 Release Notes
New Features
Products | Internal Number | Case Number | Description |
---|---|---|---|
Accelerator | VOV-12728 | CS0133915 | Added new usage options to nc modify to allow an Admin to increase/decrease grabbed resourcemaps for running jobs and to allow selection of jobs for all nc modify use cases by specifying a selection rule. |
Accelerator | VOV-12822 | None | The web UI dashboard page has a Counters section that shows the number of active Users for an NC queue. This counter had been 0 in past releases. This has been fixed to show the correct value. |
Monitor | VOV-12955 | None | Jobclass initialization has been moved into the liveness script live_init_jobclasses.tcl instead of vovresourced. |
Accelerator | VOV-12956 | None | Timevar definitions are now processed in two configuration files. Timevar
devinitions may be place now in new configuration file
SWD/config/timevars.tcl, where they are processed by the
new VOV liveness script. This new config file is the preferred
place for Timevar definitions going forward. Timevar definitions in
SWD/resources.tcl will also continue to be processed by
vovresourced for compatibility with prior releases. |
Accelerator | VOV-13090 | CS0141521 | Support has been added to allow custom values to be used for PIPELOG related ports and range via VOV_PIPELOG_FIRST and VOV_PIPELOG_RANGE. Also VOV_CONTAINER_NETWORK_PROXY environment variable has been added to better support nested container resources, see example container config file containers/c3-enter.sh for more details. |
Accelerator Plus, Hero | VOV-13136 | None | Integrated Accelerator Plus and Hero with Altair License Manager for both node-locked and floating licenses. |
Allocator, Accelerator | VOV-13137 | None | Integrated Allocator and Accelerator with Altair License Manager for both node-locked and floating licenses. |
Accelerator | VOV-13625 | None | Added new NUMUSERS field to the SERVER object. The new field contains the count of users who are currently connected to vovserver with a web or CLI client or a running job. Vovserver updates this field every 10s. |
Accelerator | VOV-13654 | None | In Dashboard UI, the server vital signs widget will have the donut under the bullet graph bars. |
Accelerator | VOV-13774 | None | By setting the new server configuration option, vovservermgr config
slave.childProcessCleanupExclusions someChildDaemon in conjunction with
setting vovservermgr config slave.childProcessCleanup 1 , users
can now specify by name named process exclusions to the child cleanup process. If
set in conjunction with slave.childProcessCleanup , then slaves
should kill all of a job's child processes when that job exits, except for those
named here for slave.childProcessCleanupExclusions in a comma
separated list. The default value if not set is empty string, "". |
All | VOV-13806 | None | The web server used to provide Accelerator products' web UI interface and
HTTP interface to the main server is changed from the internal web server to nginx
in this release. If the internal web server is preferred, it can be selected via
the -webprovider option on the ncmgr start
command. The impact to users will be that with the new default nginx web server,
the REST v3 interface and the Accelerator administrator web UI dashboard page will
not be available for use. If either of these capabilities are needed, you should
select the internal web server option when the Accelerator queue is
started. |
Accelerator | VOV-13839 | None | With the webport enabled and the webprovider set to "internal" to use the
REST service, worker threads dedicated to servicing REST requests have a label of
either "RESTService" or "RESTRequestHandler" This can be seen by calling
ps -T -p PID with PID being the process id of
vovserver. |
All | VOV-13841 | None | Added new liverecorder.mode configuration parameter for taskers that can be used to specify whether the main tasker process, the subtasker process, or both processes should generate a LiveRecorder recording file. Note that recording files will be generated per job that is executed if subtasker recording is enabled. The default mode is for the main tasker process to be the only one to generate a recording. |
All | VOV-13947 | None | The Installation Guide has been updated to include the information for the Altair License Manager. |
FlowTracer | VOV-13372 | CS0182759 | Enhanced Job status bar to show the colors for the valid and failed jobs
based on their exit status. This functionality can be controlled by
::VovGUI::configJobStatusBar in the
gui.tcl file. All valid jobs with assigned color will
appear before default valid (green color), same for failed jobs. |
FlowTracer | VOV-13135 | None | Integrated Altair License Manager with FlowTracer for both node-locked and floating licenses. |
Hero | VOV-13138 | None | Integrated Hero with Altair License Manager for both node-locked and floating licenses. |
Monitor | VOV-13133 | None | Integrated Altair License Manager with Monitor for both node-locked and
floating licenses. Added a config key config(alm.enable) in
policy.tcl to enable the ALM licensing. The default license
manager for Monitor remains the Reprise License Manager (RLM). |
Monitor | VOV-13776 | None | Implemented Grace Period for Altair License Manager. |
Accelerator, Accelerator Plus | VOV-13778 | None | Implemented licensing modes 'Full' and 'N' for Altair License Manager for
Accelerator and Accelerator Plus. This can be set using config key
config(enterpriselicense) in
policy.tcl. |
Monitor | VOV-13777 | None | Enhanced license UI to show Altair License Manager status and make changes to the current license environment. |
Allocator, Monitor, Accelerator | VOV-13134 | None | Updated Allocator, Monitor, and Accelerator with the actual (new) features names for Altair License Manager. |
All | VOV-13617 | None | The SSL implementation has been upgraded using the latest third party libraries, OpenSSL version 1.1.1. |
Resolved Issues
Products | Internal Number | Case Number | Description |
---|---|---|---|
All | VOV-12564 | None | Changed default LiveRecorder log directory from . to /tmp. This mainly affects the default behavior for vovserver, whose working directory is the parent of the SWD, which is often stored in an NFS-based location. |
Accelerator | VOV-12822 | None | The web UI dashboard page has a Counters section that shows the number of active Users for an NC queue. This counter had been 0 in past releases. This has been fixed to show the correct value. |
All | VOV-13180 | None | Robustness changes to mitigate vovsh backtraces linked to query memory management. |
All | VOV-12963 | CS0143849 | Fixed issue that could cause object IDs to improperly recycle after multiple rollovers. |
All | VOV-13950 | None | test.check_TIMEVAR.sh was primarily failing due to clock
skew. This fixes that. |
All | VOV-14075 | None | Fix the internal web provider's HTTP responses to include security headers. Previously, the internal web provider did not correctly return the security headers in the following case: when HTTP requests were made to the vovserver web port, responses did not include the standard HTTP security headers for URLs beginning with "/doc". |
All | VOV-7887 | 21377 | Clarified documentation of VOV_LIMIT_vmemoryuse. |
All | VOV-14143 | None | Added missing vovservermgr.bat and vovclientmgr.bat scripts to the Windows package. |
All | VOV-13183 | None | Fixed a potential memory leak when a client running a long-running query is unexpectedly terminated, causing vovserver to permanently mark a query as "in-use". |
All | VOV-14061 | None | Fixed HTTP security header setting which prevented some icons to not appear in the Altair online documentation. |
Accelerator | VOV-13921 | None | In the past, when you changed an initJobClass procedure, you had to restart vovresourced to put the change into effect. Restarting vovresourced is no longer necessary. The online help has been reflected to show this. |
Accelerator | VOV-13764 | None | Changing http.proxytimeout requires the REST service be restarted if it is already running. This can be done by setting the webport to 0, waiting 30 seconds, and then setting it back to the desired webport number. |
Accelerator | VOV-13380 | None | If the vovserver web interface is enabled by setting the web port, then under
some high load conditions the web port interface would hang up. In the 2021.2.0
release, the default web server reverts to nginx, which is immune to this issue.
As a result, the default vovserver configuration will not support REST v3 or the
Accelerator dashboard UI page. To enable REST v3 and the dashborad UI page, a new
web server option may be specified by the -webprovider internal
option on the ncmgr start or vovproject start
commands. |
Accelerator | VOV-13167 | None | Memory reporting within the tools has changed on Linux to reflect what's reported by the Linux kernel in VmRSS rather than VmSize. The vovserver memory usage information from commands such as vsi, vovselect memorystats from server as well as the memory web page at "http://host:port/server?page=memory" will also report consistent memory use totals based on VmRSS. Also, vovselect memorystats from server and the memory statistics webpage have been enhanced to account for more of the "chunk" based memory pool allocations used within vovserver. |
Accelerator | VOV-13531 | None | A minor appearance improvement to the web dashboard UI page was made within the Capacity sub-window. |
Accelerator | VOV-13326 | None | Changed the VOV_DISABLE_SHARED_MEMORY_LOOKUP behavior to return the RSS. |
Accelerator | VOV-13769 | None | When using the internal REST server (as opposed to the nginx server), the
vendor library used to implement it was changed from cpprestsdk to Oat++. The back
end no longer allocates a static pool of worker threads to service requests,
controlled by the variable http.workerthreads . That parameter is
now ignored. The Oat++ backend creates a new thread to service each request and
terminates the thread after sending the response to the HTTP/REST client. |
Accelerator | VOV-13638 | CS0191754 | The RESD(typeList) parameter in the
vovreconciled/config.tcl file can be used to modify the
license types handled by vovreconciled. The parameter value is
a list of names, by default the value is {License}. The following types are not
supported and will be ignored if present: Limit, Policy, User, Group and Priority.
The type of License will be added if not specified. |
Accelerator | VOV-13549 | None | Fixed script execution issue with message.cgi where user did not have proper permissions to modify the underlying file, UI will now display message indicating the issue and properly disable the submit button. |
Accelerator | VOV-13561 | None | Internal bug which may have caused some set statistics to not be reported accurately has been addressed in this release. |
Accelerator | VOV-11780 | AAP24453 | This release contains OpenSSL 1.1.1j which does not exhibit the warning message of the previous packaged version OpenSSl 1.0.2q. |
Accelerator | VOV-13720 | None | The header section stays sticky at the top and always be visible to the user. |
Accelerator | VOV-13672 | None | In the Scheduler Vital Signs widget, the values under the horizontal bars are now in sync with the values shown in tooltips, respectively (tooltips appear on hover over the horizontal bars). |
Accelerator | VOV-13791 | None | The REST HTTP server has a new threading model that no longer uses a pool of
worker threads to service client connections. It now creates a new thread for each
connection and that thread terminates after transmitting a response to the client.
The vov variable http.workerthreads is deprecated. It is visible
but not changeable. |
Accelerator | VOV-13739 | None | The Set Browser link has been modified to point to the classic UI's Set Browser page. |
Accelerator | VOV-13816 | CS0205113 | Address issue where license resources sometimes became unavailable when on life support. |
Accelerator | VOV-14110 | AAP24923 | Fixed bug where TaskerClass.table based resources did not display properly in web UIs extra resources column. |
Accelerator | VOV-10345 | 24403, 24469, 24648 | The following system taskers no longer consume a license: vovdbd, WXLauncher & maintainer. |
Accelerator | VOV-12107 | CS0120865 | Added cleanup of unknown process IDs which also fixes the flooding of tasker logs with the following error messages: "Must kill late child Pid...", "rakeChildren: Child process...", "does not exist anymore: assuming it is done..." |
Accelerator | VOV-13510 | CS0182762 | Fixed issue causing license checkout with empty shared (ISV) string which is resulting in duplicate license checkout. |
Accelerator | VOV-13880 | None | In the Schedular vital signs widget, the values shown in the tooltip of the buckets bullet graph will be in-sync with the number jobs submitted. |
Accelerator | VOV-13860 | CS0208413, CS0208823 | Fixed issue that caused the tasker to overload vovserver with messages when a job execution attempt failed due to not being able to successfully fork out the subtasker process that is used to shepherd the job. |
Accelerator | VOV-9031 | 23103 | A description on how to set up a tasker in Windows has been added to the online help. |
Accelerator | VOV-13861 | CS0210064 | Fixed issue in which SIGALRM interrupted communications on interactive jobs using VOV_INTERACTIVE_PING keep alive method |
Accelerator | VOV-13771 | None | When making job related REST API calls, helpful error information is included in the REST response. If using the vov_rest_v3 Python API wrapper, the content of the error will be thrown inside VovRestException. |
Accelerator | VOV-13849 | CS0208895, CS0218919 | Fixed bug where interactive (-I/-Ir) root privileged container jobs potentially resulted in a process group SIGINT being captured and accidentally being sent to systemd, following which bad things may happen, such as a system reboot on subtasker host. |
Accelerator | VOV-14116 | None | Fix a problem with changing the vovserver's webserver from "nginx" to "internal" using the vovservermgr config webprovider internal command. It was not possible to make this change without restarting vovserver, but with this fix, the transition from nginx to internal web server can be accomplished by a 3 step process: 1) shut down nginx with this command: vovdaemonmgr stop vovnginxd ; 2) delay 5 seconds with sleep 5; 3) start the internal web server with vovservermgr config webprovider internal. |
Accelerator Plus | VOV-13956 | None | Fixed race condition with Accelerator Plus that caused jobs to fail due to placement on taskers reserved for different buckets. |
Accelerator Plus | VOV-13785 | CS0142115 | Fixed issue that prevented jobs using a jobclass with VOV_JOB_DESC(interactive,useXdisplay) from succuessfully launching agents when run via Altair Accelerator. |
Accelerator Plus | VOV-13872 | CS0211355 | Fixed issue where SICK status Accelerator taskers were not removed after an
appropriate amount of time. The underlying cause was that there were still related
jobs running in the base queue, and was repaired by passing the
-forcerunning option to the NC base queue forget command for
taskers with a SICK status. |
Accelerator Plus | VOV-14020 | None | Fixed issue with Accelerator Plus in DirectDrive mode that prevented jobs from running when added to a bucket that was empty during a vovwxd daemon restart. |
Accelerator Plus | VOV-13731 | None | Fixed issue which prevented Accelerator Plus in Direct Drive mode from launching taskers for preexisting jobs when a base queue is restarted and no further jobs are incoming in the Accelerator Plus queue. |
Accelerator Plus | VOV-13917 | None | Improved logging to identify when Altair Accelerator Direct Drive feature is in use. On vovwxd startup, the vovwxd.log will contain: Initializing vovwxd with Direct Drive... ... vovwxd with Direct Drive initialization successful. During operation, the vovwxd.log will also contain the thread identifier "APPluginAccel" when running in Direct Drive mode. |
Accelerator Plus | VOV-13890 | None | Fixed issue which prevented WX taskers from reconnecting after server freeze/failover with fastexit enabled. |
Accelerator Plus | VOV-13650 | None | vovserver has a new config parameter
tasker.authorization.delay that specifies the time in seconds
that the server wait before authorizing new taskers.This parameter can be useful
for WX, where it can reduce the latency of dispatching a job to a newly requested
tasker. |
FlowTracer | VOV-13749 | None | Fixed issue in vov_lsf_agent that prevented it from launching a tasker. |
Monitor | VOV-8715 | 22305 | Send warning about nonexistent user to stderr instead of stdout when ftlm_batch_report with user filter is requested |
Monitor | VOV-13464 | CS0174914 | Added ADJUST_CAPACITY periodic maintenance task ( | ).
Monitor | VOV-13633 | CS0188846 | Logs containing, "Queued Programs" were not being recorded as queued requests properly for ftlm_parse_lstc. |
Monitor | VOV-13712 | CS0195003 | A denial plot showed incorrect data when the data binning size was 30s and the time span was several months. |
Monitor | VOV-13927 | None | Fixed issue that prevented ControlCenter jobs from executing on hosts that have an upper/mixed-case name. |
Monitor | VOV-13929 | None | Fixed issue that prevented the process monitoring facility from recording incoming running processes as checkouts. |
Monitor | VOV-7800 | 21302, 23839 | Fixed issue that caused the default bin interval for the denial plot to default to 30s instead of a dynamically calculated optimum value for the report time range. This caused the denials to be binned incorrectly for reports with time ranges that would result in more than 100k bins unless an explicit interval was specified in the report options. Protections were also added to prevent the acceptance of a bin interval that is too short for the report time range (any value that would result in more than 100k bins). |
Monitor | VOV-9665 | 23904 | Fixed issue that prevented the ftlm_capacity load operation from finding the data files necessary for loading feature capacity information into the database. |
None | VOV-10568 | CS0120645 | HyperThreading fieldname 'HT' on taskers previously reported incorrectly if HT was available. The field now properly reports if SMT threading (either Intel HyperThreading or AMD SMT) is currently available and enabled by checking value of either /sys/devices/system/cpu/smt/control or /sys/devices/system/cpu/cpu0/topology/thread_siblings_list (or equivalent). directly. |
All | VOV-13611 | CS0120932 | Resolved issue with parsing and filtering via selection rules values that may contain uint64 fields compared with hardcoded integer values. Additionally, users may now explicitly declare uint64 hardcoded values such as vovselect name,totalspace,freespace -from filesystems -where 'totalspace>1U AND freespace>1U' |
Accelerator | VOV-13689 | None | The Running-jobs axis scale numbers have been placed with an even space. |
Accelerator | VOV-14209 | CS0236878 | Fixed an issue with temporary loss of key file license registration by
vovserver when the enterpriselicense configuration parameter was
not explicitly set in policy.tcl. In these cases a
vovproject sanity resulted in a temporary switch to RLM
licensing. Some specific low level changes made were as follows:
|
Accelerator, Monitor | VOV-14180 | Fixed a bug that arose when SSL certificate files were added by the admin with file names correctly derived from the fully qualified host name from VOV_HOST_HTTP_NAME. The Accelerator products would not initialize the webport and the web UI URL properly when the "internal" webprovider was activated. | |
Accelerator | VOV-14220 | None | Fixed issue that could result in a leak of a file descriptor in the tasker for an interactive job that has ended. |
Accelerator | VOV-14221 | CS0221756, CS0238663 | Fixed issue that could result in a leak of a file descriptor in the tasker for an interactive job that has ended. The resulting build up of old file descriptors was making the tasker (vovtaskerroot process) go into a "sick" state. |
2021.2.0-p5 Release Notes
Resolved Issues
Internal Number | Products | Case Number | Description |
---|---|---|---|
VOV-15827 | All | CS0412949 | Added robustness to code dealing with REST and web session key management. |
VOV-15886 | Accelerator Plus | None | Only one instance of vovwxd daemon can be launched for an AAP queue. Other attempts to start a vovwxd process will fail with an error. |
VOV-15617 | Accelerator Plus | CS0322667 | Fixed an issue where invalid launcher jobs ( with an ID of 000000000 ) would get added to the user and jobclass sets, and cause the WX console to crash. |
2021.2.0-p4 Release Notes
Resolved Issues
Internal Number | Products | Case Number | Description |
---|---|---|---|
VOV-15310 | Accelerator Plus | CSO355127 | Fixed a crash where an ambiguous "vovquery select id from 9" results in an object lookup that vovquery doesn't support. |
2021.2.0-p3 Release Notes
Resolved Issues
Internal Number | Products | Case Number | Description |
---|---|---|---|
VOV-15245 | Accelerator, Accelerator Plus | CS0325511 | Updated nc wait to filter event by jobid when waiting on 10 or less jobs. |
2021.2.0-p2 Release Notes
Resolved Issues
Internal Number | Products | Case Number | Description |
---|---|---|---|
VOV-14362 | Accelerator, Accelerator Plus, Monitor | CS0248090 | Fixes a bug where versions of LM prior to 2021.1.0 could not communicate properly with 2021.1.0 and later versions of NC, and versions of NC prior to 2021.1.0 could not communicate properly with 2021.1.0 and later versions of LM. |
2021.2.0-p1 Release Notes
Resolved Issues
Internal Number | Products | Case Number | Description |
---|---|---|---|
VOV-13910 | Accelerator Plus | CS0213212, CS0303516 | If a job is dispatched to a tasker that is in the process of exiting, the job will be refused by the tasker and automatically rescheduled for execution up to the maximum number of times allowed by the autoRescheduleCount server configuration parameter. |
2021.2.1 Release Notes
New Features
Internal Number | Products | Case Number | Description |
---|---|---|---|
VOV-13249 | All | CS0120865 | A configuration parameter "logs,tasker,compress" (a boolean that defaults to 0) was added that, if set, will cause the daily logs to be compressed at the end of the day. This is similar to the existing "logs,server,compress" parameter. This parameter should be set in the policy.tcl configuration file. |
VOV-14345 | Accelerator Plus | None | Direct drive now supports vovwxd/config.tcl parameters slave,max and client,derate by pausing base queues when limits are reached. Note that due to the asynchronous nature of Direct Drive, these are soft limits and some overrun is expected. |
VOV-14171 | Accelerator Plus | None | vovwxd in direct drive mode can now use a driver script for bucket filtering. vovaccel.tcl is an example script and it can be specified in CONFIG(drive_script). |
Resolved Issues
Internal Number | Products | Case Number | Description |
---|---|---|---|
VOV-14253 | All | CS0145466, CS0240541 | Fixes an issue where in some circumstances, vtk_select_loop would cause a crash in vovsh when requesting large result sets. |
VOV-14279 | Accelerator, Accelerator Plus | CS0241809 | Fixed issue causing error 'child process exited abnormally' for forgotten jobs. |
VOV-14287 | Accelerator | None | Corrected the script name that is called by the vovclientmgr.bat wrapper script. |
VOV-14286 | Accelerator | None | Fixed vovlicensemgr errors with ALM licensing. |
VOV-14218 | Accelerator | None | Fixed a crash that occurred when the Internal webprovider's SSL configuration and/or webport was changed. |
VOV-13510 | Accelerator | CS0182762 | Fixed issue causing license checkout with empty shared (ISV) string which is resulting in duplicate license checkout. |
VOV-13732 | Accelerator | CS0134814 | Fixed an issue with vovclientmgr closedeadinteractive where an error was thrown if jobs were forgotten while the command was running. Also added a -dry-run option to the above command. |
VOV-14118 | Accelerator | CS0230408 | An issue was fixed where vovgetgroups would fail if VOV_USE_VOVGETGROUPS was set to 1, and the user in question belonged to more than 128 groups. |
VOV-14262 | Accelerator | None | Some compatibility issues were found between the openssl library we use and the one provided on CentOS 8 that negatively affects our ability to validate user information on CentOS 8 when NIS is configured. The compatibility issue has been addressed. |
VOV-14302 | FlowTracer | None | Fixed a bug causing a TCL error in vovlsfd daemon when, for any reason, it cannot parse the output of bsub command. |
VOV-14068 | Monitor | CS0219474 | Fixed incorrect summary calculation in Monitor tables. |
VOV-13693 | Monitor | CS0183043 | Fixed license parser issue causing the error - Illegal number of args: vtk_feature_get_or_create daemon name total version isv |
2021.2.1-p3 Release Notes
Resolved Issues
Internal Number | Products | Case Number | Description |
---|---|---|---|
VOV-14822 | Accelerator, Accelerator Plus | CS0309335, CS0314259 | Fixed issue causing nc wait -dir to wait for
jobs in other directories. |
VOV-15106 | Accelerator Plus | CS0325511 | Fixed issue causing nc wait to exit with error "Failed subcommand wait: Illegal object id". |
VOV-15148 | Accelerator, Accelerator Plus | Fixed issue causing nc wait to exit with error "can't read "jInfo(exit)": no such element in array" | |
VOV-15245 | Accelerator, Accelerator Plus | CS0325511 | Updated nc wait to filter event by jobid when waiting on 10 or less jobs. |
VOV-15310 | Accelerator Plus | CS0355127 | Fixed a crash where an ambigious vovquery select id
from 9 results in an object lookup that vovquery
doesn't support. |
2021.2.1-p2 Release Notes
Resolved Issues
Internal Number | Products | Case Number | Description |
---|---|---|---|
VOV-14362 | Accelerator, Accelerator Plus, Monitor | CS0248090 | Fixes a bug where versions of LM prior to 2021.1.0 could not communicate properly with 2021.1.0 and later versions of NC, and versions of NC prior to 2021.1.0 could not communicate properly with 2021.1.0 and later versions of LM. |
2021.2.1-p1 Release Notes
New Features
Internal Number | Products | Case Number | Description |
---|---|---|---|
VOV-14477 | All | CS0257852 | Taskers running as non-root will no longer get sent jobs unless the job's user matches the
non-root tasker's userid. This is to address a situation where a
job running on a non-root tasker gets access to the user's data
on the filesystem. This policy can be disabled by setting
allowForeignJobsOnUserTaskers to 1. |
VOV-14419 | All | None | Multiphase support is provided by two additional command arguments to nc
run: -multiphase [1|0] and
-mpres "resource string"
By specifying the resources of each phase and designating that certain resources are only allocated to certain taskers, you can run different phases of a job on different taskers. For example, I have
two taskers named tasker1 and tasker2. I want to run phase 1
and 3 on tasker1, and phase 2 on tasker2. My resources may
look
like:
I
could then run a multiphase job as:
A
multiphase job will have two new Job Properties set:
If the script exits with an exit code of 216, nc will increment the job phase, change the job resources, and reschedule the job to run again. If the script exits with an exit code of 0, the job is considered "Done", and MPCURRENTPHASE is reset to 1. Failed jobs: If a job fails during a phase
with a code other than 0 or 216, it is considered FAILED and
MPCURRENTPHASE will not increment. If the job is invalided
and re-run (for example, Logging: After the first phase is run, subsequent phases of the job will have the command rewritten so that the wrappers are passed "-a -A", telling the wrappers to append to the job log. This is so that all phases of the job get their stdout and stderr logged to the same file. If this was not done, each phase of the job would overwrite the log, and the user would only see the output from the last phase that was run. If nc does not detect one of the standard vov wrappers at the beginning of the command line, it will assume the command is not using a wrapper. In this case, it will look for the standard ">" redirect symbol in the command and replace it with ">>". REST Support: In the payload for submitting a job via
rest, two new fields are allowed:
|
Resolved Issues
Internal Number | Products | Case Number | Description |
---|---|---|---|
VOV-14465 | All | CS0257852 | Taskers running as non-root will no longer get sent jobs unless the job's user matches the
non-root tasker's userid. This is to address a situation where a
job running on a non-root tasker gets access to the user's data
on the filesystem. This can be disabled by setting
allowForeignJobsOnUserTaskers to 1. |
VOV-14217 | Accelerator Plus | CS0237217 | An issue that prevented DP jobs from successfully being run via Accelerator Plus has been resolved. |
VOV-13910 | Accelerator Plus | CSO213212, CSO303516 | If a job is dispatched to a tasker that is in the process of
exiting, the job will be refused by the tasker and automatically
rescheduled for execution up to the maximum number of times
allowed by the autoRescheduleCount server
configuration parameter. |
2021.1.0 Release Notes
New Features and Enhancements
The following new features and enhancements were introduced this software release:
Product | Issue Number | Case Number | Description |
---|---|---|---|
All | VOV-9454 | 23741 | Most of the VOV Tcl files from the installation package now contain the proper Altair copyright statement and version number. |
All | VOV-12801 | Introduce the new "tasker" lexicon for Accelerator product environment variables. Environment variable names containing old term "SLAVE" will get new names containing "TASKER" in the place of "SLAVE". The old environment variable name will be honored unless the new name is being used. This compatibility measure will ease the transition for administrators. | |
All | VOV-12797 | New Tcl VTK function names are added to move to the new "tasker" lexicon. Old VTK function names containing the string "slave" are deprecated, and new VTK function names containing the string "tasker" are added and should be transitioned to from this release forward. | |
All | VOV-9297 | Support for SuSE Linux Enterprise Server (SLES) 15 has been added to the Altair Accelerator products. | |
All | VOV-12111 | Support for CentOS and RHEL 8 has been added to the Altair Accelerator products. | |
All | VOV-13321 | Support for Ubuntu 14.04 has been dropped. | |
All | VOV-13306 | This release discontinues support for the SLES 11 operating system. | |
All | VOV-13127 | All references to the term "slave" have been replaced with the new term "tasker" throughout the online help documentation. | |
Accelerator | VOV-12708 | A REST API guide and tutorial document is added to the Accelerator documentation bookshelf reader. | |
Accelerator | VOV-12537 | SlaveLists are deprecated and replaced with TaskerLists with
the following additional functionality:
|
|
Accelerator | VOV-12458 | Implemented "Dialpad" or "Waffle" menu for mobile screens and at high zoom levels on the dashboard UI page. | |
Accelerator | VOV-12743 | Implemented Subsets table in Set Detailed View page. | |
Accelerator | VOV-12747 | An actions dropdown menu has been added, enabling the actions (delete, run with priority) to be performed on selected jobs. Added a search bar, allowing the user to filter the jobs by entering search strings. | |
Accelerator, Accelerator Plus | VOV-12150 | The documentation shown by "ncmgr start -h" is expanded to explain some additional features that require an Accelerator queue to have the webport enabled. In the 2020.1.0 release, the following new features require the webport: 1) REST v3 API and 2) the new administrator dashboard UI page. | |
Accelerator | VOV-12964 | Fixed CSS issues in bulk actions drop down. | |
Accelerator | VOV-12812 | Enabled client activity logging for nc cmd commands. | |
Accelerator | VOV-12744 | Added a Details section for the selected set in the Set Detailed View page. | |
Accelerator | VOV-12739 | Added UI functionality in the | screen. Users can now perform actions like retrace, and delete on sets, and also filter the displayed list of sets by text string.|
Accelerator | VOV-12460 | Added NC queue color to the dashboard user interface. | |
Accelerator | VOV-12762 | Implemented storybooks for the Table component user interface. | |
Accelerator | VOV-9778 | 23068, 23767, 23914, 24923 | Irrelevant alerts are no longer generated. Addressed some implementation issues with vtk_flexlm_exclude_tags. Note that calls to vtk_flexlm_exclude_tags are cumulative and override any tags added with vtk_flexlm_monitor and vtk_flexlm_monitor_all. The -noooq parameter for vtk_flexlm_monitor has no impact at present, please use vovresSetFlags instead. The -order parameter to vtk_flexlm_monitor and vtk_flexlm_monitor_all only orders any specified tags, it no longer adds tags (use -tags to add tags). The optional parameters vovResource, vovMap to vtk_flexlm_monitor are now handled correctly. |
Accelerator | VOV-12277 | A new command option for nc run has been
added called -dpinitialport N which allows the
user to specify the starting port that partialTool will use to
find an open port to communicate among the subtasks in the
cohort. This will be reflected in a new job property named
DP_INITIAL_PORT that can be observed being set on the job.
|
|
Accelerator | VOV-12736 | The React dashboard has now implemented a Sets List view. | |
Accelerator | VOV-12742 | Implemented Jobs table in Set Detailed View page. | |
Accelerator | VOV-13292 | Two new server configuration parameters have been added: http.workerthreads and http.proxytimeout. http.workerthreads specifies the number of worker threads that the new REST HTTP server will start when vovserver starts with a valid webport. http.proxytimeout enables you to specify the timeout in seconds, used when the main webserver forwards some requests, like CGI pages, to the older http server listening on the VOV port. | |
Accelerator | VOV-11930 | CS0120821 | A new command line parameter was added for nc
run for dp jobs called
-nocohortwait . This instructs partialTool
for each cohort task to finish when its subtask process has
finished rather than wait for the primary job to complete (which
is the default behavior). Passing -nochortwait
to nc run sets a new property named
DP_COHORTWAIT to 0. By default, this is set to 1 when
-nocohortwait is not passed, and
partialTool will behave like it always has. |
Accelerator, FlowTracer, Monitor | VOV-10198 | Add support for Windows Server 2019. | |
Accelerator | VOV-12947 | Added breadcrumb navigation to the Sets page, through which the user can navigate to the hierarchical sets. | |
Accelerator | VOV-12452 | Add a sub-window for scheduler health monitoring and vital signs in the Accelerator admin dashboard web UI. | |
Accelerator | VOV-12733 | Support is added for Accelerator on ARM64 systems running Centos 7, Centos 8, or Amazon Linux 2. This support is for execution hosts and submit hosts only. The "armv8" hardware resource name is added for this architecture. | |
Accelerator Plus | VOV-12295 | CS0121114 |
|
FlowTracer | VOV-12556 | Add support for FlowTracer on Windows. | |
Hero | VOV-12891 | Added a -P <NAME=VALUE> parameter to the
hero submit command (similar to the NC -P
parameter). |
|
Hero | VOV-12932 | Added the -modules ,
-stagein , -stageout
parameters to the hero -zebu submit command.
The DeclareEmulator specification now includes the following
parameters: -type (for future use),
-environment ,
z-ebu_system_dir ,
-zebu_root . The commands specified in the
-stagein , -stageout
parameters depend on the resources
Limit:zebu_stagein_load ,
Limit:zebu_stageout_load . |
|
Monitor | VOV-5671 | The vtk_feature_add_or_create API now expects an additional parameter for the associated ISV string. See the documentation for the new syntax. The vtk_featureuser_* APIs have had their names changed to vtk_checkout_*. The old vtk_featureuser_* names are still supported but vtk_checkout_* will be the official documented names. | |
Accelerator | VOV-6572 | This should be fixed as a side effect of implementing the mutator API. See release notes for VOV-8899. | |
All | VOV-9298 | Support for Ubuntu 18.04 and 20.04 is added. | |
All | VOV-13364 | Starting with the 2021.1.0 release, the Accelerator Products images come with digitally signed certificates that can be used to reliably confirm authenticity of the installation media images. |
Resolved Issues
The following issues were resolved in this release.
Product | Issue Number | Case Number | Description |
---|---|---|---|
All | VOV-13252 | Web server improvements (when the web port is configured as
non-zero):
|
|
All | VOV-13181 | Some stability improvements were made in the Webserver code to avoid potential crashes of vovserver. | |
All | VOV-12989 | CS0145649 | Fixed an issue where stopping more than 1 vovtasker by name (vovtaskermgr stop <tasker1> <tasker2>...) was renaming only the last named tasker to <taskername>_stopped_<timestamp>. |
All | VOV-9560 | 23740 | Fixed an issue with vovnotifyd using only the first RAM value for jobs with multiple RAM requests (e.g.: -r+ RAM/100 -r+ RAM/200) to determine if the job is exceeding requested RAM usage (health check of requested RAM). |
Accelerator | VOV-11388 | 25153 | Fixed an issue with the vovserver failing to start when epoll
is enabled (set config(useepoll) 1) in
policy.tcl. |
All | VOV-12582 | The vovtaskermgr start command will now only utilize the configured rshcmd (one of: inetd/rsh/ssh/vovtsd) for starting remote taskers. Prior to this change, the inetd method was always attempted, and the vovtsd method would be attempted if the configured vovtsd port was non-zero. | |
All | VOV-12512 | All references to PBS Works support have been updated to direct the user to the new Altair One website. | |
All | VOV-6287 | 20738 | Fixed the issue in the error message "too many elements in
array" where the max array was not getting updated as per the
config(maxJobArray) . |
All | VOV-13247 | Network security testing port scans had in certain cases caused vovserver to hang up in an infinite loop. | |
Accelerator, Accelerator Plus, Monitor | VOV-13009 | CS0133888 | In the help information displayed by nc cmd
vovdaemonmgr -h, a note was added indicating that
the -f (force) option applies only to the start
subcommand, and only when a daemon list is specified. |
Accelerator | VOV-12418 | CS0121215 | By default, interactive jobs will also write to a logfile
just like normal jobs do. You can also specify the log file with
the -l parameter to nc run
like normal jobs. If you do not want an interactive job to write
to the log file, use -nolog as a parameter to
nc run |
Accelerator | VOV-12135 | The axis labels in the jobs histogram in the dashboard UI page showed repeated "1" labels with a small number of running or queueed jobs. | |
Accelerator | VOV-13000 | CS0128274 | Fixed an issue where incrementing grabbed resources was not incrementing the count of used resources in some cases. |
Accelerator | VOV-13039 | CS0146315 | The network data sent as a result of nc info was made more compact, which will make running the command more efficient. |
Accelerator | VOV-13113 | Fixed CSS issues in Set Browser page. | |
Accelerator | VOV-13026 | CS0149221 | Handle window/weight inheritance for new FairShare groups that are being created during job submission. The window will be inherited from the parent. Both the window and weight will be inherited from a sibling group named "default". |
Accelerator | VOV-12714 | Fixed the following issues with job container support:
|
|
Accelerator | VOV-13108 | As per the new API response, changes made to the footer version text. | |
Accelerator | VOV-13272 | Increased virtual memory limit for nc run. | |
Accelerator, FlowTracer | VOV-12908 | The Accelerator new dashboard UI for administrators, when
accessed, will increase the vovserver memory "Size" metric
printed by the vsi command. The large
reported memory size is virtual memory address space size, with
only modest associated increase in actual memory usage. The
number of worker threads used by the web server can be
controlled with config(http.workerthreads) N in
policy.tcl. Changing this value will
require a vovserver restart, because it can only be set once
before the multithreaded webserver is initialized. Also, a
timeout value for when the multi-threaded webserver has to
delegate some requests, such as CGI pages, to the old vovserver
web server, can be configured by setting
config(http.proxytimeout) . This value can
be changed at any time. |
|
Accelerator | VOV-12977 | CS0143428 | Fixed an issue with interactive jobs (nc run
-I ) failing with the error message "Job has
problems with PTY. Bad pipes". |
Accelerator | VOV-12811 | CS0129987 | vtk_resourcemap_set now requires that the user either own the resource, or the user have ADMIN security rights for it to take effect. |
Accelerator | VOV-10921 | 24781, 24803 | To better clarify jobs that have been queued due to reserved taskers, additional information has been added to the output of the nc why command. Under the "Per-slave/per-tasker analysis" section, a count of taskers that would have been compatible but are reserved will be shown as: 'n is currently reserved by others' |
Accelerator | VOV-5980 | 21105 | The nc modify command has been modified to exit with a status of 1 if any part of the modification request fails. |
Accelerator | VOV-12892 | With certain types of product install methods, the Accelerator documentation bookshelf link in the Web UI had not been functional. This only impacted customers who download and un-tar both common.tar and win64.tar with the intent of installing both linux64 and win64 into the same master installation directory. The workaround was to un-tar and install win64 first, then go back and un-tar common.tar and linux64.tar, and then install linux64 only. If you had an existing installation the workaround was to un-tar common.tar and reinstall linux64 only. | |
Accelerator | VOV-12547 | Fixed an issue with the -Il option for
interactive jobs that prevented the user from typing in the
terminal window and interacting with the job. |
|
Accelerator | VOV-11452 | Added -orphanreservations option to the
vovforget command for forgetting the
reservations not attached to any tasker. Behaviour is modified
to allow overlapping reservation in the system, but it will
never be in effect unless the dominant reservation is deleted.
Fixed an issue where the tasker reservation gets duplicated
after server restart. Also, changed the tasker instance
reservation (created using vtk_tasker_define
(-reserve option) or by passing the
-e option to vovtasker) to be
non-persistent by default. No change in behavior for tasker
reservations created using
vtk_reservation_create. |
|
Accelerator | VOV-11662 | 29869 | Fixed "no such variable 'killTimePP'" alerts when health
checks are enabled for stuck jobs with
-stuckKillTime . |
Accelerator | VOV-9254 | 23430 | Fixed issue that prevented child FairShare groups from being displayed when viewing the top-level group via vovfsgroup show. |
Accelerator | VOV-11261 | Addressed issue where delays were encountered due to vovserver not being immediately notified of an update. | |
Accelerator | VOV-7490 | 20070, 24363 | Fixed an issue leading to "URGENT vovnotifyd Cannot send mail. can't read "code": no such variable" alerts. Reduced the severity to WARN, in case of failures to send mail. Also, added an alert if the list of recipients for notification emails is empty. |
Accelerator | VOV-12464 | CS0122942 | Requests for CGROUP:RAM with more than 1 RAM specification
will now limit RAM usage to the total amount requested by all
RAM specifications rather than the last one. For example the
command: nc run -r CGROUP:RAM RAM/60 RAM/40 -- sleep
0 will limit ram usage to 100 megabytes rather than
40. |
Accelerator, Accelerator Plus | VOV-13030 | CS0149277 | Fixed an issue with Ctrl-C not working as expected with
interactive jobs (nc run -I/-Il/-Ir ). |
Accelerator, Accelerator Plus, FlowTracer | VOV-4998 | CS0143832 | For all products, strict job name checking has been enabled
and invalid job name characters will cause an error. For
Accelerator and Accelerator Plus, this can be overridden by
putting the following in
$VOVDIR/local/vncrun.config.tcl
or
Legacy will
use the more lax job naming rules from earlier releases. Replace
will identify invalid characters in the job name, replace them
with "_", and issue a warning to the console An issue with
vsm being enabled to handle some invalid
job name characters was addressed. |
Accelerator, Accelerator Plus | VOV-13051 | CS0121039 | Fixed an issue with interactive jobs (nc run -I) failing with error messages similar to "Error=98: Address already in use [vovttyserver2:244]" and "FATAL ERROR: Cannot open PTY port (with remote signal handling): Cannot open pty server sockets [vncrun.tcl:2257]". This is accompanied by job errors similar to "Cannot connect to PTY server on submission host lava1 13316 Z@:x=XGa56cT_Hd6 from lava5". |
Accelerator | VOV-12519 | For consistency across CLI and web UI, the default values for
the following VovPremptRule options have changed in some cases
from previous versions.
|
|
Accelerator | VOV-10558 | 23924 | Empty job class sets are not deleted, thereby preserving all properties for future submissions. |
Accelerator | VOV-12807 | When you hover the mouse over the job graph line in the dashboard UI window, a small pop-up displays the Y unit, the Y number, and a time. The Y number is actually an average over a surrounding time window, and not an instantaneous value as implied by the information shown. | |
Accelerator | VOV-13169 | When vovserver was configured with webport and failover was configured, it was found that vovserver could lose access to the webport and get restarted with the webport disabled. This has been fixed. Also, if Accelerator was configured to use License Monitor, it was observed that failover could leave an extra copy of the voveventmon process launched by vovresourced running every time vovserver crashes and restarted. vovresourced has been modified to properly shut down voveventmon in the case where vovserver has crashed. | |
Accelerator | VOV-13036 | Fixed a server crash caused by memory corruption when running queries from a 2016.09 client. | |
Allocator | VOV-10997 | CS0156459 | Suppressed the log, "Could not add FTResJob...", as it is not impacting the functionality. |
Allocator | VOV-12724 | CS0130000 | Fixed issues with the allocation of resource groups in Allocator. Prior implementations based the allocation on demand for the component resources only. The new implementation bases the allocation on demand for the resource group and all of its component resources. |
Allocator | VOV-11883 | CS0120726 | Fixed issue when the same feature serviced by different
daemons with different tags which makes
-ExcludeTagRx ignored. |
Allocator | VOV-11894 | CS0120781 | Fixed CSV export of the | and tabular reports.
Allocator | VOV-11224 | 24981 | Fixed an issue with Allocator showing incorrect "Distributable" values when 'SetReserverForUser' is used with * (all users). |
Allocator | VOV-13025 | CS0145466 | Added a config key MQ(pjProbeKillTimeout)
for the maximum time that the vovlad daemon
should wait for existing probes to be killed at startup. |
FlowTracer | VOV-7956 | 21595 | Addressed issue that prevented alert text from being displayed in the vovconsole alerts window. |
FlowTracer | VOV-12960 | CS0143848 | Fixed issue with keyword substitution for array job submissions that caused arbitrary matches to the array reference job's ID and IDINT values to be substituted with the ID and IDINT values from the individual array jobs in the job metadata. For example, an array job submission of "echo 000001070" where the reference job was coincidentally job 1070 would result in the command being changed in each individual array job to reflect its own ID, such as "echo 000001072" for job 1, "echo 000001074" for job 2, and so on. |
FlowTracer | VOV-12815 | CS0137268 | Evaluation of resources when used with an indirect tasker (taskerVNC) now applies the jobclass followed by the resource list which is the opposite of what was done previously. |
FlowTracer | VOV-12203 | CS0120999 | Re-evaluation of a job class to compute the union of resources when used with an indirect tasker(taskerVNC) is no longer done. This is typically relevant for FlowTracer integration with either Accelerator (NC) or Accelerator Plus (WX). To restore the old behavior, please contact Altair support. |
FlowTracer | VOV-12901 | Fixed an issue where the user may see PIPELOG related errors
in the console on Windows, when running a FlowTracer job
directly from the command line, such as : vov cmd.exe /c
echo "Hello" |
|
FlowTracer | VOV-12918 | CS0142609 | Improved the behavior of the vovwxd daemon. The daemon will be configured to use the default queue name (vnc) unless the NC_QUEUE environment variable is present. | : vovwxd vovconsole menu option, which configures and starts the
FlowTracer | VOV-10189 | Schedule priority and execution priority are now saved in the persistent representation. | |
FlowTracer | VOV-12813 | CS0137660 | systemjob state is now saved in the representation so that it's persistent across FlowTracer restarts. |
Hero | VOV-12931 | Wrapper daemon now runs on the emulator vovtasker associated with the emulator. Previously it ran from wherever the autostart command was executed. Fixed an issue that prevented the command hero -zebu stop_all_wrappers from working correctly in some instances. | |
Monitor | VOV-9774 | 24058, CS0121121 | Fixed problem parsing MathLM licenses when one of HH MM SS time values starts with 08 and 09. |
Monitor | VOV-12253 | CS0120851 | Fixed MathLM parser for features with "Sub" and space prefix. |
Monitor | VOV-12634 | CS0126701 | Fixed Altair Monitor GUI to correctly show expiration date if one of the licenses expired. |
Monitor | VOV-12324 | CS0121132, 21139 | The output format for more recent versions of Sentinel RMS has changed. The new format caused ftlm_parse_sentinel to incorrectly calculate capacity. It now recognizes the new format and only counts instances of capacity appearing inside a feature block. Support for older formats has been retained. |
Monitor | VOV-9100 | 23225 | Fixed LM report plotting with "Breakdown By Feature" option. |
2021.1.1 Release Notes
New Features
Products | Internal Number | Case Number | Description |
---|---|---|---|
Accelerator | VOV-13210 | None | Added functionality to perform actions on sets in the Set Browser page. |
Accelerator | VOV-12810 | None | The dashboard UI job graph shows actual Y values when user hover the mouse over the graphed lines, and at the intersection points of the graph lines in job plot graph, the tooltip will show the Y-axis values which are intersected. |
Accelerator | VOV-12780 | 24397 | The commands nc run, vovset
resources, and nc modify -res
support binary unit conversion for all memory based resources as
a convenience from Petabytes (PB), Terabyte (TB), or Gigabyte
(GB) to Megabytes (MB), which is still used internally and
reported by all commands. The input conversion will accept
either decimal or integer form and are all case-insensitive, so
for example both nc run -r SWAP/1GB sleep 0 ,
and nc run -r RAM/0.1Tb sleep 0 are supported.
The currently supported parameter names for which this
conversion is supported are RAM/, RAM#, RAMFREE#, RAMFREE/,
RAMTOTAL#, RAMFREE/, SWAP/, SWAP#, SWAPFREE#, SWAPFREE/,
SWAPTOTAL#, SWAPTOTAL/ and TMP# or TMP/. By default the unit is
MB (Megabytes), where 1MB is 1<<20 bytes. |
Accelerator Plus, FlowTracer | VOV-12409 | None | Elastic taskers launched via vovwxd will detect and exit mote quickly when their designated bucket is empty or deleted. Accelerator Plus queues using Direct Drive will detect empty queues and stop launching taskers for those buckets more quickly. This functionality can be disabled by setting vovwxd.fastexit server parameter to 0. This is on by default. |
Resolved Issues
Products | Internal Number | Case Number | Description |
---|---|---|---|
All | VOV-13517 | None | Fixed issue that prevented the Property Editor GUI utility in vovconsole from launching successfully. |
All | VOV-13183 | None | Fixed a potential memory leak when a client running a long-running query is unexpectedly terminated, causing vovserver to permanently mark a query as "in-use". |
All | VOV-7887 | 21377 | Clarified documentation of VOV_LIMIT_vmemoryuse. |
All | VOV-13523 | None | Fixed an issue that caused vovshow -queries to fail when trying to access a non-existent array element. |
All | VOV-13568 | None | Fixed an issue where comma list resource maps were not selectable in the Web UI. |
All | VOV-13743 | None | Corrected format of output for vtk_time_psp that was returning hh:mm format when the given date/time was on a different day/year. Now returns hh:mm only for current day, otherwise month abbreviation and day (Apr 14) for the same year, else Year month abbreviation (Dec 2020). |
All | VOV-13706 | None | The vovtasker binary was missing from 2021.1.0. As a workaround, in previous product versions where it may be missing in the installation, copy vovslave to vovtasker. |
All | VOV-13610 | None | Fixed issue with displaying working taskers in Altair Accelerator 2021.1.0 |
All | VOV-13351 | CS0173855 | Fixed some cases where the Server Working Directory (SWD)
contained slaveClass.table or
taskerClass.table, and these
configuration files were ignored by taskers/slaves. Scenarios
that had been broken were:
|
All | VOV-13439 | CS0175205, CS0187053 | Code related to vovps has been modified to be more robust in handling of non-fatal errors emanating from vovps command and to be more inline with typical ps command output. |
All | VOV-12897 | CS0140687, CS170776 | On some network configurations, a warning about IPV6 could be issued when running an INTERACTIVE job that could not be filtered out by turning down the verbosity. This has been fixed. The -v switch to nc run on an interactive job will turn off the warning. |
All | VOV-13014 | None | With multi-platform product installs, the Accelerator documentation bookshelf link in the Web UI was nonfunctional. This only impacted customers who download and un-tar both common.tar and win64.tar with the intent of installing both linux64 and win64 into the same master installation directory. |
All | VOV-13171 | CS0164654 | The timeout duration for PR saves can now be controlled via a server configuration parameter. |
All | VOV-13837 | None | Fixed issue that caused some vtx-wrapper links to point to an incorrect absolute path. |
All | VOV-13502 | None | This fix provides for mitigation of a hang in the http(s) service. It reinstates the nginx service found in earlier releases. The use of the patch is required for production systems using the https service - typically Accelerator. The patch prevents the use of the REST v3 API and the new web based Dashboard, which is dependent on the REST v3 API. A subsequent release will address this shortcoming. The use of nginx should be seen as temporary and a subsequent release is expected to provide integrated https within vovserver. To start nginx, pass -webprovider nginx to ncmgr start, lmmgr start, etc. You should see vovnginxd start as one of the vov daemons. vovservermgr config will also show the webprovider setting as being either 'internal' or 'nginx' depending on how you have configured the system. |
All | VOV-13416 | None | Added configuration section to the sds.cfg file to allow inclusion kafka producer configuration properties such as those needed to enable ssl communication. |
Accelerator | VOV-13162 | None | The "Match Jobs to Handles" HTML topic now reflects the code colors that coincide with the software. |
Accelerator | VOV-9353 | 23568 | In the past, a stopped tasker and a newly started tasker were not aware of each other's NUMA usage, and so could assign CPU or Node affinity that overlaps. Taskers using NUMA on the same machine, with the same vovhost and queue name, will now share NUMA usage to avoid over allocating NUMA resources on the same machine. |
Accelerator | VOV-7487 | 20455, CS0120837 | Passing bash functions through snapprop is a fragile
operation that only works when the following conditions hold:
Note: Bash encodes functions in two ways (subsequent to the 2014
shellshock vunerability):
|
Accelerator | VOV-7736 | 21176 | Fixed building of resource maps from resources with OR and AND words in resource names. |
Accelerator | VOV-13629 | CS0178114, CS0186671, CS0192772, CS0194045, CS0196466 | Fixed an issue where a failed PTY connection for a job would cause subsequent jobs on the tasker to fail as long as the original job was still running, and in some cases, the tasker could become unresponsive. |
Accelerator, Accelerator Plus | VOV-13651 | None | Fixed issue in node.cgi which resulted in the CPU Time displayed for job being multiplied by 1000. |
Accelerator, Accelerator Plus | VOV-13152 | CS0159375 | Fixed spurious error message when receiving (RESMAP,CHANGE) events in some clients. |
Accelerator, Accelerator Plus | VOV-13293 | CS0169911 | Added check for ADMIN privilege which blocks the regular user from stopping the job using NC command nc stop -allusers if the requesting user is not ADMIN. |
Accelerator | VOV-13161 | CS0163181 | Fixed issue where setting a project or site message in /cgi/messages.cgi would not result in a message being registered. |
Accelerator | VOV-13324 | None | Tasker based support has been added for the following
vovselect fieldnames: CHOSENTASKERID, LASTTASKERID, LASTTASKERNAME, TASKERGROUP, TASKERID, TASKERLIST, TASKERNAME, TASKERSLOTSSUSPENDABLE, TASKERSLOTSSUSPENDED, TASKERSLOTSUSED, TASKERSTATUS for jobs, TASKERID for clients, TASKERGROUP, TASKERHOST, TASKERNAME, TASKERSLOTSSUSPENDABLE, TASKERSLOTSSUSPENDED, TASKERSLOTSUSED, TASKERTYPE for slaves or taskers. The same are available for use as symbolics such as @LASTTASKERID@ or @TASKERNAME@, etc. |
Accelerator | VOV-13398 | None | An error in the online help regarding
vtk_server_config suddenshutdown
<server-pid> has been addressed. |
Accelerator | VOV-13388 | None | Fixed issue that can cause vovserver to crash upon receipt of a REST request when thread.service.max and thread.service.enable.query are both greater than zero in the vovserver policy. |
Accelerator | VOV-13363 | None | Fixed ncupgrade abort by changing the vovserver stdout message to Vovmessage ( stderr ) |
Accelerator | VOV-13346 | CS0120637, CS0164333, CS0186238 | Fixed an issue where a redirect in the nginx configuration would cause vovresourced to crash |
Accelerator | VOV-13413 | CS0175672 | Fixed issue where vwn incorrectly attempted to contact the server after the VOV_VW_PING interval. |
Accelerator | VOV-13438 | None | Removed vovproject enable command from ncupgrade so that it can read from stdin and can be used for testing automation. |
Accelerator, Accelerator Plus | VOV-13424 | CS0176272 | The handling of the resource parameter to vtk_flexlm_monitor has been improved. If the a resource name is specified then this name is the actual resource name used (in a previous release it was always prepended by License:). If a resource name is not specified, it defaults to License:<feature>. This is consistent with vtk_flexlm_monitor_all behavior. |
Accelerator, Accelerator Plus | VOV-13465 | CS0180449 | Resolved issue when using vtk_tasker_define with -tsdport. |
Accelerator | VOV-13458 | CS0180796 | During jobclass initialization the VovUserError proc does not exit or generate any output. |
Accelerator Plus | VOV-13563 | CS0186685 | Fixed an issue which resulted in stalled WX buckets reporting a waitreason of License:xyz when really just waiting on HW in the base queue. Fixed an issue which prevented the setting of resmap.sw.types in the policy.tcl file. |
Accelerator Plus | VOV-13560 | None | The custom vnc_policy.tcl file for PBS integration is no longer required and should be removed upon upgrading to 2021.1.1. This file is located in $SWD/vnc_policy.tcl and was originally copied from $VOVDIR/../common/etc/config/vovwxd/vnc_policy_pbs.tcl. |
Accelerator Plus | VOV-13484 | None | Fixed an issue which could result in some slaves not being recognized as vovwxd slaves resulting in them not being counted toward max,slaves. Optimized scheduling for WX/PBS jobs by enabling the bucket shortcut and removing per slave resources. |
Accelerator Plus | VOV-13772 | None | The Accelerator Plus online help has been updated to reflect the addition of the Direct Drive functionality. |
Allocator | VOV-13562 | CS0187051 | Fixed a race-condition in Allocator that resulted in random crashes in complex configurations. |
FlowTracer | VOV-10426 | 24499 | NodeEditor has renamed 'In Queue' to 'Queue' and times shown are now based upon buckettime to give more accurate breakdown of the job's timeline. |
FlowTracer | VOV-12348 | None | A threshold of 4 is now applied before issuing warnings about WXLauncher is not running. |
FlowTracer | VOV-13419 | CS0172898 | Fixed vovconsole performance degradation for drawing the sets and switching between horizontal and vertical view. |
Monitor | VOV-13323 | None | Fixed issue that caused remote LM parser to fail and return no data. |
Monitor | VOV-13066 | CS0151123 | The registering of multiple hosts defined through env variable VOV_LICMON is now handled correctly. However its use should be minimized due to the additional overhead involved and consequential impact on job start up time. |
Monitor | VOV-13390 | None | Fixed an issue that prevented licenses provided by Altair license key files from being monitored. |
Monitor | VOV-13446 | None | Fixed rare issue that caused an "Unexpected return -90" message in the vovserver log when the top job of a bucket cannot be dispatched to a tasker at the time the dispatch function is called. |
Monitor | VOV-13440 | None | Fixed issue that prevented licensing detail tables from being displayed on the licensing administration web UI page on Windows. |
Monitor | VOV-13426 | CS0154262 | Some valid Accelerator and Monitor license key files with vovversion set for the early part of year 2021 were not working because of a bug in license keyfile validation code. |
Monitor | VOV-12980 | CS0144723 | Fixed ftlm_batch_report for checkouts that have not been moved yet to Altair Monitor database. |
2021.1.1-p1 Release Notes
Resolved Issues
Products | Internal Number | Case Number | Description |
---|---|---|---|
Accelerator | VOV-13861 | CS0210064 | Fixed issue in which SIGALRM interrupted communications on interactive jobs using VOV_INTERACTIVE_PING keep alive method. |
Accelerator | VOV-13860 | CS0208413, CS0208823 | Fixed issue that caused the tasker to overload vovserver with messages when a job execution attempt failed due to not being able to successfully fork out the subtasker process that is used to shepherd the job. |
Accelerator | VOV-13816 | CS0205113 | Addressed an issue where license resources sometimes became unavailable when on life support. |
Accelerator | VOV-13849 | CS0208895, CS0218919 | Fixed bug where interactive (-I/-Ir) root privileged container jobs potentially resulted in a process group SIGINT being captured and accidentally being sent to systemd, following which bad things may happen, such as a system reboot on subtasker host. |
Accelerator Plus | VOV-13890 | None | Fixed issue which prevented wx taskers from reconnecting after server freeze/failover with fastexit enabled. |
Accelerator Plus | VOV-13872 | CS0211355 | Fixed issue where SICK status Accelerator taskers were not removed after an appropriate amount of time. The underlying cause was that there were still related jobs running in the base queue, and was repaired by passing the -forcerunning option to the NC base queue forget command for taskers with a SICK status. |
2021.1.0-rs1 Patch Release Notes
The following new features and resolved issues were introduced this software release:
Product | Issue Number | Case Number | Description |
---|---|---|---|
Accelerator | VOV-13399 | Added Rapid Scaling - a feature that provides high-throughput, cost-conscious scheduling in the cloud. | |
Accelerator | VOV-13470 | The Rapid Scaling PDF has been updated to reflect the changes relative to the 2021.1.0-rs1 patch. |
2021.1.0-p1 Patch 1 Release Notes
New Features
Products | Internal Number | Case Number | Description |
---|---|---|---|
All | VOV-13502 | None | This fix provides for mitigation of a hang in the http(s) service. It reinstates the nginx service found in earlier releases. The use of the patch is required for production systems using the https service - typically Accelerator. The patch prevents the use of the REST v3 API, a subsequent release will address this shortcoming. The use of nginx should be seen as temporary and a subsequent release is expected to provide integrated https within vovserver. To start nginx, pass -webprovider nginx to ncmgr start, lmmgr start, etc. You should see vovnginxd start as one of the vov daemons. vovservermgr config will also show the webprovider setting as being either 'internal' or 'nginx' depending on how you have configured the system. |
2020.1.0 Release
New Features and Enhancements
The following new features and enhancements were introduced this software release:
Product(s) | Internal Number | Case Number | Description |
---|---|---|---|
All | VOV-11377 | A new document viewer for the Accelerator product family is provided in the web UI. This document viewer provides a modernized interface with new client-side search capability. | |
All | VOV-11059 | Field descriptions have been populated for all supported fields. These can be queried via the "fieldesc" metadata field, available for each object. | |
Accelerator, Accelerator Plus | VOV-12279 | CS0121103 | The output of nc info and wx info now includes the project/queue name. |
All | VOV-11323 | vovdoc CLI utility is retired | |
All | VOV-11454 | vov_rest_v3.py is the new Python module used to make v3 REST API requests against vovserver. | |
All | VOV-11251 | Accelerated processing of Crash Recovery file. | |
All | VOV-10844 | Provide a REST API addition to allow job control. The following operations can be performed via the v3 REST API: 1. Dispatch 2. Forget 3. Preempt 4. Rerun 5. Resume 6. Suspend | |
All | VOV-10964 | Job attributes can now be modified via the v3 REST API in ways that are also possible via the command line with nc modify. |
Resolved Issues
The following issues were resolved in this release.
Product(s) | Internal Number | Case Number | Description |
---|---|---|---|
All | VOV-5570 | Fixed behavior of the ROWCOUNT field used by vovselect and related commands; vovselect will print "0" instead of an empty string when there are no rows in the query, and will print a correct count instead of 0 for "vovselect rowcount from objects". | |
All | VOV-12721 | Fixed an issue with the 2020.1.0 beta that caused a user to be logged out of a web session for one product when the user logged into a different product session in the same browser. | |
All | VOV-9853 | 24114 | When determining if a shell is configured for a project, the environment variables VOV_PROJECT_NAME and VOV_HOST_NAME must be set. New checks were added to ensure the values of these variables may not be empty strings and may not be set to "unknown". |
All | VOV-12027 | CS0120819 | Add the previously missing documentation for the vovlicensemgr command. |
All | VOV-12583 | Fixed an issue where querying for "maxnumacores" was returning the total number of cores in the system instead of the maximum number of cores in a NUMA node. | |
Accelerator, Accelerator Plus | VOV-12305 | CS0120716 | Fixed issue that prevented the wxagent job in an Accelerator base queue from reflecting the job placement policy and priority of the user's job in an Accelerator Plus queue. |
Accelerator, Accelerator Plus, Monitor | VOV-10682 | 24282 | Features names such as set via vtkle_feature_set can now include the '+' character and will be handled properly via the web UI. |
Accelerator, Accelerator Plus | VOV-12403 | CS0121177 | Fixed an issue where NUMA jobs that span multiple NUMA nodes would not return all cores used by the job to the free pool on job completion. |
None | VOV-11221 | 25011 | Monitor email notifications set in the Admin->Notifications UI page using legacy mode email delivery had failed to successfully deliver email to the recipient. |
All | VOV-10844 | Provide a REST API addition to allow job control. The following operations can be performed via the v3 REST API: 1. Dispatch 2. Forget 3. Preempt 4. Rerun 5. Resume 6. Suspend | |
All | VOV-10913 | Fixed a bug that caused vovselect to issue an error when requesting the field "env" in all lowercase. | |
All | VOV-9988 | Made the WHY property more prominent in the Main Reasons section of the output of vsy and related commands for FAILED jobs. |
2019.01 Release
Enhancements
Product | Internal Number | Case Number | Description |
---|---|---|---|
All | VOV-8981 | 22923 | Slave objects have a new hardware resource HT that specifies
whether hyper-threading is enabled on the host where the slave
is running (query example: vovselect ht from
slaves ). |
All | VOV-8823 | 22590 | The -O option on nc hosts has been enhanced to support display of the @RAMUSED@ hardware resource. |
All | VOV-8808 | Added the field "SLOTSAVAIL" to the SAllocatorVES table for use with vovselect and the vtk_select APIs. | |
All | VOV-8809 | The following queryable fields have been added to the RESOURCEMAPS object: PREEMPTION, OTHERS, RESRES, NOLOG, NOOOQ, NOMATCH, MATCHRECENT, LEFTOEXPIRE, UTILIZATION, OOQDUMP, RESDUMP, and JOBDUMP. | |
All | VOV-8320 | CLI lmmgr, ncmgr/wxmgr now allows you to specify the database port option when start a new project (in consistence with the web UI database configuration tool and the CLI vovdb_util configure tool). | |
All | VOV-8810 | Several new queryable fields have been added to the JOBS and
FILES objects; see the output of vovselect fieldname
from JOBS and vovselect fieldname from
FIELDS . In addition there is a new queryable object
named "IOS". This object allows querying from the inputs and/or
outputs of a specific node. For example to get the ID field from
all of the inputs of node 12345, the query would be
vovselect id from ios.12345 where isinput .
|
|
All | VOV-7917 | 20080 | Important: The license grace period functionality
has been removed due to a technical limitation.
The
software issues alerts 30/14/7/1d prior to the earliest
expiration detected (in case there are multiple lines of the
same license feature in the same license file). Licensing robustness and network fault-tolerance have been significantly improved. The software can operate in "disconnect mode", which means that it connects, gets a checkout, then disconnects from the license server. Checkout is refreshed every hour. The software will not fail for until 5 full days have passed since the last successful checkout. Alerts are issued for this as well. As before, license servers may be specified as a colon-separated list via the RMonitor_LICENSE environment variable |
All | VOV-8766 | When copying the <swd>/vovnginxd/conf/nginx.conf.template file to an nginx.conf file, a message is output to the server log indicating that a copy is being made, and also specifying the source and destination files. | |
Accelerator Plus | VOV-8838 | Add support for PBS as a base scheduler to Accelerator Plus. |
Resolved Issues
Product | Internal Number | Case Number | Description |
---|---|---|---|
All | VOV-9363 | Fixed server crash when the Detach slave from server" link was clicked on the Slave Details page in the web UI. | |
All | VOV-9333 | Fixed an issue with vovserver failover that could lead to a corrupt VOV_HOST_HTTP_NAME setting if that value did not already contain a fully-qualified domain name. | |
All | VOV-9333 | 23318 | Fixed issue where an invalid slave resource expression would result in a permanent error message on the "slaves" web page. |
All | VOV-8578 | 21858 | The following queryable fields have been added to the SAllocatorVES table for consistency with the vovslavemgr configure command: RAMSENTRYFAllocatorG, MINRAMFREE, EFFlowTracerOTALRAM, RETRYCHDIR, RETRYCHDIRSLEEP, RETRYCHDIRSLEEPBACKOFF, MAXWAITNOSTART, ALLOWCOREDUMP, DEBUGJOBCONTROL, AUTOKILMonitorETHOD, MINDISKSPACE. |
All | VOV-8685 | 22253 | Fixed issues in D and U environments (which are used to Define and Undefine specific env vars) when they are both used, or they are nested (such as having multiple D calls in the env). Previously, it was possible to define something in the environment but have that item undefined in the "END" environment script. Now, it preserves the state using temporary environment variables so that each "START" and "END" environment script has consistent environment variables defined. |
All | VOV-9501 | 23821 | Rebuilt binaries dependent on OpenSSL to remove a bad default OpenSSL configuration path. |
All | VOV-9485 | 23695 | Fix a rare crash during slave startup caused by an intermittent failure in the Linux proc filesystem. |
Accelerator Plus | VOV-9216 | 23382 | Fix an error in the "Why" information for a job that occurred when the associated job bucket was no longer present. |
Accelerator Plus | VOV-9579 | Fixed a bug that caused vovwxd to dequeue old launcher jobs which itself can result in skipping some of the buckets when there is a large number of slave requests and the Accelerator PlusLauncher slave is not able to process them quickly | |
Accelerator Plus | VOV-8598 | 22485 | Pass the bucket priority to the launcher so that the priority of the resulting slave job in Accelerator will match that of the originating bucket in Accelerator Plus. This enables the slave jobs to be selected by the job modulation preemption rule in Accelerator. When the base queue is saturated, job modulation preemption will close and gracefully stop Accelerator Plus slaves that are running with lower priority so that queued higher priority slave jobs can run, in turn servicing the higher priority buckets in Accelerator Plus. |
Accelerator Plus | VOV-8074 | 21654, 21997 | Resource expressions with | operator in job classes should be combined without spaces. e.g. set VOV_JOB_DESC(resources) "(general|PD) Limit:..." |
Accelerator | VOV-9498 | 23813 | Fixed behavior of vovstop and related commands with regard to EXT:KILL - include and exclude specifications should work as expected now. |
Errata
The following issues and defects are known to exist in this software release.
Product | Internal Issue | Description |
---|---|---|
All | VOV-9759 | The config subcommand of vovservermgr misprints confirmation messages |
All | VOV-9827 | The vtk_slave_define Tcl command no longer supports automatic resources based on the vovslave name. |
All | VOV-9419 | Setting config(useepoll) 1 in the
policy.tcl config file has no effect.
|
All | VOV-9771 | The vtk_select_get command is not honoring the case of field names. |
All | VOV-9742 | There is a new vovservermgr command that has been added for system administrators. This command has several subcommands that provide an easier way to set vovserver configuration and environment variables, and interface to memory chunking and scheduler tuning controls. |
All | VOV-9623 | eventserver and epoll() are not starting with vovservermgr |
All | VOV-9748 | Elaborate the vovservermgr -h help screen with more complete usage information. |
Accelerator Plus | VOV-9784 | qdell misspelling makes the Accelerator Plus cleanup process fail. |
Accelerator Plus | VOV-9785 | qdel path in vovpbs.tcl is not completely specified. |
2019.01 Update 7 Release Notes
Product(s) | Internal Issue | Case Number | Description |
---|---|---|---|
All | VOV-12584 | Session and process group IDs 0 and 1 will not cause cause all process with those ids to be included in the process graph used for job statistics. | |
All | VOV-13023 |
|
|
Accelerator | VOV-12707 | CS0131356 | Curly braces can now be used in environment specifications as a means of supporting characters that would normally be sensitive to the processing of the specification. For example, to pass a comma into an environment variable value using the D environment, use D(FOO={bar,baz}), or via the alternative syntax of D,FOO={bar,baz}. |
Accelerator | VOV-12628 | CS0127402 | Fixed a bug where License: was prepended the resource name if the resource parameter was specified in vtk_flexlm_monitor, even if the resource name already started with License:. |
Accelerator | VOV-12726 | CS0133891 | The vovcleanup utility has been updated to work with resource data files using the .res suffix. |
Accelerator | VOV-12489 | Fix for cleaning up the files created today when vovcleanup is run before noon and cleantime is < 24h. Previously, these files were not being removed by vovcleanup. | |
Accelerator | VOV-12977 | CS0143428 | Fixed an issue with interactive jobs (nc run -I) failing with the error message "Job has problems with PTY. Bad pipes". |
Accelerator | VOV-12030 | CS0120906,CS0121020 | Fixed issue that caused slaves to be killed with the message "Slave instructed to exit brutally". This also fixes server messages like "Cannot find slave rdc-cad-svr12 (illegal id 365667285) pid=32830" |
Accelerator | VOV-12396 | CS0121165 | Multiple CORES, RAM, and TMP resources requested for NC job are properly combinednow when passed to container. For example "nc run -r+ RAM/100 -r+ RAM/200 ..." will result in setting "VOV_CONTAINER_RAM=300" available from Container Hook scripts. |
Accelerator | VOV-9776 | 24061 | The form for submitting a slave reservation now checks that the data is valid before actually submitting the reservation. |
Accelerator | VOV-8012 | 21662, CS0121063 | VOV_LM_VARNAMES functionality will now be available for interactive jobs and will support multiple license servers in colon or semicolon (windows) separated list instead of space separated list. |
Accelerator,Accelerator Plus | VOV-13051 | CS0121039 | Fixed an issue with interactive jobs (nc run -I) failing with error messages similar to"Error=98: Address already in use [vovttyserver2:244]" and "FATAL ERROR: Cannot open PTY port (with remote signal handling): Cannot open pty server sockets [vncrun.tcl:2257]". This is accompanied by job errors similar to "Cannot connect to PTY server on submission host lava1 13316 Z@:x=XGa56cT_Hd6 from lava5". |
Accelerator,Accelerator Plus | VOV-13038 | CS0145428 |
|
Accelerator,Accelerator Plus | VOV-13030 | CS0149277 | Fixed an issue with Ctrl-C not working as expected with interactive jobs (nc run -I/-Il/-Ir). |
Accelerator | VOV-11226 | 24205, CS0128626 | Fixed issue that prevented pre and post commands from running as the job user in containerized jobs that make use of container hooks that run as root. |
Accelerator, Accelerator Plus | VOV-12629 | CS0127516 | Removed duplicate detection logic from -r+ and -dpres+ submission options. As a result, the -r+ option now properly handles more complex resource specifications. |
Accelerator | VOV-11000 | 24853 | This change allows DP jobs to assign a separate jobclass to the master and component jobs. It works by setting VOV_JOB_DESC(dp,jobclasses) to a comma separated list of jobclass names the same way that VOV_JOB_DESC(dp,resources) can be set to specify resources. The jobclasses are treated as strings. They are not evaluated as you are already setting these from within a jobclass definition. See the updated Accelerator documentation for further explanation and examples. |
Accelerator Plus | VOV-12635 | CS0128164 | Fixed a bug causing a memory access violation in vovserver when multiple slaves are being stopped with running jobs. |
Accelerator Plus | VOV-12658 | CS0124441 | vovwxd will no longer print warnings about slaves being in "state 1" (pending). |
Accelerator Plus | VOV-11836 | CS0120725 |
nc forget using the options -mine, -dir, or -subdir, will no longer forget system jobs bydefault. A new option, -system, will include system jobs to match old behavior. In general, the option -system should not be used when forgetting user jobs in a WX setting because forgetting launcher jobs for slave agents that are not runnable in the batch system before they are processed by vovwxd couild result in lingering queued agent jobs. Using nc forget -minewithout the -system flag will forget the user submitted jobs and allow vovwxd to more efficientlyclean up slave agents. |
Accelerator Plus | VOV-12316 | AAP25172 |
Fixed an issue that caused vovserver memory to grow over time as "nc wait" and "nc run -w"commands were issued. |
Accelerator Plus | VOV-13037 | CS0147601 |
|
FlowTracer | VOV-12571 | vovwxd configuration parameters CONFIG(launchers,autoForgetSuccessful) and CONFIG(aunchers,autoForgetFailed) will now work regardless of the CONFIG(log,level) setting. The defaults for these settings has been changed to "0s". Either of these values should only be set to non-zero when debugging slave launch issues to avoid additional server overhead. | |
FlowTracer | VOV-12570 | Using local resources in FlowTracer with an LSF Base queue will now correctly account for local resources when using array launcher submission. | |
FlowTracer | VOV-12569 | FlowTracer Local resources will now be properly released in the case of an LSF launch failure. | |
FlowTracer | VOV-12455 | CS0121195 | This fix addresses issues handling files larger than 2G in size in the flow. |
FlowTracer | VOV-12620 | vovwxd now supports LSF array jobs when getting the status of launcher jobs submitted to the queue. | |
FlowTracer | VOV-12261 | License:*resources will now be properly passed to the base queue when using FlowTracer with vovwxd. | |
FlowTracer | VOV-12203 | CS0120999 | Re-evaluation of a job class to compute the union of resources when used with an indirect slave (slaveVNC) is no longer done. This is typically relevant for FlowTracer integration with either Accelerator (NC) or AcceleratorPlus (WX). To restore the old behavior, please contact Altair support. |
FlowTracer | VOV-11873 | To enable resources to be managed locally by FlowTracer when using an LSF backend via vovwxd enable the vovwxd.localresources parameter in the policy file and create the resources in FlowTracer using the vtk_resourcemap_set -local parameter. | |
Monitor | VOV-12575 | Fixed handling of mixed-case user names on Windows to always honor the case reportedby the OS. On Windows, the case used at login time will be the case used when obtaining the active user from the OS and when applying security. It is therefore recommended for users to always use the same case when logging into Windows. Otherwise, a separate vtk_security entry will be required for each case used (such as for joe, Joe, and JOE), and those user names will be considered as different users. Prior to this change, for a mixed-case username, the default security entry was for the user name in lower-case only, resulting in a security mismatch if the user logged into the web UI with the same case used to login to Windows itself. | |
Monitor | VOV-12294 | CS0121092 | Suppressed below server log for loopback IP addresses. vovserver(3956) ERROR Jun 1200:04:28 Found host with different ip: 127.0.0.1 instead of 10.10.1.34 [host:903] |
2019.01 Update 7 Patch 4
Product(s) | Internal Issue | Case Number | Description |
---|---|---|---|
All | VOV-14220 | FIFO read pipes accumulate in vovtasker root when interactive jobs terminate. |
2019.01 Update 7 Patch 3
Product(s) | Internal Issue | Case Number | Description |
---|---|---|---|
All | VOV-13872 | CS0211355 | Fixed issue where SICK status Accelerator slaves were not removed after an appropriate amount of time. The underlying cause was that there were still related jobs running in the base queue, and was repaired by passing the -forcerunning option to the NC base queue forget command for slaves with a SICK status. |
All | VOV-13861 | CS0210064 | Fixed issue in which SIGALRM interrupted communications on interactive jobs using VOV_INTERACTIVE_PING keep alive method. |
All | VOV-13816 | CS0205113 | Address issue where license resources sometimes became unavailable when on life support. |
All | VOV-13860 | CS0208413, CS0208823 | Fixed issue that caused the slave to overload vovserver with messages when a job execution attempt failed due to not being able to successfully fork out the subslave process that is used to shepherd the job. |
All | VOV-12989 | CS0145649 | Fixed an issue where stopping more than 1 vovslave by name (vovslavemgr stop <slave1> <slave2>...) was renaming only the last named tasker to <slavename>_stopped_<timestamp>. |
All | VOV-12812 | Enabled client activity logging for nc cmd commands. |
2019.01 Update 7 Patch 2
Product(s) | Internal Issue | Case Number | Description |
---|---|---|---|
All | VOV-13629 | CS0186671, CS0192772, CS0194045 | Fixed an issue where a failed PTY connection for a job would cause subsequent jobs on the slave to fail as long as the original job was still running, and in some cases, the slave could become unresponsive. |
All | VOV-13346 | CS0164333, CS0186238 | Fixed an issue where a redirect in the nginx configuration would cause vovresourced to crash. |
All | VOV-13553 | CS0185082, CS0191010 | Remove the need to call vtk_flexlm_monitor_all -reset). No longer make redundant calls to process license data. No longer start voveventmon when vovresourced is called with the -initjobclass parameter. |
2019.01 Update 7 Patch 1
Product(s) | Internal Issue | Case Number | Description |
---|---|---|---|
All | VOV-12714 | Fixed the following issues with job container support:
|
|
All | VOV-13025 | CS0145466 | Added a config key "MQ(pjProbeKillTimeout)" for the maximum time that the vovlad daemon should wait for existing probes to be killed at startup. |
2019.01 Update 6 Release
New Features and Enhancements
The following new features and enhancements were introduced this software release:
Product | Internal Number | Case Number | Description |
---|---|---|---|
All | VOV-9777 | 24039 | Added "Last Dispatch" into the page jobqueue?page=buckets&. Prior to fix, it was showing "age", which is confusing as it gets updated on submission and job dispatch. |
Resolved Issues
The following issues were resolved in this release.
Product | Internal Number | Case Number | Description |
---|---|---|---|
All | VOV-12236 | Fixed an issue leading to the following server
logs:
|
|
All | VOV-11253 | 25045 | Improved the error messaging for the vtk_resourcemap_forget API command. |
Accelerator, Accelerator Plus | VOV-12200 | Fixed issue with job arrays that are dependent upon a job or set, where the array jobs will not run due to an invalid input. | |
Accelerator Plus | VOV-11928 | CS0120712 | Added a safeguard mechanism to prevent a base queue job status query from blocking vovwxd operation. The mechanism is configurable by setting CONFIG(jobstat,timeout) in the SWD/vovwxd/config.tcl file to a valid timespec (eg 30s, 30m 30h) or number of seconds. The default timeout is 5m. |
Accelerator Plus | VOV-12220 | CS0121046 | Fixed a bug in vovwxd that was causing it to lose track of the slaves list and print redundant "Metrics for..." log messages |
Accelerator Plus | VOV-11715 | AAP29909 | Fixed an issue where an interactive job's (-I, -Ir, etc.) TERM environment variable would be incorrectly set to the user's TERM environment variable when the job's output is redirected (for e.g., piped). The job's TERM environment variable will now always be set to 'network' if the interactive job's output is redirected (for e.g., piped). This issue used to manifest itself only in jobs run by vovtaskerroot (not vovslave). |
Accelerator Plus | VOV-12267 | AAP29909 | Fixed an issue with interactive jobs (nc run -I/-Ir etc.) where the user is unable to interact with the job (including arrow keys, control key, etc. not working) when the output is piped to tee. Also, added an option (-forceterm) to disable setting the interactive jobs's TERM environment variable to "network" when the output is piped. This fix requires an update of vovsh and vovslave and a restart of any running vovslaves after the update. |
Accelerator Plus | VOV-11910 | CS0120797 | Fixed issue with WX slaves that prevented them from starting with a max life override of unlimited. Added log messages that denote when the override is activated during startup, and also the adjustment of unlimited max life to the greater of: a) the expected duration of the first job routed to the slave or b) the default slave max life as specified per the vovwxd configuration file. |
Accelerator Plus | VOV-11868 | CS0120760 | Improved the performance of checking the status of all pending ("requested") slaves when vovwxd is querying the status from the base queue(s). |
Accelerator Plus | VOV-11236 | 24875 | Stack trace generation reworked to provide correct symbols and core generation thus providing the correct source of a fatal error. |
2019.01 Update 6 Patch 1
Internal Number | Products | Case Number | Description |
---|---|---|---|
VOV-12261 | FlowTracer | License:* resources will now be properly passed to the base queue when using FlowTracer with vovwxd. | |
VOV-12620 | FlowTracer | vovwxd now supports LSF array jobs when getting the status of launcher jobs submitted to the queue. | |
VOV-12570 | FlowTracer | Using local resources in FlowTracer with an LSF Base queue will now correctly account for local resources when using array launcher submission. | |
VOV-12569 | FlowTracer | FlowTracer Local resources will now be properly released in the case of an LSF launch failure. | |
VOV-11873 | FlowTracer | To enable resources to be managed locally by FlowTracer when using an LSF backend via vovwxd enable the vovwxd.local resources parameter in the policy file and create the resources in FlowTracer using the vtk_resourcemap_set -local parameter. | |
VOV-12030 | Accelerator |
0120906, 0121020 |
Fixed issue that caused slaves to be killed with the message "Slave instructed to exit brutally". This also fixes server messages like "Cannot find slave rdc-cad-svr12 (illegal id 365667285) pid=32830". |
VOV-12628 | Accelerator | 0127402 | Fixed a bug where License: was prepended the resource name if the resource parameter was specified in vtk_flexlm_monitor, even if the resource name already started with License:. |
VOV-12647 | Accelerator | 0128385 | Fixed issues with interactive jobs failing with the following log message "Timed out waiting forauthentication request from pty server". Also, added a new environment variable, VOV_INTERACTIVE_AUTH_TIMEOUT, to configure the authentication timeout on the vovslave. |
VOV-12679 | Accelerator | 0124318, 0128385 | Fixed an issue with interactive jobs failing with the incorrect message "Client has responded to authentication request... with key ' '". This happens when the client has closed the connection. This is also accompanied with the following vovslave log message "Timed out waitingfor authentication request from pty server". |
VOV-11836 |
Accelerator Plus |
0120725 |
nc forget using the options -mine,-dir, or -subdir, will no longer forget system jobs by default. A new option, -system, will include system jobs to match old behavior. In general, the option -system should not be used when forgetting user jobs in a WX setting because forgetting launcher jobs for slave agents that are not runnable in the batch system before they are processed by vovwxd couild result in lingering queued agent jobs. Using nc forget -mine without the -system flag will forget the user submitted jobs and allow vovwxd to more efficiently clean up slave agents. |
VOV-12635 | Accelerator Plus | 0128164 | Fixed a bug causing a memory access violation in vovserver when multiple slaves are being stopped with running jobs. |
VOV-12316 | Accelerator Plus | Fixed an issue that caused vovserver memory to grow over time as "nc wait" and "nc run -w" commands were issued. | |
VOV-12647 | Accelerator Plus | 0128385 | Fixed issues with interactive jobs failing with the following log message "Timed out waiting for authentication request from pty server". Also, added a new environment variable, VOV_INTERACTIVE_AUTH_TIMEOUT, to configure the authentication timeout on the vovslave. |
VOV-12679 | Accelerator Plus | 0124318, 0128385 | Fixed an issue with interactive jobs failing with the incorrect message "Client has responded toauthentication request... with key ' '". This happens when the client has closed the connection. This is also accompanied with the following vovslave log message "Timed out waiting for authentication request from pty server". |
2019.01 Update 5 Release
New Features and Enhancements
The following new features and enhancements were introduced this software release:
Product | Internal Number | Case Number | Description |
---|---|---|---|
All | VOV-11299 | 25102 | Fixed error that resulted in a "Server is operating on a non-internal object" error to be printed in the server log. This error is linked to querying for the "why" status of a job that has an input dependency. |
All | VOV-11350 | Fixed statistics for some hierarchical sets. | |
Accelerator Plus | VOV-11260 | 24568, 24890, 25001, 25249 | Added policy parameter fairshare.overshoot.damping 0/1; 1=enabled, 0=disabled, controls whether or not FairShare restricts the number of jobs scheduled for groups that are over budget. |
Accelerator Plus | VOV-11337 | Added accounts option (-A) for PBS Pro resource list for Accelerator Plus. | |
Accelerator Plus, FlowTracer | VOV-11188 | A configuration value for the Accelerator Plus configuration file, SWD/vovwxd/config.tcl, has been added allow the user to specify a limit on how many consecutive failures of a slave job in the base queue will be allowed before we no longer attempt to create slaves for a bucket. The default value is 0 (no limit). This is to prevent a malformed job from causing churn in the system. |
Resolved Issues
The following issues were resolved in this release.
Product | Internal Number | Case Number | Description |
---|---|---|---|
All | VOV-10110 | vovwxd cleaner log files will be preserved for the time spec specified by the delCleanerLog,older config parameter in vovwxd/config.tcl. | |
All | VOV-11326 | Slave slot licenses will be released when a slave exits in Auto Licensing mode. | |
All | VOV-11294 | The /local/registry/system-accelerator folder may have not always been writable because it was created with user's umask permissions. Now created with 777. | |
All | VOV-11350 | Fixed statistics for some hierarchical sets. | |
Accelerator, Accelerator Plus | VOV-11338 | 25079 | Fixed issues with job resource usage reporting by including detached processes with unique gpids and session ids by matching VOV_JOBID and VOV_SLAVE_PID. The VOV_JOBID to be matched will be taken from the transaction object rather than depending on the subslave environment. Also added NC_JOBID and NC_SLAVE_PID env variables so that WX and NC slaves can both correctly track processes. |
Accelerator Plus | VOV-11276 | 24834, 25080 | Fixed an issue with array submission in WX that would lead to "Illegal set id" errors. This also fixes an issue that resulted in log file conflicts with the error messages "Error: OnLaunchError for <queue>,time: <timestamp>, err: Launcher job failed:" and "FATAL ERROR: Cannot use FILEX <log_filename>" |
Accelerator Plus | VOV-11234 | Fixed issue with core file generation on signals SIGSEGV and SIGBUS | |
Accelerator Plus | VOV-11115 | Internal optimization of the WX slave creation process. | |
Accelerator Plus | VOV-11191 | vovwxd will no longer create extraneous slave objects and/or processes when launching slaves using the vovlsf.tcl driver. | |
Accelerator Plus, FlowTracer | VOV-11646 | vovwxd should no longer attempt to provision extra slaves when the number of pending slaves is sufficient to handle the currently queued load. | |
Accelerator Plus | VOV-11677 | Fixed issue which prevented vovwxd from launching more slaves when the limit was increased in the SWD/vovwxd/config.tcl file without requiring a vovwxd daemon restart. | |
Accelerator Plus | VOV-11645 | Fixed issue that caused the PBS_JOBID environment variable to be modified to contain the numeric part of the job ID only. | |
Accelerator Plus | VOV-11630 | Modified PBS driver script to use the -V submission option for launcher jobs to ensure that all environment variables required for slave operation are set in the slave's environment. Also added a new configuration item, CONFIG(pbsBin), in the vovwxd configuration file that can be used to specify the location of the PBS binaries (default: /opt/pbs/bin). |
2019.01 Update 5 Patch 1
Resolved Issues
Product | Internal Number | Case Number | Description |
---|---|---|---|
All | VOV-12236 | Fixed an issue leading to the following server
logs:
|
|
Accelerator Plus | VOV-11868 | 0120760 | Improved the performance of checking the status of all pending ("requested") slaves when vovwxd is querying the status from the base queue(s). |
Accelerator Plus | VOV-11910 | 0120797 | Fixed issue with WX slaves that prevented them from starting with a max life override of unlimited. Added log messages that denote when the override is activated during startup, and also the adjustment of unlimited max life to the greater of: a) the expected duration of the first job routed to the slave or b) the default slave max life as specified per the vovwxd configuration file. |
Accelerator Plus | VOV-12220 | 0121046 | Fixed a bug in vovwxd, that was causing it to lose track of the slaves list and print redundant "Metrics for..." log messages. |
Accelerator Plus | VOV-10557 | 24557 | The Linux priority/"nice level" of jobs running via Accelerator Plus will now have the same priority as jobs running directly on Accelerator for the same Accelerator/Accelerator Plus designated execution priority. Example: Use nc/wx run -p . ... to set the execution priority. (u5-1) The wxagent job while no longer carry the LauncherClass job class so that the user's job class will be prevalent. |
Accelerator Plus | VOV-12200 | Fixed issue with job arrays that are dependent upon a job or set, where the array jobs will not run due to an invalid input. |
2019.01 Update 4 Release
New Features and Enhancements
The following new features and enhancements were introduced this software release:
Product | Internal Number | Case Number | Description |
---|---|---|---|
All | VOV-10969 | The command vovclientmgr show no longer incorrectly labels clients without "nicknames" as HTTP clients. In addition, the nc command now properly sets a client nickname in all scenarios to allow it to be more easily identified in the output of both vovclientmgr show and vovshow -clients. | |
All | VOV-10864 | Added a new trace parameter,
enterpriselicense.burst to enable burst licensing
for NC/WX in Auto mode. 1=enabled 0=disabled, defaults to
disabled. Cleanup of various presentation errors in web UI license page when switching modes and show more details on current usage/availability for all modes. Disabled choices for Full and N for the licensing mode in the web UI for non-NC/WX servers. |
|
Accelerator Plus | VOV-9520 | 23779 | wxmgr stop -freeze will now force shutdown of WXLauncher if it does not complete a graceful shutdown within 60s to support upgrade operations which require WXLauncher to restart. |
Accelerator Plus | VOV-10444 | The behavior of crash recovery timing has changed. In
previous updates, a single server parameter
crashRecoveryPeriod dictated the crash
recovery period. Crash recovery completed after the
crashRecoveryPeriod and the server
began normal operation. Three changes were made:
At any stage, if all jobs are recovered, crash recovery will end. If a vovslave reconnects during a 'quiet time' before crash recovery ends, the crash recovery deadline will be extended by this 'quiet time'. The quiet time is specified by the crashRecoveryQuietTime server parameter. The crashRecoveryMaxExtension server parameter specifies an upper limit on the amount by which the deadline is extended. The parameters can be set in policy.tcl. The ranges and default values are as follows: # min 30s, max 1800s VovServerConfig crashRecoveryPeriod 60 # min 0, max 300 VovServerConfig crashRecoveryQuietTime 30 # min 0s, max 1800s VovServerConfig crashRecoveryMaxExtension 60 If desired, the original crash recovery behavior can be restored by setting the crashRecoveryMaxExtension parameter to zero. Appropriate settings for these parameters will depend on the particular site configuration and needs. |
Resolved Issues
Product | Internal Number | Case Number | Description |
---|---|---|---|
All | VOV-10999 | Fixed issue that prevented the show all rows link from working on the buckets web UI page. Previously, using this link would result in an empty table as opposed to showing all available rows. | |
All | VOV-10350 | All installers/SFDs now reject installation paths that contain spaces. | |
All | VOV-10427 | 24510 | This ticket addressed three issues that affected crash
recovery. The first was a race condition that occurred when a vovslave connected to a restarted server. If the vovslave license authorization happened to be checked during a very small interval the result was that the vovslave was destroyed. The second was that 'hog protection' was inadvertently applied to vovslaves during crash recovery with the result that reconnection of vovslaves after a serve restart could be delayed until crash recovery period had ended. (This compounded the first issue during crash recovery.) The third issue was cosmetic and resulted an a Tcl stack trace if the vovserver took too long to respond while restarting. The database queries (vtk_select_loop) parameters were adjusted to lengthen the response period. Also see the release notes for VOV-10444 for pertinent crash recovery parameters. |
All | VOV-10221 | 24265, 24417 | vovserver failover recovery has been enhanced to try for the recovery on all the configured server candidates. |
All | VOV-9902 | Prevent vovserver and child processes from exiting when Ctrl-C is pressed in the Windows command prompt from which the server was started. | |
All | VOV-11126 | 24961 | The description of the RAMUSED slave resource was updated for better clarity on usage. |
Accelerator Plus | VOV-10862 | Behavioral change; remaining slaves in base queues that have been removed will not be filtered from wait reasons. | |
Accelerator Plus | VOV-10557 | 24557 | The Linux priority/"nice level" of jobs running via Accelerator Plus will now have the same priority as
jobs running directly on Accelerator for the same
Accelerator/Accelerator Plus designated execution priority. Use
nc/wx run -p <scheduling priority>.<execution
priority> ... to set the execution priority.
|
Accelerator Plus | VOV-10117 | 24291 | Fixed race condition when a job arrives while a slave is shutting down due to exceeding its maxIdle setting. The job will now be rescheduled instead of failing. |
Accelerator Plus | VOV-10273 | If the server configuration parameter
failover.usefailoverslavegrouponly is
set (default 0), then only failover slaves participate in server
election. By default all slaves participate, which may cause
excessive file traffic with many slaves (particularly
exacerbated by Accelerator Plus). The server election 'voting' period in seconds can be overridden by the server configuration parameter failover.maxdelaytovote (default 120). |
|
Accelerator Plus | VOV-10033 | 24060 | Jobs using shared memory should no longer see incorrect ram usage spikes when child processes terminate. |
Accelerator Plus | VOV-10705 | 24632 | Fixed bug that masked the number of queued slave requests when Accelerator Plus was calculating how many more slaves to request and under some conditions resulted in more slaves requested than there were jobs in the bucket. Also fixed the use of quota with slave launching via arrays so that the array parameter correctly applies the quota. |
Accelerator Plus | VOV-11113 | vovwxd will now log the time for a service loop at log level 3. The time of the latest loop will be updated in the property WXLoopTime. | |
Accelerator Plus | VOV-11112 | The WX_BUCKET_SERVICE_TS property will be updated more frequently to show activity on heavily loaded Accelerator Plus queue. |
2019.01 Update 4 Patch 2
Product | Internal Number | Case Number | Description |
---|---|---|---|
All | VOV-12679 | CS0124318, CS0128385 | Fixed an issue with interactive jobs failing with the incorrect message "Client has responded to authentication request... with key ''". This happens when the client has closed the connection. This is also accompanied with the following vovslave log message "Timed out waiting for authentication request from pty server". |
All | VOV-12647 | CS0128385 | Fixed issues with interactive jobs failing with the following log message "Timed out waiting for authentication request from pty server". Also, added a new environment variable, VOV_INTERACTIVE_AUTH_TIMEOUT, to configure the authentication timeout on the vovslave. |
All | VOV-12316 | AAP25172 | Fixed an issue that caused vovserver memory to grow over time as "nc wait" and "nc run -w" commands were issued. |
2019.01 Update 4 Patch 1
Resolved Issues
2019.01 Update 3 Release
New Features and Enhancements
Product | Internal Number | Case Number | Description |
---|---|---|---|
All | VOV-10550 | Added inline documentation for the default vovcleanup config file, cleanup.config.tcl. | |
All | VOV-6577 | A new configuration parameter
(liverecorder.logdir) has been added to
allow the Live Recorder recording file directory to be
specified. This can be used with both the vovservermgr
configure and vovslavemgr
configure utilities. The default location for the server remains $SWD/../. The default location for the slave remains /tmp. |
|
All | VOV-9639, VOV-9640 | Improved the functional relationship between the refresh operation and the reporting operations of the vovprocessmgr utility, improved the help text for all options that needed clarification, added warnings about timing and accuracy, and added host filter support to the refresh operation for orphans. | |
All | VOV-9454 | 23741 | Most of the VOV Tcl files from the installation package now contain a proper Altair copyright statement. |
Accelerator Plus | VOV-9862 | 24090 | Fixed a bug that was causing waitreasons to be empty for buckets that are being updated but their jobs cannot be dispatched to slaves. |
Accelerator Plus | VOV-10019 | Added new parameter in vovwxd /config.tcl CONFIG(limit,mode) that allows Accelerator Plus to properly handle the -limit option of wx run. Refer to config.tcl for more information. | |
Accelerator Plus | VOV-10569 | 23888 | Removed call to vovtcpkill from within the vovserver failover script. |
Resolved Issues
Product | Internal Number | Case Number | Description |
---|---|---|---|
All | 24362 | Fixed a bug when vovserver incorrectly reports about license expiration for perpetual license. | |
All | 23689 | Added a new time value trace parameter: preemption.rule.cooldown. Preemption rules that are disabled due to exceeding the max preemption processing time (preemption.max.time.rule) will observe a cooldown period equal to the value specified by preemption.rule.cooldown before being automatically re-enabled. The minimum value is 0 which indicates that an alert should be generated but the rule will not be disabled. The default and max value is 1y = off (rules will not be automatically re-enabled). | |
All | VOV-10317 | Fixed an error that corrupted non-default values for the allowUidForSecurityFile parameter, which in turn could prevent the vovserver from honoring settings in the security file. | |
All | VOV-10665 | Fixed an issue in the deprecated vtk_resourcemap_reserve Tcl command that caused an error when attempting to cancel existing reservations by updating the duration to 0 seconds. | |
All | VOV-10239 | Simplified switches to avoid user confusion. | |
All | VOV-10726 | Fixed an issue that caused the "show all rows" link to not appear in some web UI tables that were being limited. The behavior of this link has also changed slightly, in that when showing all rows, instead of passing the total number of rows the table is cognizant of back into the table as the new limit, a limit of 0 is passed. This triggers the table functionality to show all rows available. To prevent confusion, an empty string is displayed as the current limit value instead of 0. Before this change, the total number of rows was only current at the time the table was build, so the limit passed was often stale and still did not show all rows. | |
Accelerator Plus | VOV-10462 | 24488 | Fixed a bug in vovwxd daemon that was causing it to request only 1 slave at a time for jobs with NC:<queue> resource. |
Accelerator Plus | VOV-10530 | 24543 | Fixed a bug in vovwxd that was causing it to create only 1 launcher job per buckets processing cycle. The impact of the bug was slow Accelerator Plus job starts due to delays in launching the host job in the base NC queue. |
Accelerator Plus | VOV-10414 | 24497 | Fixed a bug that was causing vovwxd daemon to run into a infinite loop. |
Accelerator Plus | VOV-10678 | 24572 | Fixed an issue that caused user-based security rules to eventually stop working on a heavily loaded server. |
Accelerator Plus | VOV-10205 | In Accelerator Plus, a Tcl error occurred in vovnc driver script, for one bucket, will not impact the processing of other buckets. However the error will be reported as an alert, the user should take action accordingly. The failed bucket will be recovered after vovnc.tcl file is updated. | |
Accelerator Plus | VOV-10182 | 24321 | In Accelerator Plus, user will not be able to submit jobs into a locked FairShare group. The FS group must be also locked on Accelerator Plus side as well as on Accelerator queue. |
Accelerator Plus | VOV-10355, VOV-10353 | Job data files have been enabled for Accelerator Plus. | |
Accelerator Plus | VOV-10161 | vovresourced is disabled in Accelerator Plus. | |
Accelerator Plus | VOV-9728 | 24019 | wx run can now specify -G parameter for FairShare group. The wxagent job in the base queue will run with the specified group. |
Accelerator Plus | VOV-10275 | 24422 | Removed syntax error warning from the slave log file |
2019.01 Update 3 Patch 2
Resolved Issues
The following issues were resolved in this software release:
Product | Case Number | Internal Number | Description |
---|---|---|---|
Accelerator | 24897 | VOV-11046 | Jobs are failing with fork errno 0, filling up the disk with error logs. |
2019.01 Update 3 Patch 1
Resolved Issues
Product | Internal Issue | Description |
---|---|---|
FlowTracer | VOV-10990 | "Generic Error" in vovlsfd log file |
2019.01 Update 2 Release
New Features and Enhancements
Product | Issue Number | Case Number | Description |
---|---|---|---|
All | VOV-10021 | The Bookshelf page has been simplified for easier reading. | |
All | VOV-9742 | The new vovservermgr command is added for
system administrators. This command has several subcommands. Three of the subcommands provide an easier syntax to set vovserver configuration and environment variables than vtk_server_config, et al. Other subcommands interface to memory chunking and scheduler tuning controls. |
|
Accelerator Plus | VOV-9792 | Added new parameter in vovwxd/config.tcl set CONFIG(slave,AutoKillMethod) "" that specifies the autokill method for all slaves. | |
Accelerator Plus | VOV-9200 | 22867 | If the directories ${VOVDIR}/local/logs/lm or ${VOVDIR}/local/logs/nc exist, then the output of lmmgr and ncmgr respectively will be logged in those directories with timestamped filename. |
Accelerator Plus | VOV-8840 | vovdaemonmgr stop/start/restart of vovwxd will also stop/start/restart WXLauncher slave. |
Resolved Issues
Product | Internal Number | Case Number | Description |
---|---|---|---|
All | VOV-9998 | Eliminated the appearance of errors similar to the following
in the logs: 7,275 vovserver(19661) ERROR May 20 15:59:55 Ending a visit that is already ended (searchId=81179) [object:411] 7,277 vovserver(19661) ERROR May 20 16:14:01 Nested visit with searchId=82117 [object:388] |
|
All | VOV-10103 | Fixed issue in calculating the total memory used by jobs, where the reported amount did not take into account memory that is shared between multiple processes in a job's process hierarchy. | |
All | VOV-9827 | Fixed a bug in the handling of resource names that were the same as the beginning of a slave name. For example a slave named "foo1" with a resource "foo" would see its "foo1" resource disappear and be replaced by a second "foo" resource. Both the disappearing slave name and the resource duplication have been fixed. | |
All | VOV-10003 | Fixed a bug involving missing field names in result arrays returned by vtk_select_get when specifying parameterized fields, e.g. properties. | |
All | VOV-10071 | Relaxed the restrictions around the user-specified slave health check script. | |
All | VOV-9852 | vovserver will not start any configured daemon as part of the
server start up process; this task will be handled by the auto
start scripts. One exception to this behavior is vovnginxd; the vovserver will explicitly start vovnginxd if the web port is changed from zero to a valid port while the server is running. |
|
Accelerator Plus | VOV-10171 | Fixed error in Accelerator Plus that could cause non-ADMIN users' jobs to fail due to security errors during slave startup. | |
Accelerator Plus | VOV-10067 | Fixed error that prevented vovwxconnect -test script from successfully running. | |
Accelerator Plus | VOV-9378 | Multiple instances of the same daemon will no longer be allowed to run when the "-f" flag is used. If the specified daemon is not running, "-f" will attempt to start the daemon, but if there is one already running, another instance will not be started. | |
Accelerator Plus | VOV-9357 | Multiple instances of the same daemon will no longer be allowed to run when the "-f" flag is used. If the specified daemon is not running, "-f" will attempt to start the daemon, but if there is one already running, another instance will not be started. | |
Accelerator Plus | VOV-9735 | The fields GRABBEDRESOURCES and GRABBEDRESOURCESO are now visible for Accelerator Plus jobs. In addition, the VOV_GRABBED_RESOURCES environment variable now propagates properly in Accelerator Plus jobs. Note that the value of GRABBEDRESOURCES differs from that in Accelerator; for Accelerator Plus jobs, its value will be derived from that of GRABBEDRESOURCESO of the associated slave job in the base queue. | |
Accelerator Plus | VOV-10094 | nc/wx cmd ... now supports interactive scripts. | |
Accelerator Plus | VOV-9753 | 24035 | Accelerator Plus will initialize environment variables specified in VOV_LM_VARNAMES variable of the job that requests a License:* resource |
2019.01 Update 1 Release
New Features and Enhancements
Product | Internal Number | Case Number | Description |
---|---|---|---|
All | VOV-9448 | Cached queries created via vtk_select_create or related commands can no longer expire while the results are still being processed. | |
All | VOV-8080 | 22230 | Startup scripts for use with systemd are now included in $VOVDIR/etc/boot. |
Accelerator Plus | VOV-9798 | 24022 | Added new parameter in vovwxd/config.tcl set CONFIG(agent,PreemptMethod) "" that specifies the preemption method for all wxagent jobs running in base NC queue. |
Accelerator | VOV-8699 | Added support for cgroups v2. |
Resolved Issues
Product | Internal Number | Case Number | Description |
---|---|---|---|
All | VOV-10040 | This fixes some issues with vov services on Windows not listening to both IPV4 and IPV6 network ports. Without this fix, the user may experience hangs while starting and stopping Monitor or other projects, and with nginx being unresponsive. | |
All | VOV-9771 | Fixed an issue where all field names specified by vtk_select_create and related commands would be converted to upper-case in the returned Tcl array. The field names will now be returned with the same capitalization as requested by the user. | |
All | VOV-9749 | Changed the behavior of the stop button in the vovssd SFD to stop the vovssd daemon and all child processes to prevent orphaned child processes from being present after exiting the SFD. | |
Accelerator Plus | VOV-9379 | 23650 | vovwxd will show an alert when the deletion of a slave object is unsuccessful. The new alert will have "vovwxd could not destroy slave" title format. |
Accelerator Plus | VOV-9745 | 24004 | Fixed a bug in vovwxd that was causing it to request slaves from the same base Accelerator queue when configured for multiple queues. |
Accelerator Plus | VOV-9744 | 24001 | Improved the method of retrieving the slave status in Accelerator Plus from the base Accelerator queue. The new method will help to avoid the blocking of vovwxd caused by nc getfield command |
Accelerator Plus | VOV-9797 | A configuration parameter was added that can change how wxagent responds to an nc stop command when running with an NC base queue. The parameter is named CONFIG(nc_stop_signals) and can be found in vovwxd/config.tcl. The default value is "" (empty string). If this parameter is set to a non empty string (for example, "USR1"), then the wxagent will be started with the additional parameter NC_STOP_SIGNALS set to the value of the string. |
2016.09 Release Notes
New Features and Enhancements
Products | Internal Number | Case Number | Description |
---|---|---|---|
All | 6258 | There is now a Python interface for the vovserver REST API. The
Python interface returns a JSON formatted string as a result of
posted query. For more information about request URLs and response
format please refer to REST API documentation. As vovserver requires
the user to be authenticated, the Python interface will
automatically prompt the user for username and password if the
current login session is expired. Example
scripts:
|
|
All | 2272 | Selection rule syntax now supports several addtional features.
Logical "or" operations in selection rules are available via the | operator. Logical operations may be grouped with parentheses. AND operations always take precedence over OR operations in the absence of parentheses. The words and, or and not can now be used in place of &, | and ! respectively; this makes shell-level scripting easier. These words are case-insensitive; they may also be typed as AND, OR and NOT. The = operator is now supported, which is equivalent to the == operator. Spaces are now allowed in non-quoted string values if they are preceded by a backslash '\' character. Example: command^sleep\ 60 is now a valid rule. |
|
All | 4218 | Improved protocol for more compact packing. To send 16-bit integer, packing size is changed from 8 bytes to 3 bytes. to send 32-bit integer, packing size is changed from 8 bytes to 5 bytes. To send 64bit integer, packing size is changed from 16 bytes to 10 bytes. By removing 4-byte alignment packing and reducing data type encoding method, encoding of string and double types is also more efficient. |
|
All | 4980 | Daily log files generated by system daemons are now automatically
compressed. Note: This applies only to non-Windows platforms.
|
|
All | 5349 | Vovselect now supports the '*' wildcard to signify all fields of
a particular object. Note: Ensure the '*' character is quoted as
required by your shell. Example:
vovselect '*' from jobs
where idint==12345 |
|
All | 5595 | 13053 | The increased web server security now prevents local file inclusion via a path relative to an open URL, such as /gif. |
All | 5935 | Added waitreasons as from source to
vovselect , which provides access to the data
available in vovshow -waitreasons . |
|
All | 5954 | Jobs submitted to NetworkComputer by the administrator can no longer be accidentally dispatched to the VOV database support slave. | |
All | 5985 | Added coresused and corestotal
fields to vovselect when querying slaves. |
|
All | 6071 | The interface to the vovversion shell command has been changed.
The new command interface:
|
|
All | 6107 | Added timestamp for FILE type and start and end time for JOB job in vovconsole Navigator. This allows users to easily see the dependencies of a timestamp. | |
All | 6119 | For easier editing, Cut/Copy/Paste menus have been added to Text fields | |
All | 6168 | Added support for doing daily maintenance on the VOV database, which is configurable through the database administration web interface. | |
All | 6196 | A new configuration parameter,
liverecorder.logsize , now controls the size of
Live Recording log files. The parameter, liverecorder.logsize,
can be added to the policy.tcl file, or set on-demand for the
server
via:
or
for the slave via: |
|
All | 6400 | The protection against deleting a resource when a job is queued against is now improved; it no longer slows the vovresourced-based expiration extender. | |
LicenseMonitor | 5394 | Integrated licensing capabilities now support server lists. This feature is useful for unlicensed tool wrapping. | |
NetworkComputer | 5766 | New features have been added to the vovprocessmgr utility for finding processes that are descendents of vovslave, orphans of vovslave and external processes. There is an option to create foster jobs for discovered orphans; orphans can now be accounted for by a slave on the same host, and tracked for the rest of their lifetime. | |
NetworkComputer | 5904 | On Linux slaves, a job can now be requested to run in one or more
cgroups.The syntax is similar to requesting any other resource, with
the resource name consisting of the prefix CGROUP: followed by the
path to the cgroup on the filesystem. Example, to use
/sys/fs/cgroup/cpuset/my_cgroup1
/sys/fs/cgroup/memory/my_cgroup2:
will
assign a "sleep 120" job to the cgroups
If the user specifies multiple conflicting cgroups (such as 2 cgroups under the /memory hierarchy), the cgroup that is specified last is the one that the process will be assigned to. The special resource CGROUP:RAM can be used to limit memory usage of a job within a cgroup. Example: nc run -r CGROUP:RAM -r
RAM/2000 -- sleep 120 will assign the job to a
default cgroup and limit that cgroup to 2000 megabytes of RAM.
Since we only place one job in each default cgroup, we can
effectively limit RAM usage on a per-job level. The path to this
default cgroup will be:
Note:
CGROUP:RAM cannot be used with a non-default cgroup; if both
CGROUP:RAM and a non-default cgroup are specified, the job will
be placed in the specified cgroup without changing that cgroup's
RAM usage limit. We strongly recommend specifying a RAM resource
when using CGROUP:RAM, as the default value is low (currently 20
megabytes).
To see slave resource for cgroups, use
|
|
NetworkComputer | 5909 | The new dynamic server tuning feature enhances performance when the server is under heavy load conditions: a unique maximum value is set per bucket. | |
NetworkComputer | 5080 | Slaves now account for jobs running on a stopped slave on the same host. When a slave is started, if there is a matching slave in the stopped condition (waiting on its jobs to finish), the new slave will adopt any jobs on the stopped slave by using foster jobs. This feature helps prevent host overloading. | |
NetworkComputer | 5472 | 12544 | There is a new server configuration parameter for statistics:
cpuprogressWindowSize . This parameter can be set in the policy.tcl configuration file. The default value (1) provides identical behavior to
previous releases. Increasing the parameter increases the number
of samples to be used in the calculation of the
The accepted range of values is 1-1440. 1440 signifies one full day, assuming the sample time is 60 seconds. |
NetworkComputer | 5689 | The fields REQCORES, REQCPUS, REQPERCENT, REQRAM,
REQSLOTS , and REQSWAP are now valid on
SCHEDULED jobs. This can be useful in writing
-preempting clauses on preemption rules. |
|
NetworkComputer | 6144 | 14414 | Validation is improved with two new options and a new health
check. Summary: The new Passing in the new
In addition, a new health check ensures that a
slave is running on each host listed in the
servercandidates.tcl file. By default, the
health check is on; it can be disabled in the health check
configuration of the web UI. Note: The health check features and
options are listed on the health check configuration of the
web UI.
|
NetworkComputer | 6195 | Update calls to ftlm_lmproject from vw every
20m. The period is controlled by the variable
VOV_VW_PING . |
|
NetworkComputer | 6204 | The PERCENT slave resource can now be configured in the slave resource specification. This is specified in consumable form, such as PERCENT/50. Previously, PERCENT was always initialized to 100 and could not be changed. | |
6312 | New parameter resuserDisableMatchingThreshold
allows matching to be disabled for license resources from a license
server: in such cases, the matching process can take a very long
time.
The range is 0 - 10000; the default value is 1000. This
parameter can be specifed in Example:
This parameter can also be set using the VTK API. Example:
|
||
NetworkComputer | 6722 | 14704 | The vovserver now responds to multiple incoming client
connections during the same cycle. This capability is enabled by
default, and can be controlled with the following congfiguration
parameters via the policy.tcl
file:
The mode can be "single" or "multi" (default=multi). The size can be any integer in the range of 25-1024 (default=512). |
Workload Accelerator | 6558 | The ncmgr and wxmgr utilities
now check for a minimum number of file descriptors on Unix-based
platforms. Summary: The default value can be over-riden
with the An attempt is made to raise the default limit if needed. If the target's limit is higher than the system's hard limit, an error is displayed and the product start will be aborted. |
|
LicenseAllocator | 5311 | 12150 | License resources can now be allocated from multiple servers or
excluded. Summary: Resource groups can be defined that are a logical OR of components that are hosted on different license servers. Licenses from specific servers can be excluded from distribution to specific sites. |
FlowTracer | 6358 | A SNAPSHOT of the server environment is now automatically
captured on FlowTracer vovproject start . This new
feature enables users to run a job in the same environment as the
server. A shell environment snapshot can also be saved on demand
from any project enabled shell. Summary: The automatically captured environment is saved as a file under the server working directory, which can be used to set the SNAPSHOT environment at any time. The new standalone utility
|
|
FlowTracer | 4879 | A periodic job can be paused by setting a property "PERIOD_PAUSE"
as 1. This is done through web interface, vovconsole NodeEditor,
vtk_prop_set tcl function, or vovprop cli. If a
paused job is running when this property is set, the current run
will continue to completion, but a new run will not be started at
the next period; it will remain paused until
PERIOD_PAUSE property is reset to 0. |
|
FlowTracer | 5428 | If EventOverflow occurs from server, vovconsole will empty all events, wait, reconnect and refresh all set viewers. When this occurs, a message is displayed: Updating ... please wait | |
FlowTracer | 5617 | 13082 | Upcone Set and Downcone set menus are now added to the popup menu of vovconsole when a set is selected. If no node is selected, Connectivity popup menus of Downcone, Upcone and Expand will apply to the current set displayed in the SetViewer. |
FlowTracer | 6018 | Details about the condition of a schedule job that is stuck due
to barrier invalid are now available.
|
|
FlowTracer | 6122 | The Navigator column now sorts per the severity of the node status. | |
FlowTracer | 6156 | SERIAL and PARALLEL commands in FDL now work with any level of nesting: you can nest parallel in serial and vice versa, for as many levels deep as desired. SERIAL and PARALLEL commands now work with S (set), T (task), and J (job) commands, which can be nested under one another in any order. | |
FlowTracer | 6440 | vovlsfd now utilizes an agent script
(vov_lsf_agent ) in the installation to launch
vovslave in the batch system. Previuosly, shell scripts had to be
written to the launchers/<hourly sub-directory>. |
|
FlowTracer | 6571 | 15109 | The vw command now has the option to apply a delay that allows
latency on outputs as well as inputs. Previously, if the
environment variable In this release, in
addition to the above behavior, if
|
Resolved Issues
Products | Internal Number | Case Number | Description |
---|---|---|---|
All | 3120 | 13056 | Increased web server security, which prevents cross-site scripting. |
All | 6132 | Added arch field support for slave objects to VovQuery:
vovselect, vtk_select . |
|
All | 6193 | An issue has been fixed with field-only selection rules for string fields. By passing the field name without an operator or a value, you can now query for objects that have a non-empty value in the specified field. For integer fields, this form of selection rule queries for objects that have a non-zero value in the specified field. | |
All | 6253 | Scrolling has been added to the alerts dialog, which makes long lists of alerts easier to access and consumes less space on the monitor. | |
All | 6394 | Viewing the files of sets has been made easier:
Summary: For default sets that only contain files, such as System:files, System:filesToCheck, System:zippable, Predefined:missing files, and Predefined:blocking files, files are always displayed regardless of the setting of the flag. However, for user defined sets, Show files must be turned on to view the files. |
|
LicenseMonitor | 6103 | 13577 | New format is now recognized, and added support for additional date format in Feature line. |
LicenseMonitor | 6268 | Detailed plots are now reasier to read. The visibility of plot lines was increased for checkouts and queued requests with shorter durations compared to the report time range. | |
NetworkComputer | 5253 | The vovfsgroup create command now copies the
parent ACL when creating a subgroup. Example: vovfsgroup
create /abc/def will create a new group
/abc/def , with ACL permissions copied from the
group /abc . If there is no applicable parent group,
the default ACLs will be used. |
|
NetworkComputer | 5699 | 14154 | fairshare.cgi now takes fstokens into account.
Previously, fstokens were ignored. |
NetworkComputer | 5715 | 13420 | The autokill function now takes suspension time into account when determining if a job has exceeded the autokill time threshold. |
NetworkComputer | 5993 | 13705 | License resources are no longer overbooked. Previously, the configured threshold was honored the first time the overbooking procedure was called for a specific feature. |
NetworkComputer | 6125 | 13164 | vovset list can now display more than 600k
sets. |
NetworkComputer | 6617 | The failover process now works correctly. Note: It can take over 2
minutes for the failover process to complete.
|
|
NetworkComputer | 6618 | The error message for a specific port issue is better handled: where ncmgr is being used to start or stop an NetworkComputer instance, but the specified port matches that of an NetworkComputer instance other than the one being controlled. | |
NetworkComputer | 6619 | 15159 | The DISPLAY environemnt is unset during the failover process, which prevents the process from failing due to an invalid value. |
LicenseAllocator | 6414 | ADMIN permission is no longer needed to read LicenseMonitor or NetworkComputer data. That data is now accessible by all USERs. However, updating the NetworkComputer instances with new allocations still requires ADMIN privileges. | |
LicenseAllocator | 6601 | 12150 | The minimum number of tokens to run any queued job on a site are allocated when possible. This ensures jobs will be run. |
LicenseAllocator | 6752 | 15243 | When a new resource is declared in the LA config.tcl file, restarting LicenseAllocator or vovlad is no longer needed to get the resource tokens. |
LicenseAllocator | 6753 | 15245 | LicenseAllocator now uses the expression specified through
LA::SetMapForResourceInSite to set the resource
map expression in NetworkComputer; components are in the resource
map in the correct order. |
LicenseAllocator | 6756 | 15244 | LicenseAllocator now sets the weight of the components of a resource group from the weight of the resource group, unless the weight of the component has been set explicitly. |
FlowTracer | 4826 | 11903 | When a new set is created in the Set Browser, the sets can now be placed in alphabetical order: right-click any node and select the Update and Sort option. Previously, new sets were appended to the bottom, with no option to update the order. |
FlowTracer | 4901 | 10461 | vovproject enable now returns a non-zero exit
when it fails. |
FlowTracer | 4945 | Detailed information of "why" is now provided if a job fails with no outputs. | |
FlowTracer | 5198 | Improved Slave Monitor: Slave LED Monitor on vovconsole and
Floating Slave Monitor.
|
|
FlowTracer | 5231 | 3186 4562 5409 | Predefined sets on the console can now be deleted or modified;
they are not permanently deleted.
|
FlowTracer | 5424 | Manually overriding vovslave cores and capacity is now improved.
|
|
FlowTracer | 5537 | Sticky attachment attributes on inputs are now preserved when creating job arrays. Previously, sticky attributes were not copied for inputs. | |
FlowTracer | 6101 | 14221 | vovconsole no longer crashes when running in READ-ONLY mode. Menus and buttons that are not applicable for read-only security are now disabled. |
FlowTracer | 6129 | 13269 | A race condition that occurred between resource reservations and resource grabbing by a job has been fixed. Previously, this occurred when a job was dispatched by the server and resource was reserved between the time of the dispatch and the time the job started running on a slave. |
FlowTracer | 6262 | The Disconnect popup menu of the Navigator is now fixed. Previously, error messages were displayed. | |
FlowTracer | 6364 | In vovconsole, jobs in a set are no longer deleted when the set is flattened then rebuilt from FDL. | |
FlowTracer | 6438 | The following commands that were previously deprecated are no
longer available: vovstopjobs
vovleader -K retraces |
|
FlowTracer | 6668 | The numbers of projects for other users shown in registry.cgi page has been corrected. | |
FlowTracer | 6683 | Files no longer appear in sets that should not contain files such as "Predefined:stuff to do" and "System:jobs". |
2016.09 Update 20
New Features and Enhancements
Products | Internal Number | Case Number | Description |
---|---|---|---|
LicenseMonitor | VOV-8169 | 21933 | The vovdb_util 'showcfg' sub-command now shows information about DB backups, if configured. |
LicenseMonitor | VOV-9133 | 23261 | The Altium log file parser now handles dates of a few new formats, such as yyyy-mm-dd, in addition to those using slash and dot as separators. |
LicenseMonitor | VOV-9045 | 23136 | ftlm_parse_altiumlog now is able to parse the log line record of the new format of time log: Product Name;License Name;User;Role;Action;Action Time or format of usage log: Product Name;Activation Code;User;Role;Version;Start Time;Returned Time |
NetworkComputer | VOV-9075 | 23176 | For preemption rule debugging purposes, a separate preemption log
file is written if enabled. The preemption log can be turned on
through server config parameters in policy.tcl. set
preemption.log.verbosity to 1 or up to 10. For example, add the
following to the <swd>/policy.tcl:set
preemption.log.verbosity 3 Turn on debug flag for each
preemption rule of interest through WebUI or preemption rule
definition in <swd>/vovpreemptd/config.tcl. To enable all
preemption rules debugging, add the following to
<swd>/policy.tcl:set preemption.log.allrules
1 Verbosity level needs to be chosen carefully. Higher
verbosity number will create many messages and may cause the log
file size to grow quickly. In case the current preemption log file
is removed, it can be recreated with:vovsh -x
'vtk_server_config rotate_server_preemption_log
1' |
Resolved Issues
Products | Internal Number | Case Number | Description |
---|---|---|---|
NetworkComputer | VOV-8018 | 23130 | Removed unnecessary "Load" button for config.tcl. config.tcl is loaded automatically by vovpreemptd daemon. |
VOV-8347 | 23028 | The "slave.childProcessCleanup" parameter in policy.tcl no longer requires cgroups; it does still require a Linux platform. Additionally the method of process cleanup has been changed to be more thorough. This setting may cause vovslave to be less responsive in heavy usage scenarios. | |
VOV-9096 | 23174 | Fixed a bug that when slave names get updated in RESERVE_SLAVES preempt rules, previous slaves still get reserved. | |
VOV-9097 | 23175 | FREE_SLAVES rule type preempts slaves only in IDLE(READY), WORKING, or FULLLOAD. | |
VOV-9118 | vovserver doesn't check file existence with lstat for nc run -l <logfilename> . | ||
VOV-9025 | 23101 | Initialize job classes in a separate vovresourced process by default. | |
VOV-9026 | 23102 | Initialize job classes in a separate vovresourced process by default. | |
VOV-9036 | 23149 | Prevent misleading message in node.cgi about the job being executed on an invalid slave when the job has been forgotten from server memory and job data is being loaded from the database. | |
VOV-8923 | 22882 | Added an environment variable, VOV_BJOBS_JOBID_WIDTH, that can be used to override the default job ID column width in the bjobs output. | |
VOV-8986 | 22899 | Fixed incorrect value of cputime field displayed at job completion. |
2016.09 Update 19
New Features and Enhancements
Products | Internal Number | Case Number | Description |
---|---|---|---|
All | 7453 | 20684 | The vovbrowser command now shows the value of VOV_HOST_HTTP_NAME, if this is set, in the project URL. |
NetworkComputer | 8973 | 22958 | NetworkComputer now correctly generates an alert if the LicenseMonitor instance being tracked by vovresourced is not available at first contact attempt. Previously this alert would appear only if contact was made and then lost. |
NetworkComputer | 7108 | 20078 | nc hosts will show the full list of
reservations for a slave in the following format: U:<list
of users> C:<list of job classes> G:<list of groups>
etc... |
NetworkComputer | 7126 | 20116 | vtk_slave_define has a new argument
-expiredate that specifies the date and time
after which the definition of this slave is expired, and it cannot
be started with vovslavemgr command. The format
of this parameter is year_month_day_hour_min_sec. Example:
2018_12_31_23_59_00
|
NetworkComputer | 7970 | 21621 | The job output URL protocol (http or https) matches that from the output of vovbrowser. i.e. It is determined by setting of sslenabled from the project. |
NetworkComputer | 8014 | 21705 | Added CLOCKTURBO field to slave objects in VovQuery that represents the CPU turbo speed of the host. |
NetworkComputer | 8028 | 21725 | The command vovslavemgr stop -force with an
empty slave list now prints a warning and does not stop any
vovslaves. Use with -all if you really mean it.
|
NetworkComputer | 8996 | FreeStyle preemption ruletype no longer preempts more jobs than necessary to run jobs in preempting bucket. |
Resolved Issues
Products | Internal Number | Case Number | Description |
---|---|---|---|
LicenseMonitor | 8967 | 22940 | Now cvs export function works properly on the following pages:
|
NetworkComputer | 8911 | 22868 | Fix premature termination of nc hosts command
when a slave exits during the running of the command. This may have
been encountered with large or volatile slave counts and is seen
when the -O format option is used. Fixed in 2016.09
Update 19 for the case of volatile slave counts. |
NetworkComputer | 8917 | Jobs that have reached WITHDRAWN state later than preemption plan spec. are no longer left as WITHDRAWN and get rescheduled. | |
NetworkComputer | 8925 | 22887 | Added accountability of RAM used by jobs that are submitted with NUMA supported requested. The total RAM for each socket will be decremented by a job's RAM request when the job is bound to the socket. This allows the next job requesting NUMA support to be placed according to the RAM that is expected to be available instead of the RAM that is currently available. Additionally, the NUMA_LAYOUT slave property has been augmented to show the memory currently used by all jobs on the slave that requested NUMA support (e.g. "Socket: 0 RAM= 500/32089 oooooooooo"). |
NetworkComputer | 8964 | 22922 | Interactive jobs no longer consume 100% CPU when attempting to reconnect to a downed NetworkComputer queue. |
NetworkComputer | 4453 | 20991 | Added sorting, filtering, and searching capabilities to legacy FairShare page. |
NetworkComputer | 8730 | 22315 | Fixed an error where editing a queued interactive job in the NetworkComputer web interface could potentially cause the job to fail. |
NetworkComputer | 8978 | 22965 | Removed redundant "Excluded from being declared as input" warnings on nc run -f command. |
NetworkComputer | 9047 | The URL shown at the end of the NC startup routine now takes into account the VOV_HOST_HTTP_NAME value and whether SSL is enabled. |
2016.09 Update 18
New Features and Enhancements
Products | Internal Number | Case Number | Description |
---|---|---|---|
All | 8766 | 22020 | When copying the <swd>/vovnginxd/conf/nginx.conf.template file to an nginx.conf file, a message is output to the server log indicating that a copy is being made, and also specifying the source and destination files. |
LicenseMonitor | 8574 | 22192 | LicenseMonitor now includes support for monitoring licenses of QF-Test software. |
LicenseMonitor | 8954 | 22878 | Moved the icons for exporting png/csv up in a separate row, so chart area can be highlighted excluding those icons for sreenshots. |
LicenseMonitor | 8959 | 22897 | The FLEXlm parser has been enhanced to correctly handle version 11.14+ lmstat output. |
NetworkComputer | 8789 | The ncupgrade script has been implemented at beta level. This script allows customers to upgrade their NC installation to a different (newer) version of software without taking down the queue. The script prompts users for decisions (such as queue name, location of the new version, etc.) and is intended to perform the upgrade without losing any jobs, and without having client job submissions fail (they just take longer to register). Please provide feedback on this feature during the beta period. | |
NetworkComputer | 7466 | 20598 | Reservations may now be negated to specify who or what is NOT allowed to use a slave. For example a reservation of "USER !john,mary" will allow all users except "john" and "mary" to run jobs on the slave for the specified duration. |
NetworkComputer | 7748 | 21236 | Generate both an event and a server log entry when a FairShare
group is changed. Event example:
Log example: vovserver(9805) Oct 09 13:26:20 FairShare change: /test, by
joe@localhost SERVER: Window 1h00m to 2h00m |
NetworkComputer | 8695 | Slave reservation that is created by preemption type FreeSlave is now reserving the preempted slave by bucket id. The reservation duration is 5 seconds by default and it can be configured using preempt rule field reservetime. | |
NetworkComputer | 8735 | 22516 | Speeds up jobclasses.cgi by showing detailed job statistics per jobclass only on demand, by clicking the basic job count initially displayed. |
NetworkComputer | 6199 | 20959 | Documented the boolean jobclass variable "classEditable", which
controls whether a jobclass can be edited using the jobclass web UI
page. Example: set classDescription "my jobclass" set
classEditable false ...rest of jobclass script...
|
Resolved Issues
Products | Internal Number | Case Number | Description |
---|---|---|---|
All | 8826 | 22666 | Prevent vovslave from caching empty group lists returned from OS. |
All | 8947 | The AGE field of the SETS object will now return the correct age, defined as time in seconds since the set was last updated. In addition, the LASTUPDATE field will always return a valid timestamp instead of 0 for non-smart sets. | |
FlowTracer | 8854 | 22728 | Fixed vovconsole crash that could occur in flow graphs when using Orthogonal/Manhattan arrows. |
FlowTracer | 8984 | 22999 | Fixed bug which caused a Tcl error in node editor when the expected duration field had a time value with a leading zero, and then changes were made to the node and saved. The time spec parser was failing to trim the leading zeroes and producing an error. Now, the time spec parser does it correctly. An example time spec that could cause the error is "1h05m" due to the "05" value for the minutes. |
LicenseMonitor | 8133 | 21860 | The "shift combo" was fixed in previous 2016.09 release. The test case added to ensure this error will got covered in our regression tests. |
LicenseMonitor | 9011 | 23060 | Removed the double html-encoding to prevent license editing corruption. |
LicenseMonitor | 8834 | 22690 | ftlm_batch_report should honor the switch with "Hide" parameter (e.g.-UtilPlotHideAverage) with empty value (default value as 1). |
LicenseMonitor | 8852 | Configure group access to a tab should not prevent the guests view page using readonly port. | |
LicenseMonitor | 8859 | 22682 | Several robustness improvements to vovresourced with respect to
obtaining data from LicenseMonitor:
|
LicenseMonitor | 8615 | 22208 | ftlm_parse_flexlmlic to handle nodelocked licenses (e.g. MATLAB:ID=720099) with correct capacity |
NetworkComputer | 7242 | 20265 | FairShare groups now have a "flatten" setting. When a group is flattened, all of its child groups will be treated as though they were on the same level of hierarchy; that is, all leaf-level groups will be assigned weights as though they were direct children of the top-level flattened group, ignoring weights assigned to any non-leaf-level groups in between. This setting can be enabled by the vovfsgroup modify command. Note that disabling this setting for a group that has a flattened parent will have no effect; the group will still be flattened. |
NetworkComputer | 7806 | 21608 | Fixed FreeSlave preemption type not to preempt slaves that are not compatible with preempting jobs in HW requests such as CORES, RAM. |
NetworkComputer | 8824 | 22655 | Do not process commas as option delimiters in SNAPSHOT environment calls so that a snapshot file path can include commas. |
NetworkComputer | 8865 | 22693 | Improved error messages for wrong parameters in nc preempt command. |
NetworkComputer | 8686 | Improved the preemption rule view page to include all of the fields that are displayed in the new rule page. This mainly includes improvements to the FREE_SLAVES, RESERVE_SLAVES, and RESERVE_RESOURCES rule views. | |
NetworkComputer | 8697 | 21608 | Improved the performance and reliability of the FREE_SLAVES preemption rule behavior. Hosts with non-preemptable jobs can be preempted if the machine can still run preempting jobs. |
NetworkComputer | 8914 | 22857 |
|
WorkloadXelerator | 8878 | 22773 | Fixed problem with SNAPSHOT environments that caused environment variables to get added in subsequent jobs but never removed. Now it uses the correct environment from the shell that launched the job, plus the environment specific to each job. The bug impacted vovslave but not vovslaveroot. |
2016.09 Update 17
New Features and Enhancements
Products | Internal Number | Case Number | Description |
---|---|---|---|
All | 8720 | 22285 | Added patchlevel display and patch sorting for various options of vovversion command. |
FlowTracer | 8756 | 22556 | Added 2 config parameters for vovlsfd:
|
LicenseMonitor | 7006 | 5636 | Added a pair of new configuration variables in the
vovlmd configuration file to control the
threshold of elapsed time without an update at which point a license
server is to be considered down. That is, you can set the following
in the licmon.swd/vovlmd/config.tcl file:
|
LicenseMonitor | 8743 | 22692 | Added "Show capacity with queued" and "Show peak with queued" options in the utilization plot batch report UI and corresponding options for CLI. |
NetworkComputer | 8701 | 22294 | The bjobs LSF emulation utility has been modified to support the
-o option to control the job listing output
format. Column width and delimiter specifiers are also supported.
See bjobs usage syntax (bjobs -h ) for details.
|
NetworkComputer | 8796 | 22674 | A new subcommand of vovslavemgr, rotatelog, creates new log files and new log directories if missing. Startup log files are not recreated. |
Resolved Issues
Products | Internal Number | Case Number | Description |
---|---|---|---|
All | 7809 | 21345 | Fixed issue that caused the patch status link in certain web UI pages to be rendered as plain text instead of HTML. |
All | 8630 | 22215 | Improved vovnginxd robustness when reconfiguring the web port, and corrected the documentation on how to modify the web port while the server is still running. |
LicenseMonitor | 8580 | 22181 | LicenseMonitor can now serve batch reports greater than 1GB in size. |
LicenseMonitor | 5898 | License server emulate thru vtkle_feature_set can take y (year) as time unit. | |
LicenseMonitor | 8451 | 22083 | Remove the VIEW (and other relevant) permissions for EVERYBODY when restricting tag access to specific users. |
LicenseMonitor | 8846 | 22701 | Fixed "extra switch pattern with no body" Tcl error in the vovdb_util dump function. |
NetworkComputer | 8805 | 22639 | Fixed issue where the highest possible value for the maxload
setting of a vovslave was silently capped at 100.0. This prevented
slaves configured with more than 100 job slots from running more
than 100 jobs. The new highest possible value for the maxload
setting is 10000.0. Along with this change, jobs will need to be
submitted with PERCENT/0, or the default minimum HW request will
need to be modified so that PERCENT/0 is requested, or no PERCENT
resource at all. The default minimum HW request can be configured in
the SWD/policy.tcl file, using the "minhw"
setting (default is shown):
If this setting is changed, the policy.tcl file
will need to be reread by the vovserver, which can be accomplished
with the following command:
|
NetworkComputer | 8075 | 21678 | The server parameter schedskip is now obsolete.
We recommend using the parameter schedMaxEffort to
limit the time usage of the scheduler. See the documentation for
schedMaxEffort for more details. |
NetworkComputer | 8244 | 22004 | A new server configuration parameter is added to limit the
maximum number of ORs in a resource expression. For example in
policy.tcl:
|
NetworkComputer | 8373 | 22020 | Added documentation that describes vovnginxd configuration files
and procedure. Note: This ticket is composed of 2 parts. The first
part has been addressed. The second part, "nginx sometimes holds
up the port indefinitely when the back end NC vovserver has
failed over", will be addressed in a future
release.
|
NetworkComputer | 8475 | 22006 | Jobs with a negated soft resource containing a colon or underscore character were unexpectedly dispatched instead of queueing. This has been corrected. |
NetworkComputer | 7864 | 21442 | Fixed to set correct fail codes for stopped jobs by Ext method. |
NetworkComputer | 8833 | 22689 | The SOLUTION property on jobs will now use "!" to indicate negated resources rather than "!=". |
NetworkComputer | 8872 | 22713 | Fixed the problem that one of preempt rules is not sorted by order. Improved error message. |
2016.09 Update 16
New Features and Enhancements
Products | Internal Number | Case Number | Description |
---|---|---|---|
All | 8641 | 22235 | When a job is modified, do not mark the job unsafe when the host has been changed. Also do not invalidate the job when the resources, aux resources, jobclass, job project, or FairShare group changed. |
NetworkComputer | 7878 | 21366 | Added option -leaf to vovfsgroup
genconfig to also include weights for the leaf
nodes of the complete FairShare tree. |
NetworkComputer | 8644 | 22239 | To prevent excessive server load due to too many OR clauses
in resource map sums, calls to
vtk_resourcemap_sum from
<swd>vovresourced/config.tcl are now
limited to expressions of up to 5 ORs (by default). The default
limit can be adjusted by setting
RESD(maxORsInResourceSum) in
<swd>/vovresourced/config.tcl. If a
resource sum with excessive ORs is seen, an alert is generated
and an error message is output once per day in the vovresourced
logfile. |
NetworkComputer | 8749 | A new option, -writeprdir
<directory_path> has been added to the
ncmgr stop command. This permits the PR
file to be written, uncompressed, to the specified directory
instead of the trace.db directory upon
server shutdown. |
Resolved Issues
Products | Internal Number | Case Number | Description |
---|---|---|---|
All | 8658 | 22287 | Fixed issue that randomly caused a "page too large" error to be displayed for certain pages in the web UI that contain a significant amount of data. |
FlowTracer | 8709 | Fixed vovfileready bug that existed in 2016.09u12-2016.0u15. The bug caused the vovfileready job to become invalid instead of becoming valid and running the downcone immediately. Now, it behaves correctly. This is a 1-line Tcl change and is a trivial patch to any release. | |
FlowTracer | 8626 | An alert for not enough file descriptors from vovlsfd.tcl was confusing users and has been removed. Warnings about this are still issued in the logfile. | |
FlowTracer | 8773 | Fixed issue in Tcl procedures that attempt to validate nodes (waive exit code, force run, force validate). These procedures were broken in recent 2016.09 versions (validation did not always occur and error messages were seen). This fix allows these functions to successfully validate the requested node, which also includes the output of jobs. | |
LicenseMonitor | 8162 | 21794 | Made sure the data series has the same order as the legend series so that there is no mismatch between them. |
LicenseMonitor | 8610 | Use bold font for the label and legend in pie charts, and use bold font for tick and legend in bar/histogram chart. This change should provide improved look for presentation views. | |
LicenseMonitor | 8611 | Adopted a new pie chart color selection algorithm that produces a range of lighter colors that look good with black fonts on top of them in most cases. | |
LicenseMonitor | 8209 | Fixed issue that prevented the lmmgr reset function from working. | |
LicenseMonitor | 8571 | Make sure correct where clause for the SQL is used when retrieve data from database to produce reports. | |
NetworkComputer | 8492 | 22142 | Fixed issue with vovslaveroot reconnecting after a 30+ minute network interruption. Now, the slaves will properly reconnect and recover the jobs when the network connection to the server is re-established even after a delay of over 30 minutes. Fixed issue with vovslaveroot when the process exits with an error to make sure that the message gets logged (previously the "fatal error" message was not shown or logged anywhere). Fixed issue with the sleep time between reconnection attempts to be accurate based on wall clock time. Previously, when a vovslave was reconnecting, the timeout would sometimes be very far off from the requested wall clock time. |
NetworkComputer | 8739 | 22515 | In certain cases, when a job finishes at the same time as an internal alarm, a job can hang, or be auto-killed, so appears to fail. This problem has been resolved. |
NetworkComputer | 8753 | 22291 | Fixed error in jobclass.cgi, where a jobclass using the variable VOV_JOB_DESC(jobclass) would trigger a Tcl error. |
NetworkComputer | 7343 | 20113 | When there is an error opening, writing, or closing a dailylog file, such as the log used for vovresourced, catch the error and report it. The error message, including the decoded string from C++, will get printed, as well as the message that was being written. In addition, generate an alert when such an error occurs with the same information (except for the message being written). |
NetworkComputer | 8744 | 22529 | Format the value of the WX_BUCKET_LINK property when viewed from within node.cgi. |
NetworkComputer | 7562 | 20821 | Made some fixes to input and output declarations to address "Server is operating on a non-internal object" error. |
NetworkComputer | 7911 | 21526 | The command vovslavemgr restart now behaves identically to vovslavemgr stop followed by vovslavemgr start, allowing vovslavemgr restart to properly restart busy slaves. |
NetworkComputer | 8623 | 22166 |
|
NetworkComputer | 8625 | 22226 | Queries for requested-resource fields will now return correct values for jobs with non-running statuses. |
NetworkComputer | 8673 | 22267 | Added "USERXDUR" and "USERXDURPP" fields to vovselect (for FlowTracer) and nc getfields (for NetworkComputer) to reflect the expected duration of the job as specified by the user. The existing fields "XDUR" and "XDURPP" will continue to be updated to reflect actual duration when the job completes successfully. |
NetworkComputer | 8678 | 22273 | A slave that is in the process of stopping but is still running jobs will now have _stopped_<timestamp> appended to its name, whether stopped from the command line or through the server's browser-based UI. This allows a slave to be restarted multiple times. In addition, a request to a slave to stop can only be canceled if a replacement slave hasn't already started. |
NetworkComputer | 8697 | 21608 | Improved the performance and reliability of the FREE_SLAVES preemption rule behavior. |
2016.09 Update 16.1
Resolved Issues
Products | Internal Number | Case Number | Description |
---|---|---|---|
NetworkComputer | 8833 | 22689 | Have vovfosterjob detect the malformed negated resource and translate it into the proper syntax (work-around prior to server-side fix). |
NetworkComputer | 8753 | 22291 | Fix error in jobclasses.cgi, where a jobclass using the variable VOV_JOB_DESC(jobclass) would trigger a Tcl error. |
2016.09 Update 15
Resolved Issues
Products | Internal Number | Case Number | Description |
---|---|---|---|
All | 8529 | 22165 | Attempts to use vtk_resourcemap_set to create a self-referential resource map are now rejected with an error message and an alert. |
All | 8570 | NetworkComputer and WorkloadXelerator queue configuration files are no longer corrupted when enabling server failover. | |
All | 8592 | Timespec values are now supported in selection rules, in all places that accept integer values. | |
FlowTracer | 8261 | 22036 | Fixed a problem with editing jobs from the browser UI when the command line uses single quote characters. |
FlowTracer | 8569 | 22173 | We have added a new environment variable, VOV_SLAVE_SID_DISABLE. If this variable is set to 1, vovslave and vovslaveroot will not create new sessions (and consequently will not create new process groups) on startup. |
FlowTracer | 8595 | When calling ForceValidate, queued jobs that are validated are removed from the buckets. | |
FlowTracer | 8146 | 21921 | Fixed the occasional failure of vovlsfd to start when run in single user mode for multi-user FT usage. Removed redundant checks for whether it is safe to run or not. |
LicenseAllocator | 8585 | 22193 | Critical jobs check will now work with wait reasons that include number of tokens in the resource specification. |
LicenseAllocator | 8635 | Improved presentation of "group matching" in LA user interface. | |
LicenseMonitor | 8561 | 22112 | Fix issues with LM SFD that prevented it from running as a Windows service under the "system" user. Running under the "system" user requires the user name AND password to be "system" in the account information portion of the SFD GUI. The "system" user is now added to the default security configuration in security.tcl to allow it to run normally. Because the "system" user has no password, at least one real user will need to be granted ADMIN privileges in the security.tcl file as well. After doing so, restart the service for the change to take affect. Please allow for 30 seconds between stop and start of the service. Note that the service must be uninstalled and reinstalled or it will fail to start with this version of the SFD. This can be done using the Install/Delete controls in the service portion of the SFD GUI. |
LicenseMonitor | 8580 | 22181 | Fixed error in LicenseMonitor that prevented the viewing of very large batch reports. |
LicenseMonitor | 8620 | 22225 | Now the ftlm_parse_flexlm program correctly outputs the parsed version for the queued for license records to the *.chk file. |
LicenseMonitor | 8526 | Clarified messaging regarding unrecognized slave configuration parameters; this is now a warning rather than an error, as it can occur for example when using a newer server with an older slave. | |
NetworkComputer | 8468 | 22121 | NetworkComputer no longer "forgets" a resource map that has been in use for longer than the cutoff provided or is currently in use by a preempted job. |
NetworkComputer | 7643 | 21098 | Fixed issue that caused jobs submitted through the LSF emulation layer to result in a SLEEPING state due to an output conflict caused by the use of the -o option to bsub. |
NetworkComputer | 8038 | 21752 | Added the ability to grant a user the privilege to stop another
user's job. This can be accomplished by granting the STOP ACL
privilege to the other user by using the vovacl
command line utility.
|
NetworkComputer | 8089 | 21856 | Ensure VOV_JOBCLASS_DIRS is honored in CLI and web UI, regardless whether it is defined in the queue setup file in vncConfig or the vnc_policy.tcl file. This includes a new line in the output of the nc jobclass -ll command that shows the location of the jobclass script. For the web UI specifically, jobclasses that are located in a user-defined directory are also marked as editable. |
NetworkComputer | 8233 | 21983 | Messaging in vsy, nc info and related commands has been changed to clarify which resources are actually missing when the user specifies OR clauses in the resource request. |
NetworkComputer | 8347 | 22087 | The slave.childProcessCleanup parameter in
policy.tcl no longer requires cgroups; it
does still require a Linux platform. Additionally the method of
process cleanup has been changed to be more thorough; this setting
may cause vovslave to be less responsive in heavy usage scenarios. |
NetworkComputer | 8439 | 22088 | The snapshot environment used by the nc run -ep option is now protected against stale system environment variables, such as VOV_PORT_NUMBER. |
NetworkComputer | 8587 | 22194 | Fixed handling of selection rules for preemptable job for MULTIQUEUE preemption rules. |
NetworkComputer | 8631 | Fixed resource leak caused by hyper-active preemption of same job. | |
NetworkComputer | 8412 | NetworkComputer now reconnects properly to a LicenseMonitor instance that has been restarted. | |
NetworkComputer | 8588 | Improved reliability of nc stop -dir xxx which now makes sure all stops have been performed. | |
NetworkComputer | 8629 | A job preempted with EXT and then killed correctly ends up in the FAILED status rather than WITHDRAWN. | |
NetworkComputer | 8636 | Allow editing of mqthresh parameter in MULTIQUEUE preemption rules. | |
HERO | 8164 | Fixed problem with exit-status-based auto-rescheduling that prevent the job from running upon being auto-rescheduled. |
2016.09 Update 14
New Features and Enhancements
Products | Internal Number | Case Number | Description |
---|---|---|---|
NetworkComputer | 8248 | NetworkComputer will now update license information from LicenseMonitor the minimum of either every 5 minutes or half the configured expiration time. |
Resolved Issues
Products | Internal Number | Case Number | Description |
---|---|---|---|
FlowTracer | 8479 | Fixed "Force Validate", which uses a procedure called MakeDashT to emulate the "-t" option to Make. This feature updates the start and end times of jobs, which was being prevented due to tightened protections that were added. This change also fixes a similar problem with "RECONCILE_WITH_FILE_SYSTEM", which also needs to update start and end times of jobs. | |
FlowTracer | 8519 | Added back addflow.cgi and removed the broken web link to it in the web interface. This script is not expected to be used but the broken web link needed to be removed. | |
LicenseMonitor | 8487 | Handles correctly the case "No checkouts found for the specified report period." for division calculation when excluding idle time. | |
NetworkComputer | 8401 | 22101 | An incorrect "why" message produced by "nc info" when jobs had been stopped was corrected. |
NetworkComputer | 8503 | 22148 | Added support for entire list of legacy field names in "nc hosts -O", which in turn, fixes issues with the named groups, such as "nc hosts -SLOTS". |
NetworkComputer | 8187 | 21130 | Fixed issue with auto-rescheduling of failed jobs so that the downcone dependencies do not get descheduled. Now, the failed job can re-run and the downcone will run if and when the job passes on the second (or later) attempt. |
NetworkComputer | 8505 | 22150 | Restored sensitivity to NC_STOP_SIGNALS and NC_STOP_SIG_DELAY job properties. |
NetworkComputer | 7952 | 21588 | Reverted FairShare history graph to original width of 600px in order to restore x-axis label visibility for graph windows down to 5d. |
WorkloadXelerator | 8334 | 22085 | vovelasticd will now use auxiliary resources of WX jobs when submitting slaves to NC. In case of a previously failed slave, using auxiliary resource, WX will try to submit a slave on a different host in NC. |
WorkloadXelerator | 8074 | 21997 | Resource expressions with | operator in job classes should be combined without spaces. e.g. set VOV_JOB_DESC(resources) "(general|PD) Limit:..." |
2016.09 Update 13
New Features and Enhancements
Products | Internal Number | Case Number | Description |
---|---|---|---|
All | 8346 | 22069 | Changed default value of thread.service.max from 2 to 0, per customer request. |
All | 8420 | 22111 | Allow control of the maximum number of maps in a job resource expression, previously hard-coded at 300. Now this can be set with the parameter resources.max.maps. |
NetworkComputer | 5947 | 22069 | Added ability to shut down threads. This can be done now
with % vovsh -x 'vtk_server_config thread.service.max
0' . Other parameters called
thread.service.... are accessible to control
when to use threads. |
NetworkComputer | 7535 | 20833 | The Tcl procedure VovGetRevokeDelay {} can now
be added and customized by redefining it in
vovresourced/config.tcl under the SWD
directory to allow users to customize the revoke delay to be used in
vovreconciled. This allows users to have the
revoke delay from their job classes override the default value of
RESD(revokeDelay) . The proc definition has been
added to the documentation. In addition, the verbosity levels of
various messages have been modified per customer request. |
NetworkComputer | 7455 | 20674 | The optional live_keepfor_jobs.tcl task script has been improved to reduce the load on the NC vovserver. |
NetworkComputer | 7959 | 21585 | The -maxload option of
vtk_slave_define now accepts simple
expressions relative to the capacity value, represented by the
keyword 'CAPACITY'. For example, -maxload
CAPACITY*1.5 would set the maxload to 1.5 times the
number of slots. Supported operators are: +-/* . |
Resolved Issues
Products | Internal Number | Case Number | Description |
---|---|---|---|
All | 8031 | 21729 | As of 2016.09 update 11, there is a new environment variable VOV_PAUSE_CHILD_SIGNAL, which controls the signalling of child processes on Linux platforms when a slave enters the PAUSED state (usually this happens when a slave receives a SIGTSTP signal). The valid values of this variable are: "STOP" - send SIGSTOP to all child processes. This is the default behavior. "TSTP" - send SIGTSTP to all child processes. SIGTSTP is not guaranteed to suspend child processes. "NONE" or ""(empty string) - do not signal child processes. Any value other than the valid values will result in the default behavior (SIGSTOP). As of 2016.09 Update 13, the environment variable VOV_PAUSE_VW_ON_TSTP can be set to 1 to cause vw jobs to pause when they receive the SIGTSTP signal. The vw jobs will continue if they receive either SIGCONT or SIGALRM. |
All | 8486 | Fixed a bug that was introduced in 2016.09u10 wherein the server would crash if it failed to obtain a lock on the server.info file (on server startup). | |
FlowTracer | 8294 | 22059 | Prevent Tcl errors for FDL procedures that apply to the most recent job, set, or file when the FDL has not yet declared the necessary item. Issue a warning that the item was ignored for properties, annotations, and job names. For inputs, outputs, capsules, adding items to sets, IFJOB, and X11_DISPLAY, stop the vovbuild with a fatal error when this problem occurs. |
LicenseMonitor | 7863 | 1099 | vovsql_load_checkouts now is able to process
checkout log with handle# > limit of INTEGER data type in one of the
following ways:
|
LicenseMonitor | 7977 | 21594 | The efficiency stats report now has option "Include idle time". If selected, the percentile calculation will based duration, match the right-side (Exactly Used) chart in the "Feature Efficiency Histogram". |
LicenseMonitor | 8054 | The .png export for report chart will work correctly even for high DPI canvas. | |
LicenseMonitor | 8251 | 21817 | Updated LicenseMonitor's FlexLM parser to combine reservations for the same feature and user or group that are spread over multiple declarations. |
LicenseMonitor | 8173 | 21934 | The vovserver policy.tcl file options
checkoutHostLowerCase and
checkoutUserLowercase were inoperative, but now
correctly control case-sensitivity. |
LicenseMonitor | 8201 | 21958 | Fixed issue in the usage comparison plot report where an error is produced when dropped queued requests are present. |
NetworkComputer | 8187 | 21130 | Added a post_retrace_downcone post-processing
script to schedule the down cone of a job, to cover the case in
which a job with dependencies may be automatically resubmitted
multiple times. |
NetworkComputer | 8241 | 21985 | Fixed a problem with @JOBID@ not being expanded in the environment specification. This was causing a problem with cross queue submission due to environment not being captured correctly. |
NetworkComputer | 8362 | 22086 | Fixed issue with new dependencies for existing jobs in NC that caused the job that is a dependency for another job to turn invalid, which also invalidated the downcone. Now, it does not turn invalid. |
NetworkComputer | 7982 | 21646 | Changed documentation to suggest usage of the option -rule rather than -hw for multiple constraints in an nc hosts command. Corrected the help for nc hosts. An issue was also fixed in the nc hosts -hw command that prevented multiple elements from being evaluated. |
NetworkComputer | 8188 | 21251 | Previously, when vovgetgroups timed out or did not return groups info correctly, the job would run with the incorrect groups. Following this change, if vovgetgroups fails or times out, the job will fail. |
NetworkComputer | 8219 | 22004 | No longer perform detailed slave analysis unless there is a confirmed HW resource wait reason. |
NetworkComputer | 8244 | A new server configuration parameter is added to limit the
maximum number of ORs in a resource expression. For example in
policy.tcl:
Or from the shell: If a job
is submitted and the number of ORs in its resource expression
exceeds the limit, the job fails with a message in the Why info. The
default value for the limit is 20. |
|
NetworkComputer | 8458 | Synchronized the nc hosts utility with supported slave fields so that all fields are supported by the -O option. | |
NetworkComputer | 8089 | 21856 | NC jobclasses defined via VOV_JOBCLASS_DIRS now are visible in web and command line. |
NetworkComputer | 8111 | 21869 | vovserver no longer issues incorrect license status GRACE messages in its log file. |
NetworkComputer | 8218 | Fixed error message concerning undefined errorCode variable in the nc why help message. | |
NetworkComputer | 8330 | 22081 | Fixed a bug in which nc hosts -f did not return an accurate list of field names. |
NetworkComputer | 8397 | 22079 | The fork time metric now shows 0 if threading has not been turned on, and also falls to 0 when threading gets turned off after being on. |
WorkloadXcelerator | 8074 | 21997 | Fixed resource parsing that contains | operator in vovelasticd, vovwxd. |
WorkloadXcelerator | 8214 | 21986 | WX no longer leaves stranded processes after the vovslave in the base NC setup has become SICK and subsequently recovers. |
2016.09 Update 13.1
Resolved Issues
Products | Internal Number | Case Number | Description |
---|---|---|---|
NetworkComputer | 8505 | 22150 | Restored sensitivity to NC_STOP_SIGNALS and NC_STOP_SIG_DELAY job properties. |
2016.09 Update 12
New Features and Enhancements
Products | Internal Number | Case Number | Description |
---|---|---|---|
NetworkComputer | 8174 | Improved performance of vtk_resourcemap_set_limit. | |
NetworkComputer | 8125 | Added timing control for preemption with two new parameters:
preemption.max.time.overall which limits the
time spent by the preemption code in any given preemption iteration
(normally 0.3 seconds at most once every 3 seconds) and
preemption.max.time.rule which limits the
maximum time for each rule. Improve performance of RESERVE_SLAVES
rules. Allow RESERVE_SLAVES lists to be specified with
SlaveList:NAMEOFLIST . Allow reservation of
slaves to a bucket. |
|
NetworkComputer | 6022 | 21107 | Removed the confusing -q option to nc
forget, introduced in 2016.09 Update 9, since it could
be confused with the -q (queue) option. Replaced
it with the -quiet option to nc
forget. Note that the command is not quiet in case of
errors. To make it completely silent, use: nc forget ...
>& /dev/null . |
Resolved Issues
Products | Internal Number | Case Number | Description |
---|---|---|---|
All | 8184 | Fixed error case when vovconsole prints out unknown color name "" when there is not visible error. | |
FlowTracer | 8052 | 21619 | Fixed bugs with barriers. Prevent jobs above a valid barrier from being retraced when a job below the barrier is retraced with the aggressive retrace flag. Prevent invalidation of a valid barrier that is the output of an invalid job during vovbuild. Added protection from illegal status changes (for example, a job can not be MISSING). Allow only INVALID to propagate to the entire downcone of a node. |
FlowTracer | 8083 | 21752 | vtk_set_get_elements accepts -selrule as the same option with -rule. Error handling with invalid options is improved. |
FlowTracer | 8161 | Fixed bug that sometimes caused deleted sets to not be removed from the set browser in FlowTracer after being forgotten. This was most visible when doing vovforget -allsets and depended on the order that sets were forgotten. | |
LicenseAllocator | 8123 | 21866 | LA will check allocations against min restriction at every step of allocation calculation. |
LicenseAllocator | 8088 | LA will check out at least 1 token of jobs_la, even if no job are running. | |
LicenseMonitor | 8113 | 21882 | Fixes issue where the override timezone was not being honored for log parsing jobs. |
LicenseMonitor | 8176 | 21952 | In some cases, the LM 'convert to batch report' link generated command lines that produced reports different from those in the browser UI. This is now fixed. |
NetworkComputer | 7958 | 21591 | Modified both nc info and node.cgi to show the CHOSENSLAVEID if it is set. Also added the -sameslave option to vovresreq to control this behavior. |
NetworkComputer | 8027 | 21723 | Prevent changes to a job while running if the changes would invalidate the job. This fixes a bug when resources are modified with nc modify on a running job, causing it to turn INVALID/Idle even though the processes are still running. |
NetworkComputer | 8126 | 21884 | A new debug environment variable, VOV_DEBUG_NO_START has been created. The script vov_diagnostic_no_start will be run only if this environment variable is set to a non-zero value on the slave. The script contains a vovselect query, which can increase server load significantly under certain conditions. The query will be run only if VOV_DEBUG_NO_START is set to 2 on the slave. |
NetworkComputer | 5624 | 20532 | Changed behavior for when we reconcile a resource R we can take a look at the previous resource in the grabbed list and check if the previous resource is a summary resource for R. |
NetworkComputer | 7962 | 21605 | Fixed calculation of most recent job in the nc info ! command. |
NetworkComputer | 8043 | 21762 | There are minor changes to the wording of the output of vsy, vovwhy, nc why, wx why and vtk_explain_status. |
NetworkComputer | 8090 | 21778 | Corrected incorrect wording in why-waiting analysis that misreported the job's bucket rank in FairShare as the job's order in the bucket. |
NetworkComputer | 8093 | 21849 | Fixed incorrect reference to:
NC:AlsoRemovePreviousSummaryResources in
recursive call. |
NetworkComputer | 8171 | Reduced the rate at which vovresourced checks for LM to be up when using a hard-coded LM location and LM is currently down. | |
NetworkComputer | 6939 | 21653 | Fixed nc gui timeout restart so you do not have to go through the additional steps of clicking in the set bar and pressing enter. |
NetworkComputer | 8044 | 21764 | If a slave becomes "stopped" slave, symlink slave.log is not updated by the slave. |
NetworkComputer | 8050 | 21772 | The behavior of the nc info command has been fixed to provide better information in cases where a FAILED or INVALID job has an invalid return code. |
NetworkComputer | 8159 | The nc who command was nonfunctional starting in 2016.09u9. This has been fixed. | |
NetworkComputer | 7944 | 21576 | Fixed issue where NC wrapper fails to exit when job is complete. |
NetworkComputer | 7978 | 21596 | There is a new optional policy.tcl parameter on Linux platforms named slave.childProcessCleanup. Setting this parameter to 1 causes slaves to kill all child processes when a job exits. This parameter will implicitly use cgroups. Additionally the old method of using vovprocessmgr has been fixed. |
NetworkComputer | 8098 | 21828 | The time window was missing from the URL of the FairShare web page. This prevented it from being shared. The URL now again has name=value pairs, which should once again allow sharing. |
NetworkComputer | 8136 | 21918 | There was a race condition which could result in a deadlock when a signal handler was called. This manifested variously in the traceback as a hung call to futex() or readSocket(), and possibly others. This problem has been fixed. |
NetworkComputer | 8156 | 21915 | When reading environments, if there is an environment variable with an empty value, call unsetenv on it instead of logging an error with a backtrace in the slave log. |
NetworkComputer | 8157 | 21928 | The emulated bsub command now correctly uses the job placement policy defined in the NC jobclass that is mapped to the bsub "-q" option. |
WorkloadXcelerator | 8085 | 21803 | WX now correctly handles SlaveList requests by not attempting to process them in the front-end, passing them on to the NC back-end for processing instead. |
WorkloadXcelerator | 7772 | 21266 | Failover slaves now use the original vovserver's VOVDIR as opposed to their own when starting a failover server. |
WorkloadXcelerator | 8097 | Catch error caused by the removal of a non-existent slave. |
2016.09 Update 11
New Features and Enhancements
Products | Internal Number | Case Number | Description |
---|---|---|---|
NetworkComputer | 8017 | Improve visibility on unused slaves. | |
NetworkComputer | 6151 | 21656 | Added @JOBLOGDIR@ field for JOBS to allow pre- and post-command logfiles to go to the same directory as the job logfile. Checked if the pre- and post-command output logs are zero length, in which case they are auto-deleted. |
NetworkComputer | 7455 | 20674 | The optional live_keepfor_jobs.tcl task script has been improved to reduce the load on the NC vovserver. |
NetworkComputer | 7604 | 20904 | Added explanation that the main ’why’ reason may not be the only one to discussion of nc why in vncqueue.html in trunk. |
Resolved Issues
Products | Internal Number | Case Number | Description |
---|---|---|---|
NetworkComputer | 7978 | 21596 | There is a new optional policy.tcl parameter on Linux platforms named "slave.childProcessCleanup". Setting this parameter to 1 causes slaves to kill all child processes when a job exits. |
NetworkComputer | 7999 | 21674 | Fixed issue where keyword substitution for the command line can inject curly braces if the command line includes a quoted string. |
NetworkComputer | 8011 | 21693 | Increase the max capacity of a running slave to match the new capacity if the slave is being reconfigured with a capacity that exceeds the current max capacity. |
WorkloadXcelerator | 7772 | 21266 | Improve visibility on unused slaves. |
WorkloadXcelerator | 7996 | 21706 | Ensure vovelasticd daemon clears the queue error state to allow launcher submissions to resume after its 2-minute error waiting period. |
WorkloadXcelerator | 8025 | 21663 | Disable vovresourced LM interconnect for WX. |
2016.09 Update 11.5
New Features and Enhancements
Products | Internal Number | Case Number | Description |
---|---|---|---|
NetworkComputer | 8644 | 22239 | To prevent excessive server load due to too many OR clauses in resource map sums, calls to vtk_resourcemap_sum from <swd>vovresourced/config.tcl are now limited to expressions of up to 5 ORs (by default). The default limit can be adjusted by setting RESD(maxORsInResourceSum) in <swd>/vovresourced/config.tcl. If a resource sum with excessive ORs is seen, an alert is generated and an error message is output once per day in the vovresourced logfile. |
2016.09 Update 11.4
New Features and Enhancements
Products | Internal Number | Case Number | Description |
---|---|---|---|
NetworkComputer | 8644 | 22239 | To prevent excessive server load due to too many OR clauses in resource map sums, calls to vtk_resourcemap_sum from <swd>vovresourced/config.tcl are now limited to expressions of up to 5 ORs (by default). The default limit can be adjusted by setting RESD(maxORsInResourceSum) in <swd>/vovresourced/config.tcl. If a resource sum with excessive ORs is seen, an alert is generated and an error message is output once per day in the vovresourced logfile. |
2016.09 Update 11.3
Resolved Issues
Products | Internal Number | Case Number | Description |
---|---|---|---|
All | 8500 | Provided a workaround to enable the patch URL at the top of the project page to be rendered properly as a link instead of displaying HTML code. This has been fixed in 2016.09u12 and above. | |
NetworkComputer | 7982 | 21646 | Changed documentation to suggest usage of the option -rule rather than -hw for multiple constraints in an nc hosts command. Corrected the help for nc hosts. Also fixed the nc hosts -hw command preventing multiple elements from being evaluated. |
NetworkComputer | 8187 | 21130 | Provided a workaround that enables auto-rescheduling of failed jobs so that the downcone dependencies do not get descheduled. Now, the failed job can re-run and the downcone will run if and when the job passes on the second (or later) attempt. Requires the workaround script $VOVDIR/etc/post/post_retrace_downcone which must be run at the end of the job as a post cmd. |
NetworkComputer | 8241 | 21985 | Provided a workaround for problem where @JOBID@ was not being expanded in the environment specification. This was causing a problem with cross queue submission due to environment not being captured correctly. Workaround requires use of VncCallbackAction (see NC Admin Guide) to call custom callback script in jobclass to perform expansion. |
2016.09 Update 11.2
Resolved Issues
Products | Internal Number | Case Number | Description |
---|---|---|---|
NetworkComputer | 8136 | 21918 | There was a race condition which could result in a deadlock when a signal handler was called. This manifested variously in the traceback as a hung call to futex() or readSocket(), and possibly others. This problem has been fixed. |
WorkloadXcelerator | 8073 | 21803 | Minor cleanup of errors in vovelasticd.tcl. |
2016.09 Update 11.1
Resolved Issues
Products | Internal Number | Case Number | Description |
---|---|---|---|
NetworkComputer | 8159 | The nc who command was nonfunctional starting in 2016.09u9. This has been fixed. | |
NetworkComputer | 8093 | 21849 | Fixed incorrect reference to: NC:AlsoRemovePreviousSummaryResources in recursive call. |
WorkloadXcelerator | 8073 | 21803 | Improve handling of job’s expected duration with respect to maximum slave lifetime. Also, appropriately handle error caused by the removal of a non-existent slave so that the vovelasticd daemon does not enter the 2-minute wait state for back-end queue errors. More debug output at verbosity level 5 if slave not started for bucket. Improved web based visibility into vovelasticd. |
WorkloadXcelerator | 8074 | 21654 | Fixed issue with handling resource specifications that contain boolean OR conditions. |
2016.09 Update 10
New Features and Enhancement
Products | Internal Number | Case Number | Description |
---|---|---|---|
All | 7925 | Symbolic links of log files, such as server.log, slave.log are changed to relative paths. | |
FlowTracer | 7827 | 21382 | The VOV_STDOUT_SPEC environment variable now supports @JOBNAME@
as part of the stdout and stderr filename formats, e.g.,
The
jobname portion of these filenames is limited to alphanumeric
characters, the underscore, hyphen and the period. Other characters
in the jobname are filtered out. For jobs without a jobname,
NOJOBNAME is used. Note that the vovserver must have the
VOV_STDOUT_SPEC environment variable defined as above for this to
take effect. |
FlowTracer | 7928 | 21547 | No longer bind to the X display by default in vovsh. To bind, pass the -d option. |
LicenseMonitor | 7762 | 21280 | ftlm_batch_report can take input fields from a
file % ftlm_batch_report -inputFile <INFILE> and
a template <INFILE> can be found at
$VOVDIR/etc/config/lm/ftlm_batch_report.tm. |
LicenseMonitor | 7849 | 21424 | LA will cut back unused tokens to maximize utilization. It will also try hard not to reduce allocations below currently running for higher weightage sites, thus reducing the likelihood of pre-emption on NC. |
NetworkComputer | 7959 | 21585 | The -maxload option of
vtk_slave_define now accepts simple
expressions relative to the capacity value, represented by the
keyword ’CAPACITY’. For example, -maxload
CAPACITY*1.5 would set the maxload to 1.5 times the
number of slots. Supported operators are: +-/* . |
NetworkComputer | 7535 | 20833 | The Tcl procedure VovGetRevokeDelay can now be added and customized by redefining it in vovresourced/config.tcl under the SWD directory to allow users to customize the revoke delay to be used in vovreconciled. This allows users to have the revoke delay from their job classes override the default value of RESD(revokeDelay). The proc definition has been added to the documentation. |
NetworkComputer | 7714 | 21125 | Enabled the @USER@ recipient keyword for the LongJobs health check notification procedure. |
NetworkComputer | 7715 | 21199 | A limited-release capability, nc info -legacy,
is provided to generate nc info output in the
older format (prior to 2016.09). The behavior of this command can be
customized via a Tcl file at
.../local/vncinfo.config.tcl based on
environment variable, username, project name, etc. by doing ’set
NCINFO(legacy) 1’ under the desired conditions. Example:
Important: This legacy feature will not be available in
releases beyond 2016.09 (and updates).
|
NetworkComputer | 7881 | Fixed the ROWCOUNT field in REST API output so that it does not report double the correct row count. | |
NetworkComputer | 7810 | 21343 | Stop all running jobs button changed to Stop jobs for all users, and the default set to "Cancel" on the dialog window instead of "OK" (which also changed to "Yes, Stop All"). |
Resolved Issues
Products | Internal Number | Case Number | Description |
---|---|---|---|
All | 7807 | Improved signal handling in vov/vw/vrt wrapper when waiting for a job to finish to increase stability. | |
All | 7931 | Fixed conflict with the -platform and -nocommon options in the batch installer so that the option order does not matter. | |
FlowTracer | 7912 | 21517 | LSF jobs can now have multiple "LSFmopts:*" parameters in the resource string. |
FlowTracer | 7953 | 21590 | The capsule for Questa vcom is enhanced to handle files containing both entity and package declarations. |
FlowTracer | 7749 | 21260 | Documentation for PJ has been updated to describe the new method of indicating a PJ is a system job. |
FlowTracer | 7776 | 21297 | When an input or output is disconnected from a running job, do not invalidate the job. If a file was declared as an output of a running job during the job, and it is disconnected, propagate INVALID to the downcone of the disconnected output. |
FlowTracer | 7829 | Prevent md5 barriers from invalidating the downcone when the timestamp changes but the content (md5sum) is still the same. | |
FlowTracer | 7837 | 21356 | When the FDL commands T_FINAL or J_FINAL are called, do not disconnect dependencies other than the ones previously declared in FDL (no runtime dependencies such as instrumented tools, stderr, etc, should be detached). |
LicenseAllocator | 7861 | 21445 | LA will prevent reconciliation of licenses by vovreconciled if matching has been turned off for this license resource. |
LicenseAllocator | 7984 | Fixed the problem with targets being set to 1 even when there is no demand. | |
LicenseMonitor | 7601 | 20454 | Added ability to specify a replacement pattern and string for
host names obtained by LM parsers. This is configured by adding the
following line to the SWD/config/parser.cfg file:
where PATTERN is a case-sensitive regular expression pattern, and
REPLACEMENT is the string with which to replace the pattern.
|
LicenseMonitor | 6962 | 20151 | Use the parser time for checkouts in the remote LM parser instead of the remote checkout time, to prevent overlapping of checkout records. |
LicenseMonitor | 7863 | 21099 | vovsql_load_checkouts now is able to process checkout log with handle# > limit of integer data type. |
LicenseMonitor | 7775 | 21279 | Uncheck options on batch report via Web UI to disable the plot on that option, now matching the behavior of online history report. Batch report via CLI now takes boolean (1|0) value for options that do not require values before but will backward compatible if no value provided. |
LicenseMonitor | 7868 | 21447 | Now the report UI is able to display users or any other drop-down list on Windows supported versions of all browsers (Chrome/FireFox/IE) even if the list contains binary characters. |
LicenseMonitor | 4216 | 21593 | Added average usage statistic to the Efficiency Statistics report table. |
NetworkComputer | 7906 | 21516 | Fixed issue that caused global namespace variables in the vnc_policy.tcl file to not be seen by the Tcl interpreter without adding a global call. |
NetworkComputer | 7383 | 20576 | Start time, end time, and duration values are now validated in a call to vtk_slave_reserve to prevent values of 0 from being applied to a reservation. This prevents confusing reservation property entries in the /system/slaves/reservations FairShare group, if enabled. Slave reservation expirations and cancellations are now logged in the server log, even if the reservation expires while the slave is down. Slave properties, which store reservations for persistence, are now enabled by default for NetworkComputer instances. |
NetworkComputer | 7487 | 20455 | Fixed issue where bash functions were not being defined correctly in snapprop. |
NetworkComputer | 7739 | 20807 | Resumer jobs for jobs preempted via preemption rules that are waiting for HW now request the same slave and HW resources as the preempted job. |
NetworkComputer | 7937 | 21560 | Prevent the following environment variables from being carried
over from the job submission environment to the execution
environment when using environment snapshots:
By doing so, conflicts in a mixed-version environment will be
prevented. |
NetworkComputer | 7866 | Cleanup stderr files generated by the SNAPPROP job environment when an error is caught by the Tcl interpreter when fetching a system property. | |
NetworkComputer | 7880 | Fixed the format of JSON output in the REST API to include commas between list elements. | |
NetworkComputer | 7919 | 21520 | The following changes have been made to non-legacy output of vsy, vovwhy, nc why, wx why, vtk_explain_status and related commands: slavelists will not appear as SLAVELIST-NOT-AVAILABLE with 0 slaves when running wx. For non-wx products, empty or non-existent slavelists will now appear under "Main Reason" in the first section of the output. This is an additive change; slavelist information will still appear in the "Additional Information" section as well. The redundant line "Analyzing job 12345" has been removed from non-verbose output. This makes the "Main Reason" section easier to read. |
NetworkComputer | 7958 | 21591 | A slave spawned by vovelasticd should use the trigger job’s expected duration for its max life if the duration is longer than the slave max life configuration. This will prevent a deadlock situation where a slave is available to run a job, but cannot because it will exit before the job is expected to be finished. |
NetworkComputer | 7973 | Modified job why-waiting analysis to report on cases where the job’s expected duration exceeds a slave’s maximum lifetime. | |
NetworkComputer | 7975 | 21634 | Enabled pre/post commands with use of -f option to nc run. |
NetworkComputer | 7982 | 21646 | Changed documentation to suggest usage of the option "-rule" rather than "-hw" for multiple constraints in a "nc hosts" command. |
NetworkComputer | 5725 | 20830 | The "Seamless Transition To A Cycle-Based Scheduler" of the docs has been modified to provide added description for scheduler parameters. |
NetworkComputer | 7644 | 21092 | Improved error handling in LSF emulation’s bsub |
NetworkComputer | 7930 | 21518 | Do not attempt to kill a non-existent subslave (with a PID of 0) after a job fails to start due to an unknown user error. |
NetworkComputer | 7631 | 21020 | If job class description has html tag <em>, the enclosed text will rendered to the intended effects as italic/emphasized. |
NetworkComputer | 7682 | 21153 | Fixed so that all NC jobclasses from $VOVDIR/local/jobclass show up in the web GUI. |
WorkloadAccelerator | 7890 | 21478 | Enabled initialization of jobclasses so that jobclass limit resources are defined correctly. |
WorkloadAccelerator | 7910 | 21532 | Made vovelasticd sensitive to bucket FairShare groups for determining the number of job launchers to submit to the back-end queue. |
WorkloadAccelerator | 6281 | Added "VOVELASTICD(maxQueueErrors)" setting, defaulting to 10, that will cause the vovelasticd daemon to stop attempting launcher submissions when the number of consecutive launcher submission failures to the back-end queue exceed the setting. Both a system alert, as well as a vovelasticd log file entry, will be generated during this condition. Most often, this is related to a misconfigured queue. The daemon will resume normal operation once the configuration files have changed to correct the issue. | |
WorkloadAccelerator | 6314 | Documented the requirement to include LSFqueue:<queueName> in the resource specification in order to direct jobs to a specific LSF queue when using LSF as the base scheduler. | |
WorkloadAccelerator | 7744 | 20975 | Enabled additional health check procedures, which can be used to monitor for errant conditions in the compute environment or the workload, such as jobs that have been queued or running for longer than desired. |
WorkloadAccelerator | 7772 | 21266 | Failover slaves now use the original vovserver’s VOVDIR as opposed to their own when starting a failover server. |
WorkloadAccelerator | 7820 | 21173 | Fixed case in WX where users first jobs do not run when they are submitted with "-f file". |
WorkloadAccelerator | 7969 | 21614 | Slaves spawned by vovelasticd should be reserved for the trigger job’s bucket instead of its FairShare group. This will ensure that the slave will execute the trigger job first, then any subsequent jobs in the same bucket instead of allowing unrelated jobs to be executed. |
WorkloadAccelerator | 7976 | 21633 | Elastic slave launchers should reflect the job placement policy of the trigger job. |
WorkloadAccelerator | 7989 | 21661 | Elastic slaves should be initialized with actual RAMTOTAL instead of using the trigger job’s requested RAM. |
2016.09 Update 10.1
Resolved Issues
Products | Internal Number | Case Number | Description |
---|---|---|---|
WorkloadAccelerator | 7995, 7996 | 21655, 21706 | Fixed issue that caused vovelasticd daemon to enter a dead-locked state upon reaching the max allowed number of back-end queue errors. |
WorkloadAccelerator | 21803 | 8073 | Improve handling of job’s expected duration with respect to maximum slave lifetime. Also, appropriately handle error caused by the removal of a non-existent slave so that the vovelasticd daemon does not enter the 2-minute wait state for back-end queue errors. More debug output at verbosity level 5 if slave not started for bucket. Improved web based visibility into vovelasticd. |
WorkloadAccelerator | 8074 | 21654 | Fixed issue with handling resource specifications that contain boolean OR conditions. |
2016.09 Update 9
New Features and Enhancements
Products | Internal Number | Case Number | Description |
---|---|---|---|
All | 7697 | 21191 | In Admin/Database page or vovdbd_util, user is able to specify port for use by the VOV Postgres DB. |
All | 7657 | The slave will now print a warning message to its log when the server appears to be a different version from the slave. | |
All | 7728 | Allow colons in property names. | |
FlowTracer | 7788 | Added option to vovbuild to prevent building when jobs are running. | |
FlowTracer | 7743 | 21229 | When vov/vw/vrt is used to launch jobs from a client, and an illegal option is passed, make sure to update the "WHY" property to tell the user exactly why it failed. Previously it would exit with error code 2 and not provide any reason why. |
FlowTracer | 7609 | 21025 | Display run and suspend times in UIs that show job duration. |
FlowTracer | 6852 | 15427 | In FlowTracer and NC GUI, added subsets to the list of nodes that "quick find" highlights and selects. Previously it would only search for jobs and files. |
FlowTracer | 7667 | In FlowTracer and NC GUI, the quick find search field searches every time you type or click the mouse in the search field, provided there are at least 2 characters. For large node counts, this made it feel like it was hanging between keystrokes. In this fix, it initiates a search request when the search text is changed, but does not start the search until a small amount of time passes, so it prevents starting the search too early. It also can interrupt the previous search if more characters are typed. In addition, a bug was found that was causing it to search for each node type twice. That bug was fixed. For large node counts (100k+), it prints white status messages on the screen to show the search progress. The result is a faster and more responsive quick find feature. | |
FlowTracer | 7689 | When NFS protection is enabled using the nfsDelay setting in policy.tcl, do not wait for the nfs delay time to pass when checking if a file exists if the file's status is MISSING. | |
FlowTracer | 7688 | When using vovbuild, if an input or output dependency is excluded due to exclusion rules such as those in the server working directory file "exclude.tcl", print a warning message. | |
LicenseAllocator | 7105 | 20054 | Site nicknames will no longer be truncated in the column headers of the overview page - the full nickname will be displayed. |
LicenseAllocator | 7576 | 20897 | LA overview page will now show the sum of all running jobs for each resource, and all queued jobs for each resource, across all sites. |
LicenseAllocator | 7581 | 20901 | Clicking running or queued jobs in Overview page or Resource Summary page will take you to an NC page that only shows jobs that are running with or queued for this specific resource. |
LicenseAllocator | 7279 | 20452 | The resourcemap owner name will now include the name of the LA project which is currently controlling this resource. |
LicenseMonitor | 7706 | Added CLI version of the LM agent SFD for Windows. The new agent, lmagent-win64-cli.exe, can interact with the Windows SCM (requires UAC elevation). Run with -h to get the full usage syntax. | |
NetworkComputer | 6022 | 21107 | Added option -q to nc
forget. Note that the command is not quiet in case of
errors. To make it completely silent, use: nc forget ...
>& /dev/null . |
NetworkComputer | 7452 | 20720 | The vovreconciled daemon now includes the name of the license when notifying that reconciliation is skipped for a job. |
NetworkComputer | 7652 | 21013 | Added new nc info -dep command to easily list
job dependencies:
|
HERO | 7103 | 20071 | Completed implementation of vtk_fs_stat by supporting the field "fsid". |
Resolved Issues
Products | Internal Number | Case Number | Description |
---|---|---|---|
All | 6966 | If a server log file cannot be opened at server startup, the server will exit with a fatal error. If a new server log file cannot be opened during daily rotation and the previous log file is not accessible, a new log file will be opened under tmp directory. | |
All | 7381 | 20552 | Added a column to show resource reservations in the monitor GUI. |
All | 6921 | Slaves will timeout predictably, after attempting to connect/reconnect to server. VOV_RELIABLE_TIMEOUT can be used to specify the timeout period. Command line option -t can be used to override the setting of VOV_RELIABLE_TIMEOUT. | |
All | 7524 | 20809 | Now db host can be configured using fqcn, alias, or IP address, in addition to the canonical short host name. If host provided cannot be pinged, an error message will be displayed. |
All | 7809 | 21345 | Fixed issue that caused the patch status link in certain web UI pages to be rendered as plain text instead of HTML. |
All | 7698 | vovconsole -geometry option was ignored in
2016.09 u8. Fixed to make -geometry option
respected. |
|
All | 7801 | No longer show utilization percentage in monitor GUI for resources with unlimited capacity. | |
FlowTracer | 7777 | 21297 | Added option to vovbuild to prevent building when jobs in the flow are running. The new option is vovbuild -c which is short for "cautious" mode. |
FlowTracer | 5423 | 21190 | vovlsfd will generate a warning if it detects jobs with expected duration longer than MaxLife parameter. Slaves will not be submitted for these jobs. |
FlowTracer | 7814 | Fixed a bug that caused printing an incorrect count of the
number of overloaded slaves in the output of
vsy, nc why,
wx why and
vtk_explain_status. The formatting of the
output of the vsy, nc why,
wx why, and
vtk_explain_job_status commands has
changed in the following ways:
|
|
FlowTracer | 7694 | Sending a SIGTSTP signal to an indirect slave no longer kills the slave. Instead the slave process and all of its jobs will be suspended until another signal is received. | |
FlowTracer | 7707 | Fixed vtk_server_config protocol error in readonly security level. | |
FlowTracer | 7724 | Redirected indirect slave log from stdout to the slave daily log file. | |
FlowTracer | 7725 | Fixed grid mode drawing bug that made FlowTracer seem to go into an infinite loop of drawing when the setting to move nodes to the end in grid mode was enabled and the graph was already maxed out (autofit overflowed the bottom due to too many nodes to fit in the canvas). | |
FlowTracer | 7590 | 20930 | Corrected vovps documentation, and added example. |
LicenseAllocator | 7222 | 21220 | LA now correctly drops overbooking to zero when overbooking is turned off. |
LicenseAllocator | 7794 | 21306 | LA will try to meet demand first before distributing extra tokens. Also, LA will not take tokens away from critical sites when adjusting for uncertains. |
LicenseAllocator | 7796 | 21311 | LA will use wave based OOQ values for computing available tokens for distribution. |
LicenseMonitor | 7647 | 20906 | Fixed issue with loading remotely-generated data via the lmmgr loadremotedata command caused by the utility not connecting to the LicenseMonitor server. |
LicenseMonitor | 7676 | 21269 | On Windows, one of file path modules tried a shortcut expansion on a regular file and reported that the file doesn't exist even though the file exists. Fixed to try the shortcut file expansion when it is a really a shortcut. Also added recovery codes when the expansion fails. |
LicenseMonitor | 7665 | 20884 | Fixed an issue in the detailed plot report when showing reservations that caused currently-active reservations to be displayed as normal checkouts in the plot. |
LicenseMonitor | 7704 | 21178 | Fixed issue with batch reporting of denial plot for multiple features that resulted in the carryover of previous plot data for subsequent features that have no denials. |
NetworkComputer | 5624 | 20532 | Changed behavior for when we reconcile a resource R we can take a look at the previous resource in the grabbed list and check if the previous resource is a summary resource for R. |
NetworkComputer | 7680 | Fixed bug where exporting preemption rules previously wrote out incomplete rules in some situations. | |
NetworkComputer | 7699 | 20879 | Fixed bug where removing resources from slaveClass.table would not remove them from the server without a restart. |
NetworkComputer | 7770 | 20883 | Output files are no longer tracked for jobs submitted via the LSF emulation layer. This prevents a queue bucket from being created for each job that uses the bsub -o option to specify the job log file. |
NetworkComputer | 7607 | 20982 | LSF bsub emulator uses VOV_JOBPROJ environment variable for the project name. Specifying a project explicitly with -P in bsub command overrides the project specification in the VOV_JOBPROJ variable. |
NetworkComputer | 7636 | 20603 | Fixed a bug that caused VOV_PORT_NUMBER not to be applied correctly after a connection timeout. Previously, only the last port listed would be used after VOV_RELIABLE_TIMEOUT had expired. |
NetworkComputer | 7641 | Preempting and preemptable selection rules are now validated at preemption daemon start time. Preemptable rules are also validated upon each update of the rule as job queue buckets change. | |
NetworkComputer | 7682 | 21153 | Fixed behavior so that all NC jobclasses from $VOVDIR/local/jobclass show up in the web GUI. |
NetworkComputer | 7708 | Fixed issue with slave-reservation preemption that treated this HW-based rule as one that is SW-based, preventing all related preemption attempts from succeeding. | |
NetworkComputer | 7822 | 21365 | Corrected the vovslavemgr reserveshow output format for unending reservations. |
NetworkComputer | 6498 | NC and WX now use the 'any' port mode by default, to help prevent deadlock situations when the port chosen by the previous default of 'automatic' mode is already in-use. | |
NetworkComputer | 7588 | 20923 | Fixed issue that prevented auto-rescheduled jobs from avoiding the previously used host. Also added a new configuration parameter in the policy.tcl file, autoRescheduleOnNewHost, to control whether autorescheduled jobs will avoid an entire host, or just the same slave. Set to 1 (default) for entire host, set to 0 for same slave. |
2016.09 Update 9.4
Resolved Issues
Products | Internal Number | Case Number | Description |
---|---|---|---|
LicenseMonitor | 6962 | 20151 | Use the parser time for checkouts in the remote LM parser instead of the remote checkout time, to prevent overlapping of checkout records. |
2016.09 Update 9.3
Resolved Issues
Products | Internal Number | Case Number | Description |
---|---|---|---|
WorkloadAccelerator | 7910 | 21532 | Made vovelasticd sensitive to bucket FairShare groups for determining the number of job launchers to submit to the back-end queue. |
2016.09 Update 9.2
Resolved Issues
Products | Internal Number | Case Number | Description |
---|---|---|---|
NetworkComputer | 7822 | 21365 | Corrected the vovslavemgr reserveshow output format for unending reservations. |
NetworkComputer | 7906 | Fixed issue that caused global namespace variables in the vnc_policy.tcl file to not be seen by the Tcl interpreter without adding a global call. | |
WorkloadAccelerator | 7820 | 21173 | Fixed case in WX where users first jobs do not run when they are submitted with "-f file". |
WorkloadAccelerator | 7890 | 21478 | Fixed issue where WX limits were not being defined correctly. |
WorkloadAccelerator | 6314 | Documented the requirement to include LSFqueue: <queueName> in the resource specification in order to direct jobs to a specific LSF queue when using LSF as the base scheduler. | |
WorkloadAccelerator | 7744 | 20975 | Enabled additional health check procedures, which can be used to monitor for errant conditions in the compute environment or the workload, such as jobs that have been queued or running for longer than desired. |
WorkloadAccelerator | 6281 | Added VOVELASTICD(maxQueueErrors) setting, defaulting to 10, that will cause the vovelasticd daemon to stop attempting launcher submissions when the number of consecutive launcher submission failures to the back-end queue exceed the setting. Both a system alert, as well as a vovelasticd log file entry, will be generated during this condition. Most often, this is related to a misconfigured queue. The daemon will resume normal operation once the configuration files have changed to correct the issue. |
2016.09 Update 9.1
Resolved Issues
Products | Internal Number | Case Number | Description |
---|---|---|---|
LicenseMonitor | 7867 | 21451 | Enabled project tracking in remote LM parser. |
2016.09 Update 8
New Features and Enhancements
Products | Internal Number | Case Number | Description |
---|---|---|---|
All | 7474 | Fixed instances where replacing/following symlinks is not supported in Windows. | |
All | 7387 | 20602 | Added an updatecrontab_vovdir.csh script to the project.swd/autostart directory to automatically update the scripts/vovdir.csh file when the server starts. Upgrading to a new release now automatically updates the version of crontab used to be consistent. |
All | 7088 | 20035 | Added logic to check if the .stdstart (headLog) and .stdend (footLog) files are in use. If so, delete if size 0. |
All | 5525 | 12768 | The environment variable HOSTNAME was previously being transferred to slaves via SNAPSHOT or SNAPPROP. This was causing the HOSTNAME on the slave to show an incorrect value. The transfer is now suppressed. |
FlowTracer | 7553 | 20870 | 2016.09 update 8 adds PAUSED state for slaves that receive SIGTSTP; receiving this signal suspends all child jobs and the slave itself. Resuming slaves will resume all child jobs. Additionally, we have added a queryable HEARTBEAT field to the "slaves" table. |
7673 | In FlowTracer, changed the default value of nfsdelay from 0 to 60 seconds to protect against invalidation of jobs due to filesystem caching, such as NFS attribute and directory caching. This change is made in the policy.tcl file in the server working directory. In FlowTracer, changed the default value of timeTolerance from 0 to 1 second to protect against invalidation of jobs due to clocks not being synchronized across hosts in the network. This change is made in the policy.tcl file in the server working directory. | ||
LicenseAllocator | 6499 | 20229 | Historical metrics can be loaded upon server restart by adding the command "LA::ReloadHistoricalMetrics" to LA's "config.tcl" AFTER all the sites and resources have been declared. The command requires a parameter specifying a duration (Vov time specification) for which to load the metrics. The duration is the time going back from "now". |
LicenseMonitor | 7507 | 20235 | Added new -replaceImages option to
ftlm_batch_report to replace the dynamic
image elements of a batch report HTML file with static PNG images.
The usage syntax for the new option is:
If the OUTFILE option is not passed, the utility will generate a new
file named INFILE-static.html. OUTFILE can be
the same as INFILE, but is not recommended. |
LicenseMonitor | 7472 | 20744 | When available, make pid visible in certain LM reports. |
NetworkComputer | 7455 | 20674 | The optional live_keepfor_jobs.tcl task script has been improved to reduce the load on the NC vovserver. |
NetworkComputer | 7433 | Added field XDURPERCENT for jobs, which can be used in preemption rules. | |
NetworkComputer | 5579 | 20837 | Log an entry in the server log when a slave fails to send its
required heartbeat to the server and enters a sick state. The log
entry resembles:
The server log also contains a message when the slave is healthy
again:
|
NetworkComputer | 7535 | 20833 | The proc VovGetRevokeDelay can now be added and customized by redefining it in vovresourced/config.tcl under the SWD directory to allow users to customize the revoke delay to be used in vovreconciled. This allows users to have the revoke delay from their job classes override the default value of RESD(revokeDelay). The proc definition has been added to the documentation. |
NetworkComputer | 7537 | 20834 | Provide the ability to specify slot count as an adjustment to the core count. The following capacity specification forms have been added: CORES+N, CORES-N, CORES*N, CORES/N The word CORES is required, followed by a single-character operator, then a whole or decmial number. These new forms work in addition to the traditional numerical capacity setting and are supported in the following: vtk_slave_define -capacity XXXXX, vtk_slave_set_defaults -capacity XXXXX, vovslave -T XXXXX, vovslavemgr configure -capacity XXXXX *Where XXXXX is capacity specifier as described previously. Note: Capacity cannot be less than 0 slots, nor can it exceed 1000 slots. |
NetworkComputer | 7561 | Improved debugability of preemption rules with a DEBUG property. | |
NetworkComputer | 7642 | Slave grouping was added in 2016.09 and nc gui -slaves was showing groups if there are more than 20 slaves. This changeset returns to 2016.03 behavior. nc gui -slaves shows individual slaves. |
Resolved Issues
Products | Internal Number | Case Number | Description |
---|---|---|---|
All | 7328 | 20517 | The sets page now reloads after invalidating the set via the invalidate icon. |
All | 7456 | 20713 | The frequency of calls made by vovslave to the w command have been reduced to once per minute across all slaves for a given host. |
All | 7473 | Fix instances where symlink depth in Linux is limited to 5 or 8 in some cases. | |
All | 7506 | 20517 | The sets page now reloads after invalidating the set via the invalidate icon. |
All | 7503 | 20635 | Fixed a possible memory corruption issue that occurred when defining equivalences. |
All | 7557 | Enhanced sanity to create resources for any FairShare groups that do not have a corresponding resource, such as "Group:time_users". They can be deleted by doing vovforget -allresources so this provides a mechanism to get them back. | |
FlowTracer | 7459 | vovconsole is resizable to smaller size. Previously the minimum size was set to an optimal size. | |
FlowTracer | 7467 | 20481 | Allow vovfileready to work with paths that did not exist or symlinks that point to paths that did not exist at build time. |
FlowTracer | 7511 | Fixed instances where there was random invalidation due to rejection of good timestamp update. | |
FlowTracer | 7003 | 20731 | Flush NFS directory cache when starting a job on a vovslave to prevent chdir failure. |
FlowTracer | 7463 | Fixed bug in vsx output that would cause a job with a name to appear next to the command without a space inbetween. | |
FlowTracer | 7611 | Status color is properly updated on Navigator and Alert window after a row is removed. | |
FlowTracer | 7659 | Ignore bkill failure when removing slave object. | |
LicenseAllocator | 7450 | 20230 | An issue was uncovered wherein significant memory bloat occurred resulting in large process size and gradual slowdown over a period of days or weeks. This problem has been fixed. |
LicenseAllocator | 7451 | 20691 | LA will now catch errors in stopping and forgetting old probes before creating new ones. It will wait for up to 60 seconds for the probes to stop, and if they still don't stop, it will raise an alert and not create new probes. |
LicenseAllocator | 7494 | Reset allocations before beginning new distribution, instead of upon receiving a new NC sample. | |
LicenseAllocator | 7265 | 20460 | LA will now convert fully qualified host names into short names before performing matching. |
LicenseMonitor | 7523 | 20803 | Fixed issue that caused the rlmstat parser to fail if the license server host was changed in the configuration for a live monitor. |
LicenseMonitor | 7565 | 20878 | It is now possible to choose whether or not to plot the average usage line on the usage-over-time graph of the Feature Detailed Plots page. |
LicenseMonitor | 7527 | ftlm_agent on Linux and MacOS has been fixed to prevent an error trying to change into a non-existent LMSWD directory when attempting to execute ControlCenter jobs. | |
LicenseMonitor | 7533 | Fixed issue that caused tag renaming and site assignment to use an empty string value. | |
NetworkComputer | 7632 | 21004 | Addressed issue with NC starting jobs from 2016.09u7 vovserver to 2015.03 clients. |
NetworkComputer | 7383 | 20576 | Start time, end time, and duration values are now validated
in a call to vtk_slave_reserve to prevent values of 0 from being
applied to a reservation. This prevents confusing reservation
property entries in the /system/slaves/reservations FairShare
group, if enabled. In 2016.09, this and the
/system/slaves/messages FairShare groups/properties are disabled
unless this configuration item is added to the
policy.tcl file:
|
NetworkComputer | 6042 | Fixed balloon error in nc gui -slaves. | |
NetworkComputer | 7461 | Fixed formatting issue on system recovery setup web UI page for failover server candidates information. | |
NetworkComputer | 5276 | 11584 | Improved life support mode for license-based resources when the connection to LM is interrupted. Life support is now activated when an HTTP update fails, in addition to when the event monitor is closed. External resource data, such as capacity and used-by-others numbers, will be held at the value last obtained from LM, and will be updated immediately upon reconnection to LM. |
NetworkComputer | 7566 | 20787 | Improved the performance and configurability of the nc list utility concerning listing by job names: Modified -J option to not use a smart set. Added help clarify the impact of -J option. Provided ability for the administrator to disable -J usage. Added documentation for vnclist.config.tcl. |
NetworkComputer | 7562 | 20821 | Made some fixes to input and output declarations to address "Server is operating on a non-internal object" error. |
NetworkComputer | 7505 | 20774 | Added non-Admin visibility to NC fair share graphs showing running and queued job totals. |
NetworkComputer | 7528 | Fixed behavior where preemption method gets lost some point after server restart. | |
NetworkComputer | 7442 | The LSF emulation scripts, bjobs and bsub were modified to allow bsub -J jobname to work reliably. Job names with embedded blanks are no longer allowed. | |
NetworkComputer | 7638 | 21054 | Fixed HW resource accounting issue that caused slaves to report higher-than-actual numbers when suspended jobs were stopped instead of being resumed. |
NetworkComputer | 7547 | 20475 | Added ability for the administrator to configure the maximum
environment size for job submissions. This is done via the
$VOVDIR/local/vncrun.config.tcl file,
using the following configuration variable:
The
value is specified in bytes. |
NetworkComputer | 7633 | 20983 | Added space between job name and command to correct format in NC. |
NetworkComputer | 7677 | The vovfsgroup create command is now more efficient when copying parent group ACL's. | |
NetworkComputer | 7678 | 21174 | Previously: If there was no vnc_logs directory and nc run commands run with -l log file.log option, snapshot was saved to "vnc_logs" file. Fixed: If there is no vnc_logs directory, snapshot capturing module will try to create one. If it cannot create, it will save the snapshot to env$hashcode.env file instead of vnc_logs file. |
2016.09 Update 8.1
New Features and Enhancements
Products | Internal Number | Case Number | Description |
---|---|---|---|
NetworkComputer | 7755 | Improved the NC GUI responsiveness when 50k jobs are submitted. | |
NetworkComputer | 7547 | Added ability for the administrator to configure the maximum
environment size for job submissions. This is done via the
$VOVDIR/local/vncrun.config.tcl file, using
the following configuration
variable:
The value is specified in bytes. Note that the maximum property length limit, prop.maxStringSize, that can be defined in the SWD/policy.tcl file, can also act as a limit when using the snapshot property for job submissions (nc run -ep). |
|
NetworkComputer | 7715 | A limited-release capability, nc info -legacy,
is provided to generate nc info output in the
older format (prior to 2016.09). The behavior of this command can be
customized via a Tcl file at
.../local/vncinfo.config.tcl based on
environment variable, username, project name, etc. by doing 'set
NCINFO(legacy) 1' under the desired conditions.
Example
|
Resolved Issues
Products | Internal Number | Case Number | Description |
---|---|---|---|
LicenseAllocator | 7751 | Added NRU job detection in LA, so this information can be used by vovreconciled to add the used tokens back to the job's grabbed tokens list. | |
NetworkComputer | 7383 | Start time, end time, and duration values are now validated in a call to vtk_slave_reserve to prevent values of 0 from being applied to a reservation. This prevents confusing reservation property entries in the /system/slaves/reservations FairShare group, if enabled. Slave reservation expirations and cancellations are now logged in the server log, even if the reservation expires while the slave is down. Slave properties, which store reservations for persistence, are now enabled by default for NetworkComputer instances. | |
NetworkComputer | 7455 | Documentation for update live_keepfor_jobs.tcl (job persistency) has been improved. | |
NetworkComputer | 7601 | Added ability to specify a replacement pattern and string for
host names obtained by LM parsers. This is configured by adding the
following line to the SWD/config/parser.cfg
file: where PATTERN is a case-sensitive regular
expression pattern, and REPLACEMENT is the string with which to
replace the pattern. |
|
NetworkComputer | 7741 | Fixed SNAPSHOT property expansion in web pages for cases where the environment was large enough to be split across multiple SNAPSHOT properties. |
2016.09 Update 8.1-1
Resolved Issues
Products | Internal Number | Case Number | Description |
---|---|---|---|
LicenseMonitor | 7746 | 21218 | Fixed issue with GreenHills license parser for checkout records that have a handle that is greater than a 32-bit integer. Also added protection in the database loader to handle such records. |
NetworkComputer | 7715 | 21199 | A limited-release capability, nc info
-legacy, is provided to generate nc
info output in the older format (prior to 2016.09).
The behavior of this command can be customized via a Tcl file at
.../local/vncinfo.config.tcl based on
environment variable, username, project name, etc. by doing 'set
NCINFO(legacy) 1' under the desired conditions. Example:
Important: This legacy feature will not be available
in releases beyond 2016.09 (and updates).
|
NetworkComputer | 7594 | 20944 | Added -failover constraint option for
vovslavemgr operations. Also ensure that
failover slaves are the first to start when starting multiple
slaves. Note: These functionalities require that failover slaves be
defined in slaves.tcl using
vtk_slave_define with the
-failover option.
|
NetworkComputer | 7664 | 21289 | Fix case where nc stop -reason doesn't add reason to Why. |
2016.09 Update 8.1-2
New Features and Enhancements
Products | Internal Number | Case Number | Description |
---|---|---|---|
NetworkComputer | 7715 | 21199 | nc info compatibility enhancement backported from 2016.09u9. |
Resolved Issues
Products | Internal Number | Case Number | Description |
---|---|---|---|
NetworkComputer | 7786 | Support FairShare groups that contain a dash in their name in the vovfsgroup loadconfig command. |
2016.09 Update 7
New Features and Enhancements
Products | Internal Number | Case Number | Description |
---|---|---|---|
LicenseMonitor | 5871 | 20081 | LicenseMonitor's FlexLM monitor will now report the PID associated with a checkout, if that support is enabled in FlexLM. Also NetworkComputer will provide PID information to LicenseAllocator for matching. |
LicenseMonitor | 7432 | 20688 | Added pie slice limit control for detailed plot report. |
NetworkComputer | 5599 | Enabled ranking of FairShare groups using weighted sum of excess_running and excess_history. | |
NetworkComputer | 6022 | 13969 | Added option -q to nc forget. |
LicenseAllocator | 7332 | 20081 | LA can now perform matching between LM checkouts and NC jobs based on the UNIX PID information, whenever available. Such matches are clearly identified by the keyword "pid" in the matches page in LA. |
Resolved Issues
Products | Internal Number | Case Number | Description |
---|---|---|---|
All | 6121 | Added a compound tab with an alert icon next to "Why", instead of automatically raising why tab on failed jobs and failed/invalid files. | |
All | 6745 | The path of slave startup log is printed in every daily log that is rotated midnight. Symbolic links are created as server.log, subserve.log, slave.log, and startup.slave.log. These symbolic links point to latest corresponding log files. vovcleanup rules can be customized by editing cleanup.config.tcl located in swd (server working directory). | |
All | 6966 | If the daily log file of vovserver cannot be opened in swd/logs directory at startup time due to permission or other failures, the daily log is created in /tmp directory. | |
All | 7255 | 20389 | As of 2016.09 Update 7, the Trace object can be queried for
PARAMETERS, PARAMDUMP and individual trace parameters via
vovselect and all related vtk commands (e.g.
vtk_select_loop). The PARAMETERS field
will return a list of all trace parameters and their values,
separated by '=', for example:
The PARAMDUMP field
will return a list of parameters and their default and current
values, for example: The PARAM field
will retrieve a single parameter by name, similar to the way
PROP currently works. For example:
All 3 fields are
case-insensitive. |
All | 7377 | Fixed bug that cause entry fields in node editor to lose focus after double or triple clicking, making it impossible to backspace over the selected text after a double or triple click. Now the focus is retained. | |
LicenseMonitor | 7230 | 20323 | Fixed vovsh -s clock offset to send vovserver a right IP address instead of 0. The message "Found host with different ip: 10.132.26.30 instead of 0.0.0.0" observed when running "vovslavemgr start" has been eliminated. |
LicenseMonitor | 7318 | 20529 | Restored missing filter negation control in batch report web UI. |
LicenseMonitor | 7350 | 20589 | Fixed issue with feature filter select menu when feature aliases have been configured. |
LicenseMonitor | 7329 | 20458 | Increased vovnginxd web page processing timeout to allow long-running reports to finish. |
LicenseMonitor | 7351 | 20665 | Fixed BATCH_OPTIONS variable errors for pie chart reports in the batch reporting facility. |
LicenseMonitor | 7413 | Fixed issue with detecting database status when running within a Windows service. | |
LicenseMonitor | 7428 | Fixed issue with peak statistics showing 0 usage for the others object in the legend of the usage comparison plot. | |
NetworkComputer | 5308 | 20141 | The preemption facility now retries to create the resumer job for a preempted job until successful. |
NetworkComputer | 7385 | 20591 | Modified LSF bsub emulation to allow multiple -R directives. |
NetworkComputer | 7219 | 20296 | Addressed instances where negative CPU progress was being reported. |
NetworkComputer | 7221 | 20299 | Added -cwd option to bsub LSF emulation utility. This option specifies the run directory for the job being submitted. |
NetworkComputer | 7231 | 20297 | Added support for -cwd option to specify the job execution directory and defaulted VOV_JOB_DESCRIPTION(rundir) to "." so that the variable is present when the NC policy is processed. Also fixed issue with -n option that requests multiple CPUs for a job, as well as added sensitivity to the "span[hosts=1]" resource string that constrains such jobs to a single host. |
NetworkComputer | 7262 | 20509 | The vtk_resourcemap_set command has a new
optional argument, -sum. Specify
-sum to indicate that a resource is a sum
resource, i.e. that it is composed of a boolean OR/AND
combination of other resources. For example, to create a
resource "License:a" that requires "License:b" or "License:c",
the command would be:
|
NetworkComputer | 7321 | 20493 | Fixed JavaScript error in IE 11 on slave resources page. |
NetworkComputer | 7410 | 20615 | Fixed instances where LSF bsub.config.tcl custom settings are overwritten. |
NetworkComputer | 7418 | Improve snapshot property processing to handle a malformed property value that is missing the continue sentinel for large environments. | |
LicenseAllocator | 7259 | 20398 | LicenseAllocator will now remove all allocations of a resource to a site if it has been marked as DO_NOT_SHARE. |
LicenseAllocator | 7265 | 20460 | LA will now convert fully qualified host names into short names before performing matching. |
LicenseAllocator | 7339 | 20564 | LicenseAllocator UI no longer has buttons for disallowed operations for users when logged in as a non-admin user. |
FlowTracer | 7341 | 20542 | Duration reporting behavior has been fixed for duration > 10000. |
FlowTracer | 7360 | 20551 | Corrected the "duration" field behavior in FlowTracer Node Editor. |
FlowTracer | 7395 | 20627 | Fixed vovconsole crash with READONLY permission. |
2016.09 Update 6
New Features and Enhancements
Products | Internal Number | Case Number | Description |
---|---|---|---|
NetworkComputer | 7146 | 15496 | VOV_JOB_DESC(xdur) in FlowTracer bsub emulation. |
FlowTracer | 7254 | Check out separate FT license for each user in multi-user FT. |
Resolved Issues
Products | Internal Number | Case Number | Description |
---|---|---|---|
LicenseMonitor | 7217 | ftlm_lmproject and SSL are not compatible. | |
LicenseMonitor | 7285 | ControlCenter issues. | |
LicenseMonitor | 7303 | Fixed issue with LM ControlCenter license file deployment jobs that run on a remote machine that does not have access to the LM SWD. | |
LicenseMonitor | 7102 | Fixed issue with ftlm_batch_report when using the -breakdownByFeature option and there is a feature in the database that has an empty tag name. This is a rare, corner-case bug, being as how empty tag names are not allowed in normal configuration and operation. | |
NetworkComputer | 7219 | 20296 | Negative percentages in job progress. |
NetworkComputer | 7247 | 20363 | nc gui -slaves does not work. |
NetworkComputer | 7260, 7261 | 20338 | NC bad cpu utilization; NC negative cpu progres. |
LicenseAllocator | 7040 | 15647 | LA possible 201609 jobs_la license issue. |
LicenseAllocator | 7265 | Host name discovery broken in licensing matching. | |
FlowTracer | 7225 | Statistics View should not be editible. |
Known Issues
Products | Internal Number | Case Number | Description |
---|---|---|---|
LicenseAllocator | If there is a syntax error in the LA::AddResource command in config.tcl, it can cause the vovlad daemon to exit. In this case, please recheck the syntax of the LA::AddResource command and restart the vovlad daemon. |
2016.09 Update 5
New Features and Enhancements
Products | Internal Number | Case Number | Description |
---|---|---|---|
LicenseMonitor | 5871 | 20081 | LicenseMonitor's FLEXlm monitor will now report the process ID (PID) associated with a checkout, if that support is enabled in the specific vendor's FLEXlm license server. |
LicenseMonitor | 7025 | 20259 | Added ability to specify custom pie chart dimensions in batch reporting utility. This can be used to increase the width, so that long labels in the legend will not overlap the pie. |
NetworkComputer | 7167 | As of 2016.09 update 5, cgroups support has changed. Individual cgroups are no longer displayed as extra resources by vovslavemgr or the web UI. Pre-existing cgroups can no longer be individually requested via nc run -r. CGROUP:RAM is still available as an extra resource, and its behavior is unchanged. | |
LicenseAllocator | 6499 | Historical metrics can be loaded upon server restart by adding the command reloadHistoricalMetrics to LA's config.tcl. The command requires a parameter specifying a duration (Vov time specification) for which to load the metrics. The duration is the elapsed time going back from "now" to the desired event. |
Resolved Issues
Products | Internal Number | Case Number | Description |
---|---|---|---|
LicenseMonitor | 6114 | 20002 | Support has been added for multiple wildcard patterns in the wildcard input of the LM filters box. You can now specify a space-separated list of wildcard patterns to match, such as "A* B* C?". |
LicenseMonitor | 7209 | For home page utilization and wait status widgets, if user and/or wait analysis takes longer than 10s, abort the analysis and return. This can occur when a large number of checkouts are being tracked. | |
LicenseMonitor | 7195 | 20256 | Prevent overlap of pie slice labels for skinny slices by disabling labels for slices that represent less than 3% of the total pie. |
LicenseMonitor | 7190 | 20234 | Added ability to relocate the utilization plot legend to the right-hand (east) side of the plot to allow for a lower custom height when using ftlm_batch_report. The new option is: -utilPlotLegendLocation. Possible values are s and e, with s (south) being the default. |
2016.09 Update 4
New Features and Enhancements
Products | Internal Number | Case Number | Description |
---|---|---|---|
LicenseMonitor | 7194 | 20235 | Added ability to extract images from batch report HTML files. |
FlowTracer | 7187 | Full support for single-user mode (multiuser FT project) in vovlsfd. |
Resolved Issues
Products | Internal Number | Case Number | Description |
---|---|---|---|
All | 7162 | Accumulation job fields are not reset on job start. | |
All | 7114 | 2016.09u3_54562_Dec15: installation report broken pipe error in win64. | |
NetworkComputer | 7185 | 20226 | NC preemption doe not work when "-waitingfor" in rule has the following syntax License:name#N. |
NetworkComputer | 7121 | Fully implemented matching disable threshold. |
2016.09 Update 3
Resolved Isses
Products | Internal Number | Case Number | Description |
---|---|---|---|
All | 7085 | Previously, accessing the READONLY port of vovserver, caused the vovserver to hang. This issue has been fixed. | |
All | 7059 | The env-var VOV_LICENSE_KEY can be used to designate a non-standard location for the Runtime license file. (This applies to installing Runtime products.) | |
License Monitor | 7077 | 20010 | The feature selection of LicenseMonitor report filters is now remembered correctly when navigating between report types. |
LicenseMonitor | 7089 | 20025 | Fixed the issue that caused tag-based notification POCs to be compounded, which resulted in unwanted notification mails going to the POCs. |
2016.09 Update 2
New Features and Enhancements
Products | Internal Number | Case Number | Description |
---|---|---|---|
All | 6261 | 14562 | Project rehosting is now allowed at at start time. New options,
-rehost and -block now
separate the rehosting functionality from the blocking behavior. The
following envvars have been renamed, which makes them scheduler
agnostic (this feature is not backwards compatible):
VOV_BSUB_VOVCONSOLE has been renamed VOVCONSOLE_SUBMIT_CMD
VOV_BSUB_VOVSERVER has been renamed VOVPROJECT_SUBMIT_CMD. |
All | 7017 | PostgreSQL 9.6 is now included as the VOV database engine for new instances and as an optional upgrade for existing instances. PostgreSQL 9.6 improves performance for historical reports in LicenseMonitor. | |
LicenseMonitor | 7016 | Information specified by the combination of the license tag and feature name can now be excluded from being loaded into the VOV database. This prevents overloading the database with unnecessary information. | |
FlowTracer | 7012 | The new API vtk_transition_chown_to_me supports multiuser FT
projects. This API allows a user to "take" a job. It is only enabled
when the seat_ft_mu_* feature is used. |
|
FlowTracer | 7013 | The new option -u for vovlsfd
supports multiuser FlowTracer projects. |
Resolved Issues
Products | Internal Number | Case Number | Description |
---|---|---|---|
All | 7050 | 15698 | Fixed: error that caused corruption in LicenseMonitor checkout and denial data files. |
All | 7064 | vovusergroup -unix now iterates over all
entries in the UNIX groups database. Previously, only the users
from the first entry of the given group were populated. |
|
LicenseAllocator | 6994 | Fixed custom defined resource map expressions. A resource in site is no longer overwritten by the automatic computation of resource map expression that is in response to the change of resource weights. | |
LicenseAllocator | 7071 | LicenseAllocator correctly detects vendor queued tokens. Previously, vendor-queued data was lost when the same feature is served by multiple license servers. | |
LicenseAllocator | 7075 | Overbooking boost values are cleared when overbooking is turned OFF. When overbooking is turned OFF, matching is not automatically turned ON. On the resources page, the matching button now shows ON/OFF instead of ENABLE/DISABLE. |
2016.09 Update 1
New Features and Enhancements
Products | Internal Number | Case Number | Description |
---|---|---|---|
All | 6918 | 15448 | The initial vovslave connection timeout for elastic vovslaves can now be configured via the config.tcl file. |
FlowTracer | 7012 | A new API, vtk_transition_chown_to_me supports
multiuser FlowTracer projects: a user can "take" a job.
vtk_transition_chown_to_me is only enabled when
a seat_ft_mu_* feature is used. |
|
7013 | The option -u was added for
vovlsfd to support multiuser FlowTracer
projects. |
Resolved Issues
Products | Internal Number | Case Number | Description |
---|---|---|---|
All | 6926 | vovset list -all has been fixed; all objects
are displayed. |
|
All | 7014 | The ACL mechanism now includes a new ACL role type called LEADER. Default ACL for LEADER role for jobs is VIEW, EXISTS, CHOWN. | |
All | 6968 | 15252 | With large token capacities, checkouts appear to use a very large number of them. The C/O Stats page sums the (duration * tokens) for each feature, and for this feature that number is much larger than a 32b integer. Added ::bigint to the query in the checkout stats report. |
NetworkComputer | 5699 | 14154 | fairshare.cgi takes fstokens into account
for visualization. |
NetworkComputer | 6720 | The in-command documentation for the nc run
-keepfor option has been clarified: nc run
-keepfor <time> : Disables auto-forget. Requires
the NetworkComputer administrator to copy the
live_keepfor_jobs.tcl script into the tasks directory. |
|
NetworkComputer | 6821 | 15317 | Indirect slaves no longer crash during failover. |
NetworkComputer | 6930 | Prevent the generation of an error concerning LiveRecorder not being enabled when shutting down vovserver or vovslave. For vovserver, this also resulted in an additional log file, generated at shutdown, that contained the error. | |
LicenseAllocator | 6767 | 15265 | Resource tokens are now counted corrected. Previously, the double counting of resource tokens occurred when license servers (FlexLM servers) were restarted. |
LicenseAllocator | 6684 | In grid mode, nodes are now visible at any level. Previously, with many nodes displayed in grid mode (about 40k), when the nodes were drawn very small, the files appeared to vanish. | |
LicenseAllocator | 6912 | A crash no longer occurs when using certain features when the vovserver version is older than 2016.09, but the vovsh version is 2016.09. The features that previously caused crashes: vsy, vovwhy, nc why, wx why, or the "Why" tab in vovconsole. | |
LicenseAllocator | 6913 | Sorting results and the status column color are now correct. Previously, the sorting and color(s) was incorrect when there were more rows than visible on the table for inputs or output. | |
LicenseAllocator | 6964 | Floating Slave Monitor menu is now available for "group" in Slave Monitor. |