2019.01 Update 4 Release
New Features and Enhancements
The following new features and enhancements were introduced this software release:
Product | Internal Number | Case Number | Description |
---|---|---|---|
All | VOV-10969 | The command vovclientmgr show no longer incorrectly labels clients without "nicknames" as HTTP clients. In addition, the nc command now properly sets a client nickname in all scenarios to allow it to be more easily identified in the output of both vovclientmgr show and vovshow -clients. | |
All | VOV-10864 | Added a new trace parameter,
enterpriselicense.burst to enable burst licensing
for NC/WX in Auto mode. 1=enabled 0=disabled, defaults to
disabled. Cleanup of various presentation errors in web UI license page when switching modes and show more details on current usage/availability for all modes. Disabled choices for Full and N for the licensing mode in the web UI for non-NC/WX servers. |
|
Accelerator Plus | VOV-9520 | 23779 | wxmgr stop -freeze will now force shutdown of WXLauncher if it does not complete a graceful shutdown within 60s to support upgrade operations which require WXLauncher to restart. |
Accelerator Plus | VOV-10444 | The behavior of crash recovery timing has changed. In
previous updates, a single server parameter
crashRecoveryPeriod dictated the crash
recovery period. Crash recovery completed after the
crashRecoveryPeriod and the server
began normal operation. Three changes were made:
At any stage, if all jobs are recovered, crash recovery will end. If a vovslave reconnects during a 'quiet time' before crash recovery ends, the crash recovery deadline will be extended by this 'quiet time'. The quiet time is specified by the crashRecoveryQuietTime server parameter. The crashRecoveryMaxExtension server parameter specifies an upper limit on the amount by which the deadline is extended. The parameters can be set in policy.tcl. The ranges and default values are as follows: # min 30s, max 1800s VovServerConfig crashRecoveryPeriod 60 # min 0, max 300 VovServerConfig crashRecoveryQuietTime 30 # min 0s, max 1800s VovServerConfig crashRecoveryMaxExtension 60 If desired, the original crash recovery behavior can be restored by setting the crashRecoveryMaxExtension parameter to zero. Appropriate settings for these parameters will depend on the particular site configuration and needs. |
|
Allocator | VOV-10360 | 24466 | Improved performance of NRU matching. Also, changed the NRU bailout message to clarify the numbers in the message. |
Allocator | VOV-9411 | 23688 | Added support for hierarchical Altair Allocators. This is an experimental feature. Please contact support for details. |
Accelerator | VOV-10520 | Accelerator will now support the use of burst licenses. If license file is provisioned with nc_slots_burst type licenses, Accelerator will allocate licenses first out of the base nc_slots licenses, and then allocate additional slots as required from the nc _slots_burst license pool. | |
Accelerator | VOV-4900 | 21070 | TIMEVAR time slot specifications are expanded to allow the second time item in the range HH:MM-HH:MM to be a prior time, as is the case when spanning an overnight time. For example, 6 PM to 6 AM may now be specified using a 24 hour clock range as follows: 18:00-6:00. |
Accelerator | VOV-8188 | 21251 | Previously, when vovgetgroups timed out or did not return groups info correctly, the job would run with the incorrect groups. Following this change, under those conditions, the job will fail. Also previously, the VOV_ALARM timeout for vovgetgroups was limited to not exceed 60 seconds. The 60 second limit has been eliminated. |
Accelerator | VOV-10812 | The show/hide cgroups link on the Slave Resources web UI page is no longer required to show the CGROUP:RAM slave resource, and therefore has been removed. | |
Accelerator | VOV-9555 | Modified vovslave log messages to be more clear and actionable. | |
Accelerator | VOV-10905 | 24801 | nc why output for DP jobs will omit the confusing internal DP:SLOTS_N resource and show subjob IDs and statuses |
Accelerator | VOV-5294 | A new capability to improve job RAM and CPUTIME accounting for jobs with detached processes is implemented on Linux systems. In addition to collecting PIDs that share a PGID or are within the process tree for a job, various types of detached processes are found if they are in the same Session ID or if the VOV_JOBID environment variable matches the values for the running job. | |
Accelerator | VOV-10795 | Support for hourly charging in the cloud has been
implemented. If a slave is started with the environment variable
VOV_INSTANCE_LAUNCH_TS set to the launch time of the instance on
which the slave is running, then the slave will be kept alive
until we approach the hour-boundary to within a few minutes.
This is an unsupported feature. |
|
Allocator | VOV-9737 | 24038 | The VOV_LICMON environment variable now supports a comma-separated list of hosts rather than a single host. |
Resolved Issues
Product | Internal Number | Case Number | Description |
---|---|---|---|
All | VOV-10999 | Fixed issue that prevented the show all rows link from working on the buckets web UI page. Previously, using this link would result in an empty table as opposed to showing all available rows. | |
All | VOV-10350 | All installers/SFDs now reject installation paths that contain spaces. | |
All | VOV-10427 | 24510 | This ticket addressed three issues that affected crash
recovery. The first was a race condition that occurred when a vovslave connected to a restarted server. If the vovslave license authorization happened to be checked during a very small interval the result was that the vovslave was destroyed. The second was that 'hog protection' was inadvertently applied to vovslaves during crash recovery with the result that reconnection of vovslaves after a serve restart could be delayed until crash recovery period had ended. (This compounded the first issue during crash recovery.) The third issue was cosmetic and resulted an a Tcl stack trace if the vovserver took too long to respond while restarting. The database queries (vtk_select_loop) parameters were adjusted to lengthen the response period. Also see the release notes for VOV-10444 for pertinent crash recovery parameters. |
All | VOV-10221 | 24265, 24417 | vovserver failover recovery has been enhanced to try for the recovery on all the configured server candidates. |
All | VOV-9902 | Prevent vovserver and child processes from exiting when Ctrl-C is pressed in the Windows command prompt from which the server was started. | |
All | VOV-11126 | 24961 | The description of the RAMUSED slave resource was updated for better clarity on usage. |
Accelerator Plus | VOV-10862 | Behavioral change; remaining slaves in base queues that have been removed will not be filtered from wait reasons. | |
Accelerator Plus | VOV-10557 | 24557 | The Linux priority/"nice level" of jobs running via Accelerator Plus will now have the same priority as
jobs running directly on Accelerator for the same
Accelerator/Accelerator Plus designated execution priority. Use
nc/wx run -p <scheduling priority>.<execution
priority> ... to set the execution priority.
|
Accelerator Plus | VOV-10117 | 24291 | Fixed race condition when a job arrives while a slave is shutting down due to exceeding its maxIdle setting. The job will now be rescheduled instead of failing. |
Accelerator Plus | VOV-10273 | If the server configuration parameter
failover.usefailoverslavegrouponly is
set (default 0), then only failover slaves participate in server
election. By default all slaves participate, which may cause
excessive file traffic with many slaves (particularly
exacerbated by Accelerator Plus). The server election 'voting' period in seconds can be overridden by the server configuration parameter failover.maxdelaytovote (default 120). |
|
Accelerator Plus | VOV-10033 | 24060 | Jobs using shared memory should no longer see incorrect ram usage spikes when child processes terminate. |
Accelerator Plus | VOV-10705 | 24632 | Fixed bug that masked the number of queued slave requests when Accelerator Plus was calculating how many more slaves to request and under some conditions resulted in more slaves requested than there were jobs in the bucket. Also fixed the use of quota with slave launching via arrays so that the array parameter correctly applies the quota. |
Accelerator Plus | VOV-11113 | vovwxd will now log the time for a service loop at log level 3. The time of the latest loop will be updated in the property WXLoopTime. | |
Accelerator Plus | VOV-11112 | The WX_BUCKET_SERVICE_TS property will be updated more frequently to show activity on heavily loaded Accelerator Plus queue. | |
FlowTracer | VOV-10990 | Fixed an issue that could cause vovlsfd to fail due to errors updating reservations. | |
FlowTracer | VOV-7913 | 21527 | Fixed issue that caused the login link to be shown even after logging into the web UI for users possessing the READONLY security level. |
FlowTracer | VOV-10654 | 24609 | Fixed unflattening of sets (Unflatten Sets in the context menu) that were flattened recursively in vovconsole. |
FlowTracer | VOV-10114 | 24290 | Job (transition) may now have "Failed to get user" error code. |
Accelerator | VOV-10833 | 24749 | Fixed an issue with vovreconciled not revoking component resource as per the revocation delay set for summary resource, when the component resource revocation delay is not set. |
Accelerator | VOV-9123 | 23250 | Accelerator issuing "stop" from the web interface is now sending the right exit signals. |
Accelerator | VOV-9194 | 24593, 24922 | Interactive jobs now use the fully-qualified domain name, if available, of the submission host to ensure the execution host can find and connect to the submission host. |
Accelerator | VOV-9557 | 23850 | Added new vovslavemgr stop -sick
<TIMESPEC> function that can be used to forget
slaves that are older than the specified timespec-based
threshold. |
Accelerator | VOV-10028 | 24238 |
|
Accelerator | VOV-7862 | 21237 | Correct some edge cases in the job CPU utilization graph. Phantom CPU usage spikes were being seen. |
Accelerator | VOV-11046 | 24897 | Fixed issue where the vovslave would continuously log "Killing subslave with pid = <pid>" leading to eventual exhaustion of disk space. This fix requires a restart of all vovslaves. |
Accelerator | VOV-10393 | Slaves now start automatically on Windows. | |
Accelerator | VOV-9794 | Fixed issue where the terminal appears to freeze when the output of an interactive Accelerator job (NC -Ir) is piped to tee (tee, for example, would report "tee:write error"), cat, etc. | |
Accelerator | VOV-10108 | 24207 | New API containerHooksRunDir is available to specify location where to run container hook scripts. The requested job running directory will be passed to Enter hook script as env(VOV_CONTAINER_JOB_RUNDIR). Please see sample files at /etc/config/containers |
Accelerator | VOV-10341 | Added a description of the RESV_<license> job property, which is a counter of how many times <license> has been revoked by the vovreconciled daemon (if configured). | |
Monitor | VOV-10908 | 24235 | When monitoring a remote instance of Monitor, ensure that a remote dropped feature is detected and results in the deletion of the local feature. This allows the local feature's capacity to be set to 0 upon the next capacity snapshot. |
Monitor | VOV-7082 | 20029 | Parsers for MathLM, LMX HASP enhanced. Green Hills error for large IDs fixed. |
Monitor | VOV-11062 | 24869 | Improved help for vovslavemgr config setenv to instruct windows users to quote the "name=value" parameter |