.. _20230602_2_4_0_release: Release Notes (v2.4.0) ====================== `June 2, 2023` The v2.4.0 release includes major new features along with numerous fixes and improvements. - User-defined tags - Walltime limits on tasks - Client timeouts for automatic shutdown - Autoscaling cluster (fixed and dynamic) - New logging style - NO_COLOR environment variable support ----- Features -------- | Tags ^^^^ We've added custom tag support for tasks. This feature allows users to organize both individual and groups of tasks in new and interesting ways. Tags are stored with both a `key` and `value`, e.g., ``site:xy``. When specifying one or more tags to the group ``submit`` command or the individual ``task submit`` command, omitting the `value` results in it being stored as a blank (all tags are accepted as a plain string, non-numeric). When querying for tasks by tag, omitting the value will match on any task with that tag regardless of value. Including the value will target only tasks who both have the tag and the tag has that particular value. For example, we can submit two sets of tasks with tags ``x`` and ``y``. .. admonition:: Submit group of tasks with custom tag :class: note .. code-block:: shell seq 2 | hs submit -t 'echo {}' --tag x:0 .. admonition:: Submit individual tasks with alternate tag :class: note .. code-block:: shell hs task submit -t y:0 -- echo foo hs task submit -t y:1 -- echo bar Now we can search for tasks based on these tags. .. admonition:: Search by tag :class: note .. code-block:: shell hs task search --count .. details:: Output .. code-block:: none 4 .. code-block:: shell hs task search --count --tag x .. details:: Output .. code-block:: none 2 .. code-block:: shell hs task search --count --tag x:1 .. details:: Output .. code-block:: none 0 .. code-block:: shell hs task search -t y:0 .. details:: Output .. code-block:: none --- id: befbb239-8d91-42dc-b1bc-2170b61a7a50 command: echo foo (echo foo) exit_status: 0 submitted: 2023-06-02 10:56:29.847831 scheduled: 2023-06-02 10:56:44.651948 started: 2023-06-02 10:56:44.673857 (waited: 0:00:14) completed: 2023-06-02 10:56:44.678160 (duration: null) submit_host: macbook.local (5b676c02-80b0-4d07-9d2f-07367bbd23d1) server_host: macbook.local (a6f8b3fa-f635-428f-bd93-96fbbeebb5b3) client_host: macbook.local (a6f8b3fa-f635-428f-bd93-96fbbeebb5b3) attempt: 1 retried: false outpath: /Users/me/.hypershell/lib/task/befbb239-8d91-42dc-b1bc-2170b61a7a50.out errpath: /Users/me/.hypershell/lib/task/befbb239-8d91-42dc-b1bc-2170b61a7a50.err previous_id: null next_id: null tags: y:0 Updating a tag on an existing task works like other fields except that it is additive. Specifying a new tag does not remove previous tags. .. admonition:: Update tags on existing task :class: note .. code-block:: shell hs task update befbb239-8d91-42dc-b1bc-2170b61a7a50 tag mark:false .. code-block:: shell hs task search -t y:0 .. details:: Output .. code-block:: none --- id: befbb239-8d91-42dc-b1bc-2170b61a7a50 command: echo foo (echo foo) exit_status: 0 submitted: 2023-06-02 10:56:29.847831 scheduled: 2023-06-02 10:56:44.651948 started: 2023-06-02 10:56:44.673857 (waited: 0:00:14) completed: 2023-06-02 10:56:44.678160 (duration: null) submit_host: macbook.local (5b676c02-80b0-4d07-9d2f-07367bbd23d1) server_host: macbook.local (a6f8b3fa-f635-428f-bd93-96fbbeebb5b3) client_host: macbook.local (a6f8b3fa-f635-428f-bd93-96fbbeebb5b3) attempt: 1 retried: false outpath: /Users/me/.hypershell/lib/task/befbb239-8d91-42dc-b1bc-2170b61a7a50.out errpath: /Users/me/.hypershell/lib/task/befbb239-8d91-42dc-b1bc-2170b61a7a50.err previous_id: null next_id: null tags: y:0 mark:false | Task timeout ^^^^^^^^^^^^ Previously, unless the program being executed has a built-in timeout feature, there was no way to preempt a task. Once a task began execution, we would wait indefinitely for it to complete. The new task-level timeout feature now provides this functionality. If not specified via configuration file (``task.timeout``), or environment variable (``HYPERSHELL_TASK_TIMEOUT``), or command-line argument (``-W``, ``--task-timeout``), the default behavior is still to wait indefinitely. If given, after the specified number of seconds has elapsed, a signal is sent to the running program. Each of ``SIGINT``, ``SIGTERM``, and ``SIGKILL`` are sent, in an escalating fashion, waiting briefly in between, until the program halts. If the program has still not halted (some programs are pathological), the task executor thread itself will halt. | Client timeout ^^^^^^^^^^^^^^ In anticipation of the new `autoscaling` feature, we've implemented a client-level timeout. Numerous deployment scenarios might have a client launching mechanism on some sort of timer or trigger, but would require a hard-termination of the client instances. Previously we implemented a robust mechanism for recovering tasks from evicted clients; however, this is not a graceful or preferred means to intentionally scale down. It interrupts tasks, and requires a waiting period for the server to evict. Instead, now you can specify a client-level timeout via configuration (``client.timeout``), environment variable (``HYPERSHELL_CLIENT_TIMEOUT``), or command-line option (``-T``, ``--timeout``). If not given, the client will persist indefinitely by default as before. If given, the client will shutdown gracefully and send a disconnect signal to the server after the specified period in seconds has elapsed without any new task bundles arriving. | Autoscaling ^^^^^^^^^^^ We have added another `mode` to the ``hs cluster`` enabled with the ``--autoscaling`` option. This mode combines some behavioral ideas of all three of the previous modes. The use of ``--launcher`` previously implied a single subprocess responsible for bringing up all client instances (like an ``mpirun``). This is in contrast to the ``--ssh`` mode that brought up a distinct subprocess for each of the included hosts. The ``--autoscaling`` mode incorporates a new local thread that dynamically brings up new clients with the ``--launcher`` as a prefix. There are two scaling *policies*, ``fixed`` and ``dynamic``. In both cases, there is an *initial size*, *minimum size*, and *maximum size* for the cluster. The ``fixed`` policy is pretty simple. We launch the initial number of clients, and if or when the *minimum* size is reached, we add a client. The ``dynamic`` policy incorporates a scaling ``factor``, a dimensionless quantity that expresses some multiple of the average task duration in seconds. When the expected time to completion of all currently submitted tasks given currently running clients exceeds this period a new client will be launched. See the detailed description under ``--autoscaling`` for the :ref:`command-line ` interface. | New logging style ^^^^^^^^^^^^^^^^^ With the previous release we expanded the set of attributes available for use within logging messages, like elapsed time instead of absolute time, and shortened version of the hostname and module. We've now incorporated these in a new predefined logging *style*. To enable this new style, just set it in your configuration. .. admonition:: Configure logging style :class: note .. code-block:: shell hs config set logging.style detailed-compact --user | Proper support for ``NO_COLOR`` environment variable ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ This release adds proper support of the `NO_COLOR `_ convention. We previously looked for ``HYPERSHELL_NO_COLOR``, however it is better that this option not actually be specific to this software and respect the more general configuration. | ----- Fixes ----- | Issue `#18 `_ ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Incorrect type inferred for task search filters. When using the ``task search`` command with ``-w``, ``--where`` filters, the type of the value is inferred 'smartly' to make it simple to use values, like ``exit_status == null`` vs ``exit_status == 1`` choosing a Python ``None`` and an integer ``1`` instead of their string counterparts. This works as expected. Until you have a field that expects a string and it happens to have values that could be coerced to integers. As a minimal example: .. admonition:: Run cluster with integer-like command arguments :class: note .. code-block:: shell seq -w 100 | hs cluster -N2 -t 'echo {}' The task ``args`` are ``001``, ``002``, etc. But now you cannot issue a command like the following and achieve expected results, .. admonition:: Search for task by integer-like argument :class: note .. code-block:: shell hs task search -w args==001