.. _20260705_2_8_0_release: Release Notes (v2.8.0) ====================== `July 5, 2026` The v2.8.0 release includes major features and improvements. - Built-in TLS encryption (enabled by default) - Resource-aware task scheduling - Resource monitoring - Task groups for dependency management - Queue-only task submission - Rate limiting task execution - File-based logging - Functional (Python) API - Bash and Zsh shell completions - Python 3.11–3.14 support (PostgreSQL via psycopg v3) - Major bug fixes and improvements ----- Features -------- | Secure queue transport (TLS enabled by default) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ The most consequential change in this release is that queue traffic between the server and its clients is now encrypted with TLS **by default**. In previous releases the distributed queue communicated in the clear, protected only by a shared authentication key; HyperShell now wraps that same queue in a TLS channel with no configuration required. .. note:: On single-host clusters (``hs cluster`` / ``hsx``) and on any deployment sharing a filesystem this is completely transparent — there are no certificates to generate and nothing to change. **Automatic self-signed certificates** The first time a server starts with TLS enabled it generates a self-signed certificate and private key (RSA-3072, SHA-256, ten-year validity) under the site TLS directory — ``server.crt`` and ``server.key`` in ``/lib/tls`` (``~/.hypershell/lib/tls`` on Linux). The private key is written owner-only (``0600``). These materials are reused on subsequent starts, so generation happens exactly once. **Command-line Options** The ``--no-tls``, ``--tls-cert``, ``--tls-key``, and ``--tls-ca`` options are available on the ``server``, ``client``, ``submit``, and ``cluster`` commands: .. list-table:: TLS Options :header-rows: 1 :widths: 20 60 20 * - Option - Purpose - Config Source * - ``--no-tls`` - Disable TLS entirely (encryption off — not recommended) - ``server.tls.enabled`` * - ``--tls-cert`` - Path to the certificate file - ``server.tls.cert`` * - ``--tls-key`` - Path to the private key file - ``server.tls.key`` * - ``--tls-ca`` - CA bundle used to verify the peer certificate - ``server.tls.cafile`` **Configuration** TLS can also be controlled entirely through the ``[server.tls]`` configuration namespace (or the matching ``HYPERSHELL_SERVER_TLS_*`` environment variables), following the usual precedence: .. code-block:: toml [server.tls] enabled = true # default; queue traffic is encrypted out of the box cert = "" # path to server cert, or '' to self-sign key = "" # path to server key, or '' to self-sign cafile = "" # trust anchor; '' = server's own cert fingerprint = "" # 'SHA256:AB:CD:...' pin (overrides cafile verification) insecure = false # encrypt but skip peer verification (logs a warning) min_version = "TLSv1.2" # or 'TLSv1.3' ciphers = "" # OpenSSL cipher string servername = "" # SNI / hostname check override **Peer Verification** The client side of every connection decides how to validate the server certificate. Four modes are supported, in decreasing strength: full **CA verification** (``cafile`` plus ``servername``); certificate **fingerprint pinning** (set ``fingerprint`` to the ``SHA256:...`` value logged when the certificate is generated — the recommended choice for self-signed certificates); the **system CA bundle** (for real, publicly-issued certificates); and an **insecure** mode that encrypts traffic but does not authenticate the peer. **Multi-host Clusters** For distributed clusters (SSH, MPI, SLURM, autoscaling) the launcher no longer copies the server's certificate material onto each client command line. Instead every client resolves its own TLS material from its own configuration and site directory — identical to the server's on a shared filesystem, so no operator action is needed there. When the filesystem is *not* shared, either pin the server's fingerprint (``HYPERSHELL_SERVER_TLS_FINGERPRINT``) or distribute the certificate and set ``HYPERSHELL_SERVER_TLS_CAFILE`` / ``HYPERSHELL_SERVER_TLS_SERVERNAME`` on the clients. Only the disabled state (``--no-tls``) is propagated automatically. .. tip:: A new :ref:`Security guide ` documents the full threat model, the built-in TLS architecture, all four verification modes, known limitations, and deployment recipes for single-host, internet-exposed, and Kubernetes environments. | Resource-aware task scheduling ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ HyperShell now supports resource-aware scheduling with task-level CPU core, memory, and time requirements. These requirements can be specified for all tasks en masse or on an individual basis. The client-side scheduler intelligently manages task execution based on resource availability. .. note:: For existing workflows or simpler cases where no resource requirements are given for tasks, none of the following mechanisms are engaged and task parallelism will behave exactly as it has in previous releases of the software. Lack of a resource constraint will be interpreted as 0 cores, 0 memory, and tasks may run indefinitely as before. **Command-line Options** The following options control resource management across the CLI: .. list-table:: Resource Management Options :header-rows: 1 :widths: 10 20 50 20 * - Short - Long - Purpose - Config Source * - ``-c`` - ``--cores`` - CPU cores required per task - ``task.cores`` * - ``-m`` - ``--memory`` - Memory required per task - ``task.memory`` * - ``-W`` - ``--task-timeout`` (``--timeout`` on ``submit``) - Walltime limit per task (seconds) - ``task.timeout`` * - ``-C`` - ``--client-cores`` - Limit cores available to client - ``client.cores`` * - ``-M`` - ``--client-memory`` - Limit memory available to client - ``client.memory`` * - ``-T`` - ``--timeout`` - Client idle timeout (seconds) - ``client.timeout`` These options are available across the ``submit``, ``client``, and ``cluster`` commands where appropriate. The ``cluster`` command supports all six options because it both submits tasks (with requirements via ``-c``/``-m``/``-W``) and launches clients (with limits via ``-C``/``-M``/``-T``). The ``submit`` command supports task-level options (``-c``/``-m``/``-W``), while the ``client`` command supports client-level options (``-C``/``-M``/``-T``). Note that ``-W``/``--task-timeout`` sets a per-task walltime limit and is used by the scheduler for backfilling decisions, while ``-T``/``--timeout`` controls how long a client waits idle before shutting down when no tasks are available. In prior releases of the software the ``client`` supported a global ``-W``/``--task-timeout`` as well (it still does) which lets the caller define a uniform timeout for all tasks executed by the client. In this release we combine this with individual task-level timeouts. The ultimate timeout (if there is one) for a given task will be the shorter of either of these provided. **Inline Resource Specification** Resource requirements can also be specified inline using the ``#HYPERSHELL:`` comment syntax. This is a special case for the existing inline tag annotations. This allows per-task resource heterogeneity within a single input file, overriding any command-line defaults. .. admonition:: Inline comment directives specify per-task resource requirements :class: note .. code-block:: shell stress -c 4 -t 10s #HYPERSHELL: cores:4 memory:2GB timeout:60 stress -c 8 -t 60s #HYPERSHELL: cores:8 memory:4GB timeout:120 **Executor Thread Mechanism** Each client runs multiple executor threads (configured via ``-N``/``--num-threads``), with each thread capable of running one task at a time. When a task is ready to execute: 1. The executor thread attempts to *acquire* the required resources (cores and memory) 2. If resources are available, the task starts immediately 3. If resources are insufficient, the task enters a priority-based wait queue 4. When the task completes, resources are *released* back to the pool This design allows clients to oversubscribe executor threads relative to available cores, enabling efficient resource utilization through intelligent backfilling. **Priority-based Scheduling with Backfilling** Tasks waiting for resources are tracked with increasing priority. Higher-priority tasks (those waiting longer) are scheduled first. However, the scheduler implements intelligent *backfilling* where smaller tasks with shorter timeouts can "jump ahead" if they can complete before higher-priority tasks would be able to start. Consider this example task file: .. admonition:: Task list example demonstrates backfill scheduling :class: note .. code-block:: shell stress -c 4 -t 5s #HYPERSHELL: cores:4 timeout:60 n:1 stress -c 4 -t 10s #HYPERSHELL: cores:4 timeout:60 n:2 stress -c 8 -t 10s #HYPERSHELL: cores:8 timeout:60 n:3 stress -c 4 -t 10s #HYPERSHELL: cores:4 timeout:15 n:4 stress -c 4 -t 10s #HYPERSHELL: cores:4 timeout:60 n:5 stress -c 4 -t 10s #HYPERSHELL: cores:4 timeout:60 n:6 stress -c 4 -t 10s #HYPERSHELL: cores:4 timeout:60 n:7 stress -c 4 -t 10s #HYPERSHELL: cores:4 timeout:60 n:8 On a client with 8 cores and 3 executor threads: 1. Tasks ``n:1`` and ``n:2`` start immediately (4+4=8 cores used) 2. Task ``n:3`` arrives and waits (needs 8 cores, gets priority 1) 3. Task ``n:1`` completes after ~5 seconds, freeing 4 cores 4. Task ``n:4`` arrives and needs 4 cores, gets priority 2 5. Task ``n:4`` backfills ahead of ``n:3`` because: * Task ``n:4`` has a shorter timeout of 15s * Task ``n:2`` will have a worst case timeout of 60s (~55s from now) 6. Task ``n:3`` starts when both ``n:2`` and ``n:4`` complete This backfilling strategy significantly improves throughput for heterogeneous workloads with mixed resource requirements and execution times. **Adaptive Sleep for Priority-based Wakeup** When multiple tasks are waiting for resources, HyperShell uses an adaptive sleep mechanism to ensure tasks wake up and check for availability in priority order. Instead of all waiting tasks checking simultaneously (by having equal sleep periods), each task sleeps for a duration proportional to its priority ratio: * High-priority tasks (ratio=1.0) sleep ~0.5 seconds * Low-priority tasks (ratio approaching 0.0) sleep ~1.0 seconds * Small random jitter (±0.05s) prevents exact collisions This creates a natural "wake-up cascade" where the highest-priority task checks first, followed by progressively lower-priority tasks. This minimizes lock contention on the resource scheduler while ensuring fair, priority-based access. As tasks complete and resources become available, waiting tasks are efficiently promoted and scheduled without unnecessary polling overhead. Without this mechanism we would experience O(n^2) waiting times each time a large task (all slots) completed, instead of the O(1) produced here. .. admonition:: Warning - Rename command-line option :class: warning The ``--num-tasks`` option has been renamed to ``--num-threads`` to better reflect its purpose as the number of executor threads per client. The old ``--num-tasks`` name is maintained for backwards compatibility but may be deprecated in future releases. .. admonition:: Warning - Reassigned short option ``-c`` :class: warning The ``-c`` short option now means ``--cores`` (``--task-cores`` on ``hs server``). It was previously the short form of ``--capture``, which is now available only in its long form. Existing scripts that used ``-c`` to enable output capture must be updated to ``--capture``. | Resource monitoring ^^^^^^^^^^^^^^^^^^^ HyperShell can now monitor the actual CPU and memory usage of running tasks and their child processes using the ``--monitor`` option. This provides visibility into resource consumption and helps identify tasks that may be under or over provisioned. **Enabling Monitoring** Add the ``--monitor`` flag to the ``cluster`` (``hsx``) or ``client`` commands: .. admonition:: Enable resource monitoring for workflow :class: note .. code-block:: shell hsx tasks.in --monitor -c 4 -m 2GB When monitoring is enabled, HyperShell continuously tracks CPU core utilization and memory consumption for each task and all its child processes using the ``psutil`` library. For each task, resource usage is sampled during the ~1 second cycle-time during which the executor thread is waiting for task completion. **Data Collection and Storage** * **Peak usage**: Maximum observed values are stored in the database as ``cores_max`` and ``memory_max`` for each task. These are stored with decimal precision. * **High-resolution telemetry**: Full time-series data is written to a CSV file alongside the ``.out`` and ``.err`` files (see ``--capture``). These are stored in the ``$HYPERSHELL_SITE`` library (``~/.hypershell/lib`` by default on Linux). There are only three columns: ``time``, ``cores``, ``memory``. The recorded time-series for a task can be viewed with ``hs info --perf`` (its location is also reported as the ``csvpath`` field), mirroring ``--stdout`` / ``--stderr`` for captured output. Monitoring is built on ``psutil`` (>= 7.0.0), which is now a required dependency. **Resource Limit Warnings** When both monitoring and resource requirements are specified, HyperShell automatically detects when tasks exceed their allocated resources and emits a warning: .. admonition:: Resource limit warnings when monitoring enabled :class: note .. code-block:: text ... Resource limit exceeded (...): cores 1.41 (used) > 1.00 (allocated) Resource limit exceeded (...): memory 1.62GB (used) > 1.00GB (allocated) These warnings: * Are emitted **once per task** to avoid log spam * Help identify tasks that need resource requirement adjustments * Do not terminate the task—they serve as informational warnings only * Include **tolerances** to avoid false positives from measurement fluctuations or rounding errors: * CPU: 0.05 cores tolerance * Memory: 5 MB tolerance **Use Cases** * **Profiling**: Understand actual resource consumption to right-size task requirements * **Optimization**: Identify resource bottlenecks and opportunities for parallelization * **Validation**: Verify that tasks respect their resource allocations * **Debugging**: Diagnose memory leaks or unexpected CPU usage patterns .. tip:: Use monitoring during development and testing to establish baseline resource requirements, then apply those requirements (``-c``/``-m``) in production runs for optimal scheduling. | Queue-only task submission ^^^^^^^^^^^^^^^^^^^^^^^^^^ The ``hs submit`` command now supports direct submission to a live server queue, bypassing the database entirely. This provides a lightweight alternative for transient workflows or environments where a database is not configured or desired. Use the ``-q``/``--queue`` option along with ``-H``/``--host``, ``-p``/``--port``, and ``-k``/``--auth`` to submit tasks directly to a running server: .. admonition:: Submit tasks directly to live queue :class: note .. code-block:: shell # Generate a strong authentication key (the server now requires >= 16 characters) KEY="$(openssl rand -base64 24)" # Start a server hs server --forever --auth "$KEY" & # Submit tasks directly to the queue (use -H to reach a remote server) hs submit tasks.in -q -H localhost -p 50001 -k "$KEY" When using queue mode, tasks are sent directly to the server's in-memory queue for immediate scheduling by connected clients. Without ``--queue``, the traditional database-backed workflow is used, providing persistence, recovery, and search capabilities. .. note:: Because TLS is enabled by default, submitting to a *remote* server requires the client to trust the server's certificate — on a shared filesystem this is automatic; otherwise pin the server's fingerprint or distribute its certificate (see :ref:`security`), or pass ``--no-tls`` on both ends. The ``--auth`` key must be at least 16 characters. .. note:: The ``-b``/``--bundlesize`` option controls how tasks are bundled and sent to clients. In queue mode, this directly affects the size of task bundles conveyed to connected clients, which can impact throughput and responsiveness. Coordinate bundle size with the number of executor threads (``-N``/``--num-threads``) on your clients for optimal performance. | Rate limiting task execution ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ The concept of rate limiting has been added (available as ``-R``, ``--ratelimit``) for both cluster and client commands as a per-client limit on tasks executed per second. For example, specifying ``-R5`` would restrict the workflow to only permit a maximum throughput of 5 tasks per second or 300 per minute per client. This is applied by computing a *minimum* task walltime and entering into a waiting cycle if a task completes in less than this time. This can be useful in a number of scenarios; for example, data processes where the tasks are pulling files over an API and we want to ensure we do not exceed the rate limit of the API. | Task groups for dependency management ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Up until now, HyperShell has not concerned itself with this concept and instead focused purely on delivering high-throughput execution of independent homogenous collections of tasks. This has been incredibly productive for most researchers using the software. With the addition of resource-aware scheduling of heterogeneous tasks, one might now consider other scenarios that naturally call for some kind of dependency management. For example, a large-scale pipeline that schedules relatively small inferencing tasks within some time window with some larger final integration/combination step. Or similarly, frame rendering in a visualization with a final processing step to build the video file from the frames. Traditional DAG-based frameworks (e.g., `GNU Make `_, `Airflow `_, `Nextflow `_, `Snakemake `_) focus on the user-facing *definition* of the workflow with explicit task-to-task dependencies, for which an ephemeral group association is typically assigned for the execution phase using within-group parallelism. For some of these frameworks, the actual execution system is rudimentary with only local threads (`GNU Make`) while other frameworks abstract this responsibility by allowing for externally implemented execution through some plugin system (`Nextflow`). .. note:: This feature is the last step on the road to becoming a backend to something like `NextFlow`. We have inverted the relationship between tasks and their dependencies by skipping the graph and directly exposing the task execution groups at submit time. That is, task group ID values are persisted in the database with tasks belonging to an explicit task group and remain in that group forever. This design simplifies how users conceptualize this problem and doesn't require any complicated changes to the data model or the introduction of any new syntax. **In fact, for high-throughput workflows with billions of tasks in the database this is altogether preferable in every regard!** This simple model: - **Eliminates graph solving**: The scheduler only needs to track the current group number - **Minimizes metadata**: A single integer per task instead of arbitrary dependency edges - **Enables efficient queries**: Database indexes on ``(group, schedule_time)`` provide O(log n) lookups - **Maintains high throughput**: Scheduling decisions remain constant-time operations HyperShell's scheduler thread now includes this concept of a task group by only scheduling task bundles for the current group. All tasks in some group `N` must complete before any tasks in group `N+1` may be scheduled. If there are failed tasks in the active group we remain in that group until all retries have been exhausted up until the maximum retry limit on all tasks in the group. If the server is operating in `forever` mode we will continue to remain here indefinitely, otherwise the server will trigger a shutdown with a critical message indicating non-viability of the task such as they are. Task groups are exposed through a new ``-g``/ ``--group`` option in the ``hs submit`` command: .. admonition:: Submit tasks to different groups :class: note .. code-block:: shell # Submit batch of tasks with group 0 hs submit tasks-0.in -g 0 # Submit batch of tasks with group 1 hs submit tasks-1.in -g 1 Groups may also be assigned per task inline with the ``#HYPERSHELL:`` comment directive, which overrides the command-line ``-g`` for that task: .. admonition:: Assign task groups inline :class: note .. code-block:: shell preprocess.sh #HYPERSHELL: group:0 analyze.sh #HYPERSHELL: group:1 The same ``-g``/``--group`` filter — along with a new ``--retries`` filter for tasks that have been retried — is also available on ``hs list`` and ``hs update`` (e.g. ``hs list -g 1`` or ``hs list --retries``). The default task group is 0 for a fresh database (see :const:`~hypershell.submit.DEFAULT_TASK_GROUP`). When no task groups are given the software essentially behaves in a manner indistinguishable from previous releases. The scheduler pre-selects the active task group prior to selecting tasks from the database using one of following rules (in order): #. The most recently scheduled task group (if there are any), #. The default task group if the database is empty, #. The lowest group if nothing has been scheduled yet. These same rules are followed for automatically selecting the active group as a dynamic default if task groups have previously been used and submitting new tasks without specifying. .. warning:: Submitting tasks with a group value lower than the active group for a running workflow is considered an error and these tasks will never be scheduled. It is possible to modify the group of an already submitted task using ``hs update``. Depending on the severity of the changes it might be necessary to restart the cluster. .. note:: Autoscaling is task-group aware. Scale-out pressure is computed only from tasks in the currently active group, so a backlog waiting in later groups will not spin up clients that would sit idle until the active group completes. | File-based logging ^^^^^^^^^^^^^^^^^^ In addition to the console, HyperShell can now persist log messages to disk with automatic rotation and compression — useful for long-running servers and clusters and for unattended batch pipelines. File-based logging is **opt-in** and operates independently of the console: it has its own severity level and format, so you can keep a quiet console while retaining a verbose, machine-parsable record on disk. It is enabled the moment any ``logging.file`` parameter is set. The simplest form uses all defaults: .. admonition:: Enable file-based logging :class: note .. code-block:: toml [logging] file = "enabled" # or true, or an explicit path For full control, define the ``[logging.file]`` table: .. admonition:: Rotating, compressed logs :class: note .. code-block:: toml [logging.file] level = "debug" # captured to disk independently of the console level rotate = "512MB" # size-like ('512MB'), cron-like ('@daily'), or 'never' compress = "gzip" # gzip / bzip / lzma / zstd keep = 2 # uncompressed rotations retained on disk Rotation accepts a size threshold (``512MB``, ``2GB``; units are powers of 1024), a cron expression (``@daily``, ``@midnight``, ``0 1 * * 0`` — requires the optional ``cron`` extra), or ``never`` (the default). Sending ``SIGHUP`` to a process triggers an immediate rotation on demand. Compression to ``gzip``/``bzip``/``lzma``/``zstd`` runs in the background (``zstd`` requires the optional ``zstd`` extra). **Per-process files.** In a distributed cluster many clients — potentially on shared storage — would otherwise contend for a single file. Each process therefore writes to its own role- and host-scoped file by default (``server-.log``, ``cluster-.log``, ``client-.log``, ``submit-.log``, or ``main.log``); concurrent same-role processes on a host claim numbered slots that are reclaimed when a process exits or crashes. Because names never collide, the whole log directory can be collected with ``rsync`` or merged into a single timeline. See the :ref:`logging guide ` for the full reference. | Functional (Python) API ^^^^^^^^^^^^^^^^^^^^^^^^ HyperShell's functional API is now exposed directly at the top level of the package, so the common entry points can be used without reaching into submodules: .. admonition:: Drive HyperShell from Python :class: note .. code-block:: python import hypershell as hs # Run an in-process cluster over a collection of commands hs.run_local(['echo one', 'echo two', 'echo three'], num_threads=4) The top-level namespace now includes ``run_local``, ``run_cluster``, ``run_ssh``, ``run_client``, ``submit_from``, ``submit_file``, ``serve_from``, ``serve_file``, and ``serve_forever``. ----- Improvements ------------ | Local SQLite database enabled by default ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ In previous releases of the software we automatically disabled use of the database in favor of a *live* queue if there was no database configuration. This behavior is still possible using the explicit flags ``--no-db`` (and ``--no-confirm``) which suppresses the warning new users would otherwise get for running the software out-of-the-box. With this change we have a smart preload as part of the configuration that injects ``main.db`` as the database file name within the default *site* library. Use of the ``HYPERSHELL_SITE`` environment variable alters this location. On Linux this would be ``~/.hypershell/lib/main.db`` (the file is named ``Main.db`` on macOS and Windows). The motivation here is to more seamlessly enable the beneficial features without new users running the software for the first time from needing to grapple with this behavior. So instead of getting a warning message the first time they run the software they get powerful features. Any database configuration provided by the user or system disables this new behavior of course. | Default logging level set to INFO ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Similar to the above change with the database enabled by default we have also updated the default logging level to ``INFO`` instead of ``WARNING``. It is probably never the case that users (particularly in a research workflow context) want zero messages from HyperShell. Even with ``INFO`` level messages enabled there are relatively few messages emitted. These messages are predominantly emitted by the client when tasks are started. A few messages at the start of operations about the state of the database are typically emitted as well. New users can of course forever change the logging level by setting their global user configuration: .. admonition:: Set logging level :class: note .. code-block:: shell hs config set logging.level debug --user Relatedly, starting a server against a database whose tasks are already complete no longer emits a warning: in ``--forever`` mode the server reports at ``INFO`` that it is waiting for new tasks, and otherwise it reports completion and shuts down cleanly. | Authentication key requirements ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ The shared authentication key that gates the distributed queue is now subject to minimum requirements. ``hs server`` **refuses to start** with the built-in placeholder key (previously this was only a warning) and requires a key of at least 16 characters drawn from an allowed character set. That set includes the full Base64 alphabet (``+``, ``/``, ``=``), so keys from common generators such as ``openssl rand -base64 48`` — as well as hex and URL-safe tokens — can be used directly. Clients and ``hs submit`` also warn when run with the default key or with ``--no-tls``. In addition, the authentication key is now redacted from debug logs across *all* cluster launch paths — the custom launcher, MPI, autoscaling, and SSH — so ``hsx --ssh`` debug output no longer exposes the key. | Host and port binding ^^^^^^^^^^^^^^^^^^^^^ The ``cluster`` command (``hsx``) gains a ``-H``/``--bind`` option controlling the address the server binds to — ``localhost`` for local clusters and ``0.0.0.0`` for remote/managed launchers by default (from ``server.bind``); a local cluster refuses to bind a non-local address. Local clusters now correctly honor the requested or configured port; previously the port selection was silently ignored and a fixed default was always used. When no port is given, managed clusters now probe for a free port starting at the default (50001), and the ``hs server --available-ports`` helper is evaluated lazily at run time rather than scanning at import. | Shell completions ^^^^^^^^^^^^^^^^^ HyperShell now ships first-class **Zsh** completions alongside a rewritten **Bash** implementation, both covering ``hs`` and ``hsx`` across the modern CLI — the top-level subcommands (``info``/``wait``/``run``/``list``/``update``) and the new options (``--no-tls``/``--tls-*``, ``-Q``/``--poll``, ``-g``/``--group``, ``-R``/``--ratelimit``, ``--monitor``, ``-N``/``--num-threads``) — with dynamic completion of fields, tags, ports, hosts, and SSH groups. These completions and the manual pages are now installed automatically into the environment's ``share`` prefix when installing from PyPI or ``uv``; previously they were omitted from the wheel. See the :ref:`installation guide ` for activation instructions. Completions shell out to ``hs``, which must be on your ``PATH``. | Dependencies and Python support ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ The minimum supported Python is now **3.11** (3.9 and 3.10 have been dropped); the supported range is 3.11 through 3.14. Installation is wheel-only on all supported versions (no compiler required). PostgreSQL support has migrated from ``psycopg2`` to **psycopg v3** (the SQLAlchemy dialect is now ``postgresql+psycopg``), so users of PostgreSQL must reinstall the extra. It is offered in three flavors so you can match your environment: .. list-table:: PostgreSQL extras :header-rows: 1 :widths: 25 75 * - Extra - Use when * - ``postgres`` - Self-contained binary wheels (``psycopg[binary]``) — the simplest choice. * - ``postgres-system`` - Pure-Python ``psycopg`` against your operating system's ``libpq``. * - ``postgres-c`` - Compiled ``psycopg[c]`` for maximum performance (requires a build toolchain). A missing PostgreSQL driver or system ``libpq`` now produces a clear message pointing to the ``postgres`` extra rather than a raw traceback. Two further optional extras were added: ``zstd`` (for zstd log compression) and ``cron`` (for cron-based log rotation). The database can also be configured as a single connection string: ``database.url`` (or ``HYPERSHELL_DATABASE_URL``) accepts a full SQLAlchemy URL, and setting ``database`` to a bare string treats it as a local SQLite file path. | Other improvements ^^^^^^^^^^^^^^^^^^ - The default ``autoscale.size.max`` has been reduced from 2 to 1 for a more conservative out-of-the-box scale-out; raise it with ``hs config set autoscale.size.max N``. - Task bundles are now serialized as a single JSON object per bundle, reducing per-task encode/decode overhead. The queue wire format changed as a result, so servers and clients must run matching versions of HyperShell. ----- Bug Fixes --------- | Fixed client signalwait option name ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ The ``hs client`` command now uses the correct long option name ``--signalwait`` (with short option ``-S``) for the task-level signal escalation wait period. Previously, the code used ``--task-signalwait`` while the documentation and help text specified ``--signalwait``. This has been corrected to match the documentation and align with the ``hs cluster`` command, which also uses ``--signalwait``. This is purely a bug fix for consistency. The functionality remains unchanged, and the option controls how long the executor waits between sending escalating signals (SIGINT → SIGTERM → SIGKILL) when terminating tasks that exceed their timeout. | Fixed server poll configuration parameter ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ The :const:`~hypershell.server.DEFAULT_SERVER_POLL` (previously ``DEFAULT_QUERY_PAUSE``) parameter defines the number of seconds the scheduler thread waits between polling the database when no tasks are available. This parameter can now be configured via the command-line (``-Q``/``--poll``), configuration file (``server.poll``), or environment variable (``HYPERSHELL_SERVER_POLL``). Previously, this configuration option was defined but not actually used by the server implementation, meaning the hardcoded default always persisted regardless of user settings. This release fixes the implementation to properly respect the user's configuration. Given that this setting has not until this point been functional we do not consider this an API change. The scheduler now polls with exponential backoff. When no tasks are available it starts at a fixed 0.5 second floor and doubles the wait after each empty poll, up to a maximum of ``-Q``/``--poll`` seconds (default 30); after any successful query the interval resets back to the 0.5 second floor. The ``server.poll`` value therefore sets the *upper* bound (the cap) of the backoff, not the base interval. .. note:: While this configuration parameter has been fixed in this release, it is not something users should concern themselves with in practice. | Fix config command path check bug ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ The previous release included top-level flags to inspect the location of the various configuration file paths (i.e., ``--system``, ``--user``, ``--local``). .. admonition:: Check for path of user-level configuration file :class: note .. code-block:: shell hs config --user In an attempt to shorten the output as a *smart* feature we hard-coded this based on the platform (e.g., `Windows`), where it would output something like ``%APPDATA%\HyperShell\Config.toml``. This was clever but not actually as useful because that doesn't help in scripts. More importantly, it was actually broken because it did not correctly check for `MacOS` and would output the `Linux` path instead. This release does the more sensible thing of directly returning the path object actually used in the code. | Eager mode now honored by the cluster command ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ The ``--eager`` flag (and the ``server.eager`` configuration default) is now correctly forwarded by ``hs cluster`` / ``hsx``. Previously the cluster command accepted ``--eager`` but never passed it to the scheduler, so failed tasks were not preferentially retried ahead of new tasks as documented. | ``hs info`` searches across database partitions ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ``hs info `` now automatically searches across SQLite database partitions, consistent with ``hs list``, so it works against rotated or partitioned databases. Pass ``-i``/``--ignore-partitions`` to disable this. | Clients return completed tasks on idle-timeout shutdown ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ A client that shut down on idle timeout with a partially filled result bundle could fail to return its finished tasks, which under autoscaling caused launch thrashing (outstanding tasks in the database with none eligible to schedule). Clients now flush all completed tasks before shutting down. | Client heartbeat no longer crashes when its queue is full ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ The client heartbeat thread caught the wrong exception (``QueueEmpty`` instead of ``QueueFull``), so a momentarily full heartbeat queue could crash the thread and make the server consider an otherwise healthy client dead. Heartbeats are now retried instead. | Reject ``FILE`` together with ``--restart`` on the server ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ``hs server`` now raises a clear error when an input ``FILE`` is given together with ``--restart``, mirroring the existing guard for ``FILE`` with ``--forever``; previously the combination was silently accepted. | Miscellaneous fixes ^^^^^^^^^^^^^^^^^^^ - ``-M``/``--client-memory`` now accepts human-readable sizes (e.g. ``8G``, ``512MB``) via the same parser as ``-m``/``--memory``; previously it required a raw integer byte count. - Multivalue options (``-t``/``--tag``, ``-w``/``--where``, ``--with-tag``, ``--remove-tag``) now require at least one value instead of silently accepting a bare flag.