.. _20250504_2_7_0_release: Release Notes (v2.7.0) ====================== `May 4, 2025` The v2.7.0 release includes major database features and CLI improvements. - Automatic database rotation for SQLite - Optional UUIDv7 mode - SQLite pragmas - SQLite optimization - Simplified command-line interface - Version info - Improved task submission (global tags) ----- Features -------- | Automatic database rotation for SQLite ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ When using `SQLite` as a backend for long persistent deployments the overall throughput can suffer due to the increased size of the database. What we would like is to keep our long task history while alleviating the burden of such a large database on task throughput. To accomplish this we've added two new admin routines, ``vacuumdb()`` and ``rotatedb()``, and exposed them via the command-line interface via ``hs initdb`` as ``--vacuum``, ``--backup``, and ``--rotate``. For those unfamiliar with the concept, `VACUUM` is a database operation (PostgreSQL and SQLite) that optimizes space and can improve performance in some cases. For SQLite, updated rows are written to new pages, even deleting data is only flagged as such. Performing a vacuum copies all active pages to a new database file and then replaces the original database file. This operation is safe to perform while under load. This on its own is a useful feature to have for long running deployments. For SQLite, the vacuum operation allows for a file path and instead of copying the new cleaned pages back to the original file the data is left at the given file path. This capability is what drives the ``--backup`` feature, and in turn enables the ``--rotate`` feature. With this release we now automatically add a ``part:0`` tag to all new tasks. This ``part``, short for *partition*, is used to label all completed tasks in the activate database with ``part:N`` in a single transaction, where *N* is the N-th partition chosen based on the path to the file. With this tag as a means to safely distinguish between the completed tasks in that moment, we can partition the activate database using a vacuum operation to a new location, essentially cloning the database. Afterwards, we drop any non-``part:N`` tasks from the new clone and any ``part:N`` from the activate database. This can happen in real time without stopping the server. Immediately following one of these partition events, there will be few tasks in the activate database. While searching task history we want to allow for the full history without losing access because we split the database. Now, when invoking task search, if SQLite is in use, we automatically check for any/all existing partitions of the same database path and `attach` each to the session. We build a temporary `view` that replaces the ``task`` table with a series of ``union all`` on these partitions. Any and all queries are applied to this temporary view as if the entire history existed within the one table. .. note:: With most distributions of Python the `sqlite3` library is built with a compile-time option ``SQLITE_MAX_ATTACHED=10`` by default. While ``--rotate`` can be called any number of times, the automatic attachment of these partitions will be limited by this setting. While it's possible to re-compile SQLite with a higher value, it cannot be increased above 125. See `SQLite limits `_. | Optional UUIDv7 Mode ^^^^^^^^^^^^^^^^^^^^ A new package extra, ``hypershell[uuid7]``, installs a third-party dependency, `uuid-utils `_, which provides for lexicographically-sortable version 7 UUIDs. The format specification for UUID version 7 has essentially solidified and library support for use of this format is well supported and widely used. See `RFC 9562 (draft standard) `_ for reference. In large-scale performance testing with HyperShell it is difficult to say precisely what the net benefit of UUIDv7 has been on overall task throughput. The increase in performance comes from less cache-misses in page loads due to the time ordering of task rows. It is likely the case that task bundles will consist of tasks submitted close in time to one another. Updating these tasks is more efficient when they reside within the same collection of pages, instead of being randomly distributed within the b-tree. Generation of UUIDs within the project has been refactored to an internal routine. This routine calls ``uuid.uuid4`` from the standard library and returns a string. If ``uuid-utils`` is available in the installation we prefer ``uuid_utils.uuid7``. This functionality install-time behavior and is not configurable at runtime. .. note:: On SQLite there is no real benefit to UUID v7. SQLite employs a hidden ``rowid`` column when using a non-integer primary key. The pages in the database are actually stored according to this hidden ``rowid``. The overhead of the second lookup is minimal compared to the huge performance improvement due to the page ordering. In testing, adding the ``WITHOUT ROWID`` table modifier results in a significant reduction in performance, and adding UUIDv7 on top of that entirely recovers this performance. Incorporating both the table modifier and UUIDv7 is relatively insignificant. | SQLite Pragmas ^^^^^^^^^^^^^^ It has always been technically possible to enable WAL-mode for SQLite by connecting outside of HyperShell and applying the setting. But this is inelegant and doesn't help with settings that do not persist between connections. Now you can include a ``pragmas`` section in your configuration to automatically apply these to all database connections. The most useful change from the default behavior is to enable the Write-Ahead Log (WAL) ``journal_mode`` for SQLite which can improve performance in high-concurrency applications. .. admonition:: Set user-level configuration option :class: note .. code-block:: shell hs config set database.pragmas.journal_mode wal There are many other pragmas we might set. Not all configurations of SQLite have been tested for performance. .. admonition:: Configuration with SQLite pragmas :class: note .. code-block:: toml [database] file = "/var/lib/hypershell/main.db" pragmas = {journal_mode = "wal", cache_size = 100000} | SQLite Optimization ^^^^^^^^^^^^^^^^^^^ On a related note, all ``vacuumdb()`` operations (used by ``--vacuum`` and ``--rotate``) add an automatic ``pragma optimize`` operation. This is mostly harmless and can be invoked safely on a regular basis. See `optimize `_ for details. | Simplified Command-line Interface ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ As with previous changes, backwards compatibility has been maintained for existing workflows. Hierarchy can be a good thing in command-line tools and is often necessary to manage complexity. For HyperShell though there is no naming-conflict of actions to justify the added ``task`` layer. The following subcommands have been remapped for simplicity and brevity: - ``hs task submit`` → ``hs submit`` *++* - ``hs task info`` → ``hs info`` - ``hs task wait`` → ``hs wait`` - ``hs task run`` → ``hs run`` - ``hs task search`` → ``hs list`` - ``hs task update`` → ``hs update`` The exception here is ``hs submit`` and ``hs task submit`` which were distinct operations. The new ``hs submit`` provides both interfaces in a single command. Positional arguments are treated as a single command-line task. If a single positional argument is provided and it is either ``-`` (stdin) or is a valid non-executable file path it will be read as before. This can be made explicit using the new ``-f``/``--task-file`` option. The new ``hs submit`` also includes better quoting behavior, properly forwarding arguments with quoted white space. A new entry-point, ``hsx``, is also included as shorthand for ``hs cluster``. | Version Info ^^^^^^^^^^^^ The output of ``hs --version`` now includes more detailed information. .. admonition:: Show version information :class: note .. code-block:: shell hs --version .. details:: Output .. code-block:: none HyperShell v2.7.0 (CPython 3.13.2) | Improved Task Submission ^^^^^^^^^^^^^^^^^^^^^^^^ In the :ref:`v2.6.0 release <20241115_2_6_0_release>` we added inline tag assignments. Any input line in a submission file which included a comment with the special ``# HYPERSHELL: ...`` syntax allows for tags to be mapped to input arguments on an individual basis. In this release we extend this behavior to allow processing of non-task lines. If an input task line is empty or comment-only we skip that line and do not emit a task. If that comment contains the ``# HYPERSHELL: ...`` notation the tags will take effect on all future lines. The following example input file would be processed as 4 tasks. .. admonition:: Input task file with global tags :class: note .. code-block:: shell # HYPERSHELL: site:b group:1 # HYPERSHELL: case:1 echo 1 echo 2 # HYPERSHELL: case:2 echo 3 echo 4