What is HyperShell?

Release v2.8.0 (Getting Started)

License Github Release Python Versions PyPI Monthly Downloads Code of Conduct Tests

HyperShell is an elegant, cross-platform, high-throughput computing utility for processing shell commands over a distributed, asynchronous queue. It is a highly scalable workflow automation tool for many-task scenarios.

Built on Python and tested on Linux, macOS, and Windows.

Other tools may offer similar functionality in some places but not within a single tool and not with the flexibility, ergonomics, and scalability provided by HyperShell.

Design elements include but are not limited to:

  • Client-server: Run the server in stand-alone mode with SQLite or PostgreSQL.
    Scale clients elastically as needed (even down to zero).

  • Cross-platform: trivial to install, run on any platform where Python runs.
    Mix platforms within a running cluster (Server on Linux, Clients on Windows).

  • Staggered launch: Come up gradually to balance the workload.
    Scale to 1000+ nodes, 250k+ workers without crashing the server.

  • Database in-the-loop: persist task metadata across runs.
    Fault-tolerant by default. Automated retries. Task history.

  • User-defined tags: annotate tasks with key:value tags.
    Manage catalogs of large collections of tasks with ease.


Usage


HyperShell is primarily a command-line program. Most users will operate the hs cluster command (hsx for short) in a start-to-finish workflow scenario much like people tend to do with alternatives like xargs, GNU Parallel, or HPC-specific tools like ParaFly or TaskFarmer (NERSC-only) or Launcher (TACC).

Basic usage

seq 1000000 | hsx -t 'echo {}' -N64 --ssh 'a[00-32].cluster' > task.out

See getting started for features and additional usage examples. Specific documentation is available for configuration management, database setup, logging, and using templates.

The HyperShell server can operate in standalone mode alongside the database. Zero or more client instances may come and go as available and process tasks. When deployed in this fashion, the cluster can scale out as necessary as well as scale down to zero. This strategy is appropriate for creating shared, autoscaling, high-throughput pipelines for facilities with multiple users.

HyperShell also provides a library interface for Python applications to embed components. Developers can add HyperShell to their project to provide all of this functionality within their own applications or Python-based workflows.


Domain Use Cases


HyperShell is designed for embarrassingly parallel workloads across many scientific and engineering domains. Whether processing millions of files, running massive parameter sweeps, or executing independent computational tasks, HyperShell provides the infrastructure to scale and manage your workflow with confidence.

Natural Sciences

  • Genomics / Proteomics: Sequence alignment, variant calling, genome assembly

  • Bioinformatics: Pipeline orchestration, batch analysis of biological datasets

  • Pharmacy / Drug Discovery: Molecular docking, virtual screening, clinical trial simulations

  • Climate Science / Weather Modeling: Ensemble forecasts, climate scenario analysis

  • Materials Science: Molecular dynamics simulations, high-throughput property screening

  • Computational Chemistry: Energy calculations, reaction pathway analysis

  • Geoscience / Seismology: Seismic data processing, geological survey analysis

  • Neuroscience: Brain imaging analysis, neural network simulations, connectome mapping

  • High-Energy Physics: Collider data processing, event reconstruction

  • Astronomy / Physics: Sky surveys, photon-transport simulations, particle physics analysis

  • Cosmology: gravitational wave analysis

Engineering

  • Computational Fluid Dynamics (CFD): Parameter sweeps for design optimization

  • Finite Element Analysis (FEA): Structural analysis, stress testing, mesh refinement studies

  • Computer Vision / Image Processing: Batch image analysis, object detection pipelines

  • Rendering / Visual Effects (VFX): Distributed rendering, animation frame processing

  • Network Simulation / Cybersecurity: Traffic analysis, penetration testing, security audits

  • Financial Modeling / Risk Analysis: Monte Carlo simulations, portfolio optimization

Computer Science & Data Science

  • Artificial Intelligence: inferencing, model evaluation, model training

  • Machine Learning: Hyperparameter tuning, feature engineering

  • Data Science: Benchmarking, data preprocessing, analysis

  • Monte Carlo Simulations: Statistical sampling, uncertainty quantification, stochastic modeling

Mathematics & Optimization

  • Numerical Methods: Parameter sweeps, sensitivity analysis, convergence studies

  • Optimization: Grid search, genetic algorithms, multi-objective optimization

  • Statistical Computing: Bootstrap resampling, permutation tests, computational inference


Support


Join the Discord server to post questions, discuss your project, share with the community, keep in touch with announcements and upcoming events!

HyperShell is an open-source project developed on GitHub. If you find bugs or issues with the software please create an Issue. Contributions are welcome in the form of Pull requests for bug fixes, documentation, and minor feature improvements.


License


HyperShell is released under the Apache Software License (v2).

Copyright 2019-2026 Geoffrey Lentner.

This program is free software: you can redistribute it and/or modify it under the terms of the Apache License (v2.0) as published by the Apache Software Foundation.

This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the Apache License for more details.

You should have received a copy of the Apache License along with this program.


Citation


If this software has helped facilitate your research please consider citing us.

BibTeX citation

@inproceedings{lentner_2022,
    author = {Lentner, Geoffrey and Gorenstein, Lev},
    title = {HyperShell v2: Distributed Task Execution for HPC},
    year = {2022},
    isbn = {9781450391610},
    publisher = {Association for Computing Machinery},
    url = {https://doi.org/10.1145/3491418.3535138},
    doi = {10.1145/3491418.3535138},
    booktitle = {Practice and Experience in Advanced Research Computing},
    articleno = {80},
    numpages = {3},
    series = {PEARC '22}
}