What is HyperShell?¶
Release v2.8.0 (Getting Started)
HyperShell is an elegant, cross-platform, high-throughput computing utility for processing shell commands over a distributed, asynchronous queue. It is a highly scalable workflow automation tool for many-task scenarios.
Built on Python and tested on Linux, macOS, and Windows.
Other tools may offer similar functionality in some places but not within a single tool and not with the flexibility, ergonomics, and scalability provided by HyperShell.
Design elements include but are not limited to:
Client-server: Run the server in stand-alone mode with SQLite or PostgreSQL.
Scale clients elastically as needed (even down to zero).Cross-platform: trivial to install, run on any platform where Python runs.
Mix platforms within a running cluster (Server on Linux, Clients on Windows).Staggered launch: Come up gradually to balance the workload.
Scale to 1000+ nodes, 250k+ workers without crashing the server.Database in-the-loop: persist task metadata across runs.
Fault-tolerant by default. Automated retries. Task history.User-defined tags: annotate tasks with key:value tags.
Manage catalogs of large collections of tasks with ease.
Usage¶
HyperShell is primarily a command-line program.
Most users will operate the hs cluster command (hsx for short) in a start-to-finish workflow scenario much
like people tend to do with alternatives like xargs, GNU Parallel,
or HPC-specific tools like ParaFly or
TaskFarmer (NERSC-only) or
Launcher (TACC).
Basic usage
seq 1000000 | hsx -t 'echo {}' -N64 --ssh 'a[00-32].cluster' > task.out
See getting started for features and additional usage examples. Specific documentation is available for configuration management, database setup, logging, and using templates.
The HyperShell server can operate in standalone mode alongside the database. Zero or more client instances may come and go as available and process tasks. When deployed in this fashion, the cluster can scale out as necessary as well as scale down to zero. This strategy is appropriate for creating shared, autoscaling, high-throughput pipelines for facilities with multiple users.
HyperShell also provides a library interface for Python applications to embed components. Developers can add HyperShell to their project to provide all of this functionality within their own applications or Python-based workflows.
Domain Use Cases¶
HyperShell is designed for embarrassingly parallel workloads across many scientific and engineering domains. Whether processing millions of files, running massive parameter sweeps, or executing independent computational tasks, HyperShell provides the infrastructure to scale and manage your workflow with confidence.
Natural Sciences¶
Genomics / Proteomics: Sequence alignment, variant calling, genome assembly
Bioinformatics: Pipeline orchestration, batch analysis of biological datasets
Pharmacy / Drug Discovery: Molecular docking, virtual screening, clinical trial simulations
Climate Science / Weather Modeling: Ensemble forecasts, climate scenario analysis
Materials Science: Molecular dynamics simulations, high-throughput property screening
Computational Chemistry: Energy calculations, reaction pathway analysis
Geoscience / Seismology: Seismic data processing, geological survey analysis
Neuroscience: Brain imaging analysis, neural network simulations, connectome mapping
High-Energy Physics: Collider data processing, event reconstruction
Astronomy / Physics: Sky surveys, photon-transport simulations, particle physics analysis
Cosmology: gravitational wave analysis
Engineering¶
Computational Fluid Dynamics (CFD): Parameter sweeps for design optimization
Finite Element Analysis (FEA): Structural analysis, stress testing, mesh refinement studies
Computer Vision / Image Processing: Batch image analysis, object detection pipelines
Rendering / Visual Effects (VFX): Distributed rendering, animation frame processing
Network Simulation / Cybersecurity: Traffic analysis, penetration testing, security audits
Financial Modeling / Risk Analysis: Monte Carlo simulations, portfolio optimization
Computer Science & Data Science¶
Artificial Intelligence: inferencing, model evaluation, model training
Machine Learning: Hyperparameter tuning, feature engineering
Data Science: Benchmarking, data preprocessing, analysis
Monte Carlo Simulations: Statistical sampling, uncertainty quantification, stochastic modeling
Mathematics & Optimization¶
Numerical Methods: Parameter sweeps, sensitivity analysis, convergence studies
Optimization: Grid search, genetic algorithms, multi-objective optimization
Statistical Computing: Bootstrap resampling, permutation tests, computational inference
Support¶
Join the Discord server to post questions, discuss your project, share with the community, keep in touch with announcements and upcoming events!
HyperShell is an open-source project developed on GitHub. If you find bugs or issues with the software please create an Issue. Contributions are welcome in the form of Pull requests for bug fixes, documentation, and minor feature improvements.
License¶
HyperShell is released under the Apache Software License (v2).
Copyright 2019-2026 Geoffrey Lentner.
This program is free software: you can redistribute it and/or modify it under the terms of the Apache License (v2.0) as published by the Apache Software Foundation.
This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the Apache License for more details.
You should have received a copy of the Apache License along with this program.
Citation¶
If this software has helped facilitate your research please consider citing us.
BibTeX citation
@inproceedings{lentner_2022,
author = {Lentner, Geoffrey and Gorenstein, Lev},
title = {HyperShell v2: Distributed Task Execution for HPC},
year = {2022},
isbn = {9781450391610},
publisher = {Association for Computing Machinery},
url = {https://doi.org/10.1145/3491418.3535138},
doi = {10.1145/3491418.3535138},
booktitle = {Practice and Experience in Advanced Research Computing},
articleno = {80},
numpages = {3},
series = {PEARC '22}
}