In some situations, using an online calculator such as the Green Algorithms one isn’t very practical, e.g. when many different jobs are run. In an ideal world, there would be a tool that collects automatically all the details of the algorithms ran and estimate the corresponding energy usage and carbon footprint. GA4HPC is a first step in this direction.
High Performance Computing (HPC) clusters tend to log information on all jobs ran on them, for accounting purposes, and this information can be pulled.
Who is it for?
At this stage, the script works on any HPC server using SLURM as a workload manager. It can be adapted to other workload managers, see here on how to add one.
How to install it
It doesn’t require any particular permissions, you just need to copy the GitHub repository on your HPC drive, enter some information about your data centre, and you’re good to go! Tutorial here
How to use it
Anyone with access to
the_shared_directory where the script is located can run the calculator,
by running the same command, with various options available:
usage: myCarbonFootprint.sh [-h] [-S STARTDAY] [-E ENDDAY] [--filterCWD] [--filterJobIDs FILTERJOBIDS] [--filterAccount FILTERACCOUNT] [--reportBug] [--reportBugHere] [--useCustomLogs USECUSTOMLOGS] Calculate your carbon footprint on CSD3. optional arguments: -h, --help show this help message and exit -S STARTDAY, --startDay STARTDAY The first day to take into account, as YYYY-MM-DD (default: 2022-01-01) -E ENDDAY, --endDay ENDDAY The last day to take into account, as YYYY-MM-DD (default: today) --filterCWD Only report on jobs launched from the current location. --filterJobIDs FILTERJOBIDS Comma seperated list of Job IDs you want to filter on. --filterAccount FILTERACCOUNT Only consider jobs charged under this account --customSuccessStates CUSTOMSUCCESSSTATES Comma-separated list of job states. By default, only jobs that exit with status CD or COMPLETED are considered succesful (PENDING, RUNNING and REQUEUD are ignored). Jobs with states listed here will be considered successful as well (best to list both 2-letter and full-length codes. Full list of job states: https://slurm.schedmd.com/squeue.html#SECTION_JOB- STATE-CODES --reportBug In case of a bug, this flag logs jobs informations so that we can fix it. Note that this will write out some basic information about your jobs, such as runtime, number of cores and memory usage. --reportBugHere Similar to --reportBug, but exports the output to your home folder --useCustomLogs USECUSTOMLOGS This bypasses the workload manager, and enables you to input a custom log file of your jobs. This is mostly meant for debugging, but can be useful in some situations. An example of the expected file can be found at `example_files/example_sacctOutput_raw.tsv`.
Limitations to keep in mind
- The workload manager doesn’t alway log the exact CPU usage time, and when this information is missing, we assume that all cores are used at 100%.
- For now, we assume that GPU jobs only use 1 GPU and the GPU is used at 100%, as the information needed for more accurate measurement is not always available.
(both of these may lead to slightly overestimated carbon footprints, although the order of magnitude is likely to be correct)
- Conversely, the wasted energy due to memory overallocation may be largely underestimated, as the information needed is not always logged.
If you spot any bugs, or would like new features, just open a new issue on GitHub.
How to modify the script for my cluster?
See the “Edit code and contribute” page on how to modify the code and share your improvements with other users.
This work is licensed under a Creative Commons Attribution 4.0 International License.