dask Clusters
Intro - coming soon
On-Demand Cluster - The Easy Way
Coming soon…
Persistent Cluster
For applications where multiple Reshapr processes need to be run,
setting up and managing a persistent dask cluster that your processes connect to and use avoids
the overhead of starting up a cluster for each Reshapr process.
An example of the kind of processing where this approach is used is running reshapr extract in
a bash loop to resample day-averaged datasets to get month-averaged datasets.
That is a post-processing step that is done as part of running a SalishSeaCast hindcast.
The cluster scheduler,
workers,
and bash loop are all run in separate terminals in a tmux session on salish.
An ssh tunnel can be set up to connect a browser session to the cluster dashboard for
monitoring and analysis of the processing.
Note
In contrast to the on-demand clusters that are created when you just run a reshapr extract
command,
persistent clusters must be managed.
If you create a persistent cluster,
it is your responsibility to shut it down when you are finished with it.
If you know that another group member is also using a persistent cluster, consider coordinating with them to use the same cluster instead of spinning up a new cluster.
Here is a step-by-step example of using a persistent cluster to run reshapr extract in
a bash loop to resample day-averaged datasets to get month-averaged datasets:
Create a new
tmuxsession onsalish:$ tmux new -s month-avg-201905
In the first
tmuxterminal, activate yourreshaprconda environment and launch the dask-scheduler:$ conda activate reshapr (reshapr)$ dask-scheduler
Use Control-b , to rename the
tmuxterminal todask-scheduler.Make the note of the IP address and port numbers for the scheduler and dashboard in the log output; e.g.
2022-06-16 12:15:58 - distributed.scheduler - INFO - Scheduler at: tcp://142.103.36.12:8786 2022-06-16 12:15:58 - distributed.scheduler - INFO - dashboard at: :8787
8786 and 8787 are the default scheduler and dashboard port number, respectively, but you may see different port numbers if there are other clusters already running.
Start a second
tmuxterminal with Control-b c, activate yourreshaprconda environment and launch the first dask-worker as a background process using the scheduler IP address and port number noted above:$ conda activate reshapr (reshapr)$ dask-worker --nworkers=1 --nthreads=1 142.103.36.12:8786 &
Use Control-b , to rename the
tmuxterminal todask-workers.Additional workers can be added to the cluster by repeating the same dask-worker command. The log output in the
dask-schedulerterminal (Control-b 0) will show the workers joining the cluster.Start a third
tmuxterminal with Control-b c and activate yourreshaprconda environment there too. This is the terminal in which you will run reshapr extract commands.To run those commands on the persistent cluster, set the value of the
dask clusteritem in your extract Process Configuration File to the scheduler IP address and port number noted above; e.g.dask cluster: 142.103.36.12:8786
Optional: To monitor the cluster in your browser on your laptop or workstation, start a terminal session there and set up an
sshtunnel to the scheduler’s dashboard port:$ ssh -L -N 8787:salish:8787 salish
That command creates an
sshtunnel between port 8787 on your laptop/workstation and port 8787 onsalish. You can use any number ≥1024 you want instead of 8787 as the local port number on your laptop/workstation. The number after:salish:has to be the scheduler’s dashboard port number noted above. The command also assumes that you have an entry forsalishin your~/.ssh/configfile.Open a new tab in the browser on your laptop/workstation and go to
http://localhost:8787/to see the cluster dashboard.