|
def | __init__ (self, dataPaths=None) |
|
def | getProjects (self) |
|
def | getClusterNodesGroupedByGPUCount (self) |
|
def | getSubmittedJobBatchesGroupedByJobBatchSize (self) |
|
def | selectProject (self, name) |
|
def | getProject (self) |
|
def | getRuns (self) |
|
def | getNumberOfSweepsByRun (self, run) |
|
def | getNumberOfSweepsByRuns (self, rundirs) |
|
def | getRunsFilteredByConfig (self, filters) |
|
def | getRunsWithMatchingConfiguration (self, configuration, pathTranslators=[]) |
|
def | getTargetDataSpecifications (self) |
|
def | getTargetDataSpecificationStatus (self, specification, defaultTargetSweepCount=None) |
|
def | getTargetDataSpecificationsGroupedByStatus (self, defaultTargetSweepCount=None) |
|
def | createRundirByTargetDataSpecification (self, specification) |
|
def | createSlurmJobScripts (self, rundirs, executablePath, executableOptions, slurmOptions={}, srunOptions={}, chunkSize=1, excludedNodes=None, pathTranslator=None) |
|
Provides information on the data that have been generated by OpenMPCD.
Throughout this class, the a "config part generator" is understood to be a
generator in the Python sense (c.f. the `yield` keyword), which yields a
list; each of the elements of this list is a dictionary, containing:
- `settings`:
A dictionary, with each key being a configuration setting name, and the
value being the corresponding value;
- `pathComponents`:
A list of all path components that the generator request be added to the
run directory name.
- `targetSweepCount`:
The minimum number of sweeps this configuration must be simulated for,
possibly across multiple runs. Set to `None` if no such minimum is
desired.
Definition at line 28 of file DataManager.py.
def MPCDAnalysis.DataManager.DataManager.__init__ |
( |
|
self, |
|
|
|
dataPaths = None |
|
) |
| |
The constructor.
This will require the file returned by `getConfigurationPath`
to be readable, and contain in `dataPaths` a list of OpenMPCD run
directories. Each entry in the list will be processed through Python's
`glob.glob` function, and as such, special tokens such as '*' may be
used to match a larger number of directories. If any of the matching
directories is found to not be a OpenMPCD run directory, it is ignored.
Furthermore, each entry in `dataPaths` may contain an initial '~'
character, which will be expanded to the user's home directory.
Alternatively, the list of data paths may be supplied as the `dataPaths`
variable, which takes over the configuration file.
@param[in] dataPaths
A list of data paths, or `None` for the default (see function
description).
Definition at line 61 of file DataManager.py.
def MPCDAnalysis.DataManager.DataManager.createSlurmJobScripts |
( |
|
self, |
|
|
|
rundirs, |
|
|
|
executablePath, |
|
|
|
executableOptions, |
|
|
|
slurmOptions = {} , |
|
|
|
srunOptions = {} , |
|
|
|
chunkSize = 1 , |
|
|
|
excludedNodes = None , |
|
|
|
pathTranslator = None |
|
) |
| |
For each of the given `rundirs`, creates a Slurm job script at
`input/job.slrm` (relative to the respective rundir) that can be used to
submit a job via `sbatch`, or alternatively, if the rundir is part of a
larger job controlled via a jobscript in another run directory, creates
the file `input/parent-job-path.txt`, which contains the absolute path
to the parent job.
The job script will assume that the OpenMPCD executable will reside at
`executablePath`, which should most probably be an absolute path. The
`--rundir` option, with the respective rundir specification as its
value, will be added to the string of program arguments
`executableOptions`.
`executableOptions` is a string that contains all options that are
passed to the executable upon invocation, as if specified in `bash`.
`slurmOptions` is a dictionary, with each key specifying a Slurm
`sbatch` option (e.g. "-J" or "--job-name") and its value specifying
that option's value. There, the special string "$JOBS_PER_NODE" will be
replaced with the number of individual invocations of the given
executable in the current Slurm job. See `chunkSize` below.
Furthermore, for each value, the special string "$RUNDIR" will be
replaced with the absolute path to the run directory that will contain
the `sbatch` job script.
`srunOptions` is a dictionary, with each key specifying a `srun` option
(e.g. --gres") and its value specifying that option's value, or `None`
if there is no value.
For each value, the special string "$RUNDIR" will be replaced with the
absolute path to the run directory.
`chunkSize` can be used to specify that one Slurm job should contain
`chunkSize` many individual invocations of the executable given. If
the number of `rundirs` is not divisible `chunkSize`, an exception is
thrown.
If `excludedNodes` is not `None`, it is a string describing,
in Slurm's syntax (e.g. "n25-02[1-2]"), which compude nodes
should be excluded from executing the jobs.
If `pathTranslator` is not `None`, it is called on each absolute path
before writing its value to some file, or returning it from this
function.
This is useful if this program is run on one computer, but the resulting
files will be on another, where the root directory of the project (or
the user's home) is different.
The function returns a list of server paths to the created jobfiles.
Definition at line 560 of file DataManager.py.
def MPCDAnalysis.DataManager.DataManager.getClusterNodesGroupedByGPUCount |
( |
|
self | ) |
|
Returns a dictionary, with each key corresponding the number of GPUs
installed on the individual systems grouped in the corresponding
dictionary value. Each value is a dictionary, with the following
entries:
* "nodes": A list of nodes that fall into that category
* "SlurmNodeList": A string, compatible with the `Slurm` scheduler,
that collects all the nodes in entry `nodes`.
Definition at line 135 of file DataManager.py.
def MPCDAnalysis.DataManager.DataManager.getRunsFilteredByConfig |
( |
|
self, |
|
|
|
filters |
|
) |
| |
Returns a list of runs that match the given criteria.
The argument `filters` is expected to be a function, or a list of
functions, each taking an object that represents the configuration for a
particular run, returning `False` if it should be filtered out of the
result set, or `True` otherwise (filters applied later might still
remove that configuration from the result set).
Definition at line 288 of file DataManager.py.
def MPCDAnalysis.DataManager.DataManager.getTargetDataSpecifications |
( |
|
self | ) |
|
Returns, for the currently selected project, a list of dictionaries;
the latter each contain a key `config`, which contains a `Configuration`
instance, and a key `targetSweepCount`, which contains the number of
sweeps that are desired to be in the data gathered with this
configuration, or `None` if none is specified. Furthermore, the key
`pathComponents` contains a list of all path components that the
generators request be added to the run directory name.
Definition at line 346 of file DataManager.py.
def MPCDAnalysis.DataManager.DataManager.getTargetDataSpecificationsGroupedByStatus |
( |
|
self, |
|
|
|
defaultTargetSweepCount = None |
|
) |
| |
Takes the values returned by `getTargetDataSpecifications`, and groups
them into three categories in the returned dictionary: `completed`
contains all the target data specifications that have achieved their
target sweep counts, `pending` contains all the target data
specifications that are not yet completed, but have runs being executed
or being scheduled for execution, and `incomplete` contains the rest.
@param[in] defaultTargetSweepCount
This parameter is used as the sweep count for target data
specifications that do not have a target sweep count set.
If `defaultTargetSweepCount` parameter is `None`, all target
data specifications must specify a target sweep count.
Definition at line 448 of file DataManager.py.
def MPCDAnalysis.DataManager.DataManager.getTargetDataSpecificationStatus |
( |
|
self, |
|
|
|
specification, |
|
|
|
defaultTargetSweepCount = None |
|
) |
| |
For the given target data `specification`, returns:
- `"completed"`
if the specification has achieved its target sweep count,
- `"pending"`
if it is not yet completed, but has runs being executed or being
scheduled for execution, or
- `"incomplete"` in any other case.
@param[in] defaultTargetSweepCount
This parameter is used as the sweep count for target data
specifications that do not have a target sweep count set.
If `defaultTargetSweepCount` parameter is `None`, all target
data specifications must specify a target sweep count.
Definition at line 404 of file DataManager.py.