pyiron_base.jobs.script.ScriptJob#
- class pyiron_base.jobs.script.ScriptJob(project, job_name)[source]#
Bases:
GenericJob
The ScriptJob class allows to submit Python scripts and Jupyter notebooks to the pyiron job management system.
- Parameters:
project (ProjectHDFio) – ProjectHDFio instance which points to the HDF5 file the job is stored in
job_name (str) – name of the job, which has to be unique within the project
- Simple example:
- Step 1. Create the notebook to be submitted, for ex. ‘example.ipynb’, and save it – Can contain any code like:
``` import json with open(‘script_output.json’,’w’) as f:
json.dump({‘x’: [0,1]}, f) # dump some data into a JSON file
- Step 2. Create the submitter notebook, for ex. ‘submit_example_job.ipynb’, which submits ‘example.ipynb’ to the
pyiron job management system, which can have the following code:
` from pyiron_base import Project pr = Project('scriptjob_example') # save the ScriptJob in the 'scriptjob_example' project scriptjob = pr.create.job.ScriptJob('scriptjob') # create a ScriptJob named 'scriptjob' scriptjob.script_path = 'example.ipynb' # specify the PATH to the notebook you want to submit. `
- Step 3. Check the job table to get details about ‘scriptjob’ by using:
` pr.job_table() `
- Step 4. If the status of ‘scriptjob’ is ‘finished’, load the data from the JSON file into the
‘submit_example_job.ipynb’ notebook by using: ``` import json with open(scriptjob.working_directory + ‘/script_output.json’) as f:
data = json.load(f) # load the data from the JSON file
- More sophisticated example:
The script in ScriptJob can also be more complex, e.g. running its own pyiron calculations. Here we show how it is leveraged to run a multi-core atomistic calculation.
- Step 1. ‘example.ipynb’ can contain pyiron_atomistics code like:
``` from pyiron_atomistics import Project pr = Project(‘example’) job = pr.create.job.Lammps(‘job’) # we name the job ‘job’ job.structure = pr.create.structure.ase_bulk(‘Fe’) # specify structure
# Optional: get an input value from ‘submit_example_job.ipynb’, the notebook which submits # ‘example.ipynb’ number_of_cores = pr.data.number_of_cores job.server.cores = number_of_cores
job.run() # run the job
# save a custom output, that can be used by the notebook ‘submit_example_job.ipynb’ job[‘user/my_custom_output’] = 16 ```
- Step 2. ‘submit_example_job.ipynb’, can then have the following code:
``` from pyiron_base import Project pr = Project(‘scriptjob_example’) # save the ScriptJob in the ‘scriptjob_example’ project scriptjob = pr.create.job.ScriptJob(‘scriptjob’) # create a ScriptJob named ‘scriptjob’ scriptjob.script_path = ‘example.ipynb’ # specify the PATH to the notebook you want to submit.
# In this example case, ‘example.ipynb’ is in the same # directory as ‘submit_example_job.ipynb’
# Optional: to submit the notebook to a queueing system number_of_cores = 1 # number of cores to be used scriptjob.server.cores = number_of_cores scriptjob.server.queue = ‘cmfe’ # specify the queue to which the ScriptJob job is to be submitted scriptjob.server.run_time = 120 # specify the runtime limit for the ScriptJob job in seconds
# Optional: save an input, such that it can be accessed by ‘example.ipynb’ pr.data.number_of_cores = number_of_cores pr.data.write()
- Step 3. Check the job table by using:
` pr.job_table() `
in addition to containing details on ‘scriptjob’, the job_table also contains the details of the child ‘job/s’ (if any) that were submitted within the ‘example.ipynb’ notebook.- Step 4. Reload and analyse the child ‘job/s’: If the status of a child ‘job’ is ‘finished’, it can be loaded
into the ‘submit_example_job.ipynb’ notebook using: ``` job = pr.load(‘job’) # remember in Step 1., we wanted to run a job named ‘job’, which has now
# ‘finished’
` this loads 'job' into the 'submit_example_job.ipynb' notebook, which can be then used for analysis, `
job.output.energy_pot[-1] # via the auto-complete job[‘user/my_custom_output’] # the custom output, directly from the hdf5 file ```
- attribute#
job_name
name of the job, which has to be unique within the project
- .. attribute:: status
- execution status of the job, can be one of the following [initialized, appended, created, submitted, running,
aborted, collect, suspended, refresh, busy, finished]
- .. attribute:: job_id
unique id to identify the job in the pyiron database
- .. attribute:: parent_id
job id of the predecessor job - the job which was executed before the current one in the current job series
- .. attribute:: master_id
job id of the master job - a meta job which groups a series of jobs, which are executed either in parallel or in serial.
- .. attribute:: child_ids
list of child job ids - only meta jobs have child jobs - jobs which list the meta job as their master
- .. attribute:: project
Project instance the jobs is located in
- .. attribute:: project_hdf5
ProjectHDFio instance which points to the HDF5 file the job is stored in
- .. attribute:: job_info_str
short string to describe the job by it is job_name and job ID - mainly used for logging
- .. attribute:: working_directory
working directory of the job is executed in - outside the HDF5 file
- .. attribute:: path
path to the job as a combination of absolute file system path and path within the HDF5 file.
- .. attribute:: version
Version of the hamiltonian, which is also the version of the executable unless a custom executable is used.
- .. attribute:: executable
Executable used to run the job - usually the path to an external executable.
- .. attribute:: library_activated
For job types which offer a Python library pyiron can use the python library instead of an external executable.
- .. attribute:: server
Server object to handle the execution environment for the job.
- .. attribute:: queue_id
the ID returned from the queuing system - it is most likely not the same as the job ID.
- .. attribute:: logger
logger object to monitor the external execution and internal pyiron warnings.
- .. attribute:: restart_file_list
list of files which are used to restart the calculation from these files.
- .. attribute:: job_type
- Job type object with all the available job types: [‘ExampleJob’, ‘ParallelMaster’, ‘ScriptJob’,
‘ListMaster’]
- .. attribute:: script_path
the absolute path to the python script
Methods
__init__
(project, job_name)check_if_job_exists
([job_name, project])Check if a job already exists in an specific project.
Checks whether certain parameters (such as plane wave cutoff radius in DFT) are changed from the pyiron standard values to allow for a physically meaningful results.
Convenience function to clear job info after suspend.
Compatibility function - but no log files are being collected
Collect the output files of the external executable and store the information in the HDF5 file.
compress
([files_to_compress, files_to_remove])Compress the output files of a job object.
Validate the convergence of the calculation.
copy
()Copy the GenericJob object which links to the job and its HDF5 file
Copy a specific file to the working directory before the job is executed.
copy_template
([project, new_job_name])Copy the content of the job including the HDF5 file but without the output data to a new location
copy_to
([project, new_job_name, input_only, ...])Copy the content of the job including the HDF5 file to a new location.
create_job
(job_type, job_name[, ...])Create one of the following jobs: - 'StructureContainer’: - ‘StructurePipeline’: - ‘AtomisticExampleJob’: example job just generating random number - ‘ExampleJob’: example job just generating random number - ‘Lammps’: - ‘KMC’: - ‘Sphinx’: - ‘Vasp’: - ‘GenericMaster’: - ‘ParallelMaster’: series of jobs run in parallel - ‘KmcMaster’: - ‘ThermoLambdaMaster’: - ‘RandomSeedMaster’: - ‘MeamFit’: - ‘Murnaghan’: - ‘MinimizeMurnaghan’: - ‘ElasticMatrix’: - ‘ConvergenceEncutParallel’: - ‘ConvergenceKpointParallel’: - ’PhonopyMaster’: - ‘DefectFormationEnergy’: - ‘LammpsASE’: - ‘PipelineMaster’: - ’TransformationPath’: - ‘ThermoIntEamQh’: - ‘ThermoIntDftEam’: - ‘ScriptJob’: Python script or jupyter notebook job container - ‘ListMaster': list of jobs
db_entry
()Generate the initial database entry for the current GenericJob
Decompress the output files of a compressed job object.
Disables the usage of mpi4py for parallel execution.
Change the job status to aborted when the job was intercepted.
Enable the usage of mpi4py for parallel execution.
from_dict
(obj_dict[, version])Populate the object from the serialized object.
from_hdf
([hdf, group_name])Restore the GenericJob from an HDF5 file
from_hdf_args
(hdf)Read arguments for instance creation from HDF5 file
get
(name[, default])Internal wrapper function for __getitem__() - self[name]
Generate calculate() function
get_from_table
(path, name)Get a specific value from a pandas.Dataframe
Get an hierarchical dictionary of input files.
get_job_id
([job_specifier])get the job_id for job named job_name in the local project path from database
get_output_parameter_dict
()inspect
(job_specifier)Inspect an existing pyiron object - most commonly a job - from the database
instantiate
(obj_dict[, version])Create a blank instance of this class.
For jobs which executables are available as Python library, those can also be executed with a library call instead of calling an external executable.
For jobs which executables are available as Python library, those can also be executed with a library call instead of calling an external executable.
interactive_flush
([path, include_last_step])For jobs which executables are available as Python library, those can also be executed with a library call instead of calling an external executable.
Check if the job is already compressed or not.
is_master_id
(job_id)Check if the job ID job_id is the master ID for any child job
Check if the HDF5 file of the Job is compressed as tar-archive
job_file_name
(file_name[, cwd])combine the file name file_name with the path of the current working directory
kill
()Kill the job.
list_all
()Returns dictionary of :method:`.list_groups()` and :method:`.list_nodes()`.
List child jobs as JobPath objects - not loading the full GenericJob objects for each child
List files inside the working directory
Return a list of names of all nested groups.
Return a list of names of all nested nodes.
load
(job_specifier[, convert_to_object])Load an existing pyiron object - most commonly a job - from the database
move_to
(project)Move the content of the job including the HDF5 file to a new location
Refresh job status by updating the job status with the status from the database if a job ID is available.
relocate_hdf5
([h5_path])Relocate the hdf file.
remove
([_protect_childs])Remove the job - this removes the HDF5 file, all data stored in the HDF5 file an the corresponding database entry.
remove_and_reset_id
([_protect_childs])Remove the job and reset its ID.
internal function to remove command that removes also child jobs.
rename
(new_job_name)Rename the job - by changing the job name
reset_job_id
([job_id])Reset the job id sets the job_id to None in the GenericJob as well as all connected modules like JobStatus.
restart
([job_name, job_type])Create an restart calculation from the current calculation - in the GenericJob this is the same as create_job().
run
([delete_existing_job, repair, debug, ...])This is the main run function, depending on the job status ['initialized', 'created', 'submitted', 'running', 'collect','finished', 'refresh', 'suspended'] the corresponding run mode is chosen.
For jobs which executables are available as Python library, those can also be executed with a library call instead of calling an external executable.
For jobs which executables are available as Python library, those can also be executed with a library call instead of calling an external executable.
Compatibility function - but library run mode is not available
The run if modal function is called by run to execute the simulation, while waiting for the output.
Internal helper function the run if refresh function is called when the job status is 'refresh'.
The run if queue function is called by run if the user decides to submit the job to and queing system.
The run_static() function is called internally in pyiron to trigger the execution of the executable.
Internal helper function to store the run_time in the database
save
()Save the object, by writing the content to the HDF5 file and storing an entry in the database.
save_output
([output_dict, shell_output])Store output of the calculate function in the HDF5 file.
Compress HDF5 file of the job object to tar-archive
Decompress HDF5 file of the job object from tar-archive
This function enforces read-only mode for the input classes, but it has to be implement in the individual classes.
show_hdf
()Iterating over the HDF5 datastructure and generating a human readable graph.
signal_intercept
(sig)Abort the job and log signal that caused it.
suspend
()Suspend the job by storing the object and its state persistently in HDF5 file and exit it.
to_dict
()Reduce the object to a dictionary.
to_hdf
([hdf, group_name])Store the GenericJob in an HDF5 file
to_object
([object_type])Load the full pyiron object from an HDF5 file
Transfer the job from a remote location to the local machine.
update_master
([force_update])After a job is finished it checks whether it is linked to any metajob - meaning the master ID is pointing to this jobs job ID.
Validates if the job is ready to run by checking if the script path is provided.
Call routines that generate the code specific input files Returns:
Attributes
Generate keyword arguments for the calculate() function.
list of child job ids - only meta jobs have child jobs - jobs which list the meta job as their master
content
database_entry
Get the list of groups which are excluded from storing in the hdf5 file
Get the list of nodes which are excluded from storing in the hdf5 file
Get the executable used to run the job - usually the path to an external executable.
executor_type
Allows to browse the files in a job directory.
files_to_compress
files_to_remove
Unique id to identify the job in the pyiron database - use self.job_id instead
Unique id to identify the job in the pyiron database
Short string to describe the job by it is job_name and job ID - mainly used for logging
Get name of the job, which has to be unique within the project
['ExampleJob', 'ParallelMaster', 'ScriptJob',
Get the logger object to monitor the external execution and internal pyiron warnings.
Get job id of the master job - a meta job which groups a series of jobs, which are executed either in parallel or in serial.
Get name of the job, which has to be unique within the project
Get job id of the predecessor job - the job which was executed before the current one in the current job series
Absolute path of the HDF5 group starting from the system root - combination of the absolute system path plus the absolute path inside the HDF5 file starting from the root group.
Project instance the jobs is located in
Get the ProjectHDFio instance which points to the HDF5 file the job is stored in
Get the queue ID, the ID returned from the queuing system - it is most likely not the same as the job ID.
A dictionary of the new name of the copied restart files
Get the list of files which are used to restart the calculation from these files.
Python script path
Get the server object to handle the execution environment for the job.
Execution status of the job, can be one of the following [initialized, appended, created, submitted, running,
Get the version of the hamiltonian, which is also the version of the executable unless a custom executable is used.
Get the working directory of the job is executed in - outside the HDF5 file. The working directory equals the path but it is represented by the filesystem: /absolute/path/to/the/file.h5/path/inside/the/hdf5/file becomes: /absolute/path/to/the/file_hdf5/path/inside/the/hdf5/file.
- property calculate_kwargs: dict#
Generate keyword arguments for the calculate() function. A new simulation code only has to extend the get_input_parameter_dict() function which by default specifies an hierarchical dictionary with files_to_write and files_to_copy.
Example:
>>> calculate_function = job.get_calculate_function() >>> shell_output, parsed_output, job_crashed = calculate_function(**job.calculate_kwargs) >>> job.save_output(output_dict=parsed_output, shell_output=shell_output)
- Returns:
keyword arguments for the calculate() function
- Return type:
dict
- check_if_job_exists(job_name: str | None = None, project: ProjectHDFio | pyiron_base.project.generic.Project | None = None)#
Check if a job already exists in an specific project.
- Parameters:
job_name (str) – Job name (optional)
project (ProjectHDFio, Project) – Project path (optional)
- Returns:
True / False
- Return type:
(bool)
- check_setup() None #
Checks whether certain parameters (such as plane wave cutoff radius in DFT) are changed from the pyiron standard values to allow for a physically meaningful results. This function is called manually or only when the job is submitted to the queueing system.
- property child_ids: list#
list of child job ids - only meta jobs have child jobs - jobs which list the meta job as their master
- Returns:
list of child job ids
- Return type:
list
- clear_job() None #
Convenience function to clear job info after suspend. Mimics deletion of all the job info after suspend in a local test environment.
- collect_output() None #
Collect the output files of the external executable and store the information in the HDF5 file. This method has to be implemented in the individual hamiltonians.
- compress(files_to_compress: List[str] | None = None, files_to_remove: List[str] | None = None) None #
Compress the output files of a job object.
- Parameters:
files_to_compress (list)
- convergence_check() bool #
Validate the convergence of the calculation.
- Returns:
If the calculation is converged
- Return type:
(bool)
- copy() GenericJob #
Copy the GenericJob object which links to the job and its HDF5 file
- Returns:
New GenericJob object pointing to the same job
- Return type:
- copy_file_to_working_directory(file: str) None #
Copy a specific file to the working directory before the job is executed.
- Parameters:
file (str) – path of the file to be copied.
- copy_template(project: ProjectHDFio | JobCore | None = None, new_job_name: None = None) GenericJob #
Copy the content of the job including the HDF5 file but without the output data to a new location
- Parameters:
project (JobCore/ProjectHDFio/Project/None) – The project to copy the job to. (Default is None, use the same project.)
new_job_name (str) – The new name to assign the duplicate job. Required if the project is None or the same project as the copied job. (Default is None, try to keep the same name.)
- Returns:
GenericJob object pointing to the new location.
- Return type:
- copy_to(project: ProjectHDFio | JobCore | None = None, new_job_name: str | None = None, input_only: bool = False, new_database_entry: bool = True, delete_existing_job: bool = False, copy_files: bool = True)#
Copy the content of the job including the HDF5 file to a new location.
- Parameters:
project (JobCore/ProjectHDFio/Project/None) – The project to copy the job to. (Default is None, use the same project.)
new_job_name (str) – The new name to assign the duplicate job. Required if the project is None or the same project as the copied job. (Default is None, try to keep the same name.)
input_only (bool) – [True/False] Whether to copy only the input. (Default is False.)
new_database_entry (bool) – [True/False] Whether to create a new database entry. If input_only is True then new_database_entry is False. (Default is True.)
delete_existing_job (bool) – [True/False] Delete existing job in case it exists already (Default is False.)
copy_files (bool) – If True copy all files the working directory of the job, too
- Returns:
GenericJob object pointing to the new location.
- Return type:
- create_job(job_type: str, job_name: str, delete_existing_job: bool = False) GenericJob #
Create one of the following jobs: - ‘StructureContainer’: - ‘StructurePipeline’: - ‘AtomisticExampleJob’: example job just generating random number - ‘ExampleJob’: example job just generating random number - ‘Lammps’: - ‘KMC’: - ‘Sphinx’: - ‘Vasp’: - ‘GenericMaster’: - ‘ParallelMaster’: series of jobs run in parallel - ‘KmcMaster’: - ‘ThermoLambdaMaster’: - ‘RandomSeedMaster’: - ‘MeamFit’: - ‘Murnaghan’: - ‘MinimizeMurnaghan’: - ‘ElasticMatrix’: - ‘ConvergenceEncutParallel’: - ‘ConvergenceKpointParallel’: - ’PhonopyMaster’: - ‘DefectFormationEnergy’: - ‘LammpsASE’: - ‘PipelineMaster’: - ’TransformationPath’: - ‘ThermoIntEamQh’: - ‘ThermoIntDftEam’: - ‘ScriptJob’: Python script or jupyter notebook job container - ‘ListMaster’: list of jobs
- Parameters:
job_type (str) – job type can be [‘StructureContainer’, ‘StructurePipeline’, ‘AtomisticExampleJob’, ‘ExampleJob’, ‘Lammps’, ‘KMC’, ‘Sphinx’, ‘Vasp’, ‘GenericMaster’, ‘ParallelMaster’, ‘KmcMaster’, ‘ThermoLambdaMaster’, ‘RandomSeedMaster’, ‘MeamFit’, ‘Murnaghan’, ‘MinimizeMurnaghan’, ‘ElasticMatrix’, ‘ConvergenceEncutParallel’, ‘ConvergenceKpointParallel’, ’PhonopyMaster’, ‘DefectFormationEnergy’, ‘LammpsASE’, ‘PipelineMaster’, ’TransformationPath’, ‘ThermoIntEamQh’, ‘ThermoIntDftEam’, ‘ScriptJob’, ‘ListMaster’]
job_name (str) – name of the job
delete_existing_job (bool) – delete an existing job - default false
- Returns:
job object depending on the job_type selected
- Return type:
- db_entry() dict #
Generate the initial database entry for the current GenericJob
- Returns:
- database dictionary {“username”, “projectpath”, “project”, “job”, “subjob”, “hamversion”,
”hamilton”, “status”, “computer”, “timestart”, “masterid”, “parentid”}
- Return type:
(dict)
- decompress() None #
Decompress the output files of a compressed job object.
- drop_status_to_aborted() None #
Change the job status to aborted when the job was intercepted.
- property exclude_groups_hdf: list#
Get the list of groups which are excluded from storing in the hdf5 file
- Returns:
groups(list)
- property exclude_nodes_hdf: list#
Get the list of nodes which are excluded from storing in the hdf5 file
- Returns:
nodes(list)
- property executable: Executable#
Get the executable used to run the job - usually the path to an external executable.
- Returns:
exectuable path
- Return type:
(str/pyiron_base.job.executable.Executable)
- property files: FileBrowser#
Allows to browse the files in a job directory.
By default this object prints itself as a listing of the job directory and the files inside.
>>> job.files /path/to/my/job: pyiron.log error.out
Access to the names of files is provided with
list()
>>> job.files.list() ['pyiron.log', 'error.out', 'INCAR']
Access to the contents of files is provided by indexing into this object, which returns a list of lines in the file
>>> job.files['error.out'] ["Oh no
“, “Something went wrong! “]
The
tail()
method prints the last lines of a file to stdout>>> job.files.tail('error.out', lines=1) Something went wrong!
For files that have valid python variable names can also be accessed by attribute notation
>>> job.files.INCAR File('INCAR')
- from_dict(obj_dict: dict, version: str = None)#
Populate the object from the serialized object.
- Parameters:
obj_dict (dict) – data previously returned from
to_dict()
version (str) – version tag written together with the data
- from_hdf(hdf: ProjectHDFio | None = None, group_name: str | None = None) None #
Restore the GenericJob from an HDF5 file
- Parameters:
hdf (ProjectHDFio) – HDF5 group object - optional
group_name (str) – HDF5 subgroup name - optional
- classmethod from_hdf_args(hdf: ProjectHDFio) dict #
Read arguments for instance creation from HDF5 file
- Parameters:
hdf (ProjectHDFio) – HDF5 group object
- get(name: str, default: Any | None = None) Any #
Internal wrapper function for __getitem__() - self[name]
- Parameters:
key (str, slice) – path to the data or key of the data object
default (any, optional) – return this if key cannot be found
- Returns:
data or data object
- Return type:
dict, list, float, int
- Raises:
ValueError – key cannot be found and default is not given
- get_calculate_function() callable #
Generate calculate() function
Example:
>>> calculate_function = job.get_calculate_function() >>> shell_output, parsed_output, job_crashed = calculate_function(**job.calculate_kwargs) >>> job.save_output(output_dict=parsed_output, shell_output=shell_output)
- Returns:
calculate() functione
- Return type:
callable
- get_from_table(path: str, name: str) dict | list | float | int #
Get a specific value from a pandas.Dataframe
- Parameters:
path (str) – relative path to the data object
name (str) – parameter key
- Returns:
the value associated to the specific parameter key
- Return type:
dict, list, float, int
- get_input_parameter_dict() dict [source]#
Get an hierarchical dictionary of input files. On the first level the dictionary is divided in file_to_create and files_to_copy. Both are dictionaries use the file names as keys. In file_to_create the values are strings which represent the content which is going to be written to the corresponding file. In files_to_copy the values are the paths to the source files to be copied.
- Returns:
hierarchical dictionary of input files
- Return type:
dict
- get_job_id(job_specifier: str | int | None = None) int | None #
get the job_id for job named job_name in the local project path from database
- Parameters:
job_specifier (str, int) – name of the job or job ID
- Returns:
job ID of the job
- Return type:
int
- property id: int#
Unique id to identify the job in the pyiron database - use self.job_id instead
- Returns:
job id
- Return type:
int
- inspect(job_specifier: str | int) JobCore #
Inspect an existing pyiron object - most commonly a job - from the database
- Parameters:
job_specifier (str, int) – name of the job or job ID
- Returns:
Access to the HDF5 object - not a GenericJob object - use load() instead.
- Return type:
- classmethod instantiate(obj_dict: dict, version: str = None) Self #
Create a blank instance of this class.
This can be used when some values are already necessary for the objects __init__.
- Parameters:
obj_dict (dict) – data previously returned from
to_dict()
version (str) – version tag written together with the data
- Returns:
a blank instance of the object that is sufficiently initialized to call
_from_dict()
on it- Return type:
object
- interactive_close() None #
For jobs which executables are available as Python library, those can also be executed with a library call instead of calling an external executable. This is usually faster than a single core python job. After the interactive execution, the job can be closed using the interactive_close function.
- interactive_fetch() None #
For jobs which executables are available as Python library, those can also be executed with a library call instead of calling an external executable. This is usually faster than a single core python job. To access the output data during the execution the interactive_fetch function is used.
- interactive_flush(path: str = 'generic', include_last_step: bool = True) None #
For jobs which executables are available as Python library, those can also be executed with a library call instead of calling an external executable. This is usually faster than a single core python job. To write the interactive cache to the HDF5 file the interactive flush function is used.
- is_compressed() bool #
Check if the job is already compressed or not.
- Returns:
[True/False]
- Return type:
bool
- is_master_id(job_id: int) bool #
Check if the job ID job_id is the master ID for any child job
- Parameters:
job_id (int) – job ID of the master job
- Returns:
[True/False]
- Return type:
bool
- is_self_archived() bool #
Check if the HDF5 file of the Job is compressed as tar-archive
- Returns:
[True/False]
- Return type:
bool
- job_file_name(file_name: str, cwd: str | None = None) str #
combine the file name file_name with the path of the current working directory
- Parameters:
file_name (str) – name of the file
cwd (str) – current working directory - this overwrites self.project_hdf5.working_directory - optional
- Returns:
absolute path to the file in the current working directory
- Return type:
str
- property job_id: int#
Unique id to identify the job in the pyiron database
- Returns:
job id
- Return type:
int
- property job_info_str: str#
Short string to describe the job by it is job_name and job ID - mainly used for logging
- Returns:
job info string
- Return type:
str
- property job_name: str#
Get name of the job, which has to be unique within the project
- Returns:
job name
- Return type:
str
- property job_type: str#
- [‘ExampleJob’, ‘ParallelMaster’, ‘ScriptJob’,
‘ListMaster’]
- Returns:
Job type object
- Return type:
- Type:
Job type object with all the available job types
- kill() None #
Kill the job.
This function is used to terminate the execution of the job. It checks if the job is currently running or submitted, and if so, it removes and resets the job ID. If the job is not running or submitted, a ValueError is raised.
- Returns:
None
- list_all()#
Returns dictionary of :method:`.list_groups()` and :method:`.list_nodes()`.
- Returns:
- results of :method:`.list_groups() under the key "groups"; results of :method:`.list_nodes()` und the
key “nodes”
- Return type:
dict
- list_childs() list #
List child jobs as JobPath objects - not loading the full GenericJob objects for each child
- Returns:
list of child jobs
- Return type:
list
- list_files() list #
List files inside the working directory
- Parameters:
extension (str) – filter by a specific extension
- Returns:
list of file names
- Return type:
list
- list_groups()#
Return a list of names of all nested groups.
- Returns:
group names
- Return type:
list of str
- list_nodes()#
Return a list of names of all nested nodes.
- Returns:
node names
- Return type:
list of str
- load(job_specifier: str | int, convert_to_object: bool = True) pyiron_base.job.generic.GenericJob | JobCore #
Load an existing pyiron object - most commonly a job - from the database
- Parameters:
job_specifier (str, int) – name of the job or job ID
convert_to_object (bool) – convert the object to an pyiron object or only access the HDF5 file - default=True accessing only the HDF5 file is about an order of magnitude faster, but only provides limited functionality. Compare the GenericJob object to JobCore object.
- Returns:
Either the full GenericJob object or just a reduced JobCore object
- Return type:
- property logger#
Get the logger object to monitor the external execution and internal pyiron warnings.
- Returns:
logger object
- Return type:
logging.getLogger()
- property master_id: int#
Get job id of the master job - a meta job which groups a series of jobs, which are executed either in parallel or in serial.
- Returns:
master id
- Return type:
int
- move_to(project: ProjectHDFio) None #
Move the content of the job including the HDF5 file to a new location
- Parameters:
project (ProjectHDFio) – project to move the job to
- property name: str#
Get name of the job, which has to be unique within the project
- Returns:
job name
- Return type:
str
- property parent_id: int#
Get job id of the predecessor job - the job which was executed before the current one in the current job series
- Returns:
parent id
- Return type:
int
- property path: str#
Absolute path of the HDF5 group starting from the system root - combination of the absolute system path plus the absolute path inside the HDF5 file starting from the root group.
- Returns:
absolute path
- Return type:
str
- property project: pyiron_base.project.generic.Project#
Project instance the jobs is located in
- Returns:
project the job is located in
- Return type:
- property project_hdf5: ProjectHDFio#
Get the ProjectHDFio instance which points to the HDF5 file the job is stored in
- Returns:
HDF5 project
- Return type:
- property queue_id: int#
Get the queue ID, the ID returned from the queuing system - it is most likely not the same as the job ID.
- Returns:
queue ID
- Return type:
int
- refresh_job_status() None #
Refresh job status by updating the job status with the status from the database if a job ID is available.
- relocate_hdf5(h5_path: str | None = None)#
Relocate the hdf file. This function is needed when the child job is spawned by a parent job (cf. pyiron_base.jobs.master.generic)
- remove(_protect_childs: bool = True) None #
Remove the job - this removes the HDF5 file, all data stored in the HDF5 file an the corresponding database entry.
- Parameters:
_protect_childs (bool) – [True/False] by default child jobs can not be deleted, to maintain the consistency - default=True
- remove_and_reset_id(_protect_childs: bool = True) None #
Remove the job and reset its ID.
- Parameters:
_protect_childs (bool) – Flag indicating whether to protect child jobs (default is True).
- Returns:
None
- remove_child() None #
internal function to remove command that removes also child jobs. Do never use this command, since it will destroy the integrity of your project.
- rename(new_job_name: str) None #
Rename the job - by changing the job name
- Parameters:
new_job_name (str) – new job name
- reset_job_id(job_id: int | None = None) None #
Reset the job id sets the job_id to None in the GenericJob as well as all connected modules like JobStatus.
- restart(job_name: str | None = None, job_type: str | None = None) GenericJob #
Create an restart calculation from the current calculation - in the GenericJob this is the same as create_job(). A restart is only possible after the current job has finished. If you want to run the same job again with different input parameters use job.run(delete_existing_job=True) instead.
- Parameters:
job_name (str) – job name of the new calculation - default=<job_name>_restart
job_type (str) – job type of the new calculation - default is the same type as the exeisting calculation
Returns:
- property restart_file_dict: dict#
A dictionary of the new name of the copied restart files
- property restart_file_list: list#
Get the list of files which are used to restart the calculation from these files.
- Returns:
list of files
- Return type:
list
- run(delete_existing_job: bool = False, repair: bool = False, debug: bool = False, run_mode: str | None = None, run_again: bool = False) None #
This is the main run function, depending on the job status [‘initialized’, ‘created’, ‘submitted’, ‘running’, ‘collect’,’finished’, ‘refresh’, ‘suspended’] the corresponding run mode is chosen.
- Parameters:
delete_existing_job (bool) – Delete the existing job and run the simulation again.
repair (bool) – Set the job status to created and run the simulation again.
debug (bool) – Debug Mode - defines the log level of the subprocess the job is executed in.
run_mode (str) – [‘modal’, ‘non_modal’, ‘queue’, ‘manual’] overwrites self.server.run_mode
run_again (bool) – Same as delete_existing_job (deprecated)
- run_if_interactive() None #
For jobs which executables are available as Python library, those can also be executed with a library call instead of calling an external executable. This is usually faster than a single core python job.
- run_if_interactive_non_modal() None #
For jobs which executables are available as Python library, those can also be executed with a library call instead of calling an external executable. This is usually faster than a single core python job.
- run_if_modal() None #
The run if modal function is called by run to execute the simulation, while waiting for the output. For this we use subprocess.check_output()
- run_if_refresh() None #
Internal helper function the run if refresh function is called when the job status is ‘refresh’. If the job was suspended previously, the job is going to be started again, to be continued.
- run_if_scheduler() None | int #
The run if queue function is called by run if the user decides to submit the job to and queing system. The job is submitted to the queuing system using subprocess.Popen() :returns: Returns the queue ID for the job. :rtype: int
- run_static() None [source]#
The run_static() function is called internally in pyiron to trigger the execution of the executable. This is typically divided into three steps: (1) the generation of the calculate function and its inputs, (2) the execution of this function and (3) storing the output of this function in the HDF5 file.
In future the execution of the calculate function might be transferred to a separate process, so the separation in these three distinct steps is necessary to simplify the submission to an external executor.
- run_time_to_db() None #
Internal helper function to store the run_time in the database
- save() None #
Save the object, by writing the content to the HDF5 file and storing an entry in the database.
- Returns:
Job ID stored in the database
- Return type:
(int)
- save_output(output_dict: dict | None = None, shell_output: str | None = None) None [source]#
Store output of the calculate function in the HDF5 file.
- Parameters:
output_dict (dict) – hierarchical output dictionary to be stored in the HDF5 file.
shell_output (str) – shell output from calling the external executable to be stored in the HDF5 file.
- property script_path: str#
Python script path
- Returns:
absolute path to the python script
- Return type:
str
- self_archive() None #
Compress HDF5 file of the job object to tar-archive
- self_unarchive() None #
Decompress HDF5 file of the job object from tar-archive
- property server: Server#
Get the server object to handle the execution environment for the job.
- Returns:
server object
- Return type:
- set_input_to_read_only() None [source]#
This function enforces read-only mode for the input classes, but it has to be implement in the individual classes.
- show_hdf() None #
Iterating over the HDF5 datastructure and generating a human readable graph.
- signal_intercept(sig) None #
Abort the job and log signal that caused it.
Expected to be called from
pyiron_base.state.signal.catch_signals()
.- Parameters:
sig (int) – the signal that triggered the abort
- property status: str#
- Execution status of the job, can be one of the following [initialized, appended, created, submitted, running,
aborted, collect, suspended, refresh, busy, finished]
- Returns:
status
- Return type:
(str/pyiron_base.job.jobstatus.JobStatus)
- suspend() None #
Suspend the job by storing the object and its state persistently in HDF5 file and exit it.
- to_dict() dict #
Reduce the object to a dictionary.
- Returns:
serialized state of this object
- Return type:
dict
- to_hdf(hdf: ProjectHDFio | None = None, group_name: str | None = None) None #
Store the GenericJob in an HDF5 file
- Parameters:
hdf (ProjectHDFio) – HDF5 group object - optional
group_name (str) – HDF5 subgroup name - optional
- to_object(object_type: str | None = None, **qwargs) pyiron_base.job.generic.GenericJob #
Load the full pyiron object from an HDF5 file
- Parameters:
object_type – if the ‘TYPE’ node is not available in the HDF5 file a manual object type can be set - optional
**qwargs – optional parameters [‘job_name’, ‘project’] - to specify the location of the HDF5 path
- Returns:
pyiron object
- Return type:
- transfer_from_remote() None #
Transfer the job from a remote location to the local machine.
This method transfers the job from a remote location to the local machine. It performs the following steps: 1. Retrieves the job from the remote location using the queue adapter. 2. Transfers the job file to the remote location, with the option to delete the file on the remote location after transfer. 3. Updates the project database if it is disabled, otherwise updates the file table in the database with the job information.
- Parameters:
None
- Returns:
None
- update_master(force_update: bool = False) None #
After a job is finished it checks whether it is linked to any metajob - meaning the master ID is pointing to this jobs job ID. If this is the case and the master job is in status suspended - the child wakes up the master job, sets the status to refresh and execute run on the master job. During the execution the master job is set to status refresh. If another child calls update_master, while the master is in refresh the status of the master is set to busy and if the master is in status busy at the end of the update_master process another update is triggered.
- Parameters:
force_update (bool) – Whether to check run mode for updating master
- validate_ready_to_run() None [source]#
Validates if the job is ready to run by checking if the script path is provided.
- Raises:
TypeError – If the script path is not provided (i.e., is None).
- property version: str#
Get the version of the hamiltonian, which is also the version of the executable unless a custom executable is used.
- Returns:
version number
- Return type:
str
- property working_directory: str#
Get the working directory of the job is executed in - outside the HDF5 file. The working directory equals the path but it is represented by the filesystem:
/absolute/path/to/the/file.h5/path/inside/the/hdf5/file
- becomes:
/absolute/path/to/the/file_hdf5/path/inside/the/hdf5/file
- Returns:
absolute path to the working directory
- Return type:
str
- write_input() None #
Call routines that generate the code specific input files Returns: