pyiron_base.storage.hdfio.ProjectHDFio

pyiron_base.storage.hdfio.ProjectHDFio#

class pyiron_base.storage.hdfio.ProjectHDFio(project: pyiron_base.project.generic.Project, file_name: str, h5_path: str | None = None, mode: str | None = None)[source]#

Bases: FileHDFio, BaseHDFio

The ProjectHDFio class connects the FileHDFio and the Project class, it is derived from the FileHDFio class but in addition the a project object instance is located at self.project enabling direct access to the database and other project related functionality, some of which are mapped to the ProjectHDFio class as well.

Parameters:

project (Project) – pyiron Project the current HDF5 project is located in
file_name (str) – name of the HDF5 file - in contrast to the FileHDFio object where file_name represents the absolute path of the HDF5 file.
h5_path (str) – absolute path inside the h5 path - starting from the root group
mode (str) – mode : {‘a’, ‘w’, ‘r’, ‘r+’}, default ‘a’ See HDFStore docstring or tables.open_file for info about modes

.. attribute:: project: Project instance the ProjectHDFio object is located in

.. attribute:: root_path: the pyiron user directory, defined in the .pyiron configuration

.. attribute:: project_path: the relative path of the current project / folder starting from the root path of the pyiron user directory

.. attribute:: path: the absolute path of the current project / folder plus the absolute path in the HDF5 file as one path

.. attribute:: file_name: absolute path to the HDF5 file

.. attribute:: h5_path: path inside the HDF5 file - also stored as absolute path

.. attribute:: history: previously opened groups / folders

.. attribute:: file_exists: boolean if the HDF5 was already written

.. attribute:: base_name: name of the HDF5 file but without any file extension

.. attribute:: file_path: directory where the HDF5 file is located

.. attribute:: is_root: boolean if the HDF5 object is located at the root level of the HDF5 file

.. attribute:: is_open: boolean if the HDF5 file is currently opened - if an active file handler exists

.. attribute:: is_empty: boolean if the HDF5 file is empty

.. attribute:: user: current unix/linux/windows user who is running pyiron

.. attribute:: sql_query: an SQL query to limit the jobs within the project to a subset which matches the SQL query.

.. attribute:: db: connection to the SQL database

.. attribute:: working_directory: working directory of the job is executed in - outside the HDF5 file

__init__(project: pyiron_base.project.generic.Project, file_name: str, h5_path: str | None = None, mode: str | None = None) → None[source]#

Methods

`__init__`(project, file_name[, h5_path, mode])
`clear`()
`close`()	Close the current HDF5 path and return to the path before the last open.
`copy`()	Copy the ProjectHDFio object - copying just the Python object but maintaining the same pyiron path
`copy_to`(destination[, file_name, maintain_name])	Copy the content of the HDF5 file to a new location.
`create_group`(name[, track_order])	Create an HDF5 group - similar to a folder in the filesystem - the HDF5 groups allow the users to structure their data.
`create_hdf`(path, job_name)	Create an ProjectHDFio object to store project related information - for testing aggregated data
`create_project_from_hdf5`()	Internal function to create a pyiron project pointing to the directory where the HDF5 file is located.
`create_working_directory`()	Create the working directory on the file system if it does not exist already.
`file_size`()	Get the size of the HDF5 file.
`get`(key[, default])	Get data from the HDF5 file.
`get_from_table`(path, name)	Get a specific value from a pandas.DataFrame.
`get_job_id`(job_specifier)	get the job_id for job named job_name in the local project path from database
`get_pandas`(name)	Load a dictionary from the HDF5 file and display the dictionary as a pandas DataFrame.
`get_size`(hdf)	Get the size of the groups inside the HDF5 file.
`groups`()	Filter HDF5 file by groups.
`hd_copy`(hdf_old, hdf_new[, exclude_groups, ...])	Copy data from one HDF5 file to another.
`inspect`(job_specifier)	Inspect an existing pyiron object - most commonly a job - from the database
`items`()	List all keys and values as items of all groups and nodes of the HDF5 file.
`keys`()	List all groups and nodes of the HDF5 file - where groups are equivalent to directories and nodes to files.
`list_all`()	Returns dictionary of :method:`.list_groups()` and :method:`.list_nodes()`.
`list_dirs`()	Equivalent to os.listdirs (consider groups as equivalent to dirs).
`list_groups`()	Return a list of names of all nested groups.
`list_h5_path`([h5_path])	List all groups and nodes of the HDF5 file.
`list_nodes`()	Return a list of names of all nested nodes.
`listdirs`()	Equivalent to os.listdirs (consider groups as equivalent to dirs).
`load`(job_specifier[, convert_to_object])	Load an existing pyiron object - most commonly a job - from the database
`load_from_jobpath`([job_id, db_entry, ...])	Internal function to load an existing job either based on the job ID or based on the database entry dictionary.
`nodes`()	Filter HDF5 file by nodes.
`open`(h5_rel_path)	Create an HDF5 group and enter this specific group.
`pop`(k[,d])	value.
`popitem`()	as a 2-tuple; but raise KeyError if D is empty.
`put`(key, value)	Store data inside the HDF5 file.
`read_dict_from_hdf`([group_paths, recursive])	Read data from HDF5 file into a dictionary - by default only the nodes are converted to dictionaries, additional sub groups can be specified using the group_paths parameter.
`remove_file`()	Remove the HDF5 file with all the related content.
`remove_group`()	Remove an HDF5 group if it exists.
`remove_job`(job_specifier[, _unprotect])	Remove a single job from the project based on its job_specifier.
`rewrite_hdf5`([job_name, info, ...])	Rewrite the entire hdf file.
`setdefault`(k[,d])
`show_hdf`()	Iterate over the HDF5 data structure and generate a human-readable graph.
`to_dict`([hierarchical])	Get the content of the HDF5 file at the current h5_path returned as a dictionary.
`to_object`([class_name])	Load the full pyiron object from an HDF5 file
`update`([E, ]**F)	If E present and has a .keys() method, does:
`values`()	List all values for all groups and nodes of the HDF5 file.
`write_dict`(data_dict[, compression])	Write a dictionary to the HDF5 file.
`write_dict_to_hdf`(data_dict)	Write a dictionary to HDF5

Attributes

`base_name`	The absolute path to of the current pyiron project - absolute path on the file system, not including the HDF5 path.
`db`	Get connection to the SQL database
`file_exists`	Check if the HDF5 file exists already.
`file_name`	Get the file name of the HDF5 file.
`file_path`	Get the directory where the HDF5 file is located.
`h5_path`	Get the path in the HDF5 file starting from the root group.
`is_empty`	Check if the HDF5 file is empty.
`is_root`	Check if the current h5_path is pointing to the HDF5 root group.
`name`	Get the name of the HDF5 group.
`path`	Absolute path of the HDF5 group starting from the system root - combination of the absolute system path plus the absolute path inside the HDF5 file starting from the root group.
`project`	Get the project instance the ProjectHDFio object is located in
`project_path`	the relative path of the current project / folder starting from the root path of the pyiron user directory
`root_path`	the pyiron user directory, defined in the .pyiron configuration
`sql_query`	Get the SQL query for the project
`user`	Get current unix/linux/windows user who is running pyiron
`working_directory`	Get the working directory of the current ProjectHDFio object. The working directory equals the path but it is represented by the filesystem: /absolute/path/to/the/file.h5/path/inside/the/hdf5/file becomes: /absolute/path/to/the/file_hdf5/path/inside/the/hdf5/file.

property base_name: str#

The absolute path to of the current pyiron project - absolute path on the file system, not including the HDF5 path.

Returns:: current project path
Return type:: str

clear() → None. Remove all items from D.#

close() → None#: Close the current HDF5 path and return to the path before the last open.

copy() → ProjectHDFio[source]#

Copy the ProjectHDFio object - copying just the Python object but maintaining the same pyiron path

Returns:: copy of the ProjectHDFio object
Return type:: ProjectHDFio

copy_to(destination: Pointer, file_name: str = None, maintain_name: bool = True) → Pointer#

Copy the content of the HDF5 file to a new location.

Parameters:

destination (Pointer) – The Pointer object pointing to the new location.
file_name (str, optional) – The name of the new HDF5 file. Defaults to None.
maintain_name (bool, optional) – Whether to maintain the names of the HDF5 groups. Defaults to True.

Returns:

The Pointer object pointing to a file which now contains the same content as the current file.

Return type:

Pointer

create_group(name: str, track_order: bool = False) → FileHDFio#

Create an HDF5 group - similar to a folder in the filesystem - the HDF5 groups allow the users to structure their data.

Parameters:

name (str) – Name of the HDF5 group
track_order (bool) – If False, this groups tracks its elements in alphanumeric order, if True, in insertion order

Returns:

FileHDFio object pointing to the new group

Return type:

FileHDFio

create_hdf(path: str, job_name: str) → ProjectHDFio[source]#

Create an ProjectHDFio object to store project related information - for testing aggregated data

Parameters:

path (str) – absolute path
job_name (str) – name of the HDF5 container

Returns:

HDF5 object

Return type:

ProjectHDFio

create_project_from_hdf5() → Project[source]#

Internal function to create a pyiron project pointing to the directory where the HDF5 file is located.

Returns:: pyiron project object
Return type:: Project

create_working_directory() → None[source]#: Create the working directory on the file system if it does not exist already.

property db: DatabaseAccess#

Get connection to the SQL database

Returns:: database conncetion
Return type:: DatabaseAccess

property file_exists: bool#

Check if the HDF5 file exists already.

Returns:: True if the file exists, False otherwise.
Return type:: bool

property file_name: str#

Get the file name of the HDF5 file.

Returns:: The absolute path to the HDF5 file.
Return type:: str

property file_path: str#

Get the directory where the HDF5 file is located.

Returns:: Directory where the HDF5 file is located
Return type:: str

file_size() → float#

Get the size of the HDF5 file.

Returns:: The file size in bytes.
Return type:: float

get(key: str, default: object | None = None) → Dict | List | float | int#

Get data from the HDF5 file.

Parameters:

key (str) – Path to the data or key of the data object
default (object) – Default value to return if key doesn’t exist

Returns:

Data or data object

Return type:

Union[Dict, List, float, int]

get_from_table(path: str, name: str) → Dict | List | float | int#

Get a specific value from a pandas.DataFrame.

Parameters:

path (str) – Relative path to the data object
name (str) – Parameter key

Returns:

The value associated with the specific parameter key

Return type:

Union[Dict, List, float, int]

get_job_id(job_specifier: str | int) → int[source]#

get the job_id for job named job_name in the local project path from database

Parameters:: job_specifier (str, int) – name of the job or job ID
Returns:: job ID of the job
Return type:: int

get_pandas(name: str) → DataFrame#

Load a dictionary from the HDF5 file and display the dictionary as a pandas DataFrame.

Parameters:: name (str) – HDF5 node name
Returns:: The dictionary as a pandas DataFrame object
Return type:: pd.DataFrame

get_size(hdf: FileHDFio) → float#

Get the size of the groups inside the HDF5 file.

Parameters:: hdf (FileHDFio) – HDF5 file
Returns:: File size in Bytes
Return type:: float

groups() → FileHDFio#

Filter HDF5 file by groups.

Returns:: An HDF5 file which is filtered by groups
Return type:: FileHDFio

property h5_path: str#

Get the path in the HDF5 file starting from the root group.

Returns:: The HDF5 path.
Return type:: str

hd_copy(hdf_old: FileHDFio, hdf_new: FileHDFio, exclude_groups: List[str] | None = None, exclude_nodes: List[str] | None = None) → None#

Copy data from one HDF5 file to another.

Parameters:

hdf_old (FileHDFio) – Source HDF5 file
hdf_new (FileHDFio) – Destination HDF5 file
exclude_groups (List[str]) – List of groups to exclude from the copy
exclude_nodes (List[str]) – List of nodes to exclude from the copy

inspect(job_specifier: str | int) → JobCore[source]#

Inspect an existing pyiron object - most commonly a job - from the database

Parameters:: job_specifier (str, int) – name of the job or job ID
Returns:: Access to the HDF5 object - not a GenericJob object - use load() instead.
Return type:: JobCore

property is_empty: bool#

Check if the HDF5 file is empty.

Returns:: True if the file is empty, False otherwise.
Return type:: bool

property is_root: bool#

Check if the current h5_path is pointing to the HDF5 root group.

Returns:: True if the current h5_path is the root group, False otherwise.
Return type:: bool

items() → List[Tuple[str, Dict | List | float | int]]#

List all keys and values as items of all groups and nodes of the HDF5 file.

Returns:: List of sets (key, value)
Return type:: List[Tuple[str, Union[Dict, List, float, int]]]

keys() → List[str]#

List all groups and nodes of the HDF5 file - where groups are equivalent to directories and nodes to files.

Returns:: All groups and nodes
Return type:: List[str]

list_all()#

Returns dictionary of :method:`.list_groups()` and :method:`.list_nodes()`.

Returns:

results of :method:`.list_groups() under the key "groups"; results of :method:`.list_nodes()` und the: key “nodes”

Return type:

dict

list_dirs() → List[str]#

Equivalent to os.listdirs (consider groups as equivalent to dirs).

Returns:: List of groups in pytables for the path self.h5_path
Return type:: List[str]

list_groups()#

Return a list of names of all nested groups.

Returns:: group names
Return type:: list of str

list_h5_path(h5_path: str = '') → Dict[str, List[str]]#

List all groups and nodes of the HDF5 file.

Parameters:: h5_path (str, optional) – The path to a group in the HDF5 file from where the data is read. Defaults to “”.
Returns:: A dictionary with keys “groups” and “nodes” containing lists of groups and nodes.
Return type:: Dict[str, List[str]]

list_nodes()#

Return a list of names of all nested nodes.

Returns:: node names
Return type:: list of str

listdirs() → List[str]#

Equivalent to os.listdirs (consider groups as equivalent to dirs).

Returns:: List of groups in pytables for the path self.h5_path
Return type:: List[str]

load(job_specifier: str | int, convert_to_object: bool = True) → GenericJob | JobCore[source]#

Load an existing pyiron object - most commonly a job - from the database

Parameters:

job_specifier (str, int) – name of the job or job ID
convert_to_object (bool) – convert the object to an pyiron object or only access the HDF5 file - default=True accessing only the HDF5 file is about an order of magnitude faster, but only provides limited functionality. Compare the GenericJob object to JobCore object.

Returns:

Either the full GenericJob object or just a reduced JobCore object

Return type:

GenericJob, JobCore

load_from_jobpath(job_id: int | None = None, db_entry: dict | None = None, convert_to_object: bool = True) → GenericJob | JobCore[source]#

Internal function to load an existing job either based on the job ID or based on the database entry dictionary.

Parameters:

job_id (int, optional) – Job ID - optional, but either the job_id or the db_entry is required.
db_entry (dict, optional) – database entry dictionary - optional, but either the job_id or the db_entry is required.
convert_to_object (bool) – convert the object to an pyiron object or only access the HDF5 file - default=True accessing only the HDF5 file is about an order of magnitude faster, but only provides limited functionality. Compare the GenericJob object to JobCore object.

Returns:

Either the full GenericJob object or just a reduced JobCore object

Return type:

GenericJob, JobCore

property name: str#

Get the name of the HDF5 group.

Returns:: The name of the HDF5 group.
Return type:: str

nodes() → FileHDFio#

Filter HDF5 file by nodes.

Returns:: An HDF5 file which is filtered by nodes
Return type:: FileHDFio

open(h5_rel_path: str) → FileHDFio#

Create an HDF5 group and enter this specific group. If the group exists in the HDF5 path, only the h5_path is set correspondingly, otherwise the group is created first.

Parameters:: h5_rel_path (str) – Relative path from the current HDF5 path - h5_path - to the new group
Returns:: FileHDFio object pointing to the new group
Return type:: FileHDFio

property path: str#

Absolute path of the HDF5 group starting from the system root - combination of the absolute system path plus the absolute path inside the HDF5 file starting from the root group.

Returns:: absolute path
Return type:: str

pop(k[, d]) → v, remove specified key and return the corresponding#: value. If key is not found, d is returned if given, otherwise KeyError is raised.

popitem() → (k, v), remove and return some (key, value) pair#: as a 2-tuple; but raise KeyError if D is empty.

property project: pyiron_base.project.generic.Project#

Get the project instance the ProjectHDFio object is located in

Returns:: pyiron project
Return type:: Project

property project_path: str#

the relative path of the current project / folder starting from the root path of the pyiron user directory

Returns:: relative path of the current project / folder
Return type:: str

Store data inside the HDF5 file.

Parameters:

key (str) – Key to store the data
value (Union[pandas.DataFrame, pandas.Series, Dict, List, float, int]) – Data to store

read_dict_from_hdf(group_paths: List[str] = [], recursive: bool = False) → dict#

Read data from HDF5 file into a dictionary - by default only the nodes are converted to dictionaries, additional sub groups can be specified using the group_paths parameter.

Parameters:

group_paths (List[str]) – list of additional groups to be included in the dictionary, for example: [“input”, “output”, “output/generic”] These groups are defined relative to the h5_path.
recursive (bool) – Load all subgroups recursively

Returns:

The loaded data. Can be of any type supported by write_hdf5.

Return type:

Dict

remove_file() → None#: Remove the HDF5 file with all the related content.

remove_group() → None#: Remove an HDF5 group if it exists. If the group does not exist, no error message is raised.

remove_job(job_specifier: str | int, _unprotect: bool = False) → None[source]#

Remove a single job from the project based on its job_specifier.

Parameters:

job_specifier (Union[str, int]) – Name of the job or job ID.
_unprotect (bool) – [True/False] Delete the job without validating the dependencies to other jobs. Default is False.

rewrite_hdf5(job_name: str | None = None, info: bool = False, exclude_groups: List[str] | None = None, exclude_nodes: List[str] | None = None) → None#

Rewrite the entire hdf file.

Parameters:

job_name (Optional[str]) – Deprecated argument, ignored.
info (bool) – Whether to give the information on how much space has been saved.
exclude_groups (Optional[List[str]]) – List of groups to exclude from the copy.
exclude_nodes (Optional[List[str]]) – List of nodes to exclude from the copy.

property root_path: str#

the pyiron user directory, defined in the .pyiron configuration

Returns:: pyiron user directory of the current project
Return type:: str

setdefault(k[, d]) → D.get(k,d), also set D[k]=d if k not in D#

show_hdf() → None#: Iterate over the HDF5 data structure and generate a human-readable graph.

property sql_query: str#

Get the SQL query for the project

Returns:: SQL query
Return type:: str

to_dict(hierarchical: bool = False) → Dict[str, Any]#

Get the content of the HDF5 file at the current h5_path returned as a dictionary.

Parameters:: hierarchical (bool, optional) – Whether to convert the internal hierarchy of the HDF5 file to a hierarchical dictionary. Defaults to False.
Returns:: A dictionary with the content of the HDF5 file.
Return type:: Dict[str, Any]

to_object(class_name: str | None = None, **kwargs) → object[source]#

Load the full pyiron object from an HDF5 file

Parameters:

class_name (str, optional) – if the ‘TYPE’ node is not available in the HDF5 file a manual object type can be set, must be as reported by str(type(obj))
**kwargs – optional parameters optional parameters to override init parameters

Returns:

pyiron object of the given class_name

update([E, ]**F) → None. Update D from mapping/iterable E and F.#

If E present and has a .keys() method, does:: for k in E.keys(): D[k] = E[k]
If E present and lacks .keys() method, does:: for (k, v) in E: D[k] = v
In either case, this is followed by:: for k, v in F.items(): D[k] = v

property user: str#

Get current unix/linux/windows user who is running pyiron

Returns:: username
Return type:: str

values() → List[Dict | List | float | int]#

List all values for all groups and nodes of the HDF5 file.

Returns:: List of all values
Return type:: List[Union[Dict, List, float, int]]

property working_directory: str#

Get the working directory of the current ProjectHDFio object. The working directory equals the path but it is represented by the filesystem:

/absolute/path/to/the/file.h5/path/inside/the/hdf5/file

becomes:: /absolute/path/to/the/file_hdf5/path/inside/the/hdf5/file

Returns:: absolute path to the working directory
Return type:: str

write_dict(data_dict: Dict[str, Any], compression: int = 4) → None#

Write a dictionary to the HDF5 file.

Parameters:

data_dict (Dict[str, Any]) –
Dictionary of data objects to be stored in the HDF5 file, the keys provide the path inside the HDF5 file and the values the data to be stored in those nodes. The corresponding HDF5 groups are created automatically:

{
‘/hdf5root/group/node_name’: {}, ‘/hdf5root/group/subgroup/node_name’: […],

}
compression (int, optional) – The compression level to use (0-9) to compress data using gzip. Defaults to 4.

write_dict_to_hdf(data_dict: dict) → None#

Write a dictionary to HDF5

Parameters:: data_dict (dict) – dictionary with objects which should be written to HDF5

pyiron_base.storage.hdfio.ProjectHDFio

Contents

pyiron_base.storage.hdfio.ProjectHDFio#