pyiron_base.storage.hdfio.FileHDFio#

class pyiron_base.storage.hdfio.FileHDFio(file_name: str, h5_path: str = '/', mode: str = 'a')[source]#

Bases: HasGroups, Pointer

Class that provides all info to access a h5 file. This class is based on h5io.py, which allows to get and put a large variety of jobs to/from h5

Implements HasGroups. Groups are HDF groups in the file, nodes are HDF datasets.

Parameters:
  • file_name (str) – absolute path of the HDF5 file

  • h5_path (str) – absolute path inside the h5 path - starting from the root group

  • mode (str) – mode : {‘a’, ‘w’, ‘r’, ‘r+’}, default ‘a’ See HDFStore docstring or tables.open_file for info about modes

file_name#
absolute path to the HDF5 file
h5_path#
path inside the HDF5 file - also stored as absolute path
history#
previously opened groups / folders
file_exists#
boolean if the HDF5 was already written
base_name#
name of the HDF5 file but without any file extension
file_path#
directory where the HDF5 file is located
is_root#
boolean if the HDF5 object is located at the root level of the HDF5 file
is_open#
boolean if the HDF5 file is currently opened - if an active file handler exists
is_empty#
boolean if the HDF5 file is empty
__init__(file_name: str, h5_path: str = '/', mode: str = 'a') None[source]#

Methods

__init__(file_name[, h5_path, mode])

clear()

close()

Close the current HDF5 path and return to the path before the last open.

copy()

Copy the Python object which links to the HDF5 file - in contrast to copy_to() which copies the content of the HDF5 file to a new location.

copy_to(destination[, file_name, maintain_name])

Copy the content of the HDF5 file to a new location.

create_group(name[, track_order])

Create an HDF5 group - similar to a folder in the filesystem - the HDF5 groups allow the users to structure their data.

create_project_from_hdf5()

Internal function to create a pyiron project pointing to the directory where the HDF5 file is located.

file_size()

Get the size of the HDF5 file.

get(key[, default])

Get data from the HDF5 file.

get_from_table(path, name)

Get a specific value from a pandas.DataFrame.

get_pandas(name)

Load a dictionary from the HDF5 file and display the dictionary as a pandas DataFrame.

get_size(hdf)

Get the size of the groups inside the HDF5 file.

groups()

Filter HDF5 file by groups.

hd_copy(hdf_old, hdf_new[, exclude_groups, ...])

Copy data from one HDF5 file to another.

items()

List all keys and values as items of all groups and nodes of the HDF5 file.

keys()

List all groups and nodes of the HDF5 file - where groups are equivalent to directories and nodes to files.

list_all()

Returns dictionary of :method:`.list_groups()` and :method:`.list_nodes()`.

list_dirs()

Equivalent to os.listdirs (consider groups as equivalent to dirs).

list_groups()

Return a list of names of all nested groups.

list_h5_path([h5_path])

List all groups and nodes of the HDF5 file.

list_nodes()

Return a list of names of all nested nodes.

listdirs()

Equivalent to os.listdirs (consider groups as equivalent to dirs).

nodes()

Filter HDF5 file by nodes.

open(h5_rel_path)

Create an HDF5 group and enter this specific group.

pop(k[,d])

If key is not found, d is returned if given, otherwise KeyError is raised.

popitem()

as a 2-tuple; but raise KeyError if D is empty.

put(key, value)

Store data inside the HDF5 file.

read_dict_from_hdf([group_paths, recursive])

Read data from HDF5 file into a dictionary - by default only the nodes are converted to dictionaries, additional sub groups can be specified using the group_paths parameter.

remove_file()

Remove the HDF5 file with all the related content.

remove_group()

Remove an HDF5 group if it exists.

rewrite_hdf5([job_name, info, ...])

Rewrite the entire hdf file.

setdefault(k[,d])

show_hdf()

Iterate over the HDF5 data structure and generate a human-readable graph.

to_dict([hierarchical])

Get the content of the HDF5 file at the current h5_path returned as a dictionary.

update([E, ]**F)

If E present and has a .keys() method, does: for k in E.keys(): D[k] = E[k] If E present and lacks .keys() method, does: for (k, v) in E: D[k] = v In either case, this is followed by: for k, v in F.items(): D[k] = v

values()

List all values for all groups and nodes of the HDF5 file.

write_dict(data_dict[, compression])

Write a dictionary to the HDF5 file.

write_dict_to_hdf(data_dict)

Write a dictionary to HDF5

Attributes

base_name

Get the name of the HDF5 file without the file extension.

file_exists

Check if the HDF5 file exists already.

file_name

Get the file name of the HDF5 file.

file_path

Get the directory where the HDF5 file is located.

h5_path

Get the path in the HDF5 file starting from the root group.

is_empty

Check if the HDF5 file is empty.

is_root

Check if the current h5_path is pointing to the HDF5 root group.

property base_name: str#

Get the name of the HDF5 file without the file extension.

Returns:

Name of the HDF5 file without the file extension

Return type:

str

clear() None.  Remove all items from D.#
close() None[source]#

Close the current HDF5 path and return to the path before the last open.

copy() FileHDFio[source]#

Copy the Python object which links to the HDF5 file - in contrast to copy_to() which copies the content of the HDF5 file to a new location.

Returns:

New FileHDFio object pointing to the same HDF5 file

Return type:

FileHDFio

copy_to(destination: Pointer, file_name: str = None, maintain_name: bool = True) Pointer#

Copy the content of the HDF5 file to a new location.

Parameters:
  • destination (Pointer) – The Pointer object pointing to the new location.

  • file_name (str, optional) – The name of the new HDF5 file. Defaults to None.

  • maintain_name (bool, optional) – Whether to maintain the names of the HDF5 groups. Defaults to True.

Returns:

The Pointer object pointing to a file which now contains the same content as the current file.

Return type:

Pointer

create_group(name: str, track_order: bool = False) FileHDFio[source]#

Create an HDF5 group - similar to a folder in the filesystem - the HDF5 groups allow the users to structure their data.

Parameters:
  • name (str) – Name of the HDF5 group

  • track_order (bool) – If False, this groups tracks its elements in alphanumeric order, if True, in insertion order

Returns:

FileHDFio object pointing to the new group

Return type:

FileHDFio

create_project_from_hdf5() Project[source]#

Internal function to create a pyiron project pointing to the directory where the HDF5 file is located.

Returns:

pyiron project object

Return type:

Project

property file_exists: bool#

Check if the HDF5 file exists already.

Returns:

True if the file exists, False otherwise.

Return type:

bool

property file_name: str#

Get the file name of the HDF5 file.

Returns:

The absolute path to the HDF5 file.

Return type:

str

property file_path: str#

Get the directory where the HDF5 file is located.

Returns:

Directory where the HDF5 file is located

Return type:

str

file_size() float#

Get the size of the HDF5 file.

Returns:

The file size in bytes.

Return type:

float

get(key: str, default: object | None = None) Dict | List | float | int[source]#

Get data from the HDF5 file.

Parameters:
  • key (str) – Path to the data or key of the data object

  • default (object) – Default value to return if key doesn’t exist

Returns:

Data or data object

Return type:

Union[Dict, List, float, int]

get_from_table(path: str, name: str) Dict | List | float | int[source]#

Get a specific value from a pandas.DataFrame.

Parameters:
  • path (str) – Relative path to the data object

  • name (str) – Parameter key

Returns:

The value associated with the specific parameter key

Return type:

Union[Dict, List, float, int]

get_pandas(name: str) DataFrame[source]#

Load a dictionary from the HDF5 file and display the dictionary as a pandas DataFrame.

Parameters:

name (str) – HDF5 node name

Returns:

The dictionary as a pandas DataFrame object

Return type:

pd.DataFrame

get_size(hdf: FileHDFio) float[source]#

Get the size of the groups inside the HDF5 file.

Parameters:

hdf (FileHDFio) – HDF5 file

Returns:

File size in Bytes

Return type:

float

groups() FileHDFio[source]#

Filter HDF5 file by groups.

Returns:

An HDF5 file which is filtered by groups

Return type:

FileHDFio

property h5_path: str#

Get the path in the HDF5 file starting from the root group.

Returns:

The HDF5 path.

Return type:

str

hd_copy(hdf_old: FileHDFio, hdf_new: FileHDFio, exclude_groups: List[str] | None = None, exclude_nodes: List[str] | None = None) None[source]#

Copy data from one HDF5 file to another.

Parameters:
  • hdf_old (FileHDFio) – Source HDF5 file

  • hdf_new (FileHDFio) – Destination HDF5 file

  • exclude_groups (List[str]) – List of groups to exclude from the copy

  • exclude_nodes (List[str]) – List of nodes to exclude from the copy

property is_empty: bool#

Check if the HDF5 file is empty.

Returns:

True if the file is empty, False otherwise.

Return type:

bool

property is_root: bool#

Check if the current h5_path is pointing to the HDF5 root group.

Returns:

True if the current h5_path is the root group, False otherwise.

Return type:

bool

items() List[Tuple[str, Dict | List | float | int]][source]#

List all keys and values as items of all groups and nodes of the HDF5 file.

Returns:

List of sets (key, value)

Return type:

List[Tuple[str, Union[Dict, List, float, int]]]

keys() List[str][source]#

List all groups and nodes of the HDF5 file - where groups are equivalent to directories and nodes to files.

Returns:

All groups and nodes

Return type:

List[str]

list_all()#

Returns dictionary of :method:`.list_groups()` and :method:`.list_nodes()`.

Returns:

results of :method:`.list_groups() under the key "groups"; results of :method:`.list_nodes()` und the

key “nodes”

Return type:

dict

list_dirs() List[str][source]#

Equivalent to os.listdirs (consider groups as equivalent to dirs).

Returns:

List of groups in pytables for the path self.h5_path

Return type:

List[str]

list_groups()#

Return a list of names of all nested groups.

Returns:

group names

Return type:

list of str

list_h5_path(h5_path: str = '') Dict[str, List[str]]#

List all groups and nodes of the HDF5 file.

Parameters:

h5_path (str, optional) – The path to a group in the HDF5 file from where the data is read. Defaults to “”.

Returns:

A dictionary with keys “groups” and “nodes” containing lists of groups and nodes.

Return type:

Dict[str, List[str]]

list_nodes()#

Return a list of names of all nested nodes.

Returns:

node names

Return type:

list of str

listdirs() List[str][source]#

Equivalent to os.listdirs (consider groups as equivalent to dirs).

Returns:

List of groups in pytables for the path self.h5_path

Return type:

List[str]

nodes() FileHDFio[source]#

Filter HDF5 file by nodes.

Returns:

An HDF5 file which is filtered by nodes

Return type:

FileHDFio

open(h5_rel_path: str) FileHDFio[source]#

Create an HDF5 group and enter this specific group. If the group exists in the HDF5 path, only the h5_path is set correspondingly, otherwise the group is created first.

Parameters:

h5_rel_path (str) – Relative path from the current HDF5 path - h5_path - to the new group

Returns:

FileHDFio object pointing to the new group

Return type:

FileHDFio

pop(k[, d]) v, remove specified key and return the corresponding value.#

If key is not found, d is returned if given, otherwise KeyError is raised.

popitem() (k, v), remove and return some (key, value) pair#

as a 2-tuple; but raise KeyError if D is empty.

put(key: str, value: DataFrame | Series | Dict | List | float | int) None[source]#

Store data inside the HDF5 file.

Parameters:
  • key (str) – Key to store the data

  • value (Union[pandas.DataFrame, pandas.Series, Dict, List, float, int]) – Data to store

read_dict_from_hdf(group_paths: List[str] = [], recursive: bool = False) dict[source]#

Read data from HDF5 file into a dictionary - by default only the nodes are converted to dictionaries, additional sub groups can be specified using the group_paths parameter.

Parameters:
  • group_paths (List[str]) – list of additional groups to be included in the dictionary, for example: [“input”, “output”, “output/generic”] These groups are defined relative to the h5_path.

  • recursive (bool) – Load all subgroups recursively

Returns:

The loaded data. Can be of any type supported by write_hdf5.

Return type:

Dict

remove_file() None[source]#

Remove the HDF5 file with all the related content.

remove_group() None[source]#

Remove an HDF5 group if it exists. If the group does not exist, no error message is raised.

rewrite_hdf5(job_name: str | None = None, info: bool = False, exclude_groups: List[str] | None = None, exclude_nodes: List[str] | None = None) None[source]#

Rewrite the entire hdf file.

Parameters:
  • job_name (Optional[str]) – Deprecated argument, ignored.

  • info (bool) – Whether to give the information on how much space has been saved.

  • exclude_groups (Optional[List[str]]) – List of groups to exclude from the copy.

  • exclude_nodes (Optional[List[str]]) – List of nodes to exclude from the copy.

setdefault(k[, d]) D.get(k,d), also set D[k]=d if k not in D#
show_hdf() None[source]#

Iterate over the HDF5 data structure and generate a human-readable graph.

to_dict(hierarchical: bool = False) Dict[str, Any]#

Get the content of the HDF5 file at the current h5_path returned as a dictionary.

Parameters:

hierarchical (bool, optional) – Whether to convert the internal hierarchy of the HDF5 file to a hierarchical dictionary. Defaults to False.

Returns:

A dictionary with the content of the HDF5 file.

Return type:

Dict[str, Any]

update([E, ]**F) None.  Update D from mapping/iterable E and F.#

If E present and has a .keys() method, does: for k in E.keys(): D[k] = E[k] If E present and lacks .keys() method, does: for (k, v) in E: D[k] = v In either case, this is followed by: for k, v in F.items(): D[k] = v

values() List[Dict | List | float | int][source]#

List all values for all groups and nodes of the HDF5 file.

Returns:

List of all values

Return type:

List[Union[Dict, List, float, int]]

write_dict(data_dict: Dict[str, Any], compression: int = 4) None#

Write a dictionary to the HDF5 file.

Parameters:
  • data_dict (Dict[str, Any]) –

    Dictionary of data objects to be stored in the HDF5 file, the keys provide the path inside the HDF5 file and the values the data to be stored in those nodes. The corresponding HDF5 groups are created automatically:

    {

    ‘/hdf5root/group/node_name’: {}, ‘/hdf5root/group/subgroup/node_name’: […],

    }

  • compression (int, optional) – The compression level to use (0-9) to compress data using gzip. Defaults to 4.

write_dict_to_hdf(data_dict: dict) None[source]#

Write a dictionary to HDF5

Parameters:

data_dict (dict) – dictionary with objects which should be written to HDF5