pyiron_base.storage.hdfio.FileHDFio#
- class pyiron_base.storage.hdfio.FileHDFio(file_name: str, h5_path: str = '/', mode: str = 'a')[source]#
Bases:
HasGroups
,Pointer
Class that provides all info to access a h5 file. This class is based on h5io.py, which allows to get and put a large variety of jobs to/from h5
Implements
HasGroups
. Groups are HDF groups in the file, nodes are HDF datasets.- Parameters:
file_name (str) – absolute path of the HDF5 file
h5_path (str) – absolute path inside the h5 path - starting from the root group
mode (str) – mode : {‘a’, ‘w’, ‘r’, ‘r+’}, default ‘a’ See HDFStore docstring or tables.open_file for info about modes
- file_name#
- absolute path to the HDF5 file
- h5_path#
- path inside the HDF5 file - also stored as absolute path
- history#
- previously opened groups / folders
- file_exists#
- boolean if the HDF5 was already written
- base_name#
- name of the HDF5 file but without any file extension
- file_path#
- directory where the HDF5 file is located
- is_root#
- boolean if the HDF5 object is located at the root level of the HDF5 file
- is_open#
- boolean if the HDF5 file is currently opened - if an active file handler exists
- is_empty#
- boolean if the HDF5 file is empty
Methods
__init__
(file_name[, h5_path, mode])clear
()close
()Close the current HDF5 path and return to the path before the last open.
copy
()Copy the Python object which links to the HDF5 file - in contrast to copy_to() which copies the content of the HDF5 file to a new location.
copy_to
(destination[, file_name, maintain_name])Copy the content of the HDF5 file to a new location.
create_group
(name[, track_order])Create an HDF5 group - similar to a folder in the filesystem - the HDF5 groups allow the users to structure their data.
Internal function to create a pyiron project pointing to the directory where the HDF5 file is located.
Get the size of the HDF5 file.
get
(key[, default])Get data from the HDF5 file.
get_from_table
(path, name)Get a specific value from a pandas.DataFrame.
get_pandas
(name)Load a dictionary from the HDF5 file and display the dictionary as a pandas DataFrame.
get_size
(hdf)Get the size of the groups inside the HDF5 file.
groups
()Filter HDF5 file by groups.
hd_copy
(hdf_old, hdf_new[, exclude_groups, ...])Copy data from one HDF5 file to another.
items
()List all keys and values as items of all groups and nodes of the HDF5 file.
keys
()List all groups and nodes of the HDF5 file - where groups are equivalent to directories and nodes to files.
list_all
()Returns dictionary of :method:`.list_groups()` and :method:`.list_nodes()`.
Equivalent to os.listdirs (consider groups as equivalent to dirs).
Return a list of names of all nested groups.
list_h5_path
([h5_path])List all groups and nodes of the HDF5 file.
Return a list of names of all nested nodes.
listdirs
()Equivalent to os.listdirs (consider groups as equivalent to dirs).
nodes
()Filter HDF5 file by nodes.
open
(h5_rel_path)Create an HDF5 group and enter this specific group.
pop
(k[,d])If key is not found, d is returned if given, otherwise KeyError is raised.
popitem
()as a 2-tuple; but raise KeyError if D is empty.
put
(key, value)Store data inside the HDF5 file.
read_dict_from_hdf
([group_paths, recursive])Read data from HDF5 file into a dictionary - by default only the nodes are converted to dictionaries, additional sub groups can be specified using the group_paths parameter.
Remove the HDF5 file with all the related content.
Remove an HDF5 group if it exists.
rewrite_hdf5
([job_name, info, ...])Rewrite the entire hdf file.
setdefault
(k[,d])show_hdf
()Iterate over the HDF5 data structure and generate a human-readable graph.
to_dict
([hierarchical])Get the content of the HDF5 file at the current h5_path returned as a dictionary.
update
([E, ]**F)If E present and has a .keys() method, does: for k in E.keys(): D[k] = E[k] If E present and lacks .keys() method, does: for (k, v) in E: D[k] = v In either case, this is followed by: for k, v in F.items(): D[k] = v
values
()List all values for all groups and nodes of the HDF5 file.
write_dict
(data_dict[, compression])Write a dictionary to the HDF5 file.
write_dict_to_hdf
(data_dict)Write a dictionary to HDF5
Attributes
Get the name of the HDF5 file without the file extension.
Check if the HDF5 file exists already.
Get the file name of the HDF5 file.
Get the directory where the HDF5 file is located.
Get the path in the HDF5 file starting from the root group.
Check if the HDF5 file is empty.
Check if the current h5_path is pointing to the HDF5 root group.
- property base_name: str#
Get the name of the HDF5 file without the file extension.
- Returns:
Name of the HDF5 file without the file extension
- Return type:
str
- clear() None. Remove all items from D. #
- copy() FileHDFio [source]#
Copy the Python object which links to the HDF5 file - in contrast to copy_to() which copies the content of the HDF5 file to a new location.
- Returns:
New FileHDFio object pointing to the same HDF5 file
- Return type:
- copy_to(destination: Pointer, file_name: str = None, maintain_name: bool = True) Pointer #
Copy the content of the HDF5 file to a new location.
- Parameters:
destination (Pointer) – The Pointer object pointing to the new location.
file_name (str, optional) – The name of the new HDF5 file. Defaults to None.
maintain_name (bool, optional) – Whether to maintain the names of the HDF5 groups. Defaults to True.
- Returns:
The Pointer object pointing to a file which now contains the same content as the current file.
- Return type:
Pointer
- create_group(name: str, track_order: bool = False) FileHDFio [source]#
Create an HDF5 group - similar to a folder in the filesystem - the HDF5 groups allow the users to structure their data.
- Parameters:
name (str) – Name of the HDF5 group
track_order (bool) – If False, this groups tracks its elements in alphanumeric order, if True, in insertion order
- Returns:
FileHDFio object pointing to the new group
- Return type:
- create_project_from_hdf5() Project [source]#
Internal function to create a pyiron project pointing to the directory where the HDF5 file is located.
- Returns:
pyiron project object
- Return type:
- property file_exists: bool#
Check if the HDF5 file exists already.
- Returns:
True if the file exists, False otherwise.
- Return type:
bool
- property file_name: str#
Get the file name of the HDF5 file.
- Returns:
The absolute path to the HDF5 file.
- Return type:
str
- property file_path: str#
Get the directory where the HDF5 file is located.
- Returns:
Directory where the HDF5 file is located
- Return type:
str
- file_size() float #
Get the size of the HDF5 file.
- Returns:
The file size in bytes.
- Return type:
float
- get(key: str, default: object | None = None) Dict | List | float | int [source]#
Get data from the HDF5 file.
- Parameters:
key (str) – Path to the data or key of the data object
default (object) – Default value to return if key doesn’t exist
- Returns:
Data or data object
- Return type:
Union[Dict, List, float, int]
- get_from_table(path: str, name: str) Dict | List | float | int [source]#
Get a specific value from a pandas.DataFrame.
- Parameters:
path (str) – Relative path to the data object
name (str) – Parameter key
- Returns:
The value associated with the specific parameter key
- Return type:
Union[Dict, List, float, int]
- get_pandas(name: str) DataFrame [source]#
Load a dictionary from the HDF5 file and display the dictionary as a pandas DataFrame.
- Parameters:
name (str) – HDF5 node name
- Returns:
The dictionary as a pandas DataFrame object
- Return type:
pd.DataFrame
- get_size(hdf: FileHDFio) float [source]#
Get the size of the groups inside the HDF5 file.
- Parameters:
hdf (FileHDFio) – HDF5 file
- Returns:
File size in Bytes
- Return type:
float
- groups() FileHDFio [source]#
Filter HDF5 file by groups.
- Returns:
An HDF5 file which is filtered by groups
- Return type:
- property h5_path: str#
Get the path in the HDF5 file starting from the root group.
- Returns:
The HDF5 path.
- Return type:
str
- hd_copy(hdf_old: FileHDFio, hdf_new: FileHDFio, exclude_groups: List[str] | None = None, exclude_nodes: List[str] | None = None) None [source]#
Copy data from one HDF5 file to another.
- property is_empty: bool#
Check if the HDF5 file is empty.
- Returns:
True if the file is empty, False otherwise.
- Return type:
bool
- property is_root: bool#
Check if the current h5_path is pointing to the HDF5 root group.
- Returns:
True if the current h5_path is the root group, False otherwise.
- Return type:
bool
- items() List[Tuple[str, Dict | List | float | int]] [source]#
List all keys and values as items of all groups and nodes of the HDF5 file.
- Returns:
List of sets (key, value)
- Return type:
List[Tuple[str, Union[Dict, List, float, int]]]
- keys() List[str] [source]#
List all groups and nodes of the HDF5 file - where groups are equivalent to directories and nodes to files.
- Returns:
All groups and nodes
- Return type:
List[str]
- list_all()#
Returns dictionary of :method:`.list_groups()` and :method:`.list_nodes()`.
- Returns:
- results of :method:`.list_groups() under the key "groups"; results of :method:`.list_nodes()` und the
key “nodes”
- Return type:
dict
- list_dirs() List[str] [source]#
Equivalent to os.listdirs (consider groups as equivalent to dirs).
- Returns:
List of groups in pytables for the path self.h5_path
- Return type:
List[str]
- list_groups()#
Return a list of names of all nested groups.
- Returns:
group names
- Return type:
list of str
- list_h5_path(h5_path: str = '') Dict[str, List[str]] #
List all groups and nodes of the HDF5 file.
- Parameters:
h5_path (str, optional) – The path to a group in the HDF5 file from where the data is read. Defaults to “”.
- Returns:
A dictionary with keys “groups” and “nodes” containing lists of groups and nodes.
- Return type:
Dict[str, List[str]]
- list_nodes()#
Return a list of names of all nested nodes.
- Returns:
node names
- Return type:
list of str
- listdirs() List[str] [source]#
Equivalent to os.listdirs (consider groups as equivalent to dirs).
- Returns:
List of groups in pytables for the path self.h5_path
- Return type:
List[str]
- nodes() FileHDFio [source]#
Filter HDF5 file by nodes.
- Returns:
An HDF5 file which is filtered by nodes
- Return type:
- open(h5_rel_path: str) FileHDFio [source]#
Create an HDF5 group and enter this specific group. If the group exists in the HDF5 path, only the h5_path is set correspondingly, otherwise the group is created first.
- Parameters:
h5_rel_path (str) – Relative path from the current HDF5 path - h5_path - to the new group
- Returns:
FileHDFio object pointing to the new group
- Return type:
- pop(k[, d]) v, remove specified key and return the corresponding value. #
If key is not found, d is returned if given, otherwise KeyError is raised.
- popitem() (k, v), remove and return some (key, value) pair #
as a 2-tuple; but raise KeyError if D is empty.
- put(key: str, value: DataFrame | Series | Dict | List | float | int) None [source]#
Store data inside the HDF5 file.
- Parameters:
key (str) – Key to store the data
value (Union[pandas.DataFrame, pandas.Series, Dict, List, float, int]) – Data to store
- read_dict_from_hdf(group_paths: List[str] = [], recursive: bool = False) dict [source]#
Read data from HDF5 file into a dictionary - by default only the nodes are converted to dictionaries, additional sub groups can be specified using the group_paths parameter.
- Parameters:
group_paths (List[str]) – list of additional groups to be included in the dictionary, for example: [“input”, “output”, “output/generic”] These groups are defined relative to the h5_path.
recursive (bool) – Load all subgroups recursively
- Returns:
The loaded data. Can be of any type supported by
write_hdf5
.- Return type:
Dict
- remove_group() None [source]#
Remove an HDF5 group if it exists. If the group does not exist, no error message is raised.
- rewrite_hdf5(job_name: str | None = None, info: bool = False, exclude_groups: List[str] | None = None, exclude_nodes: List[str] | None = None) None [source]#
Rewrite the entire hdf file.
- Parameters:
job_name (Optional[str]) – Deprecated argument, ignored.
info (bool) – Whether to give the information on how much space has been saved.
exclude_groups (Optional[List[str]]) – List of groups to exclude from the copy.
exclude_nodes (Optional[List[str]]) – List of nodes to exclude from the copy.
- setdefault(k[, d]) D.get(k,d), also set D[k]=d if k not in D #
- to_dict(hierarchical: bool = False) Dict[str, Any] #
Get the content of the HDF5 file at the current h5_path returned as a dictionary.
- Parameters:
hierarchical (bool, optional) – Whether to convert the internal hierarchy of the HDF5 file to a hierarchical dictionary. Defaults to False.
- Returns:
A dictionary with the content of the HDF5 file.
- Return type:
Dict[str, Any]
- update([E, ]**F) None. Update D from mapping/iterable E and F. #
If E present and has a .keys() method, does: for k in E.keys(): D[k] = E[k] If E present and lacks .keys() method, does: for (k, v) in E: D[k] = v In either case, this is followed by: for k, v in F.items(): D[k] = v
- values() List[Dict | List | float | int] [source]#
List all values for all groups and nodes of the HDF5 file.
- Returns:
List of all values
- Return type:
List[Union[Dict, List, float, int]]
- write_dict(data_dict: Dict[str, Any], compression: int = 4) None #
Write a dictionary to the HDF5 file.
- Parameters:
data_dict (Dict[str, Any]) –
Dictionary of data objects to be stored in the HDF5 file, the keys provide the path inside the HDF5 file and the values the data to be stored in those nodes. The corresponding HDF5 groups are created automatically:
- {
‘/hdf5root/group/node_name’: {}, ‘/hdf5root/group/subgroup/node_name’: […],
}
compression (int, optional) – The compression level to use (0-9) to compress data using gzip. Defaults to 4.