pyiron_base.storage.hdfio.ProjectHDFio#
- class pyiron_base.storage.hdfio.ProjectHDFio(project: pyiron_base.project.generic.Project, file_name: str, h5_path: str | None = None, mode: str | None = None)[source]#
-
The ProjectHDFio class connects the FileHDFio and the Project class, it is derived from the FileHDFio class but in addition the a project object instance is located at self.project enabling direct access to the database and other project related functionality, some of which are mapped to the ProjectHDFio class as well.
- Parameters:
project (Project) – pyiron Project the current HDF5 project is located in
file_name (str) – name of the HDF5 file - in contrast to the FileHDFio object where file_name represents the absolute path of the HDF5 file.
h5_path (str) – absolute path inside the h5 path - starting from the root group
mode (str) – mode : {‘a’, ‘w’, ‘r’, ‘r+’}, default ‘a’ See HDFStore docstring or tables.open_file for info about modes
- .. attribute:: project
Project instance the ProjectHDFio object is located in
- .. attribute:: root_path
the pyiron user directory, defined in the .pyiron configuration
- .. attribute:: project_path
the relative path of the current project / folder starting from the root path of the pyiron user directory
- .. attribute:: path
the absolute path of the current project / folder plus the absolute path in the HDF5 file as one path
- .. attribute:: file_name
absolute path to the HDF5 file
- .. attribute:: h5_path
path inside the HDF5 file - also stored as absolute path
- .. attribute:: history
previously opened groups / folders
- .. attribute:: file_exists
boolean if the HDF5 was already written
- .. attribute:: base_name
name of the HDF5 file but without any file extension
- .. attribute:: file_path
directory where the HDF5 file is located
- .. attribute:: is_root
boolean if the HDF5 object is located at the root level of the HDF5 file
- .. attribute:: is_open
boolean if the HDF5 file is currently opened - if an active file handler exists
- .. attribute:: is_empty
boolean if the HDF5 file is empty
- .. attribute:: user
current unix/linux/windows user who is running pyiron
- .. attribute:: sql_query
an SQL query to limit the jobs within the project to a subset which matches the SQL query.
- .. attribute:: db
connection to the SQL database
- .. attribute:: working_directory
working directory of the job is executed in - outside the HDF5 file
- __init__(project: pyiron_base.project.generic.Project, file_name: str, h5_path: str | None = None, mode: str | None = None) None [source]#
Methods
__init__
(project, file_name[, h5_path, mode])clear
()close
()Close the current HDF5 path and return to the path before the last open.
copy
()Copy the ProjectHDFio object - copying just the Python object but maintaining the same pyiron path
copy_to
(destination[, file_name, maintain_name])Copy the content of the HDF5 file to a new location.
create_group
(name[, track_order])Create an HDF5 group - similar to a folder in the filesystem - the HDF5 groups allow the users to structure their data.
create_hdf
(path, job_name)Create an ProjectHDFio object to store project related information - for testing aggregated data
Internal function to create a pyiron project pointing to the directory where the HDF5 file is located.
Create the working directory on the file system if it does not exist already.
Get the size of the HDF5 file.
get
(key[, default])Get data from the HDF5 file.
get_from_table
(path, name)Get a specific value from a pandas.DataFrame.
get_job_id
(job_specifier)get the job_id for job named job_name in the local project path from database
get_pandas
(name)Load a dictionary from the HDF5 file and display the dictionary as a pandas DataFrame.
get_size
(hdf)Get the size of the groups inside the HDF5 file.
groups
()Filter HDF5 file by groups.
hd_copy
(hdf_old, hdf_new[, exclude_groups, ...])Copy data from one HDF5 file to another.
inspect
(job_specifier)Inspect an existing pyiron object - most commonly a job - from the database
items
()List all keys and values as items of all groups and nodes of the HDF5 file.
keys
()List all groups and nodes of the HDF5 file - where groups are equivalent to directories and nodes to files.
list_all
()Returns dictionary of :method:`.list_groups()` and :method:`.list_nodes()`.
Equivalent to os.listdirs (consider groups as equivalent to dirs).
Return a list of names of all nested groups.
list_h5_path
([h5_path])List all groups and nodes of the HDF5 file.
Return a list of names of all nested nodes.
listdirs
()Equivalent to os.listdirs (consider groups as equivalent to dirs).
load
(job_specifier[, convert_to_object])Load an existing pyiron object - most commonly a job - from the database
load_from_jobpath
([job_id, db_entry, ...])Internal function to load an existing job either based on the job ID or based on the database entry dictionary.
nodes
()Filter HDF5 file by nodes.
open
(h5_rel_path)Create an HDF5 group and enter this specific group.
pop
(k[,d])If key is not found, d is returned if given, otherwise KeyError is raised.
popitem
()as a 2-tuple; but raise KeyError if D is empty.
put
(key, value)Store data inside the HDF5 file.
read_dict_from_hdf
([group_paths, recursive])Read data from HDF5 file into a dictionary - by default only the nodes are converted to dictionaries, additional sub groups can be specified using the group_paths parameter.
Remove the HDF5 file with all the related content.
Remove an HDF5 group if it exists.
remove_job
(job_specifier[, _unprotect])Remove a single job from the project based on its job_specifier.
rewrite_hdf5
([job_name, info, ...])Rewrite the entire hdf file.
setdefault
(k[,d])show_hdf
()Iterate over the HDF5 data structure and generate a human-readable graph.
to_dict
([hierarchical])Get the content of the HDF5 file at the current h5_path returned as a dictionary.
to_object
([class_name])Load the full pyiron object from an HDF5 file
update
([E, ]**F)If E present and has a .keys() method, does: for k in E.keys(): D[k] = E[k] If E present and lacks .keys() method, does: for (k, v) in E: D[k] = v In either case, this is followed by: for k, v in F.items(): D[k] = v
values
()List all values for all groups and nodes of the HDF5 file.
write_dict
(data_dict[, compression])Write a dictionary to the HDF5 file.
write_dict_to_hdf
(data_dict)Write a dictionary to HDF5
Attributes
The absolute path to of the current pyiron project - absolute path on the file system, not including the HDF5 path.
Get connection to the SQL database
Check if the HDF5 file exists already.
Get the file name of the HDF5 file.
Get the directory where the HDF5 file is located.
Get the path in the HDF5 file starting from the root group.
Check if the HDF5 file is empty.
Check if the current h5_path is pointing to the HDF5 root group.
Get the name of the HDF5 group.
Absolute path of the HDF5 group starting from the system root - combination of the absolute system path plus the absolute path inside the HDF5 file starting from the root group.
Get the project instance the ProjectHDFio object is located in
the relative path of the current project / folder starting from the root path of the pyiron user directory
the pyiron user directory, defined in the .pyiron configuration
Get the SQL query for the project
Get current unix/linux/windows user who is running pyiron
Get the working directory of the current ProjectHDFio object. The working directory equals the path but it is represented by the filesystem: /absolute/path/to/the/file.h5/path/inside/the/hdf5/file becomes: /absolute/path/to/the/file_hdf5/path/inside/the/hdf5/file.
- property base_name: str#
The absolute path to of the current pyiron project - absolute path on the file system, not including the HDF5 path.
- Returns:
current project path
- Return type:
str
- clear() None. Remove all items from D. #
- close() None #
Close the current HDF5 path and return to the path before the last open.
- copy() ProjectHDFio [source]#
Copy the ProjectHDFio object - copying just the Python object but maintaining the same pyiron path
- Returns:
copy of the ProjectHDFio object
- Return type:
- copy_to(destination: Pointer, file_name: str = None, maintain_name: bool = True) Pointer #
Copy the content of the HDF5 file to a new location.
- Parameters:
destination (Pointer) – The Pointer object pointing to the new location.
file_name (str, optional) – The name of the new HDF5 file. Defaults to None.
maintain_name (bool, optional) – Whether to maintain the names of the HDF5 groups. Defaults to True.
- Returns:
The Pointer object pointing to a file which now contains the same content as the current file.
- Return type:
Pointer
- create_group(name: str, track_order: bool = False) FileHDFio #
Create an HDF5 group - similar to a folder in the filesystem - the HDF5 groups allow the users to structure their data.
- Parameters:
name (str) – Name of the HDF5 group
track_order (bool) – If False, this groups tracks its elements in alphanumeric order, if True, in insertion order
- Returns:
FileHDFio object pointing to the new group
- Return type:
- create_hdf(path: str, job_name: str) ProjectHDFio [source]#
Create an ProjectHDFio object to store project related information - for testing aggregated data
- Parameters:
path (str) – absolute path
job_name (str) – name of the HDF5 container
- Returns:
HDF5 object
- Return type:
- create_project_from_hdf5() Project [source]#
Internal function to create a pyiron project pointing to the directory where the HDF5 file is located.
- Returns:
pyiron project object
- Return type:
- create_working_directory() None [source]#
Create the working directory on the file system if it does not exist already.
- property db: DatabaseAccess#
Get connection to the SQL database
- Returns:
database conncetion
- Return type:
- property file_exists: bool#
Check if the HDF5 file exists already.
- Returns:
True if the file exists, False otherwise.
- Return type:
bool
- property file_name: str#
Get the file name of the HDF5 file.
- Returns:
The absolute path to the HDF5 file.
- Return type:
str
- property file_path: str#
Get the directory where the HDF5 file is located.
- Returns:
Directory where the HDF5 file is located
- Return type:
str
- file_size() float #
Get the size of the HDF5 file.
- Returns:
The file size in bytes.
- Return type:
float
- get(key: str, default: object | None = None) Dict | List | float | int #
Get data from the HDF5 file.
- Parameters:
key (str) – Path to the data or key of the data object
default (object) – Default value to return if key doesn’t exist
- Returns:
Data or data object
- Return type:
Union[Dict, List, float, int]
- get_from_table(path: str, name: str) Dict | List | float | int #
Get a specific value from a pandas.DataFrame.
- Parameters:
path (str) – Relative path to the data object
name (str) – Parameter key
- Returns:
The value associated with the specific parameter key
- Return type:
Union[Dict, List, float, int]
- get_job_id(job_specifier: str | int) int [source]#
get the job_id for job named job_name in the local project path from database
- Parameters:
job_specifier (str, int) – name of the job or job ID
- Returns:
job ID of the job
- Return type:
int
- get_pandas(name: str) DataFrame #
Load a dictionary from the HDF5 file and display the dictionary as a pandas DataFrame.
- Parameters:
name (str) – HDF5 node name
- Returns:
The dictionary as a pandas DataFrame object
- Return type:
pd.DataFrame
- get_size(hdf: FileHDFio) float #
Get the size of the groups inside the HDF5 file.
- Parameters:
hdf (FileHDFio) – HDF5 file
- Returns:
File size in Bytes
- Return type:
float
- groups() FileHDFio #
Filter HDF5 file by groups.
- Returns:
An HDF5 file which is filtered by groups
- Return type:
- property h5_path: str#
Get the path in the HDF5 file starting from the root group.
- Returns:
The HDF5 path.
- Return type:
str
- hd_copy(hdf_old: FileHDFio, hdf_new: FileHDFio, exclude_groups: List[str] | None = None, exclude_nodes: List[str] | None = None) None #
Copy data from one HDF5 file to another.
- inspect(job_specifier: str | int) JobCore [source]#
Inspect an existing pyiron object - most commonly a job - from the database
- Parameters:
job_specifier (str, int) – name of the job or job ID
- Returns:
Access to the HDF5 object - not a GenericJob object - use load() instead.
- Return type:
- property is_empty: bool#
Check if the HDF5 file is empty.
- Returns:
True if the file is empty, False otherwise.
- Return type:
bool
- property is_root: bool#
Check if the current h5_path is pointing to the HDF5 root group.
- Returns:
True if the current h5_path is the root group, False otherwise.
- Return type:
bool
- items() List[Tuple[str, Dict | List | float | int]] #
List all keys and values as items of all groups and nodes of the HDF5 file.
- Returns:
List of sets (key, value)
- Return type:
List[Tuple[str, Union[Dict, List, float, int]]]
- keys() List[str] #
List all groups and nodes of the HDF5 file - where groups are equivalent to directories and nodes to files.
- Returns:
All groups and nodes
- Return type:
List[str]
- list_all()#
Returns dictionary of :method:`.list_groups()` and :method:`.list_nodes()`.
- Returns:
- results of :method:`.list_groups() under the key "groups"; results of :method:`.list_nodes()` und the
key “nodes”
- Return type:
dict
- list_dirs() List[str] #
Equivalent to os.listdirs (consider groups as equivalent to dirs).
- Returns:
List of groups in pytables for the path self.h5_path
- Return type:
List[str]
- list_groups()#
Return a list of names of all nested groups.
- Returns:
group names
- Return type:
list of str
- list_h5_path(h5_path: str = '') Dict[str, List[str]] #
List all groups and nodes of the HDF5 file.
- Parameters:
h5_path (str, optional) – The path to a group in the HDF5 file from where the data is read. Defaults to “”.
- Returns:
A dictionary with keys “groups” and “nodes” containing lists of groups and nodes.
- Return type:
Dict[str, List[str]]
- list_nodes()#
Return a list of names of all nested nodes.
- Returns:
node names
- Return type:
list of str
- listdirs() List[str] #
Equivalent to os.listdirs (consider groups as equivalent to dirs).
- Returns:
List of groups in pytables for the path self.h5_path
- Return type:
List[str]
- load(job_specifier: str | int, convert_to_object: bool = True) GenericJob | JobCore [source]#
Load an existing pyiron object - most commonly a job - from the database
- Parameters:
job_specifier (str, int) – name of the job or job ID
convert_to_object (bool) – convert the object to an pyiron object or only access the HDF5 file - default=True accessing only the HDF5 file is about an order of magnitude faster, but only provides limited functionality. Compare the GenericJob object to JobCore object.
- Returns:
Either the full GenericJob object or just a reduced JobCore object
- Return type:
- load_from_jobpath(job_id: int | None = None, db_entry: dict | None = None, convert_to_object: bool = True) GenericJob | JobCore [source]#
Internal function to load an existing job either based on the job ID or based on the database entry dictionary.
- Parameters:
job_id (int, optional) – Job ID - optional, but either the job_id or the db_entry is required.
db_entry (dict, optional) – database entry dictionary - optional, but either the job_id or the db_entry is required.
convert_to_object (bool) – convert the object to an pyiron object or only access the HDF5 file - default=True accessing only the HDF5 file is about an order of magnitude faster, but only provides limited functionality. Compare the GenericJob object to JobCore object.
- Returns:
Either the full GenericJob object or just a reduced JobCore object
- Return type:
- property name: str#
Get the name of the HDF5 group.
- Returns:
The name of the HDF5 group.
- Return type:
str
- nodes() FileHDFio #
Filter HDF5 file by nodes.
- Returns:
An HDF5 file which is filtered by nodes
- Return type:
- open(h5_rel_path: str) FileHDFio #
Create an HDF5 group and enter this specific group. If the group exists in the HDF5 path, only the h5_path is set correspondingly, otherwise the group is created first.
- Parameters:
h5_rel_path (str) – Relative path from the current HDF5 path - h5_path - to the new group
- Returns:
FileHDFio object pointing to the new group
- Return type:
- property path: str#
Absolute path of the HDF5 group starting from the system root - combination of the absolute system path plus the absolute path inside the HDF5 file starting from the root group.
- Returns:
absolute path
- Return type:
str
- pop(k[, d]) v, remove specified key and return the corresponding value. #
If key is not found, d is returned if given, otherwise KeyError is raised.
- popitem() (k, v), remove and return some (key, value) pair #
as a 2-tuple; but raise KeyError if D is empty.
- property project: pyiron_base.project.generic.Project#
Get the project instance the ProjectHDFio object is located in
- Returns:
pyiron project
- Return type:
- property project_path: str#
the relative path of the current project / folder starting from the root path of the pyiron user directory
- Returns:
relative path of the current project / folder
- Return type:
str
- put(key: str, value: DataFrame | Series | Dict | List | float | int) None #
Store data inside the HDF5 file.
- Parameters:
key (str) – Key to store the data
value (Union[pandas.DataFrame, pandas.Series, Dict, List, float, int]) – Data to store
- read_dict_from_hdf(group_paths: List[str] = [], recursive: bool = False) dict #
Read data from HDF5 file into a dictionary - by default only the nodes are converted to dictionaries, additional sub groups can be specified using the group_paths parameter.
- Parameters:
group_paths (List[str]) – list of additional groups to be included in the dictionary, for example: [“input”, “output”, “output/generic”] These groups are defined relative to the h5_path.
recursive (bool) – Load all subgroups recursively
- Returns:
The loaded data. Can be of any type supported by
write_hdf5
.- Return type:
Dict
- remove_file() None #
Remove the HDF5 file with all the related content.
- remove_group() None #
Remove an HDF5 group if it exists. If the group does not exist, no error message is raised.
- remove_job(job_specifier: str | int, _unprotect: bool = False) None [source]#
Remove a single job from the project based on its job_specifier.
- Parameters:
job_specifier (Union[str, int]) – Name of the job or job ID.
_unprotect (bool) – [True/False] Delete the job without validating the dependencies to other jobs. Default is False.
- rewrite_hdf5(job_name: str | None = None, info: bool = False, exclude_groups: List[str] | None = None, exclude_nodes: List[str] | None = None) None #
Rewrite the entire hdf file.
- Parameters:
job_name (Optional[str]) – Deprecated argument, ignored.
info (bool) – Whether to give the information on how much space has been saved.
exclude_groups (Optional[List[str]]) – List of groups to exclude from the copy.
exclude_nodes (Optional[List[str]]) – List of nodes to exclude from the copy.
- property root_path: str#
the pyiron user directory, defined in the .pyiron configuration
- Returns:
pyiron user directory of the current project
- Return type:
str
- setdefault(k[, d]) D.get(k,d), also set D[k]=d if k not in D #
- show_hdf() None #
Iterate over the HDF5 data structure and generate a human-readable graph.
- property sql_query: str#
Get the SQL query for the project
- Returns:
SQL query
- Return type:
str
- to_dict(hierarchical: bool = False) Dict[str, Any] #
Get the content of the HDF5 file at the current h5_path returned as a dictionary.
- Parameters:
hierarchical (bool, optional) – Whether to convert the internal hierarchy of the HDF5 file to a hierarchical dictionary. Defaults to False.
- Returns:
A dictionary with the content of the HDF5 file.
- Return type:
Dict[str, Any]
- to_object(class_name: str | None = None, **kwargs) object [source]#
Load the full pyiron object from an HDF5 file
- Parameters:
class_name (str, optional) – if the ‘TYPE’ node is not available in the HDF5 file a manual object type can be set, must be as reported by str(type(obj))
**kwargs – optional parameters optional parameters to override init parameters
- Returns:
pyiron object of the given class_name
- update([E, ]**F) None. Update D from mapping/iterable E and F. #
If E present and has a .keys() method, does: for k in E.keys(): D[k] = E[k] If E present and lacks .keys() method, does: for (k, v) in E: D[k] = v In either case, this is followed by: for k, v in F.items(): D[k] = v
- property user: str#
Get current unix/linux/windows user who is running pyiron
- Returns:
username
- Return type:
str
- values() List[Dict | List | float | int] #
List all values for all groups and nodes of the HDF5 file.
- Returns:
List of all values
- Return type:
List[Union[Dict, List, float, int]]
- property working_directory: str#
Get the working directory of the current ProjectHDFio object. The working directory equals the path but it is represented by the filesystem:
/absolute/path/to/the/file.h5/path/inside/the/hdf5/file
- becomes:
/absolute/path/to/the/file_hdf5/path/inside/the/hdf5/file
- Returns:
absolute path to the working directory
- Return type:
str
- write_dict(data_dict: Dict[str, Any], compression: int = 4) None #
Write a dictionary to the HDF5 file.
- Parameters:
data_dict (Dict[str, Any]) –
Dictionary of data objects to be stored in the HDF5 file, the keys provide the path inside the HDF5 file and the values the data to be stored in those nodes. The corresponding HDF5 groups are created automatically:
- {
‘/hdf5root/group/node_name’: {}, ‘/hdf5root/group/subgroup/node_name’: […],
}
compression (int, optional) – The compression level to use (0-9) to compress data using gzip. Defaults to 4.
- write_dict_to_hdf(data_dict: dict) None #
Write a dictionary to HDF5
- Parameters:
data_dict (dict) – dictionary with objects which should be written to HDF5