pyiron_base.interfaces.has_hdf module
Interface for classes to serialize to HDF5.
- class pyiron_base.interfaces.has_hdf.HasHDF
Bases:
ABC
Mixin class for objects that can write themselves to HDF.
Subclasses must implement
_from_hdf()
,_to_hdf()
and_get_hdf_group_name()
. They may implementfrom_hdf_args()
.from_hdf()
andto_hdf()
shall respect the given group_name in the following way. If either the argument or the method_get_hdf_group_name()
returns not None they shall create a new subgroup in the given HDF object and then call_from_hdf()
or_to_hdf()
with this subgroup and afterwards callProjectHDFio.close()
on it. If both are None it shall pass the given HDF object unchanged.Subclasses that need to read special arguments from HDF before an instance can be created, can overwrite
from_hdf_args()
and return the arguments in a dict that can be **kwargs-passed to the __init__ of the subclass. When loading an object withProjectHDFio.to_object
this method is called internally, used to create an instance on which thenfrom_hdf()
is called.Subclasses may specify an
__hdf_version__
to signal changes in the layout of the data in HDF.from_hdf()
will read this value and pass it verbatim to the subclasses_from_hdf()
. No semantics are imposed on this value, but it is usually a three digit version number.Here’s a toy class that enables writting `list`s to HDF.
>>> class HDFList(list, HasHDF): ... def _from_hdf(self, hdf, version=None): ... values = [] ... for n in hdf.list_nodes(): ... if not n.startswith("__index_"): continue ... values.append((int(n.split("__index_")[1]), hdf[n])) ... values = sorted(values, key=lambda e: e[0]) ... self.clear() ... self.extend(list(zip(*values))[1]) ... def _to_hdf(self, hdf): ... for i, v in enumerate(self): ... hdf[f"__index_{i}"] = v ... def _get_hdf_group_name(self): ... return "list"
We can use this simply like any other list, but also call the new HDF methods on it after we get an HDF object.
>>> l = HDFList([1,2,3,4]) >>> from pyiron_base import Project >>> pr = Project('test_foo') >>> hdf = pr.create_hdf(pr.path, 'list')
Since we return “list” in
_get_hdf_group_name()
by default our list gets written into a group of the same name.>>> l.to_hdf(hdf) >>> hdf {'groups': ['list'], 'nodes': []} >>> hdf['list'] {'groups': [], 'nodes': ['HDF_VERSION', 'NAME', 'OBJECT', 'TYPE', '__index_0', '__index_1', '__index_2', '__index_3']}
(Since this is a docstring, actually calling
ProjectHDFio.to_object()
wont’ work, so we’ll simulate it.)>>> lcopy = HDFList() >>> lcopy.from_hdf(hdf) >>> lcopy [1, 2, 3, 4]
We can also override the target group name by passing it >>> l.to_hdf(hdf, “my_group”) >>> hdf {‘groups’: [‘list’, ‘my_group’], ‘nodes’: []}
>>> hdf.remove_file() >>> pr.remove(enable=True)
When using this class as a mixin that also derives from classes having a legacy implementation here’s a simple recipe
>>> class MyOldClass: ... def to_hdf(self, hdf, group_name): ... ... # whatever you need to save ... def from_hdf(self, hdf, group_name): ... ... # whatever you need to restore >>> class MyDerivedClass(MyOldClass, HasHDF): ... def to_hdf(self, hdf, group_name): ... MyOldClass.to_hdf(self, hdf=hdf, group_name=group_name) ... HasHDF.to_hdf(self, hdf=hdf, group_name=group_name) ... def from_hdf(self, hdf, group_name): ... MyOldClass.from_hdf(self, hdf=hdf, group_name=group_name) ... HasHDF.to_hdf(self, hdf=hdf, group_name=group_name)
i.e. explicitly call both methods with the same group_name. The call to
HasHDF.to_hdf()
has to be last so that the type information is consistently written to HDF.If you’re deriving from
GenericJob
it will already take care of descending into group_name, so you can pass “” as the group_name like so>>> from pyiron_base import GenericJob >>> class MyHybridJob(GenericJob, HasHDF): ... def to_hdf(self, hdf, group_name): ... GenericJob.to_hdf(self, hdf=hdf, group_name=group_name) ... HasHDF.to_hdf(self, hdf=self.project_hdf5, group_name="") ... def from_hdf(self, hdf, group_name): ... MyOldClass.from_hdf(self, hdf=hdf, group_name=group_name) ... HasHDF.to_hdf(self, hdf=self.project_hdf5, group_name="")
- from_hdf(hdf: ProjectHDFio, group_name: str = None)
Read object to HDF.
If group_name is given descend into subgroup in hdf first.
- Parameters:
hdf (
ProjectHDFio
) – HDF group to read fromgroup_name (str, optional) – name of subgroup
- classmethod from_hdf_args(hdf: ProjectHDFio) dict
Read arguments for instance creation from HDF5 file.
- Parameters:
hdf (ProjectHDFio) – HDF5 group object
- Returns:
arguments that can be **kwarg-passed to cls().
- Return type:
dict
- rewrite_hdf(hdf: ProjectHDFio, group_name: str = None)
Update the HDF representation.
If an object is read from an older layout, this will remove the old data and rewrite it in the newest layout.
- Parameters:
hdf (
ProjectHDFio
) – HDF group to read/writegroup_name (str, optional) – name of subgroup
- to_hdf(hdf: ProjectHDFio, group_name: str = None)
Write object to HDF.
If group_name is given create a subgroup in hdf first.
- Parameters:
hdf (
ProjectHDFio
) – HDF group to write togroup_name (str, optional) – name of subgroup