pyiron_base.jobs.datamining.PyironTable#

class pyiron_base.jobs.datamining.PyironTable(project: pyiron_base.project.generic.Project, name: str | None = None, system_function_lst: List[callable] = None, csv_file_name: str | None = None)[source]#

Bases: object

Class for easy, efficient, and pythonic analysis of data from pyiron projects

Parameters:
  • project (pyiron_base.project.generic.Project) – The project to analyze

  • name (str) – Name of the pyiron table

  • system_function_lst (list/ None) – List of built-in functions

__init__(project: pyiron_base.project.generic.Project, name: str | None = None, system_function_lst: List[callable] = None, csv_file_name: str | None = None)[source]#

Methods

__init__(project[, name, ...])

create_table(file, job_status_list[, ...])

Create or update the table.

get_dataframe()

refill_dict(diff_dict_lst)

Ensure that all dictionaries in the list have the same keys.

total_lst_of_keys(diff_dict_lst)

Get unique list of all keys occuring in list.

Attributes

db_filter_function

Function to filter the a project database table before job specific functions are applied.

filter

Object containing pre-defined filter functions

filter_function

Function to filter each job before more expensive functions are applied

name

Name of the table.

create_table(file: FileHDFio, job_status_list: List[str], executor: Executor | None = None, enforce_update: bool = False)[source]#

Create or update the table.

If this method has been called before and there are new functions added to add, apply them on the previously analyzed jobs. If this method has been called before and there are new jobs added to analysis_project, apply all functions to them.

The result is available via get_dataframe().

Warning

The executor, if given, must not naively pickle the mapped functions or arguments, as PyironTable relies on lambda functions internally. Use with executors that rely on dill or cloudpickle instead. Pyiron provides such executors in the executorlib sub packages.

Parameters:
  • file (FileHDFio) – HDF were the previous state of the table is stored

  • job_status_list (list of str) – only consider jobs with these statuses

  • executor (concurrent.futures.Executor) – executor for parallel execution

  • enforce_update (bool) – if True always regenerate the table completely.

property db_filter_function: callable | None#

Function to filter the a project database table before job specific functions are applied.

The function must take a pyiron project table in the pandas.DataFrame format (project.job_table()) and return a boolean pandas.DataSeries with the same number of rows as the project table

Example:

>>> def job_filter_function(df):
>>>    return (df["chemicalformula"=="H2"]) & (df["hamilton"=="Vasp"])
>>> table.db_filter_function = job_filter_function
property filter: JobFilters#

Object containing pre-defined filter functions

Returns:

The object containing the filters

Return type:

pyiron.table.datamining.JobFilters

property filter_function: callable | None#

Function to filter each job before more expensive functions are applied

Example:

>>> def job_filter_function(job):
>>>     return (job.status == "finished") & ("murn" in job.job_name)
>>> table.filter_function = job_filter_function
property name: str#

Name of the table. Takes the project name if not specified

Returns:

Name of the table

Return type:

str

refill_dict(diff_dict_lst: List) None[source]#

Ensure that all dictionaries in the list have the same keys.

Keys that are not in a dict are set to None.

static total_lst_of_keys(diff_dict_lst: List[dict]) set[source]#

Get unique list of all keys occuring in list.