pyiron_base.database.generic.DatabaseAccess#
- class pyiron_base.database.generic.DatabaseAccess(connection_string: str, table_name: str, timeout: int = 60)[source]#
Bases:
IsDatabaseA core element of PyIron, which generally deals with accessing the database: getting, sending, changing some data to the db.
- Parameters:
connection_string (str) – SQLalchemy connection string which specifies the database to connect to typical form: dialect+driver://username:password@host:port/database example: ‘postgresql://scott:tiger@cmcent56.mpie.de/mdb’
table_name (str) – database table name, a simple string like: ‘simulation’
Murat Han Celik
- __init__(connection_string: str, table_name: str, timeout: int = 60)[source]#
Initialize the Database connection
- Parameters:
connection_string (str) – SQLalchemy connection string which specifies the database to connect to typical form: dialect+driver://username:password@host:port/database example: ‘postgresql://scott:tiger@cmcent56.mpie.de/mdb’
table_name (str) – database table name, a simple string like: ‘simulation’
timeout (int) – time in seconds before unused database connection are closed
Methods
__init__(connection_string, table_name[, ...])Initialize the Database connection
add_column(col_name, col_type)Add an additional column - required for modification on the database
add_item_dict(par_dict[, check_duplicates])Create a new database item
change_column_type(col_name, col_type)Modify data type of an existing column - required for modification on the database
delete_item(item_id)Delete Item from database
Get column names
get_item_by_id(item_id)Get item from database by searching for a specific item Id.
get_items_dict(item_dict[, return_all_columns])- param item_dict:
a dict type, which has a certain syntax for this function:
get_job_ids(sql_query, user, project_path[, ...])Return the job IDs matching a specific query
get_job_status(job_id)get_job_working_directory(job_id)get_jobs(sql_query, user, project_path[, ...])Internal function to return the jobs as dictionary rather than a pandas.Dataframe
get_table_headings([table_name])Get column names; if given table_name can select one of multiple tables defined in the database, but subclasses may ignore it
item_update(par_dict, item_id)job_table(sql_query, user, project_path[, ...])Access the job_table.
query_for_element(element)regexp(expr, item)Regex function for SQLite :param expr: str, regex expression :param item: str, item which needs to be checked
set_job_status(status, job_id)Set status of a job or multiple jobs if job_id is iterable.
Attributes
Get view_mode - if view_moded is enable pyiron has read only access to the database.
- add_column(col_name: str | List[str], col_type: str | List[str]) None[source]#
Add an additional column - required for modification on the database
- Parameters:
col_name (str, list) – name of the new column, normal string like: ‘myColumn’
(str (col_type) – SQL type of the new column, SQL type like: ‘varchar(50)’
list – SQL type of the new column, SQL type like: ‘varchar(50)’
Returns:
- add_item_dict(par_dict: dict, check_duplicates: bool = False) int[source]#
Create a new database item
- Parameters:
par_dict (dict) –
- Dictionary with the item values and column names as keys, like:
{‘chemicalformula’: ‘BO’,
’computer’: ‘localhost’, ‘hamilton’: ‘VAMPS’, ‘hamversion’: ‘1.1’, ‘job’: ‘testing’, ‘subjob’ : ‘SubJob’, ‘parentid’: 0L, ‘myCol’: ‘Blubbablub’, ‘project’: ‘database.testing’, ‘projectpath’: ‘/root/directory/tmp’, ‘status’: ‘KAAAA’, ‘timestart’: datetime(2016, 5, 2, 11, 31, 4, 253377), ‘timestop’: datetime(2016, 5, 2, 11, 31, 4, 371165), ‘totalcputime’: 0.117788, ‘username’: ‘Test’}
check_duplicates (bool) – Check for duplicate entries in the database
- Returns:
Database ID of the item created as an int, like: 3
- Return type:
int
- change_column_type(col_name: str | List[str], col_type: str | List[str]) None[source]#
Modify data type of an existing column - required for modification on the database
- Parameters:
col_name (str, list) – name of the new column, normal string like: ‘myColumn’
(str (col_type) – SQL type of the new column, SQL type like: ‘varchar(50)’
list – SQL type of the new column, SQL type like: ‘varchar(50)’
Returns:
- delete_item(item_id: int) None[source]#
Delete Item from database
- Parameters:
item_id (int) – Databse Item ID (Integer), like: 38
Returns:
- get_db_columns() List[str]#
Get column names
- Returns:
- list of column names like:
[‘id’, ‘parentid’, ‘masterid’, ‘projectpath’, ‘project’, ‘job’, ‘subjob’, ‘chemicalformula’, ‘status’, ‘hamilton’, ‘hamversion’, ‘username’, ‘computer’, ‘timestart’, ‘timestop’, ‘totalcputime’]
- Return type:
list
- get_item_by_id(item_id: int) dict[source]#
Get item from database by searching for a specific item Id.
- Parameters:
item_id (int) – Databse Item ID (Integer), like: 38
- Returns:
- Dictionary where the key is the column name, like:
- {‘chemicalformula’: u’BO’,
’computer’: u’localhost’, ‘hamilton’: u’VAMPS’, ‘hamversion’: u’1.1’, ‘id’: 1, ‘job’: u’testing’, ‘masterid’: None, ‘parentid’: 0, ‘project’: u’database.testing’, ‘projectpath’: u’/root/directory/tmp’, ‘status’: u’KAAAA’, ‘subjob’: u’SubJob’, ‘timestart’: datetime.datetime(2016, 5, 2, 11, 31, 4, 253377), ‘timestop’: datetime.datetime(2016, 5, 2, 11, 31, 4, 371165), ‘totalcputime’: 0.117788, ‘username’: u’Test’}
- Return type:
dict
- get_items_dict(item_dict: dict, return_all_columns: bool = True) List[dict][source]#
- Parameters:
item_dict (dict) –
a dict type, which has a certain syntax for this function: a normal dict like {‘hamilton’: ‘VAMPE’, ‘hamversion’: ‘1.1’} has similarities with a simple query like
select * from table_name where hamilton = ‘VAMPE AND hamversion = ‘1.1’
as seen it puts an AND for every key, value combination in the dict and searches for it.
another syntax is for an OR statement, simply: {‘hamilton’: [‘VAMPE’, ‘LAMMPS’]}, the query would be:
select * from table_name where hamilton = ‘VAMPE’ OR hamilton = ‘LAMMPS’
- and lastly for a LIKE statement, simply: {‘project’: ‘database.%’}, the query would be
select * from table_name where project LIKE ‘database.%’
that means you can simply add the syntax for a like statement like ‘%’ and it will automatically operate a like-search
- of course you can also use a more complex select method, with everything in use:
- {‘hamilton’: [‘VAMPE’, ‘LAMMPS’],
’project’: ‘databse%’, ‘hamversion’: ‘1.1’}
- select * from table_name where (hamilton = ‘VAMPE’ Or hamilton = ‘LAMMPS’) AND
(project LIKE ‘database%’) AND hamversion = ‘1.1’
return_all_columns (bool) – return all columns or only the ‘id’ - still the format stays the same.
- Returns:
- the function returns a list of dicts, but it does not format datetime:
- [{‘chemicalformula’: u’Ni108’,
’computer’: u’mapc157’, ‘hamilton’: u’LAMMPS’, ‘hamversion’: u’1.1’, ‘id’: 24, ‘job’: u’DOF_1_0’, ‘parentid’: 21L, ‘project’: u’lammps.phonons.Ni_fcc’, ‘projectpath’: u’D:/PyIron/PyIron_data/projects’, ‘status’: u’finished’, ‘timestart’: datetime.datetime(2016, 6, 24, 10, 17, 3, 140000), ‘timestop’: datetime.datetime(2016, 6, 24, 10, 17, 3, 173000), ‘totalcputime’: 0.033, ‘username’: u’test’},
- {‘chemicalformula’: u’Ni108’,
’computer’: u’mapc157’, ‘hamilton’: u’LAMMPS’, ‘hamversion’: u’1.1’, ‘id’: 21, ‘job’: u’ref’, ‘parentid’: 20L, ‘project’: u’lammps.phonons.Ni_fcc’, ‘projectpath’: u’D:/PyIron/PyIron_data/projects’, ‘status’: u’finished’, ‘timestart’: datetime.datetime(2016, 6, 24, 10, 17, 2, 429000), ‘timestop’: datetime.datetime(2016, 6, 24, 10, 17, 2, 463000), ‘totalcputime’: 0.034, ‘username’: u’test’},…….]
- Return type:
list
- get_job_ids(sql_query: str, user: str, project_path: str, recursive: bool = True) List[int]#
Return the job IDs matching a specific query
- Parameters:
database (DatabaseAccess) – Database object
sql_query (str) – SQL query to enter a more specific request
user (str) – username of the user whoes user space should be searched
project_path (str) – root_path - this is in contrast to the project_path in GenericPath
recursive (bool) – search subprojects [True/False]
- Returns:
a list of job IDs
- Return type:
list
- get_jobs(sql_query: str, user: str, project_path: str, recursive: bool = True, columns: List[str] | None = None) List[dict]#
Internal function to return the jobs as dictionary rather than a pandas.Dataframe
- Parameters:
sql_query (str) – SQL query to enter a more specific request
user (str) – username of the user whoes user space should be searched
project_path (str) – root_path - this is in contrast to the project_path in GenericPath
recursive (bool) – search subprojects [True/False]
columns (list) – by default only the columns [‘id’, ‘project’] are selected, but the user can select a subset of [‘id’, ‘status’, ‘chemicalformula’, ‘job’, ‘subjob’, ‘project’, ‘projectpath’, ‘timestart’, ‘timestop’, ‘totalcputime’, ‘computer’, ‘hamilton’, ‘hamversion’, ‘parentid’, ‘masterid’]
- Returns:
columns are used as keys and point to a list of the corresponding values
- Return type:
dict
- get_table_headings(table_name: str | None = None) List[str]#
Get column names; if given table_name can select one of multiple tables defined in the database, but subclasses may ignore it
- Parameters:
table_name (str) – simple string of a table_name like: ‘jobs_username’
- Returns:
- list of column names like:
[‘id’, ‘parentid’, ‘masterid’, ‘projectpath’, ‘project’, ‘job’, ‘subjob’, ‘chemicalformula’, ‘status’, ‘hamilton’, ‘hamversion’, ‘username’, ‘computer’, ‘timestart’, ‘timestop’, ‘totalcputime’]
- Return type:
list
- job_table(sql_query: str, user: str, project_path: str, recursive: bool = True, columns: List[str] | None = None, all_columns: bool = False, sort_by: str = 'id', max_colwidth: int = 200, full_table: bool = False, element_lst: List[str] | None = None, job_name_contains: str = '', mode: Literal['regex', 'glob'] = 'glob', **kwargs)#
Access the job_table.
- Parameters:
sql_query (str) – SQL query to enter a more specific request
user (str) – username of the user whoes user space should be searched
project_path (str) – root_path - this is in contrast to the project_path in GenericPath
recursive (bool) – search subprojects [True/False]
columns (list) – by default only the columns [‘job’, ‘project’, ‘chemicalformula’] are selected, but the user can select a subset of [‘id’, ‘status’, ‘chemicalformula’, ‘job’, ‘subjob’, ‘project’, ‘projectpath’, ‘timestart’, ‘timestop’, ‘totalcputime’, ‘computer’, ‘hamilton’, ‘hamversion’, ‘parentid’, ‘masterid’]
all_columns (bool) – Select all columns - this overwrites the columns option.
sort_by (str) – Sort by a specific column
max_colwidth (int) – set the column width
full_table (bool) – Whether to show the entire pandas table
element_lst (list) – list of elements required in the chemical formular - by default None
job_name_contains (str) – (deprecated) A string which should be contained in every job_name
mode (str) – search mode when kwargs are given.
**kwargs (dict) – Optional arguments for filtering with keys matching the project database column name (eg. status=”finished”). Asterisk can be used to denote a wildcard, for zero or more instances of any character
- Returns:
Return the result as a pandas.Dataframe object
- Return type:
pandas.Dataframe
- static regexp(expr: str, item: str) str | None[source]#
Regex function for SQLite :param expr: str, regex expression :param item: str, item which needs to be checked
Returns:
- set_job_status(status: str, job_id: int | List[int]) None#
Set status of a job or multiple jobs if job_id is iterable.
- Parameters:
status (str) – status
job_id (int, Iterable) – job id
- property view_mode: bool#
Get view_mode - if view_moded is enable pyiron has read only access to the database.
Some implementations do not allow to set this value.
- Returns:
True when view_mode is enabled
- Return type:
bool