MEDfl.NetManager packageο
Submodulesο
MEDfl.NetManager.dataset moduleο
- class MEDfl.NetManager.dataset.DataSet(name: str, path: str, engine=None)[source]ο
Bases:
object- __init__(name: str, path: str, engine=None)[source]ο
Initialize a DataSet object.
- Parameters:
name (str) β The name of the dataset.
path (str) β The file path of the dataset CSV file.
- delete_dataset()[source]ο
Delete the dataset from the database.
Notes: - Assumes the dataset name is unique in the βDataSetsβ table.
- static list_alldatasets(engine)[source]ο
List all dataset names from the βDataSetsβ table.
- Returns:
A DataFrame containing the names of all datasets in the βDataSetsβ table.
- Return type:
pd.DataFrame
MEDfl.NetManager.flsetup moduleο
- class MEDfl.NetManager.flsetup.FLsetup(name: str, description: str, network: Network)[source]ο
Bases:
object- __init__(name: str, description: str, network: Network)[source]ο
Initialize a Federated Learning (FL) setup.
- Parameters:
name (str) β The name of the FL setup.
description (str) β A description of the FL setup.
network (Network) β An instance of the Network class representing the network architecture.
- create_dataloader_from_node(node: Node, output, fill_strategy='mean', fit_encode=[], to_drop=[], train_batch_size: int = 32, test_batch_size: int = 1, split_frac: float = 0.2, dataset: Optional[Dataset] = None)[source]ο
Create DataLoader from a Node.
- Parameters:
node (Node) β The node from which to create DataLoader.
train_batch_size (int) β The batch size for training data.
test_batch_size (int) β The batch size for test data.
split_frac (float) β The fraction of data to be used for training.
dataset (Dataset) β The dataset to use. If None, the method will read the dataset from the node.
- Returns:
The DataLoader instances for training and testing.
- Return type:
DataLoader
- create_federated_dataset(output, fill_strategy='mean', fit_encode=[], to_drop=[], val_frac=0.1, test_frac=0.2) FederatedDataset[source]ο
Create a federated dataset.
- Parameters:
output (string) β the output feature of the dataset
val_frac (float) β The fraction of data to be used for validation.
test_frac (float) β The fraction of data to be used for testing.
- Returns:
The FederatedDataset instance containing train, validation, and test data.
- Return type:
- create_nodes_from_master_dataset(params_dict: dict)[source]ο
Create nodes from the master dataset.
- Parameters:
params_dict (dict) β A dictionary containing parameters for node creation. - column_name (str): The name of the column in the MasterDataset used to create nodes. - train_nodes (list): A list of node names that will be used for training. - test_nodes (list): A list of node names that will be used for testing.
- Returns:
A list of Node instances created from the master dataset.
- Return type:
list
- get_flDataSet()[source]ο
Retrieve the federated dataset associated with the FL setup using the FL setupβs name.
- Returns:
DataFrame containing the federated dataset information.
- Return type:
pandas.DataFrame
- static list_allsetups()[source]ο
List all the FL setups.
- Returns:
A DataFrame containing information about all the FL setups.
- Return type:
DataFrame
MEDfl.NetManager.net_helper moduleο
- MEDfl.NetManager.net_helper.get_feddataset_id_from_name(name)[source]ο
Get the Federated dataset Id from the FedDatasets table based on the federated dataset name.
- Parameters:
name (str) β Federated dataset name.
- Returns:
FedId or None if not found.
- Return type:
int or None
- MEDfl.NetManager.net_helper.get_flpipeline_from_name(name)[source]ο
Get the FLpipeline Id from the FLpipeline table based on the FL pipeline name.
- Parameters:
name (str) β FL pipeline name.
- Returns:
FLpipelineId or None if not found.
- Return type:
int or None
- MEDfl.NetManager.net_helper.get_flsetupid_from_name(name)[source]ο
Get the FLsetupId from the FLsetup table based on the FL setup name.
- Parameters:
name (str) β FL setup name.
- Returns:
FLsetupId or None if not found.
- Return type:
int or None
- MEDfl.NetManager.net_helper.get_netid_from_name(name)[source]ο
Get the Network Id from the Networks table based on the NetName.
- Parameters:
name (str) β Network name.
- Returns:
NetId or None if not found.
- Return type:
int or None
- MEDfl.NetManager.net_helper.get_nodeid_from_name(name)[source]ο
Get the NodeId from the Nodes table based on the NodeName.
- Parameters:
name (str) β Node name.
- Returns:
NodeId or None if not found.
- Return type:
int or None
- MEDfl.NetManager.net_helper.is_str(data_df, row, x)[source]ο
Check if a column in a DataFrame is of type βobjectβ and convert the value accordingly.
- Parameters:
data_df (pandas.DataFrame) β DataFrame containing the data.
row (pandas.Series) β Data row.
x (str) β Column name.
- Returns:
Processed value based on the column type.
- Return type:
str or float
- MEDfl.NetManager.net_helper.master_table_exists()[source]ο
Check if the MasterDataset table exists in the database.
- Returns:
True if the table exists, False otherwise.
- Return type:
bool
- MEDfl.NetManager.net_helper.process_data_after_reading(data, output, fill_strategy='mean', fit_encode=[], to_drop=[])[source]ο
Process data after reading from the database, including encoding, dropping columns, and creating a PyTorch TensorDataset.
- Parameters:
data (pandas.DataFrame) β Input data.
output (str) β Output column name.
fill_strategy (str, optional) β Imputation strategy for missing values. Default is βmeanβ.
fit_encode (list, optional) β List of columns to be label-encoded. Default is an empty list.
to_drop (list, optional) β List of columns to be dropped from the DataFrame. Default is an empty list.
- Returns:
Processed data as a PyTorch TensorDataset.
- Return type:
torch.utils.data.TensorDataset
MEDfl.NetManager.net_manager_queries moduleο
MEDfl.NetManager.network moduleο
- class MEDfl.NetManager.network.Network(name: str = '')[source]ο
Bases:
objectA class representing a network.
- nameο
The name of the network.
- Type:
str
- mtable_existsο
An integer flag indicating whether the MasterDataset table exists (1) or not (0).
- Type:
int
- __init__(name: str = '')[source]ο
Initialize a Network instance.
- Parameters:
name (str) β The name of the network.
- add_node(node: Node)[source]ο
Add a node to the network.
- Parameters:
node (Node) β The node to add.
- create_master_dataset(path_to_csv: str = '/home/local/USHERBROOKE/saho6810/MEDfl/code/MEDfl/notebooks/data/masterDataSet/miniDiabete.csv')[source]ο
Create the MasterDataset table and insert dataset values.
- Parameters:
path_to_csv β Path to the CSV file containing the dataset.
- static list_allnetworks()[source]ο
List all networks in the database.
- Returns:
A DataFrame containing information about all networks in the database.
- Return type:
DataFrame
- list_allnodes()[source]ο
List all nodes in the network.
- Returns:
A DataFrame containing information about all nodes in the network.
- Return type:
DataFrame
- update_network(FLsetupId: int)[source]ο
Update the networkβs FLsetupId in the database.
- Parameters:
FLsetupId (int) β The FLsetupId to update.
MEDfl.NetManager.node moduleο
- class MEDfl.NetManager.node.Node(name: str, train: int, test_fraction: float = 0.2, engine=None)[source]ο
Bases:
objectA class representing a node in the network.
- nameο
The name of the node.
- Type:
str
- trainο
An integer flag representing whether the node is used for training (1) or testing (0).
- Type:
int
- test_fractionο
The fraction of data used for testing when train=1. Default is 0.2.
- Type:
float, optional
- __init__(name: str, train: int, test_fraction: float = 0.2, engine=None)[source]ο
Initialize a Node instance.
- Parameters:
name (str) β The name of the node.
train (int) β An integer flag representing whether the node is used for training (1) or testing (0).
test_fraction (float, optional) β The fraction of data used for testing when train=1. Default is 0.2.
- assign_dataset(dataset_name: str)[source]ο
Assigning existing dataSet to node :param dataset_name: The name of the dataset to assign. :type dataset_name: str
- Returns:
None
- check_dataset_compatibility(data_df)[source]ο
Check if the dataset is compatible with the master dataset. :param data_df: The dataset to check. :type data_df: DataFrame
- Returns:
None
- create_node(NetId: int)[source]ο
Create a node in the database. :param NetId: The ID of the network to which the node belongs. :type NetId: int
- Returns:
None
- get_dataset(column_name: Optional[str] = None)[source]ο
Get the dataset for the node based on the given column name. :param column_name: The column name to filter the dataset. Default is None. :type column_name: str, optional
- Returns:
The dataset associated with the node.
- Return type:
DataFrame
- list_alldatasets()[source]ο
List all datasets associated with the node. :returns: A DataFrame containing information about all datasets associated with the node. :rtype: DataFrame
- static list_allnodes()[source]ο
List all nodes in the database. :returns: A DataFrame containing information about all nodes in the database. :rtype: DataFrame
- unassign_dataset(dataset_name: str)[source]ο
unssigning existing dataSet to node :param dataset_name: The name of the dataset to assign. :type dataset_name: str
- Returns:
None
- upload_dataset(dataset_name: str, path_to_csv: str = '/home/local/USHERBROOKE/saho6810/MEDfl/code/MEDfl/notebooks/data/masterDataSet/Mimic_train.csv')[source]ο
Upload the dataset to the database for the node.
- Parameters:
dataset_name (str) β The name of the dataset.
path_to_csv (str, optional) β Path to the CSV file containing the dataset. Default is the path in params.
- Returns:
None