Vertical Logistic Regression
Introduction
The realization of vertical logistic regression algorithm is based on [Yang2019] .
Parameter List
identity: str Federated identity of the party, should be one of label_trainer or trainer.
- model_info:
name:
strModel name, should be vertical_logistic_regression.
- input:
- trainset:
type:
strTrain dataset type, support csv.path:
strIf type is csv, folder path of train dataset.name:
boolIf type is csv, file name of train dataset.has_id:
boolIf type is csv, whether dataset has id column.has_label:
boolIf type is csv, whether dataset has label column.
- valset:
type:
strValidation dataset type, support csv.path:
strIf type is csv, folder path of validation dataset.name:
boolIf type is csv, file name of validation dataset.has_id:
boolIf type is csv, whether dataset has id column.has_label:
boolIf type is csv, whether dataset has label column.
- pretrained_model:
map path:
strPretrained model path.name:
strPretrained model name.
- pretrained_model:
- output:
path:
strFolder path of output.- model:
name:
strFile name of output model.
- metric_train:
name:
strFile name of trainset metrics.
- metric_val:
name:
strFile name of valset metrics.
- prediction_train:
name:
strFile name of trainset prediction.
- prediction_val:
name:
strFile name of valset prediction.
- ks_plot_train:
name:
strFile name of trainset ks values.
- ks_plot_val:
name:
strFile name of valset ks values.
- decision_table_train:
name:
strFile name of trainset decision table.
- decision_table_val:
name:
strFile name of valset decision table.
- feature_importance:
name:
strFile name of feature importance table.
- train_info:
- interaction_params:
save_frequency:
intFrequency to save model, set to -1 for not saving model.write_training_prediction:
boolWhether to save the predictions on train dataset.echo_training_metrics:
boolWhether to output metrics on train dataset.write_validation_prediction:
boolWhether to save predictions on validation dataset.
- train_params:
global_epoch:
intGlobal training epoch.batch_size:
intBatch size of samples in global process.- encryption:
mapCan choose either “ckks” or “paillier”. - ckks:
map poly_modulus_degree:
intPolynomial modulus degree.coeff_mod_bit_sizes:
listCoefficient modulus sizes.global_scale_bit_size:
intGlobal scale factor bit size.
- ckks:
- paillier:
key_bit_size:
intBit length of paillier key, recommend to be greater than or equal to 2048.precision:
intPrecision.djn_on:
boolWhether to use djn method to generate key pair.parallelize_on:
boolWhether to use multicore for computing.
- encryption:
- optimizer:
lr:
floatLearning rate.p:
intRegularization parameter, “0”/”1”/”2” stands for no regularization/l1 regularization/l2 regularization respectively.alpha:
floatPenalty coefficient.
- metric:
mapMetrics to output, all the keys are optional. - decision_table:
map method:
strSupport “equal_frequency” and “equal_with”.bins:
intNumber of bins in decision table.
- decision_table:
acc:
mapAccuracy, support {}.precision:
mapPrecision, support {}.recall:
mapRecall, support {}.f1_score:
mapThe harmonic mean of the precision and recall, support {}.auc:
mapArea under the ROC Curve, support {}.ks:
mapKolmogorov–Smirnov test, support {}.
- metric:
- early_stopping:
key:
strVariable to be monitored.patience:
intNumber of epochs with no improvement after which training will be stopped.delta:
intMinimum change of the key to qualify as an improvement.
random_seed:
intRandom seed, accept None.
- Yang2019
Yang S, Ren B, Zhou X, et al. Parallel distributed logistic regression for vertical federated learning without third-party coordinator[J]. arXiv preprint arXiv:1911.09824, 2019.