Vertical Binning Woe Iv
Introduction
Vertical Binning Woe Iv is an algorithm for vertically calculating weight of evidence (WOE) and information value (IV).
Two types of binning method are supported:
Equal width binning: divides the data into \(k\) intervals of equal size.
Equal frequency binning: divides the data into \(k\) groups where each group contains approximately same number of values.
After binning, WOE and IV values can be calculated individually as follows(for \(i\) th bin):
\(WOE_i = \ln \frac{y_i / y_T}{n_i/n_T}\)
\(IV_i = \left( \frac{y_i}{y_T} - \frac{n_i}{n_T} \right) \times WOE_i\)
where \(y_i\) , \(n_i\) denote the number of positive and negative samples of \(i\) th bin respectively, \(y_T\), \(n_T\) denote the number of positive and negative samples in total.
Parameters List
identity: str Federated identity of the party, should be one of label_trainer or trainer.
- model_info:
name:
strModel name, should be vertical_binning_woe_iv.
- input:
- trainset:
type:
strTrain dataset type, support csv.path:
strIf type is csv, folder path of train dataset.name:
strIf type is csv, file name of train dataset.has_id:
boolIf type is csv, whether dataset has id column.has_label:
boolIf type is csv, whether dataset has label column.nan_list:
listList of special values, all and only the values in this list will be assigned to a single bin.
- output:
path:
strFolder path of output.- result:
name:
strFile name of result.
- split_points:
name:
strFile name of split points.
- train_info:
- train_params:
- encryption: support two keys: “paillier” or “plain”.
- paillier:
key_bit_size:
intBit length of paillier key, recommended to greater than or equal to 2048.precision:
intPrecison.djn_on:
boolWhether to use djn method to generate key pair.parallelize_on:
boolWhether to use multicore for computing.
- binning:
method:
strBinning method, support “equal_width” or “equal_frequency”.bins:
intNumber of bins.
max_num_cores:
intNumber of cores for parallel computing.