model_selection¶

This module contains functions and classes for model evaluation and selection.

Functions

`cm_element`(y_true, y_pred)	It computes the elements of a confusion matrix.
`eval_metrics`(y_true, y_pred)	It computes common evaluation metrics such as Accuracy, Recall, Precision, F1-measure, and elements of the confusion matrix.
`get_results_filename`(file_name, clf_name, …)	It returns the filename of the results based on user’s input.
`grid_search`(func_eval, params_range[, log_file])	It does grid search for a TSVM-based estimator.
`performance_eval`(tp, tn, fp, fn)	It computes common evaluation metrics based on the elements of a confusion matrix.
`save_result`(validator_obj, problem_type, …)	It saves the detailed classification results in a spreadsheet file (Excel).
`search_space`(kernel_type, search_type, …)	It generates all combination of search elements based on the given range of hyperparameters.

Classes

`ThreadGS`(usr_input)	It runs the Grid Search in a separate thread.
`Validator`(X_train, y_train, validator_type, …)	It evaluates a TSVM-based estimator based on the specified evaluation method.

model_selection.cm_element(y_true, y_pred)[source]¶

It computes the elements of a confusion matrix.

Parameters:

y_true : array-like

Target values of samples.

y_pred : array-like

Predicted class lables.

Returns:

tp : int

True positive.

tn : int

True negative.

fp : int

False positive.

fn : int

False negative.

model_selection.performance_eval(tp, tn, fp, fn)[source]¶

It computes common evaluation metrics based on the elements of a confusion matrix.

Parameters:

tp : int

True positive.

tn : int

True negative.

fp : int

False positive.

fn : int

False negative.

Returns:

accuracy : float

Overall accuracy of the model.

recall_p : float

Recall of positive class.

precision_p : float

Precision of positive class.

f1_p : float

F1-measure of positive class.

recall_n : float

Recall of negative class.

precision_n : float

Precision of negative class.

f1_n : float

F1-measure of negative class.

model_selection.eval_metrics(y_true, y_pred)[source]¶

It computes common evaluation metrics such as Accuracy, Recall, Precision, F1-measure, and elements of the confusion matrix.

Parameters:

y_true : array-like

Target values of samples.

y_pred : array-like

Predicted class lables.

Returns:

tp : int

True positive.

tn : int

True negative.

fp : int

False positive.

fn : int

False negative.

accuracy : float

Overall accuracy of the model.

recall_p : float

Recall of positive class.

precision_p : float

Precision of positive class.

f1_p : float

F1-measure of positive class.

recall_n : float

Recall of negative class.

precision_n : float

Precision of negative class.

f1_n : float

F1-measure of negative class.

class model_selection.Validator(X_train, y_train, validator_type, estimator)[source]¶

Bases: object

It evaluates a TSVM-based estimator based on the specified evaluation method.

Parameters:

X_train : array-like, shape (n_samples, n_features)

Training feature vectors, where n_samples is the number of samples and n_features is the number of features.

y_train : array-like, shape (n_samples,)

Target values or class labels.

validator_type : tuple

A two-element tuple which contains type of evaluation method and its parameter. Example: (‘CV’, 5) -> 5-fold cross-validation, (‘t_t_split’, 30) -> 30% of samples for test set.

estimator : estimator object

A TSVM-based estimator which inherits from the BaseTSVM.

Methods

`choose_validator`()	It selects an appropriate evaluation method based on the input paramters.
`cv_validator`(dict_param)	It evaluates a TSVM-based estimator using the cross-validation method.
`cv_validator_mc`(dict_param)	It evaluates a multi-class TSVM-based estimator using the cross-validation.
`tt_validator`(dict_param)	It evaluates a TSVM-based estimator using the train/test split method.
`tt_validator_mc`(dict_param)	It evaluates a multi-class TSVM-based estimator using the train/test split method.

cv_validator(dict_param)[source]¶

It evaluates a TSVM-based estimator using the cross-validation method.

Parameters:

dict_param : dict

Values of hyper-parameters for a TSVM-based estimator

Returns:

float

Mean accuracy of the model.

float

Standard deviation of accuracy.

dict

Evaluation metrics such as Recall, Percision and F1-measure for both classes as well as elements of the confusion matrix.

tt_validator(dict_param)[source]¶

It evaluates a TSVM-based estimator using the train/test split method.

Parameters:

dict_param : dict

Values of hyper-parameters for a TSVM-based estimator

Returns:

float

Accuracy of the model.

float

Zero standard deviation.

dict

Evaluation metrics such as Recall, Percision and F1-measure for both classes as well as elements of the confusion matrix.

cv_validator_mc(dict_param)[source]¶

It evaluates a multi-class TSVM-based estimator using the cross-validation.

Parameters:

dict_param : dict

Values of hyper-parameters for a multi-class TSVM-based estimator.

Returns:

float

Accuracy of the model.

float

Zero standard deviation.

dict

Evaluation metrics such as Recall, Percision and F1-measure.

tt_validator_mc(dict_param)[source]¶

It evaluates a multi-class TSVM-based estimator using the train/test split method.

Parameters:

dict_param : dict

Values of hyper-parameters for a TSVM-based estimator

Returns:

float

Accuracy of the model.

float

Zero standard deviation.

dict

Evaluation metrics such as Recall, Percision and F1-measure.

choose_validator()[source]¶

It selects an appropriate evaluation method based on the input paramters.

Returns:

object

An evaluation method for assesing a TSVM-based estimator’s performance.

model_selection.search_space(kernel_type, search_type, C1_range, C2_range, u_range, step=1)[source]¶

It generates all combination of search elements based on the given range of hyperparameters.

Parameters:

kernel_type : str, {‘linear’, ‘RBF’}

Type of the kernel function which is either ‘linear’ or ‘RBF’.

search_type : str, {‘full’, ‘partial’}

Type of search space

C1_range : tuple

Lower and upper bound for C1 penalty parameter.

C2_range : tuple

Lower and upper bound for C2 penalty parameter.

u_range : tuple

Lower and upper bound for gamma parameter.

step : int, optinal (default=1)

Step size to increase power of 2.

Returns:

list

Search elements.

model_selection.get_results_filename(file_name, clf_name, kernel_name, test_method)[source]¶

It returns the filename of the results based on user’s input.

Parameters:

file_name : str

Name of the dataset file.

clf_name : str

Name of the classifier.

kernel_name : str

Name of kernel function.

test_method : tuple

A two-element tuple which contains type of evaluation method and its

parameter.

Returns:

output : str

Filename of the results.

model_selection.save_result(validator_obj, problem_type, gs_result, output_file)[source]¶

It saves the detailed classification results in a spreadsheet file (Excel).

Parameters:

problem_type : str, {‘binary’, ‘multiclass’}

Type of the classification problem.

validator_obj : object

The evaluation method that was used for the assesment of the TwinSVM classifier.

gs_result : list

Classification results of the TwinSVM classifier using different set of hyperparameters.

output_file : str

The full path and filename of the classification results. ex. C:UsersMirfile.xlsx

Returns:

str

Path to the saved spreadsheet (Excel) file.

model_selection.grid_search(func_eval, params_range, log_file=None)[source]¶

It does grid search for a TSVM-based estimator. Note that this function is defined for API usage.

Parameters:

func_eval : object

An evaluation method for assesing a TSVM-based estimator’s performance.

params_range : dict

Range of each hyper-parameter.

log_file : object (default=None)

An opened file for logging best classification accuracy.

Returns:

max_acc

Best accuracy obtained after the grid search.

max_acc_std

Standard deviation of the best accuracy.

dict

Optimal hyper-parameters.

list

Classification results for every hyper-parameters.

class model_selection.ThreadGS(usr_input)[source]¶

Bases: PyQt5.QtCore.QObject

It runs the Grid Search in a separate thread.

Parameters:

usr_input : object

An instance of UserInput class which holds the user input.

Methods

`blockSignals`(self, b)
`childEvent`(self, a0)
`children`(self)
`connectNotify`(self, signal)
`customEvent`(self, a0)
`deleteLater`(self)
`destroyed`	destroyed(self, object: typing.Optional[QObject] = None) [signal]
`disconnect`(a0)
`disconnectNotify`(self, signal)
`dumpObjectInfo`(self)
`dumpObjectTree`(self)
`dynamicPropertyNames`(self)
`event`(self, a0)
`eventFilter`(self, a0, a1)
`findChild`(self, type, name, options, …)	findChild(self, types: Tuple, name: str = ‘’, options: Union[Qt.FindChildOptions, Qt.FindChildOption] = Qt.FindChildrenRecursively) -> QObject
`findChildren`(self, type, name, options, …)	findChildren(self, types: Tuple, name: str = ‘’, options: Union[Qt.FindChildOptions, Qt.FindChildOption] = Qt.FindChildrenRecursively) -> List[QObject] findChildren(self, type: type, regExp: QRegExp, options: Union[Qt.FindChildOptions, Qt.FindChildOption] = Qt.FindChildrenRecursively) -> List[QObject] findChildren(self, types: Tuple, regExp: QRegExp, options: Union[Qt.FindChildOptions, Qt.FindChildOption] = Qt.FindChildrenRecursively) -> List[QObject] findChildren(self, type: type, re: QRegularExpression, options: Union[Qt.FindChildOptions, Qt.FindChildOption] = Qt.FindChildrenRecursively) -> List[QObject] findChildren(self, types: Tuple, re: QRegularExpression, options: Union[Qt.FindChildOptions, Qt.FindChildOption] = Qt.FindChildrenRecursively) -> List[QObject]
`inherits`(self, classname)
`initialize`()	It passes a user’s input to the functions and classes for solving a classification task.
`installEventFilter`(self, a0)
`isSignalConnected`(self, signal)
`isWidgetType`(self)
`isWindowType`(self)
`killTimer`(self, id)
`metaObject`(self)
`moveToThread`(self, thread)
`objectName`(self)
`objectNameChanged`	objectNameChanged(self, objectName: str) [signal]
`parent`(self)
`property`(self, name)
`pyqtConfigure`(…)	Each keyword argument is either the name of a Qt property or a Qt signal.
`receivers`(self, signal)
`removeEventFilter`(self, a0)
`run_gs`(func_eval, search_space)	Runs grid search for the selected classifier on specified hyper-parameters.
`sender`(self)
`senderSignalIndex`(self)
`setObjectName`(self, name)
`setParent`(self, a0)
`setProperty`(self, name, value)
`sig_finished`
`sig_gs_info_set`
`sig_pbar_set`
`signalsBlocked`(self)
`startTimer`(self, interval, timerType)
`stop`()	Stops the thread of the grid search.
`thread`(self)
`timerEvent`(self, a0)
`tr`(self, sourceText, disambiguation, n)

run_gs(func_eval, search_space)[source]¶

Runs grid search for the selected classifier on specified hyper-parameters.

Parameters:

func_eval : object

An evaluation method for assesing a TSVM-based estimator’s performance.

search_space : list

Search elements.

Returns:

list

Classification results for every hyper-parameters.

initialize()[source]¶

It passes a user’s input to the functions and classes for solving a classification task. The steps that this function performs can be summarized as follows:

Specifies a TwinSVM classifier based on the user’s input.
Chooses an evaluation method for assessment of the classifier.
Computes all the combination of search elements.

#. Computes the evaluation metrics for all the search element using grid search. #. Saves the detailed classification results in a spreadsheet file (Excel).

Returns:

object

The evalution method.

dict

Grids of search elements.

stop()[source]¶: Stops the thread of the grid search.