preprocess¶
In this module, functions for reading and processing datasets are defined.
Functions
read_libsvm (filename) |
It reads LIBSVM data files for doing classification using the TwinSVM model. |
Classes
DataReader (file_path, sep, header) |
It handels data-related tasks like reading, etc. |
-
class
preprocess.
DataReader
(file_path, sep, header)[source]¶ Bases:
object
It handels data-related tasks like reading, etc.
Parameters: file_path : str
Path to the dataset file.
sep : str
Separator character
header : boolean
whether the dataset has header names or not.
Attributes
X_train (array-like, shape (n_samples, n_features)) Training samples in NumPy array. y_train ( array-like, shape(n_samples,)) Class labels of training samples. hdr_names (list) Header names of datasets. filename (str) dataset’s filename Methods
get_data
()It returns processed dataset. get_data_info
()It returns data characteristics from dataset. load_data
(shuffle, normalize)It reads a CSV file into pandas DataFrame. -
load_data
(shuffle, normalize)[source]¶ It reads a CSV file into pandas DataFrame.
Parameters: shuffle : boolean
Whether to shuffle the dataset or not.
normalize : boolean
Whether to normalize the dataset or not.
-