2021-12-15/14:51:10 *** Model Training started !! *** 2021-12-15/14:51:10 Entered the delete_existing_model_dir method of fileOperations class ... 2021-12-15/14:51:10 Deleted existing model directory successfully ! 2021-12-15/14:51:10 Data load successful !! 2021-12-15/14:51:10 Data preprocessing started ... 2021-12-15/14:51:10 Entered the dropUnnecessaryColumns method of preProcessor class ... 2021-12-15/14:51:10 Drop columns successful ! 2021-12-15/14:51:10 Entered the replaceInvalidValuesWithNull method of preProcessor class ... 2021-12-15/14:51:10 Replaced invalid values for columns: ['age', 'sex', 'TSH', 'T3', 'TT4', 'T4U', 'FTI'] 2021-12-15/14:51:10 Replacing invalid values successful ! 2021-12-15/14:51:10 Entered the encodeCategoricalFeatures method of preProcessor class ... 2021-12-15/14:51:10 Encoded columns: ['sex', 'on_thyroxine', 'query_on_thyroxine', 'on_antithyroid_medication', 'sick', 'pregnant', 'thyroid_surgery', 'I131_treatment', 'query_hypothyroid', 'query_hyperthyroid', 'lithium', 'goitre', 'tumor', 'hypopituitary', 'psych', 'referral_source'] 2021-12-15/14:51:10 Encode categorical features successful ! 2021-12-15/14:51:10 Entered the encodeTarget method of preProcessor class ... 2021-12-15/14:51:10 Encoding target variable successful ! 2021-12-15/14:51:10 Entered the isNullPresent method of preProcessor class ... 2021-12-15/14:51:11 Null values found and written to null_values.csv 2021-12-15/14:51:11 Entered the imputeMissing method of preProcessor class ... 2021-12-15/14:51:11 Data imputation successful ! 2021-12-15/14:51:11 Entered the isOutlierPresent method of preProcessor class ... 2021-12-15/14:51:11 21 outliers found and written to outliers.csv 2021-12-15/14:51:11 Entered the removeOutliers method of preProcessor class ... 2021-12-15/14:51:11 Removed outliers successfully ! Shape of data: (3949, 26) 2021-12-15/14:51:11 Entered the splitTarget method of preProcessor class ... 2021-12-15/14:51:11 Splitting dependent and independent features successful ! 2021-12-15/14:51:11 Entered the handleImbalanceData method of preProcessor class ... 2021-12-15/14:51:11 Oversampling successful! Records in new dataset: 10998 2021-12-15/14:51:11 Entered the scaleData method of preProcessor class ... 2021-12-15/14:51:11 Scaling using StandardScalar() successful! 2021-12-15/14:51:11 Data preprocessing completed successfully in 1.16 seconds !! 2021-12-15/14:51:11 Clustering started ... 2021-12-15/14:51:11 Entered the getOptimalClusters method of kMeansClustering class ... 2021-12-15/14:51:15 Optimal clusters are: 4 2021-12-15/14:51:15 Entered the createClusters method of kMeansClustering class ... 2021-12-15/14:51:16 Clusters created successfully ! 2021-12-15/14:51:16 Entered the saveModel method of fileOperations class ... 2021-12-15/14:51:16 Model file saved successfully: KMeans 2021-12-15/14:51:16 Clustering completed successfully in 4.57 seconds !! 2021-12-15/14:51:16 *** Model creation started !! *** 2021-12-15/14:51:16 Building model for data in cluster 2 2021-12-15/14:51:16 Entered the get_best_model method of modelFinder class ... 2021-12-15/14:51:16 Entered the getBestParamsForKNN method of modelFinder class ... 2021-12-15/14:53:03 Best KNN model parameters: {'algorithm': 'ball_tree', 'leaf_size': 10, 'n_neighbors': 4, 'p': 1} 2021-12-15/14:53:03 KNN model with best parameters fitted successfully in 107.78 seconds!! 2021-12-15/14:53:04 AUC score for KNN: 0.9935124652153101 2021-12-15/14:53:04 Entered the getBestParamsForSVM method of the modelFinder class 2021-12-15/14:54:06 Best SVM model parameters: {'C': 10, 'gamma': 0.1} 2021-12-15/14:54:07 SVM model with best parameters fitted successfully in 63.02 seconds!! 2021-12-15/14:54:07 AUC score for SVM: 0.996752358916956 2021-12-15/14:54:07 Entered the getBestParamsForRF method of modelFinder class ... 2021-12-15/14:54:50 Best Random Forest model parameters: {'criterion': 'entropy', 'max_depth': 3, 'max_features': 'log2', 'n_estimators': 100} 2021-12-15/14:54:51 Random Forest model with best parameters fitted successfully in 44.14 seconds!! 2021-12-15/14:54:51 AUC score for Random Forest: 0.9927888130040845 2021-12-15/14:54:51 Entered the getBestParamsForAdaBoost method of the modelFinder class 2021-12-15/14:55:32 Best AdaBoost model parameters: {'learning_rate': 0.1, 'n_estimators': 10} 2021-12-15/14:55:32 AdaBoost model with best parameters fitted successfully in 41.51 seconds!! 2021-12-15/14:55:32 AUC score for AdaBoost: 0.984112911953482 2021-12-15/14:55:32 Entered the getBestParamsForXGBoost method of the modelFinder class 2021-12-15/15:02:06 Best XGBoost model parameters: {'learning_rate': 0.5, 'max_depth': 3, 'n_estimators': 100} 2021-12-15/15:02:07 XGBoost model with best parameters fitted successfully in 394.44 seconds!! 2021-12-15/15:02:07 AUC score for XGBoost: 0.9993231556830694 2021-12-15/15:02:07 Best model for cluster 2, XGBoost, found in 10.85 mins. 2021-12-15/15:02:07 Entered the saveModel method of fileOperations class ... 2021-12-15/15:02:07 Model file saved successfully: XGBoost2 2021-12-15/15:02:07 Building model for data in cluster 1 2021-12-15/15:02:07 Entered the get_best_model method of modelFinder class ... 2021-12-15/15:02:07 Entered the getBestParamsForKNN method of modelFinder class ... 2021-12-15/15:02:18 Best KNN model parameters: {'algorithm': 'ball_tree', 'leaf_size': 10, 'n_neighbors': 5, 'p': 1} 2021-12-15/15:02:18 KNN model with best parameters fitted successfully in 11.47 seconds!! 2021-12-15/15:02:18 AUC score for KNN: 0.999809581706726 2021-12-15/15:02:18 Entered the getBestParamsForSVM method of the modelFinder class 2021-12-15/15:02:20 Best SVM model parameters: {'C': 10, 'gamma': 0.1} 2021-12-15/15:02:20 SVM model with best parameters fitted successfully in 1.70 seconds!! 2021-12-15/15:02:20 AUC score for SVM: 1.0 2021-12-15/15:02:20 Entered the getBestParamsForRF method of modelFinder class ... 2021-12-15/15:02:50 Best Random Forest model parameters: {'criterion': 'entropy', 'max_depth': 3, 'max_features': 'auto', 'n_estimators': 50} 2021-12-15/15:02:50 Random Forest model with best parameters fitted successfully in 30.24 seconds!! 2021-12-15/15:02:50 AUC score for Random Forest: 0.9979558641760157 2021-12-15/15:02:50 Entered the getBestParamsForAdaBoost method of the modelFinder class 2021-12-15/15:03:13 Best AdaBoost model parameters: {'learning_rate': 0.5, 'n_estimators': 10} 2021-12-15/15:03:13 AdaBoost model with best parameters fitted successfully in 22.28 seconds!! 2021-12-15/15:03:13 AUC score for AdaBoost: 0.993288716789008 2021-12-15/15:03:13 Entered the getBestParamsForXGBoost method of the modelFinder class 2021-12-15/15:04:08 Best XGBoost model parameters: {'learning_rate': 0.5, 'max_depth': 3, 'n_estimators': 10} 2021-12-15/15:04:08 XGBoost model with best parameters fitted successfully in 55.07 seconds!! 2021-12-15/15:04:08 AUC score for XGBoost: 0.9998345087745205 2021-12-15/15:04:08 Best model for cluster 1, SVM, found in 2.01 mins. 2021-12-15/15:04:08 Entered the saveModel method of fileOperations class ... 2021-12-15/15:04:08 Model file saved successfully: SVM1 2021-12-15/15:04:08 Building model for data in cluster 0 2021-12-15/15:04:08 Entered the get_best_model method of modelFinder class ... 2021-12-15/15:04:08 Entered the getBestParamsForKNN method of modelFinder class ... 2021-12-15/15:04:57 Best KNN model parameters: {'algorithm': 'ball_tree', 'leaf_size': 10, 'n_neighbors': 4, 'p': 1} 2021-12-15/15:04:57 KNN model with best parameters fitted successfully in 49.59 seconds!! 2021-12-15/15:04:57 AUC score for KNN: 0.9929820133951114 2021-12-15/15:04:57 Entered the getBestParamsForSVM method of the modelFinder class 2021-12-15/15:05:21 Best SVM model parameters: {'C': 10, 'gamma': 0.1} 2021-12-15/15:05:21 SVM model with best parameters fitted successfully in 23.68 seconds!! 2021-12-15/15:05:21 AUC score for SVM: 0.9990554973444578 2021-12-15/15:05:21 Entered the getBestParamsForRF method of modelFinder class ... 2021-12-15/15:05:57 Best Random Forest model parameters: {'criterion': 'entropy', 'max_depth': 3, 'max_features': 'auto', 'n_estimators': 100} 2021-12-15/15:05:58 Random Forest model with best parameters fitted successfully in 36.56 seconds!! 2021-12-15/15:05:58 AUC score for Random Forest: 0.9980872496698886 2021-12-15/15:05:58 Entered the getBestParamsForAdaBoost method of the modelFinder class 2021-12-15/15:06:29 Best AdaBoost model parameters: {'learning_rate': 0.5, 'n_estimators': 200} 2021-12-15/15:06:30 AdaBoost model with best parameters fitted successfully in 32.37 seconds!! 2021-12-15/15:06:30 AUC score for AdaBoost: 0.9997729352944983 2021-12-15/15:06:30 Entered the getBestParamsForXGBoost method of the modelFinder class 2021-12-15/15:09:39 Best XGBoost model parameters: {'learning_rate': 0.5, 'max_depth': 5, 'n_estimators': 10} 2021-12-15/15:09:39 XGBoost model with best parameters fitted successfully in 188.98 seconds!! 2021-12-15/15:09:39 AUC score for XGBoost: 0.9999863251218218 2021-12-15/15:09:39 Best model for cluster 0, XGBoost, found in 5.52 mins. 2021-12-15/15:09:39 Entered the saveModel method of fileOperations class ... 2021-12-15/15:09:39 Model file saved successfully: XGBoost0 2021-12-15/15:09:39 Building model for data in cluster 3 2021-12-15/15:09:39 Entered the get_best_model method of modelFinder class ... 2021-12-15/15:09:39 Entered the getBestParamsForKNN method of modelFinder class ... 2021-12-15/15:10:04 Best KNN model parameters: {'algorithm': 'ball_tree', 'leaf_size': 10, 'n_neighbors': 4, 'p': 1} 2021-12-15/15:10:04 KNN model with best parameters fitted successfully in 24.85 seconds!! 2021-12-15/15:10:04 AUC score for KNN: 1.0 2021-12-15/15:10:04 Entered the getBestParamsForSVM method of the modelFinder class 2021-12-15/15:10:05 Best SVM model parameters: {'C': 10, 'gamma': 0.01} 2021-12-15/15:10:05 SVM model with best parameters fitted successfully in 1.26 seconds!! 2021-12-15/15:10:05 AUC score for SVM: 1.0 2021-12-15/15:10:05 Entered the getBestParamsForRF method of modelFinder class ... 2021-12-15/15:10:36 Best Random Forest model parameters: {'criterion': 'gini', 'max_depth': 2, 'max_features': 'auto', 'n_estimators': 10} 2021-12-15/15:10:36 Random Forest model with best parameters fitted successfully in 30.45 seconds!! 2021-12-15/15:10:36 AUC score for Random Forest: 1.0 2021-12-15/15:10:36 Entered the getBestParamsForAdaBoost method of the modelFinder class 2021-12-15/15:10:59 Best AdaBoost model parameters: {'learning_rate': 0.5, 'n_estimators': 10} 2021-12-15/15:10:59 AdaBoost model with best parameters fitted successfully in 22.99 seconds!! 2021-12-15/15:10:59 AUC score for AdaBoost: 1.0 2021-12-15/15:10:59 Entered the getBestParamsForXGBoost method of the modelFinder class 2021-12-15/15:12:26 Best XGBoost model parameters: {'learning_rate': 0.5, 'max_depth': 3, 'n_estimators': 50} 2021-12-15/15:12:27 XGBoost model with best parameters fitted successfully in 87.93 seconds!! 2021-12-15/15:12:27 AUC score for XGBoost: 1.0 2021-12-15/15:12:27 Best model for cluster 3, KNN, found in 2.79 mins. 2021-12-15/15:12:27 Entered the saveModel method of fileOperations class ... 2021-12-15/15:12:27 Model file saved successfully: KNN3 2021-12-15/15:13:17 Best models for each cluster and their scores are: 2021-12-15/15:13:17 [(2, 'XGBoost', 0.9993231556830694), (1, 'SVM', 1.0), (0, 'XGBoost', 0.9999863251218218), (3, 'KNN', 1.0)] 2021-12-15/15:13:17 Training completed in 22.12 mins. 2021-12-15/15:13:17 *** End of Training !! ***