2021-12-15/15:20:44 Beginning Prediction data download, validation and insertion ... 2021-12-15/15:20:44 *** Starting data download from S3 bucket ... *** 2021-12-15/15:20:44 Entered the download_raw_training_data method of rawPredictionValidation class ... 2021-12-15/15:20:44 Entered the download_dir_from_s3 method of rawPredictionValidation class ... 2021-12-15/15:20:44 Downloading prediction data files from S3 bucket started ... 2021-12-15/15:20:45 10 files downloaded successfully !! 2021-12-15/15:20:45 All prediction data files moved to: Raw_Data/Prediction_Batch_Files 2021-12-15/15:20:45 Prediction schema file moved to Raw_Data directory 2021-12-15/15:20:45 Prediction files downloaded from S3 bucket successfully ! 2021-12-15/15:20:45 Data download completed in 1.50 seconds!! 2021-12-15/15:20:45 *** Data download completed successfully *** 2021-12-15/15:20:45 *** Starting raw data validation ... *** 2021-12-15/15:20:45 Entered the get_schema_values method of rawPredictionValidation class ... 2021-12-15/15:20:45 LengthOfDateStampInFile: 7, LengthOfTimeStampInFile: 13, NumberofColumns: 29 2021-12-15/15:20:45 Received values from schema successfully ! 2021-12-15/15:20:45 Entered the generate_regex method of rawPredictionValidation class ... 2021-12-15/15:20:45 Regex generated successfully ! 2021-12-15/15:20:45 Entered the validate_filename method of rawPredictionValidation class ... 2021-12-15/15:20:45 Entered the delete_existing_good_raw method of rawPredictionValidation class ... 2021-12-15/15:20:45 Existing Good raw data directory deleted successfully ! 2021-12-15/15:20:45 Entered the delete_existing_bad_raw method of rawPredictionValidation class ... 2021-12-15/15:20:45 Existing Bad raw data directory deleted successfully ! 2021-12-15/15:20:45 Entered the create_good_bad_raw_directory method of rawPredictionValidation class ... 2021-12-15/15:20:45 Good/Bad raw data directories created successfully ! 2021-12-15/15:20:45 All files validated and moved to Good_Raw, Bad_Raw data directories successfully ! 2021-12-15/15:20:45 Validated filenames successfully !. 2021-12-15/15:20:45 Entered the validate_column_length method of rawPredictionValidation class ... 2021-12-15/15:20:45 4/5 files had invalid column length and were moved to Bad_Raw directory 2021-12-15/15:20:45 Validated number of columns successfully ! 2021-12-15/15:20:45 Entered the validate_all_missing_values method of rawPredictionValidation class ... 2021-12-15/15:20:45 0/1 files had missing values in all columns and were moved to Bad_Raw directory 2021-12-15/15:20:45 Validated all missing values successfully ! 2021-12-15/15:20:45 Entered the get_good_bad_fileCount method of rawPredictionValidation class ... 2021-12-15/15:20:45 Total files: 9, Good files: 1, Bad files: 8 2021-12-15/15:20:45 Raw data validation completed in 0.06 seconds!! 2021-12-15/15:20:45 *** Raw data validation completed successfully *** 2021-12-15/15:20:45 *** Starting data transformation ... *** 2021-12-15/15:20:45 Entered the add_quotes_to_strings method of transformData class ... 2021-12-15/15:20:45 Added quotes successfully ! 2021-12-15/15:20:45 Data transformation completed in 0.02 seconds !! 2021-12-15/15:20:45 *** Data transformation completed successfully *** 2021-12-15/15:20:45 *** Starting database operations ... *** 2021-12-15/15:20:45 Entered the create_table method of dbOperations class ... 2021-12-15/15:20:45 Entered the create_db_conn method of dbOperations class ... 2021-12-15/15:20:45 DB connection created successfully ! 2021-12-15/15:20:45 Data table already exists ! 2021-12-15/15:20:45 Database connection closed successfully ! 2021-12-15/15:20:45 Created database and table with column names successfully ! 2021-12-15/15:20:45 Entered the insert_data method of dbOperations class ... 2021-12-15/15:20:45 Entered the create_db_conn method of dbOperations class ... 2021-12-15/15:20:45 DB connection created successfully ! 2021-12-15/15:20:45 Good_Raw_Data table exists with 259 records. Not inserting data 2021-12-15/15:20:45 Database connection closed successfully ! 2021-12-15/15:20:45 Inserted data into table successfully ! 2021-12-15/15:20:45 Entered the delete_existing_good_raw method of rawPredictionValidation class ... 2021-12-15/15:20:45 Existing Good raw data directory deleted successfully ! 2021-12-15/15:20:45 Deleted Good data directory successfully ! 2021-12-15/15:20:45 Entered the move_bad_files_to_archive method of rawPredictionValidation class ... 2021-12-15/15:20:45 Bad files moved to archive directory: Prediction/Prediction_BadData_Archive/BadData_2021-12-15_152045 2021-12-15/15:20:45 Moved bad data files to Archive successfully ! 2021-12-15/15:20:45 Entered the export_to_csv method of dbOperations class ... 2021-12-15/15:20:45 Entered the create_db_conn method of dbOperations class ... 2021-12-15/15:20:45 DB connection created successfully ! 2021-12-15/15:20:45 Data exported to Prediction/PredictionFile_FromDB/InputFile.csv csv 2021-12-15/15:20:45 Database connection closed successfully ! 2021-12-15/15:20:45 Exported data to csv successfully ! 2021-12-15/15:20:45 Database insertion and export completed in 0.02 seconds 2021-12-15/15:20:45 *** Database operations completed successfully *** 2021-12-15/15:20:45 Prediction data validation and insertion completed successfully !! Total time taken: 1.60 seconds. 2021-12-15/15:20:45 *** Model Prediction started !! *** 2021-12-15/15:20:45 Entered the delete_existing_prediction_file method of predictionDataValidation class ... 2021-12-15/15:20:45 Existing prediction_file deleted successfully ! 2021-12-15/15:20:45 Data pre-processing started ... 2021-12-15/15:20:45 Data load successful ! 2021-12-15/15:20:45 Entered the replace_invalid_values method of predictionDataValidation class ... 2021-12-15/15:20:45 Replaced invalid values for columns: ['sex', 'TSH', 'T3', 'TT4', 'T4U', 'FTI', 'TBG'] 2021-12-15/15:20:45 Invalid values replaced successfully ! 2021-12-15/15:20:45 Entered the drop_unnecessary_columns method of predictionDataValidation class ... 2021-12-15/15:20:45 Columns dropped are: ['TSH_measured', 'T3_measured', 'TT4_measured', 'T4U_measured', 'FTI_measured', 'TBG_measured', 'TBG'] 2021-12-15/15:20:45 Dropped columns successfully ! 2021-12-15/15:20:45 Entered the encode_categorical method of predictionDataValidation class ... 2021-12-15/15:20:45 Encoded columns: ['sex', 'on_thyroxine', 'query_on_thyroxine', 'on_antithyroid_medication', 'sick', 'pregnant', 'thyroid_surgery', 'I131_treatment', 'query_hypothyroid', 'query_hyperthyroid', 'lithium', 'goitre', 'tumor', 'hypopituitary', 'psych', 'referral_source'] 2021-12-15/15:20:45 Encoded categorical features successfully ! 2021-12-15/15:20:45 Entered the is_null_present method of predictionDataValidation class ... 2021-12-15/15:20:45 Null values found and written to null_values.csv 2021-12-15/15:20:45 Entered the impute_missing method of predictionDataValidation class ... 2021-12-15/15:20:45 Imputed missing values successfully ! 2021-12-15/15:20:45 Entered the is_outlier_present method of predictionDataValidation class ... 2021-12-15/15:20:45 2 outliers found and written to outliers.csv 2021-12-15/15:20:45 Entered the remove_outliers method of predictionDataValidation class ... 2021-12-15/15:20:45 Removed outliers successfully ! Shape of data: (257, 25) 2021-12-15/15:20:45 Data pre-processing completed successfully in 0.10 seconds 2021-12-15/15:20:45 Predicting clusters started ... 2021-12-15/15:20:45 Entered the loadModel method of fileOperations class ... 2021-12-15/15:20:45 Model file KMeans loaded successfully ! 2021-12-15/15:20:45 Clusters predicted successfully in 0.00 seconds 2021-12-15/15:20:45 *** Prediction started !! *** 2021-12-15/15:20:45 Entered the find_correct_model_file method of fileOperations class ... 2021-12-15/15:20:45 Entered the loadModel method of fileOperations class ... 2021-12-15/15:20:45 Model file KNN3 loaded successfully ! 2021-12-15/15:20:45 Starting prediction for cluster 3 2021-12-15/15:20:45 Time elapsed in prediction for cluster 3: 0.00 seconds 2021-12-15/15:20:45 Entered the find_correct_model_file method of fileOperations class ... 2021-12-15/15:20:45 Entered the loadModel method of fileOperations class ... 2021-12-15/15:20:45 Model file XGBoost2 loaded successfully ! 2021-12-15/15:20:45 Starting prediction for cluster 2 2021-12-15/15:20:45 Time elapsed in prediction for cluster 2: 0.00 seconds 2021-12-15/15:20:45 Entered the find_correct_model_file method of fileOperations class ... 2021-12-15/15:20:45 Entered the loadModel method of fileOperations class ... 2021-12-15/15:20:45 Model file XGBoost0 loaded successfully ! 2021-12-15/15:20:45 Starting prediction for cluster 0 2021-12-15/15:20:45 Time elapsed in prediction for cluster 0: 0.00 seconds 2021-12-15/15:20:45 Entered the find_correct_model_file method of fileOperations class ... 2021-12-15/15:20:45 Entered the loadModel method of fileOperations class ... 2021-12-15/15:20:45 Model file SVM1 loaded successfully ! 2021-12-15/15:20:45 Starting prediction for cluster 1 2021-12-15/15:20:45 Time elapsed in prediction for cluster 1: 0.00 seconds 2021-12-15/15:20:45 Entered the upload_prediction_results method of fileOperations class ... 2021-12-15/15:20:46 Prediction results uploaded successfully to S3 bucket ! 2021-12-15/15:20:46 Total time for Prediction: 0.34 seconds 2021-12-15/15:20:46 *** Model Prediction successful !! *** 2021-12-15/20:44:22 *** Model Prediction started !! *** 2021-12-15/20:44:22 Entered the delete_existing_prediction_file method of predictionDataValidation class ... 2021-12-15/20:44:22 Existing prediction_file deleted successfully ! 2021-12-15/20:44:22 Data pre-processing started ... 2021-12-15/20:44:22 Data load successful ! 2021-12-15/20:44:22 Entered the replace_invalid_values method of predictionDataValidation class ... 2021-12-15/20:44:22 Replaced invalid values for columns: ['sex', 'TSH', 'T3', 'TT4', 'T4U', 'FTI', 'TBG'] 2021-12-15/20:44:22 Invalid values replaced successfully ! 2021-12-15/20:44:22 Entered the drop_unnecessary_columns method of predictionDataValidation class ... 2021-12-15/20:44:22 Columns dropped are: ['TSH_measured', 'T3_measured', 'TT4_measured', 'T4U_measured', 'FTI_measured', 'TBG_measured', 'TBG'] 2021-12-15/20:44:22 Dropped columns successfully ! 2021-12-15/20:44:22 Entered the encode_categorical method of predictionDataValidation class ... 2021-12-15/20:44:22 Encoded columns: ['sex', 'on_thyroxine', 'query_on_thyroxine', 'on_antithyroid_medication', 'sick', 'pregnant', 'thyroid_surgery', 'I131_treatment', 'query_hypothyroid', 'query_hyperthyroid', 'lithium', 'goitre', 'tumor', 'hypopituitary', 'psych', 'referral_source'] 2021-12-15/20:44:22 Encoded categorical features successfully ! 2021-12-15/20:44:22 Entered the is_null_present method of predictionDataValidation class ... 2021-12-15/20:44:22 Null values found and written to null_values.csv 2021-12-15/20:44:22 Entered the impute_missing method of predictionDataValidation class ... 2021-12-15/20:44:22 Imputed missing values successfully ! 2021-12-15/20:44:22 Entered the is_outlier_present method of predictionDataValidation class ... 2021-12-15/20:44:22 2 outliers found and written to outliers.csv 2021-12-15/20:44:22 Entered the remove_outliers method of predictionDataValidation class ... 2021-12-15/20:44:22 Removed outliers successfully ! Shape of data: (257, 25) 2021-12-15/20:44:22 Data pre-processing completed successfully in 0.39 seconds 2021-12-15/20:44:22 Predicting clusters started ... 2021-12-15/20:44:22 Entered the loadModel method of fileOperations class ... 2021-12-15/20:44:22 Model file KMeans loaded successfully ! 2021-12-15/20:44:22 Clusters predicted successfully in 0.00 seconds 2021-12-15/20:44:22 *** Prediction started !! *** 2021-12-15/20:44:22 Entered the find_correct_model_file method of fileOperations class ... 2021-12-15/20:44:22 Entered the loadModel method of fileOperations class ... 2021-12-15/20:44:22 Model file KNN3 loaded successfully ! 2021-12-15/20:44:22 Starting prediction for cluster 3 2021-12-15/20:44:22 Time elapsed in prediction for cluster 3: 0.00 seconds 2021-12-15/20:44:22 Entered the find_correct_model_file method of fileOperations class ... 2021-12-15/20:44:22 Entered the loadModel method of fileOperations class ... 2021-12-15/20:44:22 Model file XGBoost2 loaded successfully ! 2021-12-15/20:44:22 Starting prediction for cluster 2 2021-12-15/20:44:22 Time elapsed in prediction for cluster 2: 0.00 seconds 2021-12-15/20:44:22 Entered the find_correct_model_file method of fileOperations class ... 2021-12-15/20:44:22 Entered the loadModel method of fileOperations class ... 2021-12-15/20:44:22 Model file XGBoost0 loaded successfully ! 2021-12-15/20:44:22 Starting prediction for cluster 0 2021-12-15/20:44:22 Time elapsed in prediction for cluster 0: 0.00 seconds 2021-12-15/20:44:22 Entered the find_correct_model_file method of fileOperations class ... 2021-12-15/20:44:22 Entered the loadModel method of fileOperations class ... 2021-12-15/20:44:22 Model file SVM1 loaded successfully ! 2021-12-15/20:44:22 Starting prediction for cluster 1 2021-12-15/20:44:22 Time elapsed in prediction for cluster 1: 0.00 seconds 2021-12-15/20:44:22 Entered the upload_prediction_results method of fileOperations class ... 2021-12-15/20:44:23 Prediction results uploaded successfully to S3 bucket ! 2021-12-15/20:44:23 Total time for Prediction: 0.78 seconds 2021-12-15/20:44:23 *** Model Prediction successful !! *** 2021-12-15/21:03:38 *** Model Prediction started !! *** 2021-12-15/21:03:38 Entered the delete_existing_prediction_file method of predictionDataValidation class ... 2021-12-15/21:03:38 Existing prediction_file deleted successfully ! 2021-12-15/21:03:38 Data pre-processing started ... 2021-12-15/21:03:38 Data load successful ! 2021-12-15/21:03:38 Entered the replace_invalid_values method of predictionDataValidation class ... 2021-12-15/21:03:38 Replaced invalid values for columns: ['sex', 'TSH', 'T3', 'TT4', 'T4U', 'FTI', 'TBG'] 2021-12-15/21:03:38 Invalid values replaced successfully ! 2021-12-15/21:03:38 Entered the drop_unnecessary_columns method of predictionDataValidation class ... 2021-12-15/21:03:38 Columns dropped are: ['TSH_measured', 'T3_measured', 'TT4_measured', 'T4U_measured', 'FTI_measured', 'TBG_measured', 'TBG'] 2021-12-15/21:03:38 Dropped columns successfully ! 2021-12-15/21:03:38 Entered the encode_categorical method of predictionDataValidation class ... 2021-12-15/21:03:38 Encoded columns: ['sex', 'on_thyroxine', 'query_on_thyroxine', 'on_antithyroid_medication', 'sick', 'pregnant', 'thyroid_surgery', 'I131_treatment', 'query_hypothyroid', 'query_hyperthyroid', 'lithium', 'goitre', 'tumor', 'hypopituitary', 'psych', 'referral_source'] 2021-12-15/21:03:38 Encoded categorical features successfully ! 2021-12-15/21:03:38 Entered the is_null_present method of predictionDataValidation class ... 2021-12-15/21:03:38 Null values found and written to null_values.csv 2021-12-15/21:03:38 Entered the impute_missing method of predictionDataValidation class ... 2021-12-15/21:03:38 Imputed missing values successfully ! 2021-12-15/21:03:38 Entered the is_outlier_present method of predictionDataValidation class ... 2021-12-15/21:03:38 2 outliers found and written to outliers.csv 2021-12-15/21:03:38 Entered the remove_outliers method of predictionDataValidation class ... 2021-12-15/21:03:38 Removed outliers successfully ! Shape of data: (257, 25) 2021-12-15/21:03:38 Data pre-processing completed successfully in 0.22 seconds 2021-12-15/21:03:38 Predicting clusters started ... 2021-12-15/21:03:38 Entered the loadModel method of fileOperations class ... 2021-12-15/21:03:38 Model file KMeans loaded successfully ! 2021-12-15/21:03:38 Clusters predicted successfully in 0.00 seconds 2021-12-15/21:03:38 *** Prediction started !! *** 2021-12-15/21:03:38 Entered the find_correct_model_file method of fileOperations class ... 2021-12-15/21:03:38 Entered the loadModel method of fileOperations class ... 2021-12-15/21:03:38 Model file KNN3 loaded successfully ! 2021-12-15/21:03:38 Starting prediction for cluster 3 2021-12-15/21:03:38 Time elapsed in prediction for cluster 3: 0.00 seconds 2021-12-15/21:03:38 Entered the find_correct_model_file method of fileOperations class ... 2021-12-15/21:03:38 Entered the loadModel method of fileOperations class ... 2021-12-15/21:03:38 Model file XGBoost2 loaded successfully ! 2021-12-15/21:03:38 Starting prediction for cluster 2 2021-12-15/21:03:38 Time elapsed in prediction for cluster 2: 0.00 seconds 2021-12-15/21:03:38 Entered the find_correct_model_file method of fileOperations class ... 2021-12-15/21:03:38 Entered the loadModel method of fileOperations class ... 2021-12-15/21:03:38 Model file XGBoost0 loaded successfully ! 2021-12-15/21:03:38 Starting prediction for cluster 0 2021-12-15/21:03:38 Time elapsed in prediction for cluster 0: 0.00 seconds 2021-12-15/21:03:38 Entered the find_correct_model_file method of fileOperations class ... 2021-12-15/21:03:38 Entered the loadModel method of fileOperations class ... 2021-12-15/21:03:38 Model file SVM1 loaded successfully ! 2021-12-15/21:03:38 Starting prediction for cluster 1 2021-12-15/21:03:38 Time elapsed in prediction for cluster 1: 0.00 seconds 2021-12-15/21:03:38 Entered the upload_prediction_results method of fileOperations class ... 2021-12-15/21:03:38 Prediction results uploaded successfully to S3 bucket ! 2021-12-15/21:03:38 Total time for Prediction: 0.57 seconds 2021-12-15/21:03:38 *** Model Prediction successful !! ***