# Title: Machine Learning Model Ensemble Fitting and Prediction Workflow for Biodiversity Analysis
 Author: Vinicius Marcilio-Silva
 Contact: viniciuschms@gmail.com
 Date: October 2024
# ----------------------------------------------------------------------------
 Description: This script provides a generalized workflow for fitting models and 
 predicting biodiversity metrics based on environmental predictors. Developed as 
 supplemental material for the paper "Synergies and trade-offs between biodiversity conservation, human well-being, and agricultural production: lessons from the Atlantic Forest in Santa Catarina, Brazil" and based on code from 
 Thiago Sanna F. Silva (tsfsilva@rc.unesp.br) available at https://datadryad.org/stash/dataset/doi:10.5061/dryad.6m905qfzp

Explanation of Key Steps:
 1. Data Preprocessing: Prepares predictors by centering and scaling.
 2. Data Splitting: Divides data into training and testing subsets for model validation.
 3. Model Formula Creation: Dynamically creates a formula for each response variable.
 4. Visualization: Generates scatter plots to check the relationship between predictors and response.
 5. Cross-Validation: Sets up a repeated cross-validation scheme.
 6. Model List Setup: Specifies the ensemble of models to be trained.
 7. Model Training: Uses caretList to fit multiple models simultaneously.
 8. Model Summaries: Extracts and saves a summary of model results.
 9. Stacked Ensemble Model: Trains a stacked model to aggregate predictions.
 10. Test Set Prediction: Predicts outcomes for the test set and calculates RMSE.
 11. Variable Importance: Determines the importance of each variable in the ensemble model.
 12. Future Projections: Applies the ensemble model to new data if future predictions are needed.
This structure provides a reproducible workflow for analyzing multiple biodiversity metrics using ensemble modeling.