Phase III Model Training (Data File Phase 3 HTML)
Click on the HTML file attached to read the scenario. After reading through the case, please click next to jump into the quiz and exercises.
We recommend leaving the HTML file open so that it is easier for you to look for the information questioned in the quiz/exercise.
What learning phenomena is the team observing now?
Part 2. (a)
What are some techniques that can be applied in order to improve generalization performance? Check all that apply.
Weight decay (L2 regularization)
Increasing the number of model parameters
Stronger data augmentation
Part 2 (b)
Sometimes, overfitting is attributed to the task being “too hard.” Given what we know about model behavior during overfitting, how can this explanation be justified?
What is weight decay?
An added penalty in the loss function that encourages semantic clustering in feature space
An added penalty in the loss function that mitigates class imbalance
An added penalty in the loss function that ensures the model is well calibrated
An added penalty in the loss function that discourages models from becoming overly complex
What does dropout do?
Dropout randomly removes layers in the network during training in order to improve the rate of convergence
Dropout randomly removes neurons in the network during training in order to prevent overreliance on any one neuron
Dropout randomly removes layers in the network during training in order to prevent overreliance on any one layer
Dropout randomly removes neurons in the network during training in order to improve the rate of convergence
Part 5. Which of the following are tunable hyperparameters? Check all that apply.
Weight decay strength
The team is noticing counterintuitive results regarding the performance of the model when measured with accuracy and AUROC. What is likely occurring? NOTE: There are 27,000 COVID-negative exams and 3,000 COVID-positive exams, a breakdown of 90% negative cases and 10% positive cases.
Accuracy is a poor metric for performance because of the small number of samples in the test set
Accuracy is a poor metric for performance because of the high class imbalance
AUROC is a poor metric for performance because they have a predetermined threshold in mind
AUROC is a poor metric for performance because it can only be used in multi-class settings
Further analysis shows that the model is predicting that every patient is COVID-negative. What can be done to mitigate this effect? Check all that apply.
Using dropout during training to improve performance on the test set
Undersampling COVID-positive exams during training
Upweighting COVID-positive exams loss during training
Lowering the learning rate to improve convergence
What is a pro of using k-fold cross-validation instead of a hold-out validation set for hyperparameter tuning?
Improves model convergence rates because many hyperparameters can be tested at the same time
Regularizes the model by randomly selecting training examples automatically
Requires less overall time to train a model, due to the reduced number of training samples
Produces a more reliable estimate of the generalization performance of the model
What is a con of using k-fold cross-validation instead of a hold-out validation set for hyperparameter tuning?
Increases the number of parameters in the overall model, which leads to overfitting
Decreases model generalization performance because the model is able to learn on the test set
Requires more overall time to train a model, due to the repeated training runs associated with each experiment
Increases the overall memory requirements of the model during training, due to the higher number of samples seen during training
What are common criteria used for early stopping? Check all that apply.
Which of the following hyperparameters are exclusive to deep learning models? Check all that apply.
Number of layers
Weight decay strength
Class weights (loss function)