- Describe your initial thoughts or intuitions on answering the three questions raised in the lecture.
- Describe in a few paragraphs how the three questions were answered in the lecture.
D3.2
Replicate the analysis conducted for Intuition 1 in the video lecture on “Intuition to statistics”. Provide the side-by-side comparison for all 4 biomarkers. Post your results as a few screenshots here. Make sure to keep the number of bins at 12, i.e., 10 bins + an overflow + an underflow bin, for all 4 biomarkers. Also, the optimal bin placement will be different for each biomarker.
⚠️ Attention!!: The numbers you put under the bin column is the locations of interval in the number axis. NOT the ordinal number such as 0, 1, 2, … 10!
Example bin placements shown in the table:
AFP | CEA | CA125 | CA50 |
0 | 0 | 0 | 0 |
1.5 | 1.5 | 2 | 3 |
3 | 3 | 4 | 6 |
4.5 | 4.5 | 6 | 9 |
6 | 6 | 8 | 12 |
7.5 | 7.5 | 10 | 15 |
9 | 9 | 12 | 18 |
10.5 | 10.5 | 14 | 21 |
12 | 12 | 16 | 24 |
13.5 | 13.5 | 18 | 27 |
30 |
Note: Each biomarker starts from 0 because there are no negative values. CA50 goes up to 30 due to a wider range. Also, 11 numbers listed create 12 bins (10 main + 1 underflow + 1 overflow).
D3.3
Replicate the analysis conducted for Intuition 2 in the video lecture on “Intuition to statistics”.
Produce the risk line for all 4 biomarkers plus the random base line. Post your results as a few screenshots here.
D3.4
Describe the main idea behind the video lecture for testing whether gender is a risk factor in a few paragraphs; specifically, describe what p-value is and how to use it to reach a definitive yes or no answer.