# In the sample provided in Final_3.sav, are the variables income and Bladder Cancer independent of each other?

Question: I. Descriptive Statistics: (20 pts) Download the data set Final_1.sav. Complete the following: 1) List the level of measurement for the variables, AGE, SEX, AGEGRP, SBP1 in the data set and describe the appropriate numerical and descriptive statistics based on these. 4 pts Record Number AGE 1 3 2 11 3 15 4 46 5 14 6 35 7 46 8 35 9 40 10 29 11 22 12 16 2) Calculate (by hand) the mean and standard deviation for the first 12 records for age in the data set. 4 pts 3) Generate numerical and graphical descriptive statistics for each of the variables, namely, AGE, SEX, AGEGRP and SBP1. 8 pts 4) Interpret the output you generated in part 3 for each of the variables in the data set. 4 pts I. Paired and Independent t tests: (20 pts) Download the data set Final_2.sav and use SPSS to complete the following calculations: 1) Use the 5-step approach to hypothesis testing and the calculation of the 95{0e601fc7fe3603dc36f9ca2f49ef4cd268b5950ef1bbcf1f795cc00e94cdd119} confidence intervals to answer the following research question: Was a significant difference in Systolic Blood Pressure (SBP) observed over the course of the study? (10 pts) 2) Use the 5-step approach to hypothesis testing and the calculation of the 95{0e601fc7fe3603dc36f9ca2f49ef4cd268b5950ef1bbcf1f795cc00e94cdd119} confidence intervals to answer the following research question: Is there a difference in SBP1 based on HIV status? (10 pts) (Hint: Assign Y as group 1 and N as group 2) III. Cross-Tabulation: (20pts) Download the data set Final_3.sav and use SPSS to complete the following calculations. 1) Use the 5-step approach to hypothesis testing to answer the following research question: In the sample provided in Final_3.sav, are the variables income and Bladder Cancer independent of each other? (Note: The question could also be asked: Is there an association between the variables because the lack of independence implies an association)? (10 pts) 2) Answer the following based on the cross-tabulation of alcohol consumption and Bladder Cancer: (10 pts) Alcohol consumption * Bladder Cancer Crosstabulation Count Bladder Cancer Total No Yes Alcohol consumption “Less than 1 drink per week” 30 54 84 4 or more drinks per month 22 115 137 Total 52 169 221 Calculate the odds ratio. 4 pts Describe how the odds ratio differs from the relative risk or risk ratio and why you would chose it here. 2 pts Interpret the odds ratio and how it might impact the practice of public health practitioners. 2 pts If you wanted to know whether this relationship was statistically significant what test(s) could you use? 2 pts IV. ANOVA: 20pts Download the data set Final_4.sav and use SPSS to complete the following calculations. 1) Produce box plots of income for each region of the US in the data set and interpret them. Based on the box plots do you expect to find a difference between any of the groups? 4 pts 2) Create descriptive statistics for each region, using the variable income. 4 pts Include skewness and kurtosis in the output. 2 pts Create a histogram for each group. 2 pts 3) Run the ANOVA for income based on region. Include the ANOVA table and the test for Homogeneity of Variance. Interpret the results. 6 pts 4) Conduct post hoc analysis using Bonferroni and LSD methods to control for multiple testing. 6 pts Provide the output. 2 pts Interpret your results. 3 pts Why do you need to use methods like Bonferroni and LSD with the ANOVA? 1 pt V. Regression: 20pts Download the data set Final_5.sav and use SPSS to complete the following calculations. 1) Use an independent t test and simple linear regression to identify whether a relationship exists between gender and BMI. (10 pts) Run the appropriate t test in SPSS, report the significance of the difference in means and the confidence interval, and interpret the results. 4 pts Run the simple linear regression in SPSS, report the significance of the variable gender and the overall fit of the model (using r2). Interpret the results. 4 pts How are these two approaches different? 1 pt Are your conclusions the same using both tests? 1 pt 2) Answer the questions using the provided output: 10 pts Multiple Linear Regression 5 pts Researchers looked at the Emergency Department Records of 60 adults ages 22 to 46 years who arrived in the ED complaining of chest pain during a 6 month period of time. They did not use a random sample as they wanted 30 males and 30 females in the study. They collected information on BMI (a measure of overweight/obesity), Age, SBP (Systolic Blood Pressure) and the diagnosis of Diabetes. Their first hypothesis (alternative) was that the dependent variable SBP is associated with BMI, Age, Diabetes, and Gender. They conducted a multiple linear regression to test their