Assignment Detail:- MATH 4044 Statistics for Data Sciences - University Of South Australia
To achieve maximum marks for each question, you should aim to:- Complete the requested statistical analysis in SAS using appropriate tasks or procedures-- Include only the output most relevant to the question and interpret all key results- Do not include every piece of output produced by SAS!- Discuss the results more broadly in the context of the given scenario-
IntroductionUse the data to study the factors that affect the beneficiary's insurance charges-
Data DescriptionMachine Learning with R by Brett Lantz is a book that provides an introduction to ma- chine learning using R- The dataset is used as an example for regression in the book- The data is downloaded from Some post-processing was carried out for the purpose of the case study-The data file for this case study is called insurancev2-sas7bdat- The dataset con- tains insurance charges to the beneficiary, together with their demographic information- Variables in that file are as follows:
Assignment Parts
Question 1-a- Carry out a one-way analysis relating log_charges to agegroup- Use contrasts to test at least one a-priori hypothesis of your choice- Examine and comment on residuals- Also carry out appropriate post-hoc comparisons and discuss your results--b- Use SAS to perform a one-way ANCOVA relating log_charges to agegroup and bmi with bmi as a covariate, including appropriate post-hoc comparisons:- Confirm that there is a linear relationship between the response variable and the covariate -a scatterplot and correlation coefficient plus a comment will suffice-;- Check the two additional ANCOVA assumptions -report and comments only on the parts of the output most directly relevant to condition check- ing-:∗ Independence of the covariate and the treatment effect -perform a one-way ANOVA test-;∗ Equality of slopes -add and check significance of the interaction term-;- Report and briefly discuss your results-Technical note: Make sure you obtain and examine Type III Sum of Squares -ss3-- Also obtain estimates of 'least squares means' -lsmeans- which are means by treatment adjusted for the covariate-
Question 2-a- Carry out a one-way analysis of variance relating log_charges and weight_range- Examine and comment on residuals- Use contrast to test at least one a-priori hypothesis of your choice- Also carry out appropriate post-hoc comparison and discuss your results--b- Extend your analysis in part a to test whether there is evidence of interaction between weight_range and smoker- Examine and comment on residuals- Carry out appropriate post-hoc comparisons and discuss your results-
Question 3 Carry out an additional ANCOVA or factorial ANOVA of your choice to find other factors that may have significant impact on insurance charges-
Question 4 Write a summary of your findings from Questions 1-3- Keep the technical details of the analyses that led you to these conclusions to the absolute minimum- Rather, focus on practical significance and present your findings in non-specialist terms- One to two paragraphs -up to a page- will be sufficient-
Attachment:- Statistics for Data Sciences-rar
Most Recent Questions