Assignment Detail:- COMP 5070 Statistical Programming For Data Science - University of South Australia
- You must submit R-code fileand a word -or PDF- document with a nice-looking report-No archived files are allowed, that is, no zip, rar, tar, 7z, etc---
1- Code all requested components2- Aim for optimised code in terms of computational overhead -5%-- It is not always possible to avoid loops, however you should aim to avoid loops where possible-3- Use a clear coding style -5%-- Code clarity is an important part of your submission- Thus you should choose meaningful variable names and adopt the use of comments - you don't need to comment every single line, as this will affect readability - however you should aim to comment at least each section of code-4- Have the code run successfully -5%--5- Output the information in a presentable manner as decided by yourself and present the requested statistical analyses/discussions -35%--6- Document code limitations including, but not limited to, the requested functionalities -5%--• Assignments submitted late, without an extension being granted, will attract a penalty of 10 marks per each day or any part thereof beyond the due date and time-
Data Analysis, Visualisation and InterpretationData and Background InformationDataset contains information related to COVID-19 for more than 200 countries-
Most variable names are self-explanatory and there are too many of them- You don't need all variables for the task- Full description of all variables is available in the codebook or online- It is very important to review this description, so you understand correctly how all these variables are measured-
If you have any doubts, feel free to ask on the forum or by email-
Research report
For this assignment, you will need to produce a report summarising a collection of requested statistical analyses and visualisations of the data- See the below for details- Youwill need to submit a proper written and nicely presented report and R-script file-
As a guideline, excluding tables/figures, 2-3 pages of writing will be sufficient for the report- I won't strictly count words so if you go over/under - that's fine, but this is a good ballpark to aim for-
The report should contain:
1- An introduction outlining analysis to follow/background information/available data-
2- What are total numbers of COVID-19 cases in the following countries: Australia, China, India, New Zealand, Sweden, Ukraine, United Kingdom, United States???? Provide numbers and appropriate graph- Analyse the progress of total cases for these countries and produce the graph similar to the one you might see in the media in the past:70 days as above and your graph should be identical to the example- When you are confident that your graph is correct, you change it to the full history-Grey dashed lines "doubles in --- days" are not required-
3- Provide and discuss descriptive statistics for the variable "gdp_per_capita", then use it to split all countries in three groups: "rich" countries, "average" countries and "poor" countries- Compare numbers of deaths and numbers of cases per month for these three groups of countries-
Hint: the data set is daily based; however, GDP does not change every day- Also, some countries are small, while other countries are large- Prepare your data correctly- You must compare apples to apples-
4- Covid-19 was declared as a pandemic- However, there might be another pandemic no one talks about - cardiovascular diseases- Study the distribution of cardiovascular diseases -variable "cardiovasc_death_rate"- overall and compare it to COVID-19 death rate- Repeat this comparison for three groups of "rich", "average" and "poor" countries-
5- COVID-19 mortality rate is a ratio of the number of deaths to the number of cases- Different countries might have different mortality rates due to COVID-19 case- Calculate and report overall statistics for mortality rates- Then select one variable from all variables available in the data that might help explaining the difference in mortality rates- Run appropriate analysis to demonstrate the relationship-
6- Conclusions summarising all your research findings-
Attachment:- Statistical Programming-rar
Most Recent Questions