How to use R to wrangle business data. This assignment will

Assignment Detail:- BUS5DWR Data Wrangling and R - La Trobe University OverviewOver the past few weeks, you have learned how to use R to wrangle business data- This assignment will provide you with an opportunity to demonstrate your R skill for data wrangling- Using the tidyverse package is recommended but not compulsory- Assignment Requirements Part 1The given data files Movie-csv, Rating-csv and Continent-csv record the information about the IMDB movie ratings-Write R code in an Rmd file to answer the following questions- Each question should be presented in one code chunk: Load the dataset from the given files into three data frames called Movie, Rating, and Continent- Rename columns to remove space if they exist- -Hint: use str_replace_all to do this automatically for all columns-- Remove the column Writer in the Movie dataframe- Display the summary of each dataframe- How many movies produced by 'Universal Pictures' have the actor 'Arnold Schwarzenegger'???? Display the five most-reviewed movies that belong to both Action and Drama- Display only the Title and the number of reviews- Display movie rating information including Title, average rating and two new columns -1- 'TotalVote' showing the total votes from both males and females and -2- 'Popular' showing 'Male' for movies with the MalesTotalVotes greater than FemalesTotalVotes and 'Female' otherwise- -Hint: see Workshop 9 exercise-- Show only TEN movies with the highest average rating- Display the number of Comedy movies and their average rating from each continent- Analyse the distribution of the average rating of all the movies after the year 2000- -Hint: draw a boxplot and histogram and write a short paragraph -less than 100 words- to describe your insight-- Part 2The given Spotify-xlsx file records the summary of Australia's top 200 daily-streamed songs -or tracks- in the first three months of 2017 and 2018- The Data worksheet records the total streams and the highest position of each song in each month- You will see that the data is far from being ready for analysis and needs to be 'wrangled'- The given Artist-csv file records the artists who perform the songs- You are required to write R code to perform the following steps- Load the data from the Spotify worksheet into a dataframe named Spotify- Replace the space in the column name with an underscore -"_"-- Show the structure of Spotify- You can see that most column names contain the month information, which should be placed as row values- Let: • Use pivot_longer to transform the dataframe into four columns, namely Artist_ID, Track_Name, Month, and Value-• Drop all rows having NA in Value-• Split the Month column into Month and Year• Display the number of columns and rows- You can see that the data in column Value contains both the total stream and highest position of the song in the corresponding month- Note that the smaller value of the position, the higher the position- • Split the Value column into two columns with appropriate names- • For each month-year, show the total streams and the number of songs appearing in the daily top 200- Find all tracks that appeared in all six months with each monthly stream more than 100,000- Display their name, total stream and highest position- Export the result into a CSV file- Load the data from the Artist-csv file into a new dataframe- Rename the columns to remove spaces- How many artists do not have songs listed in the Spotify dataframe???? Draw a bar chart to compare the artists of the songs/tracks returned in Q2-4 based on their total stream- Order the bar from the highest to the lowest total stream- Write a small paragraph describing your insight got from this chart- Attachment:- Data Wrangling-rar




Most Recent Questions

Captcha

Helping Students for Excellence in Academics, GET Help with Assignment? Order Now