Tasks

  1. Consider the command mean(head(sort(airquality$Ozone), 10)). Describe in one sentence what the effect of this command is.
  2. Use the same command as above, but reverse the ordering. How do you do that and what’s the effect of it? HINT: Remember that sort() has an optional parameter decreasing to control the ordering direction.
  3. Calculate the standard deviation of the variable Solar.R in the built-in data set airquality. Use the function sd() for this and set it to ignore NA values in the input vector.
  4. rnorm() is a function to generate random numbers from a normal distribution. Find out which parameters it has using R’s built-in help function. Now generate 100 random numbers from a normal distribution with mean 30 and standard deviation 2. Calculate the mean from these numbers. How much differs the mean of your generated numbers from the mean 30?
  5. Use the 100 random numbers generated in the previous task and count how many of them are greater than or equal 30.
  6. Load the package MASS and its data set cats as in the previous session. Answer the following questions (Hint: Create a logical expression and use sum() to count the occurrences of TRUE values, e.g. sum(cats$Sex == 'F')):
    1. How many female, how many male cats are there in the data set? Store the results in two objects n_female and n_male.
    2. How many female cats have a body weight of at least 2.5kg? What’s the ratio of these in the group of all female cats?
    3. How many male cats have a body weight of at least 2.5kg? What’s the ratio of these in the group of all male cats?
    4. How many female cats have a body weight of at least 2.5kg or a heart weight of 10g and more? What’s the ratio of these in the group of all female cats?
    5. How many male cats have a body weight of at least 2.5kg or a heart weight of 10g and more? What’s the ratio of these in the group of all male cats?
  7. Complete lesson 8 of SWIRL Course “R Programming”. (See the notes in session 2 tasks about installing SWIRL if you have not done that yet.)
  8. What’s wrong with the following lines of code:

Example 1:

sum(airquality$Month = 5)
## Error

Example 2:

smoker <- c(TRUE, NA, FALSE, TRUE, FALSE)
sum(smoker, na.rm <- TRUE)
## Error

Example 3:

age <- c(20, NA, 19, 51, 20)
mean(age, na.rm == TRUE)
## Error

Example 4:

country <- factor(c('USA', 'GB', 'GB', 'DE', 'USA'))
country_is_usa <- (country = 'USA')
country_is_usa
## [1] "USA"

Example 5:

age <- c(20, NA, 19, 51, 20)
median(age, rm.na = TRUE)
## [1] NA