There are two predefined objects in R that contain all letters from A-Z and a-z, respectively:
LETTERS
## [1] "A" "B" "C" "D" "E" "F" "G" "H" "I" "J" "K" "L" "M" "N" "O" "P" "Q"
## [18] "R" "S" "T" "U" "V" "W" "X" "Y" "Z"
letters
## [1] "a" "b" "c" "d" "e" "f" "g" "h" "i" "j" "k" "l" "m" "n" "o" "p" "q"
## [18] "r" "s" "t" "u" "v" "w" "x" "y" "z"
Using numeric indexing (not subsetting with logical expressions), try to generate the following output using either letters
or LETTERS
:
"e"
"e"
"v" "w" "x" "y" "z"
)"W" "Z" "B"
"a" "c" "e" "g" "i" "k" "m" "o" "q" "s" "u" "w" "y"
) Hint: Use the seq()
function"f" "g" "h" "i" "j" "k" "l" "m" "n" "o" "p" "q" "r" "s" "t" "u" "v" "w" "x" "y" "z"
myletters
as a copy of letters
(myletters <- letters
). Assign the first five capital letters (from LETTERS
) to the first five letters of myletters
so that myletters
will then contain: "A" "B" "C" "D" "E" "f" "g" "h" "i" "j" "k" "l" "m" "n" "o" "p" "q" "r" "s" "t" "u" "v" "w" "x" "y" "z"
retweets <- c(1, 3, 2, 2, 3, 4, 3, 2, 8, 2)
likes <- c(6, 10, 9, 6, 3, 6, 6, 7, 6, 15)
users <- factor(c('WZB_Berlin', 'JWI_Berlin', 'JWI_Berlin', 'gesis_org', 'WZB_Berlin', 'WZB_Berlin', 'WZB_Berlin', 'gesis_org', 'JWI_Berlin', 'WZB_Berlin'))
located_in_berlin <- c(TRUE, TRUE, TRUE, FALSE, TRUE, TRUE, TRUE, FALSE, TRUE, TRUE)
Assume that the elements in the vectors are aligned, i.e. the first element in retweets
corresponds to the first element in likes
and users
etc. (as if they were combined in a data frame). Solve all tasks by using logical expressions / logical vectors.
retweets
and likes
to contain only data from the user WZB_Berlin
.users
to contain only elements where located_in_berlin
is FALSE
or users
equals "WZB_Berlin"
(this should return a vector only containing "gesis_org"
and "WZB_Berlin"
).retweets
, likes
and users
with the criteria to have at least three retweets and at least six likes. (Hint: If you want to spare yourself from typing too much, create a logical vector of the criteria at first and re-use it to subset the vectors.)retweets
. Now form a subset of retweets
, users
and located_in_berlin
where retweets are higher than the median.Create a script file in RStudio that does the following:
segindex_sample.csv
(from the accompanying resources file 04rbasics3-resources.zip
available on the course website) into a data frame. Set read.csv()
to not convert strings to factors automatically.%in%
operator for this – it was introduced in the previous session).segindex_subset.xlsx
.