{hyenaR}
:{drat}
to access the current (stable) version of {hyenaR}
.
summarise()
functionsummarise()
functionsummarise()
can apply any function to our dataWhat is the longest a female has ever lived?
#Create a data frame of all females
create_id_starting.table(sex = "female") %>%
#Determine the lifespan of each female
mutate(lifespan = fetch_id_duration.lifespan(ID))
# A tibble: 1,211 × 2
ID lifespan
<chr> <dbl>
1 A-001 19.0
2 A-002 5.00
3 A-003 15.7
4 A-004 9.01
5 A-006 19.6
6 A-007 8.05
7 A-008 14.5
8 A-009 12.8
9 A-010 5.38
10 A-013 19.8
# … with 1,201 more rows
summarise()
can apply any function to our dataWhat is the longest a female has ever lived?
#Create a data frame of all females
create_id_starting.table(sex = "female") %>%
#Determine the lifespan of each female
mutate(lifespan = fetch_id_duration.lifespan(ID)) %>%
#Summarise our data to find the maximum value of lifespan
summarise(max_age = max(lifespan, na.rm = TRUE))
# A tibble: 1 × 1
max_age
<dbl>
1 20.1
summarise()
can apply any function to our dataFind the max lifespan in months…
#Create a data frame of all females
create_id_starting.table(sex = "female") %>%
#Determine the lifespan of each female
mutate(lifespan = fetch_id_duration.lifespan(ID)) %>%
#Summarise our data to find max lifespan in months
summarise(lifespan_months = max(lifespan, na.rm = TRUE)*12)
# A tibble: 1 × 1
lifespan_months
<dbl>
1 241.
summarise()
can apply multiple functions to our data at the same timeSummarise both the max and mean lifespan…
#Create a data frame of all females
create_id_starting.table(sex = "female") %>%
#Determine the lifespan of each female
mutate(lifespan = fetch_id_duration.lifespan(ID)) %>%
#Summarise our data to find BOTH the maximum and mean value of lifespan
summarise(max_age = max(lifespan, na.rm = TRUE),
mean_age = mean(lifespan, na.rm = TRUE))
# A tibble: 1 × 2
max_age mean_age
<dbl> <dbl>
1 20.1 4.78
group_by()
/summarise()
applies functions to subsets of the dataSummarise for females born in each clan…
create_id_starting.table(sex = "female") %>%
mutate(lifespan = fetch_id_duration.lifespan(ID),
#Find birth clan...
clan = fetch_id_clan.birth(ID)) %>%
# Group by clan so we will apply functions to data from each clan separately...
group_by(clan) %>%
#Summarise our data to find three things:
summarise(
#- Number of females
n = n(),
#- Max age
max_age = max(lifespan, na.rm = TRUE),
#- Mean age
mean_age = mean(lifespan, na.rm = TRUE))
# A tibble: 13 × 4
clan n max_age mean_age
<chr> <int> <dbl> <dbl>
1 A 241 19.8 4.80
2 B 1 5.00 5.00
3 C 3 12 8.39
4 E 101 16.7 5.95
5 F 118 18.0 4.76
6 L 208 20.1 4.83
7 M 199 19.1 4.71
8 N 107 16.9 4.40
9 R 4 5.89 3.85
10 S 139 14.6 3.81
11 T 66 18.5 5.45
12 U 7 5.56 3.44
13 X 17 10.0 5.21
summarise()
Start with this data…
…and find the mean lifespan in each clan.
Start with this data…
…and find the sum and mean reproductive success and number of individuals (using n()
) in each clan.
Start with this data…
…and find number of offspring per capita in each clan.
length()
. This returns a single number.nrow()
. This returns a single number.count()
. This returns a new data frame.
We can continue working with dplyr functions.
Data frame is needed for statistical models (e.g. lm()
) and plotting (e.g. ggplot()
).
Data frame can be easily output as .csv with e.g. write.csv()
summarise()
and n()
. This returns a new data frame.# Create a DATA FRAME with an ID and lifespan column
create_id_starting.table(sex = "female") %>%
mutate(lifespan = fetch_id_duration.lifespan(ID)) %>%
# Count data AND find longest lifespan
summarise(number_females = n(),
oldest = max(lifespan, na.rm = TRUE))
# A tibble: 1 × 2
number_females oldest
<int> <dbl>
1 1211 20.1
left-censored: Individual was born before study period.
right-censored: Individual died after study period.
uncensored: Individual born and died during study period.
# Create a DATA FRAME with all female IDs
create_id_starting.table(sex = "female") %>%
mutate(
#If individual is born before 1997-01-01 they are LEFT CENSORED
left_censored = fetch_id_is.censored.left(ID, at = "1997-01-01"),
#If individual is still alive at 1997-12-31 they are RIGHT CENSORED
right_censored = fetch_id_is.censored.right(ID, at = "1997-12-31"),
)
# A tibble: 1,211 × 3
ID left_censored right_censored
<chr> <lgl> <lgl>
1 A-001 TRUE TRUE
2 A-002 TRUE FALSE
3 A-003 TRUE TRUE
4 A-004 TRUE FALSE
5 A-006 TRUE TRUE
6 A-007 FALSE FALSE
7 A-008 TRUE TRUE
8 A-009 TRUE TRUE
9 A-010 TRUE TRUE
10 A-013 TRUE TRUE
# … with 1,201 more rows
# Create a DATA FRAME with all female IDs
create_id_starting.table(sex = "female") %>%
mutate(
#If individual is already alive at 1997-01-01 they are LEFT CENSORED
left_censored = fetch_id_is.censored.left(ID, at = "1997-01-01"),
#If individual is still alive at 1997-12-31 they are RIGHT CENSORED
right_censored = fetch_id_is.censored.right(ID, at = "1997-12-31"),
) %>%
#Count how many are left and/or right censored
group_by(left_censored, right_censored) %>%
## REMEMBER: TO RETURN A DATAFRAME WE NEED TO USE COUNT() OR SUMMARISE()
count()
# A tibble: 4 × 3
# Groups: left_censored, right_censored [4]
left_censored right_censored n
<lgl> <lgl> <int>
1 FALSE FALSE 1096
2 FALSE TRUE 24
3 TRUE FALSE 15
4 TRUE TRUE 76
# Create a DATA FRAME with all female IDs
create_id_starting.table(sex = "female") %>%
mutate(
#If individual is already alive at 1997-01-01 they are LEFT CENSORED
left_censored = fetch_id_is.censored.left(ID, at = "1997-01-01"),
#If individual is still alive at 1997-12-31 they are RIGHT CENSORED
right_censored = fetch_id_is.censored.right(ID, at = "1997-12-31"),
) %>%
#Keep only individuals that are uncensored
filter(!left_censored & !right_censored) %>%
#How many individuals is this?
count()
# A tibble: 1 × 1
n
<int>
1 1096