'Count number of individuals with a condition (dummy) paneled data
Due to privacy issues, I can't share the original dataset or my original code. Therefore, I have created an example.
Suppose that I want to count how many individuals have obtained a degree in higher education. This means that I want to know for how many individuals the HEdummy == 0. I am struggling with how I can do this... In the example below, the correct answer would be 2. I have tried to create a table and to use the count/unique functions, but I have no clue how I can distinct between individuals without summing all '1's.
df <- data.frame (Individual = c("1", "1", "1","1","2","2","2","3","4","4",'4',"4"),
Time = c("2011", "2012", "2013","2014","2011","2012","2012","2017","2014","2015",'2016',"2017"),
HigherEducationDummy = c("1", "1", "1","1","0","0","0","1","0","0",'0',"0"))
Solution 1:[1]
Not sure why the answer would be 0, but based on the rest of the description it seems you could do summarize over the years for each individual.
library(dplyr)
df %>%
group_by(Individual) %>%
summarize(hasHE = !any(HigherEducationDummy == "1")) %>%
select(hasHE) %>%
sum()
This would tell you how many people never achieved higher education in the years. You could also replace sum
with table
to get a count of all categories.
Solution 2:[2]
Using all
in tapply
and sum
it up. This counts how many individuals just have zeroes in the dummy across all years.
sum(with(df, tapply(HigherEducationDummy == 0, Individual, all)))
# [1] 2
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 | Marcus |
Solution 2 | jay.sf |