Data frame with condition
Data frame with condition
I have the following data frame called planets.df
:
type | planets | diameter | rotation | rings --------------------------------------------------------- Terrestrial planet| Mercury | 0.382 | 58.64 | FALSE Terrestrial planet| Venus | 0.949 |-243.02 | FALSE Terrestrial planet| Earth | 1.000 | 1.00 | FALSE Terrestrial planet| Mars | 0.532 | 1.03 | FALSE Gass giant | Jupiter | 11.209 | 0.41 | TRUE Gass giant | Saturn | 9.449 | 0.43 | TRUE Gass giant | Uranus | 4.007 | -0.72 | TRUE Gass giant | Neptune| 3.883 | 0.67 | TRUE
I want to get all the plants that have a ring, i.e. rings = TRUE
with the following code:
ring.vector <- planets.df$rings planets.with.rings.df <- planets.df[rings.vector,]
Can someone tell me why this works? I didn't come up with the codes myself but want to understand why it works. The part [rings.vector,]
means rings=TRUE
?
Thanks!
Answer by alittleboy for Data frame with condition
rings.vector
is a vector that contains indicators of TRUE
or FALSE
, which correspond to the column of rings
. If you want to subset those rings with TRUE
value, then using [rings.vector, ]
will select those rows that rings==TRUE
and all columns.
Answer by Codoremifa for Data frame with condition
It works because in a df[
condition
part is basically a vector of T/F. The row numbers corresponding to TRUE are kept and the ones corresponding to FALSE are omitted.
rings.vector
is already a vector of T/F. You could instead use a rings.vector == TRUE
condition which would give the same condition.
And in your case, it probably doesn't matter, but be careful if you have NA
s in your condition
vector or the column you are filtering on.
Answer by user1362215 for Data frame with condition
When you have a data frame, you can reference specific rows and columns 2 different ways.
- You can call the numbers of the columns and rows explicitly by using
df[row_numbers,column_numbers]
, or - You can use boolean variables (TRUE/FALSE) to indicate which rows/columns you want. With the
rings.vector
, it will look for the row numbers that match the indices of all the TRUE values in rings.vector and pull out the corresponding rows when you usedf[rings.vector,]
.
In the above example, nothing is being checked for in the columns, but you need the comma in the brackets to indicate that the object before the comma refers to rows. Most of the time you'll only use the boolean values for rows and specific numbers for columns out of simplicity.
Answer by marbel for Data frame with condition
Here is a small reproducible example. I've added some examples using data.table
. Please, correct the code if it's not right.
data <- data.frame(id = 1:100, x = rnorm(100, 100, 50)) data$flag <- ifelse(data$x > 100, TRUE, FALSE) head(data) # FALSE can be subseted using 0 data[data == FALSE] data[data == 0] str(data$flag) # As it's of class: class(data$flag) # Using Data Table library("data.table") DT <- data.table(data) setkey(DT, flag) DT[J(FALSE)] DT[J(TRUE)] # Aggregate (Group by) DT[, quantile(x), by = flag] DT[, list(mean(x), sum = sum(x), meadian = median(x)) , by = flag]
Answer by exegetic for Data frame with condition
Another angle on this is to use subset(), which is rather intuitive: it extracts only those lines from the data frame for which the condition (second argument) is true.
planets.with.rings.df <- subset(planets.df, rings == TRUE)
or just simply
planets.with.rings.df <- subset(planets.df, rings)
The "== TRUE" in the first solution is redundant since you are comparing a Boolean vector already!
Fatal error: Call to a member function getElementsByTagName() on a non-object in D:\XAMPP INSTALLASTION\xampp\htdocs\endunpratama9i\www-stackoverflow-info-proses.php on line 72
0 comments:
Post a Comment