Blog coding and discussion of coding about JavaScript, PHP, CGI, general web building etc.

Monday, February 1, 2016

Data frame with condition

Data frame with condition


I have the following data frame called planets.df:

     type         | planets | diameter | rotation | rings  ---------------------------------------------------------  Terrestrial planet| Mercury |   0.382  |  58.64   | FALSE  Terrestrial planet|   Venus |   0.949  |-243.02   | FALSE  Terrestrial planet|   Earth |   1.000  |   1.00   | FALSE  Terrestrial planet|    Mars |   0.532  |   1.03   | FALSE  Gass giant        | Jupiter |  11.209  |   0.41   | TRUE  Gass giant        |  Saturn |   9.449  |   0.43   | TRUE  Gass giant        |  Uranus |   4.007  |  -0.72   | TRUE  Gass giant        |  Neptune|   3.883  |   0.67   | TRUE  

I want to get all the plants that have a ring, i.e. rings = TRUE with the following code:

ring.vector <- planets.df$rings  planets.with.rings.df <- planets.df[rings.vector,]  

Can someone tell me why this works? I didn't come up with the codes myself but want to understand why it works. The part [rings.vector,] means rings=TRUE?

Thanks!

Answer by alittleboy for Data frame with condition


rings.vector is a vector that contains indicators of TRUE or FALSE, which correspond to the column of rings. If you want to subset those rings with TRUE value, then using [rings.vector, ] will select those rows that rings==TRUE and all columns.

Answer by Codoremifa for Data frame with condition


It works because in a df[ type of statement, the condition part is basically a vector of T/F. The row numbers corresponding to TRUE are kept and the ones corresponding to FALSE are omitted.

rings.vector is already a vector of T/F. You could instead use a rings.vector == TRUE condition which would give the same condition.

And in your case, it probably doesn't matter, but be careful if you have NAs in your condition vector or the column you are filtering on.

Answer by user1362215 for Data frame with condition


When you have a data frame, you can reference specific rows and columns 2 different ways.

  1. You can call the numbers of the columns and rows explicitly by using df[row_numbers,column_numbers], or
  2. You can use boolean variables (TRUE/FALSE) to indicate which rows/columns you want. With the rings.vector, it will look for the row numbers that match the indices of all the TRUE values in rings.vector and pull out the corresponding rows when you use df[rings.vector,].

In the above example, nothing is being checked for in the columns, but you need the comma in the brackets to indicate that the object before the comma refers to rows. Most of the time you'll only use the boolean values for rows and specific numbers for columns out of simplicity.

Answer by marbel for Data frame with condition


Here is a small reproducible example. I've added some examples using data.table. Please, correct the code if it's not right.

data <- data.frame(id = 1:100, x = rnorm(100, 100, 50))  data$flag <- ifelse(data$x > 100, TRUE, FALSE)  head(data)    # FALSE can be subseted using 0   data[data == FALSE]  data[data == 0]  str(data$flag)    # As it's of class:  class(data$flag)    # Using Data Table  library("data.table")  DT <- data.table(data)    setkey(DT, flag)  DT[J(FALSE)]  DT[J(TRUE)]    # Aggregate (Group by)  DT[, quantile(x), by = flag]    DT[, list(mean(x),             sum = sum(x),            meadian = median(x))     , by = flag]  

Answer by exegetic for Data frame with condition


Another angle on this is to use subset(), which is rather intuitive: it extracts only those lines from the data frame for which the condition (second argument) is true.

planets.with.rings.df <- subset(planets.df, rings == TRUE)  

or just simply

planets.with.rings.df <- subset(planets.df, rings)  

The "== TRUE" in the first solution is redundant since you are comparing a Boolean vector already!


Fatal error: Call to a member function getElementsByTagName() on a non-object in D:\XAMPP INSTALLASTION\xampp\htdocs\endunpratama9i\www-stackoverflow-info-proses.php on line 72

0 comments:

Post a Comment

Popular Posts

Powered by Blogger.