I’m learning programming/analysis with R. Here are a few things I’ve learned that were more painful than necessary.
- Suppose you have a formula such as
fol <- formula(response ~ a + b + c)
. Every one of those values must be defined in your data frame – including the left-hand side (‘response’ in this case). This means you must add a column to your data frame for the left-hand side. You’ll probably do that with df$response<-rep(0,row_count_goes_here). - If you call rpart with something like
model <- rpart(fol, method="class", data=training)
, you might getError in cbind(yval2, yprob, nodeprob) : number of rows of matrices must match (see arg 2)
. If all of your left-hand side values are the same, rpart throws this unhelpful message as its way of saying, “Hey! I can’t make a decision tree where all of the records (‘response’, in this case) have the same value! The decision is too easy.”- You’ll have to put the calculated value for the left-hand side into the table. Try something like
training$response <- training$a + training$b + training$c
- You’ll have to put the calculated value for the left-hand side into the table. Try something like
- to get the first 5 rows of a data frame with 13 columns use either:
- df[c(1:5),1:13]
- df[c(1:5,]