This is probably a dumb question, but when I use the H2O Predict function in R, I am wondering if there is a way that I can specify that it keep a column or columns from the scoring data. Specifically I want to keep my unique ID key.
As it stands now, I end up doing the really inefficient approach of assigning an index key to the original data set and one to the scores, then merging the scores to the scoring data set. I'd rather just say "score this data set and keep x,y,z....columns as well." Any advice?
#Use H2O predict function to score new data NL2L_SCore_SetScored.hex = h2o.predict(object = best_gbm, newdata = NL2L_SCore_Set.hex) #Convert scores hex to data frame from H2O NL2L_SCore_SetScored.df<-as.data.frame(NL2L_SCore_SetScored.hex) #add index to the scores so we can merge the two datasets NL2L_SCore_SetScored.df$ID <- seq.int(nrow(NL2L_SCore_SetScored.df)) #Convert orignal scoring set to data frame from H2O NL2L_SCore_Set.df<-as.data.frame(NL2L_SCore_Set.hex) #add index to original scoring data so we can merge the two datasets NL2L_SCore_Set.df$ID <- seq.int(nrow(NL2L_SCore_Set.df)) #Then merge by newly created ID Key so I have the scores on my scoring data #set. Ideally I wouldn't have to even create this key and could keep #original Columns from the data set, which include the customer id key Full_Scored_Set=inner_join(NL2L_SCore_Set.df,NL2L_SCore_Set.df, by="ID" )