SpeedUtilities Processes | DLLs | Operating systems | Manufacturers | CPUs | Ports | Services | Log files | Directories | Internet services | Software packages | Viruses | Network adapters
» An ultimate catalog of computer data

Forum posts for df.exe

Is there a better way to find the percent of one column that meets a criteria for each value…

I have a data frame with columns grade.equivalent and scaled.score, both numeric. I'd like to find the percent of students at or above a given scaled.score for all students at or above each grade.equivalent.

For example, given the following data frame:

df.ex <- data.frame(grade.equivalent=c(2.4,2.7,3.1,2.5,1.4,2.2,2.3,1.7,1.3,2.2),

I'd like to know for each grade.equivalent, what percent of students scored above 301 out of all students scoring at or above that grade.equivalent.

To do this I did the following:

find.percent.basic <- function(cut.ge, data, cut.scaled.score){
df.sub <- subset(data, grade.equivalent >= cut.ge & !is.na(scaled.score))
denom <- nrow(df.sub)
df.sub <- subset(df.sub, scaled.score >= cut.scaled.score)
numer <- nrow(df.sub)

grade.equivs <- unique(df.ex$grade.equivalent)
grade.equivs <- grade.equivs[order(grade.equivs)]

just.percs <- sapply(grade.equivs, find.percent.basic, data=df.ex, cut.scaled.score=301)

new.df <- data.frame(grade.equivalent=grade.equivs, perc=just.percs)

I plan to wrap this in a function and use it with plyr.

My question is, is there a better way to do this? It seems like this might be a base function of r or a common package that I just don't know about.

Thanks for any thoughts.

EDIT for clarification
The code above produces the following result, which is what I'm looking for:

grade.equivalent perc
1 1.3 0.2000000
2 1.4 0.2222222
3 1.7 0.2500000
4 2.2 0.2857143
5 2.3 0.2000000
6 2.4 0.2500000
7 2.5 0.3333333
8 2.7 0.5000000
9 3.1 1.0000000

Edited for clarification a second time, per observations from @DWin

View complete forum thread with replies

Other posts related to df.exe

See Related Forum Messages: Follow the Links Below to View Complete Thread

R - Create subset data frame using variable
Exploding date range as row is R
Is there a better way to find the percent of one column that meets a criteria for each value&hellip;
Error: This name does not have a type, and must have an explicit type
Fortran program errors
Fortran &ldquo;Error: The shapes of the array expressions do not conform.&rdquo;

What is the carbon footprint of your coffee?

Is it low? Is it high? Can this things really kill the planet Earth? Maybe the answer will surprise you. Maybe not.