Best Fit for Classification



I have two sets of data representing the aesthetic scores for two groups of objects (determined using a computer program). The first has about 19,000 elements while the other has about 12,000. The first group scores on average, higher than the second. The difference is statistically significant (two-sample t-test assuming unequal variances). The scores for the first group ranges between 0 and 7 while the second ranges between 0 and 5.

I want to classify the scores into two categories (e.g. "beautiful" and "not beautiful") for the purpose of testing for correlation with human assessment. I suppose this means I have to "draw a line" between the group scores somehow. What is the best way to do this? Thanks.
.


Quantcast