An Analysis of Homeschooled and Non-Homeschooled Students’ Performance on an ACT Mathematics Test

To see this article with all of the tables and figures, please view the PDF Version.

An Analysis of Homeschooled and Non-Homeschooled Students’ Performance
on an ACT Mathematics Achievement Test

 Basil Qaqish, Ph.D.

Department of Social Work, The University of North Carolina at Greensboro

PO Box 26170, Greensboro NC 27402-6170

Keywords: Homeschooled, homeschool, student, ACT performance, mathematics.

Homeschooling is an alternative education method that can be chosen by parents to educate their children. There is not a set age for children to start a homeschooling program of education. Parents choose to homeschool for many reasons. They may opt to homeschool for religious reasons. In fact, religious affiliation is a major factor among home educators (Ray, 1997). Parents may also choose to homeschool because their child has a hard time in a public school learning environment. In general, families choose to homeschool one or more of their children for a variety of reasons. According to a U.S. Department of Education study, parents of home-schooled children stated that the most important reasons for homeschooling their children were giving a better education at home, religious reasons, and poor learning environment at school. The U.S. Department of Education estimated the number of homeschooled children, in 1999, to be between 709,000 and 992,000 (Bielick, Chandler, and Broughman, 2001).   

It has been shown that homeschoolers performed better than non-homeschoolers on standardized tests in the past. This is a known fact according to the published literature so far. The National Home Education Research Institute (NHERI) web site (, has many references, from many published studies, stating that fact. In  their fact sheet I, it is stated that home educated students scored, on average, at or above the 80th percentile in all areas on standardized achievement tests in one study.

In NHERI fact sheet IIIc, a study of 16,311 home educated students found that home educated students performed above average on the Iowa Test of Basic Skills, which is a test of abilities in reading, language and math. Here is the table of national percentile performance as presented by NHERI on its web site:














Home Education





In 1992, Calvery et al. analyzed achievement of Arkansas homeschooled and public schooled students in grades 4, 7, and 10. Homeschooled students were found to have scored higher than their public school counterpart except for grade 10 language scores, where homeschooled students scored significantly lower than public school students.

Research by Ray (1997) yielded results that indicate higher performance by homeschooled students. For instance, Ray reported that homeschooled students performed at the 87th percentile in reading and 82nd percentile in math compared to a national average of 50th percentile. Research by Rudner (1999) indicated that 24.5% of homeschooled students “ are enrolled one or more grades above their nominal grade.”. Rudner used the IOWA  test of basic skills (ITBS) and the tests of achievement and proficiency (TAP) to compare achievement for homeschooled and non-homeschooled students. His results showed that homeschooled students outperformed non-homeschooled students across all grade levels. Most results of the analysis showed that homeschooled students performed at the 75th to 85th percentile range compared to non-homeschooled students.

Homeschooled students did well on the 1998 ACT college entrance exam. Their average composite score was 22.8, about 0.38 standard deviation above a national average of 21. This result placed them at the 65th percentile of all ACT test takers of 1998 (Rudner, 1999).

Historically, homeschooled students seem to have outperformed, on average, non-homeschooled students. However, in recent years, more people are choosing to go into homeschooling for their children for one reason or another, and this may have changed the demographics of home educated students in a manner that impacted the differences of  performance on standardized tests between the two groups. But how much change is there in regard to performance on standardized tests is there? To answer this question in part, two datasets of response vectors for homeschooled and for non-homeschooled children for the same form of an ACT mathematics test were obtained. Next, two specific subsets were extracted controlling for grade level, gender, ethnicity, and SES (socioeconomic status). The final output was two subsets of examinees, (N=1477); one for the homeschooled group and one for the non-homeschooled group. A comparison of the two subsets was done to see if the historical trends of standardized tests still hold.

It is worth mentioning here that the national norms are largely controlled by the scores of public school students. The current study does not try to establish any comparison between public school students’ results on ACT and their homeschooled counterparts. Rather the goal was to take homeschooled students and compare their test results to the group of all other students. It is true that the second group results will be largely influenced by public school students’ results. But it is also true that the second group includes students who do not belong to public education (i.e., private schools). Simply put, the goal of this study was to compare homeschooled students to everybody else, with regard to their scores on an ACT mathematics achievement test. The research questions that motivated this study were:

1. Is the test equally “reliable” for homeschooled and non-homeschooled students?

2. Which group had a higher raw test score?

3. If we condition on equivalent raw test scores for the two groups, is the pattern of responses to a specific item on the test similar. That is, is their a statistically significant difference between the two groups of examinees who have equal abilities (equal raw test scores) with regard to their pattern of responses to the same item on the test?



During the beginning of 2003, ACT Inc. was contacted to obtain two data sets of response vectors, for the same form of an ACT mathematics test for homeschooled and non-homeschooled students with four demographics included in the data (grade level, gender, ethnicity, and SES). Homeschooled students were defined as such if they indicated on their application form to take the test that they were homeschooled. The length of time being homeschooled was not considered in this study. Data were not available in this regard. ACT was very helpful and provided the requested datasets in March of 2003. It is worth mentioning here that ACT provided clean datasets with no missing cases. It is also worth mentioning that a breakdown of the test items by competency was not available. According to the data description file from ACT (2003), the test is a 60-item test that takes 60 minutes for students to finish. The test is

. . . designed to assess the mathematical reasoning skills that students have typically acquired in courses taken up to the beginning of grade twelve.  The test presents five-option multiple-choice items that require students to use their mathematical reasoning skills to solve practical problems in mathematics.  Knowledge of basic formulas and computational skills are assumed as background for the problems, but memorization of complex formulas and extensive computation are not required.  The material covered on the test emphasizes the major content areas that are prerequisite to successful performance in entry-level courses in college mathematics.  Six content areas are included: pre-algebra, elementary algebra, intermediate algebra, coordinate geometry, plane geometry, and trigonometry. (p.2)

According to the abovementioned data description file, The methodology followed in obtaining the two datasets was as follows:

Two steps were involved in drawing these samples.  First, the data file for the total examinee population for this administration of the ACT Assessment (N = 330,970) was filtered to separate the home scholars from the non-homeschoolers, and to remove the record of any examinee who neglected to indicate one or more of the self-reported variables.  This filtering resulted in the 1,807 student records represented in home schoolers.dat and records for 274,313 non-homeschoolers.  Next, a simple random sample of 5,400 records was drawn from the modified non-homeschooler population.  These records are represented in nonhome schoolers.dat. (p.2).

Conditioning on the abovementioned four demographic variables, the statistical package SAS was used to extract two comparable subset files. A randomization procedure was used wherever more records of similar demographics were available in one of the two files. For instance if there were four homeschoolers’ records available for a 9th grade male white examinee whose family income is more than $100,000 and there were 10 records of similar characteristics in the non-homeschoolers’ records, randomization was done to extract four non-homeschoolers records to include them in the non-homeschoolers newly created subset file. If the reverse situation occurred and the homeschoolers’ file had more records than non-homeschoolers, then the randomization procedure was applied to the homeschoolers’ records. In the end, two comparable files, with regard to the four demographics were obtained (N=1477). In short, matched samples were created through random selection. Those obtained files were, for the most part, the ones analyzed in this study. Tables 1, 2 and 3 report the demographics breakdown of the obtained files. It was not possible to report the demographics breakdown in one table because there were four demographic variables that included multiple levels for each variable. Gender had two levels. Grade had seven levels. Ethnicity had nine levels. SES had ten levels. Using a one-table breakdown yields 1260 cells in that table (most of which are zeros). Because of this, and to report a table that shows all four demographics, data were collapsed to the following number of levels for each variable:

Gender: remained the same with two levels

Grade Level: collapsed into five levels (9th grade, 10th grade, 11th grade, 12th grade and Other)

Ethnicity: collapsed into four levels (African American, Caucasian, Hispanic and other)

SES: collapsed into different income brackets ($30,000 or less, between $30,000 and $60,000, between $60,000 and $100,000, and more than $100,000).

Following are the three tables (1, 2, and 3) mentioned above.


The analysis in this study followed 3 steps:

Computation of descriptive statistics (means and standard deviations and reliabilities). This part of the analysis tries to answer questions like: What are the mean test scores for the two groups? What are the standard deviations? What is the test reliability for the two groups?

1.     An item p-value (proportion correct) comparison. This part of the analyses tries to investigate the performance of each group on each of the test items.

2.     A SIBTEST analysis. SIBTEST was used in this study to answer the following research question:

3.     Conditioning on equivalent test raw scores, are the patterns of responses to the same test question, from a purely “statistical” perspective, similar or different. SIBTEST takes the groups of examinees who have equivalent raw scores on the test (homeschooled vs. non-homeschooled)   and analyzes their response patterns to see if they are statistically significantly different from each other. The rationale behind using this test is to see if groups of students who have similar math abilities respond to the same test question in the same manner, regardless of whether they were homeschooled or not.

4.     Expected Frequency distribution comparison using IRT (item response theory): this part of the study involved IRT expected frequency simulation. This investigation was done to see if the expected frequencies distribution is similar to what was actually observed in the data sets.

Means and Standard Deviations and Instrument Reliability

Table 4 gives the  “number correct” means and standard deviations for the two groups for the original datasets and for the conditioned subsets obtained (N=1477).

Table 2. RACE * SES crosstabulation. *Numbers are in 1000’s dollars for income categories.


The mean scores for the two groups favors non-homeschoolers by about two items per examinee, while the standard deviations are close in value to each other. All datasets showed good reliability measures.


Proportion Correct Comparison


Table 5 shows the p-value per item (proportion correct) for the two groups along with the p-value differences (positive values favor homeschoolers)

In terms of p-values, there were 17 items where homeschoolers performed better than non-homeschoolers. There were 43 items where non-homeschoolers performed better than homeschoolers. Whereas some of those differences may  seem very small (i.e. 0.008), some others seem relatively large (i.e. 0.08). We need to keep in mind that those differences of p-values aggregate over all total score categories.

The proportion correct comparison graph below shows that non-homeschoolers had an overall proportion correct values higher than their counterparts of homeschoolers.

Figure 2 below represents the proportion correct differences across the items between the two groups. A value above zero means that the item proportion correct favored homeschoolers. A value below zero means the item proportion correct favored non-homeschoolers.


SIBTEST Analysis


Table 6 contains a summary output from a SIBTEST run to investigate DIF (differential item functioning) on the item level between homeschoolers and non-homeschoolers (N = 1477).





Table 3. Grade by Gender by Ethnicity by SES Cross Tabulation For matched datasets (N=1477).




Table 4. Test means, standard deviations, and reliabilities.  * Cronbach’s alpha.



There were 38 items exhibiting DIF. Nineteen items favored homeschoolers and nineteen items favored non-homeschoolers. This huge number of items exhibiting DIF stresses the fact that there is a marked difference of response patterns between the two groups on the item level. DIF is conceptualized as a difference in the probability of endorsing a keyed item response, when individuals with the same levels of ability possess different amounts of supplemental abilities that affect their responses to the item. The high number of items that exhibited DIF in this study may be indicative of the difference in mathematical training (i.e., teaching methods, teacher-learner interaction) between the two groups. Such a possible conclusion demands further research and investigation that is beyond the scope of the current study. An item by competency breakdown was not available, a fact that prevented further investigations into the possible reasons for the occurrence of DIF in many test items.

Below are item characteristic curves (ICC) for the most significantly different items, with regard to DIF, where one of the items favored homeschoolers, and the other one favored non-homeschoolers.

Figure 4 is a frequency bar graph for both groups for item 39 (numbers on the bars reflect the number of examinees who got the item correct at each score category):

Note that the number of homeschoolers is greater than the number of non-homeschoolers for total test score groups less than 30. Non-homeschoolers numbers are greater than homeschoolers for total test score groups higher than 30.



Table 5. Proportion correct by item for homeschoolers and non-homeschoolers.



Let’s inspect one item that showed DIF favoring non-homeschoolers. Figure 5 below is the ICC for the two groups, followed by a frequency bar graph (Numbers on the bars reflect the number of examinees who got the item correct for each group at each score category).

Note that the pattern is still the same. There are more homeschoolers than non-homeschoolers for score categories less than 30, while there are more non-homeschoolers than homeschoolers for score categories higher than 30. This is a pattern that is noticeable for all the items of this test, regardless of whether the item showed an overall percent correct favoring any of the two groups, and regardless of whether the item displayed DIF or not. One can say, here, that for students with higher mathematical abilities, there are more non-homeschoolers that perform better than homeschoolers. By the same token, one can say that, for students with average  and low mathematical abilities, there are more homeschoolers than non-homeschoolers. Keep in mind that this result was obtained after conditioning on all four demographic characteristics, with the same number of examinees for the two groups.






Figure 1. Item proportion correct comparison.





Figure 2. Proportion correct differences.






Table 6. SIBTEST run results.








Figure 3. Item characteristic curve – item favors homeschoolers.







Figure 4. Frequency bars – item favors homeschoolers.





Figure 5. Item characteristic curve – item favors non-homeschoolers.




Figure 6. Frequency bars – item favors non-homeschoolers.



Expected Frequencies of Scores


Item response theory (IRT) was used to get the expected frequency distributions for the two groups using the following procedure:

1. The program BILOG was run twice to obtain ability, difficulty and discrimination parameters’ estimates, using a three-parameter logistic model. According to this model, the probability of correct response, P(q), for item i  and subject j is given by:

Pi (q) = Ci +  ………..………….(1)

Where ai, bi, and Ci are the discrimination, difficulty and guessing parameters.

2. Homeschoolers’ ability estimates and item parameters’ estimates were transformed to the metric of ability estimates for non-homeschoolers using a mean/sigma method (Hambleton, Swaminathan, and Rogers, 1991). To transform the ability estimates (thetas) for homeschoolers to the non-homeschoolers, the following formula was used:

    qy = a qx + b …………………………..…..(2)

Where qx  and qy are the ability estimates (thetas) for homeschoolers before and after the transformation, and the values of a and  b above were determined using the following formulas:

     a = sbj / sbi  …………………..…………..(3)

Where s is the standard deviation, and bj and bi are the difficulty parameters for      non-homeschoolers and for homeschoolers.


b  = m bj  – a m bi ……………………………..(4)


Where  m bj  and m bi  are the mean “b” values (difficulty parameters) for non-homeschoolers and for homeschoolers.

This transformation was done using the program RESCALE. (Ackerman, 2004)

3. Finally, using the rescaled item parameters and latent ability estimates, the true score, t,   for each subject was calculated according to:

t = åitem P(q)  …………………..(5)

Where the probability of correct response, P(q), for item i  and subject j is given by the three parameter logistic formula mentioned above. True scores were rounded off to the nearest whole number to obtain the expected number correct score.

Expected frequencies for all estimated abilities were generated using the program EXPFREQ2 (Ackerman, 2004). Below is a frequency histogram of the results. As one can see again, there are more homeschoolers than non-homeschoolers in score categories below 30. The situation is reversed for score categories above 30.





There are three distinct results from this study:

On average, non-homeschoolers performed better than homeschoolers, by about two items, out of sixty items, on the ACT mathematics test that was analyzed.

Comparing the two groups of examinees indicated a high number of items displaying DIF. This result may be due to the different teaching/learning media used in teaching each of the two groups, to different teacher/student interaction, or to the number of years homeschooled before taking the ACT mathematics test. More investigative research is needed in this regard.

In general, there are more homeschoolers, who got a specific item correct than non-homeschoolers for total score categories less than 30. There are more non-homeschoolers who got a specific item correct than homeschoolers for total score categories more than 30.






      Figure 7. Expected frequencies across score categories.



Ideas for Further Research


This study indicates a need to analyze more datasets on standardized achievement tests to compare homeschoolers to non-homeschoolers in general and to public schoolers in particular. There is certainly a wide array of standardized achievement tests that can be used for analysis. Comparisons can include ACT datasets for content areas other than mathematics to compare groups (i.e. English, AP tests, etc.). There is also a need to investigate how the demographics of homeschooled students has changed in the last 15 years and to relate those changes to results of standardized tests. 

Another area that calls for more research is to investigate why DIF occurs on the item level when homeschoolers are compared to non-homeschoolers in this mathematics achievement test. The current research is limited to one test and two data sets. Hence, DIF analysis results cannot be generalized to other tests. This study indicates that more research is needed in this regard.




Ackerman, T. (2004). EXPFREQ2 [Computer software]. Author

Ackerman, T. (2004). RESCALE [Computer software]. Author

ACT (personal communication, March 27, 2003).

Bielick, S., Chandler, K., & Broughman, S. (2001), Home schooling in the United States: 1999. (NCES 2001-033). National Center for Education Statistics, U.S. Department of Education.

Calvery, R., and Others (1992). The difference in achievement between home schooled and public schooled students for grades four, seven, and ten in Arkansas. Paper presented at the Annual Meeting of the Mid-South Educational Research Association (Knoxville, TN. November 11-13)

Hambleton, R. K., & Swaminathan, H., & Rogers, J. H. (1991), Fundamentals of item response theory. Newbury Park, CA: Sage Publications.

Ray, B. (1997). Strengths of their own. Salem, OR: National Home Education Research Institute.

Rudner, Lawrence (1999). Scholastic achievement and demographic characteristics of home school students in 1998. Education Policy Analysis Archives ( No. 8)

Shealy, R., & Stout, W. (1993). A model-based standardization approach that separates true bias/DIF from group ability differences and detects bias/DTF as well as item bias/DIF. Psychometrika, 58 (No. 2).

Zimowski, M.,  Muraki, E., Mislevy, R. J., & Bock R. D. BILOG [Computer               software]. St. Paul, MN: Assessment systems corporation.




I would like to thank ACT for providing the requested data sets. I would like to thank Professor Bahjat Qaqish of UNC-Chapel Hill for providing valuable assistance with the SAS code, and Professor Terry Ackerman of UNC-Greensboro for valuable assistance in providing software for this study and for reviewing earlier versions of this paper.

I would like to thank Professors Susan Dennison and Jay Poole of UNC-Greensboro for reviewing this paper and providing many valuable suggestions. Also, I would like to thank Dr. Brian Ray and two anonymous reviewers who provided valuable suggestions to improve this paper. Many of their notes and suggestions were adopted in the final version of this study.  ¯

0 replies

Leave a Reply

Want to join the discussion?
Feel free to contribute!

Leave a Reply