How do you know if your evaluation instrument is “good”? Or if the instrument you find on csedresearch.org is a decent one to use in your study?

Evaluation instruments (like surveys, questionnaires, and interview protocols) can go through their own evaluation to assess whether or not they are reliable or valid. In the Filters section on the Evaluation Instruments page, you can find a category called Assessed where you can include instruments in your search that have been measured for reliability and validity. So, what do these measures mean? And, what is the difference between them?

Evaluation instruments are often designed to measure the impact of outreach activities, curriculum, and other interventions in computing education. But how do you know if these evaluation instruments actually measure what they say they are measuring? We gain confidence in these instruments by measuring their reliability and validity.

Reliable instruments yield the same results each time they are taken. Let’s say that you created an evaluation instrument in computing education research, and you gave it to the same group of high school students four times at (nearly) the same time. If the instrument was reliable, you would expect that the results of these tests to be the same, statistically speaking.

Validated instruments are those that have been checked in one or more ways to determine whether or not the instrument measures what it is supposed to measure. So, if your instrument is designed to measure whether or not parental support of high school students taking computer science courses is positively correlated with their grades in these courses, then statistical tests and other steps can be taken to ensure that the instrument does exactly that.

Those are still very broad definitions. Let’s break it down some more. But before we do, there is one very important caveat.

Reliability and/or validity have been checked for the specified particular demographic in a particular setting. Using a validated, reliable instrument does not mean that the instrument is reliable and valid in your setting. It can provide, however, a greater measure of confidence than an instrument that has not been validated or determined to be reliable. And, if you are able to find an instrument that has been validated with a population similar to your own (e.g. Hispanic students in an urban middle school), this can provide even greater confidence.

Now, let’s take a look at what each of these terms mean and how they can be measured.

Select here to go to next page to learn about Reliability.