Assessment plans typically contain a strategy for measuring the intervention or program. These plans define what factors need to be measured among which populations as well as how those measurements will be made. Within any given plan, time and resources need to be carefully considered, including the impact on the assessment on the participants.

Although such plans can be very detailed, we offer some high-level steps here, along with a case study of how to identify the factors to be measured and then search for an existing instrument to evaluate those factors that may meet your needs. By using an existing instrument (as is or with modifications), your team can save time and resources that would be needed to create a brand new instrument.

Common Steps for Measurement

The major steps for measurement include:

  1. Refer back to your project’s original goals and pre-defined outcomes, then identify:
    a. The factors that you want to measure and
    b. Who will be measured.
  2. Determine the most appropriate design and data collection measures, such as qualitative or quantitative and whether you will use a randomized control trial, one group pre-test/post-test measure or other study design.
  3. Search the literature to find studies similar to yours (both in terms of goals and population). Extract the constructs (or measurement scales) described and determine which (if any) are relevant for your population (e.g., “motivation to succeed” or “classroom engagement”). If your project has a broad range of goals, it may be necessary to extract different constructs from different resources. In some cases, constructs may not yet have been defined in articles about CS education, but may have been explored in other STEM fields like math or engineering and can be adapted for CS education.
  4. Search on these terms and constructs to find evaluation instruments (including interview guides or templates for ethnographic research for qualitative studies) that may already exist. If an instrument is found, determine whether it needs any additions or modifications to better meet your needs–that is, match your goals. If no instrument is found, you may need to develop your own instruments.
  5. Define your measurement protocol, including the data collection techniques and the type of analysis that will be conducted on the data.
  6. Collect and analyze your data according to your protocol.
  7. Gather evidence of reliability and validity of the instrument as used in your study.
  8. Compare the data against your goals and pre-defined outcomes to determine how well you met your goals and what areas could be further studied and/or improved.

Align your measurement goals with your project’s well-defined goals. Once these are aligned, it becomes easier to find the instruments that include the constructs you want to measure.

Case Study 1: A CS-themed AISL Proposal

In a proposal to NSF’s Advancing Informal STEM Learning (AISL) program (NSF 17-573), Maggie, a Principal Investigator (PI) proposes a project to pilot test a new approach bolster student interest in computing among high school students by infusing CS content into a large, urban high school’s STEM club. The purpose of the modification would be to expose high school students to real-world CS content by inviting local industry workers who specialized in CS to share projects that the students could work on to solve specific problems.

The specific goals of the project were devised in a way that ensured that they were measurable quantitatively and that they could be tied back to specific noncognitive constructs. The two primary goals were:

  1. To increase high school student confidence in computer science – especially among students underrepresented in the field.
  2. To increase high school student interest in computer science and their desire to use it in future academic and career pursuits – especially among students underrepresented in the field.

In order to demonstrate an impact of participate in the revised STEM club activities, it would be necessary to document student CS interest and self-efficacy in CS among several groups:

  • STEM club students participating in the CS-specific STEM club activities
  • STEM club students not participating in the CS-specific STEM club activities
  • A comparison group of non-STEM club students at the school (possibly including a sub-group who has taken some CS coursework)

A background literature search resulted in information about two discrete constructs – CS interest and CS self-efficacy, both of which are noncognitive constructs. Knowing this, Maggie visited the evaluation instrument page on CSEdResearch.org and selected:

  • A Focus Area of Computing,
  • Demographic, then 9th-12th,
  • Student Engagement, then under this category selected Self-Efficacy and Interest (Computing), and
  • Quantitative/Qualitative, then under this category selected Quantitative.

This search revealed a number of instruments that were potentially relevant (with self-efficacy being a fairly common construct that was examined). Maggie was able to find one instrument called the Computer Science Interest Survey, which covered both of the relevant constructs and had also been used with a similar population to the one being examined in this study (high school students).

In order to keep the instrument intact and measure both constructs in the same way they were measured in the original study, Maggie used all of the questions from the survey. However, the only background questions asked on the original survey were about prior course-taking and gender. They added additional questions about race/ethnicity and whether or not the student had a disability, in order to fully address the research questions related to broadening participation in computing.

Case Study 2: A Researcher Practitioner Partnership (RPP)

In a proposal to NSF’s Computer Science for All: Researcher Practitioner Partnership program (NSF 18-537), Jorge, the Principal Investigator (PI), and his team proposes a project that will train high school teachers to teach a nationally-known CS curriculum with a set of modifications to make it more culturally responsive and engaging for students from underrepresented backgrounds.

The specific goals of the project were devised in a way that ensured that they were measurable and that they could be tied to both the cognitive and noncognitive constructs. These goals were:

  1. To increase high school student interest in computer science – especially among students underrepresented in the field.
  2. To increase the extent to which high school students saw CS coursework and skills being relevant in their future academic and career pursuits– especially among students underrepresented in the field.
  3. To improve high school students’ knowledge in the specific areas emphasized most by the course.

In order to demonstrate the uniqueness and impact of the revised course, it is necessary to use instruments that measured the relevant constructs under at least three different conditions:

  1. Within the version of the class using the nationally-known curriculum that was modified with new resources and pedagogical techniques.
  2. Within a version of the same class as above that was not using the modified resources and pedagogy.
  3. Within another CS in the same school that was at a similar level, but not using the same curriculum (depending on course content, the knowledge assessment may or may not apply).

Based on the goals, Jorge performed a background literature search resulted in information about a number of different constructs. These constructs and related instruments are described below, based on which goal they correspond to:

Goal 1 – Interest in CS

The most relevant construct was CS interest, which is noncognitive. Jorge then visited the evaluation instrument page on CSEdResearch.org and selected:

  • Focus Area of Computing,
  • Demographic, then 9th-12th,
  • Student Engagement, then under this category selected Interest (Computing), and
  • Quantitative/Qualitative, then under this category selected Quantitative.

Through this process, Jorge discovered an instrument called <i>The Barriers and Supports to Implementing Computer Science (BASICS)</i> study, which includes a large number of batteries of student questions, with one focused specifically on CS interest (Outlier Research & Evaluation, 2017).

Goal 2 – Relevance

The most appropriate construct was “perceived relevance of computer science,” which is also noncognitive. This construct was also present in the BASICS Study and could be taken from this instrument. To be sure that this was the best instrument to use, however, he ran the search again with these terms:

  • Focus Area of Computing,
  • Demographic, then 9th-12th,
  • Student Engagement, then under this category selected Relevance, and
  • Quantitative/Qualitative, then under this category selected Quantitative.

After exploring the other instruments, he decided to stay with the scale he found in the BASICS instrument.

Goal 3 – Knowledge

In order to address content knowledge, Jorge decided that it was best both to examine self-efficacy around knowledge (noncognitive) as well as to administer a content knowledge assessment focused on the key content areas (algorithms and abstraction). First, he searched for self-efficacy on the evaluation instrument page with this criteria:

  • Focus Area of Computing,
  • Demographic, then 9th-12th,
  • Student Engagement, then under this category selected Self-Efficacy
  • Quantitative/Qualitative, then under this category selected Quantitative.

Jorge disovered the Computational Thinking Self-Efficacy Survey and decided that this instrument may work.

For the cognitive (knowledge) instrument, he revisited the evaluation instrument page on CSEdResearch.org and selected:

  • Focus Area of Computing,
  • Demographic, then 9th-12th,
  • Content Knowledge, then under this category selected Computational Thinking and Programming

This enabled him to quickly narrow down which instruments might work best (or, alternatively, places where there was no existing instrument). Since he did not find one to suit his requirements, he then performed general searches and he decided to create his own instrument for measuring the specific CS knowledge areas he was targeting.

Demographic Data

Finally, given that neither of the instruments Jorge borrowed constructs from contained all of the background questions he was interested in, he made a decision to use NSF’s guidelines on broadening participation in computing to inform those questions (NSF, 2019). Using the defined areas of underrepresentation from this document, Jorge compiled a brief set of survey items asking individuals to classify themselves in the relevant racial/ethnic, gender and socioeconomic categories.

References

National Science Foundation. 2019. Retrieved from https://s27944.pcdn.co/wp-content/uploads/2019/02/White-Paper-on-CISE-BPC-Plans-External-FINAL.pdf

Outlier Research & Evaluation. (September, 2017). BASICS Study ECS Student Implementation and Contextual Factor Questionnaire Measures [Measurement scales]. Chicago, IL; Outlier Research & Evaluation at UChicago STEM Education | University of Chicago.

Weese, Joshua Levi, and Russell Feldhausen. STEM outreach: Assessing computational thinking and problem solving. ASEE Annual Conference and Exposition, Conference Proceedings. 2017.

Cite this page

To cite this page, please use:

Xavier, Jeffrey and McGill, Monica M. 2019. Choosing an Evaluation Instrument. CSEdResearch.org Retrieved from https://csedresearch.org/resources/conducting-research/choosing-an-evaluation-instrument/