Archive: 2024

Demystifying Reliability and Validity in Educational Research

Comments Off on Demystifying Reliability and Validity in Educational Research

Post prepared and written by Joe Tise, PhD, Senior Education Researcher

In the past, validity and reliability may have been explained to you by way of an analogy: validity refers to how close to the “bullseye” you can get on a dart board, while reliability is how consistently you throw your darts in the same spot (see figure below).

Four images of arrows and targets to represent when reliability and validity can be achieved.

Such an analogy is largely useful, but somewhat reductive. In this four-part blog series, I will dig a bit deeper into validity and reliability to show the different types of each, the different conceptualizations of each, and the relations between them. The structure and content in this blog post comes largely from the Standards for Educational and Psychological Testing (2014), so I highly recommend you get a copy of that book to learn more.

Validity
The Standards (American Educational Research Association et al., 2014) define validity as “the degree to which evidence and theory support the interpretations of test scores for proposed uses of tests.” This definition leads me to make an important distinction at the outset: a test is never valid or invalid—it is the interpretations and uses of that test and decisions made because of a test that are valid or invalid.

To illustrate this point, consider the following scenario. I want to measure students’ reading fluency. I dig into a big pile of data I collected from thousands of K12 students and see that taller students can read longer and more complex books than shorter students. I say to myself:
“Great! To assess new students’ reading fluency, all I need to do is measure how tall they are. Taller students are better readers, after all. Thus, a measure of students’ height must be a valid test of reading fluency.”

Of course, you likely see a problem with my logic. Height may well be correlated with reading fluency (because older children tend to be taller and better readers than younger children), but clearly it is not a test of reading fluency. Nobody would argue that my measuring tape is invalid—just that my use of it to measure reading fluency is invalid. This distinction, obvious as it may seem, is the crux of contemporary conceptions of validity (American Educational Research Association et al., 2014; Kane, 2013). Thus, researchers ought never say a test is valid or invalid but rather, their interpretations or uses of a test are valid or invalid. We may, however, say that an instrument has evidence of validity and reliability while bearing in mind the relevance of such evidence may apply differentially among populations, settings, points in time, or other factors.

Reliability
A similar distinction must be made about reliability—reliability refers to the data, rather than the test itself. A test that produces reliable data will produce the same result for the same participants after multiple administrations, assuming no change in the construct has occurred (e.g., assuming one did not learn more about math between two administrations of the same math test). Thus, The Standards define reliability as “the more general notion of consistency of the scores across instances of the testing procedure.”

But how can you quantify such consistency in the data across testing events? Statisticians have several ways to do this, each differing slightly depending on their theoretical approach to assessment. Each approach utilizes some form of a reliability coefficient, or “the correlation between scores on two equivalent forms of the test, presuming that taking one form has no effect on performance on the second form.” There are many theories of assessment, but three of the most common include Classical Test Theory (Gulliksen, 1950; Guttman, 1945; Kuder & Richardson, 1937), Generalizability Theory (Cronbach et al., 1972; Suen & Lei, 2007; Vispoel et al., 2018) and Item Response Theory (IRT) (Baker, 2001; Hambleton et al., 1991). This blog post is too broad in scope to detail each of these theories, but just know that each theory differs in assumptions made about assessment, terminology used, and each has different implications for how one quantifies data reliability.

What’s Next?
This post only introduces these two terms. The next three posts discuss validity and reliability more in-depth for both quantitative and qualitative approaches (to be published over the next few weeks).

References
American Educational Research Association, American Psychological Association, & National Council on Measurement in Education. (2014). Standards for Educational and Psychological Testing. American Educational Research Association.
Baker, F. B. (2001). The Basics of Item Response Theory (2nd ed.). ERIC Clearinghouse on Assessment and Evaluation. https://eric.ed.gov/?id=ED458219
Cronbach, L. J., Gleser, G. C., Nanda, H., & Rajaratnam, N. (1972). The dependability of behavioral measurements: Theory of generalizability for scores and profiles. John Wiley and Sons.
Gulliksen, H. (1950). Theory of Mental Tests. Wiley.
Guttman, L. (1945). A basis for analyzing test-retest reliability. Psychometrika, 10(4), 255–282. https://doi.org/10.1007/BF02288892
Hambleton, R. K., Swaminathan, H., & Rogers, H. J. (1991). Fundamentals of Item Response Theory. Sage Publications, Inc.
Kane, M. (2013). The argument-based approach to validation. School Psychology Review, 42(4), 448–457. https://doi.org/10.1080/02796015.2013.12087465
Kuder, G. F., & Richardson, M. W. (1937). The theory of the estimation of test reliability. Psychometrika, 2(3), 151–160. https://doi.org/10.1007/BF02288391
Suen, H. K., & Lei, P.-W. (2007). Classical versus Generalizability theory of measurement. Educational Measurement, 4, 1–13.
Vispoel, W. P., Morris, C. A., & Kilinc, M. (2018). Applications of generalizability theory and their relations to classical test theory and structural equation modeling. Psychological Methods, 23(1), 1–26. https://doi.org/10.1037/met0000107

Podcasts! Considering K-5 Computing Education Practices

Comments Off on Podcasts! Considering K-5 Computing Education Practices

We’re super excited to announce our long-awaited series on K-5 computing education practices!

Our podcasts provide insights from discussions among teachers as they consider meaningful research and how they could adopt new practices into their classrooms.

For educators, these podcasts are meant to provide you with information on various research studies that are may be suitable for your classrooms.

For researchers, the podcasts are meant to insight reflection and further inquiry into how teachers interpret research in context with their classrooms. As we continue our working closing the gap between researchers and practitioners, the discussions can give researchers additional perspectives that they may not already have.

Special thanks to Association of Computing Machinery (ACM) SIGCSE for funding to support this work through a Special Projects grant! We also thank our additional sponsors, Amazon Future Engineer and Siegel Family Endowment, who support our outreach efforts at IACE.

And special thanks to Emily Thomforde for tirelessly leading the discussion groups every week for many years. Shout out to Jordan Williamson (IACE), Emily Nelson (IACE), and Monica McGill (IACE) for creating, modifying, and reviewing the podcasts and briefs!

Either way, we hope you enjoy the podcasts!

Join Us at the 2024 ACM SIGCSE Technical Symposium

Comments Off on Join Us at the 2024 ACM SIGCSE Technical Symposium

We’re always excited to attend the ACM SIGCSE Technical Symposium, and this year is no exception!

You can catch IACE team members (Laycee Thigpen, Joe Tise, Julie Smith, and Monica McGill) at the following events. (Pre-symposium events are invitation only.)

For all the rest, please stop by and say Hi! We’d love to hear about research you’re engaged in that supports learning for all students!

Day/Time Event Authors/Presenters Location
Tuesday, All day Reimagining CS Pathways (Invitation only) Bryan Twarek and Jake Karossel (CSTA), Julie Smith and Monica McGill (IACE) Off-site
Wednesday, All day Reimagining CS Pathways (Invitation only) Bryan Twarek and Jake Karossel (CSTA), Julie Smith and Monica McGill (IACE) Off-site
Wednesday, 1-5pm PST Conducting High-quality Education Research in Computing Designed to Support CS for All (Invitation only) Monica McGill, Institute for Advancing Computing Education
Jennifer Rosato, Northern Lights Collaborative
Leigh Ann DeLyser, CSforALL
Sarah Heckman, North Carolina State University
Bella Gransbury White, North Carolina State University
Meeting Room E146
Thursday, 1:45-3pm PT Unlocking Excellence in Educational Research: Guidelines for High-Quality Research that Promotes Learning for All Monica McGill (IACE), Sarah Heckman (North Carolina State University), Michael Liut (University of Toronto Mississauga), Ismaila Temitayo Sanusi (University of Eastern Finland), Claudia Szabo (The University of Adelaide) Portland Ballroom 252
Thursday, 3:45-5pm PT The NSF Project Showcase: Building High-Quality K-12 CS Education Research Across an Outcome Framework of Equitable Capacity, Access, Participation, and Experience Monica McGill (IACE) Meeting Rooms E143-144
Friday, 10am PT The Landscape of Disability-Related K-12 Computing Education Research (poster) Julie Smith (IACE), Monica McGill (IACE) Exhibit Hall E
Friday, 10:45am PT Piloting a Diagnostic Tool to Measure AP CS Principles Teachers’ Knowledge Against CSTA Teacher Standard 1 Monica McGill (IACE), Joseph Tise (IACE), Adrienne Decker (University at Buffalo) Meeting Room D136
Saturday, 10am PT Reimagining CS Courses for High School Students (poster) Julie Smith (IACE), Bryan Twarek (CSTA), Monica McGill (IACE) Exhibit Hall E

Key Levers for Advancing K-12 Computer Science Education in Chicago, in Illinois, and in the United States

Comments Off on Key Levers for Advancing K-12 Computer Science Education in Chicago, in Illinois, and in the United States

Computer science has become an essential skill for K-12 students. As the demand for computing jobs grows, there is a pressing need to advance K-12 CS education across the nation. To achieve this, there are several key levers that can advance change, including policy changes, teacher training and development, increased access to technology and resources, and partnerships between educational institutions, non-profits, and industry leaders. By leveraging these, we can equip students with the skills they need to thrive in an increasingly digital world and drive innovation and progress.

Under funding and direction from the CME Group Foundation, we took a look at K-12 computer science education in Chicago and Illinois, in context with efforts across the United States. As a result of this work, we are pleased to announce our most recent publication on this work, Key Levers for Advancing K-12 CS Education in Chicago, in Illinois and in the United States.

In particular, the Foundation funded this study to understand:

  • How the landscape of K-12 CS education in Chicago has changed across 2013- 2022, with a focus on public schools, out-of-school-time (OST) programs, and research for evidence of progress.
  • The current strengths and opportunities of the K-12 CS education landscape in Chicago, in Illinois, and nationally.
  • How the support from the Foundation since it first started funding K-12 CS education in Chicago in 2015 has influenced the CS education landscape.

This qualitative study, conducted by Laycee Thigpen, Annamaria Lu, Monica McGill (all from the Institute for Advancing Computing Education), and Eva Giglio (CME Group Foundation), involved conducting 49 interviews (57 people in total). The interviewees represented a wide variety of organizations and voices.

Key findings for Chicago Public Schools (CPS) include the need to:

  • Support consistency and fidelity across schools
  • Continue to address the teacher shortage and to support the need for teacher CS professional development
  • Support research within CPS to inform decision-making to improve equitable outcomes for all students
  • Support workforce pathways for high school students
  • Support expanded K-8 CS, including integration into other subject areas
  • Support the design of scaffolded, standards-based curriculum

Specific to out-of-school time programs, we found that there is a need to support the creation, implementation and maintenance of ways to search CS learning opportunities and for program providers to also engage in partnerships with schools.

The report also details more findings for Illinois–some of which are similar, others that differ to meet the unique needs of rural communities.

We look forward to hearing your thoughts on the report!