Comprehensive Technical Analysis of the West Virginia General Summative Assessment (WVGSA) 2024: Aggregate-Level Item Reporting and Instructional Applications
1. Introduction: The Strategic Landscape of Assessment in West Virginia
The administration of the West Virginia General Summative Assessment (WVGSA) represents the culmination of the annual academic cycle for public education within the state. As the primary instrument for measuring student mastery of the West Virginia College- and Career-Readiness Standards (WVCCRS), the WVGSA serves a dual function: it meets the federal accountability mandates established by the Every Student Succeeds Act (ESSA) of 2015, and, more critically, it provides a rich repository of diagnostic data intended to drive instructional improvement. For the 2023-2024 academic year, the West Virginia Department of Education (WVDE) continued its strategic partnership with Cambium Assessment, Inc. (CAI), leveraging a sophisticated online delivery and reporting platform that allows for unprecedented granularity in data analysis.
At the heart of this reporting ecosystem lies the Aggregate-Level Item Report. This specific report, accessible to educators through the secure West Virginia Centralized Reporting System (CRS), offers a capability that far exceeds traditional score reporting. By listing every single item administered to a student cohort—aggregated by class, school, or district—and sorting these items by Reporting Category, the system allows educators to pinpoint precise instructional gaps. It transforms assessment from a "autopsy" of student failure into a diagnostic roadmap for curriculum adjustment. The user query highlights a specific workflow: identifying "frequently missed items" sorted by category. This seemingly simple action requires a deep understanding of psychometrics, interface design, and content standards to execute effectively.
This comprehensive research report serves as an exhaustive guide to the WVGSA 2024 Aggregate-Level Item Report. It explores the theoretical underpinnings of the assessment, the architecture of the reporting platform, the specific definitions of the reporting categories across grades and subjects, and the technical validity that justifies the use of this data for high-stakes instructional decisions. Furthermore, it elucidates the complex relationship between the summative data provided in these reports and the formative practices required to improve student outcomes in subsequent years.
1.1 The Policy Context: WV-MAP and ESSA
The WVGSA does not exist in a vacuum. It is the flagship component of the West Virginia Measures of Academic Progress (WV-MAP), governed by West Virginia Board of Education Policy 2340. This policy framework mandates that all students in grades 3-8 be assessed annually in English Language Arts (ELA) and mathematics, with science assessments occurring in grades 5 and 8.
The shift to the WVGSA from previous legacy assessments marked a transition to more rigorous standards and a more transparent reporting philosophy. Under ESSA, states are required to provide data that is not only valid and reliable but also actionable for stakeholders. The WVDE meets this requirement through a bifurcated reporting strategy:
ZoomWV: The public-facing transparency portal that provides high-level data on enrollment, graduation rates, and aggregate proficiency to the community.
It is within the secure CRS that the "Item Report" resides. Policy 2340 and the accompanying technical documentation emphasize that while the scores are for accountability, the item data is for instruction. The aggregate report allows district leaders to see systemic trends—such as a county-wide weakness in "Geometry" or "informational text"—that would be invisible in a simple proficiency percentage.
1.2 The Transition to Online and Adaptive Testing
The WVGSA is administered as an online summative test. This digital format is crucial for the functionality of the Aggregate-Level Item Report. In a paper-based era, item analysis was often delayed by months. With the CAI platform, results are typically available within 24 hours of test completion. This immediacy allows data analysis to begin while the instructional year is still fresh in the minds of educators, or immediately upon the return of staff in August.
Furthermore, the online format supports a variety of item types beyond multiple-choice, including Technology-Enhanced Items (TEIs) and Equation Response items. The Aggregate-Level Item Report must therefore aggregate performance data across these disparate formats, presenting a unified view of student ability whether the question asked them to select A, B, C, or D, or to drag and drop a fraction onto a number line.
2. Psychometric Foundations of the Item Report
To interpret the Aggregate-Level Item Report correctly, one must understand how the items land on the report in the first place. The WVGSA is built upon the principles of Evidence-Centered Design (ECD) and utilizes Item Response Theory (IRT) for scoring and scaling.
2.1 Evidence-Centered Design (ECD)
ECD is a methodology that ensures every item on the test is there for a reason. It starts with a "Claim" (a broad statement about what students should know), breaks that down into "Targets" (specific domains of knowledge), and then generates "Items" to elicit evidence for those targets.
The Chain of Evidence: When a teacher sees an item on the Aggregate Report sorted by "Reporting Category," they are looking at the final link in this chain. The Reporting Category represents the Claim or a major cluster of Targets. The item is the specific instrument used to measure it.
Implication for Analysis: If a specific item is frequently missed, it breaks the chain of evidence. It suggests that the students do not possess the specific skill (Target) necessary to support the broader claim (Reporting Category).
2.2 The Independent College and Career Readiness (ICCR) Item Bank
The items displayed on the report are drawn from the Independent College and Career Readiness (ICCR) item bank. This is a shared repository of vigorously developed items used by multiple states (formerly associated with the Smarter Balanced consortium).
Item Validity: Each item in the bank has undergone multiple rounds of review, including bias and sensitivity reviews, to ensure it is fair for all subgroups.
Item Parameters: Each item carries statistical parameters (difficulty, discrimination) derived from field testing. When the Aggregate Report shows "Percent Correct," it is displaying the empirical performance of the current student group, which can be compared against these known parameters to determine if the performance is anomalous.
2.3 Computer-Adaptive Testing (CAT) vs. Fixed Form
The WVGSA ELA and Mathematics assessments utilize Computer-Adaptive Testing (CAT) engines. This means that the specific items a student sees are tailored to their ability level. A student answering correctly gets harder questions; a student answering incorrectly gets easier ones.
Impact on Aggregate Reporting: This complicates the "Aggregate-Level Item Report." In a fixed-form test, every student sees Item #1. In a CAT, only a subset of students might see Item #1.
Reporting Solution: The CRS Aggregate Item Report typically aggregates data for all items seen by any student in the group. It calculates the statistics based on the specific students who were administered that item. This makes the "Number of Students" column in the report critical—an item seen by only 5 high-performing students will have different statistical properties than an item seen by the whole class.
3. The Architecture of the Centralized Reporting System (CRS)
The mechanism for accessing the Aggregate-Level Item Report is the West Virginia Centralized Reporting System (CRS), a component of the broader CAI assessment suite. Understanding the architecture and navigation of this system is a prerequisite for data access.
3.1 System Access and User Roles
Access to the CRS is strictly controlled to comply with the Family Educational Rights and Privacy Act (FERPA) and WVDE Policy 4350 regarding data privacy. The view a user sees in the Aggregate-Level Item Report depends entirely on their assigned role:
District Test Coordinator (DC) / District Administrator (DA): These users have the widest view. They can aggregate item data across the entire county. This allows them to answer questions like, "Is the weakness in fraction division a problem at Elementary School A, or is it a district-wide curriculum issue?".
3.2 Navigation to the Item Report
The user journey to the Aggregate-Level Item Report is designed to move from the general to the specific.
Portal Entry: Users navigate to the WVGSA portal (e.g.,
wv.portal.cambiumast.com) and log in.
3.3 The Interface: Visualizing the Data
The CRS interface utilizes visual data representations to make the dense item statistics digestible.
Sorting Capabilities: The interface allows dynamic sorting. Clicking the "Reporting Category" header groups all items by their content domain. This is the critical functionality that allows educators to see patterns (e.g., a cluster of red "Below Standard" indicators in the "Geometry" block).
4. Anatomy of the Aggregate-Level Item Report
The core of the analysis rests on understanding the data fields presented in the Aggregate-Level Item Report. Each row in the report represents a single test item, and the columns provide the metadata and performance statistics necessary for interpretation. While specific layouts can evolve, the 2024 report follows standard CAI reporting specifications for West Virginia.
4.1 Item Metadata Columns
These columns define what the item is.
Item ID: A unique alphanumeric code (e.g.,
MATH-1234). This ID is the key to cross-referencing. Educators can sometimes use the "Teacher Item Previewer" (if available) or released item repositories to view the structure of the item using this ID.
4.2 Performance Metric Columns
These columns define how students performed.
Number of Students (N-Count): The number of students in the current aggregate group (school or district) who were administered this specific item. In an adaptive test, this number may vary from item to item.
Percent Correct (P-Value): The percentage of students in the group who answered the item correctly. This is the primary indicator of "frequently missed." A low percentage (e.g., 30%) indicates a difficult item or a missed concept.
State Average: The average percent correct for the same item across the entire state of West Virginia.
Analysis: This is the benchmark. If the School Percent Correct is 40% and the State Average is 42%, the item was simply hard for everyone. If the School is 40% and the State is 75%, the school has a specific instructional gap.
Difference / Residual: The mathematical difference between the group score and the state average. Large negative numbers highlight areas for immediate concern.
4.3 Strength and Weakness Indicators
A unique feature of the Cambium reporting suite is the calculation of Relative Strength/Weakness.
Mechanism: The system compares the group's performance on the item to the group's performance on the test as a whole.
Interpretation:
Strength Icon: Students performed better on this item than their overall ability would predict.
Weakness Icon: Students performed worse on this item than their overall ability would predict.
Value: This allows educators to identify specific content deficits even in low-performing groups. If a class has a 30% average on the whole test, a 30% on an item isn't a "weakness" relative to their baseline. But a 10% on an item is. It helps prioritize remediation.
5. Deep Dive: Reporting Categories by Subject and Grade
The utility of the report hinges on the "Reporting Categories." These are not arbitrary labels; they are the structural pillars of the WVCCRS. To use the report effectively, educators must understand what each category encompasses.
5.1 Mathematics Reporting Categories
The mathematics assessment is structured around conceptual categories that evolve as students progress from elementary to middle school.
Grades 3-5 (Elementary)
Operations and Algebraic Thinking: This category covers the foundations of arithmetic reasoning. In Grade 3, it involves multiplication and division. By Grade 5, it moves to writing and interpreting numerical expressions. High miss rates here often indicate a failure to transition from concrete counting to abstract operation.
Grades 6-8 (Middle School)
The Number System: Extends elementary arithmetic to the full system of rational numbers, including negative numbers and absolute value.
Expressions and Equations: The transition to pre-algebra. Items require solving linear equations and understanding variables.
Functions (Grade 8 only): A critical category for algebra readiness, involving the definition and graphing of functions.
Geometry: Moves to solving real-world problems involving area, surface area, and volume of complex figures.
Statistics and Probability: Analyzing data distributions and understanding probability models.
5.2 English Language Arts (ELA) Reporting Categories
The ELA assessment integrates reading and writing, but the Item Report separates them to allow for distinct analysis of literacy skills.
Reading Literary Text: Items assess comprehension of fiction, drama, and poetry.
Skills: Identifying themes, analyzing character interactions, understanding figurative language.
Report Insight: Teachers can distinguish if students struggle with the text complexity (Lexile) or the literary analysis skills.
Reading Informational Text: Items based on non-fiction, historical documents, and technical texts.
Skills: Identifying main ideas, analyzing text structure, evaluating arguments and evidence.
Report Insight: This category often shows lower performance than Literary Text. Item analysis can reveal if the struggle is with vocabulary or the structure of arguments.
Writing and Language: This category is unique because it includes the essay component.
Writing Component: Students respond to a prompt (Opinion/Argumentative or Informative/Explanatory) based on read sources.
Language Component: Items test "editing" skills—grammar, punctuation, capitalization, and spelling.
Report Insight: The Item Report for the writing prompt typically provides rubric scores (e.g., Organization, Evidence, Conventions) rather than a simple right/wrong.
5.3 Science Reporting Categories (Grades 5 and 8)
Science is assessed only in terminal grades for elementary and middle school. The categories align with the West Virginia Next Generation Content Standards and Objectives.
Life Science: Covers ecosystems, heredity, molecular biology, and evolution.
Physical Science: Covers matter, force and motion, energy, and waves.
Earth and Space Science: Covers the solar system, earth systems (geology, weather), and human impact.
Disciplinary Core Ideas (DCI): The Item Report for science is often viewed through the lens of DCIs. Educators can see if students understand the cross-cutting concepts (e.g., Cause and Effect) applied within these categories.
6. Distractor Analysis: The Diagnostic Engine
One of the most powerful aspects of the Aggregate-Level Item Report, particularly when viewed at the roster or teacher level, is the potential for Distractor Analysis. While a simple "Percent Correct" tells that students missed an item, distractor analysis tells why they missed it.
6.1 Understanding Distractors
In a multiple-choice item, the wrong answers are not random; they are "distractors" designed to represent common misconceptions.
Example Item: What is ?
Correct Answer:
Distractor A: (Adding numerators and denominators—a common error).
Distractor B:
Distractor C:
6.2 Interpreting Distractor Data on the Report
If the Aggregate Item Report shows that only 30% of students selected the correct answer, the next step is to look at the distribution of incorrect responses (if available in the drill-down view).
Scenario: 60% of students chose Distractor A ().
Conclusion: The students are not guessing randomly. They are consistently applying a flawed algorithm (adding across). This is a specific instructional error that can be corrected with a targeted lesson on common denominators.
Scenario: The responses are evenly split between A, B, and C (20% each).
Conclusion: The students are likely guessing. They lack the fundamental concept entirely, or the question was too difficult.
6.3 Technical Quality of Distractors
The West Virginia technical reports (Volume 1) confirm that psychometricians review items to ensure that distractors are not "trick questions." A negative correlation is expected for distractors (students with high overall scores should not choose them). If a distractor has a positive correlation (high-scoring students are choosing it), the item is flagged as potentially ambiguous and removed from scoring. This quality assurance ensures that when teachers see students choosing a distractor on the report, it is a valid indicator of a misconception, not a bad test question.
7. Instructional Application: From Report to Classroom
The ultimate goal of the Aggregate-Level Item Report is to influence instruction. The user query explicitly notes that the report allows teachers to "see which specific items were frequently missed." This section details the workflow for transforming that observation into action.
7.1 The "Frequently Missed Item" Protocol
Educators should adopt a systematic protocol when reviewing the report:
Sort by Reporting Category: Group the data to see the "forest" (Category) and the "trees" (Items).
Identify Red Flags: Highlight items with a Percent Correct significantly below the State Average (e.g., >10% gap).
Cluster Analysis:
Is it isolated? If only one item in "Geometry" is missed, check the item type. Was it a confusing interface?
Is it systemic? If 5 out of 7 "Geometry" items are missed, the issue is curricular.
Standard Look-Up: Cross-reference the "Standard" column with the WVCCRS.
Example: All missed items map to Standard 4.NBT.B.5 (Multiply two two-digit numbers).
Curriculum Audit: Look at the pacing guide. When was this taught? Was it rushed before spring break? Was the textbook resource sufficient?
Action Plan: Develop a re-teaching plan or adjust the curriculum map for the next year to allocate more time to this standard.
7.2 Professional Development (PD) Planning
District leaders use the Aggregate Item Report to design PD.
Data-Driven PD: If the report shows that students across the district are failing "Writing and Language" items related to elaborating on evidence, the district can bring in writing coaches to train teachers specifically on that skill. This moves PD from generic "best practices" to targeted intervention.
7.3 Leveraging Interim Assessments
The WVGSA is a summative, end-of-year event. However, West Virginia also offers Interim Assessments (Benchmarks and Modules) that mirror the summative test.
The Feedback Loop: If the Spring 2024 Item Report shows a weakness in "Fractions," teachers should not wait until Spring 2025 to see if it's fixed. They should administer the "Fraction Interim Module" in November 2024.
Reporting Consistency: The Interim Assessments use the same reporting system (CRS) and the same item bank structure. This allows for valid comparison: "We were weak in fractions in May; how are we doing on the fraction interim in November?".
7.4 Student-Involved Data Analysis
While the Aggregate Report is for educators, the insights can be shared with students. Teachers can say, "Class, I looked at our data, and we are rock stars at multiplication, but we let the 'Area and Perimeter' questions trick us. Let's focus on that this week." This builds student ownership and transparency.
8. Technical Validity, Reliability, and Fairness
The data in the Aggregate-Level Item Report is only useful if it is accurate. The West Virginia assessment system undergoes rigorous technical auditing to ensure the validity and reliability of the data.
8.1 Validity
Validity refers to whether the test measures what it claims to measure. The WVGSA uses Construct Validity evidence to show that the items align with the WVCCRS. The "Reporting Category" structure is validated through factor analysis to ensure that the items grouped under "Geometry" actually measure geometric reasoning and not just reading ability.
8.2 Reliability and Standard Error
Reliability refers to the consistency of the scores. The Aggregate Item Report provides not just scores but often the Standard Error of Measurement (SEM).
Interpretation: If a student scores 500, the SEM might be +/- 10 points. This means their "true score" is between 490 and 510. Educators must remember this statistical noise when looking at individual items. A "missed item" might be within the margin of error for a specific student, but when aggregated across 100 students, the noise cancels out, revealing the true instructional signal.
8.3 Fairness and Differential Item Functioning (DIF)
To ensure the report is fair to all subgroups, items undergo Differential Item Functioning (DIF) analysis.
Mechanism: Psychometricians check if an item is unexpectedly harder for a specific group (e.g., female students) compared to another group (male students) with the same overall ability.
Result: Items with high DIF are removed. This gives educators confidence that if their "Special Education" subgroup missed an item, it was due to a lack of content mastery, not because the question was biased against them.
9. Special Populations and Accommodations
The Aggregate-Level Item Report includes data from all students who took the general assessment, including those with disabilities and English Learners (ELs) who utilized accommodations. Understanding this context is vital for interpreting the data.
9.1 Impact of Accommodations on Item Data
Students with Individualized Education Programs (IEPs) or Section 504 Plans may use accommodations such as:
Text-to-Speech (TTS): The computer reads the item to the student.
Braille/Large Print: For visually impaired students.
Translation Glossaries: For EL students.
Implication for Reporting: The CRS allows educators to filter the Aggregate Item Report by these subgroups. A teacher can view the item performance specifically for "Students using Text-to-Speech."
Analysis: If students using TTS performed significantly better on "Word Problems" than students with similar reading levels who did not use TTS, it validates the accommodation—it removed the reading barrier to allow the math skill to be measured.
9.2 The West Virginia Alternate Summative Assessment (WVASA)
It is important to note that the WVGSA Aggregate Report does not include the 1% of students with significant cognitive disabilities who take the WVASA (Dynamic Learning Maps). Their results are reported separately. Educators looking for a "whole school" view must look at both reports to get the full picture of academic performance.
10. Conclusion
The West Virginia 2024 Aggregate-Level Item Report stands as a sophisticated instrument in the state's educational toolkit. By providing a transparent window into student performance at the most granular level—the individual test item—it bridges the gap between high-stakes accountability and daily classroom instruction.
The power of the report lies in its organization. By sorting items by Reporting Category, the system transforms a chaotic list of questions into a structured diagnostic of the West Virginia College- and Career-Readiness Standards. It allows educators to answer the critical questions: "What do our students know? What are they missing? And why are they missing it?"
However, the report is only as effective as the "Assessment Literacy" of the educators who use it. To unlock its potential, districts must invest in training that moves beyond simple score checking. Educators must understand the psychometric significance of "Percent Correct," the diagnostic value of "Distractor Analysis," and the curricular boundaries of the "Reporting Categories."
As West Virginia continues to refine its educational system under the guidance of Policy 2340 and ESSA, the Aggregate-Level Item Report will remain the fulcrum of data-driven decision-making. It is the mechanism by which the abstract goals of state standards are translated into the concrete reality of lesson plans, remediation strategies, and ultimately, student success. The 2024 administration, with its rapid reporting turnaround and rich interactive data, offers a clear opportunity for schools to turn hindsight into foresight, ensuring that the assessment results of today fuel the academic achievements of tomorrow.
Appendices: Data Structures and Reference Tables
Table 1: WVGSA Mathematics Reporting Categories by Grade Band
This table outlines the structural evolution of the mathematics reporting categories from elementary to middle school, providing the framework for the Item Report.
Table 2: WVGSA English Language Arts Reporting Categories
Table 3: Aggregate-Level Item Report Column Definitions
Table 4: User Roles and Access Rights in CRS
West Virginia Centralized Reporting System (CRS): The secure, role-based platform where educators access the granular Aggregate-Level Item Report.
School Test Coordinator (SC) / Principal: These users see data aggregated at the school level. They use the report to identify grade-level trends and guide Professional Learning Communities (PLCs).
Teacher (TE): Teachers typically see data rostered to their specific students. While the "Aggregate" report usually refers to school/district levels, the teacher's view is an aggregation of their specific classroom sections. This allows them to see which items their specific students missed, enabling targeted re-teaching in the fall.
Dashboard: The landing page provides a high-level summary of proficiency rates and scale scores.
Drill-Down: The user selects a specific assessment (e.g., "Spring 2024 WVGSA Grade 5 Mathematics").
Report Tab: Within the subject detail view, the user selects the "Item Analysis" or "Item Report" tab. This is the specific interface described in the user query.
Heat Maps: Performance metrics are often color-coded. A "Percent Correct" significantly below the state average might appear in red or orange, while high performance appears in green. This visual cue directs attention immediately to the "frequently missed items" referenced in the query.
Reporting Category: This is the primary sorting variable requested by the user. It labels the broad content domain (e.g., "Number and Operations in Base Ten"). Sorting by this column clusters all related items, revealing whether a "missed item" is an outlier or part of a systemic trend.
Standard / Target: A more granular classification. For example, within the Reporting Category of "Geometry," the standard might be "5.G.1" (Graph points on the coordinate plane). This column is vital for connecting the assessment back to the specific lesson plan or textbook chapter.
Item Type: Indicates the format of the question. Common types in WVGSA include:
MC (Multiple Choice): Standard selection.
MS (Multi-Select): Choose all that apply.
EQ (Equation Response): Students type a number or equation.
TEI (Technology Enhanced): Drag-and-drop, hot text, grid interaction.
EXT (Extended Response): (ELA Writing) Essay responses.
Number and Operations in Base Ten: The mechanics of multi-digit arithmetic and place value. This is often the area with the highest item volume.
Number and Operations—Fractions: Historically a high-difficulty area. Items assess equivalence, ordering, and operations with fractions. The Item Report often reveals specific misconceptions here (e.g., adding denominators).
Measurement and Data: Covers unit conversion, area, perimeter, volume, and data representation (graphs).
Geometry: In elementary grades, this focuses on classifying shapes and understanding attributes (e.g., parallel lines, angles).
No comments:
Post a Comment