Search This Blog

Research "item analysis" on standardized student tests


 Research "item analysis" on standardized student tests in the state of "West Virginia: for 2025 

Research Websites
(1) Identify the primary standardized assessments administered to students in West Virginia in 2025, including the West Virginia General Summative Assessment and the SAT School Day. (2) Search the West Virginia Department of Education website for technical manuals, assessment guides, and interpretive documentation released for the 2025 testing cycle. (3) Investigate the specific statistical methods used for item analysis in these assessments, such as Item Response Theory, p-values, point-biserial correlations, and Differential Item Functioning. (4) Locate guides or training materials for West Virginia educators that explain how to access and interpret item-level analysis reports from the 2025 assessments. (5) Research public data releases or executive summaries from late 2025 that discuss student performance trends at the item, claim, or standard level. (6) Determine if there were any significant changes to the test blueprints, vendors, or analysis methodologies for the 2025 administration compared to previous years.
 
 
Defining the Assessment Landscape
I am beginning my investigation into West Virginia's standardized testing environment for the 2025 academic cycle. I've identified the West Virginia General Summative Assessment (WVGSA) for grades 3 through 8 and the SAT School Day for grade 11 as the primary subjects for technical evaluation. My focus is on synthesizing the statistical benchmarks used to validate these tests, specifically looking for item difficulty metrics, discrimination indices, and methods for identifying differential item functioning to ensure assessment equity.
Addressing Documentation Timelines
A critical part of my thinking involves determining the availability of full technical reports, as there is often a lag between test administration and the release of comprehensive psychometric data. I am currently evaluating whether preliminary 2025 interpretive guides or assessment handbooks are available to bridge any gaps if the final technical manuals are still in the publication pipeline. This will allow me to establish whether the methodology remained consistent with previous years or if new statistical models were introduced for the 2025 cycle.
Strategic Data Retrieval
I am now moving to retrieve official documentation from the state's education department, specifically targeting technical manuals and educator reporting guides for 2025. I will be looking for details on how individual test items were scrutinized for quality and how these results are communicated to stakeholders. By examining these repositories, I aim to uncover the specific psychometric frameworks applied to the most recent testing window.
 
 
Mapping the 2025 Assessment Framework
I have successfully identified the primary assessments used for the 2025 cycle, confirming that student performance is evaluated through the General Summative Assessment for younger grades and the SAT School Day for high schoolers. My synthesis of recent reports shows that proficiency data for 2025 is already being integrated into state scorecards, with significant improvements noted in English language arts and mathematics across the majority of districts. I have also identified specialized assessments for career technical education and alternative pathways for students with significant cognitive disabilities, ensuring a comprehensive view of the state's testing landscape.
Technical Validations and Statistical Models
Through my investigation into the underlying methodology, I have discovered that the state employs sophisticated psychometric models to ensure test quality. For high school assessments, I've found that a multistage adaptive testing design is used, which leverages item response theory to adjust question difficulty based on student performance in real-time. Additionally, I have found evidence that the state utilizes differential item functioning analysis to detect potential bias, though I have observed a common administrative lag between test delivery and the release of full, item-level technical manuals for the current year.
Investigating Educator Reporting Tools
I am currently thinking through how item-level insights are delivered to those in the classroom. I've identified a secure data dashboard known as ZoomWV-e as the primary portal for educators to access performance metrics. While aggregate data is public, I am exploring the specific depth of information available within this secure environment—specifically whether it provides granular p-values and distractor analysis for individual questions or if reporting is limited to broader content standards and claims.
Next Steps in Technical Research
Moving forward, I will be focusing on extracting more specific details from the 2025 results documentation and searching for any updated methodology guides that bridge the gap from previous years. I plan to look deeper into the reporting features provided by the state's assessment vendors to determine exactly how much item-level transparency is granted to local schools. This will help resolve whether the 'item analysis' is primarily a backend statistical validation process or an actionable tool for instructional adjustment.

 -------------------

 

West Virginia Statewide Standardized Assessment System: Comprehensive Technical Analysis of Item Methodologies and Reporting Protocols for the 2025 Administration

1. Executive Context and System Architecture

The assessment landscape within the state of West Virginia for the 2025 academic year represents a sophisticated convergence of federal accountability mandates, state-level legislative policy, and advanced psychometric methodology. Governed principally by the West Virginia Board of Education (WVBE) Policy 2340 and the federal Every Student Succeeds Act (ESSA), the state’s assessment system—collectively known as the West Virginia Measures of Academic Progress (WVMAP)—is designed to provide a comprehensive evaluation of student mastery relative to the West Virginia College- and Career-Readiness Standards (WVCCRS). The 2025 administration cycle marks a critical maturation point in the state's transition toward fully digital, adaptive testing ecosystems, necessitating a nuanced understanding of "item analysis" that extends beyond traditional classical test theory into the complex realms of Item Response Theory (IRT) and Computer Adaptive Testing (CAT).  

1.1 The Statutory and Policy Framework

The 2025 assessment protocols are rooted in a rigid legal framework intended to ensure transparency, equity, and instructional utility. West Virginia Code §18-2E-10 and §18-2E-2, alongside Policy 2340, mandate the administration of annual summative assessments to measure student performance against state content standards. These statutes require that the assessment system not only generate an aggregate score for accountability purposes but also provide granular, actionable data that can be utilized by educators to inform instruction. The "Testing Transparency" documents released for the 2024-2025 school year underscore the state's commitment to ensuring that parents, educators, and the public understand the mechanics of these assessments, including their purpose, subject coverage, and the timing of results dissemination.  

The scope of the 2025 summative assessment administration encompasses three distinct but integrated testing programs:

  1. The West Virginia General Summative Assessment (WVGSA): Administered to the vast majority of students in grades 3 through 8, covering English Language Arts (ELA) and Mathematics annually, with Science administered in grades 5 and 8.  


1.2 Defining "Item Analysis" in the 2025 Context

In the context of West Virginia's 2025 assessments, "item analysis" is a bifurcated concept.

First, it refers to the technical validation processes employed by the West Virginia Department of Education (WVDE) and its vendors—Cambium Assessment, Inc. (CAI) and the College Board—to ensure that every question (item) functioning on the test is statistically valid, reliable, and free from bias. This involves complex psychometric procedures such as Differential Item Functioning (DIF) analysis, calibration under IRT models, and alignment studies conducted by third-party auditors like the Wisconsin Center for Educational Products and Services (WCEPS).  

Second, it refers to the instructional data analysis performed by districts, schools, and teachers using the reporting platforms. The shift to adaptive testing has fundamentally altered this aspect. In a fixed-form era, item analysis meant reviewing the specific questions a class missed. In the 2025 adaptive environment, where students see different items based on their ability, item analysis transforms into the analysis of aggregate item performance and reporting categories. Teachers analyze how students performed on "items related to making inferences" rather than "Question #4." The introduction of systems like the Cambium Reporting System (CRS) and the ZoomWV-e data dashboard facilitates this meta-analysis, allowing educators to peer into the "black box" of the adaptive algorithm to discern patterns of student learning.  

1.3 The 2025 Assessment Landscape

The 2025 data reveals a system in recovery and growth. Preliminary results indicate that 46.3% of students in grades 3-8 reached proficiency in ELA, surpassing pre-pandemic levels, while Math proficiency climbed to 40.8%. These gains are not accidental; they are the result of rigorous curriculum alignment and the strategic use of assessment data. However, the integrity of these scores relies entirely on the technical quality of the items used. If the items do not accurately discriminate between high and low ability, or if they are misaligned with the depth of the standards, the resulting proficiency data is meaningless. Therefore, an exhaustive analysis of the item methodologies is a prerequisite for interpreting the state's educational health.  

2. Psychometric Foundations and Methodological Frameworks

To comprehend how items are analyzed in West Virginia, one must navigate the underlying psychometric theories that dictate test construction and scoring. The 2025 assessments have largely abandoned Classical Test Theory (CTT)—which treats all items as equal contributors to a total score—in favor of Item Response Theory (IRT), which evaluates the mathematical properties of individual items.

2.1 Item Response Theory (IRT) Implementation

The technical reports for the WVGSA confirm that reliability estimates and item calibrations are grounded in the IRT framework. IRT posits that the probability of a student answering a specific item correctly is a logistic function of the difference between the student's ability () and the item's difficulty (), discrimination (), and guessing () parameters.  

For the ELA and Mathematics assessments in grades 3-8, the state utilizes a calibrated item bank where every question has been pre-equated. This is essential for Computer Adaptive Testing. Because the difficulty of the test tailors itself to the student, a raw score (number correct) is meaningless. A student answering 20 difficult questions correctly demonstrates higher ability than a student answering 20 easy questions correctly. IRT allows the scoring engine to calculate a scale score that accounts for the specific characteristics of the items the student encountered.  

The technical documentation explicitly references the use of the Q3 statistic to evaluate local item independence. Local independence is a core assumption of unidimensional IRT models, stating that once a student's ability is accounted for, their response to one item should not influence their response to another. The use of Q3 analysis indicates a high level of technical rigor, ensuring that "item chaining" or context effects do not distort the measurement of student ability.  

2.2 Multidimensionality in Science Assessment

The science assessment introduces a higher level of complexity. Unlike Math or Reading, which are treated as largely unidimensional traits, the West Virginia Next Generation Content Standards for Science are three-dimensional, requiring students to integrate Disciplinary Core Ideas (DCIs), Science and Engineering Practices (SEPs), and Crosscutting Concepts (CCCs).  

To measure this, the assessment employs a multidimensional IRT model. This model accounts for the "nuisance dimensions" that arise from the cluster-based design of the science test. In this design, students engage with a single scenario or phenomenon (e.g., a river ecosystem) and answer multiple questions related to it. These questions are naturally correlated because they share a common stimulus. A standard IRT model might misinterpret this correlation as higher student ability. The multidimensional model mathematically separates the student's true science proficiency from the local dependencies created by the item cluster, ensuring a purer measure of the construct.  

2.3 Computer Adaptive Testing (CAT) Algorithms

For grades 3-8 ELA and Math, the "item analysis" is essentially an analysis of the CAT algorithm's performance. The algorithm is the gatekeeper, deciding which item from the Independent College and Career Readiness (ICCR) bank is presented next.

The technical reports detail that the algorithm functions by optimizing measurement precision while satisfying blueprint constraints.  

  • Step 1: The engine estimates the student's current ability based on previous responses.

  • Step 2: It scans the item bank for available items that provide the most "information" (technical term for precision) at that ability level.

  • Step 3: It filters these items through the "Blueprint," which mandates that a certain number of items must come from specific reporting categories (e.g., Geometry vs. Algebra).  


The alignment studies conducted by WCEPS analyzed this algorithmic behavior, confirming that the CAT engine successfully generates unique test events for each student that nevertheless cover the breadth and depth of the standards. This confirms that the "item analysis" done by the algorithm in real-time is functioning correctly.  

2.4 Multistage Adaptive Testing (MST) for SAT

The SAT School Day utilizes a divergent methodology known as Multistage Adaptive Testing (MST). Unlike the item-by-item adaptation of the WVGSA, the SAT adapts at the module level.  

  1. Module 1: All students take a routing module with a mix of difficulty levels.

  2. Analysis: The system analyzes performance on Module 1 to determine the student's broad performance band.

  3. Module 2: The student is routed to a specific second module (Hard or Easy).

This design has profound implications for item analysis. In a CAT, the pathway is continuous. In MST, the pathway is bifurcated. A student routed to the lower-difficulty Module 2 will never see the most challenging items. This structural limitation is efficient for scoring but creates a "black box" effect for educators, who cannot see which specific barrier prevented a student from reaching the harder module.

3. Item Development and Quality Assurance Lifecycles

The validity of the entire WVMAP system rests on the quality of the individual items in the bank. West Virginia does not develop these items in isolation; it participates in the Independent College and Career Readiness (ICCR) consortium and a multistate science collaborative, sharing development costs and intellectual capital with states such as Utah, Oregon, Connecticut, and Idaho.  

3.1 The Item Writing and Review Protocol

The lifecycle of a 2025 assessment item is a multi-year journey involving rigorous scrutiny. The process described in Volume 2 of the technical reports is exhaustive.  

Phase 1: Specification and Writing Items are written to strict "Item Specifications" that define the allowable content, cognitive complexity (DOK), and format for each standard. For science, this involves the creation of "Item Clusters"—scenarios anchored in scientific phenomena that require students to apply knowledge, not just recall facts. Writers are trained on the "Universal Design" principles to ensure items are accessible to the widest possible range of students from the outset.  

Phase 2: The Internal Review Cascade Before a West Virginia educator ever sees an item, it undergoes four levels of internal review by the vendor (CAI):

  1. Preliminary Review: Senior staff verify alignment to the standard and the quality of the stimulus.

  2. Content Review One: Subject matter experts check for factual accuracy and instructional relevance.

  3. Edit Review: Copy editors ensure the item meets style guides and linguistic clarity.

  4. Senior Review: A final sign-off by lead content developers.  


Phase 3: Content Advisory Committees (CAC) This is the critical "state-level" item analysis. Committees comprised of West Virginia educators convene to review new items. Their mandate is to evaluate alignment: Does this item truly measure the WVCCRS it claims to measure? Is the DOK level accurate? Is the vocabulary appropriate for West Virginia students?.  

Phase 4: Bias, Sensitivity, and Fairness Review Concurrently, a "Fairness Committee" reviews items to identify and remove "construct-irrelevant variance." This involves flagging items that might introduce bias against students based on gender, race, ethnicity, religion, or socioeconomic status. For example, a math problem that presupposes knowledge of a specific sport (like golf) might be flagged for socioeconomic bias. The technical reports emphasize that this qualitative review is the first line of defense against inequity.  

3.2 Field Testing and Statistical Flagging

Once items pass the committee reviews, they are field-tested. In the 2025 administration, this is typically done by embedding new items into the operational test slots where they do not count toward the student's score. The resulting data is subjected to intense statistical analysis.

Items are flagged for rejection if they fail specific statistical criteria:

  • Difficulty Parameter (): Is the item too hard or too easy for the target population?

  • Discrimination Parameter (): Does the item fail to distinguish between high-performing and low-performing students? (Low point-biserial correlation).

  • Differential Item Functioning (DIF): This is the statistical companion to the Fairness Committee. DIF analysis compares the performance of subgroups (e.g., Male vs. Female) who have the same overall ability. If students of equal ability perform differently on the item solely based on their group membership, the item is flagged for DIF and likely removed. The 2025 participation guidelines highlight the expansion of DIF analysis to include accommodations, checking if items function differently for students using tools like text-to-speech.  


3.3 Alignment Studies and DOK Discrepancies

A critical component of item analysis is the Alignment Study. The technical report details an independent study by WCEPS that analyzed the alignment of the item pool to the state standards.

Key Insight: The study found that for ELA Grades 7-8, there was a discrepancy in Depth of Knowledge (DOK) Consistency. The standards at these grade levels are rigorous, often requiring DOK 3 (Strategic Thinking). However, the item blueprints and the available pool relied too heavily on items with lower cognitive demand. The study noted that while 70% of the standards were DOK 3, the blueprint only required ~25% of items to be DOK 3.  

This finding is significant for 2025. It implies that the state has likely undertaken a targeted item writing campaign to bolster the DOK 3 inventory for middle school ELA. Educators analyzing 2025 scores should be aware that the rigor of the reading passages and associated questions in grades 7-8 may have increased to address this alignment gap.

4. West Virginia General Summative Assessment (WVGSA): Item Analysis and Reporting

The WVGSA remains the flagship assessment for grades 3-8. The 2025 administration leverages the Cambium Reporting System (CRS) to deliver item-level insights to educators.

4.1 Reporting Categories and Claims

Item analysis in WVGSA is structured around "Reporting Categories," which function as sub-tests.

  • ELA: Scores are reported for "Reading Literary Text," "Reading Informational Text," "Writing," and "Listening."

  • Math: Scores are reported for "Concepts and Procedures," "Problem Solving," "Modeling and Data Analysis," and "Communicating Reasoning."

For each category, students are classified as Above Mastery, At/Near Mastery, or Below Mastery relative to the "Meets Standard" cut score. This classification provides a coarse-grained item analysis, allowing teachers to identify if a student has a specific deficit in informational text despite being proficient in literary text.  

4.2 Aggregate-Level Item Reports

The most powerful tool for granular item analysis in 2025 is the Aggregate-Level Item Report available in CRS. Because the test is adaptive, individual student item reports are not feasible (every student sees a different test). However, at the school or district level, the system can aggregate performance across the entire pool of items administered.  

Report Features:

  • Item Difficulty: Teachers can see which specific items were most challenging for their student population.

  • Standard Alignment: Data is sorted by standard, allowing a department to see, for example, that 60% of students missed items aligned to standard M.4.NF.3 (fractions).

  • Non-Secure Items: The user guide references a feature for "viewing non-secure items". This suggests that while operational items remain secure, the system provides access to released or practice items that mirror the operational content. This allows teachers to visualize the type of question that caused difficulty, bridging the gap between abstract data and concrete instruction.  


4.3 Science Assessment: The Matrix Design Challenge

The science assessment in grades 5 and 8 presents a unique challenge for item analysis due to its Matrix Design.  

  • The Mechanism: The total item pool covers the entirety of the broad NGSS-based standards. However, no single student could answer all questions without fatigue. Therefore, the test uses a matrix sampling approach where each student takes a subset of items.

  • Implication for Analysis: Individual student reports in science are less reliable for diagnosing specific standard deficits because the student may not have encountered enough items in a specific domain to generate a valid sub-score.

  • Solution: The Aggregate-Level Disciplinary Core Ideas (DCI) Report. This report aggregates data from all students in a school/district, effectively reconstructing the full test. Validity relies on the "Law of Large Numbers." While one student's test is incomplete, the district's data covers the full spectrum of the curriculum. Educators must rely on school-level science analysis rather than student-level analysis to drive curriculum changes.  


4.4 Lexile and Quantile Measures

A crucial aspect of the 2025 item analysis reporting is the integration of Lexile and Quantile measures. These are not just scores; they are item-dependent metrics that link assessment to instruction.  

  • Mechanism: The difficulty of the items a student successfully answers is mapped to the Lexile/Quantile scale.

  • Utility: This allows teachers to perform "forward item analysis"—selecting instructional materials (books, math problems) that match the student's identified complexity level. The reports provide a tangible connection between the abstract scale score and the concrete difficulty of a text.

5. SAT School Day: Methodological Shifts and Reporting Constraints

For grade 11, the SAT School Day serves as the accountability metric. The complete transition to the Digital SAT in 2025 has introduced significant changes to how educators can access and analyze item-level data.

5.1 The Loss of Question-Level Transparency

Historically, the College Board offered a "Question Analysis Service" (QAS) that allowed educators to see the exact questions students missed. For 2025, this feature has been discontinued for the digital test.  

  • Reasoning: The adaptive nature of the MST design means that items are drawn from a highly secure, high-value item bank. Releasing items would compromise the bank and require the continuous, expensive development of new items.

  • Impact: Teachers can no longer "go over the test" in class. The specific misconceptions that led a student to choose Distractor B over Correct Answer C are now hidden.

5.2 Content Domain Reporting: The New Standard

To replace specific item analysis, the College Board provides data on Content Domains via the K-12 Reporting Portal.  

  • Structure: Performance is reported across eight domains (4 Reading/Writing, 4 Math).

  • Visuals: Educators see "Progress Bars" indicating performance bands (e.g., bands 1-7).

  • Skills Insight™: This tool acts as a proxy for item analysis. Instead of showing the student's actual question, it shows sample questions that exemplify the difficulty band the student achieved. Teachers must infer that "if the student is in Band 3, they likely struggle with items similar to this sample question about linear inequalities."

5.3 Benchmarks and Vertical Scaling

The SAT scores (400-1600) are vertically scaled, allowing for longitudinal analysis from grade 8 (PSAT 8/9) to grade 11. The 2025 college readiness benchmarks are set at 480 for Evidence-Based Reading and Writing (ERW) and 530 for Mathematics.  

  • Item Analysis Implication: These benchmarks represent a 75% probability of earning a 'C' or better in credit-bearing college courses. Analysis of the 2025 data shows a significant divergence: while 49% of WV students met the ERW benchmark, only 17% met the Math benchmark. This suggests that the items in the "Heart of Algebra" and "Passport to Advanced Math" domains are serving as the primary gatekeepers for college readiness in the state.  


5.4 Data Layout Changes

The 2024-2025 Data Layout for electronic score reports confirms the removal of granular item fields. The layout documentation explicitly lists "Question Item Data" as a deleted field. This technical confirmation reinforces that districts must adapt their data analysis protocols to focus on domain-level trends rather than item-level auditing.  

6. West Virginia Alternate Summative Assessment (WVASA): Assessing the 1%

For students with significant cognitive disabilities, the WVASA (Dynamic Learning Maps) utilizes a fundamentally different item analysis logic based on Learning Maps.

6.1 Linkage Levels and Essential Elements

The WVASA does not use standard "items" with a single difficulty. Instead, it assesses Essential Elements (EEs)—specific statements of knowledge linked to grade-level standards but reduced in complexity. Each EE is broken down into five Linkage Levels:  

  1. Initial Precursor: The most basic level (e.g., recognizing a distinction between two objects).

  2. Distal Precursor

  3. Proximal Precursor

  4. Target: The level most closely aligned with the standard.

  5. Successor: A level extending beyond the standard.

Item Analysis: The "item analysis" here is the determination of the student's "linkage level." If a student fails a "Target" level testlet, the system routes them to a "Proximal Precursor" testlet. The report generated for teachers is a Linkage Level Profile, showing exactly where on the complexity ladder the student stands for each skill. This is arguably the most actionable item analysis in the entire system, as it directly dictates the next instructional step.  

6.2 Assertion Mapping and Scoring

Scoring in the WVASA is not based on "number correct" but on Assertion Mapping.  

  • Process: Standard-setting panelists review the content of the testlets and the student responses to create "assertions"—statements like "Student can match a picture to a word."

  • Aggregation: These assertions are summed to create the proficiency level.

  • Reporting: The "Student Mastery Report" translates these assertions into narrative feedback. Teachers analyzing 2025 WVASA results are analyzing a qualitative profile of student capabilities, not a quantitative score on a linear scale.

7. Reporting Ecosystems: The User Experience of Data

The utility of item analysis is dependent on the platforms that deliver it. In 2025, three primary systems constitute the "data layer" of West Virginia education.

7.1 Cambium Reporting System (CRS)

The CRS is the operational hub for grades 3-8.

  • Immediacy: Reports are available on a rolling basis, often within days of testing.  


7.2 ZoomWV-e: The Longitudinal Context

ZoomWV-e is the state’s secure data warehouse.  

  • Contextual Analysis: While CRS shows how a student performed, ZoomWV-e helps explain why. It allows administrators to cross-reference assessment data with attendance, discipline, and socioeconomic data.

  • Accommodations Data: A critical feature for 2025 is the digitization of the WVS.326 Accommodations Form. This allows the state to link specific accommodations (e.g., Scribe, Read Aloud) to assessment outcomes. Administrators can perform an item analysis of the efficacy of accommodations—identifying if students with "Read Aloud" supports are performing comparably to their peers on reading comprehension items.  


7.3 College Board K-12 Portal

This portal is the exclusive source for SAT data.

  • Access: It requires distinct credentials from the WVDE systems.

  • Roster Reports: The 2025 update includes an "Enhanced Roster Report" that improves the visibility of student performance across the new digital domains.  


8. Analysis of 2025 Assessment Data

The application of these item analysis methodologies to the 2025 data yields significant insights into the state of education in West Virginia.

8.1 Proficiency Trends and Recovery

The 2025 WVGSA results indicate a robust recovery.

  • ELA: 46.3% proficiency (Grades 3-8). This exceeds the 2024 rate (44.2%) and the 2019 baseline (45.3%).  


  • Math: 40.8% proficiency. While improved from 2024 (38.3%), it remains just shy of the 2019 level (40.9%).

Insight: The divergence between ELA and Math recovery is consistent with item analysis literature. Math items are often hierarchical—missing a concept in grade 4 (e.g., fractions) makes grade 5 items (decimals) theoretically unsolvable. ELA items are more recursive—reading skills spiral and reinforce each other. The data suggests that the "COVID gap" in cumulative math knowledge is proving harder to close than in reading.

8.2 District Variance and Equity

The "Balanced Scorecard" data reveals that 86% of districts improved in ELA. However, significant variance remains. The "Top-Performing School Districts" map indicates clusters of high performance, likely correlated with socioeconomic factors. The ongoing DIF analysis conducted by the state is crucial here to ensuring that this variance is due to educational disparities and not measurement bias in the items themselves.  

8.3 Participation and "Testing Transparency"

West Virginia successfully met the 95% federal participation requirement. This high participation rate validates the item analysis data; the results are not skewed by the exclusion of low-performing or marginalized students. The "Testing Transparency" initiative ensures that parents are informed of their right to see this data, although the report notes that there is no legal "opt-out" provision in West Virginia, reinforcing the universality of the data set.  

9. Conclusion

The 2025 administration of standardized assessments in West Virginia represents the culmination of a decade-long shift toward technology-enhanced measurement. The definition of "item analysis" has evolved from a clerical task of checking wrong answers to a sophisticated discipline of data science.

For the WVGSA, item analysis is now an exercise in aggregate standard analysis, supported by the detailed reporting of the Cambium system and the theoretical rigor of IRT and CAT. The ability to view "non-secure items" provides a necessary bridge between the abstract data and classroom reality.

For the SAT School Day, the shift to Digital MST has traded question-level transparency for testing efficiency and security. Educators must now rely on domain-level analysis and benchmark indicators, using the "Skills Insight" tool to infer student needs rather than observing them directly.

For the WVASA, item analysis is deeply diagnostic, utilizing the granular Linkage Levels to map the specific cognitive topography of students with significant disabilities.

Collectively, these systems provide a valid, reliable, and increasingly rich picture of student learning. The improvements in 2025 proficiency rates suggest that West Virginia educators are successfully navigating this data-rich environment, using the insights derived from these complex item analyses to drive instruction and improve outcomes for all students.

Table 1: Comparative Technical Architecture of 2025 WV Assessments

FeatureWVGSA (Grades 3-8)SAT School Day (Grade 11)WVASA (DLM)
Measurement ModelUnidimensional & Multidimensional IRTIRT / Multistage Adaptive (MST)Diagnostic Classification / Linkage
AdaptivityItem-level CAT (ELA/Math)Module-level MST (Section-based)Testlet-level Adaptive Routing
Item Analysis DataAggregate Item Reports (School/District)Content Domain Bands (1-7)Linkage Level Profiles
TransparencyView "Non-Secure" Items allowed"Question Analysis" DiscontinuedPractice Testlets available
Standard SettingBookmark MethodStandard Setting / Vertical ScalingAssertion Mapping
Reporting PlatformCambium Reporting System (CRS)College Board K-12 PortalKite Educator Portal
Primary MetricScale Score / Proficiency LevelSection Score (200-800) / BenchmarkMastery of Essential Elements

Table 2: 2025 Proficiency and Trend Analysis

SubjectGrade Band2025 Proficiencyvs. 2024vs. 2019 (Pre-Pandemic)Analysis
ELA3-846.3%▲ +2.1%▲ +1.0%Full Recovery: Surpassed 2019 levels.
Math3-840.8%▲ +2.5%▼ -0.1%Near Recovery: Rapid growth, nearly at 2019 levels.
Science5, 8~29%▲ SlightN/AStable: Slow growth; Matrix design limits student-level data.
SAT ERW1149% (Benchmark)N/AN/AStrong: Nearly half of juniors meet college readiness reading metric.
SAT Math1117% (Benchmark)N/AN/ACritical Gap: Math readiness significantly lags reading; major focus area.
  • The SAT School Day: Serving as the high school accountability measure for grade 11, covering Reading and Writing, Mathematics, and Science.  

  • The West Virginia Alternate Summative Assessment (WVASA): Administered to students with the most significant cognitive disabilities (approximately 1% of the population) via the Dynamic Learning Maps (DLM) system.  

  • Step 4: It applies "Exposure Control" parameters to ensure that the best items are not overused, preserving the security of the bank.  

  • Drill-Down: Users can navigate from the District view -> School view -> Teacher view -> Student view -> Item/Standard view.

  • Tables and Graphs: The system prioritizes visual representations of data (box plots for distribution, bar charts for mastery) to help educators quickly identify outliers.  

  • Connections: The portal also links assessment results to the "BigFuture" career planning tools, integrating item analysis with career readiness data.   

    --------------------------------------

    For the 2025 assessment cycle in West Virginia, "Question-Level Transparency"—the ability for educators and parents to see the exact questions a student answered—has been fundamentally altered by the shift to digital and adaptive testing.

    The level of transparency available depends entirely on which assessment the student took:

    1. SAT School Day (Grade 11): Transparency is Removed

    For the 2025 Digital SAT, the College Board has explicitly discontinued the "Question Analysis" features that were previously available for paper-based tests.

    • Discontinued Reports: The "Question Analysis Report," which previously allowed teachers to review specific questions students missed, is no longer available. The 2024-2025 Data Layout for electronic score reports confirms that the "Question Item Data" fields have been deleted.1

      -----------------------

      For the 2025 assessment cycle in West Virginia, Question-Level Transparency refers to the degree to which educators and parents can view the specific test questions a student answered. Due to the universal shift to Computer Adaptive Testing (CAT) and Multistage Adaptive Testing (MST), this transparency has been significantly reduced or altered compared to historical fixed-form tests.

      In 2025, transparency is handled differently across the state's three main assessments:

      1. SAT School Day (Grade 11): Transparency is Removed

      For the 2025 Digital SAT, the College Board has discontinued the "Question Analysis" features that were previously available for paper-based tests.

      • Discontinued Reports: The "Question Analysis Report," which previously allowed teachers to review specific questions students missed, is no longer available. The 2024-2025 Data Layout for electronic score reports confirms that "Question Item Data" fields have been deleted.1

      • The Reason: The test is now Multistage Adaptive (MST). Students receive a unique combination of items drawn from a highly secure item bank. Releasing these questions would compromise the security of the bank for future administrations.3

      • The Replacement (Skills Insight™): Instead of seeing the actual question, educators are provided with "Skills Insight." This tool highlights sample questions that represent the difficulty band the student achieved. Teachers must infer student misconceptions from these proxies rather than the actual test performance.3

      2. WVGSA (Grades 3-8): Transparency is Aggregate

      For the General Summative Assessment, the Cambium Reporting System (CRS) provides a form of transparency that focuses on groups rather than individuals.

      • Aggregate-Level Item Report: Educators cannot see the exact test a specific student took because the CAT algorithm generates it uniquely for each child. However, they can access a report that lists all items administered to the school or district, sorted by Reporting Category.4

      • Viewing "Non-Secure" Items: The CRS includes a feature for "viewing non-secure items".5 This allows teachers to click on specific items in the aggregate report to view the content, provided those items have been designated as non-secure or released for instructional purposes. This bridges the gap between abstract data and classroom instruction, allowing teachers to see the types of questions that challenged their students.

      • Family Access: Parents can access results via the "View My Results" portal. While they receive detailed Lexile and Quantile measures, they do not receive a question-by-question breakdown of the test.6

      3. WVASA (Alternate Assessment): Transparency via Linkage

      For students with significant cognitive disabilities, transparency is achieved through Linkage Levels.

      • Skill-Based Reporting: The reporting system details exactly which "Linkage Level" (e.g., Initial Precursor vs. Target) a student mastered for each Essential Element. While the exact testlet questions are not released, the report is highly transparent regarding the specific complexity level of the task the student was able to perform.7

      Summary of Transparency Changes

      AssessmentCan you see the student's question?Can you see the student's answer?Proxy for Analysis
      SAT School DayNoNoSkills Insight™ (Sample questions based on score band)
      WVGSA (3-8)No (Secure items hidden)NoAggregate Item Report (View "Non-Secure" items missed by the group)
      WVASANoN/A (Assertion based)Linkage Level Profile (Shows exact complexity level mastered)

       

    • Reasoning: Because the test is Multistage Adaptive (MST), every student sees a unique combination of items drawn from a secure, high-stakes bank. Releasing these questions would compromise the security of the test for future administrations.2

    • The Replacement (Skills Insight™): Instead of seeing the actual question, educators are provided with "Skills Insight." This tool highlights sample questions that represent the difficulty band the student achieved. Teachers must infer student misconceptions from these proxies rather than the actual test performance.3

    2. WVGSA (Grades 3-8): Transparency is Aggregate, Not Individual

    For the General Summative Assessment, the Cambium Reporting System (CRS) offers a middle ground. You cannot see the exact test a specific student took (because it was generated uniquely by the CAT algorithm), but you can analyze items at a group level.

    • Aggregate-Level Item Report: Educators can access a report that lists all items administered to students in their school or district, sorted by Reporting Category. This allows teachers to see which specific items were frequently missed by the group.4

    • Viewing "Non-Secure" Items: The CRS user guide includes a feature for "viewing non-secure items."5 This typically allows educators to click on an item in the aggregate report and view the content of that question, provided it has been designated as "non-secure" or released for instructional use. Secure operational items remain hidden to preserve the validity of the item bank.5

    • Family Access: Parents can view student results within 24 hours of test completion via the "View My Results" portal using a specific access code, but this report focuses on scale scores and Lexile/Quantile measures rather than individual test questions.6

    3. WVASA (Alternate Assessment): Transparency via Linkage Levels

    For students with significant cognitive disabilities, transparency is defined by the Linkage Level Profile. Because the test adapts by routing students to different complexity levels (e.g., from "Initial Precursor" to "Target"), the report shows exactly which complexity level the student mastered for each Essential Element. This provides a clear view of what the student can do, even if the specific testlet questions are not released.7

    ---------------------------------------------------------------------------------------------------------------

     

    For the 2025 assessment cycle in West Virginia, "Question-Level Transparency"—the ability for educators and parents to see the exact questions a student answered—has been fundamentally altered by the shift to digital and adaptive testing.

    The level of transparency available depends entirely on which assessment the student took:

    1. SAT School Day (Grade 11): Transparency is Removed

    For the 2025 Digital SAT, the College Board has explicitly discontinued the "Question Analysis" features that were previously available for paper-based tests.

    • Discontinued Reports: The "Question Analysis Report," which previously allowed teachers to review specific questions students missed, is no longer available. The 2024-2025 Data Layout for electronic score reports confirms that the "Question Item Data" fields have been deleted.1

    • Reasoning: Because the test is Multistage Adaptive (MST), every student sees a unique combination of items drawn from a secure, high-stakes bank. Releasing these questions would compromise the security of the test for future administrations.2

    • The Replacement (Skills Insight™): Instead of seeing the actual question, educators are provided with "Skills Insight." This tool highlights sample questions that represent the difficulty band the student achieved. Teachers must infer student misconceptions from these proxies rather than the actual test performance.3

    2. WVGSA (Grades 3-8): Transparency is Aggregate, Not Individual

    For the General Summative Assessment, the Cambium Reporting System (CRS) offers a middle ground. You cannot see the exact test a specific student took (because it was generated uniquely by the CAT algorithm), but you can analyze items at a group level.

    • Aggregate-Level Item Report: Educators can access a report that lists all items administered to students in their school or district, sorted by Reporting Category. This allows teachers to see which specific items were frequently missed by the group.4

    • Viewing "Non-Secure" Items: The CRS user guide includes a feature for "viewing non-secure items".5 This typically allows educators to click on an item in the aggregate report and view the content of that question, provided it has been designated as "non-secure" or released for instructional use. Secure operational items remain hidden to preserve the validity of the item bank.5

    • Family Access: Parents can view student results within 24 hours of test completion via the "View My Results" portal using a specific access code, but this report focuses on scale scores and Lexile/Quantile measures rather than individual test questions.6

    3. WVASA (Alternate Assessment): Transparency via Linkage Levels

    For students with significant cognitive disabilities, transparency is defined by the Linkage Level Profile. Because the test adapts by routing students to different complexity levels (e.g., from "Initial Precursor" to "Target"), the report shows exactly which complexity level the student mastered for each Essential Element.8 This provides a clear view of what the student can do, even if the specific testlet questions are not released.9

    --------------------

    For the 2025 assessment cycle in West Virginia, "Question-Level Transparency"—the ability for educators and parents to see the exact questions a student answered—has been fundamentally altered by the shift to digital and adaptive testing.

    The level of transparency available depends entirely on which assessment the student took:

    1. SAT School Day (Grade 11): Transparency is Removed

    For the 2025 Digital SAT, the College Board has explicitly discontinued the "Question Analysis" features that were previously available for paper-based tests.

    • Discontinued Reports: The "Question Analysis Report," which previously allowed teachers to review specific questions students missed, is no longer available. The 2024-2025 Data Layout for electronic score reports confirms that the "Question Item Data" fields have been deleted.1

    • Reasoning: Because the test is Multistage Adaptive (MST), every student sees a unique combination of items drawn from a secure, high-stakes bank. Releasing these questions would compromise the security of the test for future administrations.2

    • The Replacement (Skills Insight™): Instead of seeing the actual question, educators are provided with "Skills Insight." This tool highlights sample questions that represent the difficulty band the student achieved. Teachers must infer student misconceptions from these proxies rather than the actual test performance.2

    2. WVGSA (Grades 3-8): Transparency is Aggregate, Not Individual

    For the General Summative Assessment, the Cambium Reporting System (CRS) offers a middle ground. You cannot see the exact test a specific student took (because it was generated uniquely by the CAT algorithm), but you can analyze items at a group level.

    • Aggregate-Level Item Report: Educators can access a report that lists all items administered to students in their school or district, sorted by Reporting Category. This allows teachers to see which specific items were frequently missed by the group.3

    • Viewing "Non-Secure" Items: The CRS user guide includes a feature for "viewing non-secure items".4 This typically allows educators to click on an item in the aggregate report and view the content of that question, provided it has been designated as "non-secure" or released for instructional use. Secure operational items remain hidden to preserve the validity of the item bank.4

    • Family Access: Parents can view student results within 24 hours of test completion via the "View My Results" portal using a specific access code, but this report focuses on scale scores and Lexile/Quantile measures rather than individual test questions.5

    3. WVASA (Alternate Assessment): Transparency via Linkage Levels

    For students with significant cognitive disabilities, transparency is defined by the Linkage Level Profile. Because the test adapts by routing students to different complexity levels (e.g., from "Initial Precursor" to "Target"), the report shows exactly which complexity level the student mastered for each Essential Element.6 This provides a clear view of what the student can do (e.g., "identifying objects" vs. "matching words"), even if the specific testlet questions are not released.7


     

    Aggregate-Level Item Report: Educators can access a report that lists all items administered to students in their school or district, sorted by Reporting Category. This allows teachers to see which specific items were frequently missed by the group.3


     

  • No comments:

    Post a Comment

    A Strategic Plan for Academic Recovery (2025-2028)

      Pocahontas County Schools: A Strategic Plan for Academic Recovery (2025-2028) 1.0 Introduction: From State of Emergency to a State of Exce...

    Shaker Posts