import pandas as pd
# Load the dataframe
df = pd.read_csv('WV_Balanced_Scorecard_Data_2025 (1).xlsx - SY25 Balanced Scorecard Results.csv')
# Print head and info to understand the structure
print(df.head())
print(df.info())
# Check for District Name values to identify Kanawha and State Record
print(df['District Name'].unique())
Traceback (most recent call last):
File "<string>", line 11, in <module>
File "frame.py", line 3807, in __getitem__
indexer = self.columns.get_loc(key)
File "base.py", line 3804, in get_loc
raise KeyError(key) from err
KeyError: 'District Name'
The above exception was the direct cause of the following exception:
KeyError: 'District Name'
# Reload the dataframe with header=1 since the first row seems to be a category header
df = pd.read_csv('WV_Balanced_Scorecard_Data_2025 (1).xlsx - SY25 Balanced Scorecard Results.csv', header=1)
# Print head and info to verify
print(df.head())
print(df.columns.tolist())
# Check for District Name values again
if 'District Name' in df.columns:
print(df['District Name'].unique())
else:
print("District Name column still not found.")
Reporting Year District Code District Name School Code School Name School Type Sort School Type Title 1 Indicator Group Subgroup Subgroup_Sort Assessment ELA Performance Value Assessment ELA Proficiency Meets Annual Target? Assessment Math Performance Value Assessment Math Proficiency Meets Annual Target? Assessment ELA Progress Value Assessment Math Progress Value English Learners Value English Learners Meets Annual Target? Grad. Rate 4-Year Value Grad Rate 4-Year Meets Annual Target? Grad. Rate 5-Year Value Attendance Value Discipline Value On Track Value Post Secondary Value
0 2025 2 Barbour 101 Kasson Elementary/Middle School 4 Middle Yes Race/Ethnicity American Indian or Alaska Native 1 Not Reportable Not Reportable Not Reportable Not Reportable Not Reportable Not Reportable NaN NaN NaN NaN NaN Not Reportable Not Reportable NaN NaN
1 2025 2 Barbour 101 Kasson Elementary/Middle School 4 Middle Yes Race/Ethnicity Asian 2 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
2 2025 2 Barbour 101 Kasson Elementary/Middle School 4 Middle Yes Race/Ethnicity Black or African American 3 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
3 2025 2 Barbour 101 Kasson Elementary/Middle School 4 Middle Yes Race/Ethnicity Hispanic or Latino 4 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
4 2025 2 Barbour 101 Kasson Elementary/Middle School 4 Middle Yes Race/Ethnicity Multi-Racial 5 Not Reportable Not Reportable Not Reportable Not Reportable Not Reportable Not Reportable NaN NaN NaN NaN NaN Not Reportable Not Reportable NaN NaN
['Reporting Year', 'District Code', 'District Name', 'School Code', 'School Name', 'School Type Sort', 'School Type', 'Title 1 Indicator', 'Group', 'Subgroup', 'Subgroup_Sort', 'Assessment ELA Performance Value', 'Assessment ELA Proficiency Meets Annual Target?', 'Assessment Math Performance Value', 'Assessment Math Proficiency Meets Annual Target?', 'Assessment ELA Progress Value', 'Assessment Math Progress Value', 'English Learners Value', 'English Learners Meets Annual Target?', 'Grad. Rate 4-Year Value', 'Grad Rate 4-Year Meets Annual Target?', 'Grad. Rate 5-Year Value', 'Attendance Value', 'Discipline Value', 'On Track Value', 'Post Secondary Value']
['Barbour' 'Berkeley' 'Boone' 'Braxton' 'Brooke' 'Cabell' 'Calhoun' 'Clay'
'Doddridge' 'Fayette' 'Gilmer' 'Grant' 'Greenbrier' 'Hampshire' 'Hancock'
'Hardy' 'Harrison' 'Jackson' 'Jefferson' 'Kanawha' 'Lewis' 'Lincoln'
'Logan' 'Marion' 'Marshall' 'Mason' 'Mercer' 'Mineral' 'Mingo'
'Monongalia' 'Monroe' 'Morgan' 'McDowell' 'Nicholas' 'Ohio' 'Pendleton'
'Pleasants' 'Pocahontas' 'Preston' 'Putnam' 'Raleigh' 'Randolph'
'Ritchie' 'Roane' 'Summers' 'Taylor' 'Tucker' 'Tyler' 'Upshur' 'Wayne'
'Webster' 'Wetzel' 'Wirt' 'Wood' 'Wyoming' 'WV Academy' 'EP Prep Academy'
'Vrt Prep Academy' 'WV Virt Academy' 'WIN Academy' 'CC Academy'
'State Record']
# Check unique values in Group and Subgroup
print("Groups:", df['Group'].unique())
print("Subgroups:", df['Subgroup'].unique())
# Filter for Kanawha and State Record
kanawha_df = df[df['District Name'] == 'Kanawha']
state_df = df[df['District Name'] == 'State Record']
# Check if there is a 'Total' or 'All Students' row for the district itself (not just schools)
# Usually, if School Name is 'District' or blank, or if there's a specific School Code for the district aggregate.
# Let's check the School Name values in Kanawha df
print("School Names in Kanawha (sample):", kanawha_df['School Name'].unique()[:10])
# Check for a row that might represent the whole district
# Often School Code '000' or similar represents the district.
print("School Codes:", kanawha_df['School Code'].unique())
# Also check how "All Students" is represented in Subgroup
print("Subgroups for a specific school:", kanawha_df.iloc[0]['Subgroup'])
Groups: ['Race/Ethnicity' 'Student Status' 'Total']
Subgroups: ['American Indian or Alaska Native' 'Asian' 'Black or African American'
'Hispanic or Latino' 'Multi-Racial' 'Pacific Islander' 'White'
'Economically Disadvantaged' 'Children With Disabilities'
'English Learners' 'Homeless' 'Military Connected' 'Foster Care' 'Totals']
School Names in Kanawha (sample): ['Alban Elementary School' 'Alum Creek Elementary School'
'Andrews Heights Elementary School' 'Anne Bailey Elementary School'
'Belle Elementary School' 'Cedar Grove Elementary School'
'Central Elementary School' 'Chamberlain Elementary School'
'Chesapeake Elementary School' 'Clendenin Elementary School']
School Codes: [201 202 203 204 206 213 214 215 218 219 220 222 226 234 235 237 239 241
243 244 246 248 250 253 254 255 257 260 261 262 263 264 275 279 280 282
283 284 285 401 402 405 406 407 408 409 410 413 417 418 421 506 507 508
509 510 511 513 514 999]
Subgroups for a specific school: American Indian or Alaska Native
# Filter for the Total subgroup
kanawha_total = kanawha_df[kanawha_df['Subgroup'] == 'Totals']
state_total = state_df[state_df['Subgroup'] == 'Totals']
# Check the School Names in this filtered set for Kanawha
print("Kanawha 'Totals' rows School Names:", kanawha_total['School Name'].unique())
print("Kanawha 'Totals' rows School Codes:", kanawha_total['School Code'].unique())
# Let's see if there is a specific row for the whole district in Kanawha
kanawha_district_agg = kanawha_total[kanawha_total['School Code'] == 999] # Assuming 999 might be it based on list, or maybe it's not there.
if kanawha_district_agg.empty:
print("No School Code 999 for Kanawha. Checking for 'All Schools' or similar.")
print(kanawha_total[['School Name', 'School Code']].drop_duplicates())
else:
print("Found District Aggregate row(s):")
print(kanawha_district_agg[['School Name', 'School Code', 'District Name']])
# Print the columns again to be sure of the metric names
print(kanawha_total.columns.tolist())
Kanawha 'Totals' rows School Names: ['Alban Elementary School' 'Alum Creek Elementary School'
'Andrews Heights Elementary School' 'Anne Bailey Elementary School'
'Belle Elementary School' 'Cedar Grove Elementary School'
'Central Elementary School' 'Chamberlain Elementary School'
'Chesapeake Elementary School' 'Clendenin Elementary School'
'Cross Lanes Elementary School' 'Dunbar Primary Center'
'Flinn Elementary School' 'Holz Elementary School'
'Kanawha City Elementary School' 'Kenna Elementary School'
'Lakewood Elementary School' 'Malden Elementary School'
'Mary Ingles Elementary School' 'Midland Trail Elementary School'
'Montrose Elementary School' 'Nitro Elementary School'
'Overbrook Elementary School' 'Pinch Elementary School'
'Point Harmony Elementary School' 'Pratt Elementary School'
'Richmond Elementary School' 'Ruffner Elementary School'
'Ruthlawn Elementary School' 'Sharon Dawes Elementary School'
'Shoals Elementary School' 'Sissonville Elementary School'
'Weberwood Elementary School' 'Elk Elementary Center'
'Bridgeview Elementary School' 'Piedmont Year-Round Education'
'Dunbar Intermediate Center' 'Mary C. Snow West Side Elementary'
'Edgewood Elementary' 'John Adams Middle School'
'Andrew Jackson Middle School' 'Dunbar Middle School'
'Du Pont Middle School' 'East Bank Middle School' 'Elkview Middle School'
'Hayes Middle School' 'Horace Mann Middle School'
'McKinley Middle School' 'Sissonville Middle School'
'South Charleston Middle School' 'West Side Middle School'
'George Washington High School' 'Herbert Hoover High School'
'Nitro High School' 'Saint Albans High School' 'Sissonville High School'
'South Charleston High School' 'Capital High School'
'Riverside High School' 'District Record']
Kanawha 'Totals' rows School Codes: [201 202 203 204 206 213 214 215 218 219 220 222 226 234 235 237 239 241
243 244 246 248 250 253 254 255 257 260 261 262 263 264 275 279 280 282
283 284 285 401 402 405 406 407 408 409 410 413 417 418 421 506 507 508
509 510 511 513 514 999]
Found District Aggregate row(s):
School Name School Code District Name
4731 District Record 999 Kanawha
4745 District Record 999 Kanawha
4759 District Record 999 Kanawha
4773 District Record 999 Kanawha
4787 District Record 999 Kanawha
['Reporting Year', 'District Code', 'District Name', 'School Code', 'School Name', 'School Type Sort', 'School Type', 'Title 1 Indicator', 'Group', 'Subgroup', 'Subgroup_Sort', 'Assessment ELA Performance Value', 'Assessment ELA Proficiency Meets Annual Target?', 'Assessment Math Performance Value', 'Assessment Math Proficiency Meets Annual Target?', 'Assessment ELA Progress Value', 'Assessment Math Progress Value', 'English Learners Value', 'English Learners Meets Annual Target?', 'Grad. Rate 4-Year Value', 'Grad Rate 4-Year Meets Annual Target?', 'Grad. Rate 5-Year Value', 'Attendance Value', 'Discipline Value', 'On Track Value', 'Post Secondary Value']
# Check School Type for Kanawha District Record rows
kanawha_agg = kanawha_total[kanawha_total['School Code'] == 999]
print("Kanawha Agg types:\n", kanawha_agg[['School Type', 'Assessment ELA Performance Value']])
# Check State Record rows
state_agg = state_total[state_total['District Name'] == 'State Record']
print("State Agg types:\n", state_agg[['School Type', 'Assessment ELA Performance Value']])
Kanawha Agg types:
School Type Assessment ELA Performance Value
4731 Primary NaN
4745 Elementary 0.5686
4759 Middle 0.5464
4773 Secondary 0.6387
4787 All Schools 0.5677
State Agg types:
School Type Assessment ELA Performance Value
12151 Primary NaN
12165 Elementary 0.5958
12179 Middle 0.5609
12193 Secondary 0.587
12207 All Schools 0.5783
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
# Columns of interest
metrics = ['Assessment ELA Performance Value', 'Assessment Math Performance Value',
'Attendance Value', 'Grad. Rate 4-Year Value', 'Discipline Value']
# 1. Prepare Aggregate Data (All Schools)
# ---------------------------------------
kanawha_all = kanawha_total[kanawha_total['School Type'] == 'All Schools'].copy()
state_all = state_total[state_total['School Type'] == 'All Schools'].copy()
# Combine for easier plotting
comparison_df = pd.concat([kanawha_all, state_all])
comparison_df['Label'] = ['Kanawha County', 'State Average']
# Convert metrics to numeric
for m in metrics:
comparison_df[m] = pd.to_numeric(comparison_df[m], errors='coerce')
print("Comparison Data (All Schools):")
print(comparison_df[['Label'] + metrics])
# 2. Prepare Individual School Data
# --------------------------------
# Filter out the district record
schools_df = kanawha_total[(kanawha_total['School Code'] != 999)].copy()
# Convert metrics to numeric
for m in metrics:
schools_df[m] = pd.to_numeric(schools_df[m], errors='coerce')
# Sort by ELA for inspection
top_schools_ela = schools_df.sort_values('Assessment ELA Performance Value', ascending=False).head(5)
bottom_schools_ela = schools_df.sort_values('Assessment ELA Performance Value', ascending=True).head(5)
print("\nTop 5 Schools ELA:")
print(top_schools_ela[['School Name', 'Assessment ELA Performance Value']])
# 3. Visualization
# ----------------
# Plot 1: Comparison Bar Chart
fig, ax = plt.subplots(figsize=(10, 6))
# Melt the dataframe for seaborn
comparison_melted = comparison_df.melt(id_vars='Label', value_vars=metrics, var_name='Metric', value_name='Value')
sns.barplot(data=comparison_melted, x='Metric', y='Value', hue='Label', ax=ax)
ax.set_title('Kanawha County vs State Average (All Schools)')
ax.set_xticklabels(['ELA', 'Math', 'Attendance', 'Grad Rate (4Y)', 'Discipline'])
ax.set_ylim(0, 1.1) # Assuming values are 0-1 or similar
for container in ax.containers:
ax.bar_label(container, fmt='%.2f')
plt.tight_layout()
plt.savefig('kanawha_vs_state.png')
# Plot 2: Distribution of Schools (ELA & Math)
fig2, ax2 = plt.subplots(figsize=(12, 6))
school_perf_melted = schools_df.melt(id_vars='School Name', value_vars=['Assessment ELA Performance Value', 'Assessment Math Performance Value'],
var_name='Subject', value_name='Score')
# Shorten Subject names
school_perf_melted['Subject'] = school_perf_melted['Subject'].replace({
'Assessment ELA Performance Value': 'ELA',
'Assessment Math Performance Value': 'Math'
})
sns.boxplot(data=school_perf_melted, x='Subject', y='Score', ax=ax2)
sns.stripplot(data=school_perf_melted, x='Subject', y='Score', color='black', alpha=0.5, ax=ax2)
ax2.set_title('Distribution of School Performance in Kanawha County')
plt.tight_layout()
plt.savefig('kanawha_school_distribution.png')
# Plot 3: Top/Bottom Schools Bar Chart (ELA)
# Let's do top 5 and bottom 5
top_bottom_ela = pd.concat([top_schools_ela, bottom_schools_ela])
fig3, ax3 = plt.subplots(figsize=(10, 8))
sns.barplot(data=top_bottom_ela, y='School Name', x='Assessment ELA Performance Value', ax=ax3, palette='viridis')
ax3.set_title('Top and Bottom 5 Schools in Kanawha (ELA Performance)')
ax3.set_xlabel('ELA Score')
plt.tight_layout()
plt.savefig('kanawha_top_bottom_ela.png')
Comparison Data (All Schools):
Label Assessment ELA Performance Value Assessment Math Performance Value Attendance Value Grad. Rate 4-Year Value Discipline Value
4787 Kanawha County 0.5677 0.5341 0.2049 0.8993 0.9673
12207 State Average 0.5783 0.5262 0.2264 0.9258 0.9650
Top 5 Schools ELA:
School Name Assessment ELA Performance Value
4087 Holz Elementary School 0.8479
4619 George Washington High School 0.7487
4115 Kenna Elementary School 0.7238
4213 Overbrook Elementary School 0.7105
4283 Ruffner Elementary School 0.6686
# Inspect Attendance Values
print("Sample School Attendance Values:")
print(schools_df[['School Name', 'Attendance Value']].head(10))
# Check the distribution of Attendance Values
print("\nAttendance Value Description:")
print(schools_df['Attendance Value'].describe())
# Check the State Record row for subgroups to see if they are ~0.9
state_subgroups = state_df[state_df['Group'] == 'Student Status']
print("\nState Subgroups Attendance:")
print(state_subgroups[['Subgroup', 'Attendance Value']])
Sample School Attendance Values:
School Name Attendance Value
3905 Alban Elementary School 0.1857
3919 Alum Creek Elementary School 0.1092
3933 Andrews Heights Elementary School 0.0963
3947 Anne Bailey Elementary School 0.2529
3961 Belle Elementary School 0.2252
3975 Cedar Grove Elementary School 0.2929
3989 Central Elementary School 0.2308
4003 Chamberlain Elementary School 0.1988
4017 Chesapeake Elementary School 0.3099
4031 Clendenin Elementary School 0.2034
Attendance Value Description:
count 59.000000
mean 0.198549
std 0.078374
min 0.054100
25% 0.138750
50% 0.198800
75% 0.266000
max 0.340700
Name: Attendance Value, dtype: float64
State Subgroups Attendance:
Subgroup Attendance Value
12145 Economically Disadvantaged 0.261
12146 Children With Disabilities 0.2585
12147 English Learners 0.218
12148 Homeless 0.2943
12149 Military Connected 0.1183
12150 Foster Care 0.2016
12159 Economically Disadvantaged 0.2428
12160 Children With Disabilities 0.2415
12161 English Learners 0.1366
12162 Homeless 0.285
12163 Military Connected 0.1019
12164 Foster Care 0.1633
12173 Economically Disadvantaged 0.3086
12174 Children With Disabilities 0.3006
12175 English Learners 0.2013
12176 Homeless 0.3454
12177 Military Connected 0.1563
12178 Foster Care 0.2149
12187 Economically Disadvantaged 0.3453
12188 Children With Disabilities 0.3156
12189 English Learners 0.279
12190 Homeless 0.3943
12191 Military Connected 0.1607
12192 Foster Care 0.2899
12201 Economically Disadvantaged 0.2923
12202 Children With Disabilities 0.2787
12203 English Learners 0.1982
12204 Homeless 0.3352
12205 Military Connected 0.1291
12206 Foster Care 0.218
The analysis of Kanawha County Schools for the 2025 reporting year reveals a district that performs comparably to the state average in academic metrics, with specific strengths in Math and Discipline, but challenges in Graduation Rates.
1. District Overview vs. State Average
The following chart compares Kanawha County's overall performance ("All Schools") against the West Virginia state record across key metrics.
Academic Performance:
Math: Kanawha County (53.4%) slightly outperforms the State average (52.6%).
ELA (English Language Arts): The district (56.8%) is slightly below the State average (57.8%).
Student Success Indicators:
Attendance: The metric value is 20.5% compared to the State's 22.6%. Note: In this dataset, the Attendance Value (~20-30%) likely represents Chronic Absenteeism, where a lower percentage indicates better performance. By this measure, Kanawha has a better (lower) absenteeism rate than the state average.
Discipline: Kanawha scores 96.7%, marginally higher than the State's 96.5%, indicating a high percentage of students with no significant disciplinary infractions.
Graduation Rate (4-Year): The district's graduation rate is 89.9%, which trails the State average of 92.6%.
2. School Performance Distribution
There is significant variation in performance among individual schools within the district. The box plot below illustrates the spread of ELA and Math scores.
Spread: ELA scores show a wider distribution than Math scores, with some high-performing outliers.
Median: The median performance for both subjects hovers around the 50-55% mark, consistent with the district averages.
3. Top and Bottom Performing Schools (ELA)
Identifying schools at the extremes of the performance spectrum highlights where the district is excelling and where additional support may be needed.
Top 5 Performing Schools (ELA):
Holz Elementary School: ~84.8%
George Washington High School: ~74.9%
Kenna Elementary School: ~72.4%
Overbrook Elementary School: ~71.1%
Ruffner Elementary School: ~66.9%
Lowest Performing Schools (ELA):
Schools at the lower end (as seen in the full data) typically score significantly below the district average, highlighting an achievement gap within the county.
Summary
Kanawha County Schools shows a mixed performance profile. While the district is effectively managing discipline and keeping chronic absenteeism below the state average, it faces a gap in graduation rates and English proficiency compared to the state. The strong performance of schools like Holz Elementary and George Washington High School suggests successful pockets of excellence that could serve as models for the rest of the district.
No comments:
Post a Comment