Reproduction of Hurricane Harvey Flooding GEOG120 Lab Problem¶

Authors¶

  • Colman Bashore*, cbashore@middlebury.edu, @colman-bashore, Middlebury College

* Corresponding author and creator

Abstract¶

This study is a reproduction of a Middebury Geography Indroductory GIS Lab problem titled "Exposure to Environmental Hazards: Hurricane Harvey." The original study focused on comparing levels of flooding across block groups of different majority demographics. The original study used the desktop GIS QGIS to determine the majority racial group in every block group in Harris County, Texas and then compared these data to the extent of flooding from Hurricane Harvey. This reproduction study aims to reproduce the same results from the original lab problem using a Python computation notebook (ipynb) as opposed to a desktop GIS workflow. The notebook will potentially serve as an opportunity to demonstrate using Python to complete simple GIS problems in the context of an introductory Human Geography with GIS class.

Link to original study prompt

Study metadata¶

  • Key words: Comma-separated list of keywords (tags) for searchability. Geographers often use one or two keywords each for: theory, geographic context, and methods.
  • Subject: select from the BePress Taxonomy
  • Date created: November 23, 2023
  • Date modified: December 17, 2023
  • Spatial Coverage: Harris County, Texas OpenStreetMap Link
  • Spatial Resolution: Census Block Group Level
  • Spatial Reference System: EPSG:6587
  • Temporal Coverage: September 2017
  • Temporal Resolution: ACS 5 year estimates
  • Funding Name: Middlebury College
  • Funding Title: N/A
  • Award info URI: N/A
  • Award number: N/A

Original study spatio-temporal metadata¶

  • Spatial Coverage: Harris County, Texas OpenStreetMap Link
  • Spatial Resolution: Census Block Group Level
  • Spatial Reference System: EPSG:6587
  • Temporal Coverage: September 2017
  • Temporal Resolution: ACS 5 year estimates

Study design¶

The study is setup to be an educational example. The primary research question is which regions defined by majority racial/ethnic group had the most flooding during Hurricane Harvey. It used a zonal statistic to calculate the amount of flooding in each region of Harris County. The original study is a QGIS workflow, and this replication uses a ipynb python notebook.

Materials and procedure¶

Computational environment¶

Maintaining a reproducible computational environment requires some conscious choices in package management.

Please refer to requirements.txt for details.

In [ ]:
# Import modules, define directories

from pyhere import here
import pandas as pd
import geopandas as gpd
import folium
import matplotlib
import numpy as np
import rasterio
from rasterio.plot import show
import fiona
from rasterstats import zonal_stats
from matplotlib import pyplot as plt



# You can define your own shortcuts for file paths:
path = {
    "dscr": here("data", "scratch"),
    "drpub": here("data", "raw", "public"),
    "drpriv": here("data", "raw", "private"),
    "ddpub": here("data", "derived", "public"),
    "ddpriv": here("data", "derived", "private"),
    "rfig": here("results", "figures"),
    "roth": here("results", "other"),
    "rtab": here("results", "tables"),
    "dmet": here("data", "metadata")
}

Data and variables¶

Describe the data sources and variables to be used. Data sources may include plans for observing and recording primary data or descriptions of secondary data. For secondary data sources with numerous variables, the analysis plan authors may focus on documenting only the variables intended for use in the study.

Primary data sources for the study are to include ... . Secondary data sources for the study are to include ... .

Each of the next subsections describes one data source.

blockgroups.shp¶

  • Title: blockgroups.shp
  • Abstract: Shapefile containing the geometry and GEOID over every Census block group in Harris County, Texas.
  • Spatial Coverage: Harris County, Texas. OpenStreetMap Link
  • Spatial Resolution: Census Block Group
  • Spatial Reference System: EPSG: 6587
  • Temporal Coverage: N/A
  • Temporal Resolution: N/A
  • Lineage: United States Census Bureau delineation, gathered through US Census API https://www.census.gov/developers/
  • Distribution: Distributed publicly indefinetely by the US Census.
  • Constraints: None
  • Data Quality: Opening in a graphical GIS like QGIS and verifying existence of all block groups.
  • Variables: For each variable, enter the following information. If you have two or more variables per data source, you may want to present this information in table form (shown below)
    • Label: GEOID
    • Alias: GEOID
    • Definition: Unique identifier for each block group
    • Type: Integer
    • Accuracy: One per block group.
    • Domain: 482011000001 to 482019801001
    • Missing Data Value(s): N/A
    • Missing Data Frequency: None for GEOID

Other variables are not significant for this analysis Prior observation: - [x] metadata and descriptive statistics have been observed

Import blockgroups.shp¶
In [ ]:
blockgroups = gpd.read_file(here(path["drpub"], "blockgroups.shp"))

blockgroups = gpd.GeoDataFrame(blockgroups)

blockgroups.head()
Out[ ]:
STATEFP COUNTYFP TRACTCE BLKGRPCE AFFGEOID GEOID LSAD ALAND AWATER GEONAME geometry
0 48 201 311000 1 1500000US482013110001 482013110001 BG 616969.0 47009.0 Block Group 1, Census Tract 3110, Harris Count... POLYGON ((958404.137 4217699.295, 958413.144 4...
1 48 201 311000 4 1500000US482013110004 482013110004 BG 408595.0 25333.0 Block Group 4, Census Tract 3110, Harris Count... POLYGON ((957048.814 4217692.784, 957496.427 4...
2 48 201 311100 1 1500000US482013111001 482013111001 BG 1018525.0 213804.0 Block Group 1, Census Tract 3111, Harris Count... POLYGON ((958975.179 4217311.881, 958892.738 4...
3 48 201 311100 3 1500000US482013111003 482013111003 BG 484061.0 36045.0 Block Group 3, Census Tract 3111, Harris Count... POLYGON ((958773.280 4216120.376, 959779.865 4...
4 48 201 311100 4 1500000US482013111004 482013111004 BG 547376.0 0.0 Block Group 4, Census Tract 3111, Harris Count... POLYGON ((958756.220 4216618.402, 959659.603 4...

blockgroup_demographic_data.csv¶

  • Title: blockgroup_demographic_data.csv
  • Abstract: Data table of American Community Survey demographic data by Census Block groups for Harris County, Texas.
  • Spatial Coverage: Harris County, Texas. OpenStreetMap Link
  • Spatial Resolution: Census Block Groups
  • Spatial Reference System: None
  • Temporal Coverage: 2012-2017
  • Temporal Resolution: ACS 5-year estimates
  • Lineage: https://www.census.gov/programs-surveys/acs/guidance/estimates.html
  • Distribution: Distributed publicly indefinetely by the US Census.
  • Constraints: None
  • Data Quality: None
  • Variables: For each variable, enter the following information. If you have two or more variables per data source, you may want to present this information in table form (shown below)
    • Label: variable name as used in the data or code
    • Alias: intuitive natural language name
    • Definition: Short description or definition of the variable. Include measurement units in description.
    • Type: data type, e.g. character string, integer, real
    • Accuracy: e.g. uncertainty of measurements
    • Domain: Expected range of Maximum and Minimum of numerical data, or codes or categories of nominal data, or reference to a standard codebook
    • Missing Data Value(s): Values used to represent missing data and frequency of missing data observations
    • Missing Data Frequency: Frequency of missing data observations: not yet known for data to be collected
Label Alias Definition Type Accuracy Domain Missing Data Value(s) Missing Data Frequency
GEOID GEOID Unique identifier for each block group Integer N/A 482011000001 to 482019801001 N/A None
BO3002_001 Total Population Number of people in the block group Integer See ACS 9-21758 N/A None
BO3002_003 White Population Number of people in the White racial/ethnic group in the block group Integer See ACS 0-9199 N/A None
BO3002_004 Black Population Number of people in the Black racial/ethnic group in the block group Integer See ACS 0-5258 N/A None
BO3002_006 Asian Population Number of people in the Asian racial/ethnic group in the block group Integer See ACS 0-3418 N/A None
BO3002_012 Latinx Population Number of people in the Latinx racial/ethnic group in the block group Integer See ACS 0-11408 N/A None
In [ ]:
blockgroup_demographic_data = pd.read_csv(here(path["drpub"],'blockgroup_demographic_data.csv'), dtype=str,encoding='latin-1')


blockgroup_demographic_data.head()
Out[ ]:
GEOID B03002_001 B03002_002 B03002_003 B03002_004 B03002_005 B03002_006 B03002_007 B03002_008 B03002_009 ... B03002_012 B03002_013 B03002_014 B03002_015 B03002_016 B03002_017 B03002_018 B03002_019 B03002_020 B03002_021
0 482013110001 583 0 0 0 0 0 0 0 0 ... 583 573 0 0 0 0 10 0 0 0
1 482013110004 1869 22 22 0 0 0 0 0 0 ... 1847 1818 0 0 0 0 29 0 0 0
2 482013111001 1046 11 11 0 0 0 0 0 0 ... 1035 895 4 0 0 0 136 0 0 0
3 482013111003 1639 112 112 0 0 0 0 0 0 ... 1527 1192 0 0 0 0 315 20 0 20
4 482013111004 1759 48 0 16 0 32 0 0 0 ... 1711 1476 11 7 0 0 173 44 44 0

5 rows × 22 columns

actual_flood_10.tif¶

  • Title: actual_flood_10.tif
  • Abstract: Raster image of Harris County where 1's represent flooded area from Hurricane Harvey and and nodata (raster equivalent of NULL) in all other locations
  • Spatial Coverage: Harris County, Texas. OpenStreetMap Link
  • Spatial Resolution: 10 meter resolution
  • Spatial Reference System: EPSG: 6587
  • Temporal Coverage: September 2017
  • Temporal Resolution: Worth investigating more, a specific day after the flooding.
  • Lineage: I recieved this data from the GEOG 120 professors. The original sources are: The Flood Observatory https://floodobservatory.colorado.edu/ and the Harris County Flood Control District: https://www.hcfcd.org/Hurricane-Harvey. The steps taken from the original source to the version I recieved are unclear.
  • Distribution: This exact file may not be publicly available.
  • Constraints: Middlebury Course Material
  • Data Quality: Inspect in QGIS.
  • Variables: Only has one Band. 1 = Flooded. Nodata = Not flooded.
In [ ]:
actual_flood_10 = rasterio.open(here(path["drpub"],'actual_flood_10.tif'))

ax = show((actual_flood_10, 1))
No description has been provided for this image

Bias and threats to validity¶

One of the main geographic threat to validity in this study is the modifiable areal unit problem. In the case of the original study the data sources are aggregated at the level of census block groups. In many cases the unit of aggregation used for a study can dramatically change the output of an analysis. For example, the map of majority racial groups in Harris County aggregated at the block group level would be much more complex than a map at the tract level but it would not be as detailed may show different trends than a map produced with data at the block level. The original study unit size is based upon the scale of data available, which often determines the unit used. However, we cannot be sure that the results of our analysis wouldn't be different if we used a finer unit of analysis, therefore the unit of aggregation is a threat to validity in this study. Additionally, another threat to this study is the confusion of spatial and a-spatial causation. Flooding is an example of a variable that is often considered to be based exlusivily on the physical geographic of a place. However, this may lead to a lack of focus on variables such as emergency preparedness that may impact the extent of flooding. Another threat to validity is the common assumption that all locations within a delineated region are the same. In the case of the Harvey flooding study we may see that a region that has a predominantly white population has a lot of flooding, but we are not taking into consideration the distribution of population within that area and cannot tell whether the flooding in that region is overlapping with the white population or if there are other factors at play such as flooding in undeveloped wetlands within the region, which would not have a significant human geography impact. Finally, there may be a boundary effect from the border of Harris County. The northwest border of Harris County is a river, thus it follows that the lowland area surrounding it would be more likely to flood. Thus, by not extending the extent of the study, we may see higher flooded percentages in regions that touch this border.

Data transformations¶

Describe all data transformations planned to prepare data sources for analysis. This section should explain with the fullest detail possible how to transform data from the raw state at the time of acquisition or observation, to the pre-processed derived state ready for the main analysis. Including steps to check and mitigate sources of bias and threats to validity. The method may anticipate contingencies, e.g. tests for normality and alternative decisions to make based on the results of the test. More specifically, all the geographic and variable transformations required to prepare input data as described in the data and variables section above to match the study's spatio-temporal characteristics as described in the study metadata and study design sections. Visual workflow diagrams may help communicate the methodology in this section.

Examples of geographic transformations include coordinate system transformations, aggregation, disaggregation, spatial interpolation, distance calculations, zonal statistics, etc.

Examples of variable transformations include standardization, normalization, constructed variables, imputation, classification, etc.

Be sure to include any steps planned to exclude observations with missing or outlier data, to group observations by attribute or geographic criteria, or to impute missing data or apply spatial or temporal interpolation.

Goal 1: Load census data into block groups¶

In [ ]:
bg_merged = blockgroups.merge(blockgroup_demographic_data, on = "GEOID", how = "left")

bgData = gpd.GeoDataFrame(bg_merged)

bgData.rename(
  columns={
    'B03002_001': 'Total',
    'B03002_003' : 'White',
    'B03002_004' : 'Black',
    'B03002_006' : 'Asian',
    'B03002_012' : 'Latinx',
  },
  inplace=True
)

bgData  = bgData.drop(columns=['STATEFP', 'COUNTYFP', 'TRACTCE', 'BLKGRPCE', 'AFFGEOID','LSAD', 'ALAND', 'AWATER', 'GEONAME','B03002_002','B03002_005','B03002_007', 'B03002_008','B03002_009', 'B03002_010', 'B03002_011','B03002_013','B03002_014', 'B03002_015', 'B03002_016', 'B03002_017', 'B03002_018','B03002_019', 'B03002_020', 'B03002_021'])

# bgData = bgData.astype(
#      {'GEOID': 'int','Total': 'float', 'White': 'float', 'Black': 'float', 'Asian': 'float','Latinx': 'float'}).dtypes

bgData[['Total', 'White', 'Black', 'Asian','Latinx']] = bgData[['Total', 'White', 'Black', 'Asian','Latinx']].apply(pd.to_numeric)
print(bgData.dtypes)
#bgData.columns
#bgData.plot()
#print(bgData)
GEOID         object
geometry    geometry
Total          int64
White          int64
Black          int64
Asian          int64
Latinx         int64
dtype: object

Goal 2: Create regions by majority groups¶

In [ ]:
## Calculate percentages of each majority group


bgData["pctAsian"] = bgData.Asian / bgData.Total * 100
bgData["pctBlack"] = bgData.Black / bgData.Total * 100
bgData["pctLatinx"] = bgData.Latinx / bgData.Total * 100
bgData["pctWhite"] = bgData.White / bgData.Total * 100

bgData.plot(column='pctLatinx', legend=True)
Out[ ]:
<Axes: >
No description has been provided for this image
In [ ]:
## Create majority group field

def assign_major_group(row):
    if row['pctAsian'] >= 60:
        return 'Asian'
    elif row['pctBlack'] >= 60:
        return 'Black'
    elif row['pctLatinx'] >= 60:
        return 'Latinx'
    elif row['pctWhite'] >= 60:
        return 'White'
    else:
        return 'Mixed'

bgData['majorGrp'] = bgData.apply(assign_major_group, axis=1)

bgData.head(10)
Out[ ]:
GEOID geometry Total White Black Asian Latinx pctAsian pctBlack pctLatinx pctWhite majorGrp
0 482013110001 POLYGON ((958404.137 4217699.295, 958413.144 4... 583 0 0 0 583 0.000000 0.000000 100.000000 0.000000 Latinx
1 482013110004 POLYGON ((957048.814 4217692.784, 957496.427 4... 1869 22 0 0 1847 0.000000 0.000000 98.822900 1.177100 Latinx
2 482013111001 POLYGON ((958975.179 4217311.881, 958892.738 4... 1046 11 0 0 1035 0.000000 0.000000 98.948375 1.051625 Latinx
3 482013111003 POLYGON ((958773.280 4216120.376, 959779.865 4... 1639 112 0 0 1527 0.000000 0.000000 93.166565 6.833435 Latinx
4 482013111004 POLYGON ((958756.220 4216618.402, 959659.603 4... 1759 0 16 32 1711 1.819215 0.909608 97.271177 0.000000 Latinx
5 482013131001 POLYGON ((948490.088 4214006.390, 948838.150 4... 2744 1567 347 434 375 15.816327 12.645773 13.666181 57.106414 Mixed
6 482013111002 POLYGON ((958538.363 4215935.056, 958611.621 4... 1092 7 0 0 1085 0.000000 0.000000 99.358974 0.641026 Latinx
7 482013131002 POLYGON ((947852.561 4213019.383, 947966.900 4... 652 304 85 119 88 18.251534 13.036810 13.496933 46.625767 Mixed
8 482015431001 POLYGON ((896848.263 4250165.408, 897077.183 4... 3498 1442 322 19 1490 0.543168 9.205260 42.595769 41.223556 Mixed
9 482015502001 POLYGON ((943861.867 4241534.858, 944139.303 4... 2256 143 1368 0 745 0.000000 60.638298 33.023050 6.338652 Black
In [ ]:
## Group by majority groups and dissolve geometry
bgData['blockGroups'] = 1
#bgData['majorGrp2'] = bgData.majorGrp
group_sums = bgData.groupby('majorGrp')[['blockGroups', 'Total', 'White', 'Black', 'Asian', 'Latinx']].sum().reset_index()

# Step 2: Merge the sum information with the original GeoDataFrame
bgDataWithSums = pd.merge(bgData, group_sums, on='majorGrp', how = 'inner')

# Step 3: Dissolve based on 'majorGrp' and calculate the sum
dissolved = bgDataWithSums.dissolve('majorGrp', aggfunc='sum')

# Step 4: Create a new GeoDataFrame with the dissolved result
major_grps = gpd.GeoDataFrame(dissolved)

major_grps = major_grps.drop(columns=['blockGroups_y','Total_y','White_y','Black_y','Asian_y','Latinx_y'])

major_grps.rename(
  columns={
    'blockGroups_x': 'blockGrps',
    'Total_x' : 'Total',
    'White_x' : 'White',
    'Black_x' : 'Black',
    'Asian_x' : 'Asian',
    'Latinx_x' : 'Latinx'
  },
  inplace=True
)

major_grps.head()

#bgData.plot(column='majorGrp', legend=True)
/Users/colmanbashore/anaconda3/envs/flooding/lib/python3.9/site-packages/geopandas/geodataframe.py:1676: FutureWarning: The default value of numeric_only in DataFrameGroupBy.sum is deprecated. In a future version, numeric_only will default to False. Either specify numeric_only or select only columns which should be valid for the function.
  aggregated_data = data.groupby(**groupby_kwargs).agg(aggfunc)
Out[ ]:
geometry Total White Black Asian Latinx pctAsian pctBlack pctLatinx pctWhite blockGrps
majorGrp
Asian MULTIPOLYGON (((926751.990 4209490.058, 926783... 3814 557 157 2431 637 192.226138 16.971301 47.205665 41.411103 3
Black MULTIPOLYGON (((949698.842 4206031.461, 949639... 264578 10961 199801 3261 47251 180.146080 13451.646456 3023.093058 650.237696 175
Latinx MULTIPOLYGON (((965411.937 4198913.656, 965055... 1219893 112180 109390 27012 962544 1195.685395 4933.749778 51520.813364 6210.126096 643
Mixed MULTIPOLYGON (((970393.326 4194192.233, 969688... 2213832 661709 497957 224801 777050 8318.759023 19525.925998 30700.430404 25805.788462 864
White MULTIPOLYGON (((978544.008 4193972.941, 978565... 823402 601169 30980 49604 123053 2669.432784 1483.902465 6311.791195 34404.114891 459

Analysis¶

Describe the methods of analysis that will directly test the hypotheses or provide results to answer the research questions. This section should explicitly define any spatial / statistical models and their parameters, including grouping criteria, weighting criteria, and significance thresholds. Also explain any follow-up analyses or validations.

Goal 3: Find flooded area in each group and calculate pct¶

In [ ]:
# Zonal Statistics

# Filter out invalid geometries
major_grps = major_grps[major_grps.geometry.is_valid]

major_grps.to_file(here(path["ddpub"],'major_grps.shp'), driver='ESRI Shapefile')


# Specify the path to the raster file
raster_file = here(path["drpub"],'actual_flood_10.tif')

# Read raster data using rasterio
with rasterio.open(raster_file) as src:
    # Get raster values as a NumPy array
    raster_data = src.read(1)


with fiona.open(here(path["ddpub"],'major_grps.shp')) as src:
    zs = zonal_stats(src, raster_file, stats="count", all_touched=True)

    
print(zs)

flood_pixels = [value for dictionary in zs for value in dictionary.values()]

bgMajorFld = gpd.read_file(here(path["ddpub"],'major_grps.shp'))

bgMajorFld['fl_count'] = flood_pixels

# Display the GeoDataFrame with the new "flood" column
print(bgMajorFld['fl_count'])
/Users/colmanbashore/anaconda3/envs/flooding/lib/python3.9/site-packages/rasterstats/main.py:151: ShapelyDeprecationWarning: The 'type' attribute is deprecated, and will be removed in the future. You can use the 'geom_type' attribute instead.
  if 'Point' in geom.type:
[{'count': 2165}, {'count': 645888}, {'count': 3181885}, {'count': 11415356}, {'count': 4784908}]
0        2165
1      645888
2     3181885
3    11415356
4     4784908
Name: fl_count, dtype: int64
In [ ]:
# Field Calculator Flooded Area
bgMajorFld['flArea'] = bgMajorFld.fl_count * 10 * 10
bgMajorFld['totArea'] = (bgMajorFld.geometry.area).round(2)
bgMajorFld['pctFlood'] = (bgMajorFld.flArea / bgMajorFld.totArea * 100).round(2)

bgMajorFld.head()

columns_to_export = ['majorGrp', 'blockGrps','flArea', 'totArea', 'pctFlood']

bgMajorFld[columns_to_export].to_csv(here(path["rtab"],'MajGrpFld.csv'), index=False)

bgMajorFld.to_csv(here(path["ddpub"],'bgMajorFld.csv'), index=False)

Results¶

The final results for this study are the Final Table which shows number of block groups, flooded area, total area, and percent flooded in each majority group region. The other results are a map of percent flooding by each region, and maps of the majority group regions and the flooding.

In [ ]:
# Final Table
Final_Table = pd.read_csv(here(path["rtab"],'MajGrpFld.csv'), dtype=str,encoding='latin-1')
Final_Table.head()
Out[ ]:
majorGrp blockGrps flArea totArea pctFlood
0 Asian 3 216500 952855.79 22.72
1 Black 175 64588800 206749828.71 31.24
2 Latinx 643 318188500 789346589.94 40.31
3 Mixed 864 1141535600 2477275416.41 46.08
4 White 459 478490800 1114263551.36 42.94

The final table shows how most of the flooding ocurred in the regions that did not have a dominant racial/ethnic group. However, this region had the most number of block groups and most total area. The majority groups that had the highester percentage of area flooded were White and Latinx. This final table should be close to the same as the final table from the original study. The only differences were the exact flooded area numbers which were calculated using different zonal statistic tools. The final flood percentages were very close to the originals.

In [ ]:
# Final Map
Final_Map = bgMajorFld.plot(column = 'pctFlood', legend = True, cmap = 'Blues', scheme = 'FisherJenks')
print("Percent Flooded by Block Groups")

plt.savefig(here(path["rfig"],'bgMajorFld.png'))
Percent Flooded by Block Groups
/Users/colmanbashore/anaconda3/envs/flooding/lib/python3.9/site-packages/mapclassify/classifiers.py:1860: UserWarning: Numba not installed. Using slow pure python version.
  warnings.warn(
No description has been provided for this image

Percent Flooded by Block Groups¶

This map is designed to match the final map from the original study. It shows the percentage flooded in each majority group. Besides small stylistic elements it matches the original.

In [ ]:
# Map of Majority Groups
bgMajorFld.plot(column='majorGrp', legend = True)
plt.savefig(here(path["rfig"],'majorGrps.png'))
No description has been provided for this image

Majority Groups in Harris County, Texas¶

In [ ]:
# Map of Flooding Extent

with rasterio.open(here(path["drpub"],'actual_flood_10.tif')) as src:
    # Read the raster data
    raster_data = src.read(1)

    # Plot the raster image
    plt.imshow(raster_data, cmap='Blues')
    #plt.colorbar(label='Pixel Values')
    plt.title('Flooding Extent')

    # Save the plot as a PNG file
    plt.savefig(here(path["rfig"],'actualFlood.png'))

    # Show the plot (optional)
    plt.show()
No description has been provided for this image

Flooding extent from Hurricane Harvey in Harris County, Texas¶

The previous two maps are designed to fill the place of another map deliverable from the original study. The original study produced a map of majority groups with a flooding layer on top. This reproduction was not able to replicate this specific figure but the two previous figures show the same layers.

Discussion¶

The success of both the original study and this replication are both challenging to evaluate given the design of the study as an educational problem. The scientific research question is the evaluation of the environmental justice and spatial patterns of flooding from Hurricane Harvey. The reproduction found the same patterns and similar percentage flooded as the original study. Both studies found the highest percentage of flooding in Mixed majority group regions, followed by White and Latinx regions. In this regard, the reproduction was a success. The main goal was to translate the existing QGIS workflow into python code in a way that this notebook could be used as a learning example in future courses teaching python as a GIS tool. This reproduction succesfully translated each data fransformation and analysis step into reproducible python code. The visualization of results is not entirely complete, and further reproductions of this study could improve it through visualization and cartography in python. However, this computational notebook successfully implements a simple GIS problem in a reproducible way and can serve as a building block for future python for GIS educational projects.

Integrity Statement¶

The authors of this preregistration state that they completed this preregistration to the best of their knowledge and that no other preregistration exists pertaining to the same hypotheses and research.

Acknowledgements¶

This project was made possible as part of the Middlebury Geography course Open Source GIScience taught by Professor Joseph Holler.

This report is based upon the template for Reproducible and Replicable Research in Human-Environment and Geographical Sciences, DOI:10.17605/OSF.IO/W29MQ

References¶

Middlebury Geography Department