Tornado Analysis Dataset Schema

Dataset Schema

This document describes the data structure used in the Tornado Death Risk Analysis.

Source Data Structure

The analysis uses a CSV-formatted dataset with the following columns:

Column NameData TypeDescription
YearIntegerThe calendar year of the record (1900-2024).
DeathsIntegerTotal confirmed tornado-related fatalities in the United States for the given year.
PopFactorFloatEstimated population (in millions) across 25 tornado-prone states*.
EraStringHistorical classification of the period: 'Pre-Radar', 'Warning Era', or 'Modern Era'.

Derived Fields

The analysis script calculates the following additional metric:

Column NameData TypeFormulaDescription
DeathRateFloatDeaths / PopFactorPopulation-adjusted mortality rate (deaths per million people in tornado-prone regions).

*Tornado-Prone States

The PopFactor includes population estimates for: AL, AR, FL, GA, IA, IL, IN, KS, KY, LA, MI, MN, MO, MS, NC, ND, NE, OH, OK, SC, SD, TN, TX, VA, WI.

Data Sources