Gaza-Mortality-Survey
README.md

Gaza Mortality Survey — Replication Materials

This repository contains the data and computer code for the paper:

Violent and non-violent death tolls for the Gaza conflict: new primary evidence from a population-representative field survey
Michael Spagat, Jon Pedersen, Khalil Shikaki, Michael Robbins, Eran Bendavid, Håvard Hegre, and Debarati Guha-Sapir
The Lancet Global Health, published February 18, 2026
DOI: 10.1016/S2214-109X(25)00522-4

The main data were collected in the Gaza Strip between December 30, 2024 and January 5, 2025 in what we have called the Gaza Mortality Survey (GMS).

Note: Earlier versions of this repository contained additional files that are not part of the final published analysis. This repository has been updated to contain only the files needed to replicate the published results.


Files in this Repository

Data Files

GMS Household roster.sav — Individual-level household roster data (9,729 individuals). This is the main analysis file. Key variables:

  • prikey — household identifier
  • HR00 — governorate of residence (1=North Gaza, 2=Gaza City, 3=Deir al-Balah, 4=Khan Younis, 5=Rafah)
  • HR01 — person number within household
  • HR03 — age in years
  • HR04 — sex (1=Male, 2=Female)
  • HR05 — status (1=Resident, 2=Left Gaza, 3=Elsewhere in Gaza, 4=Dead, 5=Missing, 6=Imprisoned)
  • HR06 — cause of death (1=Disease with medical aid, 2=Disease without medical aid, 3=Accident, 4=Violent, 5=Unknown)
  • HR07 — age at start of war
  • PSUID, PSUType — sampling unit identifiers
  • Interviewer — interview team identifier (gaza1–gaza10)

GMS Births.sav — Births occurring during the survey period (357 births). Same variable structure as the household roster, with HR03=0 for all births.

GMS_Respondents_anonymised.sav — Interview-level respondents file (2,000 interviews). This file was used for field monitoring and quality control but is not required to replicate the published analysis. Key variables include interview timing (start_h, start_m, end_h, end_m), PSU identifiers (d01, d05_1), supervisor attendance (d002), GPS coordinates (x1), and governorate of origin (HR00). Two columns present in the original field file — interviewer first names (d001) and supervisor names (d0001) — have been removed to protect field staff working in an active conflict zone.

Population Gaza Single year age groups IDB_2023.xlsx — Gaza population by single year of age and sex for 2023, downloaded from the US Census Bureau International Data Base (https://www.census.gov/data-tools/demo/idb/). Two edits were made from the original download: "100+" was changed to "100" in the age group column, and the population column names were simplified to Total, Male, and Female.

Code Files

Gaza_Strip_Mortality_RMD_fixed.rmd — The main analysis file producing all primary estimates in the paper. Run this file in RStudio to replicate the main results. Produces all tables and the sensitivity analysis plot.

Combined various calculations.R — Code for descriptive statistics and supplementary calculations referenced in the paper, including sample demographic tables, infant birth analysis, and raking quality checks. Note: this file currently reads from earlier versions of the data files and will require updating to work with the current data files.

power_calculations_vary_deaths_by_household.R — Sample size calculations conducted before the survey.


Replicating the Main Results

Requirements

  • R
  • R packages: tidyverse, haven, survey, gt, webshot2, readxl
  • RStudio (recommended)

Instructions

  1. Place all data files and Gaza_Strip_Mortality_RMD_fixed.rmd in the same working directory.
  2. Open Gaza_Strip_Mortality_RMD_fixed.rmd in RStudio and set that directory as your working directory.
  3. Click "Knit" or run the chunks sequentially.

Expected Output

The main model produces results matching the published figures exactly, including:

  • 75,200 violent deaths (95% CI 63,600–86,800)
  • 16,300 non-violent deaths (95% CI 12,300–20,200)
  • 8,540 excess non-violent deaths (95% CI 4,540–12,500)

Data Notes

Privacy: First names (variable HR02 in the original data collection instrument) have been removed from all posted files. An internal interviewer logistics variable (GOVINT) has also been removed as it is not needed for replication. The respondents file (GMS_Respondents_anonymised.sav) has additionally had interviewer and supervisor name fields removed; see above.

GPS coordinates: The x1 variable in the respondents file contains GPS coordinates recorded during fieldwork for team safety monitoring and approximate verification that teams were operating in the correct area. These coordinates are indicative only and should not be treated as precise location measurements. Detailed spatial analysis based on these coordinates is not appropriate. As an illustration of their imprecision, eight coordinates from Day 1 of fieldwork place interviews outside Gaza entirely, near the Jordanian border — an artefact of GPS signal disruption under difficult field conditions rather than any data quality issue.

Births module anomalies: The births file contains eight records with impossible characteristics — births attributed to male household members or to women above reproductive age. All eight are from team Gaza3 and appear to reflect data entry errors in the births module, which was a secondary component of the questionnaire. These errors have no effect on the mortality estimates, which are based entirely on the household roster module. The affected household identifiers (prikey) are: 10, 20, 367, 717, 740, 1007, 1015, 1319, 1332. Researchers reanalysing the births data may wish to exclude or correct these records.

The 5 missing ages: Five individuals in the roster have missing values for HR03 (age). These rows are dropped before raking and have a negligible effect on the estimates.


Citation

Spagat M, Pedersen J, Shikaki K, Robbins M, Bendavid E, Hegre H, Guha-Sapir D. Violent and non-violent death tolls for the Gaza conflict: new primary evidence from a population-representative field survey. The Lancet Global Health. 2026. DOI: 10.1016/S2214-109X(25)00522-4