No API? No Problem - Mapping Coordinates to ZIP Codes for Free

Assigning ZIP Codes to coordinates without paying for API access
Python
Geospatial
Published

July 20, 2025

ZIP Codes are widely used in analytics because they serve as a convenient proxy for geographic location, even though they are not true areal units. In the United States, ZIP Codes are created and managed by the United States Postal Service for the purpose of delivering mail. They represent collections of delivery routes (not enclosed polygons), and because these routes can change over time and don’t necessarily align with geographic or administrative boundaries, ZIP Codes do not have precise, fixed geographic boundaries.

The US Census Bureau releases ZIP Code Tabulation Area (ZCTA) shapefiles, which are generalized areal representations of the geographic extent and distribution of point-based ZIP Codes built using census tabulation blocks. Note however, ZCTAs are not the same as ZIP Codes, and not all valid ZIP Codes are represented by a ZCTA.

Despite these shortcomings, ZIP Codes can still reflect meaningful differences in building stock, emergency services and environmental hazards. They are also typically small and granular enough to allow for regional differentiation, and large enough to avoid some of the volatility seen at finer geographic levels like census block. Additionally, ZIP Codes are commonly used by regulators and in industry datasets, which makes them a practical choice for pricing, underwriting and reporting.

In this article, I’ll walk through two approaches that can be used to assign 5-digit ZIP Code to a sample collection of coordinate pairs without having to resort to a paid API.


OpenAddresses

OpenAddresses is an open data project that provides free, worldwide address data collected from government sources, open datasets, and community contributions. It includes individual address records containing street names, cities, ZIP Codes, longitude and latitude and sometimes administrative areas. The data is organized by country and region and is regularly updated. It is especially useful for geocoding, reverse geocoding, and location enrichment tasks where access to millions of structured address points is required.

For US locations, the data are divided into regions. For this demonstration, our goal is to assign a single zip code (not ZCTA) to 10 longitude latitude pairs falling somewhere in Iowa, so we’ll download us_midwest.zip, which is ~1GB.

Within the zip archive, we are interested in the statewide.csv files. For Iowa, the exact path would be something like …/us/ia/statewide.csv. We read this file into a Pandas DataFrame and inspect the first frew records:

%load_ext watermark

%watermark --python --conda --hostname --machine --iversions
Python implementation: CPython
Python version       : 3.12.11
IPython version      : 9.4.0

conda environment: osm

Compiler    : GCC 13.3.0
OS          : Linux
Release     : 6.8.0-60-generic
Machine     : x86_64
Processor   : x86_64
CPU cores   : 16
Architecture: 64bit

Hostname: fermi

import warnings

import numpy as np
import pandas as pd
import geopandas as gpd
from shapely.geometry import Point

np.set_printoptions(suppress=True, precision=8, linewidth=1000)
pd.options.mode.chained_assignment = None
pd.set_option("display.max_columns", None)
pd.set_option("display.width", None)
warnings.filterwarnings("ignore")

# Example Iowa statewide file. Data available at https://results.openaddresses.io/
dfoa = pd.read_csv(
    "openaddr-collected-us_midwest/us/ia/statewide.csv"
)

dfoa.head(10)
LON LAT NUMBER STREET UNIT CITY DISTRICT REGION POSTCODE ID HASH
0 -93.586810 41.714348 729 Southeast Kensington ROAD NaN Ankeny Polk County IA 50021 NaN 85640de13c151e73
1 -91.124941 40.821785 1826 EDWARDS STREET NaN BURLINGTON DES MOINES IA 52601 NaN c46a81890ea9dade
2 -95.835598 41.202768 222 PEREGRINE PL NaN COUNCIL BLUFFS POTTAWATTAMIE IA 51501 NaN ff20a503ae7884c0
3 -92.984547 41.157328 1037 565TH TRAIL NaN LOVILIA MONROE COUNTY IA 50150 NaN 50faddd361096939
4 -95.202703 42.656680 1410 MICHIGAN ST NaN STORM LAKE BUENA VISTA IA NaN 7780387c272717bc
5 -93.637272 41.529334 6010 Southwest 15th STREET NaN Des Moines Polk County IA 50315 NaN c83fc3e7c9ff08ec
6 -92.924891 40.626100 308 SOUTH LIBERTY STREET NaN CINCINNATI APPANOOSE COUNTY IA NaN NaN 05a0d2c060942c72
7 -91.154685 40.755431 5151 US HWY 61 NaN BURLINGTON DES MOINES IA 52601 NaN 142186b7c0ead2b3
8 -93.734875 41.671672 6165 Northwest 86th STREET Suite 206 Johnston Polk County IA 50131 NaN 07012a2220b2dd84
9 -92.916058 42.032234 205 W INGLEDUE ST NaN MARSHALLTOWN MARSHALL CO IA 50158 NaN c1a3720939481dac


Ultimately a nearest-neighbor lookup will be used to assign a 5-digit ZIP Code to the 10 sample points using LON and LAT associated with known addresses, therefore we drop records with missing values for LON, LAT or POSTCODE and perform a bit of additional clean-up. In the last step, the original OpenAddresses dataset is converted to a GeoDataFrame:


# Clean and filter.
dfoa = dfoa.dropna(subset=["LON", "LAT", "POSTCODE"])
dfoa["POSTCODE"] = dfoa["POSTCODE"].astype(str).str.strip()
dfoa = dfoa[dfoa.POSTCODE!=""].reset_index(drop=True)

# Retain ZIP5 only (drop the plus four code).
dfoa["POSTCODE"] = dfoa["POSTCODE"].astype(str).str.zfill(5)
dfoa["ZIP5"] = dfoa["POSTCODE"].str.split("-", n=1, expand=True)[0]
dfoa = dfoa[["LON", "LAT", "ZIP5"]]

# Create GeoDataFrame.
oa = gpd.GeoDataFrame(
    dfoa,
    geometry=gpd.points_from_xy(dfoa.LON, dfoa.LAT),
    crs="EPSG:4326"
)

print(f"Iowa addresses in OpenAddresses dataset: {len(oa):,}")

oa.head()
Iowa addresses in OpenAddresses dataset: 1,314,249
LON LAT ZIP5 geometry
0 -93.586810 41.714348 50021 POINT (-93.58681 41.71435)
1 -91.124941 40.821785 52601 POINT (-91.12494 40.82178)
2 -95.835598 41.202768 51501 POINT (-95.8356 41.20277)
3 -92.984547 41.157328 50150 POINT (-92.98455 41.15733)
4 -93.637272 41.529334 50315 POINT (-93.63727 41.52933)


We create a sample of 10 coordinate pairs that fall within the Iowa state boundary:

# Sample points of interest. All fall within Iowa state boundary.
pois = pd.DataFrame({
    "id": np.arange(10),
    "lon": [-91.4239, -92.5043, -90.6057, -93.6110, -93.1085,
            -90.6103, -95.8953, -92.4293, -93.5519, -95.0369],
    "lat": [41.9252, 40.6608, 41.5667, 42.0368, 41.3333, 
            41.5861, 41.2213, 42.4960, 41.6058, 40.7294]
    }
)

# Create GeoDataFrame.
pois = gpd.GeoDataFrame(
    pois,
    geometry=gpd.points_from_xy(pois.lon, pois.lat),
    crs="EPSG:4326"
)

pois.head(10)
id lon lat geometry
0 0 -91.4239 41.9252 POINT (-91.4239 41.9252)
1 1 -92.5043 40.6608 POINT (-92.5043 40.6608)
2 2 -90.6057 41.5667 POINT (-90.6057 41.5667)
3 3 -93.6110 42.0368 POINT (-93.611 42.0368)
4 4 -93.1085 41.3333 POINT (-93.1085 41.3333)
5 5 -90.6103 41.5861 POINT (-90.6103 41.5861)
6 6 -95.8953 41.2213 POINT (-95.8953 41.2213)
7 7 -92.4293 42.4960 POINT (-92.4293 42.496)
8 8 -93.5519 41.6058 POINT (-93.5519 41.6058)
9 9 -95.0369 40.7294 POINT (-95.0369 40.7294)


Let’s visualize the 10 points to get an idea of their distribution across the state:


import folium

f = folium.Figure(width=700, height=500)

lats, lons = pois.lat.tolist(), pois.lon.tolist()
coords = list(zip(lats, lons))

# Center map on central Iowa. 
mid_lon, mid_lat = -93.0977, 41.8780
 
m = folium.Map(
    location=[mid_lat, mid_lon], 
    zoom_start=7
).add_to(f)

for lat, lon in coords:

    folium.CircleMarker(
        location=[lat, lon], 
        radius=5, 
        color="red", 
        fill_color="red", 
        fill=True,
        fill_opacity=1
        ).add_to(m)

m


Within GeoPandas, sjoin_nearest performs a spatial join that matches each geometry in one GeoDataFrame to the nearest geometry in another GeoDataFrame based on geometric distance. Instead of checking for intersection or containment like a standard spatial join, sjoin_nearest finds the closest match in terms of Euclidean distance (in projected coordinates) or great-circle distance if using geographic coordinates.

In our example, each of the 10 points will get assigned the ZIP Code associated with the nearest known address point based on great-circle distance:


# Perform nearest neighbor lookup to known lon-lat zip code.
pois = gpd.sjoin_nearest(
    pois,
    oa[["ZIP5", "geometry"]],
    how="left",
    distance_col="dist_meters"
)

pois.head()
id lon lat geometry index_right ZIP5 dist_meters
0 0 -91.4239 41.9252 POINT (-91.4239 41.9252) 896565 52314 0.000180
1 1 -92.5043 40.6608 POINT (-92.5043 40.6608) 684538 52537 0.000204
2 2 -90.6057 41.5667 POINT (-90.6057 41.5667) 659368 52806 0.000064
3 3 -93.6110 42.0368 POINT (-93.611 42.0368) 1261954 50010 0.000030
4 4 -93.1085 41.3333 POINT (-93.1085 41.3333) 207898 50138 0.000654


In the resulting join, index_right specifies which row index from the OpenAddresses dataset served as the nearest match to the sample point.

Since our sample data is represented using a geographic coordinate reference system, dist_meters represents the distance to the nearest known address in terms of decimal degrees, not meters (even if we named the column "dist_meters"). If accurate distance measurements to the nearest known address are required, it is necessary to re-project the data to a projected CRS, then re-execute the spatial join:


# Using a projected CRS.
pois = pd.DataFrame({
    "id": np.arange(10),
    "lon": [-91.4239, -92.5043, -90.6057, -93.6110, -93.1085,
            -90.6103, -95.8953, -92.4293, -93.5519, -95.0369],
    "lat": [41.9252, 40.6608, 41.5667, 42.0368, 41.3333, 
            41.5861, 41.2213, 42.4960, 41.6058, 40.7294]
    }
)

# Create GeoDataFrame.
pois = gpd.GeoDataFrame(
    pois,
    geometry=gpd.points_from_xy(pois.lon, pois.lat),
    crs="EPSG:4326"
)

# Re-project to a projected CRS for accurate distance calculations.
pois = pois.to_crs("EPSG:3857")

# Ensure same CRS for OpenAddresses data.
oa = oa.to_crs("EPSG:3857")

pois = gpd.sjoin_nearest(
    pois, 
    oa[["ZIP5", "geometry"]], 
    distance_col="dist_meters"
)

pois.head(10)
id lon lat geometry index_right ZIP5 dist_meters
0 0 -91.4239 41.9252 POINT (-10177261.994 5149781.344) 896565 52314 20.088399
1 1 -92.5043 40.6608 POINT (-10297531.572 4962437.751) 684538 52537 29.855588
2 2 -90.6057 41.5667 POINT (-10086180.387 5096292.577) 659368 52806 9.535298
3 3 -93.6110 42.0368 POINT (-10420728.853 5166493.501) 1261954 50010 4.511569
4 4 -93.1085 41.3333 POINT (-10364790.809 5061628.338) 207898 50138 78.010282
5 5 -90.6103 41.5861 POINT (-10086692.457 5099179.465) 519778 52806 10.161212
6 6 -95.8953 41.2213 POINT (-10675015.965 5045038.366) 395965 51501 90.983468
7 7 -92.4293 42.4960 POINT (-10289182.61 5235569.854) 550407 50613 49.319472
8 8 -93.5519 41.6058 POINT (-10414149.871 5102111.883) 187076 50317 11.959234
9 9 -95.0369 40.7294 POINT (-10579459.315 4972509.787) 358873 51632 292.360969


For the first sample (id = 0), dist_meters indicates the nearest known address is ~20 meters away.


Finding k-Nearest ZIP Codes with BallTree

sjoin_nearest performs a spatial join by finding the nearest geometry in one GeoDataFrame for each geometry in another using a spatial index. When sjoin_nearest is called, it uses the spatial index to find candidate nearest geometries efficiently, then refines the match using actual geometric distance.

One limitation of using sjoin_nearest is that only a single ZIP Code will be assigned to each sample. There may be cases where knowing the ZIP Codes of the five or ten nearest addresses might be of interest, possibly as an indicator of the homogeneity of ZIP Codes in a given region: If the 10 nearest addresses all share the same ZIP Code, it is likely that the sample point shares that ZIP Code (assuming a relatively small distance to each address).

We can identify and retrieve ZIP Codes associated with the k-nearest addresses and corresponding distances using a BallTree. Within scikit-learn, BallTree implements a fast nearest-neighbor search using a space-partitioning tree. It is not geospatially aware and does not work directly with geometry objects, and instead works with raw coordinate arrays. In the next cell, we create a BallTree instance, and identify the 10 nearest ZIP Codes for each sample in pois. Note that BallTree with haversine distance expects input in radians, not degrees:


from sklearn.neighbors import BallTree

# Convert lon/lat to radians for BallTree when using haversine 
# metric.
# OpenAddresses data.
oa_coords = np.deg2rad(oa[["LON", "LAT"]].values)

# 10 sample data points. 
poi_coords = np.deg2rad(pois[["lon", "lat"]].values)

# Create BallTree instance.
bt = BallTree(oa_coords, metric="haversine")

# Query top 10 neighbors.
dists, indices = bt.query(poi_coords, k=10)

print(f"dists:\n{dists}\n")

print(f"indices:\n{indices}\n")
dists:
[[0.00000078 0.00000128 0.00000155 0.0000019  0.00000203 0.00000257 0.00000312 0.00000329 0.0000034  0.00000356]
 [0.00000043 0.00000734 0.00000983 0.0000122  0.00001249 0.00001339 0.000014   0.00001426 0.00001839 0.00002441]
 [0.00000014 0.00000024 0.00000029 0.00000048 0.00000056 0.0000006  0.00000061 0.00000093 0.00000093 0.00000094]
 [0.00000003 0.0000005  0.0000005  0.0000005  0.00000062 0.00000064 0.00000076 0.00000085 0.00000114 0.00000131]
 [0.00000287 0.00000313 0.00000334 0.00000416 0.00000419 0.00000437 0.00000449 0.00000452 0.00000454 0.00000457]
 [0.00000067 0.00000076 0.00000078 0.00000083 0.0000009  0.00000114 0.00000118 0.00000121 0.00000123 0.00000125]
 [0.00000158 0.00000411 0.00000447 0.00000459 0.00000587 0.00000624 0.0000063  0.00000662 0.00000686 0.00000708]
 [0.00000245 0.00000253 0.00000458 0.00000494 0.00000522 0.0000057  0.00000577 0.00000604 0.00000629 0.00000641]
 [0.00000059 0.00000117 0.00000119 0.00000134 0.00000143 0.00000153 0.00000159 0.00000168 0.00000172 0.00000194]
 [0.0000043  0.00000476 0.00000553 0.00000741 0.00001087 0.00001098 0.0000111  0.00001117 0.00001213 0.00001238]]

indices:
[[ 547709  457711  131556  510091   13847  856680  896565  192896 1064575  158127]
 [ 684538 1093125 1130773  540050  784185 1083753  600773  614572   58343  334934]
 [ 659368  286207  432808  627125  391029 1006928  338206  390721  575752  267197]
 [1261954 1280697  443539 1027393  567643 1038369  116769  799895  331017   70198]
 [ 312170  797161  506549   92729  392748  322733  532624  969750 1307702   79626]
 [ 241231  519778  282477  333948 1158412  575133  486008 1200591  238422 1134008]
 [ 395965  605176  682735  450244 1282644  889027 1276471  355442 1138247  620641]
 [ 766037  732231 1011916  550407 1161263 1088782  351109 1042692   25833 1034179]
 [ 187076 1143818 1077490  343559  836727 1266959 1041256   48610  677851  868357]
 [1245842  165862  471164  764378  276749 1284297  358873  362510 1020793  984672]]


indices is a 10 x 10 array representing the indices into the original OpenAddresses GeoDataFrame corresponding to the 10 nearest addresses. We can lookup the associated ZIP Codes as follows:


# Lookup ZIP Codes for first sample (row 0)

idx = indices[0]

oa.loc[idx, "ZIP5"].tolist()
['52314',
 '52314',
 '52314',
 '52314',
 '52314',
 '52314',
 '52314',
 '52314',
 '52314',
 '52314']


For the first point of interest, the 10 nearest ZIP Codes are the same, indicating a degree of homegeneity for that region.

dists is also a 10 x 10 array representing the distances to each of the 10 nearest addresses. When using the haversine metric, the distances will be returned in radians. To get distance in meters, multiply by 6,371,000 (assumes an Earth radius of 6,371km):


# Convert radians to meters.
dists*= 6_371_000  

dists.round(2)
array([[  4.99,   8.14,   9.9 ,  12.13,  12.96,  16.39,  19.89,  20.93,  21.68,  22.66],
       [  2.74,  46.77,  62.65,  77.71,  79.54,  85.31,  89.2 ,  90.82, 117.18, 155.54],
       [  0.9 ,   1.5 ,   1.82,   3.03,   3.58,   3.81,   3.91,   5.94,   5.95,   5.98],
       [  0.21,   3.17,   3.18,   3.21,   3.95,   4.09,   4.86,   5.44,   7.28,   8.36],
       [ 18.31,  19.97,  21.31,  26.53,  26.72,  27.86,  28.58,  28.78,  28.91,  29.13],
       [  4.24,   4.87,   4.99,   5.28,   5.74,   7.29,   7.53,   7.68,   7.82,   7.98],
       [ 10.08,  26.16,  28.49,  29.22,  37.38,  39.73,  40.13,  42.16,  43.71,  45.09],
       [ 15.6 ,  16.12,  29.2 ,  31.48,  33.29,  36.34,  36.75,  38.48,  40.05,  40.83],
       [  3.77,   7.46,   7.61,   8.55,   9.09,   9.74,  10.13,  10.73,  10.98,  12.35],
       [ 27.4 ,  30.32,  35.26,  47.18,  69.28,  69.94,  70.69,  71.14,  77.29,  78.86]])


Notice the distances increase from left to right. The nearest neighboring address will be at index 0 for each row, the furthest at the last index.

Let’s attach the ZIP Codes and distances to the original pois GeoDataFrame:


# Add ZIP Codes and distances to pois GeoDataFrame.
pois["zip5_bt"] = [oa.loc[ii, "ZIP5"].tolist() for ii in indices]
pois["dist_bt"] = [dists[ii].tolist() for ii in range(len(dists))]

pois.head(10)
id lon lat geometry index_right ZIP5 dist_meters zip5_bt dist_bt
0 0 -91.4239 41.9252 POINT (-10177261.994 5149781.344) 896565 52314 20.088399 [52314, 52314, 52314, 52314, 52314, 52314, 523... [4.985134882078796, 8.141201258446523, 9.90310...
1 1 -92.5043 40.6608 POINT (-10297531.572 4962437.751) 684538 52537 29.855588 [52537, 52537, 52537, 52537, 52537, 52537, 525... [2.7405501278443767, 46.76946329403264, 62.646...
2 2 -90.6057 41.5667 POINT (-10086180.387 5096292.577) 659368 52806 9.535298 [52806, 52806, 52806, 52806, 52806, 52806, 528... [0.9037957011912015, 1.500812774709375, 1.8239...
3 3 -93.6110 42.0368 POINT (-10420728.853 5166493.501) 1261954 50010 4.511569 [50010, 50010, 50010, 50010, 50010, 50010, 500... [0.21342223699654192, 3.17473669691422, 3.1849...
4 4 -93.1085 41.3333 POINT (-10364790.809 5061628.338) 207898 50138 78.010282 [50138, 50138, 50138, 50138, 50138, 50138, 501... [18.31278035974411, 19.967693631899426, 21.309...
5 5 -90.6103 41.5861 POINT (-10086692.457 5099179.465) 519778 52806 10.161212 [52806, 52806, 52806, 52806, 52806, 52806, 528... [4.242762617075453, 4.870854521824348, 4.99311...
6 6 -95.8953 41.2213 POINT (-10675015.965 5045038.366) 395965 51501 90.983468 [51501, 51501, 51501, 51501, 51501, 51501, 515... [10.084917943102482, 26.164037229203412, 28.48...
7 7 -92.4293 42.4960 POINT (-10289182.61 5235569.854) 550407 50613 49.319472 [50613, 50613, 50613, 50613, 50613, 50613, 506... [15.604875075255679, 16.124558588950354, 29.19...
8 8 -93.5519 41.6058 POINT (-10414149.871 5102111.883) 187076 50317 11.959234 [50317, 50317, 50317, 50317, 50317, 50317, 503... [3.772941024353452, 7.462518447657111, 7.60926...
9 9 -95.0369 40.7294 POINT (-10579459.315 4972509.787) 358873 51632 292.360969 [51632, 51632, 51632, 51632, 51632, 51632, 516... [27.402712270050465, 30.31954566212826, 35.258...


We confirm that the ZIP Code identified using sjoin_nearest (ZIP5) matches the nearest address ZIP Code for each of the 10 points of interest.

As final diagnostic, we can visualize all the points from the OpenAddresses dataset that map to one of our POIs nearest ZIP Codes, along with the convex hull enclosing those points. This will provide additional context missing from a purely distance based diagnostic. We focus on the point with id = 1, and perform the following steps:

  • Filter oa to addresses having ZIP Code = 52537.
  • Combine resulting geometries using .unary_union.
  • Call .convex_hull to compute the convex hull polygon enclosing the result from the previous step.
  • Visualize address points, convex hull and original point of interest.

# Filter oa to addresses having ZIP Code = 52537.
# Repoject to WGS84 for visualization.
oa52537 = oa[oa.ZIP5=="52537"].to_crs("EPSG:4326").reset_index(drop=True)

print(f"Addresses in ZIP Code 52537: {len(oa52537):,}")

oa52537.head(10)
Addresses in ZIP Code 52537: 1,830
LON LAT ZIP5 geometry
0 -92.452221 40.750748 52537 POINT (-92.45222 40.75075)
1 -92.330243 40.803125 52537 POINT (-92.33024 40.80313)
2 -92.561039 40.628394 52537 POINT (-92.56104 40.62839)
3 -92.468560 40.710687 52537 POINT (-92.46856 40.71069)
4 -92.315096 40.747551 52537 POINT (-92.3151 40.74755)
5 -92.219034 40.748971 52537 POINT (-92.21903 40.74897)
6 -92.341592 40.712831 52537 POINT (-92.34159 40.71283)
7 -92.420783 40.702076 52537 POINT (-92.42078 40.70208)
8 -92.221492 40.778673 52537 POINT (-92.22149 40.77867)
9 -92.428911 40.876458 52537 POINT (-92.42891 40.87646)

# Get convex hull for ZIP Code = 52537.
cvh = oa52537.unary_union.convex_hull

# Get POI lat and lon.
poi_y = pois[pois.id==1]["lat"].item()
poi_x = pois[pois.id==1]["lon"].item()

# Initialize map.
f = folium.Figure(width=800, height=525)
m = folium.Map(location=[poi_y, poi_x], zoom_start=10).add_to(f)


folium.Marker(
    location=[poi_y, poi_x],
    icon=folium.Icon(color="red"),
    popup=f"id=1: {poi_x} {poi_y}"
    ).add_to(m)

# Add address points to map.
for point in oa52537.geometry:
    folium.CircleMarker(
        location=[point.y, point.x],
        radius=3.5,
        color="#4f9de4",
        fill=True,
        weight=1,
        fill_opacity=0.35
    ).add_to(m)


# Add the convex hull polygon to the map.
folium.Polygon(
    locations=[[lat, lon] for lon, lat in cvh.exterior.coords],
    color="#184771",
    fill=False
).add_to(m)


m


The POI (red marker) is surrounded by many addresses with the same ZIP Code. This provides additional confidence that the assignment is reasonable and relatively accurate.


Conclusion

  • ZCTAs <> ZIP Codes! They are not one-to-one. Some ZIP Codes don’t have ZCTA equivalents. ZCTAs are statistical areas and do not map exactly to USPS ZIP Codes. ZIp Codes represent collections of delivery routes (not enclosed polygons).
  • ZCTAs are updated only with each decennial census, while USPS updates ZIP Codes constantly.
  • ZIP Codes that span multiple counties or regions may be misrepresented in ZCTA form.
  • Even though the OpenAddresses dataset contains over 500 million addresses, coverage is variable, and is generally better in urban/suburban areas than in rural areas. Some large metropolitan areas may be missing address data altogether.
  • If you’re building something regulatory or production-facing, you should disclose that the USPS does not publish ZIP Code boundaries, and this method uses nearest inferred ZIP Codes from publicly available data.