# Air quality in the United States at the county level

Most of us go about our daily lives without giving much thought to the air we breathe. But there are times when we must be mindful of the air where we live, because air pollution may be high enough that our health could be affected.

The US Environmental Protection Agency, or EPA, measures the “cleanliness” of the air across the country. Thanks to these measurements, we can make informed plans regarding our outdoor activities.

The EPA uses the Air Quality Index ($AQI$) to determine how benign the local air is at a given time. The $AQI$ is a number between 0 and 500. The lower the number, the cleaner the air. According to the EPA booklet “Air Quality Index – A Guide to Air Quality and Your Health”, the $AQI$ is calculated for four major air pollutants regulated by the Clean Air Act: ground-level ozone, airborne particles, carbon monoxide, and sulfur dioxide. The first two of these pose the greatest hazard to human health. Ozone can cause coughing, throat irritation or a burning sensation in the airways. Particle pollutants, which are microscopic solids or liquid droplets that can reach the lungs, may cause irritation of the nose, eyes and throat. Those at risk from ground-level ozone and particulate matter include people with asthma, children, and the elderly.

The $AQI$ associated with a specific pollutant is obtained from measurements of the pollutant concentration $C$. The EPA uses a table of concentration ranges to which different $AQI$ values are pre-assigned (see the booklet link above). The desired $AQI$ is computed using a piece-wise linear function of $C$:

$AQI = \frac{AQI_{\mathrm{high}} - AQI_\mathrm{low}}{C_{\mathrm{high}} - C_{\mathrm{low}}}\left(C - C_{\mathrm{low}}\right) + AQI_\mathrm{low}$,

where $C_{\mathrm{low}}$ and $C_{\mathrm{high}}$ are the lower and upper ends, respectively, of the tabulated pollutant concentration range such that $C_\mathrm{low} \leq C \leq C_\mathrm{high}$, and $AQI_\mathrm{low}$ and $AQI_\mathrm{high}$ are the endpoints of the $AQI$ range assigned to the interval $\left[C_\mathrm{low}, C_\mathrm{high}\right]$. Again, the lower the $AQI$, the better the air quality.

The $AQI$ is formally computed by the above equation for each pollutant. Each measuring site might measure the concentration of all pollutants, of some of them, or of only one. When the concentration for more than one pollutant is measured, the final reported $AQI$ corresponds to the largest $AQI$.

Which areas in the United States have the cleanest air, on average? More specifically, what are the time averages of the $AQI$ in different counties? To answer this question, one can obtain county-level $AQI$ data from this EPA webpage (scroll to the middle of the page to the table called “Daily AQI”; the csv files containing those data, as well as the R code to analyze the data, can be found in this GitHub repository). The reported daily $AQI$ values correspond to only 1,316 US counties (less than half of all US counties), because reports for areas with a population of less than 350,000 are not required by law. Note also that $AQI$s for each sampled county are not reported strictly on a daily basis (for example, many measurements are reported every three days).

I used data from the year 2000 up to April 2020. The following map in Figure 1 shows the $AQI$ averaged over this period:

As described by the figure legend, the counties in green have the best air quality (remember that the lower the $AQI$, the cleaner the air). In EPA terminology, the green counties have “Good” air quality. The counties in yellow have “Moderate” air quality. The air in the two regions in orange is “Unhealthy for Sensitive Groups”. A distribution of these mean $AQI$s is shown in Figure 2:

Figure 2 shows that most (84.1%) of the US counties for which $AQI$ data exist have had clean air (i.e., “Good” air quality) since 2000. The air quality was moderate, on average, for 15.7% of counties. Only two counties (0.15% of the total measured) had an average air quality that was unhealthy for certain people. These two counties are Kings, in California, and Ada, in Idaho.

During the 2000 – 2020 period examined here, what were the best and worst recorded air qualities in each of the studied counties? In the case of the best air quality, all 1,316 examined counties reported minimum $AQI$s that were below 51. In other words, at different points in that period, each of those counties enjoyed “Good” air quality. For that reason, the minimum-$AQI$ map shows all of those counties in green, and it is therefore not included here.

For the case of the worst air quality reached in each county in the same period, Fig. 3 shows a US map with the maximum measured $AQI$:

Remarkably, there are a few counties in which the air quality remained “Good” throughout. A list of those counties is shown in Fig. 4:

The distribution of maximum $AQI$s is shown in Fig. 5. Note that there were several $AQI$ values (25 of them, corresponding to 2% of the total number of reported counties) well above 500, extending the tail of the distribution to the right into the thousands. Those extreme values were not included in the distribution of Fig. 5 for visual convenience. Nevertheless, it would be worthwhile to inquire further with EPA experts as to the meaning of such outliers.

From Fig. 5, 42% of the studied counties reported “Unhealthy” air conditions at some point during the 2000 – 2020 period. “Very unhealthy” conditions occurred in 29% of the counties, and “Hazardous” conditions in 4% of the counties.

Finally, calculating the standard deviation of each county’s $AQI$ ($AQI_{\mathrm{RMS}}$, where RMS stands for “root-mean-square”) gives an idea of the variation in air quality over the same period. This variation is shown in Fig. 6. In this figure, the numbers shown correspond to those counties for which the $AQI_{\mathrm{RMS}}$ is less than 100. Counties in blue have a low standard deviation, while counties in magenta have a high standard deviation. There were 2 counties in which $AQI_{\mathrm{RMS}}$ exceeded 100, both in California (neither of which is colored): Kings ($AQI_{\mathrm{RMS}} = 533$) and Napa ($AQI_{\mathrm{RMS}}=380$).

In conclusion, the EPA data on daily county $AQI$ values show that

• on average, most of the counties for which $AQI$ values are available enjoyed clean air between January 2000 and April 2020;
• most of those counties experienced at least one instance of unhealthy air conditions in the same period; and
• all but two counties had a root-mean-square $AQI$ ($AQI_{\mathrm{RMS}}$) of less than 100, with a median $AQI_{\mathrm{RMS}}$ of 20.