This analysis uses the U.S. Census Bureau’s American
Community Survey (ACS) to examine demographic patterns across
California — from county-level summaries to census tract-level detail.
Using the tidycensus R package, we pull data directly from
the Census API and calculate racial/ethnic composition and age
distributions.
Focus areas: - Median age by sex across California counties - Racial and ethnic composition at the census tract level (Alameda County) - County-level demographic comparisons - Joining multi-year Census data to track changes
library(tidyverse)
library(dplyr)
library(ggplot2)
library(knitr)
library(psych)
library(tidycensus)Note: This analysis requires a Census API key. You can request one at api.census.gov and install it with
census_api_key("YOUR_KEY", install = TRUE).
Pulling median age data for all California counties from the 2023 ACS 5-year estimates:
ca <- get_acs(geography = "county",
variables = "B01002_001",
state = "CA",
year = 2023)
ca %>%
arrange(desc(estimate)) %>%
head(15) %>%
kable(caption = "California counties with the highest median age (2023 ACS)")| GEOID | NAME | variable | estimate | moe |
|---|---|---|---|---|
| 06091 | Sierra County, California | B01002_001 | 56.0 | 5.0 |
| 06105 | Trinity County, California | B01002_001 | 54.8 | 0.9 |
| 06063 | Plumas County, California | B01002_001 | 52.1 | 0.8 |
| 06009 | Calaveras County, California | B01002_001 | 52.0 | 0.7 |
| 06043 | Mariposa County, California | B01002_001 | 51.6 | 1.0 |
| 06057 | Nevada County, California | B01002_001 | 50.3 | 0.2 |
| 06005 | Amador County, California | B01002_001 | 49.9 | 0.6 |
| 06049 | Modoc County, California | B01002_001 | 49.0 | 2.1 |
| 06109 | Tuolumne County, California | B01002_001 | 48.8 | 0.4 |
| 06093 | Siskiyou County, California | B01002_001 | 47.4 | 0.6 |
| 06041 | Marin County, California | B01002_001 | 47.3 | 0.2 |
| 06017 | El Dorado County, California | B01002_001 | 46.1 | 0.2 |
| 06027 | Inyo County, California | B01002_001 | 45.6 | 0.5 |
| 06033 | Lake County, California | B01002_001 | 44.2 | 0.3 |
| 06045 | Mendocino County, California | B01002_001 | 43.9 | 0.3 |
Pulling median age disaggregated by sex at the census tract level:
agevars <- c("B01002_001", "B01002_002", "B01002_003")
altrtage <- get_acs(geography = "tract",
variables = agevars,
county = "Alameda",
state = "CA",
year = 2023)
# Pivot from stacked to wide format
wide_altrtage <- altrtage %>%
pivot_wider(names_from = variable, values_from = c(estimate, moe))
# Rename columns for clarity
pretty_altrtage <- wide_altrtage %>%
dplyr::select(GEOID, NAME,
medage = estimate_B01002_001,
male_age = estimate_B01002_002,
fem_age = estimate_B01002_003)
kable(head(pretty_altrtage, 15),
caption = "Median age by sex for Alameda County census tracts")| GEOID | NAME | medage | male_age | fem_age |
|---|---|---|---|---|
| 06001400100 | Census Tract 4001; Alameda County; California | 51.4 | 51.5 | 51.1 |
| 06001400200 | Census Tract 4002; Alameda County; California | 45.0 | 45.0 | 43.8 |
| 06001400300 | Census Tract 4003; Alameda County; California | 39.6 | 36.0 | 44.8 |
| 06001400400 | Census Tract 4004; Alameda County; California | 37.6 | 37.7 | 37.5 |
| 06001400500 | Census Tract 4005; Alameda County; California | 35.6 | 34.3 | 38.1 |
| 06001400600 | Census Tract 4006; Alameda County; California | 39.8 | 39.4 | 40.3 |
| 06001400700 | Census Tract 4007; Alameda County; California | 37.3 | 38.0 | 36.3 |
| 06001400800 | Census Tract 4008; Alameda County; California | 36.4 | 37.3 | 34.8 |
| 06001400900 | Census Tract 4009; Alameda County; California | 35.3 | 33.9 | 37.0 |
| 06001401000 | Census Tract 4010; Alameda County; California | 34.9 | 35.4 | 34.1 |
| 06001401100 | Census Tract 4011; Alameda County; California | 33.2 | 34.0 | 32.4 |
| 06001401200 | Census Tract 4012; Alameda County; California | 37.7 | 39.3 | 35.0 |
| 06001401300 | Census Tract 4013; Alameda County; California | 35.3 | 36.1 | 34.6 |
| 06001401400 | Census Tract 4014; Alameda County; California | 30.9 | 30.7 | 32.8 |
| 06001401500 | Census Tract 4015; Alameda County; California | 33.9 | 33.4 | 35.3 |
racevars <- c("B03002_001", "B03002_002", "B03002_003", "B03002_004",
"B03002_005", "B03002_006", "B03002_007", "B03002_008",
"B03002_009", "B03002_012")
alameda <- get_acs(geography = "tract",
variables = racevars,
state = "CA",
county = "Alameda",
year = 2023)
# Pivot and rename
wide_alameda <- alameda %>%
pivot_wider(names_from = variable, values_from = c(estimate, moe))
pretty_alameda <- wide_alameda %>%
dplyr::select(GEOID, NAME,
pop = estimate_B03002_001,
not_hisp = estimate_B03002_002,
white = estimate_B03002_003,
black = estimate_B03002_004,
am_ind = estimate_B03002_005,
asian = estimate_B03002_006,
pac_isl = estimate_B03002_007,
other = estimate_B03002_008,
mult = estimate_B03002_009,
hispanic = estimate_B03002_012)alameda_fin <- pretty_alameda %>%
mutate(pct_white = round(100 * (white / pop), 1)) %>%
mutate(pct_black = round(100 * (black / pop), 1)) %>%
mutate(pct_am_ind = round(100 * (am_ind / pop), 1)) %>%
mutate(pct_asian = round(100 * (asian / pop), 1)) %>%
mutate(pct_pac_isl = round(100 * (pac_isl / pop), 1)) %>%
mutate(pct_other = round(100 * (other / pop), 1)) %>%
mutate(pct_mult = round(100 * (mult / pop), 1)) %>%
mutate(pct_hisp = round(100 * (hispanic / pop), 1))
kable(head(alameda_fin %>%
select(NAME, pop, pct_white, pct_black, pct_asian, pct_hisp, pct_mult), 15),
caption = "Racial/ethnic composition by census tract (Alameda County)")| NAME | pop | pct_white | pct_black | pct_asian | pct_hisp | pct_mult |
|---|---|---|---|---|---|---|
| Census Tract 4001; Alameda County; California | 3094 | 68.1 | 4.4 | 14.9 | 6.5 | 5.3 |
| Census Tract 4002; Alameda County; California | 2093 | 67.3 | 2.1 | 12.2 | 9.4 | 8.1 |
| Census Tract 4003; Alameda County; California | 5727 | 58.8 | 9.1 | 10.6 | 8.7 | 11.6 |
| Census Tract 4004; Alameda County; California | 4395 | 60.2 | 9.9 | 9.6 | 13.7 | 6.1 |
| Census Tract 4005; Alameda County; California | 3822 | 44.4 | 23.8 | 8.0 | 14.6 | 8.9 |
| Census Tract 4006; Alameda County; California | 1957 | 40.9 | 27.8 | 5.2 | 10.1 | 12.2 |
| Census Tract 4007; Alameda County; California | 4404 | 50.6 | 23.9 | 6.3 | 10.3 | 7.3 |
| Census Tract 4008; Alameda County; California | 4583 | 53.3 | 8.1 | 15.1 | 13.0 | 8.6 |
| Census Tract 4009; Alameda County; California | 2752 | 34.8 | 27.6 | 7.7 | 18.5 | 9.9 |
| Census Tract 4010; Alameda County; California | 6529 | 30.4 | 37.5 | 6.1 | 13.7 | 10.9 |
| Census Tract 4011; Alameda County; California | 5627 | 42.5 | 17.8 | 14.7 | 13.8 | 9.4 |
| Census Tract 4012; Alameda County; California | 3068 | 57.5 | 6.7 | 15.4 | 11.6 | 8.8 |
| Census Tract 4013; Alameda County; California | 4368 | 33.4 | 28.5 | 16.8 | 8.0 | 11.7 |
| Census Tract 4014; Alameda County; California | 4876 | 26.2 | 35.8 | 10.4 | 22.4 | 5.3 |
| Census Tract 4015; Alameda County; California | 2827 | 21.1 | 52.1 | 4.5 | 18.0 | 4.2 |
cacounties <- get_acs(geography = "county",
variables = racevars,
state = "CA",
year = 2023)
wide_cacounties <- cacounties %>%
pivot_wider(names_from = variable, values_from = c(estimate, moe))
pretty_cacounties <- wide_cacounties %>%
dplyr::select(GEOID, NAME,
pop = estimate_B03002_001,
not_hisp = estimate_B03002_002,
white = estimate_B03002_003,
black = estimate_B03002_004,
am_ind = estimate_B03002_005,
asian = estimate_B03002_006,
pac_isl = estimate_B03002_007,
other = estimate_B03002_008,
mult = estimate_B03002_009,
hispanic = estimate_B03002_012)
cacounties_fin <- pretty_cacounties %>%
mutate(pct_white = round(100 * (white / pop), 1)) %>%
mutate(pct_black = round(100 * (black / pop), 1)) %>%
mutate(pct_am_ind = round(100 * (am_ind / pop), 1)) %>%
mutate(pct_asian = round(100 * (asian / pop), 1)) %>%
mutate(pct_pac_isl = round(100 * (pac_isl / pop), 1)) %>%
mutate(pct_other = round(100 * (other / pop), 1)) %>%
mutate(pct_mult = round(100 * (mult / pop), 1)) %>%
mutate(pct_hisp = round(100 * (hispanic / pop), 1))
kable(head(cacounties_fin %>%
select(NAME, pop, pct_white, pct_black, pct_asian, pct_hisp) %>%
arrange(desc(pop)), 20),
caption = "Racial/ethnic composition by county (California, 2023)")| NAME | pop | pct_white | pct_black | pct_asian | pct_hisp |
|---|---|---|---|---|---|
| Los Angeles County, California | 9848406 | 25.2 | 7.5 | 14.8 | 48.3 |
| San Diego County, California | 3282782 | 43.2 | 4.4 | 11.9 | 34.3 |
| Orange County, California | 3164063 | 37.7 | 1.5 | 21.7 | 34.1 |
| Riverside County, California | 2449909 | 32.0 | 6.1 | 6.8 | 50.6 |
| San Bernardino County, California | 2187816 | 25.6 | 7.6 | 7.9 | 54.6 |
| Santa Clara County, California | 1903297 | 28.2 | 2.3 | 39.3 | 25.1 |
| Alameda County, California | 1651949 | 28.2 | 9.6 | 32.0 | 23.3 |
| Sacramento County, California | 1584047 | 41.5 | 9.1 | 17.2 | 24.0 |
| Contra Costa County, California | 1161458 | 39.3 | 8.2 | 18.3 | 27.3 |
| Fresno County, California | 1012152 | 27.0 | 4.2 | 10.8 | 54.1 |
| Kern County, California | 910433 | 30.7 | 4.8 | 4.9 | 55.7 |
| Ventura County, California | 838259 | 42.9 | 1.7 | 7.1 | 43.8 |
| San Francisco County, California | 836321 | 37.5 | 4.8 | 34.7 | 15.9 |
| San Joaquin County, California | 787416 | 27.9 | 6.7 | 17.5 | 42.2 |
| San Mateo County, California | 745100 | 35.8 | 2.1 | 30.5 | 24.9 |
| Stanislaus County, California | 552250 | 37.5 | 2.7 | 5.7 | 49.2 |
| Sonoma County, California | 485642 | 59.1 | 1.5 | 4.3 | 29.4 |
| Tulare County, California | 475774 | 26.3 | 1.3 | 3.4 | 66.1 |
| Solano County, California | 450824 | 34.3 | 12.6 | 15.6 | 29.0 |
| Santa Barbara County, California | 443975 | 41.5 | 1.5 | 5.1 | 47.6 |
Comparing household income across years to track economic changes:
ca22 <- get_acs(geography = "county",
variables = "B19013_001",
state = "CA",
year = 2022)
ca23 <- get_acs(geography = "county",
variables = "B19013_001",
state = "CA",
year = 2023)
# Join by GEOID and compute year-over-year change
ca_joined <- ca23 %>%
inner_join(ca22, by = "GEOID") %>%
select(GEOID, county = NAME.x,
hhi_2023 = estimate.x,
hhi_2022 = estimate.y) %>%
mutate(change = hhi_2023 - hhi_2022,
pct_change = round(100 * change / hhi_2022, 1)) %>%
arrange(desc(pct_change))
kable(head(ca_joined, 15),
caption = "Counties with the largest household income increases (2022–2023)")| GEOID | county | hhi_2023 | hhi_2022 | change | pct_change |
|---|---|---|---|---|---|
| 06027 | Inyo County, California | 72432 | 63417 | 9015 | 14.2 |
| 06105 | Trinity County, California | 53498 | 47317 | 6181 | 13.1 |
| 06021 | Glenn County, California | 70487 | 64033 | 6454 | 10.1 |
| 06115 | Yuba County, California | 73313 | 66693 | 6620 | 9.9 |
| 06003 | Alpine County, California | 110781 | 101125 | 9656 | 9.5 |
| 06015 | Del Norte County, California | 66780 | 61149 | 5631 | 9.2 |
| 06005 | Amador County, California | 81526 | 74853 | 6673 | 8.9 |
| 06043 | Mariposa County, California | 65378 | 60021 | 5357 | 8.9 |
| 06035 | Lassen County, California | 64395 | 59515 | 4880 | 8.2 |
| 06011 | Colusa County, California | 75149 | 69619 | 5530 | 7.9 |
| 06107 | Tulare County, California | 69489 | 64474 | 5015 | 7.8 |
| 06017 | El Dorado County, California | 106190 | 99246 | 6944 | 7.0 |
| 06057 | Nevada County, California | 84905 | 79395 | 5510 | 6.9 |
| 06077 | San Joaquin County, California | 88531 | 82837 | 5694 | 6.9 |
| 06099 | Stanislaus County, California | 79661 | 74872 | 4789 | 6.4 |
kable(tail(ca_joined %>% arrange(desc(pct_change)), 10),
caption = "Counties with the smallest or negative household income changes (2022–2023)")| GEOID | county | hhi_2023 | hhi_2022 | change | pct_change |
|---|---|---|---|---|---|
| 06055 | Napa County, California | 108970 | 105809 | 3161 | 3.0 |
| 06093 | Siskiyou County, California | 55499 | 53898 | 1601 | 3.0 |
| 06095 | Solano County, California | 99994 | 97037 | 2957 | 3.0 |
| 06039 | Madera County, California | 75496 | 73543 | 1953 | 2.7 |
| 06109 | Tuolumne County, California | 72259 | 70432 | 1827 | 2.6 |
| 06041 | Marin County, California | 142785 | 142019 | 766 | 0.5 |
| 06047 | Merced County, California | 65044 | 64772 | 272 | 0.4 |
| 06031 | Kings County, California | 68750 | 68540 | 210 | 0.3 |
| 06091 | Sierra County, California | 60000 | 61108 | -1108 | -1.8 |
| 06063 | Plumas County, California | 64946 | 67885 | -2939 | -4.3 |
tidycensus pipeline: Pull → Pivot
→ Select/Rename → Mutate provides a repeatable workflow for Census data
analysis.Data source: U.S. Census Bureau, American Community Survey 5-Year Estimates (2022–2023)