Overview

This analysis uses the U.S. Census Bureau’s American Community Survey (ACS) to examine demographic patterns across California — from county-level summaries to census tract-level detail. Using the tidycensus R package, we pull data directly from the Census API and calculate racial/ethnic composition and age distributions.

Focus areas: - Median age by sex across California counties - Racial and ethnic composition at the census tract level (Alameda County) - County-level demographic comparisons - Joining multi-year Census data to track changes

Setup

library(tidyverse)
library(dplyr)
library(ggplot2)
library(knitr)
library(psych)
library(tidycensus)

Note: This analysis requires a Census API key. You can request one at api.census.gov and install it with census_api_key("YOUR_KEY", install = TRUE).

Part I: Age Demographics

County-Level Median Age

Pulling median age data for all California counties from the 2023 ACS 5-year estimates:

ca <- get_acs(geography = "county",
              variables = "B01002_001",
              state = "CA",
              year = 2023)

ca %>%
  arrange(desc(estimate)) %>%
  head(15) %>%
  kable(caption = "California counties with the highest median age (2023 ACS)")
California counties with the highest median age (2023 ACS)
GEOID NAME variable estimate moe
06091 Sierra County, California B01002_001 56.0 5.0
06105 Trinity County, California B01002_001 54.8 0.9
06063 Plumas County, California B01002_001 52.1 0.8
06009 Calaveras County, California B01002_001 52.0 0.7
06043 Mariposa County, California B01002_001 51.6 1.0
06057 Nevada County, California B01002_001 50.3 0.2
06005 Amador County, California B01002_001 49.9 0.6
06049 Modoc County, California B01002_001 49.0 2.1
06109 Tuolumne County, California B01002_001 48.8 0.4
06093 Siskiyou County, California B01002_001 47.4 0.6
06041 Marin County, California B01002_001 47.3 0.2
06017 El Dorado County, California B01002_001 46.1 0.2
06027 Inyo County, California B01002_001 45.6 0.5
06033 Lake County, California B01002_001 44.2 0.3
06045 Mendocino County, California B01002_001 43.9 0.3

Tract-Level Age by Sex (Alameda County)

Pulling median age disaggregated by sex at the census tract level:

agevars <- c("B01002_001", "B01002_002", "B01002_003")

altrtage <- get_acs(geography = "tract",
                    variables = agevars,
                    county = "Alameda",
                    state = "CA",
                    year = 2023)

# Pivot from stacked to wide format
wide_altrtage <- altrtage %>%
  pivot_wider(names_from = variable, values_from = c(estimate, moe))

# Rename columns for clarity
pretty_altrtage <- wide_altrtage %>%
  dplyr::select(GEOID, NAME,
         medage = estimate_B01002_001,
         male_age = estimate_B01002_002,
         fem_age = estimate_B01002_003)

kable(head(pretty_altrtage, 15),
      caption = "Median age by sex for Alameda County census tracts")
Median age by sex for Alameda County census tracts
GEOID NAME medage male_age fem_age
06001400100 Census Tract 4001; Alameda County; California 51.4 51.5 51.1
06001400200 Census Tract 4002; Alameda County; California 45.0 45.0 43.8
06001400300 Census Tract 4003; Alameda County; California 39.6 36.0 44.8
06001400400 Census Tract 4004; Alameda County; California 37.6 37.7 37.5
06001400500 Census Tract 4005; Alameda County; California 35.6 34.3 38.1
06001400600 Census Tract 4006; Alameda County; California 39.8 39.4 40.3
06001400700 Census Tract 4007; Alameda County; California 37.3 38.0 36.3
06001400800 Census Tract 4008; Alameda County; California 36.4 37.3 34.8
06001400900 Census Tract 4009; Alameda County; California 35.3 33.9 37.0
06001401000 Census Tract 4010; Alameda County; California 34.9 35.4 34.1
06001401100 Census Tract 4011; Alameda County; California 33.2 34.0 32.4
06001401200 Census Tract 4012; Alameda County; California 37.7 39.3 35.0
06001401300 Census Tract 4013; Alameda County; California 35.3 36.1 34.6
06001401400 Census Tract 4014; Alameda County; California 30.9 30.7 32.8
06001401500 Census Tract 4015; Alameda County; California 33.9 33.4 35.3

Part II: Racial & Ethnic Composition

Alameda County — Census Tract Level

racevars <- c("B03002_001", "B03002_002", "B03002_003", "B03002_004",
              "B03002_005", "B03002_006", "B03002_007", "B03002_008",
              "B03002_009", "B03002_012")

alameda <- get_acs(geography = "tract",
                   variables = racevars,
                   state = "CA",
                   county = "Alameda",
                   year = 2023)

# Pivot and rename
wide_alameda <- alameda %>%
  pivot_wider(names_from = variable, values_from = c(estimate, moe))

pretty_alameda <- wide_alameda %>%
  dplyr::select(GEOID, NAME,
         pop = estimate_B03002_001,
         not_hisp = estimate_B03002_002,
         white = estimate_B03002_003,
         black = estimate_B03002_004,
         am_ind = estimate_B03002_005,
         asian = estimate_B03002_006,
         pac_isl = estimate_B03002_007,
         other = estimate_B03002_008,
         mult = estimate_B03002_009,
         hispanic = estimate_B03002_012)

Computing Demographic Percentages

alameda_fin <- pretty_alameda %>%
  mutate(pct_white = round(100 * (white / pop), 1)) %>%
  mutate(pct_black = round(100 * (black / pop), 1)) %>%
  mutate(pct_am_ind = round(100 * (am_ind / pop), 1)) %>%
  mutate(pct_asian = round(100 * (asian / pop), 1)) %>%
  mutate(pct_pac_isl = round(100 * (pac_isl / pop), 1)) %>%
  mutate(pct_other = round(100 * (other / pop), 1)) %>%
  mutate(pct_mult = round(100 * (mult / pop), 1)) %>%
  mutate(pct_hisp = round(100 * (hispanic / pop), 1))

kable(head(alameda_fin %>%
             select(NAME, pop, pct_white, pct_black, pct_asian, pct_hisp, pct_mult), 15),
      caption = "Racial/ethnic composition by census tract (Alameda County)")
Racial/ethnic composition by census tract (Alameda County)
NAME pop pct_white pct_black pct_asian pct_hisp pct_mult
Census Tract 4001; Alameda County; California 3094 68.1 4.4 14.9 6.5 5.3
Census Tract 4002; Alameda County; California 2093 67.3 2.1 12.2 9.4 8.1
Census Tract 4003; Alameda County; California 5727 58.8 9.1 10.6 8.7 11.6
Census Tract 4004; Alameda County; California 4395 60.2 9.9 9.6 13.7 6.1
Census Tract 4005; Alameda County; California 3822 44.4 23.8 8.0 14.6 8.9
Census Tract 4006; Alameda County; California 1957 40.9 27.8 5.2 10.1 12.2
Census Tract 4007; Alameda County; California 4404 50.6 23.9 6.3 10.3 7.3
Census Tract 4008; Alameda County; California 4583 53.3 8.1 15.1 13.0 8.6
Census Tract 4009; Alameda County; California 2752 34.8 27.6 7.7 18.5 9.9
Census Tract 4010; Alameda County; California 6529 30.4 37.5 6.1 13.7 10.9
Census Tract 4011; Alameda County; California 5627 42.5 17.8 14.7 13.8 9.4
Census Tract 4012; Alameda County; California 3068 57.5 6.7 15.4 11.6 8.8
Census Tract 4013; Alameda County; California 4368 33.4 28.5 16.8 8.0 11.7
Census Tract 4014; Alameda County; California 4876 26.2 35.8 10.4 22.4 5.3
Census Tract 4015; Alameda County; California 2827 21.1 52.1 4.5 18.0 4.2

California Counties — Statewide Comparison

cacounties <- get_acs(geography = "county",
                      variables = racevars,
                      state = "CA",
                      year = 2023)

wide_cacounties <- cacounties %>%
  pivot_wider(names_from = variable, values_from = c(estimate, moe))

pretty_cacounties <- wide_cacounties %>%
  dplyr::select(GEOID, NAME,
         pop = estimate_B03002_001,
         not_hisp = estimate_B03002_002,
         white = estimate_B03002_003,
         black = estimate_B03002_004,
         am_ind = estimate_B03002_005,
         asian = estimate_B03002_006,
         pac_isl = estimate_B03002_007,
         other = estimate_B03002_008,
         mult = estimate_B03002_009,
         hispanic = estimate_B03002_012)

cacounties_fin <- pretty_cacounties %>%
  mutate(pct_white = round(100 * (white / pop), 1)) %>%
  mutate(pct_black = round(100 * (black / pop), 1)) %>%
  mutate(pct_am_ind = round(100 * (am_ind / pop), 1)) %>%
  mutate(pct_asian = round(100 * (asian / pop), 1)) %>%
  mutate(pct_pac_isl = round(100 * (pac_isl / pop), 1)) %>%
  mutate(pct_other = round(100 * (other / pop), 1)) %>%
  mutate(pct_mult = round(100 * (mult / pop), 1)) %>%
  mutate(pct_hisp = round(100 * (hispanic / pop), 1))

kable(head(cacounties_fin %>%
             select(NAME, pop, pct_white, pct_black, pct_asian, pct_hisp) %>%
             arrange(desc(pop)), 20),
      caption = "Racial/ethnic composition by county (California, 2023)")
Racial/ethnic composition by county (California, 2023)
NAME pop pct_white pct_black pct_asian pct_hisp
Los Angeles County, California 9848406 25.2 7.5 14.8 48.3
San Diego County, California 3282782 43.2 4.4 11.9 34.3
Orange County, California 3164063 37.7 1.5 21.7 34.1
Riverside County, California 2449909 32.0 6.1 6.8 50.6
San Bernardino County, California 2187816 25.6 7.6 7.9 54.6
Santa Clara County, California 1903297 28.2 2.3 39.3 25.1
Alameda County, California 1651949 28.2 9.6 32.0 23.3
Sacramento County, California 1584047 41.5 9.1 17.2 24.0
Contra Costa County, California 1161458 39.3 8.2 18.3 27.3
Fresno County, California 1012152 27.0 4.2 10.8 54.1
Kern County, California 910433 30.7 4.8 4.9 55.7
Ventura County, California 838259 42.9 1.7 7.1 43.8
San Francisco County, California 836321 37.5 4.8 34.7 15.9
San Joaquin County, California 787416 27.9 6.7 17.5 42.2
San Mateo County, California 745100 35.8 2.1 30.5 24.9
Stanislaus County, California 552250 37.5 2.7 5.7 49.2
Sonoma County, California 485642 59.1 1.5 4.3 29.4
Tulare County, California 475774 26.3 1.3 3.4 66.1
Solano County, California 450824 34.3 12.6 15.6 29.0
Santa Barbara County, California 443975 41.5 1.5 5.1 47.6

Part III: Joining Multi-Year Census Data

Comparing household income across years to track economic changes:

ca22 <- get_acs(geography = "county",
                variables = "B19013_001",
                state = "CA",
                year = 2022)

ca23 <- get_acs(geography = "county",
                variables = "B19013_001",
                state = "CA",
                year = 2023)

# Join by GEOID and compute year-over-year change
ca_joined <- ca23 %>%
  inner_join(ca22, by = "GEOID") %>%
  select(GEOID, county = NAME.x,
         hhi_2023 = estimate.x,
         hhi_2022 = estimate.y) %>%
  mutate(change = hhi_2023 - hhi_2022,
         pct_change = round(100 * change / hhi_2022, 1)) %>%
  arrange(desc(pct_change))

kable(head(ca_joined, 15),
      caption = "Counties with the largest household income increases (2022–2023)")
Counties with the largest household income increases (2022–2023)
GEOID county hhi_2023 hhi_2022 change pct_change
06027 Inyo County, California 72432 63417 9015 14.2
06105 Trinity County, California 53498 47317 6181 13.1
06021 Glenn County, California 70487 64033 6454 10.1
06115 Yuba County, California 73313 66693 6620 9.9
06003 Alpine County, California 110781 101125 9656 9.5
06015 Del Norte County, California 66780 61149 5631 9.2
06005 Amador County, California 81526 74853 6673 8.9
06043 Mariposa County, California 65378 60021 5357 8.9
06035 Lassen County, California 64395 59515 4880 8.2
06011 Colusa County, California 75149 69619 5530 7.9
06107 Tulare County, California 69489 64474 5015 7.8
06017 El Dorado County, California 106190 99246 6944 7.0
06057 Nevada County, California 84905 79395 5510 6.9
06077 San Joaquin County, California 88531 82837 5694 6.9
06099 Stanislaus County, California 79661 74872 4789 6.4
kable(tail(ca_joined %>% arrange(desc(pct_change)), 10),
      caption = "Counties with the smallest or negative household income changes (2022–2023)")
Counties with the smallest or negative household income changes (2022–2023)
GEOID county hhi_2023 hhi_2022 change pct_change
06055 Napa County, California 108970 105809 3161 3.0
06093 Siskiyou County, California 55499 53898 1601 3.0
06095 Solano County, California 99994 97037 2957 3.0
06039 Madera County, California 75496 73543 1953 2.7
06109 Tuolumne County, California 72259 70432 1827 2.6
06041 Marin County, California 142785 142019 766 0.5
06047 Merced County, California 65044 64772 272 0.4
06031 Kings County, California 68750 68540 210 0.3
06091 Sierra County, California 60000 61108 -1108 -1.8
06063 Plumas County, California 64946 67885 -2939 -4.3

Key Takeaways

  1. California is highly diverse: Racial composition varies dramatically between counties and even within counties at the tract level.
  2. Age patterns vary geographically: Median age differs significantly across tracts, reflecting different community characteristics.
  3. Census data enables temporal analysis: By joining multi-year ACS data, we can track economic changes at the county level over time.
  4. The tidycensus pipeline: Pull → Pivot → Select/Rename → Mutate provides a repeatable workflow for Census data analysis.

Data source: U.S. Census Bureau, American Community Survey 5-Year Estimates (2022–2023)