Extreme commuting

Summary: The share of workers age 16 or older who work outside of home that commute 90 minutes or more to work, one-way. Data for 2010 and 2020 represent five-year averages (e.g., 2016-2020).

Data source(s): U.S. Census Bureau, 2010 and 2020 American Community Survey 5-year Summary Files; GeoLytics, Inc., 2000 Long Form in 2010 Boundaries; Integrated Public Use Microdata Series, IPUMS USA, University of Minnesota, www.ipums.org, 2000 5% sample, 2010 and 2020 American Community Survey 5-year samples.

Universe: All workers age 16 or older who work outside of home.

Methods: The number and percentage of extreme commuters among workers age 16 or older who work outside of home was calculated by race/ethnicity, gender, nativity, ancestry, commute mode and poverty level for each year and geography. Private vehicles include cars, trucks, and motorcycles; public transportation includes buses, streetcars, subways/light rails, railroads, taxicabs, and ferryboats; walk or bike includes bicycles, walking, and other modes. See the methodology page for other relevant notes.

Notes:

  • Latinos include people of Hispanic origin of any race and all other groups exclude people of Hispanic origin.
  • Data from 2010 and 2020 represent 2006-2010 and 2016-2020 averages, respectively.
  • Workers are defined as people who reported working during the week prior to the survey (and they had to have worked outside of home to be included in the universe).
  • No data are available for other cities or towns or Census Designated Places for the by race/ethnicity, by gender, by nativity, by ancestry, and by poverty breakdowns, as they are based on the IPUMS microdata.
  • No data are reported if based on fewer than 100 individuals (i.e., unweighted) survey respondents age 16 or older who work outside of home for the by race/ethnicity, by gender, by nativity, by ancestry, and by poverty breakdowns.
  • No data are reported if based on fewer than 100 people age 16 or older who work outside of home for the trend, ranking and map breakdowns.

Housing burden

Summary: The share of owner- and renter-occupied households that are cost-burdened (spending more than 30 percent of income on housing costs) and "severely" cost-burdened (more than 50 percent). Data for 2010 and 2020 represent five-year averages (e.g., 2016-2020).

Data source(s): U.S. Census Bureau, 2010 and 2020 American Community Survey 5-year Summary Files; GeoLytics, Inc., 2000 Long Form in 2010 Boundaries; Integrated Public Use Microdata Series, IPUMS USA, University of Minnesota, www.ipums.org, 2000 5% sample, 2010 and 2020 American Community Survey 5-year samples.

Universe: Occupied households with housing costs, excluding non-traditional owner-occupied households (e.g., multi-unit structures and trailers).

Methods: The number and percentage of burdened and severely burdened households was calculated by tenure (owner vs. renter), race/ethnicity, gender, nativity, ancestry and poverty level for each year and geography. Housing costs for renters include contract rent as well as utilities while housing costs for owners includes most costs of owning a home such as mortgage, insurance, utilities, real estate taxes and other costs. See the  methodology page for other relevant notes.

Notes:

  • Latinos include people of Hispanic origin of any race and all other groups exclude people of Hispanic origin.
  • Demographic characteristics are based on those of the householder.
  • Household income is based on the year prior to the survey while housing costs are based on the survey year.
  • Housing costs are based on the month the survey was conducted.
  • For the 'By poverty' breakdown, the poverty level is based on family size. For instance, in 2020, the 200% cutoff family of four (with two kids) was about $52K and the 350% cutoff was around $92K.
  • Data for 2000 is based on a survey that year but reflects income from the year prior, while data for 2010 and 2020 represent 2006-2010 and 2016-2020 averages, respectively.
  • No data are available for other cities or towns or Census Designated Places for the by race/ethnicity, by gender, by nativity, by ancestry, and by poverty breakdowns, as they are based on the IPUMS microdata.
  • No data are reported if based on fewer than 100 individuals (i.e., unweighted) survey respondents who are owner- or renter-occupied householders for the by race/ethnicity, by gender, by nativity, by ancestry, and by poverty breakdowns.
  • No data are reported if based on fewer than 100 owner- or renter-occupied householders for the trend, ranking and map breakdowns.
     

Homeownership

Summary: The percentage of households that are owner-occupied. Data for 2010 and 2020 represent five-year averages (e.g. 2016-2020). 

Data Source(s): Integrated Public Use Microdata Series, IPUMS USA, University of Minnesota, www.ipums.org, 2000 5% Sample, 2010 and 2020 American Community Survey 5-year samples; U.S. Census Bureau, 2010 and 2020 American Community Survey 5-year Summary Files; GeoLytics, Inc., 2000 Long Form in 2010 Boundaries; IPUMS NHGIS, University of Minnesota, www.nhgis.org, NHGIS crosswalk files, 2020 blocks to 2010 blocks.

Universe: All households.

Methods: The rate of homeownership was calculated by race/ethnicity, gender, nativity, and ancestry for each year and geography. For the map breakdown, census tract level data from the 2020 5-year American Community survey was re-estimated into 2010 census tract boundaries using a 2020 to 2010 census tract geographic crosswalk developed using 2020 block level population data from the 2020 Census Redistricting Data along with a block level geographic crosswalk (2020 to 2010 blocks) from NHGIS. See the methodology page for other relevant notes.

Notes:

  • Latinos include people of Hispanic origin of any race and all other groups exclude people of Hispanic origin.
  • Data for 2010 and 2020 represent 2006-2010 and 2016-2020 averages, respectively. 
  • No data are reported if based on fewer than 100 individual (i.e., unweighted) survey respondents in the universe for the race/ethnicity, trend, by gender, by nativity, by ancestry, and ranking breakdowns.
  • No data are reported if based on fewer than 500 people in the universe for the map breakdown; a lower minimum threshold for reporting of at least or 100 people was applied for census tract level estimates in the map breakdown.

Market rent

Summary: Estimated monthly median market rent based on the Zillow Rent Index (ZRI) for all rental units. Values are for April 30th of each year and are not adjusted for inflation.

Data Source(s): Zillow Group, Inc., Zillow Rent Index (ZRI) for all rental units; U.S. Census Bureau, 2011, 2012, 2013, 2014, 2015, 2016, and 2017 American Community Survey 5-year Summary Files.

Universe: All rental housing units (multifamily, single-family residence and condo/co-op).

Methods: An extract of the monthly ZRI for all rental units (multifamily, single-family residence and condo/co-op) at the 2010 census tract level of geography was obtained from Zillow Group, Inc. The ZRI is "a smoothed measure of the median estimated market rate rent across a given region and housing type." Median "market" rent reflects the median rent of units up for rent at a given point in time, and not necessarily the median rent of all renter-occupied units. In places where rents are on the rise, median market rent will be higher the median rent paid by all renters and vice-versa. The underlying data used to derive the ZRI are the particular set of housing units for rent in a geographic area over at a given point in time. Because this sample of rental units may not reflect the overall rental housing stock in terms of types of units for rent, which could bias the estimate of median market rent, the methodology behind the ZRI is designed to adjust for differences between the sample of units for rent and the overall rental stock to yeild an index of median market rent that is unnaffected by the mix of homes in each sample used. For example, imagine a neighborhood that is mostly comprised of multifamily apartments, but for some reason most of the units up for rent at a given point in time are single-family units. Since single-family units tend to be more expensive, this could bias the estimate of median market rent upward if no adjustments were are made. A "smoothed measure" simply means that the ZRI for any given month is a three-month average to smooth the volitility in the estimates that can occur from one month to the next. ZRI estimates for April 30 of each year from 2011 through 2017 were selected for all census tracts in the 9-county Bay Area, and merged with data on the number of renter-occupied housing units from the 2011 through 2017 American Communty Survey (ACS) 5-year summary files (i.e. eight different 5-year summary files). ZRI values were estimated for geographies other than census tracts in each year by taking a weighted average of the tract-level values, using the number of renter-occupied housing units from the ACS 5-year summary file for the corresponding year as weight. For example, to estimate the ZRI for the large cities in 2015, a weighted average of the ZRI values for all tracts contained in each large city was calculated, using the number of renter-occupied housing units from the 2015 5-year summary file (which reflects a 2011-2015 average) as weight. To aggregate from census tracts to sub-counties (CPUMAs) and census-defined places (large cities, other cities or towns, and Census Designated Places), we used geographic crosswalks that were created by assigning each 2010 tract to the CPUMA and census-defined place containing the plurality of its 2010 population (from SF1 of the 2010 Census) by census block. All estimates were derived based on the same underlying 2010 census tract level data file for consistency. However, of this, estimates for geographies such as cities and counties will not align perfectly with the official ZRI estimates from Zillow, Inc., which can be accessed here. For more information on the methodology used by Zillow to develop the ZRI, see here. See the methodology page for other relevant notes.

Unfortunately, we have been unable to obtain more recent data extracts from Zillow Group, Inc. since initially developing this indicator, so 2017 is the latest year with data available. We are considering developing a methodology to update the indicator using publicly available data from Zillow Group, Inc.

Notes:

  • Values are for April 30th of each year and are not adjusted for inflation.
  • Data are only reported for census-defined places (large cities, other cities or towns and Census Designated Places) for which at least three census tracts were assigned in the geographic crosswalks noted above, or for which one or two tracts were assigned but the vast majority (at least 80 percent) of the tract population(s) fell within the census-defined place based on 2010 block level population counts.

Gentrification risk

Summary: The share of low-income households (household income < $60,000) living in neighborhoods classified according to the UC Berkeley Urban Displacement Project’s gentrification/displacement typology. The 2018 typology includes new categories as well as mixed income classifications to better reflect mixed-income neighborhoods. Race/ethnicity is based on the race of the householder; with the exception of whites, all racial groups include people of Hispanic origin who self-identify with that racial identity.

Data Source(s): UC Berkeley Urban Displacement Project, 2018 Gentrification and Displacement Census Tract Typology for the nine-county Bay Area; U.S. Census Bureau, 2020 American Community Survey 5-year Summary File; www.nhgis.org, NHGIS crosswalk files, 2020 blocks to 2010 blocks.

Universe: All households with income below $60,000 per year.

Methods: Gentrification risk categories at the 2010 census tract level were constructed using data from the Urban Displacement Project (UDP) Gentrification and Displacement Census Tract Typology for the nine-county Bay Area. The typology was created to better understand and predict where gentrification and displacement is happening and will likely occur in the future, and is based on a wide variety of data sources covering the years 1990 through 2018. 2018 Maps of the typology are available on the Urban Displacement Project's website, here, and the methodology can be found here.

While the typology includes 11 different types of census tracts, they were consolidated into four categories for the gentrification risk indicator on the Atlas: “at risk” includes typology categories of Low-Income/Susceptible to Displacement, Ongoing Displacement, and At Risk of gentrification; “gentrifying” includes typology categories of Early/Ongoing Gentrification and Advanced Gentrification; “stable” includes typology categories of Stable Moderate/Mixed Income and At Risk of Becoming Exclusive; and “exclusive” includes typology categories of Becoming Exclusive and Stable/Advanced Exclusive. The typology categories of Unavailable or Unreliable Data and High Student Population were excluded from our analysis.

The tract level gentrification risk data were merged with tract-level data on the number of households by race/ethnicity and annual income level from the 2020 American Community Survey 5-year Summary File, which reflects a 2016-2020 average.  Census tract level data from the 2020 5-year American Community survey was re-estimated into 2010 census tract boundaries for a geographically consistent merge with the gentrification risk data, using a 2020 to 2010 census tract geographic crosswalk developed using 2020 block level population data from the 2020 Census Redistricting Data along with a block level geographic crosswalk (2020 to 2010 blocks) from NHGIS.

Low-income households were defined as those with annual income under $60,000, and the number of such households by race/ethnicity were summed up to higher-level Atlas geographies for each of the four gentrification risk categories. To aggregate from census tracts to sub-counties (CPUMAs) and census-defined places (large cities, other cities or towns, and Census Designated Places), we used geographic crosswalks that were created by assigning each 2010 tract to the CPUMA and census-defined place containing the plurality of its 2010 population (from SF1 of the 2010 Census) by census block. See the methodology page for other relevant notes.

Notes:

  • Race/ethnicity is based on the householder; with the exception of whites, all racial groups include people of Hispanic origin who self-identify with that racial identity.
  • The analysis is restricted to tracts with an assigned gentrification/displacement type by the UC Berkeley typology.
  • Low-income households are defined as those with an annual household income of less than $60,000.
  • No data are reported if based on fewer than 100 low-income households.
  • Data are only reported for census-defined places (large cities, other cities or towns and Census Designated Places) for which at least three census tracts were assigned in the geographic crosswalks noted above, or for which one or two tracts were assigned but the vast majority (at least 80 percent) of the tract population(s) fell within the census-defined place based on 2010 block level population counts.
  • No data are available for California as a whole.
     

Experiencing homelessness

Summary: Point-in-Time (PIT) count of people experiencing homelessness. 

Visualizations: Five categories of homelessness are displayed: 1.) overall homeless population (includes both sheltered and unsheltered people);  2.) unsheltered population;  3.) sheltered population;  4.) population sheltered in emergency housing;  5.) population sheltered in transitory housing. 

Data Source(s): US Department of Housing and Urban Development, Point-in-Time (PIT) count; US Census Bureau, Population Estimates Program (PEP), Annual County Resident Population Estimates by Age, Sex, Race, and Hispanic Origin: April 1, 2010 to July 1, 2020 (CC-EST2020-ALLDATA) and April 1, 2020 to July 1, 2022 (CC-EST2022-ALLDATA).

Universe: Sheltered and unsheltered people experiencing homelessness.

Methods: The Point-in-Time (PIT) count methodology counts the number of sheltered and unsheltered people experiencing homelessness on a single night in January. This dataset contains gender categories: male, female, transgender, gender non-conforming, gender that is not singularly female or male, and gender questioning. For the indicator, we combined non-conforming (gender that is not singularly female or male) and gender questioning into one category labeled non-binary or other. Together, the indicator displays four gender identification categories: female, male, transgender, and non-binary or other.

The dataset also includes racial/ethnic categories: Asian American, Black, Latino, Native American, Native Hawaiian and Pacific Islander, and multiracial. In the raw data, the counts by race are non-exclusive of those who identify as Hispanic or Latino, while the counts for those who identify as having Hispanic or Latino ethnicity include individuals of any race.

To transform the racial categories into mutually exclusive groups, census population estimates were matched to each county. Then, we determined the proportion of each racial category that also identified as Latino in that county in the census estimates. Next, we multiplied each of the homeless PIT counts for each non-Latino racial group by the computed proportions. This provided an initial estimate of the Latino homeless population alongside other mutually exclusive racial categories. The implicit assumption of this calculation is that the proportion of each racial group that self-identifies as Latino in the census population estimates will be the same among the homeless population. We subtracted the new estimated Latino homeless population from the Latino homeless population provided in the raw data to determine the remainder, which still needed to be redistributed. To redistribute the remainder (either under- or over-estimations from the initial calculations), we computed the share of the estimated Latino homeless population composed of each racial group. That is, we divided the estimated number of homeless individuals within each racial category that identified as Latino (calculated in the previous step) by the estimated homeless Latino population. This statistic was then multiplied by the remainder and used to recalculate new totals for each mutually exclusive racial category including Latino people. Finally, we used the new mutually exclusive county estimates to aggregate up to the Bay Area five-county and nine-county regions as well as the full state of California.

For each category, we also computed the rate of homelessness per 10,000 people using census population estimates as the denominator.

Notes:

  • Point-in-time (PIT) counts are reported for Continuums of Care (CoCs). For the nine Bay Area counties, CoCs align directly with each county. However, in other rural parts of the state, CoCs may include multiple counties.

  • General population data is based on the Population Estimates Program because its methodology and frequency match well with the PIT count frequency and methodology. As a result, the population estimates are different from other indicators that display population data from the American Community Survey or decennial census. 

  • Each category was computed using the total count of homeless people (including both individuals and people in families). The total counts include adults and children (except when excluded based on age criteria).

  • In recent years, CoCs in the Bay Area have followed the Department of Housing and Urban Development’s recommendation and allowed participants to select multiple gender identities. Hence, we recommend that data on gender identity should be interpreted as noninclusive.

  • For the gender-related variables, the female, male, and transgender categories are available for all years in the dataset (2015-2022). The Department of Housing and Urban Development (HUD) added the category of gender non-conforming in 2018 and discontinued its use in 2022 when it was replaced with the categories of gender that are not singularly female or male and gender questioning.

  • We recognize that LGBTQ+ people have diverse identities that do not match neatly with data categories. Due to data, display, and reporting considerations, we combined gender non-conforming (gender that is not singularly female or male) and gender questioning into one category labeled “non-binary or other.” We welcome any suggestion or critique of our approach.

  • The Latino category includes people of Hispanic origin of any race; all other groups exclude people of Hispanic origin.

  • Unsheltered data are only mandated on years ending with odd numbers (e.g., 2019, 2021, and 2023). Due to the surge of Covid-19 cases in 2021, HUD allowed Continuums of Care (CoCs) to opt-out from collecting unsheltered data to protect the safety of survey workers and unsheltered homeless people. Hence, the overall and unsheltered PIT counts are set to be missing in 2021 to reflect the lack of data. Analyzing the data for the nine Bay Area counties and the aggregate counts for the State of California, we did not find any missing or inconsistent unsheltered counts for odd years except in 2021. 

  • Data are only available at the state, regional, and county levels.

Acknowledgments: We would like to thank Dr. Bianca Wilson for her guidance on interpreting gender identity data.

Affordable housing production

Summary: Housing unit permits approved to meet housing needs by income level based on the Regional Housing Needs Assessment conducted by the Association of Bay Area Governments.

Data Source(s): Association of Bay Area Governments (ABAG), Regional Housing Needs Assessments.

Universe: Total number of housing unit permits needed, by income level of tenants, to meet affordable housing needs in each projection period (1999-2006, 2007-2014, and 2015-2023).

Methods: Regional Housing Need Allocation (RHNA) is the state-mandated process to identify the total number of housing units, by affordability level, that each jurisdiction must accommodate in its Housing Element. As part of this process, the California Department of Housing and Community Development (HCD) identifies the total housing need for the San Francisco Bay Area for an eight-year period (in latest completed cycle, from 2015-2023). Data by income are based on affordability for households falling in different ranges of Area Median Income (AMI) designated by ABAG for each Regional Housing Needs Assessment period (1999-2006, 2007-2014, 2015-2023). "Very-low income" is defined as income between zero and 50 percent of AMI, "low income" is between 50 and 80 percent of AMI, "moderate income" is between 80 and 120 percent of AMI and "above-moderate income" is 120 percent of AMI or higher.

Data on the projected number of housing units needed by affordability level and the number for which permits were issued were collected for all three historic RHNA periods of 1999-2006, 2007-2014, and 2015-2023. The next RHNA period is for 2023-2031, with a plan released by ABAG in December 2021. No data on progress for 2023-2031 period were available at the time of the last update of this indicator. The number of housing unit permits issues and their percentage of total housing units needed was collected for each of the three RHNA periods for jurisdictions reporting in the Bay Area (i.e., counties and cities). The data were aggregated across counties to derive regional totals for the Five- and Nine-county Bay Area regions. See the methodology page for other relevant notes.

Notes: 

  • No data are available for California as a whole, sub-counties, and Census Designated Places.

Neighborhood opportunity

Summary: The share of the population living in neighborhoods with different neighborhood resource levels based on the California Fair Housing Task Force opportunity maps created by the Haas Institute at UC Berkeley. High segregation and poverty neighborhoods are those with poverty rates of at least 30 percent, with high levels of racial segregation and high shares of people-of-color households. Resource levels are based on a comprehensive index of opportunity.
Data Source(s): California Fair Housing Task Force 2018 and 2023 Opportunity Maps; U.S. Census Bureau, 2015 and 2020 American Community Survey 5-year Summary Files.

Universe: All people.

Methods: Data on neighborhood opportunity from the latest 2023 and earlier 2018 version of the California Fair Housing Task Force Opportunity Mapping project, drawing on data from a range of years between 2010 and 2022, was downloaded from the California State Treasurer website. The 2018 version of the data are expressed at the 2010 census tract level, and the 2023 version is expressed at a hybrid level of geography comprised of the 2010 census tracts and rural block groups. The block group data is generally available in more sparsely population areas of California where the census tracts are large, geographically.

To briefly summarize the neighborhood opportunity methodology, tract level opportunity scores were derived relative to the 9-county Bay Area region using an index based on three domains of indicators: the health and environment domain (e.g. air pollution concentrations, drinking water contaminants, toxic releases from polluting facilities, pesticides, traffic density); the education domain (e.g. math and reading proficiency, student poverty, and high school graduation rates); and the economics domain (e.g. employment rates, job proximity, median home values, poverty rates, and adult educational attainment). A filtering approach was used to assign those identified as having high levels of poverty and racial segregation into a "high segregation & poverty" category. Index scores were assigned to unfiltered tracts and block groups as follows: the top 20 percent of tracts and block groups in terms of the opportunity score were assigned to the "highest resource" category and the next 20 percent were assigned to the "high resource" category. Finally, the remaining tracts and block groups were divided evenly into the “moderate resource” and “low Resource” categories based on their index scores.

Data reported on the Atlas for 2015 is based on combining the 2018 neighborhood opportunity data with the 2015 American Community Survey 5-year Summary File, while data reported for the 2023 neighborhood opportunity data is combined with the 2020 American Community Survey 5-year Summary File. The neighborhood opportunity estimates (at the census tract and/or block group levels of geography) were merged with population data from the relevant American Community Survey 5-year Summary File. Counts of people by race/ethnicity and nativity were then summed up to higher-level Atlas geographies for each of the five neighborhood opportunity categories described above. To aggregate from census tracts and block groups to sub-counties (CPUMAs) and census-defined places (large cities, other cities or towns, and Census Designated Places), we used geographic crosswalks that were created by assigning each 2010 tract/block group to the CPUMA and census-defined place containing the plurality of its 2010 population (from SF1 of the 2010 Census) by census block. See the methodology page for other relevant notes.

Finally, because the American Community Survey 5-year Summary File does not have race by nativity data at the block group level, estimates for block group-level population data were created before aggregating, based on the distribution of immigrants by race at the census tract level.

Notes:

  • For the by race/ethnicity and ranking breakdowns, Latinos include people of Hispanic origin of any race and all other groups exclude people of Hispanic origin.
  • For the by nativity breakdown, with the exception of Whites, all racial groups include people of Hispanic origin who self-identify with that racial identity.
  • Resource levels are based on two different versions of a comprehensive index of opportunity, one released in 2018 (based on data from the 2010 Decennial Census and additional data from 2010-2016) and an updated version released in 2023 (based on data from the 2010 Decennial Census and additional data from 2015-2022).
  • Data reported for on the Atlas for 2015 combines the 2018 neighborhood opportunity data with the 2015 American Community Survey 5-year Summary File.
  • Data reported for on the Atlas for 2020 combines the 2023 neighborhood opportunity data with the 2020 American Community Survey 5-year Summary File.
  • Counts of people by race/ethnicity and nativity for 2015 reflect 2011-2015 averages and reflect 2016-2020 averages for 2020.
  • High segregation and poverty neighborhoods are those where 30% of the population is below the federal poverty line, with high levels of racial segregation and high shares of people-of-color households.
  • No data are reported if based on fewer than 100 people.
  • Data are only reported for census-defined places (large cities, other cities or towns and Census Designated Places) for which at least three census tracts or block groups were assigned in the geographic crosswalks noted above, or for which one or two tracts or block groups were assigned but the vast majority (at least 80 percent) of the tract or block group population(s) fell within the census-defined place based on 2010 block level population counts.
  • No data are available for California as a whole.

Business ownership

Summary: The number of firms with paid employees per 100 persons in the labor force ages 16 or older. Firms are classified by race/ethnicity and gender based on the self-identification of the majority owner. With the exception of white people, all racial groups include people of Hispanic origin who self-identify with that racial identity.

Data source(s): U.S. Census Bureau, 2007 and 2012 Survey of Business Owners, 2017 Annual Business Survey, 2009, 2014, and 2017 American Community Survey 5-year Summary Files.

Universe: Firms include all nonfarm businesses filing Internal Revenue Service tax forms as firms with paid employees and with receipts of $1,000 or more.

Methods: Data on the number of firms with paid employees, industry, and race/ethnicity and gender of the proprietor was collected from the 2007 and 2012 Survey of Business Owners (SBO) and 2017 Annual Business Survey (ABS) for all Atlas geographies. To be consistent across breakdowns and cuts by race/ethnicity and gender, firm counts for all breakdowns were restricted to firms classifiable by race, gender, and veteran status. A single firm may be tabulated in more than one racial/ethnic group category. This can result because the sole owner was reported to be of more than one race, the majority owner was reported to be of more than one race, or a majority combination of owners was reported to be of more than one race. The denominator used to calculate the number of firms per 100 persons in the labor force age 16 or older by race/ethnicity and gender was merged in from the 2009 American Community Survey (ACS) 5-year summary file for the 2007 SBO data and the 2014 ACS 5-year summary file for the 2012 SBO data. These years of the ACS summary file were chosen because the central year of each five-year pool aligns with the year of the SBO data (e.g., the central year of the 2014 5-year ACS, which covers years 2010-2014 is 2012).

Beginning in 2017, the SBO was discontinued and replaced with the ABS. One advantage of the shift to the ABS is that the data are released annually and are thus more current. One major disadvantage, however, is that the ABS data are based on a smaller sample of firms, particularly in years that do not align with the Economic Census (those ending with a two or a seven), and do not report data for many smaller geographies and more detailed groups defined by race/ethnicity and gender. For example, while the SBO reports data for over 20 racial/ethnic groups for the nation, states, CBSAs, counties, and places and reports data disaggregated by race/ethnicity and gender at the national and state levels, the 2017 ABS only reports such detailed data at the national and state levels. The 2018 ABS data are only reported for 7 racial/ethnic groups and only by race or gender instead of race and gender, and only at the state level or higher. Moreover, the sample size of the 2018 ABS (the most recent data available at the time of the last update of the business ownership indicator), and presumably all subsequent years of the ABS until the next Economic Census in 2022, is too small to be of use for the Bay Area Equity Atlas given that it only reports data down to the metropolitan area level.

And while the timelier release schedule for the ABS is a good thing, it did lead us to draw data for the denominator (the number of people in the labor force age 16 or older) from a relatively older vintage of the ACS summary file for 2017 (and later years) compared with earlier years of the indicator; we shifted to combining the ABS data with ACS 5-year summary file data from the corresponding year (e.g. 2017 ABS with the 2017 ACS 5-year summary file). This shift ensures that the ACS data needed for the denominator will be available at the time the new ABS data are released with the downside being that the central year of the ACS sample is two years older than the ABS data.

See the methodology page for other relevant notes.

Notes:

  • With the exception of white people, all racial groups include people of Hispanic origin who self-identify with that racial identity.
  • Estimates for small geographies and/or demographic groups are often not reported because the data does not meet ABS/SBO publication standards.
  • No data is available for the mixed/other racial group since it is not identified in the ABS data.
  • No data on the number of firms per 100 workers (i.e., persons in the labor force age 16 or older) are reported if the calculated rate came out to more than 100 or if there are fewer than 1,000 workers in the denominator.
  • Total firm counts for all breakdowns are restricted to firms classifiable by race, gender, and veteran status.
  • No data are available for the Nine-county Bay Area region and sub-counties.

Business revenue

Summary: The average annual receipts per firm (in 2017 dollars). Firms are classified by race/ethnicity and gender based on the self-identification of the majority owner. With the exception of Whites, all racial groups include people of Hispanic origin who self-identify with that racial identity.

Data Source(s): U.S. Census Bureau, 2007 and 2012 Survey of Business Owners, 2017 Annual Business Survey.

Universe: Firms include all nonfarm businesses filing Internal Revenue Service tax forms as firms with paid employees and with receipts of $1,000 or more.

Methods: Data on aggregate revenues and the number of firms with paid employees, industry, and race/ethnicity and gender of the proprietor was collected from the 2007 and 2012 Survey of Business Owners (SBO) and 2017 Annual Business Survey (ABS) for all Atlas geographies. To be consistent across breakdowns and cuts by race/ethnicity and gender, firm counts for all breakdowns were restricted to firms classifiable by race, gender, and veteran status. A single firm may be tabulated in more than one racial/ethnic group category. This can result because the sole owner was reported to be of more than one race, the majority owner was reported to be of more than one race, or a majority combination of owners was reported to be of more than one race. 

Beginning in 2017, the SBO was discontinued and replaced with the ABS. One advantage of the shift to the ABS is that the data are released annually and are thus more current. One major disadvantage, however, is that the ABS data are based on a smaller sample of firms, particularly in years that do not align with the Economic Census (those ending with a two or a seven), and do not report data for many smaller geographies and more detailed groups defined by race/ethnicity and gender. For example, while the SBO reports data for over 20 racial/ethnic groups for the nation, states, CBSAs, counties, and places and also reports data disaggregated by race/ethnicity and gender for at the national and state levels, the only the 2017 ABS only reports such detailed data at the national and state levels. The 2018 ABS data are only reported for 7 racial/ethnic groups and only by race or gender instead of race and gender, and only at the state level or higher. Moreover, the sample size of the 2018 ABS (the most recent data available at the time of the last update of the business ownership indicator), and presumably all subsequent years of the ABS until the next Economic Census in 2022, is too small to be of use for the Bay Area Equity Atlas given that it only reports data down to the metropolitan area level. See the methodology page for other relevant notes.

Notes: 

  • With the exception of white people, all racial groups include people of Hispanic origin who self-identify with that racial identity.
  • Estimates for small geographies and/or demographic groups are often not reported because the data does not meet ABS/SBO publication standards.
  • No data is available for the mixed/other population racial group since it is not identified in the ABS data.
  • Revenues for all breakdowns are restricted to firms classifiable by race, gender, and veteran status.
  • No data are available for the Nine-county Bay Area region and sub-counties.