Skip to article frontmatterSkip to article content
Site not loading correctly?

This may be due to an incorrect BASE_URL configuration. See the MyST Documentation for reference.

Census Data Tools

Initiate and configure the logger. This will save the outputs of the log to the specified file.

from morpc.logs import config_logs

config_logs('./temp_data/morpc-census-demo.log', level='debug') ## Designate the logger level. 

morpc.census.api

This module has been developed to fetch and save Census data from various API endpoints.

Census API Class

The class CensusAPI class does most of the heavy lifting, formulating the API call, getting the data, wrangling it into a long format, creating frictionless schema and resource files, and saving the data.

The class takes the following parameters:

  1. The “survey/table” to get the data from.

  2. The year of the data.

  3. The group of variables that you want to call.

  4. The geographic “scope” of the data you want to get the data for.

Optionally:

  1. The geographic scale of the data you want to get the data for.

  2. A subset of the variables in the chosen group.

# TODO: Timeseries calls.  

# TODO: Getting select variables from various groups.

Example: Compare divorced population for all tracts in 15-County region.

1. See currently implemented endpoints for “survey/table” parameter.

Use the IMPLEMENTED_ENDPOINTS constant to see currently available options.

from morpc.census.api import IMPLEMENTED_ENDPOINTS

IMPLEMENTED_ENDPOINTS

2. See available vintages

To see all available endpoints provided by the Census Bureau and their available vintages, see ALL_AVAIL_ENPOINTS

from morpc.census.api import ALL_AVAIL_ENDPOINTS 

ALL_AVAIL_ENDPOINTS['acs/acs5']

3. See available variable groups for a survey/table.

from morpc.census.api import get_table_groups

{k: v['description'] for k, v in get_table_groups('acs/acs5', 2023).items()}['B05006']

# TODO: Hierarchical lookup for variables of interest.

4. Select the geographic scope of the data.

To see currently implemented scopes. See morpc.census.geos.SCOPES

from morpc.census.geos import SCOPES

SCOPES

# TODO: Passing list of geoids to ucgid parameter in api call

scope=SCOPES['region15']

(Optional) 5. Select a smaller geographic scale to get data for.

For example if you want to get all the tracts in the 15-county MORPC region.

To see available combinations of scope and scale see morpc.census.geos.PSUEDOS.

The key is the parent sumlevel and the values represent the child sumlevel to get the data for.

from morpc.census.geos import PSEUDOS

PSEUDOS

## All available scale levels if the sumlevel of the "scope" paramter is Counties (050)

PSEUDOS['050']

(Optional) 6. A list of variables to get from the group.

To see all available variables for a given variable group use morpc.census.api.get_group_variables()

from morpc.census.api import get_group_variables

{k: v['label'] for k, v in get_group_variables('acs/acs5', 2023, 'B05006').items()}

Get the data by calling CensusAPI

To call the data simply use the above attributes in the CensusAPI class.

from morpc.census.api import CensusAPI

b05006_2023 = CensusAPI('acs/acs5', 2023, 'B05006', 'region15', scale='tract')

The raw data is available and stored in the class as .DATA

b05006_2023.DATA

The class also automatically creates a long table format which is the format we will save the data as.

b05006_2023.LONG

morpc.census.api.DimensionTable

The other class that is implemented is the dimension table. These are typically used for situations in which you want the data to be human readable.

This is designed to replicate how the Census often represents it’s data with the different levels of the variable group as rows and the geography and years as columns.

from morpc.census.api import DimensionTable

dim_table = DimensionTable(b05006_2023.LONG)
dim_table.wide()

Also implemented is the percent table. Which shows the data as a percentage of the total (ie. universe)

dim_table.percent()

Timeseries of calls

Dimension table takes any long format table that is formatted similar to the output of CensusAPI.

This allows for easy implementation of timeseries data by concatenating two long form tables from different years.

b05006_2018 = CensusAPI('acs/acs5', 2018, 'B12001', 'region15', scale='tract')
b05006_2018.LONG
import pandas as pd

long = pd.concat([b05006_2023.LONG, b05006_2018.LONG])
dim_table = DimensionTable(long)
dim_table.wide()
percent = dim_table.percent()
percent

Save the long form table as csv with schema and resource files.

Typically we will store data in their long format. CensusAPI.save() method saves a long form table with schema and resource file.

b05006_2023.save('./temp_data')
import os

print(b05006_2023.FILENAME)

os.path.exists(f'./temp_data/{b05006_2023.FILENAME}')

(DEPRECIATED) Custom API calls

Getting all available datasets in the api.

MORPC works regularly with census data, including but not limited to ACS 5 and 1-year, Decennial Census, PEP, and geographies. The following module is useful for gathering and organizing census data for processes in various workflow. Those workflows are linked when appropriate.

API functions and variables

api_get() is a low-level wrapper for Census API requests that returns the results as a pandas dataframe. If necessary, it splits the request into several smaller requests to bypass the 50-variable limit imposed by the API.

The resulting dataframe is indexed by GEOID (regardless of whether it was requested) and omits other fields that are not requested but which are returned automatically with each API request (e.g. “state”, “county”)

url = 'https://api.census.gov/data/2022/acs/acs1'
params = {
    "get": "GEO_ID,NAME,B01001_001E",
    "for": "county:049,041",
    "in": "state:39"
}
from morpc.census.api import get
api = get(url, params)
api

Geography tools

from morpc.census.geos import fetch_geos_from_scale_scope

geos = fetch_geos_from_scale_scope(scope='region15', scale='tract')

# TODO: Split geoid list into sumlevels and then chucks to avoid to large of requests. Figure out how to handle this in the rest_api module maybe.
geos
geos.plot(alpha=0.5)