About

What is How Do They Lobby?

How Do They Lobby is a portal for exploring CHORUS (Dataset on Policy Choice and Organizational Representation in the United States). CHORUS represents the first comprehensive dataset of state lobbying and testimony positions, allowing researchers to systematically analyze interest groups’ influence on state legislation for the first time.

The site aims to provide a publicly accessible portal that makes it convenient to explore CHORUS. The site contains two main features: 1) a graph page for exploring interactive visualizations of interest group coalitions, and 2) a search page for identifying specific interest groups, bills, and policy positions.

The data for CHORUS was obtained by scraping and cleaning policy positions from public data portals published by state legislatures. The dataset was constructed by Galen Hall, Joshua A. Basseches, Rebecca Bromley-Trujillo and Trevor Culhane, and is a Brown Climate and Development Lab project. The How Do They Lobby site was built by Galen Winsor and Lizzy Zhang.

The CHORUS paper has been published in State Politics & Policy Quarterly (SPPQ). Use of the dataset should be cited as follows:

Hall, Galen, Joshua A. Basseches, Rebecca Bromley-Trujillo, and Trevor Culhane. 2024. “CHORUS: A New Dataset of State Interest Group Policy Positions in the United States.” State Politics & Policy Quarterly: 1–26. doi: 10.1017/spq.2024.6

About the Dataset

CHORUS compiles 13 million policy positions taken by over 300,000 interest groups and individuals from 17 state legislatures on over 500,000 bills spanning various issue areas from 1997 - 2022. CHORUS also presents a new technique for generating coalitions of similar interest groups and bills using layered stochastic block modeling. Network analysis of these coalitions present new opportunities to analyze interest groups’ policy preferences.

Additionally, CHORUS contains interest group and bill metadata linked from external databases. Industry classifications for interest groups are collected from Follow The Money, and bill metadata are linked to Legiscan and NCSL.

Policy positions have a record type of either “lobbying” or “testimony”, depending on whether the position was reported in a lobbying disclosure or collected from public testimony (e.g. committee meeting minutes). See the below table for the record types and years that are available in the positions data for each state:

State	Record Type	Support	Neutral	Oppose	% Neutral	Average positions per bill	Years covered
AZ	testimony	3,007,440	124,521	2,604,305	2.2	398.7	2006-2022
CO	lobbying	214,645	919,610	464,469	57.5	132.7	2003-2023
CO	testimony	13,134	4,339	42,826	7.2	15.3	2006-2015
FL*	testimony	13,526	4,669	35,094	8.8	6.2	2004-2022
IA	lobbying	90,545	531,455	167,782	67.3	35.8	2009-2022
IL	testimony	1,232,635	19,091	1,924,315	0.6	166	2013-2022
KS	testimony	11,203	2,973	25,862	7.4	18.6	2014-2022
MA	lobbying	111,160	210,783	153,709	44.3	15.9	2010-2021
MD	testimony	19,001	4,193	73,058	4.4	14.2	2020-2022
MO	testimony	16,110	2,927	45,812	4.5	6.9	2003-2022
MT	lobbying	23,060	33,440	45,415	32.8	10	2006-2022
MT	testimony	15,349	4,875	32,198	9.3	22.3	2017-2021
NE	lobbying	77,103	78,092	141,489	26.3	23.5	2000-2021
NJ	lobbying	13,687	16,359	40,315	23.3	4.6	2014-2022
OH	testimony	10,039	4,973	27,409	11.7	12.1	2015-2022
RI	lobbying	15,198	11,210	41,313	16.6	16.3	2018-2022
SD	testimony	19,465	1,309	44,731	2	6	1997-2022
TX	testimony	140,771	90,985	415,461	14.1	17.9	1997-2021
WI	lobbying	22,212	26,006	50,494	26.3	6.6	2002-2022

*Positions from Senate Bills are not available in Florida. All other states include positions from both chambers, except for Nebraska, which is unicameral. Source: Hall et al. 2023.

Data Disclaimer

The information provided on this site is intended for informational purposes only and may vary from the official CHORUS dataset published. The accuracy of the data on the website is not guaranteed, and should be verified with the official CHORUS dataset and state legislature lobbying tracking databases, especially when used for research purposes.

Notes on the Website Data

Position Counts

The website database calculates position counts for each bill and interest group. These counts are used for:
- The pie charts on the bill and interest group pages.
- The “oppose positions”, “neutral positions”, “support positions”, and “total positions” columns in the bill and interest group tables
When computing counts, positions are deduplicated so that only unique combinations of interest groups, bills, and stances (i.e., oppose, neutral, or support) are considered.
- A bill’s position counts therefore represent the number of interest groups who took a specific stance on the bill, while an interest group’s position counts represent the number of bills on which the interest group took a specific stance.
The number of total positions is calculated for both interest groups and bills by summing the three stance-specific counts, without further deduplication.

Bills

Proponents/Opponents. Bill proponents and opponents are calculated by finding the interest groups with the greatest number of support or oppose positions, respectively, for a given bill and record type. Positions used to calculate proponents/opponents are not deduplicated.

Clients (Interest Groups)

Proponents/Opponents. Interest group proponents and opponents represent the interest groups that agree or disagree most frequently with a given interest group. Proponents/opponents are not calculated directly.
- Instead they are computed from the edges between interest groups used to construct the intra-coalition networks. Positive and negative edges between two interest groups indicate the number of bills on which they agreed or disagreed, respectively.
- This approach deviates from direct computation by excluding interest groups with fewer than five non-neutral positions and bills with fewer than two non-neutral positions during edge generation. Positions are duplicated during this procedure. Additionally, an attempt is made to filter out individuals (see “Filtering interest groups” under the “Coalition Networks” section below). Thus, interest group proponents and opponents will vary from calculations across the entire dataset.
Client Names. An interest group’s name may be recorded differently across their positions, leading to multiple associated names for a single group. To maintain consistency in position/interest group search results, a name is arbitrarily designated to the “interest group name” column as the canonical client name. These canonical name designations are standardized across the website for a particular version of the dataset.

Search Results

Issue areas and NCSL Bill Topics. The values in the “Bill Issue Areas” and “Bill Topics’ columns are sourced from NCSL.
- Issue areas are determined by the “related topic” of the NCSL databases which track a given bill. For instance, the Prescription Drug State Bill Tracking Database’s related topic is “health”.
- NCSL Bill Topics refer to the subtopics that can be found in a particular NCSL database. For instance, some topics under the Prescription Drug State Bill Tracking Database are “Insurance/Coverage - Rx Drugs” and “Specialty Pharmaceuticals”.
Client names in position results. The “recorded client name” field in a position result is the client name listed for that particular position record, and may differ from the canonical client name (see “Client Names” in the “Clients” section above).

Coalition Networks

Filtering interest groups. An attempt is made to exclude individuals from coalitions networks. An interest group is filtered out if they have 1) a null client/interest group ID, or 2) have one of the following names (case-insensitive): ‘self’, ‘citizen’, ‘taxpayer’, or ‘individual’. Since this list is non-exhaustive, individuals may still appear in networks.
Issue areas and bill databases. The choices for issue areas and bill databases are sourced from NCSL (National Conference of State Legislatures). Bill databases align with NCSL’s bill tracking databases, and issue areas are determined as described in the Search Results section above.
- There are many bills that lack NCSL metadata because they extend beyond the years covered by their databases. Consequently, issue and database-specific coalition networks will reflect a smaller, more recent subset of the dataset.
Coalition Names. A coalition’s name is determined by identifying the most prevalent industry classification within the coalition. (Industry classifications for some interest groups are predictions). This determination is weighted by the number of positions, meaning that the industry with the highest number of positions within the coalition becomes its label. Positions from interest groups with predicted industries are weighted 10x less than actual industries. As a result, multiple coalitions within the same network may have the same name.
Intra-coalition networks. Intra-coalition networks show a force-directed graph of the top 50 interest groups in a given coalition, sorted by total position count (calculated using the deduplication procedure described above).
- Node size. The size of interest group nodes is proportional to the percentage of (deduplicated) positions held by that group in its coalition. A lower size limit is set to ensure all nodes are visible.
- Edge length. Edge length is determined by subtracting its weight from the maximum positive edge weight in the coalition, then adding a small positive constant. This design ensures that the two interest groups agreeing the most have the shortest edge, while those disagreeing the most have the longest.
- Edge sparsification. The bottom 25% of edges, based on absolute weight, are excluded from the graph. The top 250 edges are always retained, regardless of their percentile. Additionally, only the top 10% of the remaining edges (7.5% of all edges) are given opacity in the graph. The rest of the edges are invisible but still influence the position of interest group nodes. The top 250 edges are always rendered with opacity.