About

What is How Do They Lobby?

How Do They Lobby is a portal for exploring CHORUS (Dataset on Policy Choice and Organizational Representation in the United States). CHORUS represents the first comprehensive dataset of state lobbying and testimony positions, allowing researchers to systematically analyze interest groups’ influence on state legislation for the first time.

The site aims to provide a publicly accessible portal that makes it convenient to explore CHORUS. The site contains two main features: 1) a graph page for exploring interactive visualizations of interest group coalitions, and 2) a search page for identifying specific interest groups, bills, and policy positions.

The data for CHORUS was obtained by scraping and cleaning policy positions from public data portals published by state legislatures. The dataset was constructed by Galen Hall, Joshua A. Basseches, Rebecca Bromley-Trujillo and Trevor Culhane, and is a Brown Climate and Development Lab project. The How Do They Lobby site was built by Galen Winsor and Lizzy Zhang.

The CHORUS paper has been published in State Politics & Policy Quarterly (SPPQ). Use of the dataset should be cited as follows:

Hall, Galen, Joshua A. Basseches, Rebecca Bromley-Trujillo, and Trevor Culhane. 2024. “CHORUS: A New Dataset of State Interest Group Policy Positions in the United States.” State Politics & Policy Quarterly: 1–26. doi: 10.1017/spq.2024.6

About the Dataset

CHORUS compiles 13 million policy positions taken by over 300,000 interest groups and individuals from 17 state legislatures on over 500,000 bills spanning various issue areas from 1997 - 2022. CHORUS also presents a new technique for generating coalitions of similar interest groups and bills using layered stochastic block modeling. Network analysis of these coalitions present new opportunities to analyze interest groups’ policy preferences.

Additionally, CHORUS contains interest group and bill metadata linked from external databases. Industry classifications for interest groups are collected from Follow The Money, and bill metadata are linked to Legiscan and NCSL.

Policy positions have a record type of either “lobbying” or “testimony”, depending on whether the position was reported in a lobbying disclosure or collected from public testimony (e.g. committee meeting minutes). See the below table for the record types and years that are available in the positions data for each state:

State
Record Type
Support
Neutral
Oppose
% Neutral
Average positions per bill
Years covered
AZ
testimony
3,007,440
124,521
2,604,305
2.2
398.7
2006-2022
CO
lobbying
214,645
919,610
464,469
57.5
132.7
2003-2023
CO
testimony
13,134
4,339
42,826
7.2
15.3
2006-2015
FL*
testimony
13,526
4,669
35,094
8.8
6.2
2004-2022
IA
lobbying
90,545
531,455
167,782
67.3
35.8
2009-2022
IL
testimony
1,232,635
19,091
1,924,315
0.6
166
2013-2022
KS
testimony
11,203
2,973
25,862
7.4
18.6
2014-2022
MA
lobbying
111,160
210,783
153,709
44.3
15.9
2010-2021
MD
testimony
19,001
4,193
73,058
4.4
14.2
2020-2022
MO
testimony
16,110
2,927
45,812
4.5
6.9
2003-2022
MT
lobbying
23,060
33,440
45,415
32.8
10
2006-2022
MT
testimony
15,349
4,875
32,198
9.3
22.3
2017-2021
NE
lobbying
77,103
78,092
141,489
26.3
23.5
2000-2021
NJ
lobbying
13,687
16,359
40,315
23.3
4.6
2014-2022
OH
testimony
10,039
4,973
27,409
11.7
12.1
2015-2022
RI
lobbying
15,198
11,210
41,313
16.6
16.3
2018-2022
SD
testimony
19,465
1,309
44,731
2
6
1997-2022
TX
testimony
140,771
90,985
415,461
14.1
17.9
1997-2021
WI
lobbying
22,212
26,006
50,494
26.3
6.6
2002-2022

*Positions from Senate Bills are not available in Florida. All other states include positions from both chambers, except for Nebraska, which is unicameral. Source: Hall et al. 2023.

Data Disclaimer

The information provided on this site is intended for informational purposes only and may vary from the official CHORUS dataset published. The accuracy of the data on the website is not guaranteed, and should be verified with the official CHORUS dataset and state legislature lobbying tracking databases, especially when used for research purposes.

Notes on the Website Data

Position Counts

  • The website database calculates position counts for each bill and interest group. These counts are used for:
    • The pie charts on the bill and interest group pages.
    • The “oppose positions”, “neutral positions”, “support positions”, and “total positions” columns in the bill and interest group tables
  • When computing counts, positions are deduplicated so that only unique combinations of interest groups, bills, and stances (i.e., oppose, neutral, or support) are considered.
    • A bill’s position counts therefore represent the number of interest groups who took a specific stance on the bill, while an interest group’s position counts represent the number of bills on which the interest group took a specific stance.
  • The number of total positions is calculated for both interest groups and bills by summing the three stance-specific counts, without further deduplication.

Bills

  • Proponents/Opponents. Bill proponents and opponents are calculated by finding the interest groups with the greatest number of support or oppose positions, respectively, for a given bill and record type. Positions used to calculate proponents/opponents are not deduplicated.

Clients (Interest Groups)

  • Proponents/Opponents. Interest group proponents and opponents represent the interest groups that agree or disagree most frequently with a given interest group. Proponents/opponents are not calculated directly.
    • Instead they are computed from the edges between interest groups used to construct the intra-coalition networks. Positive and negative edges between two interest groups indicate the number of bills on which they agreed or disagreed, respectively.
    • This approach deviates from direct computation by excluding interest groups with fewer than five non-neutral positions and bills with fewer than two non-neutral positions during edge generation. Positions are duplicated during this procedure. Additionally, an attempt is made to filter out individuals (see “Filtering interest groups” under the “Coalition Networks” section below). Thus, interest group proponents and opponents will vary from calculations across the entire dataset.
  • Client Names. An interest group’s name may be recorded differently across their positions, leading to multiple associated names for a single group. To maintain consistency in position/interest group search results, a name is arbitrarily designated to the “interest group name” column as the canonical client name. These canonical name designations are standardized across the website for a particular version of the dataset.

Search Results

  • Issue areas and NCSL Bill Topics. The values in the “Bill Issue Areas” and “Bill Topics’ columns are sourced from NCSL.
    • Issue areas are determined by the “related topic” of the NCSL databases which track a given bill. For instance, the Prescription Drug State Bill Tracking Database’s related topic is “health”.
    • NCSL Bill Topics refer to the subtopics that can be found in a particular NCSL database. For instance, some topics under the Prescription Drug State Bill Tracking Database are “Insurance/Coverage - Rx Drugs” and “Specialty Pharmaceuticals”.
  • Client names in position results. The “recorded client name” field in a position result is the client name listed for that particular position record, and may differ from the canonical client name (see “Client Names” in the “Clients” section above).

Coalition Networks

  • Filtering interest groups. An attempt is made to exclude individuals from coalitions networks. An interest group is filtered out if they have 1) a null client/interest group ID, or 2) have one of the following names (case-insensitive): ‘self’, ‘citizen’, ‘taxpayer’, or ‘individual’. Since this list is non-exhaustive, individuals may still appear in networks.
  • Issue areas and bill databases. The choices for issue areas and bill databases are sourced from NCSL (National Conference of State Legislatures). Bill databases align with NCSL’s bill tracking databases, and issue areas are determined as described in the Search Results section above.
    • There are many bills that lack NCSL metadata because they extend beyond the years covered by their databases. Consequently, issue and database-specific coalition networks will reflect a smaller, more recent subset of the dataset.
  • Coalition Names. A coalition’s name is determined by identifying the most prevalent industry classification within the coalition. (Industry classifications for some interest groups are predictions). This determination is weighted by the number of positions, meaning that the industry with the highest number of positions within the coalition becomes its label. Positions from interest groups with predicted industries are weighted 10x less than actual industries. As a result, multiple coalitions within the same network may have the same name.
  • Intra-coalition networks. Intra-coalition networks show a force-directed graph of the top 50 interest groups in a given coalition, sorted by total position count (calculated using the deduplication procedure described above).
    • Node size. The size of interest group nodes is proportional to the percentage of (deduplicated) positions held by that group in its coalition. A lower size limit is set to ensure all nodes are visible.
    • Edge length. Edge length is determined by subtracting its weight from the maximum positive edge weight in the coalition, then adding a small positive constant. This design ensures that the two interest groups agreeing the most have the shortest edge, while those disagreeing the most have the longest.
    • Edge sparsification. The bottom 25% of edges, based on absolute weight, are excluded from the graph. The top 250 edges are always retained, regardless of their percentile. Additionally, only the top 10% of the remaining edges (7.5% of all edges) are given opacity in the graph. The rest of the edges are invisible but still influence the position of interest group nodes. The top 250 edges are always rendered with opacity.