Campaign Finance Database Methodology
Data in the Donor Name, Amount, Date, Address, and Notes fields were drawn from the following sources:
Data in the Sector of Employment field as well as some donor and Notes data were drawn from additional online research.
Our analysis is based on records provided online by the Philadelphia Department of Records and Philadelphia City Commissioners as of November 2018.
Type of Data Selected and How it Was Altered from Original Source:
Rationale for Headings/Data Types:
Donor Name: Primarily drawn from aforementioned sources. Altered with notation (in Notes) when a company donated through a property it owned (EXAMPLE) or if a donation was made between two entities (mainly a spouse) who did not give independently. Also updated various spellings of names to make them consistent across all records that likely represented the same individual or group (while checking addresses and employers, where relevant, for added validation).
Amount: Drawn from aforementioned sources. Unaltered.
Date: Drawn from aforementioned sources. Unaltered.
Sector of employment: Created to categorize donors based on the industry in which they’re employed; in some cases denotes industries where donors have other involvement—including having advocated for, donated to, or served that sector in some additional capacity (ex. Charter schools) See Sector Rationales below under “How did we alter or augment these data sources?”
Sector of employment (grouped): Sectors of employment were further grouped into a final set of 7 categories that represented the most prevalent sector groups (further described below)
Address: Drawn from aforementioned sources. Unaltered, other than cleaning obvious spelling errors.
Employer: Drawn from aforementioned sources. Unaltered.
Occupation: Drawn from aforementioned sources. Unaltered.
Notes: Created for database. Data is drawn from employment data in aforementioned sources as well as online research. In some cases notes alterations in “Donor Name” field as explained above.
Contribution To: Drawn from the aforementioned sources. Unaltered, although some cleaning/condensing was needed in cases where candidates filed with different Political Committee names over time (e.g. People for Parker, Friends of Cherelle L. Parker).
Changes to Original Datasets
The original datasets originate from the following sources:
PDF filings of campaign finance reports
TXT filings of campaign finance reports
CSV files from CF Document Search Engine Searches for Councilmembers.
How did we alter or augment these data sources?
We first organized and cleaned up the Philadelphia Department of Records Campaign Finance Document Search Engine data for All Contributions to Current Councilmembers and Mayor Kenney from 2014 to the Present.
We cleaned up the data in the following ways:
Fixed misspellings and combined donors with alternate names into a single name, taking into account addresses and employers to try to avoid mistakenly combining different individuals, where noted in the “Notes” or "Donor Name-Cleaned" fields.
Eliminated duplicate contribution records by matching donor names, amounts, dates and addresses, under the assumption and precaution that many of these were accidental double entries.
Eliminated near-duplicates, or records that matched in the “Donor Name Original” and “Amount” fields but were noted as several days before or after one another. The campaign finance text file records provided by the Department of Records include rows that represent amendments to previous records, yet there are no contribution “IDs”, so it is impossible to tell for sure which records are actually duplicates. We took a conservative approach by assuming that any contributions with the same recipient, donor name, amount, and near same date were probably duplicates, unless we could verify that both records were included in the PDF records.
This data was then cross checked with the TXT files and PDF files where they were available. In many cases the file sources did not match one another and needed to be combined—adding any missing files to the record.
Eliminated any contributions marked “other receipts” in TXT files.
Where TXT files and *Search engine files did not exist, we transcribed PDF files—drawing from the section “Schedule I Contributions and Receipts” and “Schedule II - In-Kind Contributions and Valuable Things Received”. We drew data from individual contribution records.
We then categorized or “coded” each donor by sector of the economy according to the following definitions
Unions: Any entity qualifying as a union
Law: Any entity qualifying as a lawyer or law firm
Politics / Political Committees: Any entity qualifying as a political committee under Philadelphia election rules, with several additions for prominent political operators who do not currently hold office (e.g. Ed Rendell)
Building Industry & Real Estate: Any entity qualifying for membership in the Building Industry Association of Greater Philadelphia, including property managers, developers, realtors, contractors, architects, interior designers, etc.
Food, Beverage & Tobacco: Any entity qualifying as a member of the food, beverage, or tobacco industries, including distributors, grocers, retailers, and restaurants.
Other: For any sectors that did not clearly fit into the above categories
Unknown: For donors where we are not sure of their sector of employment
Description of Analysis
Note: Analyses include donors above 50$ during the period of January 2014 through the present. Listed donors are limited to those who have given $4k or more in the same period.
Using Tableau Public, we analyzed the data in the following ways:
Aggregated giving by size of donation (overall/by candidate)
Aggregated giving by donors’ sectors of employment
Ranked fundraising numbers by councilmembers (and the Mayor) and giving numbers by donor
Mapped fundraising by origin of donation:
By Council District
By US Counties
Each Tableau dashboard allows users to further analyze data. See the pages for each individual dashboard for more details: