The Business Diversity project is a significant part of the Social and Impact Data Commons project. The Data Commons is an open knowledge repository that compiles data from trusted open access sources to provide tools designed to track issues over time and geographical locations. Our project aims to help policy makers and outreach programs track economic diversity—focusing on minority-owned businesses—within Fairfax County, Virginia.
How is a minority-owned business defined? A minority-owned business is a US-based enterprise predominantly owned (51% or more) by one or more members of a socially and economically disadvantaged minority group based on race.
Figure 1: Fairfax County Census Tracts
Microdata enables the study of minority-owned business activities at small geography levels. Current business microdata sources do not adequately identify minority-owned businesses. A case study conducted in Fairfax County revealed that the Annual Business Survey (ABS). reported approximately 38% of minority-owned businesses in 2017. During that same period, Mergent Intellect., our primary datasource, reported only 7%. Although we do not have Mergent Intellect’s methodology for identifying minority-owned businesses, our preliminary findings suggest that Mergent Intellect includes solely registered minority-owned businesses, underrepresenting those not registered. The inconsistency across these sources leads us to ask this question.
Motivation Question: How are minority-owned businesses distributed across Fairfax County geographically?
To help answer these research questions, our goal was to create a binary classification model that can reduce the error in predicting and tracking minority business ownership in Fairfax County, thus, accounting for the underrepresentation in Mergent Intellect’s data. Our classification model consists of three inputs:
We also kept ethical considerations in mind, which is why we decided to employ a binary classification model, and not disclose any business owner's racial identifiers. This is to safeguard the model's intended application.
We applied our final classification model to the non-flagged businesses. By doing this, we increased the reported percentage of minority-owned businesses for Mergent Intellect, to 41.75%. We also reduced the error of misclassified businesses by Mergent Intellect by 12%.
This project is sponsored by the Mastercard Center for Inclusive Growth and the stakeholders are the Fairfax County Government Economic Development Authority. This is one portion of the greater Data Commons project which aims to build an open knowledge repository.
Decoding State-County Census Tracts versus Tribal Census Tracts: https://www.census.gov/newsroom/blogs/random-samplings/2012/07/decoding-state-county-census-tracts-versus-tribal-census-tracts.html
Yelp Business Review Company:
Mergent Intellect (MI) Database:
https://www.mergentintellect.com/index.php/search/index
The Virginia Small Business Supply Directory (SBSD):
Chamber of Commerce:
Hispanic: https://www.novahispanicchamber.com/
Black: https://www.northernvirginiabcc.org/
Asian: https://www.aabac.org/
Fairfax County ACS (American Community Survey) census:
https://www.census.gov/quickfacts/fact/table/fairfaxcountyvirginia/POP010220#POP010220
North Carolina Voter Registration Data (Statewide and County Level data available): https://www.ncsbe.gov/results-data/voter-registration-data
Natural Language Processing Documentation:
RaceBert: https://pypi.org/project/racebert/
Rethnicity: https://www.sciencedirect.com/science/article/pii/S2352711021001874
Ethnicolr: https://ethnicolr.readthedocs.io/ethnicolr.html#install
SpaceY: https://spacy.io/
LangDetect: https://pypi.org/project/langdetect/
1964 Information: https://storymaps.arcgis.com/stories/f74a8fbad837435b8e901cc9c04aa345
1964 information: https://www.richmondfed.org/-/media/richmondfedorg/publications/research/econ_focus/2004/winter/pdf/economic_history.pdf