Three Essays on Diversity Dynamics in Science and Culture
- Koch, Bernard Joseph
- Advisor(s): Foster, Jacob G;
- Panofksy, Aaron L
Abstract
This dissertation contains three distinct papers about the historical dynamics of ideas in cultural and scientific fields.
The first chapter presents a theory of cultural evolution that reconciles dueling perspectives in cultural sociology that focus either on individuals or public culture. To operationalize the theory, the paper describes new statistical methods to identify the action of evolutionary mechanism in populations of ideas and cultural objects. These methods are deployed on an original dataset of over 30,000 Metal bands to argue that the history of Metal music has been shaped largely by competition between ideas and the innovation of new subgenres.
The second chapter explores both the stubborn persistence and impact of hereditarian psychology papers correlating race and intelligence. Using network community detection, we show that this literature is sustained by a diverse collaboration community unified by co-authorships with controversial social psychologist Richard Lynn. In terms of citation, our results show that this work is actually increasingly marginal within the intelligence psychology community. However, it is cited more widely than mainstream intelligence psychology papers in politicized debates on Reddit, where it proliferates through copied-and-pasted bibliographies.
The third chapter characterizes changes in the diversity of datasets used in Artificial Intelligence/Machine Learning Research (AI/MLR) between 2015 and 2020. Using an original dataset of over 60,000 papers citing 4,384 of the most commonly used datasets within the field, my collaborators and I find that the diversity of AI/MLR datasets is decreasing over time. Moreover, while new datasets are often created specifically for certain tasks, researchers prefer to borrow high-profile datasets from other research communities. Finally, we find that more than 50% of dataset usages in papers can be ascribed to datasets created at just 12 elite universities and corporations. Ethical and epistemic implications are discussed.