Both inside corporations and in self-organized online communities, globally distributed groups of thousands of people now collaborate together on design projects over the Internet. This changes the nature of product design, creating potential for new levels of innovation and product development speed\textemdash for example, developing vehicle designs in less than four months, or implementing new business models for urban revitalization in less than a year. However, the plethora of information created by these communities comes with a price: individuals cannot process all of it in a reasonable time frame. Without a means of harnessing their collective efforts, collaborative design communities can never reach their full potential as engines of design innovation and development. To address this problem, this dissertation applies techniques from data science and machine learning to % understand how designers can more effectively navigate and use this vast quantity of information in
answer to the following central question:
How can online design communities effectively use the design data they generate to help manage their operations and improve their designs?
Specifically, it presents examples around particular design communities (OpenIDEO and HCD Connect), and some of the challenges they face: How do you maintain a sustainable and creative design community without centralized command? How do designers locate the most relevant or creative inspirations out of thousands of ideas? How do novice designers use the community to learn what design methods are appropriate for a given problem? How can you scaffold novice designers within a community so that they can meaningfully contribute without requiring full expert knowledge? By framing these real-world problems through the lens of Network Analysis, the Maximum Coverage Problem, and Recommender Systems, this dissertation demonstrates how modern machine learning techniques can ameliorate the issues community members face in practice.
From an computational perspective, it finds that the complexity of solving many distributed design tasks necessitates not only looking at a design itself, but also how it is situated in a human community; human relationships play as big a part in predicting a design's success as the content of the design itself. From a human perspective, data-assisted techniques can adapt to human behavior in ways that improve the collaborative structure of large teams, the relevance of methods used for design problems, and the number and variety of ideas that need to be explored by a designer.
The dissertation's findings imply that customizing data science techniques to take advantage of the socially embedded nature of design benefits designers and scientists alike, not only by making design teams more effective but also by providing deeper insight into how humans design the way they do. They point to a future where data-driven design tools are not just a means to an end, but a critical part of how we understand our own needs and creations; where science can be applied, not just to the creation, but to the process of creation.