NLP in Design Evaluation
If a Picture is Worth 1000 Words, is a Word Worth 1000 Features for Design Metric Estimation?
1MIT 2The Pennsylvania State University
If a Picture is Worth 1000 Words, is a Word Worth 1000 Features for Design Metric Estimation?
1MIT 2The Pennsylvania State University
When evaluating designs, we aim to capture a range of information including usefulness, uniqueness, and novelty of a design. The subjective nature of these concepts makes their evaluation difficult. Despite this, many attempts have been made and metrics developed to do so, because design evaluation is integral to innovation and the creation of novel solutions. The most common metrics used are the consensual assessment technique (CAT) and the Shah, Vargas-Hernandez, and Smith (SVS) method. While CAT is accurate and often regarded as the "gold standard," it heavily relies on using expert ratings as a basis for judgement, making CAT expensive and time consuming. Comparatively, SVS is less resource-demanding, but it is often criticized as lacking sensitivity and accuracy. We aim to take advantage of the distinct strengths of both methods through machine learning. More specifically, this study seeks to investigate the possibility of using machine learning to facilitate automated creativity assessment. The SVS method results in a text-rich dataset about a design. In this paper we utilize these textual design representations and the deep semantic relationships that words and sentences encode, to predict more desirable design metrics, including CAT metrics. We demonstrate the ability of machine learning models to predict design metrics from the design itself and SVS Survey information. We demonstrate that incorporating natural language processing (NLP) improves prediction results across all of our design metrics, and that clear distinctions in the predictability of certain metrics exist.
The figure above shows the overall architecture of our model. We experiment with different methods to predict the five different design metrics: Usefulness, Elegance, Drawing, Uniqueness, and Creativity. Our different methods stem from using three different representations of a design, as well as three different regression models. The different design representations are derived from data available in the design itself and the unprocessed SVS features.
From a design, we initially have a dataset of SVS features in numerical form as well as a written text description provided by the designer. We hypothesized that using natural language processing (NLP) of the survey features and text description would create a design representation that can more effectively predict design metrics. For this design representation, we convert the numerical SVS features into text and combine that text with the original description to gain an all-text representation of the design. We encode this text representation in Tensorflow's Universal Sentence Encoder to gain a numerical text embedding for each design. We input this text embedding into a regression model to predict five expert acquired design metrics: Usefulness, Elegance, Drawing, Uniqueness, and Creativity.
Edwards, Kristen M., Aoran Peng, Scarlett R. Miller and Faez Ahmed. "If A Picture Is Worth 1000 Words, Is A Word Worth 1000 Features For Design Metric Estimation?” In Procesing of the International Design Engineering Technical Conferences and Computers and Information in Engineering Conference, {IDETC-21}, virtual, online, 2021.
@inproceeedings{edwards2021formfunction,
title={If A Picture Is Worth 1000 Words,
Is A Word Worth 1000 Features For Design Metric Estimation?},
author={Kristen M. Edwards and Aoran Peng and Scarlett R. Millet and Faez Ahmed},
year={2021},
booktitle={International Design Engineering Technical Conferences and Computers and Information in Engineering Conference,
{IDETC-21}},
organization={ASME},
day = {17-20},
month = {Aug},
address = {Virtual,
Online},
year={2021}}
We would like to thank the Ida M. Green Fellowship for supporting Kristen Edwards’s research.