NLP in Design Evaluation
If a Picture is Worth 1000 Words, is a Word Worth 1000 Features for Design Metric Estimation?
1MIT 2The Pennsylvania State University
If a Picture is Worth 1000 Words, is a Word Worth 1000 Features for Design Metric Estimation?
1MIT 2The Pennsylvania State University
When evaluating designs, we aim to capture a range of information including usefulness, uniqueness, and novelty of a design. The subjective nature of these concepts makes their evaluation difficult. Despite this, many attempts have been made, and metrics developed to do so because design evaluation is integral to innovation and the creation of novel solutions. The most common metrics are the consensual assessment technique (CAT) and the Shah, Vargas-Hernandez, and Smith (SVS) method. While CAT is accurate and often regarded as the "gold standard," it heavily relies on expert ratings as a basis for judgment, making it expensive and time-consuming. Comparatively, SVS is less resource-demanding, but it is often criticized as lacking sensitivity and accuracy. We aim to take advantage of the distinct strengths of both methods through machine learning. More specifically, this study investigates the possibility of using machine learning to facilitate automated creativity assessment. The SVS method results in a text-rich dataset about a design. In this paper, we utilize these textual design representations and the deep semantic relationships that words and sentences encode to predict more desirable design metrics, including CAT metrics. We demonstrate the ability of machine learning models to predict design metrics from the design itself and SVS Survey information. We demonstrate that incorporating natural language processing (NLP) improves prediction results across all of our design metrics and that clear distinction in the predictability of certain metrics exist.
The figure above shows the overall architecture of our model. We experiment with different methods to predict the five design metrics: Usefulness, Elegance, Drawing, Uniqueness, and Creativity. Our different methods stem from using three different representations of a design as well as three different regression models. The different design representations are derived from data available in the design and the unprocessed SVS features.
From a design, we initially have a dataset of SVS features in numerical form as well as a written text description provided by the designer. We hypothesized that using natural language processing (NLP) of the survey features and text descriptions would create a design representation that can more effectively predict design metrics. For this design representation, we convert the numerical SVS features into text and combine that text with the original description to gain an all-text representation of the design. We encode this text representation in Tensorflow's Universal Sentence Encoder to gain a numerical text embedding for each design. We input this text embedding into a regression model to predict five expert-acquired design metrics: Usefulness, Elegance, Drawing, Uniqueness, and Creativity.
Edwards, Kristen M., Aoran Peng, Scarlett R. Miller and Faez Ahmed. "If A Picture Is Worth 1000 Words, Is A Word Worth 1000 Features For Design Metric Estimation?” In Proceedings of the International Design Engineering Technical Conferences and Computers and Information in Engineering Conference, {IDETC-21}, virtual, online, 2021.
@inproceeedings{edwards2021formfunction,
title={If A Picture Is Worth 1000 Words,
Is A Word Worth 1000 Features For Design Metric Estimation?},
author={Kristen M. Edwards and Aoran Peng and Scarlett R. Millet and Faez Ahmed},
year={2021},
booktitle={International Design Engineering Technical Conferences and Computers and Information in Engineering Conference,
{IDETC-21}},
organization={ASME},
day = {17-20},
month = {Aug},
address = {Virtual,
Online},
year={2021}}
We would like to thank the Ida M. Green Fellowship for supporting Kristen Edwards’s research.