Engineering Datasets

A Curated List of Engineering Design Datasets by the DeCoDE lab.

Language Modelling Datasets

Datasets for Large Language Models in Engineering Design

VLM dataset - A dataset of 1000+ tasks to evaluate vision language models.

LINKS dataset - A dataset of 100 million planar linkage mechanisms and 1.1 billion coupler curves obtained from kinematic simulations. The dataset also contains curated curves, 100 million negative samples, and a publicly available simulation software.

BIKED dataset - A dataset of 4,512 bicycle models with parametric data, images, and segmented component images.

BIKED++ dataset - A multimodal dataset of 1.4 million bicycle image and parametric CAD designs.

FRAMED dataset - A dataset of 4,500 bicycle frames and ten performance metrics obtained from structural simulations.

Aircraft Lift and Drag dataset - A dataset of lift and drag performance values of 4,045 3D aircraft models from Shapenet.

Turbo-compressors dataset - A dataset of 22 million turbo-compressors and their performance under different operating conditions.

DrivAerNet++ - A large-scale multimodal dataset of 8000 detailed 3D car meshes and aerodynamic performance data comprising of full 3D pressure, velocity fields, and wall-shear stresses, point clouds and parts annotation.

DrivAerNet dataset - A dataset of 4000 detailed 3D car meshes and aerodynamic performance data comprising of full 3D pressure, velocity fields, and wall-shear stresses.

Autosurf aircraft dataset A dataset of 1,050 airplane models with segmentation labels, created using NASA's Open Vehicle Sketch Pad (OpenVSP).

Car Drag Coefficient - A dataset of 4,948 3D car meshes, their renderings, and their drag coefficients.

Structural and Topology Optimization

Explore Topology Optimization papers

Topodiff topology optimization dataset - A dataset of 33,000 images corresponding to optimal topologies for diverse boundary conditions. The dataset also contains their physical fields, compliance values, and an additional 42000 non-optimal topologies.

Design Geometry

Explore design geometry datasets

SHIP-D dataset - A dataset of 30,000 ship hulls each with design and functional performance information, including parameterization, mesh, point-cloud, and image representations, as well as 32 hydrodynamic drag measures under different operating conditions.

Airfoil dataset - A synthetic dataset of 48,503 airfoils and their aerodynamic performance computed using OpenFOAM.

Sketches

Explore Sketch Datasets:

Milk Frother dataset - A multimodal dataset of 1,126 milk frother sketches and their text descriptions. The dataset is derived from a milk frother dataset collected at the Brite lab.

Other datasets

A repository that leads to other engineering datasets

Other engineering datasets - A collection of datasets from the engineering design community, curated for our JMD review paper. Note that this list was made in 2022 and is not regularly updated.

Harvard Dataverse Link

Link to our repository of datasets on Harvard Dataverse