A dataset of a hundred million planar linkage mechanisms for data-driven kinematic design

Amin Heyrani Nobari1, Akash Srivastava2, Dan Gutfreund2, Faez Ahmed1

1MIT  2MIT-IBM Watson AI Lab, IBM Research 

The Basics

Below you can find the most upto date information on this project:

Importance of Data In Data-Driven Engineering Design

Large public datasets, such as IMAGENET, MNIST, CIFAR-10, with millions of annotated examples, have been widely attributed as one of the leading factors behind the success of machine learning approaches in computer vision. In contrast, one of the major roadblocks facing data-driven methods for the engineering design community and not just inverse kinematic design when applying deep learning approaches is a need for large public datasets. We note that the data used in existing data-driven kinematic design approaches is limited by both size (most current databases include tens of thousands of mechanisms (Deshpande & Purwar, 2019a;b; 2020)) and complexity of mechanisms (limited to 4-bars, 6-bars, etc.). As such, there is a need for large public datasets for kinematic design, which can enable high-performing data-driven models, provide a library of designs for practitioners to study, and establish benchmark problems for future work. Producing very large datasets for kinematic design is a challenging problem due to the need for an appropriate representation scheme that does not waste resources in creating infeasible designs, and the large computation time required in simulating the movement of all linkages and ensuring diversity in the dataset. To address these gaps, we introduce LINKS, a dataset of 100 million one degree of freedom (1-DOF) planar linkage mechanisms with complexity going up to 20 linkage joints. LINKS is created with a primary focus on the “Path Generation” problem. The path generation problem is designing linkage mechanisms that generate a particular path described by a finite series of point coordinates. The dataset is made up of a large number of mechanisms and the simulated coupler paths traced by each joint of said mechanisms. However, it can be extended easily to other problems such as, “Function Generation” and “Motion Generation”.

Challenges of Generating LINKS

To produce LINKS, we overcome significant roadblocks. First, we create an efficient generation scheme for randomly sampling valid mechanisms. To do this, we introduce a new operator that guarantees to create valid, non-degenerate, and non-locking mechanisms without redundancies. We show that the proposed operator is more efficient than a widely used operator from the literature. Second, we develop an efficient forward simulation algorithm, which is both vectorized and parallelized, enabling us to simulate mechanisms on a multi-core system in half a second, compared to the 454 seconds needed by a single thread non-vectorized solver. As we discuss later, the algorithm randomly samples many parameters, leading to locking mechanisms in most simulations. For example, randomly sampling mechanisms with more than ten joints lead to a 99% infeasible (locking) mechanism, requiring a simulator to do a hundred simulations before adding one item to the dataset. Creating an efficient simulator allowed us to generate the LINKS dataset in hours instead of requiring months. Another challenge in generating a dataset of linkages and associated coupler paths is the extreme skewness in the types of shapes obtained from all the coupler paths. We observe that two types of shapes, circles and arcs, make up 62% of the paths traced. These two shapes are less interesting from the perspective of inverse kinematic synthesis (as theoretical solutions for such shapes are easily obtained). We detect and filter these shapes to address this issue, which leads to two datasets. One raw dataset with everything and one curated subset of paths, which randomly removes 99.5% of these two shapes and associated mechanisms.

Generative Scheme

To generate a very large mechanism of planar linkages, we take a random sampling approach. This is analogous to stochastically searching within the design space. The J operator enables us to create mechanisms consisting only of simple kinematic loops. Besides the topology, initial positions are also needed to simulate a mechanism, as different initial positions can lead to completely different shapes of coupler paths. Therefore, we separate our generation process into two distinct steps. The first step is to generate valid 1-DOF topologies. The next step is to come up with different initial positions for each of the topologies such that the resulting mechanisms are not locking (valid). An overview of our approach is illustrated in the figure below.

A Case Study of Shape Retrieval

One of the straightforward uses of LINKS is its application in creating a numerical atlas or a library of candidates for mechanism retrieval. However, as our dataset contains more than a billion shapes, comparing any target path to all shapes would be arduous and impractical. In this section, we present a simplified baseline approach to mechanism retrieval. Our goal is to demonstrate the depth of our dataset’s coverage by selecting random shape queries and identifying mechanisms in the dataset that can create a shape similar to the query shape. We first come up with challenging problems by generating new random mechanisms (as we did for the dataset) and picking a sample of paths from them as queries. Note that these paths do not exist in the dataset. We use the chamfer distance to compare any two normalized shapes to our targets and set a threshold below which we consider a path close enough to a given target query (chamfer distance below 0.03 1). Our strategy for shape retrieval is to shuffle the curated path dataset and go through the data until three candidates are found for any target shape.

Online Demo

We also made a fun online tool for simulating mechanisms with simple kinematic loops and provide some challenging problems you can attempt right now!



Heyrani Nobari, A, Srivastava, A, Gutfreund, D, & Ahmed, F. LINKS: A Dataset of a Hundred Million Planar Linkage Mechanisms for Data-Driven Kinematic Design. Proceedings of the ASME 2022 International Design Engineering Technical Conferences and Computers and Information in Engineering Conference. Volume 3A: 48th Design Automation Conference (DAC). St. Louis, Missouri, USA. August 14–17, 2022. V03AT03A013. ASME.


     author = {Amin Heyrani Nobari and Akash Srivastava and Dan Gutfreund and Faez Ahmed},
     title = {LINKS: A Dataset of a Hundred Million Planar Linkage Mechanisms for Data-Driven Kinematic Design},
     volume = {Volume 3A: 48th Design Automation Conference (DAC)},
     series = {International Design Engineering Technical Conferences and Computers and Information in Engineering Conference},
     year = {2022},
     month = {08},
     doi = {10.1115/DETC2022-89798},
     url = {},
     note = {V03AT03A013},


The authors acknowledge the MIT SuperCloud and Lincoln Laboratory Supercomputing Center for providing HPC resources that have contributed to the research results reported within this paper.