- Sawaya, Nicolas PD;
- Marti-Dafcik, Daniel;
- Ho, Yang;
- Tabor, Daniel P;
- Neira, David E Bernal;
- Magann, Alicia B;
- Premaratne, Shavindra;
- Dubey, Pradeep;
- Matsuura, Anne;
- Bishop, Nathan;
- De Jong, Wibe A;
- Benjamin, Simon;
- Parekh, Ojas D;
- Tubman, Norm M;
- Klymko, Katherine;
- Camps, Daan
For a considerable time, large datasets containing problem instances have proven valuable for analyzing computer hardware, software, and algorithms. One notable example of the value of large datasets is ImageNet [1], a vast repository of images that has been instrumental in testing numerous deep learning packages. Similarly, in the domain of computational chemistry and materials science, the availability of extensive datasets such as the Protein Data Bank [2], the Materials Project [3], and QM9 [4] has greatly facilitated the evaluation of new algorithms and software approaches, while also promoting standardization within the field. These well-defined datasets and problem instances, in turn, serve as the foundation for creating benchmarking suites like MLPerf [5] and LINPACK [6], [7]. These suites enable fair and rigorous comparisons of different methodologies and solutions, fostering continuous advancements in various areas of computer science and beyond.