Authors:
Andrea Hrckova、Jennifer Renoux、Rafael Tolosana Calasanz、Daniela Chuda、Martin Tamajka、Jakub Simko
Paper:
https://arxiv.org/abs/2408.06847
Introduction
AI research is currently grappling with a reproducibility crisis, which poses significant risks not only to the field itself but also to other scientific domains that increasingly rely on AI systems. This paper investigates the challenges faced by AI doctoral students in Europe, focusing on the reproducibility and responsibility of AI research. The study surveyed 28 PhD candidates from 13 European countries, uncovering critical issues in the quality and findability of AI resources, difficulties in replicating experiments, and the lack of trustworthiness and interdisciplinarity in AI research.
Methodology
The study employed a qualitative approach, conducting semi-structured focus group interviews with 28 doctoral students. The interviews covered various aspects of the doctoral research process, including working with data, conducting experiments, information retrieval, organizing AI resources, and publishing research results. The data was analyzed using manual content analysis, resulting in the identification of 20 categories of challenges, visualized using mindmaps.
Findings
Quality and Findability of Datasets
The foremost challenge identified was the quality and utility of datasets required for training AI models. Issues such as poor annotation, privacy concerns, and difficulties in accessing experts were frequently mentioned. Additionally, the findability of datasets was a significant problem, with many datasets lacking proper documentation and being scattered across multiple platforms.
The Quality of Code
Quality issues also extended to the code used in AI research. Common problems included bugs, missing parts, and discrepancies between the code and the corresponding research papers. The lack of motivation to publish well-documented code was another concern, as it requires significant time and effort.
Benchmarking and Quality of Models
AI models faced similar challenges, with issues such as missing hyperparameters, inadequate documentation, and the need for specific hardware to run large-scale models. The difficulty in using benchmarking tools to compare different models was also highlighted.
Other Challenges Regarding AI Research Process
Other challenges identified included the lack of interdisciplinary collaboration, issues with information retrieval, and the minimal involvement of human participants in AI research. These challenges often require collaboration with experts from various fields, which is currently lacking.
Perspectives and Recommendations
Discoverability and Quality of AI Resources
To improve the discoverability and quality of AI resources, the paper recommends fostering interdisciplinary collaboration for the governance of AI resources. This involves providing guidelines and encouraging standardized documentation practices.
Reproducibility of Experiments
The paper calls for the AI research community to embrace reproducibility practices widely and enforce stricter reproducibility policies in journals and conferences. Research institutions should set up guidelines and provide adequate resources to ensure the quality of AI resources and reproducibility of experiments.
Trustworthiness and Interdisciplinarity
To address the lack of interdisciplinarity, research institutions should support interdisciplinary collaboration by ensuring a diverse research staff and providing the means for researchers to collaborate efficiently. This includes involving domain experts, ethicists, and other specialists in AI research.
Conclusions and Discussion
The paper concludes that good research is a collective and multidisciplinary effort that requires discoverable, reproducible, and responsible prior work. Despite growing awareness and initiatives to improve research practices, more efforts are needed both technologically and societally. The study’s findings highlight the need for improved discoverability, reproducibility, and interdisciplinarity in AI research.
Research Ethical Considerations
The research adhered to ethical principles and standards, ensuring the anonymization of personal data and the secure processing of interview data. AI-based tools were used to refine the formal aspects of the language.
Acknowledgements
This work was supported by AI4Europe, a Horizon project funded by the European Commission under GA No. 101070000.