- Pereira, Tania;
- Morgado, Joana;
- Silva, Francisco;
- Pelter, Michele M;
- Dias, Vasco Rosa;
- Barros, Rita;
- Freitas, Cláudia;
- Negrão, Eduardo;
- de Lima, Beatriz Flor;
- da Silva, Miguel Correia;
- Madureira, António J;
- Ramos, Isabel;
- Hespanhol, Venceslau;
- Costa, José Luis;
- Cunha, António;
- Oliveira, Hélder P
Artificial intelligence (AI)-based solutions have revolutionized our world, using extensive datasets and computational resources to create automatic tools for complex tasks that, until now, have been performed by humans. Massive data is a fundamental aspect of the most powerful AI-based algorithms. However, for AI-based healthcare solutions, there are several socioeconomic, technical/infrastructural, and most importantly, legal restrictions, which limit the large collection and access of biomedical data, especially medical imaging. To overcome this important limitation, several alternative solutions have been suggested, including transfer learning approaches, generation of artificial data, adoption of blockchain technology, and creation of an infrastructure composed of anonymous and abstract data. However, none of these strategies is currently able to completely solve this challenge. The need to build large datasets that can be used to develop healthcare solutions deserves special attention from the scientific community, clinicians, all the healthcare players, engineers, ethicists, legislators, and society in general. This paper offers an overview of the data limitation in medical predictive models; its impact on the development of healthcare solutions; benefits and barriers of sharing data; and finally, suggests future directions to overcome data limitations in the medical field and enable AI to enhance healthcare. This perspective is dedicated to the technical requirements of the learning models, and it explains the limitation that comes from poor and small datasets in the medical domain and the technical options that try or can solve the problem related to the lack of massive healthcare data.