Vol.:(0123456789)

Journal of Medical Systems           (2025) 49:19  
https://doi.org/10.1007/s10916-025-02140-z

RESEARCH

Prediction of the Risk of Adverse Clinical Outcomes with Machine 
Learning Techniques in Patients with Noncommunicable Diseases

Alejandro Hernández‑Arango1,2,6 · María Isabel Arias2,3 · Viviana Pérez2 · Luis Daniel Chavarría2,4 · Fabian Jaimes5

Received: 1 August 2024 / Accepted: 2 January 2025 
© The Author(s) 2025

Abstract
Decision-making in chronic diseases guided by clinical decision support systems that use models including multiple variables 
based on artificial intelligence requires scientific validation in different populations to optimize the use of limited human, 
financial, and clinical resources in healthcare systems worldwide. This cohort study evaluated three machine learning algo-
rithms—XGBoost, Elastic Net logistic regression, and an Artificial Neural Network—to develop a prediction model for three 
outcomes: mortality, hospitalization, and emergency department visits. The objective was to build a clinical decision support 
system for patients with noncommunicable diseases treated at the Alma Mater Hospital complex in Medellín, Colombia. 
We collected 4845 electronic medical record entries from 5000 patients included in the study. The median age was 71.83 
years, with 63.8% women and 29.7% receiving home care. The most prevalent medical conditions were diabetes (52.9%), 
hypertension (67.2%), dyslipidemia (57.3%), and COPD (19.4%). For mortality prediction, the Elastic Net logistic regression 
model achieved an AUCROC of 0.883 (95% CI: 0.848–0.917), the XGBoost model reached an AUCROC of 0.896 (95% CI: 
0.865–0.927), and the Neural Network achieved 0.886 (95% CI: 0.853–0.916). For hospitalization, the Elastic Net model had 
an AUCROC of 0.952 (95% CI: 0.937–0.965), the XGBoost model achieved 0.963 (95% CI: 0.952–0.974), and the Neural 
Network scored 0.932 (95% CI: 0.915–0.948). For emergency department visits, the AUCROC values were 0.980 (95% CI: 
0.971–0.987) for Elastic Net, 0.977 (95% CI: 0.967–0.986) for XGBoost, and 0.976 (95% CI: 0.968–0.982) for the neural 
network. A dashboard was developed to interact with an ensemble risk categorization segmenting patient risk in the cohort 
to aid in clinical decision-making. A clinical decision support system based on artificial intelligence using electronic medical 
records possibly can help segmenting the risk in populations with Noncommunicable Diseases for effective decision-making. 

Keywords  Clinical decision support system · Predictive models · Mortality · Emergency consultation · Hospitalization · 
Artificial intelligence

Introduction

 The increase in demand for health services from people with 
Noncommunicable Diseases represents a challenge for the 
health systems of many countries and ours is no exception. 
In Colombia, the high costs of chronic diseases are reflected 
in their diagnosis and treatment, which are characterized by 
being prolonged, complex and affecting the economically 
active population. Given that in many cases their diagnosis 
and intervention are late, in addition to the costs for the sys-
tem, a burden is generated for the patient’s health and the 
stability of his family [1].

An appropriate strategy for this chronic disease problem 
may be to use a model of care based on risk stratification. 
Stratification is defined as “the identification and/ or group-
ing of patients according to risk or severity classification”, 

 *	 Alejandro Hernández‑Arango 
	 alejandro.hernandeza@udea.edu.co

1	 Department of Internal Medicine, University of Antioquia, 
Medellín, Colombia

2	 Hospital Alma Mater de Antioquia, University of Antioquia, 
Medellín, Colombia

3	 Health Information Systems Professional Living Lab. , 
Medellín, Colombia

4	 Data Scientist, National University, Medellín, Colombia
5	 Department of Internal Medicine, University of Antioquia, 

Medellín, Colombia
6	 Faculty of Medicine, Department of Internal Medicine, 

Hospital Alma Mater de Antioquia, University of Antioquia, 
University of Antioquia, Carrera 51 A # 62 – 42, Medellín, 
Colombia

Content courtesy of Springer Nature, terms of use apply. Rights reserved.


	 Journal of Medical Systems           (2025) 49:19    19   Page 2 of 13

which serves to define in advance interventions that are 
tailored to their future health care needs (2). To carry out 
the stratification there are several systems of classification 
of patients, among which are: Adjusted Clinical Groups 
(ACG), a system that assigns each person in an exclusive 
category based on clinical criteria, with the aim of predict-
ing health costs, pharmaceutical expenses and hospitaliza-
tion risks [2]; Diagnostic Cost Groups (DCG), which use 
information from all diagnoses and prescriptions to form 
Clinical Risk Groups (CRG); which, together with demo-
graphic characteristics, manage to predict health costs in a 
year by classifying individuals according to the severity of 
their health status and according to their chronicity (4); and 
the Diagnosis-Related Groups (DRGs), which allow relating 
the different types of patients treated in a hospital, with the 
cost of their management. The use of DRGs is recommended 
by the World Health Organization (WHO) and several Latin 
American governments are exploring their implementation. 
However, this requires having information systems that have 
a high quality of the data, to guarantee the accuracy of the 
classifications and therefore the decisions that are made. In 
the case of Colombia, there has been evidence of low quality 
in the Individual Health Service Delivery Registries (RIPS) 
for the DRM system, as well as the absence of policies that 
promote and promote comparable health risk [3] (5).

The need to accurately direct the finite resources, both 
human and economic, of health systems towards a subpopu-
lation at higher risk of adverse outcomes can be realized 
under the stratification of individual risk taken to population 
terms, which allows to reach a coordination of the level of 
care according to the presence of “clusters” or risk profiles 
in chronic diseases [4] Clinical decision support systems 
(CDS) can help clinicians make informed decisions if they 
are properly integrated into the treatment process, if they 
are easy to use and understand, and if they use standards 
that enable interoperability with other systems. If these CDS 
systems are designed and implemented with user needs in 
mind, they have the potential to improve medical decisions, 
streamline physicians’ work, and improve patient outcomes 
[5].

The hospital in which this research was carried out has 
developed a “SerMás” care model based on integrated and 
continuous care, promoting synergies in the health services 
network and co-management of health risk between the hos-
pital and the insurer [6] Therefore, the aim of this study is to 
retrospectively derive a real-time risk prediction methodol-
ogy for adverse clinical outcomes with machine learning 
techniques and big data analytics in patients with chronic 
Noncommunicable Diseases. The final goal is the creation of 
a prescriptive analysis dashboard as a clinical decision sup-
port system, which allows real-time interaction with predic-
tions based on clinical and epidemiological characteristics 
of patients in the cohort.

Methods

Source of Data

A retrospective cohort study was conducted on electronic 
medical record records for the derivation of 2 prediction 
models of 3 outcomes: mortality, hospitalization and emer-
gency room visit. Patient collection was conducted from 
April 1, 2017 to December 31, 2020 and outcome assess-
ment from January 1 to December 31, 2020. This study 
was approved by the ethics committee of the Alma Mater 
Hospital in Antioquia (INS 2022-08). The data was always 
managed within the Hospital with security control and pass-
words in the work ecosystems to protect the identity of the 
patients. We followed the TRIPOD-AI consensus. [7]

Participants

The study was carried out at a highly complex medical 
institution, Hospital Alma Mater de Antioquia, located in 
Medellín, Antioquia. This institution comprises an outpa-
tient care facility, a home care division, and a hospital unit. 
Patients were eligible for inclusion if they were at least 18 
years old and had at least one chronic disease as defined by 
the ICD-10 coding system [8] (Table 1 of supplementary 
appendix.). Patients were excluded if they lacked clinical 
data in their electronic medical records, often due to missed 
appointments or loss to follow-up.

Patient care followed the protocol of the “SerMás” care 
model, a comprehensive health management approach coor-
dinating efforts between different health services. Impor-
tantly, the study utilized a convenience sampling method 
based on a contract with the healthcare payer. The cohort 
consisted of 5,000 patients selected in advance by the insurer 
according to the inclusion criteria.

Outcomes

1.	 Hospital and out-of-hospital mortality: data were 
obtained from the GHIPS system (2024 “ALMA 
MATER HOSPITAL” Version: 31.2.20221216 to 37) 
and out-of-hospital mortality was confirmed by the 
health insurer and the RUAF (©information system 
that consolidates the affiliations reported by the entities 
and administrators of the Social Protection System in 
Colombia).

2.	 Hospitalization: data were obtained on the number of 
times the patient consulted the assigned referral hospital 
or other hospitals the data was reported to the hospital 
when the patient was hospitalized elsewhere.

3.	 Use of emergency only in the reference hospital.

Content courtesy of Springer Nature, terms of use apply. Rights reserved.


Journal of Medical Systems           (2025) 49:19 	 Page 3 of 13     19 

Table 1   Clinical characteristics of patients included in the study cohort

Variable Category Data loss n(%) Total n = 4845 (100%)

Age, Mean (SD) 71.8 (13.0)
Gender, n (%) Female 3104 (64.1)

Male 1741 (35.9)
Emergency room, n (%) 1029 (21.2)
Inpatients, n (%) 918 (18.9)
Surgery, n (%) 370 (7.6)
General practitioner, n (%) 2832 (58.5)
Specialized medicine, n (%) 3464 (71.5)
General wards LoS Days, Mean (SD) 7.8 (9.7)
Special Care LoS Days, Mean (SD) 4.5 (3.5)
Intensive Care ICU LoS Days, Mean (SD) 7.2 (7.7)
Depression and Mood Disturbances, n (%) 990 (20.4)
COPD, n (%) 940 (19.4)
Thyroid diseases, n (%) 808 (16.7)
Somatoform, n (%) 808 (16.7)
Osteoarthritis, n (%) 798 (16.5)
Ischemic heart disease, n (%) 789 (16.3)
Chronic Kidney Disease, n (%) 660 (13.6)
Obesity, n (%) 566 (11.7)
Heart Failure, n (%) 545 (11.2)
Cerebrovascular, n (%) 472 (9.7)
Dementia, n (%) 443 (9.1)
Osteoporosis, n (%) 424 (8.8)
Atrial Fibrillation, n (%) 350 (7.2)
Sleep Disorders, n (%) 345 (7.1)
Hypertension, n (%) 3267 (67.4)
Vertigo and Hearing Impairment, n (%) 286 (5.9)
Other Genitourinary, n (%) 282 (5.8)
Venous and Lymphatic Diseases, n (%) 282 (5.8)
Peripheral Neuropathies, n (%) 271 (5.6)
Upper Gastrointestinal Diseases, n (%) 268 (5.5)
Migraine and Painful Facial Syndromes, n (%) 251 (5.2)
Colitis and Lower Gastrointestinal, n (%) 248 (5.1)
Prostate Diseases, n (%) 222 (4.6)
Epilepsy, n (%) 213 (4.4)
Diabetes, n (%) 2123 (43.8)
Dyslipidemia, n (%) 2057 (42.5)
Anemia, n (%) 136 (2.8)
Weight, Mean (SD) 617(12.71%) 68.1 (15.0)
Height, Mean (SD) 617(12.71%) 156.2 (9.3)
Thigh Circumference, Mean (SD) 617(12.71%) 48.3 (94.2)
Waist Circumference, Mean (SD) 617(12.71%) 95.6 (17.5)
triceps fold measurement, Mean (SD) 617(12.71%) 17.7 (15.4)
Abdomen Fold measure, Mean (SD) 617(12.71%) 26.5 (86.6)
Thigh Fold, Mean (SD) 617(12.71%) 21.7 (15.2)
Systolic Blood Pressure, Mean (SD) 617(12.71%) 129.8 (20.4)
Diastolic Blood Pressure, Mean (SD) 617(12.71%) 73.4 (11.2)
Resting Heart Rate, Mean (SD) 617(12.71%) 76.0 (11.9)
Self-rated Exercise level, n (%) 1.0 617(12.71%) 4144 (98.0)

2.0 24 (0.6)

Content courtesy of Springer Nature, terms of use apply. Rights reserved.


	 Journal of Medical Systems           (2025) 49:19    19   Page 4 of 13

Predictor Variables

We included 164 variables which were obtained from the 
GHIPS system, a web application that works as an electronic 
medical record system of the hospital in the outpatient and 
hospital settings. Clinical, laboratory and billing variables 

were extracted (Table 2 of supplementary appendix). In 
addition, the outcomes of mortality, hospitalization and 
emergency use during the evaluation year were obtained.

n (%): Number and percentage of participants, SD: Standard Deviation, LoS: Length of Stay, ICU: Intensive Care Unit, METS: metabolic equiv-
alents,  V̇O:  Maximal Oxygen Consumption, HbA1c: Glycated Hemoglobin, COPD: Chronic Obstructive Pulmonary Disease, TSH: Thyroid-
Stimulating Hormone, CKD: Chronic Kidney Disease, GFR: Glomerular Filtration Rate, HDL: High-Density Lipoprotein, LDL: Low-Density 
Lipoprotein *Additional results on Hartigan immersion test, normality test and Tukey’s test are described in Supplementary appendix

Table 1   (continued)

Variable Category Data loss n(%) Total n = 4845 (100%)

3.0 16 (0.4)
4.0 15 (0.4)
5.0 29 (0.7)

METS metabolic rate, Mean (SD) 617(12.71%) 4.8 (2.5)
VO2 at maximum oxygen, Mean (SD) 617(12.71%) 17.0 (8.6)
 Gröningen Fragility Index, n (%) Fragile 1968 (40.6)

 Data Not Available   617 (12.7)
Normal 2260 (46.6)

Monopodial time, Mean (SD) 617(12.71%) 7.8 (10.3)
Ankle-Brachial Mndex, n (%) 0.41 to 0.90 70 (1.4)

0.91 to 1.30 1574 (32.5)
< 0.4 33 (0.7)

Not qualified 3168 (65.4)
Blood Glucose, Mean (SD) 944(19,45%) 102.3 (118.6)
Glycated Haemoglobin HbA1c, Mean (SD) 944(19,45%) 5.0 (3.3)
LDL, Mean (SD) 944(19,45%) 47.2 (48.0)
HDL, Mean (SD) 944(19,45%) 41.6 (159.8)
Total Cholesterol, Mean (SD) 944(19,45%) 134.0 (68.3)
Triglycerides, Mean (SD) 944(19,45%) 125.3 (92.2)
Framingham Cardiovascular Risk adjusted to Colombia, n (%) High risk 1710 (35.3)

Low risk 2177 (44.9)
Not rated 958 (19.8)

Glomerular Filtration Rate (GFR), Mean (SD) 944(19,45%) 59.9 (39.7)
Stage of Chronic Kidney Disease (CKD), n (%) Stage 0 1511 (31.2)

Stage 1 466 (9.6)
Stage 2 1029 (21.2)
Stage 3a 796 (16.4)
Stage 3b 720 (14.9)
Stage 4 253 (5.2)
Stage 5 70 (1.4)

Urinary Albumin to Creatinine Ratio, Mean (SD) 944(19,45%) 44.3 (271.4)
TSH, Mean (S) 944(19,45%) 2.6 (6.6)
Functional Classification by “SerMás”, n (%) Functional class 1 39 (0.8)

Functional class 2A 1482 (30.6)
Functional class 2B 814 (16.8)
Functional class 3 145 (3.0)
Functional class 4 1421 (29.3)

Not rated 944 (19.5)

Content courtesy of Springer Nature, terms of use apply. Rights reserved.


Journal of Medical Systems           (2025) 49:19 	 Page 5 of 13     19 

Ta
bl

e 
2  

S
ta

tis
tic

al
 a

na
ly

si
s o

f A
rti

fic
ia

l n
eu

ra
l n

et
w

or
k,

 X
G

B
oo

st 
an

d 
El

as
tic

 n
et

 lo
gi

sti
c 

re
gr

es
si

on
 m

od
el

s f
or

 m
or

ta
lit

y,
 h

os
pi

ta
liz

at
io

n,
 a

nd
 e

m
er

ge
nc

y 
ro

om
 c

on
su

lta
tio

n

*S
en

si
tiv

ity
 (S

en
s)

 is
 th

e 
ab

ili
ty

 o
f t

he
 m

od
el

 to
 c

or
re

ct
ly

 r
ul

e 
ou

t p
at

ie
nt

s 
w

ith
 th

e 
ou

tc
om

e 
of

 in
te

re
st.

 ^
Sp

ec
ifi

ci
ty

 (S
pe

c)
 is

 th
e 

ab
ili

ty
 o

f t
he

 m
od

el
 to

 c
or

re
ct

ly
 d

et
ec

t p
at

ie
nt

s 
w

ith
ou

t t
he

 
ou

tc
om

e 
of

 in
te

re
st.

 ̈N
eg

at
iv

e 
pr

ed
ic

tiv
e 

va
lu

e 
(N

PV
) i

s 
th

e 
pr

ob
ab

ili
ty

 th
at

 a
 p

at
ie

nt
 w

ill
 n

ot
 h

av
e 

th
e 

ou
tc

om
e 

of
 in

te
re

st 
if 

th
e 

m
od

el
 c

la
ss

ifi
es

 it
 a

s 
su

ch
. +

Po
si

tiv
e 

pr
ed

ic
tiv

e 
va

lu
e 

(P
PV

) i
s 

th
e 

pr
ob

ab
ili

ty
 th

at
 a

 p
at

ie
nt

 w
ill

 h
av

e 
th

e 
ou

tc
om

e 
of

 in
te

re
st 

if 
th

e 
m

od
el

 c
la

ss
ifi

es
 it

 a
s s

uc
h 

A
U

C
RO

C
 (A

re
a 

un
de

r t
he

 R
O

C
 c

ur
ve

) i
s a

 n
um

er
ic

al
 m

ea
su

re
 th

at
 e

va
lu

at
es

 th
e 

m
od

el
’s

 a
bi

lit
y 

to
 

di
ffe

re
nt

ia
te

 b
et

w
ee

n 
pa

tie
nt

s w
ith

 a
nd

 w
ith

ou
t t

he
 o

ut
co

m
e 

of
 in

te
re

st.
 9

5%
 C

I w
as

 c
al

cu
la

te
d

O
ut

co
m

e
M

od
el

 n
am

e
Se

ns
iti

vi
ty

 (R
ec

al
l)

Sp
ec

ifi
ci

ty
 (R

el
ec

-
tiv

ity
)

Po
si

tiv
e 

Pr
ed

ic
tiv

e 
Va

lu
e 

(P
re

ci
si

on
)

N
eg

at
iv

e 
Pr

ed
ic

tiv
e 

Va
lu

e
A

U
C

RO
C

In
te

rc
ep

t
Sl

op
e

H
os

pi
ta

liz
at

io
n

El
as

tic
 N

et
0.

68
3

(0
.6

30
–0

.7
33

)
0.

97
4

(0
.9

65
–0

.9
83

)
0.

88
1

(0
.8

39
–0

.9
22

)
0.

91
6

(0
.9

00
–0

.9
32

)
0.

95
2

(0
.9

37
–0

.9
65

)
−

3.
11

0
(−

3.
34

3–
2.

91
4)

6.
40

6
(5

.8
75

–7
.0

30
)

XG
Bo

os
t

0.
79

2
(0

.7
46

–0
.8

38
)

0.
95

5
(0

.9
43

–0
.9

68
)

0.
83

3
(0

.7
89

–0
.8

78
)

0.
94

2
(0

.9
28

–0
.9

56
)

0.
96

3
(0

.9
52

–0
.9

74
)

−
3.

52
6

(−
3.

79
9–

3.
30

8)
6.

76
9

(6
.2

99
–7

.3
32

)
Ne

ur
al

 N
et

wo
rk

0.
58

1
(0

.5
26

–0
.6

35
)

0.
98

0
(0

.9
72

–0
.9

88
)

0.
89

3
(0

.8
50

–0
.9

33
)

0.
89

2
(0

.8
74

–0
.9

09
)

0.
93

2
(0

.9
15

–0
.9

48
)

−
2.

67
9

(−
2.

86
7–

2.
50

7)
6.

04
8

(5
.4

84
–6

.7
47

)
Em

er
ge

nc
y 

Ro
om

 
C

on
su

lta
tio

n
El

as
tic

 N
et

0.
92

7
(0

.8
97

–0
.9

52
)

0.
96

4
(0

.9
51

–0
.9

74
)

0.
88

9
(0

.8
52

–0
.9

19
)

0.
97

7
(0

.9
67

–0
.9

85
)

0.
98

0
(0

.9
71

–0
.9

87
)

−
3.

77
8

(−
4.

07
4–

3.
51

6)
6.

57
5

(6
.1

25
–7

.0
68

)
XG

Bo
os

t
0.

90
5

(0
.8

74
–0

.9
36

)
0.

96
1

(0
.9

50
–0

.9
72

)
0.

87
8

(0
.8

45
–0

.9
13

)
0.

97
0

(0
.9

60
–0

.9
80

)
0.

97
7

(0
.9

67
–0

.9
86

)
−

3.
79

2
(−

4.
04

6–
3.

56
4)

7.
21

2
(6

.7
65

–7
.7

44
)

Ne
ur

al
 N

et
wo

rk
0.

81
0

(0
.7

62
–0

.8
50

)
0.

96
3

(0
.9

52
–0

.9
74

)
0.

87
2

(0
.8

34
–0

.9
09

)
0.

94
2

(0
.9

26
–0

.9
56

)
0.

97
6

(0
.9

68
–0

.9
82

)
−

3.
01

9
(−

3.
26

5–
2.

79
4)

5.
47

6
(5

.0
70

–5
.9

39
)

M
or

ta
lit

y
El

as
tic

 N
et

0.
33

6
(0

.2
59

–0
.4

14
)

0.
99

1
(0

.9
85

–0
.9

96
)

0.
80

7
(0

.7
02

–0
.9

06
)

0.
93

1
(0

.9
18

–0
.9

43
)

0.
88

3
(0

.8
48

–0
.9

17
)

−
3.

27
5

(−
3.

48
9–

3.
07

9)
6.

52
3

(5
.7

78
–7

.2
40

)
XG

Bo
os

t
0.

45
3

(0
.3

64
–0

.5
31

)
0.

98
6

(0
.9

80
–0

.9
92

)
0.

78
5

(0
.6

90
–0

.8
64

)
0.

94
2

(0
.9

29
–0

.9
54

)
0.

89
6

(0
.8

65
–0

.9
27

)
−

3.
35

3
(−

3.
59

3–
3.

14
6)

6.
31

1
(5

.6
92

–6
.9

52
)

Ne
ur

al
 N

et
wo

rk
0.

35
8

(0
.2

75
–0

.4
37

)
0.

98
5

(0
.9

79
–0

.9
92

)
0.

73
1

(0
.6

27
–0

.8
39

)
0.

93
3

(0
.9

19
–0

.9
45

)
0.

88
6

(0
.8

53
–0

.9
16

)
−

3.
02

3
(−

3.
23

5–
2.

82
0)

5.
60

1
(4

.8
94

–6
.4

14
)

Content courtesy of Springer Nature, terms of use apply. Rights reserved.


	 Journal of Medical Systems           (2025) 49:19    19   Page 6 of 13

Sample Size Calculation

A formal sample size calculation was not made because 
there was a fixed cohort.

Imputation of Missing Data

A data imputation process was performed to variables with 
less than 20% data losses using the K Nearby Neighbors 
(KNN) algorithm, an imputation technique that uses infor-
mation from existing data to estimate missing values. The 
algorithm selects a number k of observations closest to the 
observation with the missing value (neighbors) in the com-
plete dataset. Then use the mean of those k neighbors to 
estimate the missing value. [9]

Statistical Analysis

For normally distributed variables, the mean and standard 
deviation are usually shown, while variables that are not 
normally distributed are reported with the median and inter-
quartile range. The Hartigan immersion test was applied to 
describe possible multimodal distributions and the Tukey 
test to describe variables with distant outliers [10] DeLong’s 
Test, was used to define differences between model’s areas 
under the curve. All statistical measures were calculated 
with accompanying 95% confidence intervals (CIs) and w 
used a p-value threshold of 0.05.

Models

Three supervised learning models were used: Elastic Net 
logistic regression model, an Artificial Neural Network and 
the XGBoost algorithm. For the 3 outcomes, the database 
was divided into 2 parts: 85% for model training and 15% for 
a test dataset (Internal Validation). Before entering the data 
into the machine learning models, the numerical variables 
were centered at a mean of zero and then scaled to ensure 
that they all have a variance of 1. For the nominal variables, 
Dummies (indicator) variables (12) [11] with the ICD refer-
ence category 10 described in Table 1 of the supplementary 
appendix.

The Elastic Net logistic regression model is an extension 
of the traditional logistic regression model that uses regulari-
zation techniques to reduce the risk of overfitting, and uses a 
combination of the vector norm L1 (the sum of the absolute 
value of the elements of the vector) and L2 (Euclidean norm 
is the square root of the sum of the squares of the elements 
of the vector) for regularization, what is known as Lasso 
(L1), Ridge (L2) and Elastic Net (L1 and L2) regulariza-
tion respectively, to automatically select the most important 
characteristics in the data and avoid overfitting [12].

The XGBoost algorithm is an implementation of gradient 
boosting with decision trees, Gradient boosting is a machine 
learning technique that consists of training a set of decision 
trees sequentially, where each tree is trained to correct the 
errors of the previous tree. XGBoost uses a gradient optimi-
zation technique called “stochastic gradient regularization” 
to adjust the parameters of individual decision trees [13].

In the neural network architecture, a feedforward model 
was implemented using the Keras framework. The network 
consisted of an input layer corresponding to the number of 
features in the dataset, followed by three hidden layers with 
128, 64, and 32 neurons, respectively, each using the ReLU 
activation function. Batch normalization was applied after 
each hidden layer to standardize inputs and improve train-
ing stability, while a dropout rate of 0.5 was used for regu-
larization to prevent overfitting. The output layer included a 
single neuron with a sigmoid activation function for binary 
classification. The model was optimized using the Adam 
optimizer with a learning rate of 0.001 and the binary cross-
entropy loss function, while accuracy was tracked as a per-
formance metric. Training was performed over 100 epochs 
with a batch size of 32, and early stopping was employed 
to terminate training when validation performance pla-
teaued. Additional callbacks, including model checkpoint-
ing, TensorBoard logging, and a custom callback to monitor 
epoch-wise training times, were utilized to enhance training 
efficiency and transparency. Model architecture shown in 
suplementary sFigure 6.

For the three models all the hyperparameters were initial-
ized randomly, a set of fitting data of these “hyperparam-
eters” was not used and instead the 10-fold cross-validation 
technique was performed for each of the three models with 
the three outcomes, to find the best parameters between 
the training dataset and the validation dataset. [14] The 
metrics area under receiver operating characteristic curve 
(AUCROC), sensitivity, specificity, negative predictive 
value, positive predictive value and calibration curves were 
determined with the calculation of the slope and intercept 
for each outcome. For each metric, the 95% confidence inter-
val was calculated and a maximum alpha error of 0.05 was 
accepted.

Models were selected for each outcome with better dis-
crimination in AUCROC and no statistically significant 
differences in slope and intercept in calibration curve. The 
results were compared using the DeLong test for differences 
in the AUCROC of each of the outcomes [15].

The R programming language (version 4.2.2 Copyright 
(C) 2022 The R Foundation for Statistical Computing) and 
Python (Python Software Foundation (2021) were used. 
Python Language Reference, version 3.10.) to process the 
data and derive the model.

Content courtesy of Springer Nature, terms of use apply. Rights reserved.


Journal of Medical Systems           (2025) 49:19 	 Page 7 of 13     19 

Risk Groups

Risk groups were created through a demo dashboard in Tab-
leau software (Tableau. (2021). Tableau 2021.2 [Software]), 
in which the AI model was connected to make predictions 
and visualize the results in histogram form in the cohort. 
This allows patients to be displayed and filtered in the con-
text of their prediction, to support decisions about the use of 
clinical resources and prioritization according to their risk. 
The dashboard creates histograms to predict the 3 outcomes 
with the probability extracted from the model on the X-axis 
and the number of patients on the Y-axis. This dashboard 
is the input for the end user to interact with the predictions 
(Fig. 1).

Results

Data were collected from January 2020 to December 2020 
for a total of 5000 eligible patients and 4845 finally ana-
lyzed (Fig. 2). The cohort had a mean age of 71.8 years 
(standard deviation of 13.0) with 64.1% (n = 3104) women. 
21.2% (n = 1029) of patients presented to the emergency 
department and 18.9% (n = 918) were hospitalized. 58.5% 
(n = 2832) consulted a general practitioner and 71.5% 
(n = 3464) consulted a specialist physician. The most com-
mon comorbidities were hypertension (67.4%), diabetes 
(43.8%) and dyslipidemia (42.5%). 19.4% of patients had 
chronic obstructive pulmonary disease (COPD), 16.7% had 
thyroid disease and 11.2% heart failure. The total mean 
value of billing in Colombian pesos was COP 5,468,904 
per patient in the year (standard deviation of 7,376,458) 
(Table 1.) The distribution of chronic disease categories is 
presented in Table 3 of the Supplementary Appendix.

Fig. 1   Patient flowchart in the 
study

Content courtesy of Springer Nature, terms of use apply. Rights reserved.


	 Journal of Medical Systems           (2025) 49:19    19   Page 8 of 13

Models

For the mortality outcome, the Elastic Net logistic regres-
sion model achieved an AUCROC of 0.883 (95% CI: 
0.848–0.917), while the XGBoost model had an AUCROC 
of 0.896 (95% CI: 0.865–0.927). The neural network model 
performed similarly with an AUCROC of 0.886 (95% CI: 
0.853–0.916). For the hospitalization outcome, Elastic Net 
showed an AUCROC of 0.952 (95% CI: 0.937–0.965), 
XGBoost reached 0.963 (95% CI: 0.952–0.974), and the neu-
ral network model achieved 0.932 (95% CI: 0.915–0.948). 
Regarding emergency room consultations, the AUCROC 
values were 0.980 (95% CI: 0.971–0.987) for Elastic Net, 
0.977 (95% CI: 0.967–0.986) for XGBoost, and 0.976 (95% 
CI: 0.968–0.982) for the neural network model. (Fig. 3).

In Table 2 we show the summary of all metrics perfor-
mance of the three prediction models for the three selected 
outcomes. For mortality prediction, Elastic Net logistic 
regression achieved a sensitivity of 33.6% and a specific-
ity of 99.1%. XGBoost outperformed Elastic Net with a 
sensitivity of 45.3% and specificity of 98.6%, while the 
neural network exhibited similar performance with a sen-
sitivity of 35.8% and specificity of 98.5%. For hospitaliza-
tion prediction, Elastic Net achieved a sensitivity of 68.3% 
and specificity of 97.4%, XGBoost reached a sensitivity 
of 79.2% and specificity of 95.5%, and the neural network 
demonstrated a sensitivity of 58.1% and specificity of 
98.0%. Lastly, for emergency room consultations, Elastic 

Net exhibited high specificity (96.4%) but a lower sensi-
tivity of 92.7%. XGBoost showed a balance of sensitivity 
(90.5%) and specificity (96.1%), and the neural network 
model achieved sensitivity and specificity of 81.0% and 
96.3%, respectively. Figure 4 presents a spider plot com-
paring the performance of the nine models across multi-
ple metrics. The calibration of these models against the 
outcomes is illustrated in Fig. 5, The weighting of the 
primary variables for each model and their contributions 
to the three outcomes are detailed in the Supplementary 
Appendix.

Discussion

There are several limitations in the study that must be con-
sidered. First, the retrospective observational design, since 
the study represents the first step in the derivation of a risk 
model for the creation of a clinical decision support system 
within the framework of the DECIDE AI consensus [16] 
Therefore, more prospective research with intervention 
studies is needed to validate the model in different popula-
tions and healthcare settings before it can be used in clini-
cal practice. Second, the study was conducted in a single 
highly complex reference hospital with an elderly popula-
tion with multiple chronic diseases, without adequate rep-
resentation of young patients. Although the hospital had 
different settings of home, outpatient and hospital care, 

Fig. 2   Area under receiver 
operating characteristic curve 
(AUCROC) of Artificial 
Neural Network, Elastic Net 
and XGBoost models for 
Hospitalization, Mortality and 
Emergency Room Consultation 
outcomes

Content courtesy of Springer Nature, terms of use apply. Rights reserved.


Journal of Medical Systems           (2025) 49:19 	 Page 9 of 13     19 

there were differences in the balance of demographic char-
acteristics such as the marked difference between women 
and men. No race information was collected, which could 
limit generalizability of results to other hospitals or health-
care settings. Third, the sample was very small relative to 
the amount recommended in studies employing machine 
learning approaches [17] and included a large number of 
predictive variables, making the assembly process with 
other types of medical history software technically dif-
ficult. Fourth, the sample size was predetermined by the 

insurer based on contractual convenience rather than a 
formal calculation of statistical power. While this cohort 
provided a substantial data for initial modeling, the lack of 
randomization or deliberate design in the sample selection 
could introduce biases and limit the ability to generalize 
findings. Future studies should aim to evaluate the model 
on larger and more diverse populations with sample sizes 
informed by power analyses.

Fig. 3   Spider plot of Artifi-
cial Neural Network, Elastic 
Net and XGBoost models for 
Hospitalization, Mortality and 
Emergency Room Consultation 
metrics

Content courtesy of Springer Nature, terms of use apply. Rights reserved.


	 Journal of Medical Systems           (2025) 49:19    19   Page 10 of 13

In this study, prediction models for mortality, hospitali-
zation, and emergency room visits were developed using 
Elastic Net Logistic Regression, XGBoost, and an Artificial 

Neural Network (ANN). The DeLong test revealed statisti-
cally significant differences in AUCROC favoring XGBoost 
over Elastic Net for hospitalization (p < 0.001), while Elastic 

Fig. 4   Calibration plots of 
Artificial Neural Network, 
Elastic net Logistic Regres-
sion and XGBoost models for 
Hospitalization, Mortality and 
Emergency Room Consultation 
outcomes

Fig. 5   User friendly dashboard to interact with the prediction of models for Hospitalization, Mortality and Emergency Room Consultation for 
each patient metrics

Content courtesy of Springer Nature, terms of use apply. Rights reserved.


Journal of Medical Systems           (2025) 49:19 	 Page 11 of 13     19 

Net outperformed the ANN (p = 0.008). For mortality, no 
significant differences were observed between Elastic Net, 
XGBoost, and the ANN (p > 0.05). Similarly, for emer-
gency room consultations, no significant differences were 
found across all models (p = 1). These results emphasize 
the comparative performance advantages of XGBoost for 
certain outcomes while highlighting similar performance 
across models in others (Supplementary Table 4). For the 
calibration of the emergency room consultation outcome, 
both models showed overestimation; however, the Elastic 
Net regression model exhibited a significantly higher slope 
(12.23; 95% CI: 10.64–13.83) compared to XGBoost (1.2; 
95% CI: 1.07–1.34). For hospitalization, Elastic Net regres-
sion showed no significant underestimation of risk. Across 
the remaining calibration comparisons, no significant dif-
ferences were observed. Previously, in the same cohort, a 
functional scale for predicting mortality (C-statistical of 
0.721 95% CI: 0.660–0.780), emergency room (C-statistical 
of 0.570 95% CI: 0.500–0.640) and hospitalization (C-sta-
tistical of 0.609 CI95%: 0.570–0.650) was developed and 
validated, so this study presents a predictive approach of 
greater discrimination of adverse outcomes of the “SerMás” 
cohort (8). [16]

While the models performed well overall in our study, 
XGBoost performed better. This same finding has been 
observed in another research. Forrest et al. derived and 
validated a model of random decision trees to predict coro-
nary heart disease with an AUROC of 0.95 (95% CI 0.94 
to 0.95), a sensitivity of 0.94 (95% CI 0.94 to 0.95) and 
a specificity of 0.82 (95% CI 0.81 to 0.83) (19). Li et al. 
evaluated the ability of XGBoost and logistic regression 
and other algorithms to predict mortality in heart failure 
patients admitted to the ICU. The results showed that 

XGBoost and logistic regression lasso L1 with AUCROC 
of 0.8416 (95% CI 0.7864 to 0.8967) had a superior per-
formance compared to the risk score model “The Ameri-
can Heart Association Get With The Guidelines a Heart 
Failure GWTG - HF”, which exhibited an AUCROC of 
0.7856 (95% CI 0.7183 to 0.8470). However, the XGBoost 
showed a wide net profit threshold range (> 0.1) above the 
other two models [18] Another study to predict ICU admis-
sion from the emergency room found that XGBoost per-
formed well compared to deep neural networks (DNNs). 
The XGBoost model obtained an AUCROC of 0.861 (95% 
CI 0.848 to 0.874) with a higher discriminative yield than 
the DNN model with an AUCROC of 0.833 (95% CI 0.819 
to 0.848) [19] Khera et al., compared the performance of 
some artificial intelligence algorithms, including XGBoost 
against logistic regression, in predicting mortality in 
patients with acute myocardial infarction. It found that the 
XGBoost model reclassified 32,393 of 121,839 patients 
(27%) at moderate to high risk of death, considered to be 
low risk in the logistic regression model [20].

This study highlights the predictive power of billing 
administrative variables for identifying clinical outcomes, 
such as mortality, emergency visits, and hospitalizations. 
These outcomes serve as proxies for underlying patient 
risk categories, enabling clinicians to stratify risk and 
allocate resources more effectively. For example, Mac-
Kay et al. developed a machine learning model combining 
administrative and clinical data to predict 30-day mortal-
ity with an AUROC of 0.88 using XGBoost, compared 
to 0.84 for logistic regression. Their model provided an 
interactive interface for clinicians to manage risk [21], 
like the approach implemented in this study (Fig. 1).

Fig. 6   Ensemble Model-Based Risk Stratification (XGBoost): (A) Distribution of Prediction Probabilities and (B) Patient Risk Categorization of 
adverse clinical outcome

Content courtesy of Springer Nature, terms of use apply. Rights reserved.


	 Journal of Medical Systems           (2025) 49:19    19   Page 12 of 13

In our analysis, XGBoost demonstrated superior sen-
sitivity, meaning it is better at identifying patients truly 
at risk for adverse outcomes. This makes its negative 
predictions more reliable, crucial for minimizing missed 
risks. Additionally, XGBoost’s sensitivity and calibration 
position it as a strong candidate for ensemble learning. 
Figure 6 illustrates risk stratification based on ensemble 
predictions, where patients are categorized into action-
able risk levels.

To translate this predictive model into clinical impact, 
future work will focus on conducting a randomized con-
trolled trials to evaluate interventions driven by this risk 
stratification approach.

Conclusions

In conclusion, the XGBoost model presented a better perfor-
mance than artificial neural networks, logistic regression and 
Elastic Net. Overall, the results indicate that the XGBoost 
model has the potential to be a tool for building clinical 
decision support systems that function as useful prognostic 
models for decision-making in patients with Noncommuni-
cable Diseases. These types of tools should be evaluated and 
validated in future experimental studies for safe implementa-
tion in clinical flowcharts.

Supplementary Information  The online version contains supplemen-
tary material available at https://​doi.​org/​10.​1007/​s10916-​025-​02140-z.

Author Contributions  A.H.A. (Alejandro Hernández-Arango) was 
responsible for the design and conceptualization of the study. M.I.A. 
(María Isabel Arias) and V.P. (Viviana Pérez) conducted the data col-
lection. Data analysis and interpretation were performed by A.H.A. 
and F.J. (Fabian Jaimes). A.H.A. and F.J. drafted and revised the manu-
script. F.J. approved the final version of the manuscript. No external 
funding was received for this study.All authors reviewed and approved 
the final manuscript.

Funding  Open Access funding provided by Colombia Consortium. 
This research was conducted without external funding. All resources 
were provided by the authors.

Data Availability  The data that support the findings of this study are 
available from the authors, but restrictions apply to the availability 
of these data, which were used under ethical approval from Hospital 
Alma Máter de Antioquia for the current study, and so are not publicly 
available. Data are, however, available from the authors upon reason-
able request and with permission from the Hospital Alma Máter de 
Antioquia.

Declarations 

All authors certify that they have no affiliations with or involvement in 
any organization or entity with any financial interest or non-financial 
interest in the subject matter or materials discussed in this manuscript.
The authors declare that there was no funding.
The study was approved by the Ethics Committee on Human Research 
at the University Research Unit (CBE-SIU), Universidad de Antioquia, 
under number 20-114-922. The Committee deemed the retrospective 

observational study ethically valid and posing no risk to participants. 
The Committee had ongoing access to study data while maintaining 
confidentiality and approved the measures for participant protection 
and the informed consent process. The study was conducted in accord-
ance with Resolution 008430 of October 4, 1993, of the Colombian 
Ministry of Health, which established the scientific, technical, and ad-
ministrative standards for health research; the principles of the World 
Medical Assembly as stated in the Declaration of Helsinki of 1964, last 
updated in 2013; the Code of Federal Regulations, Title 45, Part 46, 
for the protection of human subjects by the U.S. Department of Health 
and Human Services and the National Institutes of Health (June 18, 
1991); and Resolution 2378 of 2008 of the Ministry of Social Protec-
tion of Colombia, which adopts Good Clinical Practices for institutions 
conducting research with medications in humans.

Competing Interests  The authors declare no competing interests. 

Open Access  This article is licensed under a Creative Commons Attri-
bution 4.0 International License, which permits use, sharing, adapta-
tion, distribution and reproduction in any medium or format, as long 
as you give appropriate credit to the original author(s) and the source, 
provide a link to the Creative Commons licence, and indicate if changes 
were made. The images or other third party material in this article are 
included in the article’s Creative Commons licence, unless indicated 
otherwise in a credit line to the material. If material is not included in 
the article’s Creative Commons licence and your intended use is not 
permitted by statutory regulation or exceeds the permitted use, you will 
need to obtain permission directly from the copyright holder. To view a 
copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

References

	 1.	 K. Gallardo-Solarte K., F. P. Benavides-Acosta F.P., and R. 
Rosales-Jiménez R., “Costos de la enfermedad crónica no trans-
misible: la realidad colombiana,” Rev. Cienc. Salud, vol. 14, no. 1, 
pp. 103–114, Feb. 2016, doi: https://​doi.​org/​10.​12804/​revsa​lud14.​
01.​2016.​09.

	 2.	 J. F. Orueta Mendia, A. García-Álvarez, E. Alonso-Morán, and R. 
Nuño-Solinis, “Desarrollo de un modelo de predicción de riesgo 
de hospitalizaciones no programadas en el País Vasco,” Rev. Esp. 
Salud Publica, vol. 88, no. 2, pp. 251–260, Apr. 2014, doi: https://​
doi.​org/​10.​4321/​s1135-​57272​01400​02000​07.

	 3.	 I. Gorbanev, A. E. Cortés Martínez, S. Agudelo Londoño, and F. 
J. Yepes Lujan, “Grupos relacionados por el diagnóstico: experi-
encia en tres hospitales de alta complejidad en Colombia,” Univ. 
Médica, vol. 57, no. 2, pp. 171–181, Jul. 2016, doi: https://​doi.​
org/​10.​11144/​javer​iana.​umed57-​2.​grde.

	 4.	 E. Nolte, World Health Organization: Regional Office for Europe, 
and C. Knai, Assessing chronic disease management in European 
health systems. Europe, UK: WHO Regional Office for Europe, 
2015.

	 5.	 B. C. Stagg et al., “Special Commentary: Using Clinical Decision 
Support Systems to Bring Predictive Models to the Glaucoma 
Clinic,” Ophthalmol Glaucoma, vol. 4, no. 1, pp. 5–9, Jan-Feb 
2021, doi: https://​doi.​org/​10.​1016/j.​ogla.​2020.​08.​006.

	 6.	 V. García-Arango, J. Osorio-Ciro, D. Aguirre-Acevedo, C. Vane-
gas-Vargas, C. Clavijo-Usuga, and J. Gallo-Villegas, “Validación 
predictiva de un método de clasificación funcional en adultos 
mayores,” Rev. Panam. Salud Publica, vol. 45, p. e15, Apr. 2021, 
doi: https://​doi.​org/​10.​26633/​rpsp.​2021.​15.

	 7.	 G. S. Collins et al., “TRIPOD + AI statement: updated guid-
ance for reporting clinical prediction models that use regression 

Content courtesy of Springer Nature, terms of use apply. Rights reserved.


Journal of Medical Systems           (2025) 49:19 	 Page 13 of 13     19 

or machine learning methods,” BMJ, vol. 385, p. e078378, Apr. 
2024, doi: https://​doi.​org/​10.​1136/​bmj-​2023-​078378.

	 8.	 A. Calderón-Larrañaga et al., “Assessing and measuring chronic 
multimorbidity in the older population: A proposal for its opera-
tionalization,” J. Gerontol. A Biol. Sci. Med. Sci., p. glw233, Dec. 
2016, doi: https://​doi.​org/​10.​1093/​gerona/​glw233.

	 9.	 S. Faisal and G. Tutz, “Multiple imputation using nearest neighbor 
methods,” Inf. Sci. (Ny), vol. 570, pp. 500–516, Sep. 2021, doi: 
https://​doi.​org/​10.​1016/j.​ins.​2021.​04.​009.

	10.	 T. J. Pollard, A. E. W. Johnson, J. D. Raffa, and R. G. Mark, 
“tableone: An open source Python package for producing sum-
mary statistics for research papers,” JAMIA Open, vol. 1, no. 1, 
pp. 26–31, Jul. 2018, doi: https://​doi.​org/​10.​1093/​jamia​open/​
ooy012.

	11.	 J. Gareth, W. Daniela, H. Trevor, and T. Robert, An introduc-
tion to statistical learning: with applications in R. Spinger, 2013. 
[Online]. Available: https://​dspace.​agu.​edu.​vn/​handle/​agu_​libra​
ry/​13322

	12.	 J. Friedman, T. Hastie, and R. Tibshirani, “Regularization Paths 
for Generalized Linear Models via Coordinate Descent,” J. Stat. 
Softw., vol. 33, no. 1, pp. 1–22, 2010, doi: https://​doi.​org/​10.​1109/​
TPAMI.​2005.​127.

	13.	 T. Chen and C. Guestrin, “XGBoost: A Scalable Tree Boosting 
System,” in Proceedings of the 22nd ACM SIGKDD International 
Conference on Knowledge Discovery and Data Mining, in KDD 
’16. New York, NY, USA: Association for Computing Machinery, 
Aug. 2016, pp. 785–794. doi: https://​doi.​org/​10.​1145/​29396​72.​
29397​85.

	14.	 Y. A. Ali, E. M. Awwad, M. Al-Razgan, and A. Maarouf, “Hyper-
parameter search for machine learning algorithms for optimizing 
the computational complexity,” Processes (Basel), vol. 11, no. 2, 
p. 349, Jan. 2023, doi: https://​doi.​org/​10.​3390/​pr110​20349.

	15.	 E. R. DeLong, D. M. DeLong, and D. L. Clarke-Pearson, “Com-
paring the areas under two or more correlated receiver operating 
characteristic curves: a nonparametric approach,” Biometrics, vol. 
44, no. 3, pp. 837–845, Sep. 1988, doi: https://​doi.​org/​10.​2307/​
25315​95.

	16.	 B. Vasey et al., “Reporting guideline for the early-stage clinical 
evaluation of decision support systems driven by artificial intel-
ligence: DECIDE-AI,” Nat. Med., vol. 28, no. 5, pp. 924–933, 
May 2022, doi: https://​doi.​org/​10.​1038/​s41591-​022-​01772-9.

	17.	 M. A. Gianfrancesco, S. Tamang, J. Yazdany, and G. Schmajuk, 
“Potential biases in machine learning algorithms using electronic 
health record data,” JAMA Intern. Med., vol. 178, no. 11, pp. 
1544–1547, Nov. 2018, doi: https://​doi.​org/​10.​1001/​jamai​ntern​
med.​2018.​3763.

	18.	 F. Li, H. Xin, J. Zhang, M. Fu, J. Zhou, and Z. Lian, “Prediction 
model of in-hospital mortality in intensive care unit patients with 
heart failure: machine learning-based, retrospective analysis of 
the MIMIC-III database,” BMJ Open, vol. 11, no. 7, p. e044779, 
Jul. 2021, doi: https://​doi.​org/​10.​1136/​bmjop​en-​2020-​044779.

	19.	 S. W. Choi, T. Ko, K. J. Hong, and K. H. Kim, “Machine Learn-
ing-Based Prediction of Korean Triage and Acuity Scale Level in 
Emergency Department Patients,” Healthc. Inform. Res., vol. 25, 
no. 4, pp. 305–312, Oct. 2019, doi: https://​doi.​org/​10.​4258/​hir.​
2019.​25.4.​305.

	20.	 R. Khera et al., “Use of Machine Learning Models to Predict 
Death After Acute Myocardial Infarction,” JAMA Cardiol, vol. 6, 
no. 6, pp. 633–641, Jun. 2021, doi: https://​doi.​org/​10.​1001/​jamac​
ardio.​2021.​0122.

	21.	 E. J. MacKay et al., “Application of machine learning approaches 
to administrative claims data to predict clinical outcomes in medi-
cal and surgical patient populations,” PLoS One, vol. 16, no. 6, p. 
e0252585, Jun. 2021, doi: https://​doi.​org/​10.​1371/​journ​al.​pone.​
02525​85.

Publisher’s Note  Springer Nature remains neutral with regard to 
jurisdictional claims in published maps and institutional affiliations.

Content courtesy of Springer Nature, terms of use apply. Rights reserved.


1.

2.

3.

4.

5.

6.

Terms and Conditions
 
Springer Nature journal content, brought to you courtesy of Springer Nature Customer Service Center GmbH (“Springer Nature”). 
Springer Nature supports a reasonable amount of sharing of  research papers by authors, subscribers and authorised users (“Users”), for small-
scale personal, non-commercial use provided that all copyright, trade and service marks and other proprietary notices are maintained. By
accessing, sharing, receiving or otherwise using the Springer Nature journal content you agree to these terms of use (“Terms”). For these
purposes, Springer Nature considers academic use (by researchers and students) to be non-commercial. 
These Terms are supplementary and will apply in addition to any applicable website terms and conditions, a relevant site licence or a personal
subscription. These Terms will prevail over any conflict or ambiguity with regards to the relevant terms, a site licence or a personal subscription
(to the extent of the conflict or ambiguity only). For Creative Commons-licensed articles, the terms of the Creative Commons license used will
apply. 
We collect and use personal data to provide access to the Springer Nature journal content. We may also use these personal data internally within
ResearchGate and Springer Nature and as agreed share it, in an anonymised way, for purposes of tracking, analysis and reporting. We will not
otherwise disclose your personal data outside the ResearchGate or the Springer Nature group of companies unless we have your permission as
detailed in the Privacy Policy. 
While Users may use the Springer Nature journal content for small scale, personal non-commercial use, it is important to note that Users may
not: 
 

use such content for the purpose of providing other users with access on a regular or large scale basis or as a means to circumvent access

control;

use such content where to do so would be considered a criminal or statutory offence in any jurisdiction, or gives rise to civil liability, or is

otherwise unlawful;

falsely or misleadingly imply or suggest endorsement, approval , sponsorship, or association unless explicitly agreed to by Springer Nature in

writing;

use bots or other automated methods to access the content or redirect messages

override any security feature or exclusionary protocol; or

share the content in order to create substitute for Springer Nature products or services or a systematic database of Springer Nature journal

content.
 
In line with the restriction against commercial use, Springer Nature does not permit the creation of a product or service that creates revenue,
royalties, rent or income from our content or its inclusion as part of a paid for service or for other commercial gain. Springer Nature journal
content cannot be used for inter-library loans and librarians may not upload Springer Nature journal content on a large scale into their, or any
other, institutional repository. 
These terms of use are reviewed regularly and may be amended at any time. Springer Nature is not obligated to publish any information or
content on this website and may remove it or features or functionality at our sole discretion, at any time with or without notice. Springer Nature
may revoke this licence to you at any time and remove access to any copies of the Springer Nature journal content which have been saved. 
To the fullest extent permitted by law, Springer Nature makes no warranties, representations or guarantees to Users, either express or implied
with respect to the Springer nature journal content and all parties disclaim and waive any implied warranties or warranties imposed by law,
including merchantability or fitness for any particular purpose. 
Please note that these rights do not automatically extend to content, data or other material published by Springer Nature that may be licensed
from third parties. 
If you would like to use or distribute our Springer Nature journal content to a wider audience or on a regular basis or in any other manner not
expressly permitted by these Terms, please contact Springer Nature at 
 

onlineservice@springernature.com
 

mailto:onlineservice@springernature.com