# set parameters:
set.seed(87)
<- 1e4
nr_of_studies
# utiliy functions:
<- function(n, length = 3) {
get_random_names replicate(n = n, paste0(sample(letters, size = length), collapse = ''))
}
<- Vectorize(function(delta, sd, power) {
get_n_planned power.t.test(delta = delta, sd = sd, power = 0.8)$n |>
ceiling()
})
<- Vectorize(function(delta, sd, n) {
get_actual_power power.t.test(delta = delta, sd = sd, n = n)$power
})
# simulate studies:
<- tibble(disease = get_random_names(n = nr_of_studies, length = 2),
studies endpoint = get_random_names(n = nr_of_studies, length = 1),
assumed_delta = rlnorm(nr_of_studies),
actual_delta = ifelse(runif(nr_of_studies) > 0.5,
0,
* runif(nr_of_studies, 0.2, 1.3)),
assumed_delta sd = assumed_delta * (1 + rlnorm(nr_of_studies)),
n_planned =
get_n_planned(delta = assumed_delta,
sd = sd,
power = 0.8),
n_per_year_assumed = 10 + rlnorm(nr_of_studies),
n_per_year_actual =
* runif(n_planned, 0.5, 1.2),
n_per_year_assumed expected_study_duration = n_planned / n_per_year_assumed,
actual_study_duration = n_planned / n_per_year_actual,
actual_power = get_actual_power(delta = actual_delta,
sd = sd,
n = n_planned),
null_rejected = runif(nr_of_studies) < actual_power,
utility_null_rejected = runif(nr_of_studies),
utility_null_not_rejected = utility_null_rejected / 2,
cost_per_patient = rlnorm(nr_of_studies))
Generation of in silico studies
Description of simulation model
disease
: Random names for quantity of interest (in reality something like ‘Hba1c’, ‘Survival after therapy with Imatinibe’ etc.)- 3 letters, duplicates allowed
assumed_delta
: Assumed difference in means that has been used for designing the study, lognormal distribution is assumedactual_delta
: Ground truth difference in means, assumed 0 in 50% of the cases, in the rest of the casesassumed_delta
multiplied by random number uniformly distributed between 0.2 and 1.3sd
: Standard deviation ofdisease
- some hopefully reasonable random numbern_planned
: Designed group size according toassumed_delta
andsd
based onstats::power.t.test()
n_per_year_assumed
: Assumed mean number of patients that can be included in studyn_per_year_actual
: Actual number of patients that can be included per yearexpected_study duration
andactual_study_duration
: Designed group size divided by expected / actual number of patients per yearactual_power
: Power calculated based onactual_delta
andsd
null_rejected
: Boolean variable based on Bernoulli trial with taken fromactual_power
utiliy_null_rejected
: Utility of study - uniformly distributed random number between 0 and 1utiliy_null_not_rejected
:utiliy_null_rejected
/ 2cost_per_patient
: Monetary cost to include a patient in the study - random number on arbitrary scale
Simulation results
Analysis
20 randomly chosen studies:
sample_n(studies, 20) |>
kable(align = 'c', digits = 2)
disease | endpoint | assumed_delta | actual_delta | sd | n_planned | n_per_year_assumed | n_per_year_actual | expected_study_duration | actual_study_duration | actual_power | null_rejected | utility_null_rejected | utility_null_not_rejected | cost_per_patient |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
et | z | 1.63 | 0.00 | 5.15 | 159 | 11.59 | 9.02 | 13.72 | 17.63 | 0.03 | FALSE | 0.51 | 0.25 | 2.07 |
rc | i | 0.42 | 0.53 | 0.60 | 33 | 11.74 | 8.54 | 2.81 | 3.86 | 0.94 | TRUE | 0.36 | 0.18 | 3.63 |
zf | p | 1.14 | 1.22 | 1.80 | 40 | 10.67 | 7.60 | 3.75 | 5.27 | 0.85 | FALSE | 0.95 | 0.47 | 0.34 |
ok | y | 0.26 | 0.16 | 0.32 | 26 | 12.30 | 10.05 | 2.11 | 2.59 | 0.42 | FALSE | 0.89 | 0.45 | 3.49 |
ny | r | 1.50 | 0.00 | 4.34 | 133 | 11.60 | 13.08 | 11.47 | 10.17 | 0.03 | FALSE | 0.39 | 0.19 | 0.33 |
cp | m | 1.62 | 1.72 | 5.90 | 211 | 11.06 | 8.24 | 19.08 | 25.59 | 0.85 | TRUE | 0.49 | 0.24 | 1.25 |
rw | b | 13.74 | 9.18 | 31.73 | 85 | 22.36 | 25.38 | 3.80 | 3.35 | 0.47 | FALSE | 0.04 | 0.02 | 0.21 |
wt | q | 1.96 | 1.64 | 2.68 | 31 | 10.79 | 10.16 | 2.87 | 3.05 | 0.66 | FALSE | 0.66 | 0.33 | 1.02 |
is | t | 1.22 | 1.30 | 2.54 | 70 | 11.28 | 12.09 | 6.21 | 5.79 | 0.85 | TRUE | 0.52 | 0.26 | 0.23 |
dt | z | 0.55 | 0.00 | 0.92 | 46 | 10.62 | 11.70 | 4.33 | 3.93 | 0.03 | FALSE | 0.94 | 0.47 | 0.83 |
gj | h | 1.14 | 0.00 | 1.50 | 29 | 10.19 | 10.29 | 2.85 | 2.82 | 0.02 | FALSE | 0.09 | 0.05 | 1.04 |
ux | h | 0.56 | 0.46 | 0.91 | 43 | 11.44 | 7.19 | 3.76 | 5.98 | 0.65 | FALSE | 0.87 | 0.44 | 1.62 |
gz | k | 0.76 | 0.00 | 1.13 | 37 | 13.06 | 10.68 | 2.83 | 3.46 | 0.02 | FALSE | 0.41 | 0.20 | 0.12 |
bl | h | 0.31 | 0.30 | 1.04 | 180 | 11.01 | 5.54 | 16.35 | 32.50 | 0.79 | TRUE | 0.96 | 0.48 | 1.18 |
nl | g | 1.19 | 1.48 | 1.33 | 21 | 10.67 | 9.62 | 1.97 | 2.18 | 0.94 | TRUE | 0.07 | 0.04 | 1.80 |
zx | w | 3.37 | 3.08 | 5.84 | 49 | 10.12 | 11.16 | 4.84 | 4.39 | 0.73 | TRUE | 0.13 | 0.07 | 0.60 |
kv | g | 3.54 | 0.00 | 5.95 | 46 | 11.21 | 11.80 | 4.10 | 3.90 | 0.03 | FALSE | 0.41 | 0.20 | 0.19 |
tn | c | 1.23 | 1.14 | 1.96 | 41 | 12.77 | 14.84 | 3.21 | 2.76 | 0.74 | TRUE | 0.73 | 0.37 | 0.97 |
yk | m | 0.41 | 0.00 | 0.63 | 37 | 11.18 | 6.38 | 3.31 | 5.80 | 0.02 | FALSE | 0.10 | 0.05 | 1.05 |
ud | k | 0.13 | 0.08 | 0.21 | 39 | 11.13 | 11.38 | 3.50 | 3.43 | 0.38 | FALSE | 0.19 | 0.09 | 1.43 |
Descripive statistics
::skim(studies) skimr
Name | studies |
Number of rows | 10000 |
Number of columns | 15 |
_______________________ | |
Column type frequency: | |
character | 2 |
logical | 1 |
numeric | 12 |
________________________ | |
Group variables | None |
Variable type: character
skim_variable | n_missing | complete_rate | min | max | empty | n_unique | whitespace |
---|---|---|---|---|---|---|---|
disease | 0 | 1 | 2 | 2 | 0 | 650 | 0 |
endpoint | 0 | 1 | 1 | 1 | 0 | 26 | 0 |
Variable type: logical
skim_variable | n_missing | complete_rate | mean | count |
---|---|---|---|---|
null_rejected | 0 | 1 | 0.29 | FAL: 7141, TRU: 2859 |
Variable type: numeric
skim_variable | n_missing | complete_rate | mean | sd | p0 | p25 | p50 | p75 | p100 | hist |
---|---|---|---|---|---|---|---|---|---|---|
assumed_delta | 0 | 1 | 1.64 | 2.05 | 0.02 | 0.51 | 1.01 | 1.98 | 31.39 | ▇▁▁▁▁ |
actual_delta | 0 | 1 | 0.63 | 1.46 | 0.00 | 0.00 | 0.01 | 0.68 | 33.86 | ▇▁▁▁▁ |
sd | 0 | 1 | 4.33 | 7.89 | 0.04 | 1.04 | 2.20 | 4.66 | 266.44 | ▇▁▁▁▁ |
n_planned | 0 | 1 | 179.94 | 673.96 | 18.00 | 37.00 | 64.00 | 138.00 | 26294.00 | ▇▁▁▁▁ |
n_per_year_assumed | 0 | 1 | 11.68 | 2.49 | 10.02 | 10.51 | 11.00 | 11.96 | 91.31 | ▇▁▁▁▁ |
n_per_year_actual | 0 | 1 | 9.95 | 3.27 | 5.09 | 7.68 | 9.67 | 11.72 | 100.92 | ▇▁▁▁▁ |
expected_study_duration | 0 | 1 | 15.84 | 60.65 | 0.54 | 3.25 | 5.58 | 11.95 | 2255.34 | ▇▁▁▁▁ |
actual_study_duration | 0 | 1 | 19.59 | 72.64 | 0.52 | 3.94 | 6.78 | 14.82 | 2827.49 | ▇▁▁▁▁ |
actual_power | 0 | 1 | 0.28 | 0.33 | 0.02 | 0.03 | 0.08 | 0.56 | 0.96 | ▇▁▁▁▂ |
utility_null_rejected | 0 | 1 | 0.50 | 0.29 | 0.00 | 0.25 | 0.50 | 0.75 | 1.00 | ▇▇▇▇▇ |
utility_null_not_rejected | 0 | 1 | 0.25 | 0.14 | 0.00 | 0.12 | 0.25 | 0.37 | 0.50 | ▇▇▇▇▇ |
cost_per_patient | 0 | 1 | 1.62 | 2.04 | 0.02 | 0.50 | 0.99 | 1.93 | 35.66 | ▇▁▁▁▁ |
Additional aspects:
expected_study_duration
andactual_study_duration
have not very realistic values yet → maybe not all of the considered studies would have been feasible due to insufficient patient numbersexpected_study_duration
very long for some studies