## Survival data

Time-to-event data, including both survival and censoring times, are created using functions `defSurv`

and `genSurv`

. The survival data definitions require a variable name as well as a specification of a scale value, which determines the mean survival time at a baseline level of covariates (i.e. all covariates set to 0). The Weibull distribution is used to generate these survival times. In addition, covariates (which have been defined previously) that influence survival time can be included in the `formula`

field. Positive coefficients are associated with longer survival times (and lower hazard rates). Finally, the *shape* of the distribution can be specified. A `shape`

value of 1 reflects the *exponential* distribution.

```
# Baseline data definitions
def <- defData(varname = "x1", formula = 0.5, dist = "binary")
def <- defData(def, varname = "x2", formula = 0.5, dist = "binary")
def <- defData(def, varname = "grp", formula = 0.5, dist = "binary")
# Survival data definitions
sdef <- defSurv(varname = "survTime", formula = "1.5*x1", scale = "grp*50 + (1-grp)*25",
shape = "grp*1 + (1-grp)*1.5")
sdef <- defSurv(sdef, varname = "censorTime", scale = 80, shape = 1)
sdef
```

```
## varname formula scale shape
## 1: survTime 1.5*x1 grp*50 + (1-grp)*25 grp*1 + (1-grp)*1.5
## 2: censorTime 0 80 1
```

The data are generated with calls to `genData`

and `genSurv`

:

```
# Baseline data definitions
dtSurv <- genData(300, def)
dtSurv <- genSurv(dtSurv, sdef)
head(dtSurv)
```

```
## id x1 x2 grp survTime censorTime
## 1: 1 1 1 1 12.855 2.565
## 2: 2 1 0 0 1.159 326.380
## 3: 3 1 1 1 22.138 38.162
## 4: 4 0 0 1 123.927 169.751
## 5: 5 0 1 0 54.711 21.429
## 6: 6 1 1 1 9.969 26.729
```

```
# A comparison of survival by group and x1
dtSurv[, round(mean(survTime), 1), keyby = .(grp, x1)]
```

```
## grp x1 V1
## 1: 0 0 194.3
## 2: 0 1 14.6
## 3: 1 0 43.9
## 4: 1 1 11.2
```

Observed survival times and censoring indicators can be generated by defining new fields:

```
cdef <- defDataAdd(varname = "obsTime", formula = "pmin(survTime, censorTime)",
dist = "nonrandom")
cdef <- defDataAdd(cdef, varname = "status", formula = "I(survTime <= censorTime)",
dist = "nonrandom")
dtSurv <- addColumns(cdef, dtSurv)
head(dtSurv)
```

```
## id x1 x2 grp survTime censorTime obsTime status
## 1: 1 1 1 1 12.855 2.565 2.565 0
## 2: 2 1 0 0 1.159 326.380 1.159 1
## 3: 3 1 1 1 22.138 38.162 22.138 1
## 4: 4 0 0 1 123.927 169.751 123.927 1
## 5: 5 0 1 0 54.711 21.429 21.429 0
## 6: 6 1 1 1 9.969 26.729 9.969 1
```

```
# estimate proportion of censoring by x1 and group
dtSurv[, round(1 - mean(status), 2), keyby = .(grp, x1)]
```

```
## grp x1 V1
## 1: 0 0 0.53
## 2: 0 1 0.12
## 3: 1 0 0.26
## 4: 1 1 0.11
```

Here is a Kaplan-Meier plot of the data by the four groups: