Package 'RNAmf' reference manual

Title:	Recursive non-additive emulator for multi-fidelity data
Description:	Performs RNA emulation and active learning proposed by Heo and Sung (2023+) <arXiv:2309.11772> for multi-fidelity computer experiments. The RNA emulator is particularly useful when the simulations with different fidelity level are nonlinearly correlated. The hyperparameters in the model are estimated by maximum likelihood estimation.
Authors:	Junoh Heo <[email protected]>, Chih-Li Sung <[email protected]>
Maintainer:	Junoh Heo <[email protected]>
License:	MIT + file LICENSE
Version:	0.1.0
Built:	2025-03-16 05:35:38 UTC
Source:	https://github.com/heojunoh/rnamf

find the next point by ALC criterion

Description

The function acquires the new point by the Active learning Cohn (ALC) criterion. It calculates the ALC criterion $\frac{\Delta \sigma_L^{2}(l,\bm{x})}{\sum^l_{j=1}C_j} = \frac{\int_{\Omega} \sigma_L^{*2}(\bm{\xi})-\tilde{\sigma}_L^{*2}(\bm{\xi};l,\bm{x}){\rm{d}}\bm{\xi}}{\sum^l_{j=1}C_j}$ , where $f_L$ is the highest-fidelity simulation code, $\sigma_L^{*2}(\bm{\xi})$ is the posterior variance of $f_L(\bm{\xi})$ , $C_j$ is the simulation cost at fidelity level $j$ , and $\tilde{\sigma}_L^{*2}(\bm{\xi};l,\bm{x})$ is the posterior variance based on the augmented design combining the current design and a new input location $\bm{x}$ at each fidelity level lower than or equal to $l$ . The integration is approximated by MC integration using uniform reference samples.

Usage

ALC_RNAmf(Xref = NULL, Xcand = NULL, fit, mc.sample = 100,
cost = NULL, optim = TRUE, parallel = FALSE, ncore = 1)
ALC_RNAmf(Xref = NULL, Xcand = NULL, fit, mc.sample = 100,
cost = NULL, optim = TRUE, parallel = FALSE, ncore = 1)

Arguments

`Xref`	vector or matrix of reference location to approximate the integral of ALC. If `Xref=NULL`, $100 \times d$ points are generated by Latin hypercube design. Default is `NULL`.
`Xcand`	vector or matrix of candidate set which could be added into the current design only when `optim=FALSE`. `Xcand` is the set of the points where ALC criterion is evaluated. If `Xcand=NULL`, `Xref` is used. Default is `NULL`. See details.
`fit`	object of class `RNAmf`.
`mc.sample`	a number of mc samples generated for the imputation through MC approximation. Default is `100`.
`cost`	vector of the costs for each level of fidelity. If `cost=NULL`, total costs at all levels would be 1. `cost` is encouraged to have a ascending order of positive value. Default is `NULL`.
`optim`	logical indicating whether to optimize AL criterion by `optim`'s gradient-based `L-BFGS-B` method. If `optim=TRUE`, $5 \times d$ starting points are generated by Latin hypercube design for optimization. If `optim=FALSE`, AL criterion is optimized on the `Xcand`. Default is `TRUE`.
`parallel`	logical indicating whether to compute the AL criterion in parallel or not. If `parallel=TRUE`, parallel computation is utilized. Default is `FALSE`.
`ncore`	a number of core for parallel. It is only used if `parallel=TRUE`. Default is 1.

Details

Xref plays a role of $\bm{\xi}$ to approximate the integration. To impute the posterior variance based on the augmented design $\tilde{\sigma}_L^{*2}(\bm{\xi};l,\bm{x})$ , MC approximation is used. Due to the nested assumption, imputing $y^{[s]}_{n_s+1}$ for each $1\leq s\leq l$ by drawing samples from the posterior distribution of $f_s(\bm{x}^{[s]}_{n_s+1})$ based on the current design allows to compute $\tilde{\sigma}_L^{*2}(\bm{\xi};l,\bm{x})$ . Inverse of covariance matrix is computed by the Sherman-Morrison formula. For details, see Heo and Sung (2023+, <arXiv:2309.11772>).

To search for the next acquisition $\bm{x^*}$ by maximizing AL criterion, the gradient-based optimization can be used by optim=TRUE. Firstly, $\tilde{\sigma}_L^{*2}(\bm{\xi};l,\bm{x})$ is computed on the $5 \times d$ number of points. After that, the point minimizing $\tilde{\sigma}_L^{*2}(\bm{\xi};l,\bm{x})$ serves as a starting point of optimization by L-BFGS-B method. Otherwise, when optim=FALSE, AL criterion is optimized only on Xcand.

The point is selected by maximizing the ALC criterion: $\text{argmax}_{l\in\{1,\ldots,L\}; \bm{x} \in \Omega} \frac{\Delta \sigma_L^{2}(l,\bm{x})}{\sum^l_{j=1}C_j}$ .

Value

ALC: list of ALC criterion integrated on Xref when each data point on Xcand is added at each level $l$ if optim=FALSE. If optim=TRUE, ALC returns NULL.
cost: a copy of cost.
Xcand: a copy of Xcand.
chosen: list of chosen level and point.
time: a scalar of the time for the computation.

Examples

## Not run: 
library(lhs)
library(doParallel)
library(foreach)

### simulation costs ###
cost <- c(1, 3)

### 1-d Perdikaris function in Perdikaris, et al. (2017) ###
# low-fidelity function
f1 <- function(x) {
  sin(8 * pi * x)
}

# high-fidelity function
f2 <- function(x) {
  (x - sqrt(2)) * (sin(8 * pi * x))^2
}

### training data ###
n1 <- 13
n2 <- 8

### fix seed to reproduce the result ###
set.seed(1)

### generate initial nested design ###
X <- NestedX(c(n1, n2), 1)
X1 <- X[[1]]
X2 <- X[[2]]

### n1 and n2 might be changed from NestedX ###
### assign n1 and n2 again ###
n1 <- nrow(X1)
n2 <- nrow(X2)

y1 <- f1(X1)
y2 <- f2(X2)

### n=100 uniform test data ###
x <- seq(0, 1, length.out = 100)

### fit an RNAmf ###
fit.RNAmf <- RNAmf_two_level(X1, y1, X2, y2, kernel = "sqex")

### predict ###
predy <- predict(fit.RNAmf, x)$mu
predsig2 <- predict(fit.RNAmf, x)$sig2

### active learning with optim=TRUE ###
alc.RNAmf.optim <- ALC_RNAmf(
  Xref = x, Xcand = x, fit.RNAmf, cost = cost,
  optim = TRUE, parallel = TRUE, ncore = 10
)
alc.RNAmf.optim$time # computation time of optim=TRUE

### active learning with optim=FALSE ###
alc.RNAmf <- ALC_RNAmf(
  Xref = x, Xcand = x, fit.RNAmf, cost = cost,
  optim = FALSE, parallel = TRUE, ncore = 10
)
alc.RNAmf$time # computation time of optim=FALSE

### visualize ALC ###
par(mfrow = c(1, 2))
plot(x, alc.RNAmf$ALC$ALC1,
  type = "l", lty = 2,
  xlab = "x", ylab = "ALC criterion augmented at the low-fidelity level",
  ylim = c(min(c(alc.RNAmf$ALC$ALC1, alc.RNAmf$ALC$ALC2)),
           max(c(alc.RNAmf$ALC$ALC1, alc.RNAmf$ALC$ALC2)))
)
plot(x, alc.RNAmf$ALC$ALC2,
  type = "l", lty = 2,
  xlab = "x", ylab = "ALC criterion augmented at the high-fidelity level",
  ylim = c(min(c(alc.RNAmf$ALC$1, alc.RNAmf$ALC$2)),
           max(c(alc.RNAmf$ALC$1, alc.RNAmf$ALC$2)))
)
points(alc.RNAmf$chosen$Xnext,
  alc.RNAmf$ALC$2[which(x == drop(alc.RNAmf$chosen$Xnext))],
  pch = 16, cex = 1, col = "red"
)
## End(Not run)

## Not run: 
library(lhs)
library(doParallel)
library(foreach)

### simulation costs ###
cost <- c(1, 3)

### 1-d Perdikaris function in Perdikaris, et al. (2017) ###
# low-fidelity function
f1 <- function(x) {
  sin(8 * pi * x)
}

# high-fidelity function
f2 <- function(x) {
  (x - sqrt(2)) * (sin(8 * pi * x))^2
}

### training data ###
n1 <- 13
n2 <- 8

### fix seed to reproduce the result ###
set.seed(1)

### generate initial nested design ###
X <- NestedX(c(n1, n2), 1)
X1 <- X[[1]]
X2 <- X[[2]]

### n1 and n2 might be changed from NestedX ###
### assign n1 and n2 again ###
n1 <- nrow(X1)
n2 <- nrow(X2)

y1 <- f1(X1)
y2 <- f2(X2)

### n=100 uniform test data ###
x <- seq(0, 1, length.out = 100)

### fit an RNAmf ###
fit.RNAmf <- RNAmf_two_level(X1, y1, X2, y2, kernel = "sqex")

### predict ###
predy <- predict(fit.RNAmf, x)$mu
predsig2 <- predict(fit.RNAmf, x)$sig2

### active learning with optim=TRUE ###
alc.RNAmf.optim <- ALC_RNAmf(
  Xref = x, Xcand = x, fit.RNAmf, cost = cost,
  optim = TRUE, parallel = TRUE, ncore = 10
)
alc.RNAmf.optim$time # computation time of optim=TRUE

### active learning with optim=FALSE ###
alc.RNAmf <- ALC_RNAmf(
  Xref = x, Xcand = x, fit.RNAmf, cost = cost,
  optim = FALSE, parallel = TRUE, ncore = 10
)
alc.RNAmf$time # computation time of optim=FALSE

### visualize ALC ###
par(mfrow = c(1, 2))
plot(x, alc.RNAmf$ALC$ALC1,
  type = "l", lty = 2,
  xlab = "x", ylab = "ALC criterion augmented at the low-fidelity level",
  ylim = c(min(c(alc.RNAmf$ALC$ALC1, alc.RNAmf$ALC$ALC2)),
           max(c(alc.RNAmf$ALC$ALC1, alc.RNAmf$ALC$ALC2)))
)
plot(x, alc.RNAmf$ALC$ALC2,
  type = "l", lty = 2,
  xlab = "x", ylab = "ALC criterion augmented at the high-fidelity level",
  ylim = c(min(c(alc.RNAmf$ALC$1, alc.RNAmf$ALC$2)),
           max(c(alc.RNAmf$ALC$1, alc.RNAmf$ALC$2)))
)
points(alc.RNAmf$chosen$Xnext,
  alc.RNAmf$ALC$2[which(x == drop(alc.RNAmf$chosen$Xnext))],
  pch = 16, cex = 1, col = "red"
)
## End(Not run)

find the next point by ALM criterion

Description

The function acquires the new point by the Active learning MacKay (ALM) criterion. It calculates the ALM criterion $\frac{\sigma^{*2}_l(\bm{x})}{\sum^l_{j=1}C_j}$ , where $\sigma^{*2}_l(\bm{x})$ is the posterior predictive variance at each fidelity level $l$ and $C_j$ is the simulation cost at level $j$ . For details, see Heo and Sung (2023+, <arXiv:2309.11772>).

Usage

ALM_RNAmf(Xcand = NULL, fit, cost = NULL, optim = TRUE, parallel = FALSE, ncore = 1)
ALM_RNAmf(Xcand = NULL, fit, cost = NULL, optim = TRUE, parallel = FALSE, ncore = 1)

Arguments

`Xcand`	vector or matrix of candidate set which could be added into the current design only used when `optim=FALSE`. `Xcand` is the set of the points where ALM criterion is evaluated. If `Xcand=NULL`, $100 \times d$ number of points are generated by Latin hypercube design. Default is `NULL`.
`fit`	object of class `RNAmf`.
`cost`	vector of the costs for each level of fidelity. If `cost=NULL`, total costs at all levels would be 1. `cost` is encouraged to have a ascending order of positive value. Default is `NULL`.
`optim`	logical indicating whether to optimize AL criterion by `optim`'s gradient-based `L-BFGS-B` method. If `optim=TRUE`, $5 \times d$ starting points are generated by Latin hypercube design for optimization. If `optim=FALSE`, AL criterion is optimized on the `Xcand`. Default is `TRUE`.
`parallel`	logical indicating whether to compute the AL criterion in parallel or not. If `parallel=TRUE`, parallel computation is utilized. Default is `FALSE`.
`ncore`	a number of core for parallel. It is only used if `parallel=TRUE`. Default is 1.

Value

ALM: list of ALM criterion computed at each point of Xcand at each level if optim=FALSE. If optim=TRUE, ALM returns NULL.
cost: a copy of cost.
Xcand: a copy of Xcand.
chosen: list of chosen level and point.
time: a scalar of the time for the computation.

Examples

## Not run: 
library(lhs)
library(doParallel)
library(foreach)

### simulation costs ###
cost <- c(1, 3)

### 1-d Perdikaris function in Perdikaris, et al. (2017) ###
# low-fidelity function
f1 <- function(x) {
  sin(8 * pi * x)
}

# high-fidelity function
f2 <- function(x) {
  (x - sqrt(2)) * (sin(8 * pi * x))^2
}

### training data ###
n1 <- 13
n2 <- 8

### fix seed to reproduce the result ###
set.seed(1)

### generate initial nested design ###
X <- NestedX(c(n1, n2), 1)
X1 <- X[[1]]
X2 <- X[[2]]

### n1 and n2 might be changed from NestedX ###
### assign n1 and n2 again ###
n1 <- nrow(X1)
n2 <- nrow(X2)

y1 <- f1(X1)
y2 <- f2(X2)

### n=100 uniform test data ###
x <- seq(0, 1, length.out = 100)

### fit an RNAmf ###
fit.RNAmf <- RNAmf_two_level(X1, y1, X2, y2, kernel = "sqex")

### predict ###
predy <- predict(fit.RNAmf, x)$mu
predsig2 <- predict(fit.RNAmf, x)$sig2

### active learning with optim=TRUE ###
alm.RNAmf.optim <- ALM_RNAmf(
  Xcand = x, fit.RNAmf, cost = cost,
  optim = TRUE, parallel = TRUE, ncore = 10
)
alm.RNAmf.optim$time # computation time of optim=TRUE

### active learning with optim=FALSE ###
alm.RNAmf <- ALM_RNAmf(
  Xcand = x, fit.RNAmf, cost = cost,
  optim = FALSE, parallel = TRUE, ncore = 10
)
alm.RNAmf$time # computation time of optim=FALSE

### visualize ALM ###
par(mfrow = c(1, 2))
plot(x, alm.RNAmf$ALM$ALM1,
  type = "l", lty = 2,
  xlab = "x", ylab = "ALM criterion at the low-fidelity level",
  ylim = c(min(c(alm.RNAmf$ALM$ALM1, alm.RNAmf$ALM$ALM2)),
           max(c(alm.RNAmf$ALM$ALM1, alm.RNAmf$ALM$ALM2)))
)
points(alm.RNAmf$chosen$Xnext,
  alm.RNAmf$ALM$ALM1[which(x == drop(alm.RNAmf$chosen$Xnext))],
  pch = 16, cex = 1, col = "red"
)
plot(x, alm.RNAmf$ALM$ALM2,
  type = "l", lty = 2,
  xlab = "x", ylab = "ALM criterion at the high-fidelity level",
  ylim = c(min(c(alm.RNAmf$ALM$ALM1, alm.RNAmf$ALM$ALM2)),
           max(c(alm.RNAmf$ALM$ALM1, alm.RNAmf$ALM$ALM2)))
)
## End(Not run)

## Not run: 
library(lhs)
library(doParallel)
library(foreach)

### simulation costs ###
cost <- c(1, 3)

### 1-d Perdikaris function in Perdikaris, et al. (2017) ###
# low-fidelity function
f1 <- function(x) {
  sin(8 * pi * x)
}

# high-fidelity function
f2 <- function(x) {
  (x - sqrt(2)) * (sin(8 * pi * x))^2
}

### training data ###
n1 <- 13
n2 <- 8

### fix seed to reproduce the result ###
set.seed(1)

### generate initial nested design ###
X <- NestedX(c(n1, n2), 1)
X1 <- X[[1]]
X2 <- X[[2]]

### n1 and n2 might be changed from NestedX ###
### assign n1 and n2 again ###
n1 <- nrow(X1)
n2 <- nrow(X2)

y1 <- f1(X1)
y2 <- f2(X2)

### n=100 uniform test data ###
x <- seq(0, 1, length.out = 100)

### fit an RNAmf ###
fit.RNAmf <- RNAmf_two_level(X1, y1, X2, y2, kernel = "sqex")

### predict ###
predy <- predict(fit.RNAmf, x)$mu
predsig2 <- predict(fit.RNAmf, x)$sig2

### active learning with optim=TRUE ###
alm.RNAmf.optim <- ALM_RNAmf(
  Xcand = x, fit.RNAmf, cost = cost,
  optim = TRUE, parallel = TRUE, ncore = 10
)
alm.RNAmf.optim$time # computation time of optim=TRUE

### active learning with optim=FALSE ###
alm.RNAmf <- ALM_RNAmf(
  Xcand = x, fit.RNAmf, cost = cost,
  optim = FALSE, parallel = TRUE, ncore = 10
)
alm.RNAmf$time # computation time of optim=FALSE

### visualize ALM ###
par(mfrow = c(1, 2))
plot(x, alm.RNAmf$ALM$ALM1,
  type = "l", lty = 2,
  xlab = "x", ylab = "ALM criterion at the low-fidelity level",
  ylim = c(min(c(alm.RNAmf$ALM$ALM1, alm.RNAmf$ALM$ALM2)),
           max(c(alm.RNAmf$ALM$ALM1, alm.RNAmf$ALM$ALM2)))
)
points(alm.RNAmf$chosen$Xnext,
  alm.RNAmf$ALM$ALM1[which(x == drop(alm.RNAmf$chosen$Xnext))],
  pch = 16, cex = 1, col = "red"
)
plot(x, alm.RNAmf$ALM$ALM2,
  type = "l", lty = 2,
  xlab = "x", ylab = "ALM criterion at the high-fidelity level",
  ylim = c(min(c(alm.RNAmf$ALM$ALM1, alm.RNAmf$ALM$ALM2)),
           max(c(alm.RNAmf$ALM$ALM1, alm.RNAmf$ALM$ALM2)))
)
## End(Not run)

find the next point by ALMC criterion

Description

The function acquires the new point by the hybrid approach, referred to as Active learning MacKay-Cohn (ALMC) criterion. It finds the optimal input location $\bm{x^*}$ by maximizing $\sigma^{*2}_L(\bm{x})$ , the posterior predictive variance at the highest-fidelity level $L$ . After selecting $\bm{x^*}$ , it finds the optimal fidelity level by maximizing ALC criterion at $\bm{x^*}$ , $\text{argmax}_{l\in\{1,\ldots,L\}} \frac{\Delta \sigma_L^{2}(l,\bm{x^*})}{\sum^l_{j=1}C_j}$ , where $C_j$ is the simulation cost at level $j$ . See ALC_RNAmf. For details, see Heo and Sung (2023+, <arXiv:2309.11772>).

Usage

ALMC_RNAmf(Xref = NULL, Xcand = NULL, fit, mc.sample = 100,
cost = NULL, optim = TRUE, parallel = FALSE, ncore = 1)
ALMC_RNAmf(Xref = NULL, Xcand = NULL, fit, mc.sample = 100,
cost = NULL, optim = TRUE, parallel = FALSE, ncore = 1)

Arguments

`Xref`	vector or matrix of reference location to approximate the integral of ALC. If `Xref=NULL`, $100 \times d$ points are generated by Latin hypercube design. Default is `NULL`.
`Xcand`	vector or matrix of candidate set which could be added into the current design only when `optim=FALSE`. `Xcand` is the set of the points where ALM criterion is evaluated. If `Xcand=NULL`, `Xref` is used. Default is `NULL`.
`fit`	object of class `RNAmf`.
`mc.sample`	a number of mc samples generated for the imputation through MC approximation. Default is `100`.
`cost`	vector of the costs for each level of fidelity. If `cost=NULL`, total costs at all levels would be 1. `cost` is encouraged to have a ascending order of positive value. Default is `NULL`.
`optim`	logical indicating whether to optimize AL criterion by `optim`'s gradient-based `L-BFGS-B` method. If `optim=TRUE`, $5 \times d$ starting points are generated by Latin hypercube design for optimization. If `optim=FALSE`, AL criterion is optimized on the `Xcand`. Default is `TRUE`.
`parallel`	logical indicating whether to compute the AL criterion in parallel or not. If `parallel=TRUE`, parallel computation is utilized. Default is `FALSE`.
`ncore`	a number of core for parallel. It is only used if `parallel=TRUE`. Default is 1.

Value

ALMC: vector of ALMC criterion $\frac{\Delta \sigma_L^{2}(l,\bm{x^*})}{\sum^l_{j=1}C_j}$ for $1\leq l\leq L$ .
ALM: vector of ALM criterion computed at each point of Xcand at the highest fidelity level if optim=FALSE. If optim=TRUE, ALM returns NULL.
ALC: list of ALC criterion integrated on Xref when each data point on Xcand is added at each level $l$ if optim=FALSE. If optim=TRUE, ALC returns NULL.
cost: a copy of cost.
Xcand: a copy of Xcand.
chosen: list of chosen level and point.
time: a scalar of the time for the computation.

Examples

## Not run: 
library(lhs)
library(doParallel)
library(foreach)

### simulation costs ###
cost <- c(1, 3)

### 1-d Perdikaris function in Perdikaris, et al. (2017) ###
# low-fidelity function
f1 <- function(x) {
  sin(8 * pi * x)
}

# high-fidelity function
f2 <- function(x) {
  (x - sqrt(2)) * (sin(8 * pi * x))^2
}

### training data ###
n1 <- 13
n2 <- 8

### fix seed to reproduce the result ###
set.seed(1)

### generate initial nested design ###
X <- NestedX(c(n1, n2), 1)
X1 <- X[[1]]
X2 <- X[[2]]

### n1 and n2 might be changed from NestedX ###
### assign n1 and n2 again ###
n1 <- nrow(X1)
n2 <- nrow(X2)

y1 <- f1(X1)
y2 <- f2(X2)

### n=100 uniform test data ###
x <- seq(0, 1, length.out = 100)

### fit an RNAmf ###
fit.RNAmf <- RNAmf_two_level(X1, y1, X2, y2, kernel = "sqex")

### predict ###
predy <- predict(fit.RNAmf, x)$mu
predsig2 <- predict(fit.RNAmf, x)$sig2

### active learning with optim=TRUE ###
almc.RNAmf.optim <- ALMC_RNAmf(
  Xref = x, Xcand = x, fit.RNAmf, cost = cost,
  optim = TRUE, parallel = TRUE, ncore = 10
)
almc.RNAmf.optim$time # computation time of optim=TRUE

### active learning with optim=FALSE ###
almc.RNAmf <- ALMC_RNAmf(
  Xref = x, Xcand = x, fit.RNAmf, cost = cost,
  optim = FALSE, parallel = TRUE, ncore = 10
)
almc.RNAmf$time # computation time of optim=FALSE

### visualize ALMC ###
par(mfrow = c(1, 2))
plot(x, almc.RNAmf$ALM,
  type = "l", lty = 2,
  xlab = "x", ylab = "ALM criterion at the high-fidelity level"
)
points(almc.RNAmf$chosen$Xnext,
  almc.RNAmf$ALM[which(x == drop(almc.RNAmf$chosen$Xnext))],
  pch = 16, cex = 1, col = "red"
)
plot(x, almc.RNAmf$ALC$ALC1,
  type = "l", lty = 2,
  ylim = c(min(c(alc.RNAmf$ALC$ALC1, alc.RNAmf$ALC$ALC2)),
           max(c(alc.RNAmf$ALC$ALC1, alc.RNAmf$ALC$ALC2))),
  xlab = "x", ylab = "ALC criterion augmented at each level on the optimal input location"
)
lines(x, almc.RNAmf$ALC$ALC2, type = "l", lty = 2)
points(almc.RNAmf$chosen$Xnext,
  almc.RNAmf$ALC$ALC1[which(x == drop(almc.RNAmf$chosen$Xnext))],
  pch = 16, cex = 1, col = "red"
)
points(almc.RNAmf$chosen$Xnext,
  almc.RNAmf$ALC$ALC2[which(x == drop(almc.RNAmf$chosen$Xnext))],
  pch = 16, cex = 1, col = "red"
)
## End(Not run)

## Not run: 
library(lhs)
library(doParallel)
library(foreach)

### simulation costs ###
cost <- c(1, 3)

### 1-d Perdikaris function in Perdikaris, et al. (2017) ###
# low-fidelity function
f1 <- function(x) {
  sin(8 * pi * x)
}

# high-fidelity function
f2 <- function(x) {
  (x - sqrt(2)) * (sin(8 * pi * x))^2
}

### training data ###
n1 <- 13
n2 <- 8

### fix seed to reproduce the result ###
set.seed(1)

### generate initial nested design ###
X <- NestedX(c(n1, n2), 1)
X1 <- X[[1]]
X2 <- X[[2]]

### n1 and n2 might be changed from NestedX ###
### assign n1 and n2 again ###
n1 <- nrow(X1)
n2 <- nrow(X2)

y1 <- f1(X1)
y2 <- f2(X2)

### n=100 uniform test data ###
x <- seq(0, 1, length.out = 100)

### fit an RNAmf ###
fit.RNAmf <- RNAmf_two_level(X1, y1, X2, y2, kernel = "sqex")

### predict ###
predy <- predict(fit.RNAmf, x)$mu
predsig2 <- predict(fit.RNAmf, x)$sig2

### active learning with optim=TRUE ###
almc.RNAmf.optim <- ALMC_RNAmf(
  Xref = x, Xcand = x, fit.RNAmf, cost = cost,
  optim = TRUE, parallel = TRUE, ncore = 10
)
almc.RNAmf.optim$time # computation time of optim=TRUE

### active learning with optim=FALSE ###
almc.RNAmf <- ALMC_RNAmf(
  Xref = x, Xcand = x, fit.RNAmf, cost = cost,
  optim = FALSE, parallel = TRUE, ncore = 10
)
almc.RNAmf$time # computation time of optim=FALSE

### visualize ALMC ###
par(mfrow = c(1, 2))
plot(x, almc.RNAmf$ALM,
  type = "l", lty = 2,
  xlab = "x", ylab = "ALM criterion at the high-fidelity level"
)
points(almc.RNAmf$chosen$Xnext,
  almc.RNAmf$ALM[which(x == drop(almc.RNAmf$chosen$Xnext))],
  pch = 16, cex = 1, col = "red"
)
plot(x, almc.RNAmf$ALC$ALC1,
  type = "l", lty = 2,
  ylim = c(min(c(alc.RNAmf$ALC$ALC1, alc.RNAmf$ALC$ALC2)),
           max(c(alc.RNAmf$ALC$ALC1, alc.RNAmf$ALC$ALC2))),
  xlab = "x", ylab = "ALC criterion augmented at each level on the optimal input location"
)
lines(x, almc.RNAmf$ALC$ALC2, type = "l", lty = 2)
points(almc.RNAmf$chosen$Xnext,
  almc.RNAmf$ALC$ALC1[which(x == drop(almc.RNAmf$chosen$Xnext))],
  pch = 16, cex = 1, col = "red"
)
points(almc.RNAmf$chosen$Xnext,
  almc.RNAmf$ALC$ALC2[which(x == drop(almc.RNAmf$chosen$Xnext))],
  pch = 16, cex = 1, col = "red"
)
## End(Not run)

Constructing the nested design sets for RNA model.

Description

The function constructs the nested design sets with two fidelity levels $\mathcal{X}_2 \subseteq \mathcal{X}_{1}$ for RNAmf_two_level or three fidelity levels $\mathcal{X}_3 \subseteq \mathcal{X}_2 \subseteq \mathcal{X}_{1}$ for RNAmf_three_level.

Usage

NestedX(n, d)
NestedX(n, d)

Arguments

`n`	vector of the number of design points at each fidelity level $l$ . Thus, the vector must have a positive value $n_1, n_2$ or $n_1, n_2, n_3$ where $n_1 > n_2 > n_3$ .
`d`	constant of the dimension of the design.

Details

The procedure replace the points of lower level design $\mathcal{X}_{l-1}$ to the closest points of higher level design $\mathcal{X}_{l}$ . The length of the $\mathcal{X}_{l-1}$ could be larger than the user specified. For details, see "NestedDesign".

Value

A list containing the design at each level, i.e., $\mathcal{X}_{1}, \mathcal{X}_{2}$ or $\mathcal{X}_{1}, \mathcal{X}_{2}, \mathcal{X}_{3}$ .

References

L. Le Gratiet and J. Garnier (2014). Recursive co-kriging model for design of computer experiments with multiple levels of fidelity. International Journal for Uncertainty Quantification, 4(5), 365-386; doi:10.1615/Int.J.UncertaintyQuantification.2014006914

Examples

### number of design points ###
n1 <- 30
n2 <- 15

### dimension of the design ###
d <- 2

### fix seed to reproduce the result ###
set.seed(1)

### generate the nested design ###
NX <- NestedX(c(n1, n2), d)

### visualize nested design ###
plot(NX[[1]], col="red", pch=1, xlab="x1", ylab="x2")
points(NX[[2]], col="blue", pch=4)

### number of design points ###
n1 <- 30
n2 <- 15

### dimension of the design ###
d <- 2

### fix seed to reproduce the result ###
set.seed(1)

### generate the nested design ###
NX <- NestedX(c(n1, n2), d)

### visualize nested design ###
plot(NX[[1]], col="red", pch=1, xlab="x1", ylab="x2")
points(NX[[2]], col="blue", pch=4)

predict

Description

The function computes the posterior mean and variance of RNA models with two or three fidelity levels by fitted model using RNAmf_two_level or RNAmf_three_level.

Usage

## S3 method for class 'RNAmf'
predict(object, x, ...)
## S3 method for class 'RNAmf'
predict(object, x, ...)

Arguments

`object`	a class `RNAmf` object fitted by `RNAmf_two_level` or `RNAmf_three_level`.
`x`	vector or matrix of new input locations to predict.
`...`	for compatibility with generic method `predict`.

Details

prediction of the RNAmf emulator with 2 or 3 fidelity levels.

From the model fitted by RNAmf_two_level or RNAmf_three_level, the posterior mean and variance are calculated based on the closed form expression derived by a recursive fashion. The formulas depend on its kernel choices. For details, see Heo and Sung (2023+, <arXiv:2309.11772>).

Value

mu: vector of predictive posterior mean.
sig2: vector of predictive posterior variance.
time: a scalar of the time for the computation.

Examples

### two levels example ###
library(lhs)

### Perdikaris function ###
f1 <- function(x) {
  sin(8 * pi * x)
}

f2 <- function(x) {
  (x - sqrt(2)) * (sin(8 * pi * x))^2
}

### training data ###
n1 <- 13
n2 <- 8

### fix seed to reproduce the result ###
set.seed(1)

### generate initial nested design ###
X <- NestedX(c(n1, n2), 1)
X1 <- X[[1]]
X2 <- X[[2]]

### n1 and n2 might be changed from NestedX ###
### assign n1 and n2 again ###
n1 <- nrow(X1)
n2 <- nrow(X2)

y1 <- f1(X1)
y2 <- f2(X2)

### n=100 uniform test data ###
x <- seq(0, 1, length.out = 100)

### fit an RNAmf ###
fit.RNAmf <- RNAmf_two_level(X1, y1, X2, y2, kernel = "sqex")

### predict ###
predy <- predict(fit.RNAmf, x)$mu
predsig2 <- predict(fit.RNAmf, x)$sig2

### RMSE ###
print(sqrt(mean((predy - f2(x))^2)))

### visualize the emulation performance ###
plot(x, predy,
  type = "l", lwd = 2, col = 3, # emulator and confidence interval
  ylim = c(-2, 1)
)
lines(x, predy + 1.96 * sqrt(predsig2 * length(y2) / (length(y2) - 2)), col = 3, lty = 2)
lines(x, predy - 1.96 * sqrt(predsig2 * length(y2) / (length(y2) - 2)), col = 3, lty = 2)

curve(f2(x), add = TRUE, col = 1, lwd = 2, lty = 2) # high fidelity function

points(X1, y1, pch = 1, col = "red") # low-fidelity design
points(X2, y2, pch = 4, col = "blue") # high-fidelity design

### three levels example ###
### Branin function ###
branin <- function(xx, l){
  x1 <- xx[1]
  x2 <- xx[2]
  if(l == 1){
    10*sqrt((-1.275*(1.2*x1+0.4)^2/pi^2+5*(1.2*x1+0.4)/pi+(1.2*x2+0.4)-6)^2 +
    (10-5/(4*pi))*cos((1.2*x1+0.4))+ 10) + 2*(1.2*x1+1.9) - 3*(3*(1.2*x2+2.4)-1) - 1 - 3*x2 + 1
  }else if(l == 2){
    10*sqrt((-1.275*(x1+2)^2/pi^2+5*(x1+2)/pi+(x2+2)-6)^2 +
    (10-5/(4*pi))*cos((x1+2))+ 10) + 2*(x1-0.5) - 3*(3*x2-1) - 1
  }else if(l == 3){
    (-1.275*x1^2/pi^2+5*x1/pi+x2-6)^2 + (10-5/(4*pi))*cos(x1)+ 10
  }
}

output.branin <- function(x, l){
  factor_range <- list("x1" = c(-5, 10), "x2" = c(0, 15))

  for(i in 1:length(factor_range)) x[i] <- factor_range[[i]][1] + x[i] * diff(factor_range[[i]])
  branin(x[1:2], l)
}

### training data ###
n1 <- 20; n2 <- 15; n3 <- 10

### fix seed to reproduce the result ###
set.seed(1)

### generate initial nested design ###
X <- NestedX(c(n1, n2, n3), 2)
X1 <- X[[1]]
X2 <- X[[2]]
X3 <- X[[3]]

### n1, n2 and n3 might be changed from NestedX ###
### assign n1, n2 and n3 again ###
n1 <- nrow(X1)
n2 <- nrow(X2)
n3 <- nrow(X3)

y1 <- apply(X1,1,output.branin, l=1)
y2 <- apply(X2,1,output.branin, l=2)
y3 <- apply(X3,1,output.branin, l=3)

### n=10000 grid test data ###
x <- as.matrix(expand.grid(seq(0, 1, length.out = 100),seq(0, 1, length.out = 100)))

### fit an RNAmf ###
fit.RNAmf <- RNAmf_three_level(X1, y1, X2, y2, X3, y3, kernel = "sqex")

### predict ###
pred.RNAmf <- predict(fit.RNAmf, x)
predy <- pred.RNAmf$mu
predsig2 <- pred.RNAmf$sig2

### RMSE ###
print(sqrt(mean((predy - apply(x,1,output.branin, l=3))^2)))

### visualize the emulation performance ###
x1 <- x2 <- seq(0, 1, length.out = 100)
par(mfrow=c(1,2))
image(x1, x2, matrix(apply(x,1,output.branin, l=3), ncol=100),
zlim=c(0,310), main="Branin function")
image(x1, x2, matrix(predy, ncol=100),
zlim=c(0,310), main="RNAmf prediction")

### predictive variance ###
print(predsig2)

### two levels example ###
library(lhs)

### Perdikaris function ###
f1 <- function(x) {
  sin(8 * pi * x)
}

f2 <- function(x) {
  (x - sqrt(2)) * (sin(8 * pi * x))^2
}

### training data ###
n1 <- 13
n2 <- 8

### fix seed to reproduce the result ###
set.seed(1)

### generate initial nested design ###
X <- NestedX(c(n1, n2), 1)
X1 <- X[[1]]
X2 <- X[[2]]

### n1 and n2 might be changed from NestedX ###
### assign n1 and n2 again ###
n1 <- nrow(X1)
n2 <- nrow(X2)

y1 <- f1(X1)
y2 <- f2(X2)

### n=100 uniform test data ###
x <- seq(0, 1, length.out = 100)

### fit an RNAmf ###
fit.RNAmf <- RNAmf_two_level(X1, y1, X2, y2, kernel = "sqex")

### predict ###
predy <- predict(fit.RNAmf, x)$mu
predsig2 <- predict(fit.RNAmf, x)$sig2

### RMSE ###
print(sqrt(mean((predy - f2(x))^2)))

### visualize the emulation performance ###
plot(x, predy,
  type = "l", lwd = 2, col = 3, # emulator and confidence interval
  ylim = c(-2, 1)
)
lines(x, predy + 1.96 * sqrt(predsig2 * length(y2) / (length(y2) - 2)), col = 3, lty = 2)
lines(x, predy - 1.96 * sqrt(predsig2 * length(y2) / (length(y2) - 2)), col = 3, lty = 2)

curve(f2(x), add = TRUE, col = 1, lwd = 2, lty = 2) # high fidelity function

points(X1, y1, pch = 1, col = "red") # low-fidelity design
points(X2, y2, pch = 4, col = "blue") # high-fidelity design

### three levels example ###
### Branin function ###
branin <- function(xx, l){
  x1 <- xx[1]
  x2 <- xx[2]
  if(l == 1){
    10*sqrt((-1.275*(1.2*x1+0.4)^2/pi^2+5*(1.2*x1+0.4)/pi+(1.2*x2+0.4)-6)^2 +
    (10-5/(4*pi))*cos((1.2*x1+0.4))+ 10) + 2*(1.2*x1+1.9) - 3*(3*(1.2*x2+2.4)-1) - 1 - 3*x2 + 1
  }else if(l == 2){
    10*sqrt((-1.275*(x1+2)^2/pi^2+5*(x1+2)/pi+(x2+2)-6)^2 +
    (10-5/(4*pi))*cos((x1+2))+ 10) + 2*(x1-0.5) - 3*(3*x2-1) - 1
  }else if(l == 3){
    (-1.275*x1^2/pi^2+5*x1/pi+x2-6)^2 + (10-5/(4*pi))*cos(x1)+ 10
  }
}

output.branin <- function(x, l){
  factor_range <- list("x1" = c(-5, 10), "x2" = c(0, 15))

  for(i in 1:length(factor_range)) x[i] <- factor_range[[i]][1] + x[i] * diff(factor_range[[i]])
  branin(x[1:2], l)
}

### training data ###
n1 <- 20; n2 <- 15; n3 <- 10

### fix seed to reproduce the result ###
set.seed(1)

### generate initial nested design ###
X <- NestedX(c(n1, n2, n3), 2)
X1 <- X[[1]]
X2 <- X[[2]]
X3 <- X[[3]]

### n1, n2 and n3 might be changed from NestedX ###
### assign n1, n2 and n3 again ###
n1 <- nrow(X1)
n2 <- nrow(X2)
n3 <- nrow(X3)

y1 <- apply(X1,1,output.branin, l=1)
y2 <- apply(X2,1,output.branin, l=2)
y3 <- apply(X3,1,output.branin, l=3)

### n=10000 grid test data ###
x <- as.matrix(expand.grid(seq(0, 1, length.out = 100),seq(0, 1, length.out = 100)))

### fit an RNAmf ###
fit.RNAmf <- RNAmf_three_level(X1, y1, X2, y2, X3, y3, kernel = "sqex")

### predict ###
pred.RNAmf <- predict(fit.RNAmf, x)
predy <- pred.RNAmf$mu
predsig2 <- pred.RNAmf$sig2

### RMSE ###
print(sqrt(mean((predy - apply(x,1,output.branin, l=3))^2)))

### visualize the emulation performance ###
x1 <- x2 <- seq(0, 1, length.out = 100)
par(mfrow=c(1,2))
image(x1, x2, matrix(apply(x,1,output.branin, l=3), ncol=100),
zlim=c(0,310), main="Branin function")
image(x1, x2, matrix(predy, ncol=100),
zlim=c(0,310), main="RNAmf prediction")

### predictive variance ###
print(predsig2)

Fitting the model with three fidelity levels

Description

The function fits RNA models with designs of three fidelity levels. The estimation method is based on MLE. Possible kernel choices are squared exponential, Matern kernel with smoothness parameter 1.5 and 2.5. The function returns fitted model by RNAmf_two_level, fitted model at level 3, whether constant mean or not, and kernel choice.

Usage

RNAmf_three_level(X1, y1, X2, y2, X3, y3, kernel = "sqex", constant = TRUE, ...)
RNAmf_three_level(X1, y1, X2, y2, X3, y3, kernel = "sqex", constant = TRUE, ...)

Arguments

`X1`	vector or matrix of input locations for the low fidelity level.
`y1`	vector of response values for the low fidelity level.
`X2`	vector or matrix of input locations for the medium fidelity level.
`y2`	vector of response values for the medium fidelity level.
`X3`	vector or matrix of input locations for the high fidelity level.
`y3`	vector of response values for the high fidelity level.
`kernel`	character specifying kernel type to be used, to be chosen between `"sqex"`(squared exponential), `"matern1.5"`, or `"matern2.5"`. Default is `"sqex"`.
`constant`	logical indicating for constant mean of GP (`constant=TRUE`) or zero mean (`constant=FALSE`). Default is `TRUE`.
`...`	for compatibility with `optim`.

Details

Consider the model $\begin{cases} & f_1(\bm{x}) = W_1(\bm{x}),\\ & f_l(\bm{x}) = W_l(\bm{x}, f_{l-1}(\bm{x})) \quad\text{for}\quad l=2,3, \end{cases}$ where $f_l$ is the simulation code at fidelity level $l$ , and $W_l(\bm{x}) \sim GP(\alpha_l, \tau_l^2 K_l(\bm{x}, \bm{x}'))$ is GP model. Hyperparameters $(\alpha_l, \tau_l^2, \bm{\theta_l})$ are estimated by maximizing the log-likelihood via an optimization algorithm "L-BFGS-B". For constant=FALSE, $\alpha_l=0$ .

Covariance kernel is defined as: $K_l(\bm{x}, \bm{x}')=\prod^d_{j=1}\phi(x_j,x'_j;\theta_{lj})$ with $\phi(x, x';\theta) = \exp \left( -\frac{ \left( x - x' \right)^2}{\theta} \right)$ for squared exponential kernel; kernel="sqex", $\phi(x,x';\theta) =\left( 1+\frac{\sqrt{3}|x- x'|}{\theta} \right) \exp \left( -\frac{\sqrt{3}|x- x'|}{\theta} \right)$ for Matern kernel with the smoothness parameter of 1.5; kernel="matern1.5" and $\phi(x, x';\theta) = \left( 1+\frac{\sqrt{5}|x-x'|}{\theta} +\frac{5(x-x')^2}{3\theta^2} \right) \exp \left( -\frac{\sqrt{5}|x-x'|}{\theta} \right)$ for Matern kernel with the smoothness parameter of 2.5; kernel="matern2.5".

For details, see Heo and Sung (2023+, <arXiv:2309.11772>).

Value

fit.RNAmf_two_level: a class RNAmf object fitted by RNAmf_two_level. It contains a list of $\begin{cases} & \text{\code{fit1} for } (X_1, y_1),\\ & \text{\code{fit2} for } ((X_2, f_1(X_2)), y_2), \end{cases}$ . See RNAmf_two_level.
fit3: list of fitted model for $((X_2, f_2(X_3, f_1(X_3))), y_3)$ .
constant: copy of constant.
kernel: copy of kernel.
level: a level of the fidelity. It returns 3.
time: a scalar of the time for the computation.

Fitting the Recursive non-additive model with two fidelity levels.

Description

The function fits RNA models with designs of two fidelity levels. The estimation method is based on MLE. Possible kernel choices are squared exponential, Matern kernel with smoothness parameter 1.5 and 2.5. The function returns fitted model at level 1 and 2, whether constant mean or not, and kernel choice.

Usage

RNAmf_two_level(X1, y1, X2, y2, kernel = "sqex", constant = TRUE, ...)
RNAmf_two_level(X1, y1, X2, y2, kernel = "sqex", constant = TRUE, ...)

Arguments

`X1`	vector or matrix of input locations for the low fidelity level.
`y1`	vector of response values for the low fidelity level.
`X2`	vector or matrix of input locations for the high fidelity level.
`y2`	vector of response values for the high fidelity level.
`kernel`	character specifying kernel type to be used, to be chosen between `"sqex"`(squared exponential), `"matern1.5"`, or `"matern2.5"`. Default is `"sqex"`.
`constant`	logical indicating for constant mean of GP (`constant=TRUE`) or zero mean (`constant=FALSE`). Default is `TRUE`.
`...`	for compatibility with `optim`.

Details

Consider the model $\begin{cases} & f_1(\bm{x}) = W_1(\bm{x}),\\ & f_2(\bm{x}) = W_2(\bm{x}, f_1(\bm{x})), \end{cases}$ where $f_l$ is the simulation code at fidelity level $l$ , and $W_l(\bm{x}) \sim GP(\alpha_l, \tau_l^2 K_l(\bm{x}, \bm{x}'))$ is GP model. Hyperparameters $(\alpha_l, \tau_l^2, \bm{\theta_l})$ are estimated by maximizing the log-likelihood via an optimization algorithm "L-BFGS-B". For constant=FALSE, $\alpha_l=0$ .

For details, see Heo and Sung (2023+, <arXiv:2309.11772>).

Value

fit1: list of fitted model for $(X_1, y_1)$ .
fit2: list of fitted model for $((X_2, f_1(X_2)), y_2)$ .
constant: copy of constant.
kernel: copy of kernel.
level: a level of the fidelity. It returns 2.
time: a scalar of the time for the computation.

Package 'RNAmf'

Help Index

find the next point by ALC criterion

Description

Usage

Arguments

Details

Value

Examples

find the next point by ALM criterion

Description

Usage

Arguments

Value

Examples

find the next point by ALMC criterion

Description

Usage

Arguments

Value

Examples

Constructing the nested design sets for RNA model.

Description

Usage

Arguments

Details

Value

References

Examples

predict

Description

Usage

Arguments

Details

Value

See Also

Examples

Fitting the model with three fidelity levels

Description

Usage

Arguments

Details

Value

See Also

Fitting the Recursive non-additive model with two fidelity levels.

Description

Usage

Arguments

Details

Value

See Also