Title: | Confidence Intervals Utilizing Uncertain Prior Information |
---|---|
Description: | Computes a confidence interval for a specified linear combination of the regression parameters in a linear regression model with iid normal errors with known variance when there is uncertain prior information that a distinct specified linear combination of the regression parameters takes a given value. This confidence interval, found by numerical nonlinear constrained optimization, has the required minimum coverage and utilizes this uncertain prior information through desirable expected length properties. This confidence interval has the following three practical applications. Firstly, if the error variance has been accurately estimated from previous data then it may be treated as being effectively known. Secondly, for sufficiently large (dimension of the response vector) minus (dimension of regression parameter vector), greater than or equal to 30 (say), if we replace the assumed known value of the error variance by its usual estimator in the formula for the confidence interval then the resulting interval has, to a very good approximation, the same coverage probability and expected length properties as when the error variance is known. Thirdly, some more complicated models can be approximated by the linear regression model with error variance known when certain unknown parameters are replaced by estimates. This confidence interval is described in Mainzer, R. and Kabaila, P. (2019) <doi:10.32614/RJ-2019-026>, and is a member of the family of confidence intervals proposed by Kabaila, P. and Giri, K. (2009) <doi:10.1016/j.jspi.2009.03.018>. |
Authors: | Paul Kabaila [aut, cre], Rheanna Mainzer [aut], Ayesha Perera [ctb] |
Maintainer: | Paul Kabaila <[email protected]> |
License: | GPL-2 |
Version: | 1.2.3 |
Built: | 2024-11-16 04:50:01 UTC |
Source: | https://github.com/cran/ciuupi |
between
and
from the
-vectors
and
and the
design matrix
Computes the known correlation
between
and
. This correlation is computed from the
-vectors
and
and the
design matrix
, with linearly independent columns, using the formula
, where
and
.
acX_to_rho(a, c, X)
acX_to_rho(a, c, X)
a |
The |
c |
The |
X |
The |
The known correlation between
and
.
a <- c(0, 2, 0, -2) c <- c(0, 0, 0, 1) x1 <- c(-1, 1, -1, 1) x2 <- c(-1, -1, 1, 1) X <- cbind(rep(1, 4), x1, x2, x1*x2) rho <- acX_to_rho(a, c, X) print(rho)
a <- c(0, 2, 0, -2) c <- c(0, 0, 0, 1) x1 <- c(-1, 1, -1, 1) x2 <- c(-1, -1, 1, 1) X <- cbind(rep(1, 4), x1, x2, x1*x2) rho <- acX_to_rho(a, c, X) print(rho)
and
that specify the CIUUPI for all possible
values of
and the observed response vectorChooses the positive number and the positive integer
, sets
, and then computes the
-vector
that determines, via cubic spline interpolation, the functions
and
which specify
the confidence interval for
that utilizes the uncertain prior information (CIUUPI),
for all possible values of
and the observed response vector.
To an excellent approximation, this confidence interval
has minimum coverage probability
.
bs_ciuupi(alpha, rho, natural = 1)
bs_ciuupi(alpha, rho, natural = 1)
alpha |
The desired minimum coverage probability
is |
rho |
The known correlation |
natural |
Equal to 1 (default) if the functions |
Suppose that
where
is a random
-vector of
responses,
is a known
by
matrix
with linearly
independent columns,
is an unknown parameter
-vector and
is the random error with components that are iid normally distributed
with zero mean and known variance
.
The parameter of interest is
.
Also let
, where
and
are specified linearly independent
vectors and
is a specified number.
The uncertain prior information is that
.
Let rho
denote the known
correlation between the and
.
We can compute
rho
from given values of ,
and
using the function
acX_to_rho
.
The confidence interval for ,
with minimum coverage probability
1
alpha
, that utilizes the uncertain prior
information that
0 belongs to a class of confidence
intervals indexed
by the functions
and
.
The function
is an odd continuous function and
the function
is an even
continuous function. In addition,
and
is equal to the
1
alpha
quantile of the
standard normal distribution for all
, where
is a given positive number.
Extensive numerical explorations
have been used to find a formula (in terms of
alpha
and rho
) for a 'goldilocks'
value of that is neither too large nor too small.
Then let
=ceiling(
/0.75) and
.
The values of the functions
and
in
the interval
are specified by the
-vector
.
The values of and
for
are
deduced from this vector using the assumptions made about
the functions
and
.
The values of
and
for any
in the interval
are then found using cube spline interpolation using the
values of
and
for
.
For
natural
=1 (default) this is 'natural' cubic
spline interpolation and for natural
=0 this is
'clamped' cubic spline interpolation.
The vector
is found by numerical nonlinear constrained optimization
so that the confidence interval has minimum
coverage probability 1
alpha
and utilizes
the uncertain prior information
through its desirable expected length properties.
This optimization is performed using the
slsqp
function
in the nloptr
package.
A list with the following components.
alpha, rho, natural: the inputs
d: a 'goldilocks' value of that is not too large
and not too small
n.ints: number of equal-length consecutive
intervals whose union is ,
this is the same as
lambda.star: the computed value of
bsvec: the vector
that determines
the functions
and
that specify the CIUUPI for all possible
values of
and observed response vector
comp.time: the computation time in seconds
alpha <- 0.05 rho <- - 1 / sqrt(2) bs.list <- bs_ciuupi(alpha, rho)
alpha <- 0.05 rho <- - 1 / sqrt(2) bs.list <- bs_ciuupi(alpha, rho)
In this example, the dataset described in Table 7.5 of Box et al. (1963) is used.
The design matrix X is
specified by the command
X <- cbind(rep(1,4), c(-1, 1, -1, 1), c(-1, -1, 1, 1), c(1, -1, -1, 1))
.
A description of the parameter of interest is given in Discussion 5.8, p.3426 of
Kabaila and Giri (2009).
The parameter of interest is ,
where the column vector
is specified by the command
a <- c(0, 2, 0, -2)
.
For this example, we have uncertain prior information that
, where the column vector
is specified by the command
c <- c(0, 0, 0, 1)
.
The known correlation between
and
is
computed using the command
rho <- acX_to_rho(a, c, X)
.
The desired minimum coverage probability of the CIUUPI is , where
, which is specified by the command
alpha <- 0.05
.
The CIUUPI is determined by and
and is found using the
command
bs.list.example <- bs_ciuupi(alpha, rho)
,
which takes about 5 minutes to run.
bs.list.example
bs.list.example
An object of class list
of length 8.
Box, G.E.P., Connor, L.R., Cousins, W.R., Davies, O.L., Hinsworth, F.R., Sillitto, G.P. (1963) The Design and Analysis of Industrial Experiments, 2nd edition, reprinted. Oliver and Boyd, London.
Kabaila, P. and Giri, K. (2009) Confidence intervals in regression utilizing prior information. Journal of Statistical Planning and Inference, 139, 3419 - 3429.
and
at x
Evaluate the functions and
, as specified by
bsvec
,
alpha
, d
, n.ints
and natural
, at x
.
bsspline(x, bsvec, alpha, d, n.ints, natural)
bsspline(x, bsvec, alpha, d, n.ints, natural)
x |
A value or vector of values at which the functions |
bsvec |
The
where |
alpha |
The desired minimum coverage probability is |
d |
The functions |
n.ints |
The number of equal-length intervals in |
natural |
Equal to 1 (default) if the |
A data frame containing x
and the corresponding values of the
functions and
.
x <- seq(0, 8, by = 1) alpha <- bs.list.example$alpha natural <- bs.list.example$natural d <- bs.list.example$d n.ints <- bs.list.example$n.ints bsvec <- bs.list.example$bsvec bs <- bsspline(x, bsvec, alpha, d, n.ints, natural)
x <- seq(0, 8, by = 1) alpha <- bs.list.example$alpha natural <- bs.list.example$natural d <- bs.list.example$d n.ints <- bs.list.example$n.ints bsvec <- bs.list.example$bsvec bs <- bsspline(x, bsvec, alpha, d, n.ints, natural)
, compute
the standard
confidence intervalIf is provided then compute the standard
confidence interval for
. If
is not provided
then, as long as
, replace
by its estimate
to compute an approximate
confidence interval for
.
ci_standard(a, X, y, alpha, sig = NULL)
ci_standard(a, X, y, alpha, sig = NULL)
a |
The vector used to specify the parameter of interest
|
X |
The known |
y |
The |
alpha |
|
sig |
Standard deviation of the random error.
If a value is not specified then, provided that |
Suppose that
where is a random
-vector of responses,
is a known
matrix with linearly independent columns,
is an unknown parameter
-vector, and
, with
assumed known.
Suppose that the parameter of interest is
.
The R function
ci_standard
computes the standard
confidence interval for
.
The example below is described in Discussion 5.8 on
p.3426 of Kabaila and Giri (2009). This example is obtained
by extracting a factorial data set from the
factorial data set described in Table 7.5
of Box et al. (1963).
If is provided then a data frame of the lower and upper
endpoints of the standard
confidence interval
for
. If
is not provided then, as long as
, a data frame of the
lower and upper endpoints of
an approximation to this confidence interval.
Box, G.E.P., Connor, L.R., Cousins, W.R., Davies, O.L., Hinsworth, F.R., Sillitto, G.P. (1963) The Design and Analysis of Industrial Experiments, 2nd edition, reprinted. Oliver and Boyd, London.
Kabaila, P. and Giri, K. (2009) Confidence intervals in regression utilizing prior information. Journal of Statistical Planning and Inference, 139, 3419 - 3429.
y <- c(87.2, 88.4, 86.7, 89.2) x1 <- c(-1, 1, -1, 1) x2 <- c(-1, -1, 1, 1) X <- cbind(rep(1, 4), x1, x2, x1*x2) a <- c(0, 2, 0, -2) ci_standard(a, X, y, 0.05, sig = 0.8)
y <- c(87.2, 88.4, 86.7, 89.2) x1 <- c(-1, 1, -1, 1) x2 <- c(-1, -1, 1, 1) X <- cbind(rep(1, 4), x1, x2, x1*x2) a <- c(0, 2, 0, -2) ci_standard(a, X, y, 0.05, sig = 0.8)
,
compute the confidence interval that utilizes the
uncertain prior information (CIUUPI)If is provided then, for given observed response
vector
,
compute the confidence interval, with minimum coverage
probability
, for the parameter
that
utilizes the uncertain prior information that the parameter
(specified by the vector
and the number
) takes the value 0. If
is not provided
then, as long as
, replace
by its estimate
to compute an approximation to the CIUUPI for
.
ciuupi_observed_value(a, c, X, alpha, bs.list, t, y, sig = NULL)
ciuupi_observed_value(a, c, X, alpha, bs.list, t, y, sig = NULL)
a |
The |
c |
The |
X |
The |
alpha |
|
bs.list |
A list that includes the following components: natural, d, q and the vector bsvec (b(h),...,b((q-1)h), s(0),s(h)...,s((q-1)h)), where h=d/q, that specifies the CIUUPI for all possible values of the random error variance and the observed response vector |
t |
The number |
y |
The |
sig |
Standard deviation of the random error.
If a value is not specified then, provided that |
Suppose that
where is a random
-vector of
responses,
is a known
matrix with linearly
independent columns,
is an unknown parameter
-vector and
has components that are iid normally distributed
with zero mean and known variance.
Suppose that
a
is the
parameter of interest, where
a
is a specified
vector. Let
c
t
,
where c
is a specified vector,
t
is a specified number and
a
and c
are
linearly independent vectors. Also suppose that we have
uncertain prior information that .
For given observed response
vector
y
and a design matrix X
,
ciuupi_observed_value
computes the
confidence interval, with minimum coverage probability
1alpha
, for
that utilizes the uncertain prior information that
.
The example below is described in Discussion 5.8 on
p.3426 of Kabaila and Giri (2009). This example is obtained
by extracting a factorial data set from the
factorial data set described in Table 7.5
of Box et al. (1963).
If is provided then a data frame of the lower and upper
endpoints of
the confidence interval, with minimum coverage
probability
, for the parameter
that utilizes the
uncertain prior information that
.
If
is not provided then, as long as
, a data frame of the
lower and upper endpoints of
an approximation to this confidence interval.
Box, G.E.P., Connor, L.R., Cousins, W.R., Davies, O.L., Hinsworth, F.R., Sillitto, G.P. (1963) The Design and Analysis of Industrial Experiments, 2nd edition, reprinted. Oliver and Boyd, London.
Kabaila, P. and Giri, K. (2009) Confidence intervals in regression utilizing prior information. Journal of Statistical Planning and Inference, 139, 3419 - 3429.
a <- c(0, 2, 0, -2) c <- c(0, 0, 0, 1) x1 <- c(-1, 1, -1, 1) x2 <- c(-1, -1, 1, 1) X <- cbind(rep(1, 4), x1, x2, x1*x2) alpha <- 0.05 t <- 0 y <- c(87.2, 88.4, 86.7, 89.2) sig <- 0.8 ciuupi_observed_value(a, c, X, alpha, bs.list.example, t, y, sig=sig)
a <- c(0, 2, 0, -2) c <- c(0, 0, 0, 1) x1 <- c(-1, 1, -1, 1) x2 <- c(-1, -1, 1, 1) X <- cbind(rep(1, 4), x1, x2, x1*x2) alpha <- 0.05 t <- 0 y <- c(87.2, 88.4, 86.7, 89.2) sig <- 0.8 ciuupi_observed_value(a, c, X, alpha, bs.list.example, t, y, sig=sig)
Evaluate the coverage probability of the confidence interval that
utilizes uncertain prior information (CIUUPI) at gam
.
The input bs.list
determines the functions and
that specify the confidence interval that utilizes the uncertain
prior information (CIUUPI), for all possible values of
and observed response vector.
cpciuupi(gam, n.nodes, bs.list)
cpciuupi(gam, n.nodes, bs.list)
gam |
A value of |
n.nodes |
The number of nodes for the Gauss Legendre quadrature used for the evaluation of the coverage probability |
bs.list |
A list that includes the following components.
where |
The value(s) of the coverage probability of the CIUUPI at gam
.
gam <- seq(0, 10, by = 0.2) n.nodes <- 10 cp <- cpciuupi(gam, n.nodes, bs.list.example)
gam <- seq(0, 10, by = 0.2) n.nodes <- 10 cp <- cpciuupi(gam, n.nodes, bs.list.example)
used in the specification of the CIUUPIThe input bs.list
determines the functions and
that specify the confidence interval that utilizes the uncertain
prior information (CIUUPI), for all possible values of
and observed response vector. The R function
plot_b
plots the graph of the odd function .
plot_b(bs.list)
plot_b(bs.list)
bs.list |
A list that includes the following components.
where |
A plot of the graph of the odd function used in the
specification of the CIUUPI.
plot_b(bs.list.example)
plot_b(bs.list.example)
The input bs.list
determines the functions and
that specify the confidence interval that utilizes the uncertain
prior information (CIUUPI), for all possible values of
and observed response vector.
The coverage probability of the CIUUPI is an even function of
the unknown parameter
.
The R function
plot_cp
plots the graph of the coverage probability
of the CIUUPI, as a function of .
To provide a stringent
assessment of this coverage probability, we use a fine equally-spaced grid
seq(0, (d+4), by = 0.01)
of values of
and Gauss Legendre quadrature
using 10 nodes in the relevant integrals. By contrast,
for the computation
of the CIUUPI, implemented in
bs_ciuupi
, we
require that the coverage probability of this confidence
interval is greater than or equal to
for the equally-spaced grid
seq(0, (d+2), by = 0.05)
of values of
and we use Gauss Legendre quadrature
with 5 nodes in the relevant integrals.
plot_cp(bs.list)
plot_cp(bs.list)
bs.list |
A list that includes the following components.
where |
A plot of the graph of the coverage probability of the
CIUUPI as a function of , where
denotes the unknown parameter
.
plot_cp(bs.list.example)
plot_cp(bs.list.example)
used in the specification of the CIUUPIThe input bs.list
determines the functions and
that specify the confidence interval that utilizes the uncertain
prior information (CIUUPI), for all possible values of
and observed response vector. The R function
plot_s
plots the graph of the odd function .
plot_s(bs.list)
plot_s(bs.list)
bs.list |
A list that includes the following components.
where |
A plot of the graph of the even function used in the
specification of the CIUUPI.
plot_s(bs.list.example)
plot_s(bs.list.example)
The input bs.list
determines the functions and
that specify the confidence interval that utilizes the uncertain
prior information (CIUUPI), for all possible values of
and observed response vector.
The scaled expected length of the CIUUPI is an even function of
the unknown parameter
.
The R function
plot_squared_sel
plots the graph of the squared scaled
expected length (i.e. squared SEL)
of the CIUUPI, as a function of .
To provide a stringent
assessment of this squared SEL, we use a grid
seq(0, (d+4), by = 0.01)
of values of
and Gauss Legendre quadrature
with 10 nodes in the relevant integrals. By contrast,
for the computation
of the CIUUPI, implemented in
bs_ciuupi
, we
use Gauss Legendre quadrature
with 5 nodes in the relevant integrals.
plot_squared_sel(bs.list)
plot_squared_sel(bs.list)
bs.list |
A list that includes the following components.
where |
A plot of the graph of the squared scaled
expected length (i.e. squared SEL)
of the CIUUPI as a function of , where
denotes the unknown parameter
.
plot_squared_sel(bs.list.example)
plot_squared_sel(bs.list.example)
Evaluate the scaled expected length of the confidence interval that
utilizes uncertain prior information (CIUUPI) at gam
.
This scaled expected length is defined to be the expected length of the
CIUUPI divided
by the expected length of the standard confidence interval.
The input
bs.list
determines the functions and
that specify the confidence interval that utilizes the uncertain
prior information (CIUUPI), for all possible values of
and observed response vector.
selciuupi(gam, n.nodes, bs.list)
selciuupi(gam, n.nodes, bs.list)
gam |
A value of |
n.nodes |
The number of nodes for the Gauss Legendre quadrature used for the evaluation of the scaled expected length |
bs.list |
A list that includes the following components.
where |
The value(s) of the scaled expected length at gam
.
gam <- seq(0, 10, by = 0.2) n.nodes <- 10 sel <- selciuupi(gam, n.nodes, bs.list.example)
gam <- seq(0, 10, by = 0.2) n.nodes <- 10 sel <- selciuupi(gam, n.nodes, bs.list.example)