Title: | Analysis of Unreplicated Orthogonal Experiments using All Possible Comparisons |
---|---|
Description: | Analysis of data from unreplicated orthogonal experiments such as 2-level factorial and fractional factorial designs and Plackett-Burman designs using the all possible comparisons (APC) methodology developed by Miller (2005) <doi:10.1198/004017004000000608>. |
Authors: | Arden Miller and Abu Zar Md. Shafiullah. |
Maintainer: | Arden Miller <[email protected]> |
License: | GPL-3 |
Version: | 1.0 |
Built: | 2025-01-30 05:37:14 UTC |
Source: | https://github.com/cran/APCanalysis |
This package provides functions to analyse data from unreplicated orthogonal experiments such as 2-level factorial and fractional factorial designs and Plackett-Burman designs using the all possible comparisons (APC) methodology.
apc() identifies the active effects from an unreplicated orthogonal experiment using a modified version of the all possible comparisons (APC) procedure proposed by Miller (2005). This function has been designed specifically to analyse data from two-level designs including full factorial designs, regular fractional factorial designs and Plackett-Burman designs.
The APC procedure is based on minimizing an AIC-like criterion: APC = log(ResSS) + p where p is a penalty term that increases as the size of the candidate model increases. The APC procedure can be adapted to control either the individual error rate (IER), the experimentwise error rate (EER) or the false discovery rate (FDR). The functions IERpenalties(), EERpenalties() and FDRpenalties() can be used to estimate the penalties used in the APC criterion for each type of error control.
Arden Miller and Abu Zar Md. Shafiullah.
Maintainer: Arden Miller <[email protected]>
Miller A.E. (2005) _The analysis of unreplicated factorial experiments using all possible comparisons_ Technometrics, 47, 51-63, 2005.
apc
, IERpenalties
, EERpenalties
, FDRpenalties
.
## This example demonstrates the analysis of an artificial data set for a 12-run ## Plackett-Burman design stored in "PB12matrix". The values of "PB12response" were ## generated using the following active effects: B=7, D=5, H=11, I=4 and K=6. The ## remaining columns were all set to be inactive (effects equal 0). my.apc = apc(PB12response, PB12matrix, maxsize=6, method = 2, level = 0.20, reps = 10000) summary(my.apc) plot(my.apc)
## This example demonstrates the analysis of an artificial data set for a 12-run ## Plackett-Burman design stored in "PB12matrix". The values of "PB12response" were ## generated using the following active effects: B=7, D=5, H=11, I=4 and K=6. The ## remaining columns were all set to be inactive (effects equal 0). my.apc = apc(PB12response, PB12matrix, maxsize=6, method = 2, level = 0.20, reps = 10000) summary(my.apc) plot(my.apc)
apc() applies the all possible comparisons procedure to identify the active effects.
apc(y, x, maxsize, level=0.05, method=1, data=NULL, effnames=NULL, reps=50000, dp=4)
apc(y, x, maxsize, level=0.05, method=1, data=NULL, effnames=NULL, reps=50000, dp=4)
y |
Either the response vector or the model formula for the full model. |
x |
The model matrix for the full model - only used when y is a response vector. |
maxsize |
The maximum model size. |
level |
The level of error control. |
method |
The type of error control: 1 = IER, 2 = EER, 3 = FDR. |
data |
Optional data frame |
effnames |
Optional vector containing labels for the candidate effects. |
reps |
The number of repetitions used by the Monte Carlo simulation algorithm which estimates the set of penalties (default is 50000). |
dp |
the number of decimal places returned for estimates of effects. Default is 4. |
The APC procedure is based on minimizing an AIC-like criterion: APC = log(ResSS) + p where p is a penalty term that increases as the size of the candidate model increases. The penalties can be selected to control either the individual error rate (IER), the experimentwise error rate (EER) or the false discovery rate (FDR) at a specified level. In addition to the type and level of error control, the penalties also depend on the run size of the experiment, the number of candidate effects and the maximum model size.
A list with components
Results |
A data frame that summarizes the results of the APC analysis. The best model of each size is indicated along with its ResSS and value for the APC criterion. |
Penalties |
A vector containing the penalties used for the APC procedure. |
level |
The level of error control. |
ErrorType |
The type of error control used. |
k |
The number of candidate effects. |
m |
The maximum model size. |
apc |
The value of APC for the selected model |
Ests |
A vector containing the estimated effects |
ActEffs |
A vector containing the names of the effects included in the selected model. |
NonActEffs |
A vector containing the names of the effects not included in the selected model. |
Penalties are estimated using Monte Carlo simulations and thus the estimates will not be exactly the same each time the function is run. The precision of the estimates can be increased by increasing the number of reps but the function will take longer to run. The amount of time needed to run this programme increases as the values of n, k and m increase. For larger experiments it may be necessary to reduce the number of reps.
Arden Miller and Abu Zar Md. Shafiullah
Miller A.E. (2005) _The analysis of unreplicated factorial experiments using all possible comparisons_ Technometrics, 47, 51-63, 2005.
IERpenalties
, EERpenalties
, FDRpenalties
.
## This example demonstrates the analysis of an artificial data set for a unreplicated ## factorial design for four two-level factors. The values of "resp" were generated as ## "resp<-round(10+8*x1+5*x3+7*x4+6*x1*x4+rnorm(16), 2)". The data is contained in the ## data frame "testdata". A maximum model size of 6 and an IER of .05 are used. apc(resp~x1*x2*x3*x4, maxsize=6, data=testdata, method=1, level=.05, reps=9000)
## This example demonstrates the analysis of an artificial data set for a unreplicated ## factorial design for four two-level factors. The values of "resp" were generated as ## "resp<-round(10+8*x1+5*x3+7*x4+6*x1*x4+rnorm(16), 2)". The data is contained in the ## data frame "testdata". A maximum model size of 6 and an IER of .05 are used. apc(resp~x1*x2*x3*x4, maxsize=6, data=testdata, method=1, level=.05, reps=9000)
EERpenalties() generates the set of penalties used for the APC criterion so that the experimentwise error rate (EER) is controlled at the specified level. The values of the penalties are estimated using a Monte Carlo simulation procedure and also depend on the run size of the experiment, the number of candidate effects and the maximum model size.
EERpenalties(n, k = n - 1, m = min(n - 2, k), eer = 0.2, reps = 50000, rnd = 3)
EERpenalties(n, k = n - 1, m = min(n - 2, k), eer = 0.2, reps = 50000, rnd = 3)
n |
The number of experimental runs in the study, i.e. the row dimension of the design matrix (for orthogonal 2-level designs n must be a multiple of 4). |
k |
The number of candidate effects under study, i.e., the column dimension of the design matrix (1 < k <= n-1). |
m |
The maximum size of the candidate models (0 < m < min(n-2, k)). |
eer |
The level (0 < eer < 1) at which the experimentwise error rate will be controlled (default is eer = 0.2). |
reps |
The number of repetitions used by the Monte Carlo simulation algorithm which estimates the set of penalties (default is 50000). |
rnd |
The number of decimal places returned for the estimated penalties (default is rnd = 3). |
A vector of containing the m + 1 penalties for the APC procedure that controls the EER at the specified level.
Penalties are estimated using Monte Carlo simulations and thus the estimates will not be exactly the same each time the function is run. The precision of the estimates can be increased by increasing the number of reps but the function will take longer to run. The amount of time needed to run this programme increases as the values of n, k and m increase. For larger experiments it may be necessary to reduce the number of reps.
Arden Miller and Abu Zar Md. Shafiullah
Miller A.E. (2005) _The analysis of unreplicated factorial experiments using all possible comparisons_ Technometrics, 47, 51-63, 2005.
apc
, IERpenalties
, FDRpenalties
## Penalties for a 8-run experiment that has 5 candidate effects are generated. ## The maximum model size is set to 5 and an experimentwise error rate of .25 is used. EERpenalties(n = 8, k = 5, m = 5, eer = .25, reps = 15000)
## Penalties for a 8-run experiment that has 5 candidate effects are generated. ## The maximum model size is set to 5 and an experimentwise error rate of .25 is used. EERpenalties(n = 8, k = 5, m = 5, eer = .25, reps = 15000)
FDRpenalties() generates the set of penalties used for the APC criterion so that the false discovery rate (FDR) is controlled at the specified level. The values of the penalties are estimated using a Monte Carlo simulation procedure and also depend on the run size of the experiment, the number of candidate effects and the maximum model size.
FDRpenalties(n, k = n - 1, m = min(n - 2, k), fdr = .1, reps = 50000, rnd = 3)
FDRpenalties(n, k = n - 1, m = min(n - 2, k), fdr = .1, reps = 50000, rnd = 3)
n |
The number of experimental runs in the study, i.e. the row dimension of the design matrix (for orthogonal 2-level designs n must be a multiple of 4). |
k |
The number of candidate effects under study, i.e., the column dimension of the design matrix (1 < k <= n-1). |
m |
The maximum size of the candidate models (0 < m < min(n-2, k)). |
fdr |
The level (0 < fdr < 1) at which the experimentwise error rate will be controlled (default is fdr = 0.1). |
reps |
The number of repetitions used by the Monte Carlo simulation algorithm which estimates the set of penalties (default is 50000). |
rnd |
The number of decimal places returned for the estimated penalties (default is rnd = 3). |
A vector of containing the m + 1 penalties for the APC procedure that controls the FDR at the specified level.
Penalties are estimated using Monte Carlo simulations and thus the estimates will not be exactly the same each time the function is run. The precision of the estimates can be increased by increasing the number of reps but the function will take longer to run. The amount of time needed to run this programme increases as the values of n, k and m increase. For larger experiments it may be necessary to reduce the number of reps.
Arden Miller and Abu Zar Md. Shafiullah
Miller A.E. (2005) _The analysis of unreplicated factorial experiments using all possible comparisons_ Technometrics, 47, 51-63, 2005.
apc
, IERpenalties
, EERpenalties
.
## Penalties for a 8-run experiment that has 5 candidate effects are generated. ## The maximum model size is set to 5 and a false discovery rate of .05 is used. FDRpenalties(n = 8, k = 5, m = 5, fdr = .05, reps = 12000)
## Penalties for a 8-run experiment that has 5 candidate effects are generated. ## The maximum model size is set to 5 and a false discovery rate of .05 is used. FDRpenalties(n = 8, k = 5, m = 5, fdr = .05, reps = 12000)
IERpenalties() generates the set of penalties used for the APC criterion so that the individual error rate (IER) is controlled at the specified level. The values of the penalties are estimated using a Monte Carlo simulation procedure and also depend on the run size of the experiment, the number of candidate effects and the maximum model size.
IERpenalties(n, k = n - 1, m = min(n - 2, k), ier = 0.05, reps = 50000, rnd = 3)
IERpenalties(n, k = n - 1, m = min(n - 2, k), ier = 0.05, reps = 50000, rnd = 3)
n |
The number of experimental runs in the study, i.e. the row dimension of the design matrix (for orthogonal 2-level designs n must be a multiple of 4). |
k |
The number of candidate effects under study, i.e., the column dimension of the design matrix (1 < k <= n-1). |
m |
The maximum size of the candidate models (0 < m < min(n-2, k)). |
ier |
The level (0 < ier < 1) at which the experimentwise error rate will be controlled (default is ier = 0.05). |
reps |
The number of repetitions used by the Monte Carlo simulation algorithm which estimates the set of penalties (default is 50000). |
rnd |
The number of decimal places returned for the estimated penalties (default is rnd = 3). |
A vector of containing the m + 1 penalties for the APC procedure that controls the IER at the specified level.
Penalties are estimated using Monte Carlo simulations and thus the estimates will not be exactly the same each time the function is run. The precision of the estimates can be increased by increasing the number of reps but the function will take longer to run. The amount of time needed to run this programme increases as the values of n, k and m increase. For larger experiments it may be necessary to reduce the number of reps.
Arden Miller and Abu Zar Md. Shafiullah
Miller A.E. (2005) _The analysis of unreplicated factorial experiments using all possible comparisons_ Technometrics, 47, 51-63, 2005.
apc
, EERpenalties
, FDRpenalties
.
## Penalties for a 8-run experiment that has 5 candidate effects are generated. ## The maximum model size is set to 5 and an individual error rate of .01 is used. IERpenalties(n = 8, k = 5, m = 5, ier = .01, reps = 15000)
## Penalties for a 8-run experiment that has 5 candidate effects are generated. ## The maximum model size is set to 5 and an individual error rate of .01 is used. IERpenalties(n = 8, k = 5, m = 5, ier = .01, reps = 15000)
A binary matrix (-1 or +1) for a 12-run Plackett Burman design
data("PB12matrix")
data("PB12matrix")
PB12matrix has 12 rows and 11 columns labelled A through K.
data(PB12matrix)
data(PB12matrix)
A constructed response vector for a 12-run Plackett-Burman design
data("PB12response")
data("PB12response")
PB12response is a vector of length 12.
The values of "PB12response" were generated using the following active effects: B=7, D=5, H=11, I=4 and K=6. The remaining columns were all set to be inactive (effects equal 0).
data(PB12response)
data(PB12response)
Produces a scatterplot of minimum APC versus model size. This is useful for visualizing the relative values of APC for the best models of each size.
## S3 method for class 'apc' plot(x, elabs = TRUE, ...)
## S3 method for class 'apc' plot(x, elabs = TRUE, ...)
x |
apc object |
elabs |
use effect labels as plotting characters |
... |
other arguments |
none
Arden Miller and Abu Zar Md. Shafiullah
## This example demonstrates the analysis of an artificial data set for a unreplicated ## factorial design for four two-level factors. The values of "resp" were generated as ## "resp<-round(10+8*x1+5*x3+7*x4+6*x1*x4+rnorm(16),2)". The data is contained in the ## data frame "testdata". A maximum model size of 6 and an IER of .05 are used. my.apc = apc(resp~x1*x2*x3*x4, maxsize=6, data=testdata, method=1, level=.05, reps=9000) plot(my.apc)
## This example demonstrates the analysis of an artificial data set for a unreplicated ## factorial design for four two-level factors. The values of "resp" were generated as ## "resp<-round(10+8*x1+5*x3+7*x4+6*x1*x4+rnorm(16),2)". The data is contained in the ## data frame "testdata". A maximum model size of 6 and an IER of .05 are used. my.apc = apc(resp~x1*x2*x3*x4, maxsize=6, data=testdata, method=1, level=.05, reps=9000) plot(my.apc)
Produces a useful summary of an apc object
## S3 method for class 'apc' summary(object, ...)
## S3 method for class 'apc' summary(object, ...)
object |
apc object |
... |
other arguments |
none
Arden Miller and Abu Zar Md. Shafiullah
## This example demonstrates the analysis of an artificial data set for a unreplicated ## factorial design for four two-level factors. The values of "resp" were generated as ## "resp<-round(10+8*x1+5*x3+7*x4+6*x1*x4+rnorm(16), 2)". The data is contained in the ## data frame "testdata". A maximum model size of 6 and an IER of .05 are used. my.apc = apc(resp~x1*x2*x3*x4, maxsize=6, data=testdata, method=1, level=.05, reps=9000) summary(my.apc)
## This example demonstrates the analysis of an artificial data set for a unreplicated ## factorial design for four two-level factors. The values of "resp" were generated as ## "resp<-round(10+8*x1+5*x3+7*x4+6*x1*x4+rnorm(16), 2)". The data is contained in the ## data frame "testdata". A maximum model size of 6 and an IER of .05 are used. my.apc = apc(resp~x1*x2*x3*x4, maxsize=6, data=testdata, method=1, level=.05, reps=9000) summary(my.apc)
A constructed data frame to illustrate the use of the functions in the APC package.
data("testdata")
data("testdata")
The "testdata" data frame has 16 rows and 5 columns:
resp response variable.
x1 binary (-1 or +1) explanatory variable 1.
x2 binary (-1 or +1) explanatory variable 2.
x3 binary (-1 or +1) explanatory variable 3.
x4 binary (-1 or +1) explanatory variable 4.
The values of "resp" were generated as resp = round(10+8*x1+5*x3+7*x4+6*x1*x4+rnorm(16), 2).'
data(testdata)
data(testdata)