Title: | High Dimensional Bayesian Ridge Regression without MCMC |
---|---|
Description: | Ridge regression provide biased estimators of the regression parameters with lower variance. The HDBRR ("High Dimensional Bayesian Ridge Regression") function fits Bayesian Ridge regression without MCMC, this one uses the SVD or QR decomposition for the posterior computation. |
Authors: | Sergio Perez-Elizalde Developer [aut], Blanca Monroy-Castillo Developer [aut, cre], Paulino Perez-Rodriguez User [ctb], Jose Crossa User [ctb] |
Maintainer: | Blanca Monroy-Castillo Developer <[email protected]> |
License: | GPL (>= 2) |
Version: | 1.1.4 |
Built: | 2024-11-15 04:37:09 UTC |
Source: | https://github.com/cran/HDBRR |
Ridge regression provide biased estimators of the regression parameters with lower variance. The HDBRR ("High Dimensional Bayesian Ridge Regression") function fits Bayesian Ridge regression without MCMC, this one uses the SVD or QR decomposition for the posterior computation.
HDBRR(y, X, n0 = 5, p0 = 5, s20 = NULL, d20 = NULL, h = 0.5, intercept = TRUE, vpapp = TRUE,npts = NULL,c = NULL, corpred = NULL, method = c("svd","qr"),bigmat = TRUE, ncores = 2, svdx = NULL) ## S3 method for class 'HDBRR' summary(object, all.coef = FALSE, crit = log(4), ...) ## S3 method for class 'HDBRR' plot(x, crit = log(4), var_select = FALSE, post = FALSE, ...) ## S3 method for class 'HDBRR' predict(object, ...) ## S3 method for class 'summary.HDBRR' print(x, ...) ## S3 method for class 'HDBRR' print(x, ...) ## S3 method for class 'HDBRR' coef(object, all = FALSE, ...)
HDBRR(y, X, n0 = 5, p0 = 5, s20 = NULL, d20 = NULL, h = 0.5, intercept = TRUE, vpapp = TRUE,npts = NULL,c = NULL, corpred = NULL, method = c("svd","qr"),bigmat = TRUE, ncores = 2, svdx = NULL) ## S3 method for class 'HDBRR' summary(object, all.coef = FALSE, crit = log(4), ...) ## S3 method for class 'HDBRR' plot(x, crit = log(4), var_select = FALSE, post = FALSE, ...) ## S3 method for class 'HDBRR' predict(object, ...) ## S3 method for class 'summary.HDBRR' print(x, ...) ## S3 method for class 'HDBRR' print(x, ...) ## S3 method for class 'HDBRR' coef(object, all = FALSE, ...)
y |
The data vector (numeric, n) NAs allowed. |
X |
Design Matrix of dimension |
n0 , p0
|
|
s20 , d20
|
|
h |
(numeric, 0< |
intercept |
Logic value. The default value for the |
vpapp |
Logic value. Compute an approximation of the predictive variance. The default value for the |
npts |
Number (integer) of points used to evaluate the u's density for the numeric aprroach. The default value for the |
c |
ratio of Gaussian densities (Spike/Slab) in the prior mixture density of each Beta for variable selection. |
corpred |
The method for the compute of the correlation, there are two methods, Empirical Bayes ( |
method |
Options for the posterior computation. There are two methods available: |
bigmat |
Use of the bigstatsr package. The default value for |
ncores |
Number of the cores for computation. The default value for the ncores is 2, you can detect your number of cores with |
object |
A HDBRR object, typically generated by a call to |
all.coef |
Logical. Should results be returned for all ridge regression penalty
parameters ( |
crit |
Numerical. The lower bound of the log Bayes factor in favour to include a variable in the model. The default value for |
... |
Additional arguments to be passed to or from other methods. |
x |
A HDBRR object, typically generated by a call to |
var_select |
Logical. If is TRUE a plot with variable selection is returned. The default value is FALSE. |
post |
Logical. If is TRUE a plot with marginal posterior of u is returned. The default value is FALSE. |
all |
Logical. All coefficients are returned. If is FALSE, then, if |
svdx |
It is possible to add the svd. The default value es NULL. |
Ridge regression is a useful tool to deal with colinerity in the homocesastic linear regression model providing biased estimators of the regression parameters with lower variance than the least square estimators. The model
where vector is assumed Normal with mean vector 0 and covariance matrix
. For further details see the vignettes in the package.
List containing the following components:
betahat |
Vector (numeric, |
yhat |
Vector (numeric, |
sdyhat |
Vector (numeric, |
sdpred |
Vector (numeric, |
varb |
Vector (numeric, |
sigsqhat |
Value (numeric) of the residual variance estimate. |
sigbsqhat |
Value (numeric) of the Beta's variance estimate. |
u |
Vector (numeric, |
postu |
Vector (numeric, |
uhat |
Value (numeric) of u estimated. |
umode |
Value (numeric) of the posterior mode of u. |
whichNa |
Value (integer) of NAs in the y vector. |
phat |
Vector (numeric, |
delta |
Used in the variable selection. |
edf |
Value (numeric) of the effective degrees of freedom for regression. |
corr |
Vector (numeric, |
svdx |
The svd decomposition. |
Sergio Perez-Elizalde, Blanca E. Monroy-Castillo, Paulino Perez-Rodriguez, Jose Crossa.
## Not run: data("phenowheat") mod <- lmer(pheno$HD~pheno$env+(1|pheno$Line)) y <- unlist(ranef(mod)) n <- length(y) X <- scale(X, scale=F) fitall <- HDBRR(y,X/sqrt(ncol(X)),intercept = FALSE, corpred = "eb", c = 100) fitall sumarry(fitall, crit = 0) plot(fitall, crit = 0) predict(fitall) ## End(Not run)
## Not run: data("phenowheat") mod <- lmer(pheno$HD~pheno$env+(1|pheno$Line)) y <- unlist(ranef(mod)) n <- length(y) X <- scale(X, scale=F) fitall <- HDBRR(y,X/sqrt(ncol(X)),intercept = FALSE, corpred = "eb", c = 100) fitall sumarry(fitall, crit = 0) plot(fitall, crit = 0) predict(fitall) ## End(Not run)
Compute the SVD or QR decomposition of the matrix X.
matop(y = NULL, X, method = c("svd", "qr"), bigmat = TRUE)
matop(y = NULL, X, method = c("svd", "qr"), bigmat = TRUE)
y |
The data vector (numeric, n) NAs allowed. The default value is NULL, It is possible to compute the SVD or QR decomposition without y. |
X |
Design Matrix of dimension n x p. |
method |
Options for the posterior computation. Two methods, |
bigmat |
Use of the bigstatsr package. The default value for bigmat is |
Use the bigstartsr package when p >> n
. Auxiliary in the HDBRR function.
If the method used is svd then the list containing the following components:
y |
The data vector (numeric, |
X |
Design Matrix of dimension |
D |
A vector containing the singular values of |
L |
A matrix whose columns contain the left singular vectors of |
R |
A matrix whose columns contain the right singular vectors of |
ev |
A vector containing the square of |
Ly |
The cross-product between the matrix |
n |
Number of rows of |
p |
Number of columns of |
If the method used is qr then the list containing the following components:
y |
The data vector (numeric, |
X |
Design Matrix of dimension |
R |
An upper triangular matrix of dimension |
n |
Number of rows of |
p |
Number of columns of |
Sergio Perez-Elizalde, Blanca E. Monroy-Castillo, Paulino Perez-Rodriguez.
n <- 30 p <- 100 X <- matrix(rnorm(n*(p-1),1,1/p),nrow = n,ncol = p-1) Beta <- sample(1:p,p-1,rep = FALSE) Beta <- c(1,Beta) y <- cbind(rep(1,n),X) %*% Beta+rnorm(n,0,1) matop(y, X, bigmat = TRUE)
n <- 30 p <- 100 X <- matrix(rnorm(n*(p-1),1,1/p),nrow = n,ncol = p-1) Beta <- sample(1:p,p-1,rep = FALSE) Beta <- c(1,Beta) y <- cbind(rep(1,n),X) %*% Beta+rnorm(n,0,1) matop(y, X, bigmat = TRUE)
The final number of SNPs included in the NCCR linkage map was 7594. The markers were centered and standardized. Phenotypic evaluation of the NCCR population was performed during two growing seasons (2010-2011 and 2011-2012) in locations in the Po Valley representative of the target environments where durum wheat is grown: Cadriano in the 2010-2011 growin season (Cad11) and the 2011-2012 growing season (Cad12); Poggio Renatico in the 2010-2011 growing season (Pr11) and Argelato in the 2011-2012 growing season (Arg12).
International Maize and Wheat Improvement Center (CIMMYT), Mexico.
This contain data from a multiparental durum wheat (Triticum turgidum L. spp. duram) trial consisting of 334 lines evaluated in a country-year combination. This population is characterized for Grain Yield (GY), grain volume weight (GVW), 1000-kernel weight (GWT) and heading date (HD) in the four environments. For further details see the vignettes in the package.
data(phenowheat)
data(phenowheat)
pheno
Matrix phenowheat.pheno contains the phenotypic data.
X
The matrix phenowheat.X contains the Genotypic data.
Is a matrix (338 x 7594) with A balanced, four-way multiparental cross population was developed from four elite durum wheat cultivars (Neodur, Claudio, Colosseo, and Rascon/Tarro) that were chosen as diverse contributors of different alleles of agronomic relevance.
International Maize and Wheat Improvement Center (CIMMYT), Mexico.