Source: COLORADO STATE UNIVERSITY submitted to NRP
IMPLEMENTATION OF CONFIDENCE INTERVAL METHODS FOR MIXED MODELS
Sponsoring Institution
State Agricultural Experiment Station
Project Status
COMPLETE
Funding Source
Reporting Frequency
Annual
Accession No.
0178951
Grant No.
(N/A)
Cumulative Award Amt.
(N/A)
Proposal No.
(N/A)
Multistate No.
(N/A)
Project Start Date
Jul 1, 2001
Project End Date
Jun 30, 2005
Grant Year
(N/A)
Program Code
[(N/A)]- (N/A)
Recipient Organization
COLORADO STATE UNIVERSITY
(N/A)
FORT COLLINS,CO 80523
Performing Department
ADMINISTRATION
Non Technical Summary
Accurate confidence interval methods are not available, or easily computable for some balanced and many unbalanced linear and non-linear mixed models. This project studies existing confidence interval procedures and develops new ones. Computations are made more accessible to applications researchers by construction of programs and data analysis examples.
Animal Health Component
30%
Research Effort Categories
Basic
40%
Applied
30%
Developmental
30%
Classification

Knowledge Area (KA)Subject of Investigation (SOI)Field of Science (FOS)Percent
90173102090100%
Goals / Objectives
The objective of the project is to make accurate confidence interval procedures for fixed effects parameters and variance components in mixed linear, generalized linear and non-linear models available to agricultural, biological and medical researchers. Many more procedures have been developed than have been adequately studied and evaluated, or implemented in statistical software. Development of additional procedures is needed, particularly for designs that are unbalanced.
Project Methods
Existing procedures will be evaluated and compared in terms of upper and lower confidence interval coverage, as well as expected confidence interval length. When possible, improvements will be made. Procedures will be implemented in terms of macros that run in the SAS statistical programming environment. The macros will be usable by applied researchers with minimal guidance from a statistician. Appropriate confidence interval procedures will also be communicated in the context of data analysis in collaboration with other CSU researchers.

Progress 07/01/01 to 06/30/05

Outputs
Improved small-sample confidence intervals for 50% lethal dose values (LD50) have been developed using saddlepoint approximations. Simulations indicate that this method is a clear improvement over current methods when sample sizes are small, the design is unbalanced, or the true LD50 is near the edge of the data. This new method is also applicable to other dose levels, e.g. LD10 and LD90, where the improvement over current methods is even greater. This work is being prepared for publication. Method R (MR) is a variance component estimation procedure that can be used when data sets are too large to use the preferred maximum likelihood (ML) or restricted maximum likelihood (REML) methods. MR has been used primarily in animal breeding applications in which data sets may have a million or more records. Acceptance of the MR method has been slow, probably because its justification is somewhat ad hoc, and its properties are not yet well understood. Two papers on the subject of Method R are being revised for publication. Both papers considered estimation of the intra-class correlation coefficient in the balanced one-way random effects model. The first paper showed that the mean of individual MR sub-sample estimates is asymptotically efficient relative to the ML estimator as the number of sub-samples and the number of classes increase. Although the efficiency of MR in the one-way model does not directly apply to complicated real-world applications, it does lend general support to the use of the method. MR is often compared to REML, due to system of estimating equations shared by the two procedures; however, no clear characterization of MR as an estimation procedure has been offered. The second paper showed that a single sub-sample MR estimate can be characterized as a conditional REML estimate based on the whole-sample and sub-sample group means, given the sub-sample group means. This property does not hold in the unbalanced case. Simulations of small sample MR estimates for the one-way balanced random effects model demonstrate that the median of MR estimates is less biased and less variable than the mean of MR estimates and performs nearly as well as REML. These results support the acceptance of MR as a method for estimating variance components. An application of spatial statistical models has used to analyze data from a series of wheat field trials, and it has been determined that substantial reduction in the length of confidence intervals for treatment differences can be obtained by using an anisotropic spatial covariance model, rather than the usual randomized complete block analysis. This work has been published. A study of small-sample bootstrap confidence intervals for estimates of bull fertility scores has been completed. Estimates of individual bull fertility scores are obtained by mixing semen and fertilizing egg in what is essentially a competition experiment. The identity of the sire of the fertilized is determined by genetic analysis. This method has the potential to give estimates of bull fertility using far fewer eggs than do traditional methods. This work has been published.

Impacts
The median lethal dose, or LD50, is a commonly used measure of toxicity of compounds. Currently available methods for computing confidence intervals for the LD50 are unreliable when the sample size is small, the design is unbalanced, or the true LD50 is near the edge of the region over which data was taken. We have developed a new method for LD50 confidence intervals, based on saddlepoint approximations, that is far superior to existing methods. The new method will lead to more reliable assessment of toxicity when toxicity estimates are based on small-sample experimental data. The Method R procedure is used for variance component estimation when data sets are too large to use other methods. Method R allows researcher to use all of the available data, rather than portions or sub-samples. However, acceptance of Method R has been slow because its properties are not well understood. We have characterized Method R's relationship to other methods and shown its efficiency in the balanced one-way model. Although these results to not directly apply to the more complicated models currently in use, they do support the use of MR as a method for estimating variance components.

Publications

  • Butler, J. D. Byrne, P. F. Mohammadi, V., Chapman, P. L., and Haley, S. D. 2005. Agronomic performance of Rht alleles in a spring wheat population across a range of moisture levels. Crop Sci. 45, 739-747.


Progress 01/01/03 to 12/31/03

Outputs
Work has continued in the areas of project emphasis: (1) Small sample simulation studies of the univariate R-method for estimating breeding values in animal reproduction databases have been completed. The small sample simulation results are being written up for publication. Theoretical work has characterized the R-method as a form of conditional maximum likelihood estimation. Further theoretical works demonstrate the statistical efficiency of the R-method in the special case of the balanced one-way analysis of variance. Theoretical work on the bivariate method R has characterized the method-R as a form of conditional maximum likelihood estimator based on insufficient statistics. Several versions of the bivariate method-R have been studied, and one commonly used version has been determined to be incorrect. The theoretical large-sample work is being written up for publication. (2) Theoretical study of a class of exact methods investigated by Khuri, et al., has reached the conclusion that these methods achieve exactness by throwing away information. This work is being written up for publication. This work will discourage the use of inefficient statistical tests in mixed models, which are widely used in many areas of agricultural research. (3) Progress has continued on studies of spatial models for agricultural field trials. A series of wheat trials has been analyzed, and it has been determined that substantial reduction in the length of confidence intervals for treatment differences can be obtained by using an anisotropic spatial covariance model. This work is being written up for publication. (4) A study of small-sample bootstrap confidence intervals for estimates of bull fertility scores has been completed. Estimates of individual bull fertility scores are obtained by mixing semen and fertilizing egg in what is essentially a competition experiment. The identity of the sire of the fertilized is determined by genetic analysis. This method has the potential to give estimates of bull fertility using far fewer eggs than do traditional methods. This work has been published. (5) Improved small-sample confidence intervals for 50% lethal dose values (LD50) are being developed using saddlepoint approximations. Simulations are in progress. Preliminary results indicate that the new methods are a clear improvement over current methods.

Impacts
One of the important challenges in evaluating the results of any experiment is the assessment of the uncertainty in those results. This project develops statistical methods that aid researchers in attaching the appropriate amount of uncertainty when making estimates based on several types of linear and generalized linear models. Correct assessment of uncertainty will allow decision makers to more effectively use those results.

Publications

  • Flint, A. F., Chapman, P.L and Seidel, G. A. (2003). Heterospermic insemination of cattle using flow-sorted sperm. J. of Animal Science. 81, 1814-1822.


Progress 01/01/02 to 12/31/02

Outputs
Work has continued in the areas of project emphasis: (1) Small sample simulation studies of the univariate R-method for estimating breeding values in animal reproduction databases have been completed. Theoretical work has characterized the R-method as a form of conditional maximum likelihood estimation. Further theoretical works demonstrate the statistical efficiency of the R-method in the special case of the balanced one-way analysis of variance. Theoretical work on the bivariate method R has characterized the method-R as a form of conditional maximum likelihood estimator based on insufficient statistics. Several versions of the bivariate method-R have been studied, and one commonly used version has been determined to be incorrect. This work is being written up for publication. (2) Theoretical study of a class of exact methods investigated by Khuri, et al., has reached the conclusion that these methods achieve exactness by throwing away information. This work is being written up for publication. This work will discourage the use of inefficient statistical tests in mixed models, which are widely used in many areas of agricultural research. (3) Progress has continued on studies of spatial models for agricultural field trials. A series of wheat trials has been analyzed, and it has been determined that substantial reduction in the length of confidence intervals for treatment differences can be obtained by using an anisotropic spatial covariance model. (4) A study of small-sample bootstrap confidence intervals for estimates of bull fertility scores has been completed. Estimates of individual bull fertility scores are obtained by mixing semen and fertilizing egg in what is essentially a competition experiment. The identity of the sire of the fertilized is determined by genetic analysis. This method has the potential to give estimates of bull fertility using far fewer eggs than do traditional methods. This work has been submitted for publication. (5) Improved small-sample confidence intervals for 50% lethal dose values (LD50) are being developed using saddlepoint approximations.

Impacts
One of the important challenges in evaluating the results of any experiment is the assessment of the uncertainty in those results. This project develops statistical methods that aid researchers in attaching the appropriate amount of uncertainty when making estimates based on statistical models that have several fixed and random components in the same model. Correct assessment of uncertainty will allow decision makers to more effectively use those results.

Publications

  • No publications reported this period


Progress 07/01/01 to 12/31/01

Outputs
Work has continued in the areas of project emphasis: (1) Methods of forming confidence intervals for variance components have been reviewed, and SAS macros are being written and tested for performing these methods. It is our intention to disseminate macros through SAS user groups so that the good methods in existence will be more widely used. (2) Simulation studies of the univariate R-method for estimating breeding values in animal reproduction databases have been performed, together with theoretical work that characterizes the R-method as a form of conditional maximum likelihood estimation. Further theoretical works demonstrates the mathematical efficiency of the R-method in the special case of the balanced one-way analysis of variance. R-method results are being prepared for publication. (3) Theoretical study of a class of exact methods investigated by Khuri, et.al, has reached the conclusion that these methods essentially involve achieving exactness by throwing away information. This work is being written up for publication. This work will discourage the use of inefficient statistical tests in mixed models, which are widely used in many areas of agricultural research. (4) Studies of spatial models for agricultural field trials have been initiated. The usual analysis of variance for randomized complete block designs ignores spatial information that is potentially useful in adjusting estimates of treatment differences. Several methods that use fixed and random effects to adjust for spatial variation in the field are being compared and applied to real data.

Impacts
One of the challenges in evaluating the results of any experiment is the assessment of the uncertainty in the results. This project develops statistical methods that aid researchers in attaching the appropriate amount of uncertainty when making estimates based on statistical models that have several fixed and random components in the same model. Correct assessment of uncertainty will allow decision makers to more effectively use those results.

Publications

  • No publications reported this period


Progress 01/01/00 to 12/31/00

Outputs
Work has continued in the four areas of project emphasis: (1) Methods of forming confidence intervals for variance components have been reviewed, and SAS macros are being written and tested for performing these methods. It is our intention to disseminate macros through SAS user groups so that the good methods in existence will be more widely used. (2) Simulation studies of the univariate R-method for estimating breeding values in animal reproduction databases have been performed, together with theoretical work that characterizes the R-method as a form of conditional maximum likelihood estimation. Work on the bivariate R-methods has been completed and is being prepared for publication. (3) Theoretical study of a class of exact methods investigated by Khuri, et.al, has reached the conclusion that these methods essentially involve achieving exactness by throwing away information. This work is being written up for publication. This work will discourage the use of inefficient statistical tests in mixed models, which are widely used in many areas of agricultural research. (4) Simulation studies are being used to assess the accuracy of a bootstrap confidence interval for the proportion of individuals in a population for which a drug is bioequivalent. Results indicate that the parametric bootstrap fails for this purpose. Results from this work were presented at the Kansas State University - Applied Statistics in Agriculture Conference. That work is now being prepared for submission to journals.

Impacts
Variance components are estimated in order to assess the relative contributions of various inputs on a resulting output. For example, a study may assess the relative contributions of genetics and evironment on the rate of growth of beef cattle. This project will help researchers assess the accuracy of their variance component estimates by helping them compute more accurate confidence intervals.

Publications

  • No publications reported this period


Progress 01/01/99 to 12/31/99

Outputs
The project has progressed in the four previously described areas: (1) Methods of forming confidence intervals for variance components have been reviewed, and SAS macros are being written and tested for performing these methods. It is our intention to disseminate macros through SAS user groups so that the good methods in existence will be more widely used. (2) Simulation studies of the univariate R-method for estimating breeding values in animal reproduction databases have been performed, together with theoretical work that characterizes the R-method as a form of conditional maximum likelihood estimation. Theoretical work on the multivariate version of the R-method is now underway. (3) Theoretical study of a class of exact methods investigated by Khuri, et.al, has reached the conclusion that these methods essentially involve achieving exactness by throwing away information. Simulation studies that demonstrate the superiority of approximate methods are in progress. This work will discourage the use of such inefficient statistical tests in mixed models, which are widely used in many areas of agricultural research. (4) Simulation studies are being used to assess the accuracy of a bootstrap confidence interval for the proportion of individuals in a population for which a drug is bioequivalent. Results indicate that the parametric bootstrap fails for this purpose. Adjustments to the parametric bootstrap are now being developed so that the power of parametric modeling in small-sample situations can be used. These results will contribute to the ability to evaluate bioequivalence of drugs with smaller numbers of test patients. Results from efforts (2) and (3) have been presented in papers at the Kansas State University - Applied Statistics in Agriculture Conference. That work is now being prepared for submission to journals. Results from effort (4) will be presented this year at the same conference, and submitted later this year.

Impacts
Mixed models are widely applied in all areas of agricutural, biological and medical research. This work will increase the usefulness of such models by developing, testing and disseminating improved estimation, testing and confidence interval procedures.

Publications

  • No publications reported this period


Progress 01/01/98 to 12/31/98

Outputs
Work has progressed with emphasis on activity in four areas: (1) review of available methods for forming confidence intervals for variance components, ratios of variance components and contrasts of fixed effects in mixed models, (2) simulation study of the R-method for estimation of ratios of variance components, (3) study of a class of exact methods investigated by Khuri, et.al, (4) simulation analysis of a bootstrap method for assessing individual bioequivalence of two drugs (this is a function of variance components and a contrast in a mixed model). In the first area, it has been determined that many methods are available that are superior to the methods now available in statistical program packages. SAS macros for computation of some of these methods for some balanced models have been developed. In the second area, simulation studies comparing the accuracy of the R-method estimates to the accuracy of estimates based on maximum likelihood and restricted maximum likelihood are ongoing. Comparisons of confidence intervals based on these methods are planned. In the third area of interest study is focused on two topics: the effects of information lost by making the procedure exact, and the effects of subjectivity in selection of the test statistic. In the fourth area, simulation studies are being used to assess the accuracy of a bootstrap confidence interval method for the proportion of individuals in a population for which a drug is bioequivalent.

Impacts
(N/A)

Publications

  • No publications reported this period