 Why the Rousseeuw Yohai Paradigm is One of the Largest
and Longest Running Scientific Hoaxes in History
hoax.pdf

 Talk on prediction regions and intervals
predtalk.pdf

 The following preprints have been submitted.

 The following preprint greatly increases the scope of data splitting for regression, and finds the large sample theory for OPLS.

 Olive, D.J., and Zhang, L. (2023), One Component Partial Least Squares, High Dimensional Regression, Data Splitting, and the Multitude of Models
opls.pdf

 The following preprint gives the large sample theory for some ARMA model selection estimators. The preprint also shows how to use bootstrap confidence regions for hypothesis testing.

 Haile and Olive (2023a), Bootstrapping ARMA Time Series Models after Model Selection
tsboot.pdf

 Jin and Olive (2023), Large Sample Theory for Some RidgeType Regression Estimators
ridgetype.pdf

 The following preprint gives a data splitting prediction region
and shows how to predict the random walk.

 Haile, Zhang, and Olive (2023), Prediction Intervals and Regions for Random Walks and Renewal Processes
rwalkpi.pdf


 THE FOLLOWING PREPRINTS HAVE NOT YET BEEN SUBMITTED OR NEED TO BE RESUBMITTED.

 Haile, Welagedara, and Olive (2023), ARIMA Model Selection and Prediction Intervals
tspi.pdf

 The following preprint may be ready for submission by August 2024,

 Olive (2023g), High Dimensional Binary Regression and Classification
hdbreg.pdf

 The following preprint may be ready for submission by August 2024,

 Olive (2023h), High Dimensional Multiple Linear Regression with Heterogeneity
hdwls.pdf

 The following preprint may be ready for submission by August 2024,

 Welagedara and Olive (2023b), Visualizing Some Bootstrap Confidence
Regions
visconfreg.pdf

 The following preprint may be ready for submission by August 2024,

 Olive (2023i), Some Simple High Dimensional One Sample Tests
hd1samp.pdf

 The following preprint had too many ideas to be published in a major journal but part of it resulted in the paper Olive (2018) below.

 Highest Density Region Prediction
hdrpred.pdf

 This preprint shows how to visualize several important survival regression
models in the background of the data.
 Plots for Survival Regression
sreg.pdf

 1D Regression
onedreg.pdf
 Graphical Aids for Regression.
gaid.pdf
 A Simple Plot for Model Assessment
simp.pdf

 THE FOLLOWING PREPRINT HAS BEEN INCORPORATED IN
 Olive, D.J. (2017), Linear Regression and
 Olive, D.J. (2017), Robust Multivariate Analysis, two Springer texts,
 and in the Olive (2018) paper Applications of Hyperellipsoidal Prediction Regions (below).

 This preprint shows the equivalence between a prediction region and a confidence region that can easily be bootstrapped. This method can be used for hypothesis testing, for robust statistics, and after variable selection. See the Pelawa Watagoda and Olive (2021ab) published papers below and the Rathnayake and Olive (2020) preprint for better theory.
 Bootstrapping Hypothesis Tests and Confidence Regions
vselboot.pdf

 THE FOLLOWING 4 PREPRINTS WERE INCORPORATED IN
 Olive, D.J. (2017), Robust Multivariate Analysis, Springer, NY.

 This paper gives the first easily computed estimators of multivariate
location and dispersion that have been shown to be sqrt(n) consistent
and highly outlier resistant.
 Olive, D.J., and Hawkins, D.M. (2010), Robust Multivariate Location and Dispersion
rmld.pdf

 This preprint shows how improve low breakdown consistent regression estimators
and outlier resistant estimators that do not have theory. The resulting estimator
is the first easily computed regression estimator that has been shown
to be sqrt(n) consistent and high breakdown. The response plot
is very useful for detecting outliers.
 Olive, D.J., and Hawkins, D.M. (2011), Practical High Breakdown Regression
hbreg.pdf

 Olive, D.J. (2013), Robust Multivariate Linear Regression
robmreg.pdf

 Olive, D.J. (2014), Robust Principal Component Analysis
rpca.pdf

 THE FOLLOWING TWO PREPRINTS HAVE BEEN CITED BY OTHER
 AUTHORS, BUT WERE REVISED AND PUBLISHED.

 Chang, J., and Olive, D.J. (2007), Resistant Dimension Reduction
resdr.pdf
 was revised and published as Chang and Olive (2010).
 Applications of a Robust Dispersion Estimator
rcovm.pdf
 was revised and published as Zhang, Olive, and Ye (2012).

 THE FOLLOWING SIX PREPRINTS HAVE BEEN CITED BY OTHER AUTHORS.

 This paper shows that the bootstrap is not first order accurate
unless the number of bootstrap samples B is proportional to the sample size n.
For second order accuracy, need B proportional to n^2. This was published
in Olive, D.J. (2014), Statistical Theory and Inference, Springer, New York, NY, ch. 9.
 Olive, D.J. (2011), The Number of Samples for Resampling Algorithms
resamp.pdf

 This preprint provides some of the most important theory in the field of robust statistics.
The paper shows that a simple modification to the most used but inconsistent algorithms
for robust statistics results in easily computed sqrt n consistent highly outlier resistant estimators.
It was converted to the Robust Multivariate Location and Dispersion and Practical High Breakdown Regression preprints
above. The material is in Olive, D.J. (2017), Robust Multivariate Analysis, Springer, NY.

 Olive, D.J., and Hawkins, D.M. (2008), High Breakdown Multivariate Estimators
hbrs.pdf

 The material in the following preprint is in
Olive, D.J. (2017), Robust Multivarite Analysis, Springer, NY.
 Olive, D.J., and Hawkins, D.M. (2007), Robustifying Robust Estimators, preprint
available from
ppconc.pdf
 For location scale families, estimators based on the median and mad have optimal robustness
properties. Use He's cross checking technique to make an asymptoticaly efficient estimator.
 Olive, D.J. (2006), Robust Estimators for Transformed LocationScale Families.
robloc.pdf
 The material in the following preprint is in
Olive, D.J. (2017), Robust Multivarite Analysis, Springer, NY.
 Olive, D.J. (2005), A Simple Confidence Interval for the Median, preprint
available from
ppmedci.pdf
 The June 2008 ROBUST STATISTICS NOTES are below.
PLEASE CITE THIS WORK IF YOU USE IT. Much of this work is in
 Olive, D.J. (2017), Robust Multivariate Analysis, Springer, NY.
 Olive, D.J. (2008), Applied Robust Statistics,
preprint available from (http://parker.ad.siu.edu/Olive/run.pdf).
robnotes.pdf

 Web page with data sets and programs to go with the course notes.
robust.html

 MORE ONLINE COURSE NOTES.

 The next preprint simplifies large sample theory for elastic net, ridge regression and lasso. A new variable selection estimator with simple theory that is easy to bootstrap is given. Theory for 3 bootstrap confidence regions is given, and the coverage should be near the nominal for the new estimator. Need to update ch. 4 variable selection material as in Rathnayke and Olive (2021).
 Jan. 2023: Webpage for second draft of Olive, D.J. (2023), Prediction and Statistical Learning.
 http://parker.ad.siu.edu/Olive/slearnbk.htm

 Need to incorporate Rathnayake and Olive (2021) variable selection material into Chapter 10.
 Jan. 2023: Webpage for first draft of Math 584 notes Olive, D.J. (2023), Theory for Linear Models.
 http://parker.ad.siu.edu/Olive/linmodbk.htm

 Need to update the variable selection material as in Rathnayke and Olive (2021).
 Jan. 2023: Webpage for first draft of Math 473 notes Olive, D.J. (2023), Survival Analysis. http://parker.ad.siu.edu/Olive/survbk.htm

 Jan. 2023: Webpage for first draft of Olive, D.J. (2023), Large Sample Theory.
http://parker.ad.siu.edu/Olive/lsampbk.htm

 Need to update ch. 10 variable selection material as in Rathnayke and Olive (2021).
 Jan. 2022: Webpage for first draft of Olive, D.J. (2023), Robust Statistics.
http://parker.ad.siu.edu/Olive/robbook.htm

 WEBPAGES AND COURSE NOTES TO GO WITH THREE PUBLISHED BOOKS

 TWO COMPETITORS FOR Casella and Berger (2002), Statistical Inference:
 Olive, D.J. (2008), A Course in Statistical Theory,
preprint available from (http://parker.ad.siu.edu/Olive/infer.htm).
infer.htm
 Olive, D.J. (2014), Statistical Theory and Inference, Springer, New York, NY.
 The Springer eBook is available on SpringerLink, Springer's online platform.
http://dx.doi.org/10.1007/9783319049724

 TWO COMPETITORS FOR Kutner, Nachtsheim, Neter, and Li (2005), Applied Linear Statistical Models:
 Olive, D.J. (2010), Multiple Linear and 1D Regression Models,
preprint available from (http://parker.ad.siu.edu/Olive/regbk.htm).
regbk.htm
 Olive, D.J. (2017a), Linear Regression, Springer, New York, NY.
 The Springer eBook is available on SpringerLink, Springer's online platform.
http://dx.doi.org/10.1007/9783319552521

 A COMPETITOR FOR Johnson and Wichern (2007), Applied Multivariate Analysis:
 Olive, D.J. (2017b), Robust Multivariate Analysis, Springer, New York, NY.
 The Springer eBook is available on SpringerLink, Springer's online platform.
https://link.springer.com/book/10.1007%2F9783319682532

 Jan. 2013 1st draft of Robust Multivariate Analysis:
 http://parker.ad.siu.edu/Olive/multbk.htm


 Here are some rejected Letters to the Editor. Erratum should have been published.

 This slightly revised letter was sent to the Journal of Computational and Graphical Statistics about the latest FakeMCD estimator of Hubert, Rousseeuw and Verdonk (2012). It pointed out that DetMCD is not the MCD estimator, that DetMCD has no theory, and that it will be a massive undertaking to modify the theory for concentration estimators in Olive and Hawkins (2010) to show whether DetMCD has any good properties.
 Fake MCD
fakemcd.pdf

 This letter was sent to the Annals of Statistics regarding the
Bali, Boente, Tyler and Wang (2011) bait and switch paper.
 Fake Projection Estimator
fakeproj.pdf

 This Letter was sent to The Annals of Statistics regarding the
SalibianBarrera and Yohai (2008) bait and switch paper. After a rejection,
it was revised and sent to the American Statistician as a paper, but rejected.
 The Breakdown of Breakdown
bdbd.pdf

 THE NEXT 11 DOCUMENTS MAY BE OF MILD INTEREST, BUT WILL PROBABLY NEVER BE PUBLISHED.

 The following 5 preprints have been incorporated into the published paper
Olive (2013) ``Plots for Generalized Additive Models."
 Response Transformations for Models with Additive Errors
rtrans.pdf
 Response Plots and Related Plots for Regression
rplot.pdf
 Response Plots for Linear Models
lm.pdf
 Response Plots for Experimental Design
rploted.pdf
 Plots for Binomial and Poisson Regression
gfit.pdf

 Comments on Breakdown
bkdn.pdf
 Abuhassan, H. and Olive, D.J. (2008), Inference for the Pareto, Half Normal and
Related Distributions.
std.pdf

 (long version of) Robustifying Robust Estimators
lconc.pdf
 Prediction intervals in the presence of outliers
pi.pdf
 This 1996 result grew into a 2002 JASA discussion paper.
dense.pdf
 This 1997 result on partitioning may be of mild interest.
part.pdf

 THIS IS MY PhD DISSERTATION: Olive, D.J. (1998), Applied Robust Statistics,
Ph.D. Thesis, University of Minnesota. It shows my 1998 ideas on Robust Statistics.
The Figures are missing and the page numbers differ from the orignial dissertation.
arsdiss.pdf

 THE FOLLOWING ARE PREPRINTS OF PUBLISHED OR ACCEPTED PAPERS.


 This paper shows how to get better cutoffs for many common tests,
and gives a new weighted least squares method.

 Rajapaksha, K.W.G.D.H. and Olive, D.J (2022), Wald Type Tests with the Wrong Dispersion Matrix, Communications in Statistics: Theory and Methods, to appear. waldtype.pdf

 This paper gives the large sample theory for many variable selection estimators for several important regression models. A new estimator that does not have selection bias is given. The preprint also shows how to use bootstrap confidence regions for hypothesis testing for the usual and new variable selection estimators.

 Rathnayake, R.C. and Olive, D.J. (2023), Bootstrapping Some GLM and Survival Regression Variable Selection Estimators, Communications in Statistics: Theory and Methods, 52, 26252645.
bootglm.pdf

R code:
Rcodebootglm.pdf


 This paper shows how to get prediction intervals for a large class of parametric regression models such as GLMs, GAMs, and survival regression models. The PIs can work after variable selection and if the number of predictors is larger than the sample size.

 Olive, D.J, Rathnayake, R.C., and Haile, M.G. (2022), Prediction Intervals for GLMs, GAMs, and Some Survival Regression Models, Communications in Statistics: Theory and Methods, 51, 80128026.
pigam.pdf
R code:
Rcodepigam.pdf

 This paper gives prediction intervals that can be useful when
the sample size is less than the number of variables. These prediction intervals are useful for comparing shrinkage estimators like forward selection and lasso. Large sample theory for lasso, the elastic net, and ridge regression is simplified. New large sample theory for many OLS variable selection estimators is given. The theory shows that lasso variable selection is sqrt(n) consistent when lasso is consistent.

 Pelawa Watagoda, L.C.R. and Olive, D.J. (2021b), Comparing Six Shrinkage Estimators With Large Sample Theory and Asymptotically Optimal Prediction Intervals, Statistical Papers, 62, 24072431.
picomp.pdf

 This paper gives theory for three useful bootstrap confidence regions. We use betahatImin0 to denote the variable seletion estimator, but we are using the usual estimator betahatVS and a new estimator betahatMIX, and the paper would be clearer
if we did not use context to decide which estimator betahatImin0 is. The large sample theory for betahatMIX is derived, and is only asymptotically equivalent to that of betahatVS under strong regularity conditions. See the above paper and Rathnayake and Olive (2020). Theory for 3 bootstrap confidence regions is given.

 Pelawa Watagoda, L.C.R. and Olive, D.J. (2021a), Bootstrapping Multiple Linear Regression After Variable Selection, Statistical Papers, 62, 681700.
piboottest.pdf

 This paper shows how to bootstrap analogs of the one way MANOVA model where we
do not assume equal covariance matrices.

 Rupasinghe Arachchige Don, H.S., and Olive, D.J. (2019), Bootstrapping Analogs of
the One Way MANOVA Test, Communications in Statistics: Theory and Methods, 48, 55465558.
manova.pdf

 This paper shows that applying the Olive (2013b) nonparametric prediction region to
a bootstrap sample can result in a confidence region, and applying the prediction
region to Yhat_f + e_i, where the e_i are residual vectors, results in a nonparametric
prediction region for a future response vector Y_f for multivariate regression.

 Olive, D.J. (2018), Applications of Hyperellipsoidal Prediction Regions,
Statistical Papers, 59, 913931.
hpred.pdf

 Olive, D.J., Pelawa Watagoda, L.C.R., and Rupasinghe Arachchige Don, H.S. (2015),
Visualizing and Testing the Multivariate Linear Regression Model,
International Journal of Statistics and Probability, 4, 126137.
vtmreg.pdf

 This paper gives response plots, plots for response transformations
and plots for detecting overdispersion for GAMs and GLMs.
 Olive, D.J. (2013a), Plots for Generalized Additive Models, Communications in
Statistics: Theory and Methods, 42, 26102628.
gam.pdf
R/Splus code:
gamcode.txt

 Olive, D.J. (2013b), Asymptotically Optimal Regression Prediction Intervals and Prediction Regions
for Multivariate Data, International Journal of Statistics and Probability, 2, 90100.
apred.pdf

 This paper describes the sqrt(n) consistent highly outlier resistant
FCH, RFCH and RMVN estimators and gives an application for canonical correlation analysis.
 Zhang, J., Olive, D.J., and Ye, P. (2012), Robust Covariance Matrix Estimation with
Canonical Correlation Analysis, International Journal of Statistics and Probability, 1, 119136.
rcca.pdf

 This paper shows that OLS partial F tests, originally meant for multiple linear
regression, are useful for exploratory purposes for or a much larger class of
models, including generalized linear models and single index models.
 Chang, J. and Olive, D.J. (2010), OLS for 1D Regression Models, Communications in
Statistics: Theory and Methods, 39, 18691882.
sindx.pdf

 Olive, D.J. and Hawkins, D.M. (2007), Behavior of Elemental Sets in Regression,
Statistics and Probability Letters, 77, 621624.
elem.pdf

 This paper shows how to construct asymptotically optimal prediction intervals
for regression models of the form Y = m(x) + e. The errors need to be iid unimodal
and emphasis is on linear regression.
 Olive, D.J. (2007), Prediction Intervals for Regression Models, Computational Statistics
and Data Analysis, 51, 31153122.
spi.pdf

 This paper shows that the variable selection software originally meant
for multiple linear regression gives useful results for a much larger class of
models, including generalized linear models and single index models, if the Mallows' Cp criterion is used.
For models I with k predictors, the screen Cp(I) < 2k is much
more effective than the screen Cp(I) < k. Use response plots to show that the
final submodel is similar to the original full model.
 Olive, D.J. and Hawkins, D.M. (2005), Variable Selection for 1D Regression Models,
Technometrics, 47, 4350.
varsel.pdf

 Olive, D.J. (2005), Two Simple Resistant Regression Estimators, Computational Statistics
and Data Analysis, 49, 809819.
mba.pdf

 The MBA estimator is not as good as the FCH estimator in "High Breakdown Robust
Estimators," but was the first easily computed estimator of multivariate
location and dispersion shown (in 2004) to be sqrt(n) consistent and highly outlier resistant.
See "Robustifying Robust Estimators" or "Applied Robust Statistics" for proofs.
 Olive, D.J. (2004a), A Resistant Estimator of Multivariate Location and Dispersion,
Computational Statistics and Data Analysis, 46, 99102.
rcov.pdf

 The following paper suggests ways to robustify regression techniques for single index models
and sliced inverse regression.
 Olive, D.J. (2004b), Visualizing 1D Regression, in
Theory and Applications of Recent Robust Methods, edited by M. Hubert, G. Pison, A. Struyf and
S. Van Aelst, Series: Statistics for Industry and Technology, Birkhauser, Basel, 221233.
vreg.pdf

 Olive, D.J., and Hawkins, D.M. (2003), Robust Regression with High Coverage,
Statistics and Probability Letters, 63, 259266.
hcov.pdf

 The following paper provides a simultaneous diagnostic for whether the data
follows a multivariate normal distribution or some other elliptically contoured distribution.
It also provides a nice way to estimate and visualize single index models.
 Olive, D.J. (2002), Applications of Robust Distances for Regression, Technometrics, 44, 6471.
rdist.pdf

 The following paper gives extremely important theoretical results.
It shows that software implementations for estimators of robust regression and
robust multivariate location and dispersion tend to be inconsistent with zero breakdown value.
The commonly used elemental basic resampling algorithm draws K elemental sets. Each
elemental fit is inconsistent, so the final estimator is inconsistent, regardless
of how the algorithm chooses the elemental fit.
The CM, GS, LMS, LQD, LTS, maximum depth, MCD, MVE, one step GM and GR, projection, S, tau,
t type, and many other robust estimators are of little applied interest because they are
impractical to compute. The "Robustifying Robust Estimators" paper shows how modify some
algorithms so that the resulting regression estimators are easily computed sqrt n consistent
high breakdown estimators and the resulting multivariate location and dispersion estimators
are sqrt n consistent with high outlier resistance.
 Hawkins, D.M., and Olive, D.J. (2002), Inconsistency of Resampling Algorithms for High
Breakdown Regression Estimators and a New Algorithm (with discussion), Journal of the American
Statistical Association, 97, 136148.
incon.pdf

 This paper gives a graphical method for estimating response transformations
that can be used to complement or replace the numerical BoxCox method.
 Cook, R.D., and Olive, D.J. (2001), A Note on Visualizing Response Transformations,
Technometrics, 43, 443449. resp.pdf

 Olive, D.J. (2001), High Breakdown Analogs of the Trimmed Mean, Statistics and Probability
Letters, 51, 8792.rloc.pdf

 Hawkins, D.M., and Olive, D.J. (1999a), Improved Feasible Solution Algorithms for
High Breakdown Estimation, Computational Statistics and Data Analysis, 30, 111.
ifsa.pdf

 Hawkins, D.M., and Olive, D. (1999b), Applications and Algorithms for Least Trimmed Sum
of Absolute Deviations Regression, Computational Statistics and Data Analysis, 32, 119134.
lta.pdf
