Cross-Validation for Optimal Surrogate Selection and Combination

2 views
Skip to first unread message

fche...@gmail.com

unread,
Jan 5, 2009, 3:23:40 PM1/5/09
to Surrogates and Simple Toolboxes
Dear all,

Here it is a reference on how to get the most from multiple surrogates
using cross-validation:

F. A. C. Viana, R.T. Haftka, and V. Steffen Jr, "MULTIPLE SURROGATES:
HOW CROSS-VALIDATION ERRORS CAN HELP US TO OBTAIN THE BEST PREDICTOR,"
Structural and Multidisciplinary Optimization, 2009 (DOI: 10.1007/
s00158-008-0338-0).

Surrogate models are commonly used to replace expensive simulations of
engineering problems. Frequently, a single surrogate is chosen based
on past experience. This approach has generated a collection of papers
comparing the performance of individual surrogates. Previous work has
also shown that fitting multiple surrogates and picking one based on
cross-validation errors (PRESS in particular) is a good strategy, and
that cross-validation errors may also be used to create a weighted
surrogate. In this paper, we discussed how PRESS (obtained either from
the leave-one-out or from the k-fold strategies) is employed to
estimate the RMS error, and whether to use the best PRESS solution or
a weighted surrogate when a single surrogate is needed. We also
studied the minimization of the integrated square error as a way to
compute the weights of the weighted average surrogate. We found that
it pays to generate a large set of different surrogates and then use
PRESS as a criterion for selection. We found that (i) in general,
PRESS is good for filtering out inaccurate surrogates; and (ii) with
sufficient number of points, PRESS may identify the best surrogate of
the set. Hence the use of cross-validation errors for choosing a
surrogate and for calculating the weights of weighted surrogates
becomes more attractive in high dimensions (when a large number of
points is naturally required). However, it appears that the potential
gains from using weighted surrogates diminish substantially in high
dimensions. We also examined the utility of using all the surrogates
for forming the weighted surrogates versus using a subset of the most
accurate ones. This decision is shown to depend on the weighting
scheme. Finally, we also found that PRESS as obtained through the k-
fold strategy successfully estimates the RMSE.


You can find more about it online:
http://fchegury.googlepages.com

All the best,
Felipe Viana
Reply all
Reply to author
Forward
0 new messages