by , , , ,
Abstract:
Exploration-exploitation of functions, that is learning and optimizing a mapping between inputs and expected outputs, is ubiquitous to many real world situations. These situations sometimes require us to avoid certain outcomes at all cost, for example because they are poisonous, harmful, or otherwise dangerous. We test participants' behavior in scenarios in which they have to find the optimum of a function while at the same time avoid outputs below a certain threshold. In two experiments, we find that Safe-Optimization, a Gaussian Process-based exploration-exploitation algorithm, describes participants' behavior well and that participants seem to care firstly whether a point is safe and then try to pick the optimal point from all such safe points. This means that their trade-off between exploration and exploitation can be seen as an intelligent, approximate, and homeostasis-driven strategy.
Reference:
Better safe than sorry: Risky function exploitation through safe optimization E. Schulz, Q. J. M. Huys, D. R. Bach, M. Speekenbrink, A. KrauseIn Proc. 38th Annual Meeting of the Cognitive Science Society (CogSci), 2016
Bibtex Entry:
@inproceedings{schulz16better,
	author = {Eric Schulz and Quentin J. M. Huys and Dominik R. Bach and Maarten Speekenbrink and Andreas Krause},
	booktitle = {Proc. 38th Annual Meeting of the Cognitive Science Society (CogSci)},
	month = {August},
	title = {Better safe than sorry: Risky function exploitation through safe optimization},
	year = {2016}}