Nothing Special   »   [go: up one dir, main page]

×
Please click here if you are not redirected within a few seconds.
Many physical systems have underlying safety considerations that require that the strategy deployed ensures the satisfaction of a set of constraints. Further, often we have only partial information on the state of the system.
Mar 29, 2022
In this work, we consider a conservative and stochastic linear bandit setting with context distribution and unknown contexts,. i.e., a bandit problem with ...
Nov 19, 2016 · We develop a safe contextual linear bandit algorithm, called conservative linear UCB (CLUCB), that simultaneously minimizes its regret and satisfies the safety ...
Missing: Stochastic | Show results with:Stochastic
In this paper, we formulate a conservative stochastic contextual bandit formulation for real-time decision making when an adversary chooses a distribution on ...
People also ask
Abstract. Safety is a desirable property that can immensely increase the applicability of learning algorithms in real-world decision-making problems.
A safe contextual linear bandit algorithm, called conservative linear UCB (CLUCB), is developed that simultaneously minimizes its regret and satisfies the ...
In this paper, we study the issue of safety in contextual linear bandits that have application in many different fields including personalized recommendation.
This paper studies a kind of contextual linear bandits with a conservative constraint, which enforces the player's cumulative reward at any time t to be at ...
Missing: Stochastic | Show results with:Stochastic
In this paper, we study the stage-wise conservative linear stochastic bandit problem. Specifically, we consider safety constraints that requires the action ...
We study a constrained contextual linear ban- dit setting, where the goal of the agent is to produce a sequence of policies, whose ex-.