effSAMWMIX: An efficient Stochastic Multi-Armed Bandit Algorithm based on a Simulated Annealing with Multiplicative Weights

Villari, Boby Chaitanya; Abdulla, Mohammed Shahid

DSpace Home
→
18. Working Papers
→
2017
→
View Item

effSAMWMIX: An efficient Stochastic Multi-Armed Bandit Algorithm based on a Simulated Annealing with Multiplicative Weights

Villari, Boby Chaitanya; Abdulla, Mohammed Shahid

URI: http://hdl.handle.net/2259/935

Date: 2017-01

Abstract:

—SAMWMIX, a Stochastic Multi-Armed Bandit(SMAB) which obtains a 𝑶𝑶(𝒍𝒍𝒍𝒍𝒍𝒍 T) where T being the number of steps in the time horizon, is proposed in the literature . A blind-SAMWMIX which incorporates an input parameter ,which has better empirical performance but obtains a regret of the order 𝑶𝑶(𝒍𝒍𝒍𝒍𝒈𝒈𝟏𝟏+𝟐𝟐𝜶𝜶 𝑻𝑻).Current work proposes an efficient version of SAMWMIX which not only obtains a regret of 𝑶𝑶(𝒍𝒍𝒍𝒍𝒍𝒍 K) but also exults a better performance. A proof for the same is given in this work. The proposed effSAMWMIX algorithm is compared with KL-UCB and Thompson Sampling(TS) algorithms over rewards which follow distributions like Exponential, Poisson, Bernoulli, Triangular, Truncated Normal distribution and a synthetic distribution designed to stress test SMAB algorithms with closely spaced reward means. It is shown that effSAMWMIX performs better than both KL-UCB & TS in both regret performance and execution time

Show full item record