Abstract:
Portfolio Selection Problem (PSP) is actively discussed in financial research. The choice of available
assets poses the need for exploration and the objective to maximize the portfolio payoffs makes the
PCP an explore-exploit decision-making problem. Multi-armed bandit algorithms (MAB) suit well
for such problems when applied as the decision engines in Naïve Bandit Portfolio algorithms (NBP).
An NBP’s performance varies by varying the MAB inside the algorithm. In this work we test a
Stochastic Multi-Armed Bandit (SMAB) named effSAMWMIX, which we proposed in a previous
work of ours, to solve the PSP. We compare the performance of effSAMWMIX vis-à-vis KL-UCB,
Thompson Sampling algorithm and the benchmark Market Buy & Hold strategy. We tested the
algorithms on simulated and real-world market datasets. We report our results where
effSAMWMIX, applied as the decision-making engine of NBP, has achieved better cumulative
wealth for all portfolios when compared to the competing SMAB algorithms.