
		<paper>
			<loc>https://jjcit.org/paper/24</loc>
			<title>BAT Q-LEARNING ALGORITHM</title>
			<doi>10.5455/jjcit.71-1480540385</doi>
			<authors>Bilal H. Abed-alguni</authors>
			<keywords>Q-learning,Bat algorithm,Optimization,Cooperative reinforcement learning.</keywords>
			<citation>48</citation>
			<views>5664</views>
			<downloads>1723</downloads>
			<received_date>2016-11-30</received_date>
			<revised_date>2017-02-01</revised_date>
			<accepted_date>2017-02-23</accepted_date>
			<abstract>Cooperative  Q-learning  approach  allows  multiple  learners  to  learn  independently  and  then  share  their 
Q-values among each other using a Q-value sharing strategy. A main problem with this approach is that 
the  solutions  of  the  learners  may  not  converge  to  optimality, because  the  optimal  Q-values  may  not  be 
found. Another problem is that some cooperative algorithms perform very well with single-task problems, 
but  quite  poorly  with  multi-task  problems.  This  paper  proposes  a  new  cooperative  Q-learning  algorithm 
called the  Bat Q-learning algorithm  (BQ-learning) that implements a Q-value  sharing strategy  based on 
the Bat algorithm. The Bat algorithm is a powerful optimization algorithm that increases the possibility of 
finding  the  optimal  Q-values by  balancing  between  the  exploration  and  exploitation of  actions  by  tuning 
the  parameters of the algorithm. The  BQ-learning algorithm  was tested using two problems: the  shortest 
path  problem  (single-task  problem)  and  the  taxi  problem  (multi-task  problem).  The  experimental  results 
suggest that BQ-learning performs better than single-agent Q-learning and some well-known cooperative 
Q-learning algorithms.</abstract>
		</paper>


