In this work, a Nadaraya-Watson kernel based learning system which owns general regression neural network topology is adapted to Q learning method to evaluate a quick and efficient action selection policy for reinforcement learning problems. By means of the proposed method Q value function is generalized and learning speed of Q agent is accelerated. The training data of the developed neural network are obtained by a standard Q learning agent on closed-loop simulation system. The efficiency of the proposed method is tested on popular reinforcement learning benchmarks and its performance is compared with other popular regression methods and Q-learning utilized methods. QLRNN increased the learning performance and it learns faster than other methods on selected benchmarks. Test results showed the efficiency and the importance of the proposed network.