Developing sophisticated conversational agents is a significant area of natural language processing for commercial enterprises and academic departments as well. One of the major areas in dialogue generation is concerned with the research of open-domain chatbot models. In my thesis I follow this line of work, by investigating neural network based chatbot architectures, that are capable of imitating the features of human conversations. I focus on a great weakness of using maximum likelihood objective in these end-to-end dialogue models, which is the potential source of their inability to maintain diverse and interesting conversations. My goal with this project is to find a solution to this phenomena, by using a two stage training process to obtain a chatbot model, which is free from this problem. I create a baseline model through maximum likelihood optimization on dialogue data and then employ reinforcement learning to tune its parameters, thereby negating the undesired effects of MLE objective. My work is available in the following link https://github.com/Mrpatekful/ParlAI.