In this paper, we present a reinforcement learning model of the shepherding of a flock of sheep by a dog. The shepherding task, a heuristic model originally proposed by Strombom, et al., describes the dynamics of the sheep while being herded by a dog to a predefined target. This study recreates the proposed model using SARSA, an algorithm for learning the optimal policy in reinforcement learning. Results show that with a discretized state and action space, the dog is able to successfully herd a flock of a sheep to the target position by first learning to reach a subgoal. A reward is awarded when the dog reaches the neighbourhood of a subgoal, while a penalty is incurred for each time the shepherding task is not completed. The stochasticity of the interaction among sheep and dog, including the existence of multiple subgoals affect the learning time of the agent. Finally, we present an example of the learned shepherding task which shows the agent's continuous success after the 350th episode.