There have been frequent fatal accidents of firefighters at fire scenes. A firefighting robot can be an alternative to humans at a fire scene to reduce accidents. As a critical function of the firefighting robot, it is mandatory to autonomously detect a fire spot and shoot water. In this research, a deep learning model called YOLOv7 was employed based on thermal images to recognize the shape and temperature information of the fire. Based on the results of the test images, which were not used for learning purposes, a recognition rate of 99% was obtained. To track the recognized fire spot, a 2-DOF pan-tilt actuation system with cameras was developed. By using the developed system, a moving target can be tracked with an error of 5%, and a variable target tracking test by alternately covering two target braziers showed that it takes about 1.5 seconds to track changing targets. Through extinguishment experiments with a water spray mounted on the pan-tilt system, it was observed that the temperature of the brazier dropped from 600 degrees to 13 degrees. Based on the obtained data, the feasibility of a robotic firefighting system using image recognition was confirmed.