Vision-based fire detection systems have been significantly improved by deep models; however, higher numbers of false alarms and a slow inference speed still hinder their practical applicability in real-world scenarios. For a balanced trade-off between computational cost and accuracy, we introduce dual fire attention network (DFAN) to achieve effective yet efficient fire detection. The first attention mechanism highlights the most important channels from the features of an existing backbone model, yielding significantly emphasized feature maps. Then, a modified spatial attention mechanism is employed to capture spatial details and enhance the discrimination potential of fire and non-fire objects. We further optimize the DFAN for real-world applications by discarding a significant number of extra parameters using a meta-heuristic approach, which yields around 50% higher FPS values. Finally, we contribute a medium-scale challenging fire classification dataset by considering extremely diverse, highly similar fire/non-fire images and imbalanced classes, among many other complexities. The proposed dataset advances the traditional fire detection datasets by considering multiple classes to answer the following question: what is on fire? We perform experiments on four widely used fire detection datasets, and the DFAN provides the best results compared to 21 state-of-the-art methods. Consequently, our research provides a baseline for fire detection over edge devices with higher accuracy and better FPS values, and the proposed dataset extension provides indoor fire classes and a greater number of outdoor fire classes; these contributions can be used in significant future research. Our codes and dataset will be publicly available at https://github.com/tanveer-hussain/DFAN.