The recognition and extraction of visual text information is the basis of the realization of Text-to-Scene Conversion, while nouns are the main expression form of entity, event, orientation, action and other information in text, but not all nouns are suitable for visualization, so how to recognize the visibility of nouns is the key problem of Text-to-Scene Conversion. Based on the definition and classification of noun visualization, this paper establishes a keyword visibility dictionary. Furthermore, in view of the static limitations of the dictionary, a dynamic expansion method based on semantic similarity is studied, and an optimization method for the visibility annotation of nouns is proposed in combination with unsupervised clustering. The experimental results show that the proposed method is effective and feasible for nominal visibility annotation of common texts, and the accuracy of the optimized method is 16% higher than that of the simple semantic similarity method.