This special issue consists of extended papers selected from the 2007 International Conference on Information, Communications and Signal Processing (ICICS07). ICICS07 was held in Singapore in December 2007, and it received a positive response from the research communities with more than 700 paper submissions. The papers in this special issue are selected based on their ICICS07 reviews, their relevance to content analysis for media computing, and the timeliness of the research.

The volume of media data (image, video, audio, etc.) has grown tremendously in recent years due to increasing popularity of consumer electronics products such as video camera, prevalence of low-cost high-capacity storage devices, and the increasing proliferation of media data over the Internet and wireless networks. Coupled with the rapidly improving wired and wireless network infrastructure, the demand for intelligent and efficient solutions in media creation, indexing, search, transmission and display is fast growing. This special issue aims to present some recent advances in content analysis for media computing. In particular, it will focus on the following topics: (i) image indexing and retrieval, (ii) video search and classification, and (iii) a combination of works involving audio processing, hand sign understanding and visual quality assessment of videos.

This special issue contains two papers that present some recent advances in image indexing and retrieval. The paper “A review of region-based image retrieval” provides an in-depth survey of various conventional and state-of-the-art region-based image retrieval algorithms. As opposed to the global-based approaches, region-based techniques partition an image into different image regions and extract various visual features from these regions to represent them. The paper focuses in particular on three issues: (a) local region-based features, (b) similarity measures, and (c) relevance feedback based on regions. In the paper “Knowledge propagation in collaborative tagging for image retrieval”, a new knowledge propagation scheme that propagates keywords from a subset of annotated images to the unannotated ones is proposed. The method is based on content analysis of images and training of keyword classifiers. The salient regions of the images are determined and their importance is estimated using support vector machine (SVM). Once the previously unannotated images are tagged with propagated keywords, text-based image search can then be performed.

Video search and classification is another important field in media content analysis. As opposed to the image data, videos contain temporal information and they consist of audio and visual components. Therefore, video search and classification methods need to fully utilize this information. This special issue comprises two papers on video search and classification. Together, they explore how effective audio and visual features can be combined to represent a video and how user interaction can be integrated to improve the performance of video search. In the paper “A new learning algorithm for the fusion of adaptive audio-visual features for the retrieval and classification of movie clips”, a new learning algorithm for audio-visual fusion is proposed. The developed system utilizes perceptual features for content characterization. An adaptive video indexing scheme is used to represent the visual feature, while a statistical model is used to represent the audio feature. These features are then combined and input into an SVM to learn the concepts from a video database. On the other hand, the paper “A cooperative learning scheme for interactive video search” aims to improve the video search performance by integrating user interaction. Based on a text-driven video search engine, a cooperative training strategy is developed to learn from the feedback data. An advantage of the proposed method is that it is able to mine training samples from previous answer sets and combine multiple modalities to learn users’ query intent.

This special issue also contains three papers addressing different aspects in content analysis, which include language identification, sign language phoneme transcription, and visual quality assessment of videos. The paper “Language-dependent contribution measuring and weighting for combining likelihood scores in language identification systems” investigates existing fusion techniques for language identification systems. It proposes a new language-dependent weighting method, and explores various contribution measures, including likelihood ratio and Kullback–Leibler divergence. In the paper “Sign language phoneme transcription with rule-based hand trajectory segmentation”, an effective approach to extract phonemes from sentences of American Sign Language is proposed. A rule-based segmentation technique is employed to segment the hand motion trajectories. This is followed by extraction of feature descriptors using principal component analysis. The segment features are then clustered to obtain the phonemes. Finally, Hidden Markov Models are trained to recognize the sequence of phonemes in the sentences. The paper “Visual quality assessment of video and image sequences–a human-based approach” presents a technique for quantitatively assessing the quality of videos and image sequences without the need for a reference image. The proposed technique can correlate highly with human perception of the video quality.

As the guest editors of this special issue, we would like to thank all the authors for their great efforts in preparing the articles, and all the reviewers for their insightful comments and precious time in reviewing the articles. We are also grateful to the Editor-in-Chief, Prof. S. Y. Kung for his wonderful support in making this special issue possible.