University of Feira de Santana
Feira de Santana - Bahia - Brazil
Data Fusion and Machine Learning for Information Retrieval
Abstract: Abstract: The technological advances in data capture, storage, and processing allowed the construction of large databases, especially for multimedia data sharing. These data are used in many contexts, such as education, medicine, biometry, social networks, entertainment, news, among others. Considering the huge volume of data, providing efficient and effective access is critical. This project explores the use of modern machine learning techniques for multimedia information retrieval.
Machine Learning for Plant Species Recognition
Abstract: The knowledge of the biodiversity of a region is fundamental for the development of effective productive processes along with the minimization of damages to the environment. At the same time, knowing the characteristics of the species allows the definition of proper preservation policies, and the identification and recognition of flora species is a very important task in the activities of many sectors of society. Consequently, the study and application of modern techniques of representation of characteristics and construction of models for recognition are necessary to allow the development of practical tools. Aligned to the worldwide interested community, this project aims to contribute to the activities of recognition of flora by developing effective and efficient methods with special interest on deep learning approaches.
Diversity-oriented Multimodal and Interactive Information Retrieval (PhD Thesis, IC/Unicamp)
Abstract: Information retrieval methods, especially considering multimedia data, have evolved towards the integration of multiple sources of evidence in the analysis of the relevance of items considering a given user search task. In this context, for attenuating the semantic gap between low-level features extracted from the content of the digital objects and high-level semantic concepts (objects, categories, etc.) and making the systems adaptive to different user needs, interactive models have brought the user closer to the retrieval loop allowing user-system interaction mainly through implicit or explicit relevance feedback. Analogously, diversity promotion has emerged as an alternative for tackling ambiguous or underspecified queries. Additionally, several works have addressed the issue of minimizing the required user effort on providing relevance assessments while keeping an acceptable overall effectiveness. This thesis discusses, proposes, and experimentally analyzes multimodal and interactive diversity-oriented information retrieval methods. This project, comprehensively covers the interactive information retrieval literature and also discusses about recent advances, the great research challenges, and promising research opportunities. We have proposed and evaluated two relevance-diversity trade-off enhancement work-flows, which integrate multiple information from images, such as: visual features, textual metadata, geographic information, and user credibility descriptors. In turn, as an integration of interactive retrieval and diversity promotion techniques, for maximizing the coverage of multiple query interpretations/aspects and speeding up the information transfer between the user and the system, we have proposed and evaluated a multimodal learning-to-rank method trained with relevance feedback over diversified results. Our experimental analysis shows that the joint usage of multiple information sources positively impacted the relevance-diversity balancing algorithms. Our results also suggest that the integration of multimodal-relevance-based filtering and reranking is effective on improving result relevance and also boosts diversity promotion methods. Beyond it,
with a thorough experimental analysis we have investigated several research questions related to the possibility of improving result diversity and keeping or even improving relevance in interactive search sessions. Moreover, we analyze how much the diversification effort affects overall search session results and how different diversification approaches behave for the different data modalities. By analyzing the overall and per feedback iteration effectiveness, we show that introducing diversity may harm initial results whereas it significantly enhances the overall session effectiveness not only considering the relevance and diversity, but also how early the user is exposed to the same amount of relevant items and diversity.
Multimodal Image Retrieval with Relevance Feedback based on Genetic Programming (Masther thesis, IC/Unicamp)
Abstract: This project presents an approach for multimodal content-based image retrieval with relevance feedback based on genetic programming. We assume that there is textual information (e.g., metadata, textual descriptions) associated with collection images. Furthermore, image content properties (e.g., color and texture) are characterized by image descriptores. Given the information obtained over the relevance feedback iterations, genetic programming is used to create effective combination functions that combine similarities associated with different features. Hence using these new functions the different similarities are combined into a unique measure that more properly meets the user needs. The main contribution of this project is the proposal and implementation of two frameworks. The first one, RFCore, is a generic framework for relevance feedback tasks over digital objects. The second one, MMRF-GP, is a framework for digital object retrieval with relevance feedback based on genetic programming and it was built on top of RFCore. We have validated the proposed multimodal image retrieval approach over 2 datasets, one from the University of Washington and another from the ImageCLEF Photographic Retrieval Task. Our approach has yielded the best results for multimodal image retrieval when compared with one-modality approaches. Furthermore, it has achieved better results for visual and multimodal image retrieval than the best submissions for ImageCLEF Photographic Retrieval Task 2008.