Executive Summary : | Multi-label learning is a task that involves associating a data point with semantically relevant labels from a given vocabulary. It is considered more challenging than conventional classification tasks due to the large output space and the need for large-scale multi-label learning or eXtreme Multi-label Learning (XML). XML tasks can include online product recommendation, online seller recommendation, predicting keywords for an image, and retrieving semantically similar documents. Research in XML is primarily driven by tech giants and premier academic institutions due to the technical complexity of existing algorithms and the requirement of large-scale computational resources. Small/medium scale enterprises, with limited technical expertise and compute resources, are mostly users of XML services. However, there is hardly any significant involvement of academicians from tier-2/3 institutes in XML research, who can contribute to the growth of this area by proposing innovative, creative, and diverse solutions.
This project aims to increase the reach and impact of XML research by developing and analyzing machine learning algorithms for XML that are easy to understand and implement, can be executed on large datasets in reasonable time, and do not compromise prediction accuracy. The project will examine, adapt, and integrate streaming, few-pass, randomized, and approximation algorithms that are popular in web-scale data mining and learning tasks but have not been fully adopted by existing XML methods. |