This project proposed a new metric named the multi-class smoothed hinge (MCSH) loss function to get a pre-trained weight with a flat loss landscape, which is suitable for transfer learning.
Multi-class classification is essential in various machine learning applications, but it often suffers from overfitting when using cross-entropy (CE) loss with the softmax function. A key limitation of CE loss is that it cannot reach zero, even if the model's output for the target class approaches infinity, leading to models that become overly sensitive to training data. We address these issues by proposing a novel multi-class smoothed hinge loss function that sets a threshold, where the network score over the threshold does not change the loss. Our method is particularly effective in transfer learning and outperforms traditional approaches, achieving outstanding post-transfer accuracy and flatter loss landscapes. Both theoretical and empirical analyses validate the effectiveness of our approach in producing high-quality pre-trained weights for transfer learning.
Multi-class Smoothed Hinge Loss Function in Pre-training for Transfer Learning
Wonjik Kim, Masayuki Tanaka, Masatoshi Okutomi, Hirokazu Nosato
IEEE International Conference on Image Processing, IEEE ICIP 2025.