Electroencephalography (EEG) plays a pivotal role in the diagnosis of various neurological conditions, most notably major depressive disorder (MDD). However, deep learning-based methods currently employed for MDD detection tasks exhibit inadequate generalization capabilities, particularly across different EEG electrode channels, and demonstrate limited feature representation capacity. In this paper, we present a novel approach referred to as adaptive feature learning (AFL), which leverages kernel embedding to facilitate the learning of domain-invariant features across subjects within a reproducing kernel Hilbert space. This method aims to enhance the model's ability to generalize across multiple subjects' EEG signals. Furthermore, our research revealed that batch normalization (BN) layers within the existing MDD detection network frequently result in feature channel suppression, potentially compromising the representation power of the features. To address this issue, we propose channel activation (CA), which employs decorrelation to reactivate suppressed feature maps, thereby enhancing the model's feature representation capability, particularly for subtle EEG changes. The effectiveness of the proposed methods is evaluated using the leave-one-subject-out protocol on MODMA and PRED+CT datasets, yielding detection accuracies of 90.56\% (MODMA) and 96.51\% (PRED+CT). Our experimental findings exhibit the superior performance of our method compared to state-of-the-art (SOTA) methods in terms of MDD recognition.