Recently, two papers, “Joint Adversarial Learning for Domain Adaptation in Semantic Segmentation” from Yixin Zhang and “Progressive Boundary Refinement Network for Temporal Action Detection” from Qinying Liu, are accepted to be published in Thirty-Fourth AAAI Conference on Artificial Intelligence (AAAI 2020).
1. Joint Adversarial Learning for Domain Adaptation in Semantic Segmentation
Abstract: Unsupervised domain adaptation in semantic segmentation is to exploit the pixel-level annotated samples in the source domain to aid the segmentation of unlabeled samples in the target domain. For such a task, the key point is to learn domain-invariant representations and adversarial learning is usually used, in which the discriminator is to distinguish which domain the input comes from, and the segmentation model targets to deceive the domain discriminator. In this work, we first propose a novel joint adversarial learning (JAL) to boost the domain discriminator in output space by introducing the information of domain discriminator from low-level features. Consequently, the training of the high-level decoder would be enhanced. Then we propose a weight transfer module (WTM) to alleviate the inherent bias of the trained decoder towards source domain. Specifically, WTM changes the original decoder into a new decoder, which is learned only under the supervision of adversarial loss and thus mainly focuses on reducing domain divergence. The extensive experiments on two widely used benchmarks show that our method can bring considerable performance improvement over different baseline methods, which well demonstrates the effectiveness of our method in the output space adaptation.
2. Progressive Boundary Refinement Network for Temporal Action Detection
Abstract: Temporal action detection is a challenging task due to vagueness of action boundaries. To tackle this issue, we propose an end-to-end progressive boundary refinement network (PBRNet) in this paper. PBRNet belongs to the family of one-stage detectors and is equipped with three cascaded detection modules for localizing action boundary more and more precisely. Specifically, PBRNet mainly consists of coarse pyramidal detection, refined pyramidal detection, and fine-grained detection. The first two modules build two feature pyramids to perform the anchor-based detection, and the third one explores the frame-level features to refine the boundaries of each action instance. In the fined-grained detection module, three frame-level classification branches are proposed to augment the frame-level features and update the confidence scores of action instances. Evidently, PBRNet integrates the anchor-based and frame-level methods. We experimentally evaluate the proposed PBRNet and comprehensively investigate the effect of the main components. The results show PBRNet achieves the state-of-the-art detection performances on two popular benchmarks: THUMOS’14 and ActivityNet, and meanwhile possesses a high inference speed.