Pub/sub systems form the underlying framework for many distributed applications including large social networking applications. In this paper, we consider the optimization of the end-to-end latency of a pub/sub system in which the publisher, the broker, and the subscriber are in different administrative domains. While general pub/sub systems provide reliability of message delivery, good end-to-end latency in a multi-domain environment requires that the pub/sub system adapts to workload changes and bottlenecks in the different sub-systems. This study is motivated by two applications. First, a pub/sub based Simple Lookup Service (sLS) that is used in perfSONAR to provide information about network performance in Research and Education (R&E) networks. Second, the pub/sub system that is used to distribute alerts generated in the data pipeline in the Zwicky Transient Factory (ZTF). In this multi-domain pub/sub performance study, we consider a publisher with a multi-threaded architecture that uses batching to coalesce messages over some variable polling period. We propose a control algorithm that auto-tunes the batch-processing parameters namely, the batch size and the polling interval, to the input message load and to broker-side congestion. Using a detailed simulation model, we demonstrate the performance of the control algorithm for different scenarios. We then study the performance using a real-trace obtained from the Simple Lookup Service (sLS). We show that the proposed algorithm quickly adapts to rapid changes in the workload and yields lower mean end-to-end delay performance when compared with delays in the current deployment.