Online data collection has become a prominent option due to the COVID-19 pandemic. It is crucial to understand to what extent online studies can be compared with face-to-face studies, particularly in multimodal language research on which the modes of communication have a crucial effect. This study investigated multimodal communication across face-to-face and videoconferencing settings, focusing on gesture production and speech disfluency in a daily routine description task (N=64). Results suggested that overall disfluency rate was higher for those who communicated via videoconferencing than those who communicated face-to-face. The use of specific disfluency types also differed across the two settings, signaling an interplay between cognitive and communicative strategies. Overall gesture frequency and iconic gesture use were comparable across the two settings. Iconic gesture use negatively predicted the overall disfluency rate, regardless of the setting. Using different contexts is required to understand whether multimodal language differs between face-to-face and online communication.