Advancing the capabilities of earthquake nowcasting, the real-time forecasting of seismic activities, remains crucial for reducing casualties. This multifaceted challenge has recently gained attention within the deep learning domain, facilitated by the availability of extensive earthquake datasets. Despite significant advancements, the existing literature on earthquake nowcasting lacks comprehensive evaluations of pre-trained foundation models and modern deep learning architectures; each focuses on a different aspect of data, such as spatial relationships, temporal patterns, and multi-scale dependencies. This paper addresses the mentioned gap by analyzing different architectures and introducing two innovative approaches called Multi Foundation Quake and GNNCoder. We formulate earthquake nowcasting as a time series forecasting problem for the next 14 days within 0.1-degree spatial bins in Southern California. Earthquake time series are generated using the logarithm energy released by quakes, spanning 1986 to 2024. Our comprehensive evaluations demonstrate that our introduced models outperform other custom architectures by effectively capturing temporal-spatial relationships inherent in seismic data. The performance of existing foundation models varies significantly based on the pre-training datasets, emphasizing the need for careful dataset selection. However, we introduce a novel method, Multi Foundation Quake, that achieves the best overall performance by combining a bespoke pattern with Foundation model results handled as auxiliary streams.