MRI is an advanced imaging modality with the unfortunate disadvantage of long data acquisition time. To accelerate MR image acquisition while maintaining high image quality, extensive investigations have been conducted on image reconstruction of sparsely sampled MRI. Recently, deep convolutional neural networks have achieved promising results, yet the local receptive field in convolution neural network raises concerns regarding signal synthesis and artifact compensation. In this study, we proposed a deep learning-based reconstruction framework to provide improved image fidelity for accelerated MRI. We integrated the self-attention mechanism, which captured long-range dependencies across image regions, into a volumetric hierarchical deep residual convolutional neural network. Basically, a self-attention module was integrated to every convolutional layer, where signal at a position was calculated as a weighted sum of the features at all positions. Furthermore, relatively dense shortcut connections were employed, and data consistency was enforced. The proposed network, referred to as SAT-Net, was applied on cartilage MRI acquired using an ultrashort TE sequence and retrospectively undersampled in a pseudo-random Cartesian pattern. The network was trained using 336 three dimensional images (each containing 32 slices) and tested with 24 images that yielded improved outcome. The framework is generic and can be extended to various applications.