High-resolution poverty mapping supports evidence-based policy and research, yet about half of countries lack the requisite survey data to generate useful poverty maps. To overcome this challenge, new non-traditional data sources and deep learning techniques are increasingly used to create small-area estimates of poverty in low- and middle-income countries (LMICs). Convolutional Neural Networks (CNN) trained on satellite imagery are one of the most popular and effective approaches in this literature. However, the spatial resolution of poverty estimates has remained quite coarse, particularly in rural areas which are critical for governments to support. To resolve this, we use an ensemble transfer learning approach involving three CNN models to predict chronic poverty at a finer 1 km2 scale in rural Sindh, Pakistan. We train the model with spatially noisy georeferenced household survey containing poverty scores for 1.9 million anonymized households in Sindh Province using publicly available inputs, including daytime and nighttime satellite imagery and accessibility data. Results from rigorous cross-validation and ground-truthing of predictions with an original survey suggest the model performs well in identifying the chronic poor in both arid and non-arid regions, outperforming previous studies in key accuracy metrics. Our inexpensive and scalable approach could be used to improve poverty targeting in low- and middle-income countries.