Weak signal preservation is critical in the application of seismic data denoising, especially in deep seismic exploration. It is hard to separate those weak signals in seismic data from random noise, as it is less compressible or sparsifiable, though they are usually important for seismic data analysis. Conventional sparse coding models exploit the local sparsity through learning a union of basis, but it does not take into account any prior information about the internal correlation of patches. Motivated by an observation that data patches within a group are expected to share the same sparsity pattern in the transform domain, so-called group sparsity, we propose a novel transform learning with group sparsity (TLGS) method which jointly exploits local sparsity and internal patch self-similarity. Further, for weak signal preservation, we extend the TLGS method and propose the transform learning with external refrence (TLGSR). External clean or denoised patches are applied as the anchored references, which are grouped together with similar corrupted patches. They are jointly modeled under a sparse transform, which is adaptively learned. This is achieved by jointly learning a subset of transform for each group data. The proposed method achieves better denoising performance than existing denoising methods, both in terms of signal-to-noise ratio values and visual preservation of weak signal. Comparisons of experimental results on one synthetic data and three field data using f-x deconvolution (FX-Decon) method and data-driven tight frame (DDTF) method are also provided.