RAM access time is a single Tcy, maybe 0.5 Tcy.
DMA might take 2 Tcy for each transfer (as each transfer consists of a read and a write).
Even if DMA takes 2 cycles and uses cycle stealing, this will be hardly recognizable unless you implement block transfers.
Accessing SFRs doesn't account for a cycle as the DMA controller uses peripheral bus cycles not required by the CPU. (Unless you initialize all SFRs with the same value, the DMA peripheral bus will be available for DMA for a minimum of about every 2nd cycle.
More is in the DMA chapter in the datasheet and in the FRM about DMA.
PEBKAC / EBKAC / POBCAK / PICNIC (eventually see en.wikipedia.org)