I think L3 is much faster; perhaps 1/3rd of main memory, assuming the line is available and not in another core. Here are some numbers for L3 cache, from Intel (probably specific to the 5500 series)[1]:
L3 CACHE hit, line unshared ~40 cycles
L3 CACHE hit, shared line in another core ~65 cycles
L3 CACHE hit, modified in another core ~75 cycles
remote L3 CACHE ~100-300 cycles
Local Dram ~60 ns
Remote Dram ~100 ns
1: http://software.intel.com/sites/products/collateral/hpc/vtun...