Many things people want to do these days are memory bandwidth limited. Editing/recoding video, or even tweaking still pictures and playing games are all memory bandwidth limited. GPUs have far better memory bandwidth than CPUs, because they are sold differently.
The extra bandwidth comes from five advantages that GPUs enjoy:
- GPU and memory come together on one board (faster, more pins)
- point-to-point memory interface (faster, lower power)
- cheap GPU silicon real estate means more pins
- occasional bit errors in GPU memory are considered acceptable
- GPUs typically have less memory than CPUs
CPU memory interfaces are expected to be expandible. Expandibility has dropped somewhat, so that currently you get two slots, one of which starts out populated and the other of which is sometimes populated and sometimes not. The consequence is that the CPU to DRAM connection has multiple drops on each pin.
GPUs always have one DRAM pin to each GPU pin. If they use more DRAM chips, those chips have more narrow interfaces. Because they are guaranteed point-to-point interfaces, the interfaces can run at higher speed, generally about twice the rate of CPU interfaces.
CPU silicon is optimized for single-thread performance -- both Intel and AMD have very high performance silicon. As a result, the silicon costs more per unit area than the commodity silicon the GPUs are built with. The "programs" that run on GPUs are much more amenable to parallelization, which is why GPUs can be competitive with lower-performance silicon.
It turns out that I/O pins require drivers and ESD protection structures that have not scaled down with logic transistors over time. As a result, pins on CPUs cost more than pins on GPUs, and so GPUs have more pins. That means they can talk to more DRAM pins and get more bandwidth.
All of the above advantages would apply to a CPU if you sold it the same way a GPU is sold. The final two advantages that GPUs enjoy would not apply, but are easy to work around.
The first is the acceptability of bit errors. GPUs do not have ECC. It would be easy to make a CPU/GPU that had a big wide interface with ECC.
The second is the memory size. GPUs typically connect to 8 or 16 DRAM chips with 32b interfaces each. It would be straightforward to connect with 64 DRAM chips with 8b interfaces each. If fanout to the control pins of the DRAMs becomes a problem, off-chip dedicated drivers would be cheap to implement.
So, I think integrated CPU/GPU combinations will be interesting for the market, but I think they will be more interesting once they are sold the way GPUs are sold today. Essentially, you will buy a motherboard from Iwill with an AMD CPU/GPU and 2 to 8 GB of memory, and the memory and processor will not be upgradable.
For servers, I think AMD is going in the right direction: very low power (very cheap) mux chips which connect perhaps 4 or even 8 DRAM pins to each GPU/CPU pin. This solution can maintain point-to-point electrical connections to DIMM-mounted DRAMs, and get connectivity to 512 DRAM chips for 64 GB per GPU/CPU chip.