Re: PowerBasic rocks!



On Sat, 16 May 2009 16:34:27 -0700, bill.sloman wrote:

Pre-fetching doesn't seem to make sense in this context. The example
was just read, add and store.

Modern DRAMs are essentially wide-word block-transfer devices, and
modern caches are very smart. The inner "add" loop is a few (five,
actually) pipeline-locked instructions working on blocks of input and
output values located in data cache. Really smart programmers doing
really time-critical stuff - like video games - take cache
architecture into account when planning their code.

None of this gets around the fact that if you are looking at 128M
words of data, the memory is outside the cache, on the other side of a
word-wide bus.

When you're reading and writing cache lines using burst transfers, there's
no need for the bus width to match the word size. The Pentium and up use a
64-bit memory bus; many graphics cards use a 256-bit memory bus.

Memory access time is the bottle-neck here, and the fastest solution
has to be three blocks of memory - one for the new data and two for
the accumulated data (which you ping-pong between up-date cycles) with
three separate paths to a DSP processor that can add fast enough to
match the memory transfer rate.

Sure; a wider bus or multiple buses will transfer data faster. It doesn't
actually matter whether you have 1x 16-bit bus + 2x 32-bit buses or a
single 80-bit bus. Except that a single 80-bit bus would be more
flexible.

.



Relevant Pages

  • Re: 16/32 processor operating mode
    ... Okay, x86 it is. ... hardware perspective, as I already mentioned, most memory accesses are ... called a "cache line") in a single operation. ... much of the data bus is active when accessing stuff on the bus. ...
    (alt.lang.asm)
  • Re: memory reading and writing
    ... There is a penalty on the first access - a cache miss - as the ... memory bus speed...or that RAM would cost more, ... of system functionality off-chip and such are all about: The CPU ... as the AGP card is on the same bus as the memory as ...
    (alt.lang.asm)
  • Re: lock# and Bus Locking
    ... >> during certain critical memory operations to lock the system bus. ... The CPU will simply lock into its cache the cache line ...
    (microsoft.public.win32.programmer.kernel)
  • Re: PowerBasic rocks!
    ... Modern DRAMs are essentially wide-word block-transfer devices, ... output values located in data cache. ... no need for the bus width to match the word size. ... 64-bit memory bus; many graphics cards use a 256-bit memory bus. ...
    (sci.electronics.design)
  • Re: Intel details future Larrabee graphics chip
    ... effectively no longer any competition in the cpu world. ... Fast memory, yes. ... Cache, probably. ... it's own local bus, memory, fast interconnect and with instruction level ...
    (comp.arch)