If I Had A Hammer....


By Nils Dahl

Date: July 26, 2002

Lots of people have seen the block diagram of the 8 way hammer systems. So how about those little details that make it work?

Consider this. There are 8 blocks of local memory, each directly connected to a processor. And each processor can address the other processors' memory blocks via Hypertransport links. Stop yawning now.

So just how does this scheme work? Well, let's get into our own way-back machine and land in the days of the Intel 8086. Now some people recall that the 8086 was a 16-bit processor (more or less) with 20-bit addressing. The segment architecture had driven many a good person mad, so we'll skip the details.

Hammer must use a segment identifier inside some register/logic that is in the port to the memory block. And that segment identifier has to be set during bootup in a way that identifies each local memory block as having a unique address range. Now I could be wrong, but I'll take that chance.

Why bother talking about an architectural feature of Hammer? Well, that leads to an idea. Now be kind. I am, after all, just an old man.

Let us make an assumption that even this first round Hammer has a greater addressing range capability than is represented by eight blocks of memory, each 8 gigabytes in size. I'll not try doing the math because I have absolutely no details on this topic. Just a strong interest.

Suppose, just for example, that hammer as delivered could address sixteen (16) blocks of memory, each 8 gigabytes in size. Or even more. The 8 processor nodes remain a specific architectural limit. but we could, if we were clever, graft in additional 8 gigabyte blocks of memory simply by using a memory controller chip that contained a loadable segment register. At least this seems like a useful idea to me.

Such a very simple addition to the architecture, easily implemented by a fairly simple memory controller design with a Hypertransport interface, could be used in many interesting ways - as a huge buffer of database tables and data, as a convenient and precisely mapped set of locations for exchanging data between active software tasks, or other things. I'll skip some of the wilder ideas that float around inside this old brain. I mean, who would consider an 8 gigabyte hard drive cache or ramdrive? Or maybe even 'several' ramdrive type storage subsystems?

Now I can only speculate on this detail. AMD knows just how its hammer internal logic works. The Linux people at SuSe likely have to deal with this feature already, so they know. But disclosure is, as always, at the discretion of AMD and its partners. It sure does beat 16 segments of 64k bytes each. Now just how are they handling the remapping of ports into memory? Gee, I almost had given up on this level of my hobby.

So if there are any good papers on the fine details of Hammer, please let me know. Papers posted for the public, of course. I don't do NDAs.

nils dahl

just an old man


Pssst!  We've updated our Shopping Page.