Oct 7, 2001:  AMD: long term options 

AMD: long term options

(  by Hans de Vries  )

Mega Hertz is King. We all know it but it stays odd to see these price-list where a 1.4 GHz Athlon is sold for the same price as a 1.4 MHz Pentium 4. It is clear with the upcoming Athlon XP that AMD can deliver equivalent performance in a comparable process and with a much smaller die-size. It is still a mystery what happened to the Mustang core that was supposedly to have reached 1.5 GHz ten months ago with the help of a different type of transistor. Maybe the equally absent Motorola Apollo, the 180 nm SOI version of the G4+ that was introduced at the MPF in October 2000, explains the reason why. Motorola is AMD's semi- conductor technology partner. Pure speculation off course, but something went not as planed, that's for sure. 

Intel's future processes and processors

The worst has yet to come for AMD in terms of frequency. Intel has made long term announcements about its semiconductor and processor roadmap. A year ago it mentioned 10 GHz processors by the year 2005 implemented in the companies P1264, 65 nm process and more recently a number of 20 GHz was given for processors implemented with the projected P1266, 45 nm process. These frequencies have been published with the approval of top-management and that makes into more of a company commitment.



Expanding the hyper-pipelined NetBurst core?

The P6 core from the original Pentium Pro survived 6 years and it is likely that the current Pentium 4 will stay with us for a comparable amount of time, also given the fact that it took seven years to develop this core. A 2005 processor running at 10 GHz may well be closely related to the current Pentium 4 core. If we consider a pure linear frequency scaling for these future processors and compare them with the current Pentium 4 then we discover something remarkable: A Pentium-X running 10 GHz at 65 nm would run 3.6 GHz at 180 nm while a Pentium-X running 20 GHz at 45 nm would run 7 GHz at 130 nm. If we relate this with public Pentium 4 projections / demonstrations from Intel at the time they came up with these numbers then we notice a ratio of around two: 2 GHz for the 180 nm Pentium 4 one year ago and 3.5 GHz for the 130 nm Pentium 4 recently (although super-cooled)   

Now as we all know, some parts of the Pentium 4 do already run at a double frequency: All the units in the hyper-pipelined NetBurst core: The integer schedulers, the integer register files, the fast integer ALUs and the Address Generators. The Pentium 4 was designed to be hyper-pipelined as much as possible. It is not so that the NetBurst core was added later with the double clock-speed as an extra feature. The Netburst clock-frequency was there from the beginning and the rest of processor operates at lower clock-speeds. It was seemingly to much work to do it all at once. Future Pentiums may well include more units into the hyper-pipeline. The floating point schedulers, register file and the floating point MMX and SSE functional units to name the most important, more then enough to use the NetBurst frequency as the Processors frequency. 

SMT to justify frequency doubling without significant performance improvements?

Such a double frequency is ideal for Intel's marketing but there is a serious problem: It comes without hardly any performance improvement. Intel didn't need to do anything to sell the Mega Hertz gap with the current Pentium 4, but this time it's not so easy. There is only one way to improve performance as we all know and that is Simultaneous Multi Threading. Marketing may come up with a Pentium" TwinSpeed" processor trademark: It runs at double the frequency and it can do the work of two processors at the same time. This is probably enough to satisfy the majority of the consumers and all comes down to the Mega Hertz again. (Although the term Mega Hertz may sound a bit outdated by that time...) The public will use the 10, 20 GHz numbers to compare Intel processors with those from the competition.

Options for AMD.

Looking at all these "K-8" patents issued we must conclude that considerable work has gone into this architecture. The real Hammers to be released at the end of next year may include maybe only a subset of all these features. This doesn't mean that the designers of Hammer's successor K9 (code-named GreyHound?) won't continue on the base of this work. The double core concept as shown in the patents is not new. The Alpha EV6 also has a double integer core for much the same reasons as given in the "K-8" patents. The contents of both register files is kept identical so that both cores can work on the same (single thread) program. Such a double core by itself doesn't say anything about SMT. It is not a real surprise to see this concept back since "K-8" architects like Dirk Meyer where previously Alpha processor architects. 

Intel has shown with it's 130 nm, 6 GHz, 256 word pipelined register file that it is not really necessary to split the register file in two. The number of ports needed can be reduced by a factor of two by doubling the frequency.  This would allow an interesting strategy for AMD: Instead of using two cores with a split register file a single double frequency core with twice the pipeline stages could be used. In the simplest case it would do the work for the first core in the even cycles and the work for the second core in the odd cycles. This would reduce the die-size considerably and preserve much of the logic design. A more elaborately redesigned double frequency core would  eliminate the remains of the dual core concept altogether. Scheduling has to take some differences into account but at the end it would be even slightly faster. The cycle delay between the split register-file introduces more latency then the pipelining. The rest of the pipeline can stay unmodified: Instruction fething, branch-prediction, decoding, look ahead unit and at the other end the instruction retiring. 

So besides reducing the die-size it would also largely solve the all important Mega Hertz Marketing issue. The "K-8" as presented in the patents has most likely a higher degree of pipelining then the K7. This and the application of SOI may bridge the remaining 30% frequency gap with these future 10 to 20 GHz Intel processors. Another issue is SMT. AMD would need SMT just as much as Intel to justify a doubling of the frequency. Most of the extra design work would go into SMT especially if both threads would have their own memory-management-and-protection context which is off course highly desirable.

AMD may otherwise decide that it becomes all to complicated given the limited amount of engineering resources and time and "simply" implement two identical existing and less complicated processors on a single die.  It could multiply the individual frequencies by two and use that number for sales purposes... and it would not surprise me at all if they could get away with that... :^)  The latter stays a backup alternative anyway if more complicated projects are delayed too much.