Networking Engineer's Guide

PROCESSORS

The processor (really a short form for microprocessor and also often called the CPU or central processing unit) is the central component of the PC. This vital component is in some way responsible for every single thing the PC does. It determines.

Basic structure

A processor's major functional components are:

	Core: The heart of a modern is the execution unit. The Pentium has two parallel integer pipelines enabling it to read, interpret, execute and despatch two instructions simultaneously.
	Branch Predictor: The branch prediction unit tries to guess which sequence will be executed each time the program contains a conditional jump, so that the Prefetch and Decode Unit can get the instructions ready in advance.
	Floating Point Unit: The third execution unit in a Pentium, where non-integer calculations are performed.
	Primary Cache: The Pentium has two on-chip caches of 8KB each, one for code and one for data, which are far quicker than the larger external secondary cache.
	Bus Interface: This brings a mixture of code and data into the CPU, separates the two ready for use, and then recombines them and sends them back out.

Many instructions involve the arithmetic and logic unit (ALU). This works in conjunction with the General Purpose Registers - temporary storage areas which can be loaded from memory or written to memory. A typical ALU instruction might be to add the contents of a memory location to a general purpose register. The ALU also alters the bits in the Status Register (SR) as each instruction is executed; this holds information on the result of the previous instruction. Typically, the SR has bits to indicate a zero result, an overflow, a carry and so forth. The control unit uses the information in the SR to execute conditional instructions such as ‘jump to address 7410 if the previous instruction overflowed’.

This is about all there is as far as a very basic processor is concerned and just All the elements of the processor stay in step by use of a ‘clock’ which dictates how fast it operates. The very first microprocessor had a 100KHz clock, whereas the Pentium Pro uses a 200MHz clock, which is to say it ‘ticks’ 200 million times per second. As the clock ‘ticks’, various things happen. The Program Counter (PC) is an internal memory location which contains the address of the next instruction to be executed. When the time comes for it to be executed, the Control Unit transfers the instruction from memory into its Instruction Register (IR).At the same time, the PC is incremented so that it points to the next instruction in sequence; now the processor executes the instruction in the IR. Some instructions are handled by the Control Unit itself, so if the instruction says ‘jump to location 2749’, the value of 2749 is written to the PC so that the processor executes that instruction next.

Principles
The underlying principles of all computer processors are the same. Fundamentally, they all take signals in the form of 0s and 1s (thus binary signals), manipulate them according to a set of instructions, and produce output in the form of 0s and 1s. The voltage on the line at the time a signal is sent determines whether the signal is a 0 or a 1. On a 3.3-volt system, an application of 3.3 volts means that it's a 1, while an application of 0 volts means it's a 0. Processors work by reacting to an input of 0s and 1s in specific ways and then returning an output based on the decision. The decision itself happens in a circuit called a logic gate, each of which requires at least one transistor, with the inputs and outputs arranged differently by different operations. The fact that today's processors contain millions of transistors offers a clue as to how complex the logic system is. The processor's logic gates work together to make decisions using Boolean logic, Logic gates operate via hardware known as a switch - in particular, a digital switch.

Modern day microprocessors contain tens of millions of microscopic transistors. Used in combination with resistors, capacitors and diodes, these make up logic gates. Logic gates make up integrated circuits, and ICs make up electronic systems. Intel's first claim to fame lay in its high-level integration of all the processor's logic gates into a single complex processor chip - the Intel 4004 - released in late 1971. This was 4-bit microprocessor, intended for use in a calculator. It processed data in 4 bits, but its instructions were 8 bits long. Program and data memory were separate, 1KB and 4KB respectively. There were also sixteen 4-bit (or eight 8-bit) general purpose registers. The 4004 had 46 instructions, using only 2,300 transistors in a 16-pin DIP and ran at a clock rate of 740kHz (eight clock cycles per CPU cycle of 10.8 microseconds).

For some years two families of microprocessor have dominated the PC industry - Intel's Pentium and Motorola's PowerPC. These CPUs are also prime examples of the two competing CPU architectures of the last two decades - the former being classed as a CISC chip and the latter as a RISC chip.

CISC
CISC (complex instruction set computer) is the traditional architecture of a computer, in which the CPU uses microcode to execute very comprehensive instruction set. These may be variable in length and use all addressing modes, requiring complex circuitry to decode them.

RISC
RISC (reduced instruction set computer) CPUs keep instruction size constant, ban the indirect addressing mode and retain only those instructions that can be overlapped and made to execute in one machine cycle or less. One advantage of RISC CPUs is that they can execute their instructions very fast because the instructions are so simple. Another, perhaps more important advantage, is that RISC chips require fewer transistors, which makes them cheaper to design and produce.

Type/ Generation	Year	Data/ Address bus width	Level 1 Cache (KB)	Memory bus speed (MHz)	Internal clock speed (MHz)
8088/ First	1979	8/20 bit	None	4.77-8	4.77-8
8086/ First	1978	16/20 bit	None	4.77-8	4.77-8
80286/ Second	1982	16/24 bit	None	6-20	6-20
80386DX/ Third	1985	32/32 bit	None	16-33	16-33
80386SX/ Third	1988	16/32 bit	8	16-33	16-33
80486DX/ Fourth	1989	32/32 bit	8	25-50	25-50
80486SX/ Fourth	1989	32/32 bit	8	25-50	25-50
80486DX2/ Fourth	1992	32/32 bit	8	25-40	50-80
80486DX4/ Fourth	1994	32/32 bit	8+8	25-40	75-120
Pentium/ Fifth	1993	64/32 bit	8+8	60-66	60-200
MMX/ Fifth	1997	64/32 bit	16+16	66	166-233
Pentium Pro/ Sixth	1995	64/36 bit	8+8	66	150-200
Pentium II/ Sixth	1997	64/36 bit	16+16	66	233-300
Pentium II/ Sixth	1998	64/36 bit	16+16	66/100	300-450
Pentium III/ Sixth	1999	64/36 bit	16+16	100	450-600
AMD Athlon/ Seventh	1999	64/36 bit	64+64	100-200+	500-600+

The third generation chips, based on Intel’s 80386SX and DX processors, were the first 32-bit processors to appear in a PC. The main difference between these was that the 386SX was only a 32-bit processor on the inside, because it interfaces to the outside world through a 16-bit data bus. This meant that data moved between an SX processor and the rest of the system at half the speed of a 386DX.

Fourth generation processors were also 32-bit. However, they all offered a number of enhancements. First, the entire design was overhauled for Intel’s 486 range, making them inherently more than twice as fast. Secondly, they all had 8K of cache memory on the chip itself, right beside the processor logic. This cached data transfers from main memory meaning that on average the processor needed to wait for data from the motherboard for only 4% of the time because it was usually able to get the information it required from the cache.

The 486DX model differed from the 486SX only in that it brought the maths co-processor on board as well. This was a separate processor designed to take over floating-point calculations. It had little impact on everyday applications but transformed the performance of spreadsheets, statistical analysis, CAD and so forth.

An important innovation was the clock doubling introduced on the 486DX2. This meant that the circuits inside the chip ran at twice the speed of the external electronics. Data was transferred between the processor, the internal cache and the math co-processor at twice the speed, considerably enhancing performance. The 486DX4 took this technique further, tripling the clock speed to run internally at 75 or 100MHz and also doubled the amount of primary cache to 16K.

The Pentium is the defining processor of the fifth generation and provides greatly increased performance over the 486 chips that preceded it, due to several architectural changes, including a doubling of the data bus width to 64 bits. The P55C MMX processor made further significant improvements by doubling the size of the on-board primary cache to 32KB and by an extension to the instruction set to optimise the handling of multimedia functions.

The Pentium Pro, introduced in 1995 as the successor to the Pentium, was the first of the sixth generation of processor and introduced several unique architectural features that had never been seen in a PC processor before. The Pentium Pro was the first mainstream CPU to radically change how it executes instructions, by translating them into RISC-like micro-instructions and executing these on a highly advanced internal core. It also featured a dramatically higher-performance secondary cache compared to all earlier processors. Instead of using motherboard-based cache running at the speed of the memory bus, it used an integrated Level 2 cache with its own bus, running at full processor speed, typically three times the speed that the cache runs at on the Pentium.

Intel's first new chip since the Pentium Pro took almost a year and a half to produce, and when it finally appeared the Pentium II proved to be very much an evolutionary step from the Pentium Pro. This fuelled the speculation that one of Intel's primary goals in making the Pentium II was to get away from the expensive integrated Level 2 cache that was so hard to manufacture on the Pentium Pro. Architecturally, the Pentium II is not very different from the Pentium Pro, with a similar x86 emulation core and most of the same features.

The Pentium II improved on the Pentium Pro architecturally by doubling the size of the Level 1 cache to 32KB, using special caches to improve the efficiency of 16-bit code processing (the Pentium Pro was optimised for 32-bit processing and did not deal with 16-bit code quite as well) and increasing the size of the write buffers. However, the most talked about aspect of the new Pentium II was its packaging. The integrated Pentium Pro secondary cache, running at full processor speed, was replaced on the Pentium II with a special small circuit board containing the processor and 512KB of secondary cache, running at half the processor's speed. This assembly, termed a single-edge cartridge (SEC), was designed to fit into a 242-pin slot (Socket 8) on the new style Pentium II motherboard.

Date	Codename	Transistors	Fabrication (µm)	Speed (MHz)
1993	P5	3,100,000	0.80	60/66
1994	P54	3,200,000	0.50	75/90/100/120
1995	P54	3,300,000	0.35	120/133
1996	P54	3,300,000	0.35	150/166/200

Pentium II Xeon

In June 1998 Intel introduced its Pentium II Xeon processor, rated at 400MHz. Technically, Xeon represents a combination of Pentium Pro and Pentium II technology and is designed to offer outstanding performance in critical applications for workstations and servers. Using the new Slot 2 interface, Xeon is nearly twice the size of Pentium II, primarily because of the increased Level 2 cache.

Pentium III

Intel's successor to the Pentium II, formerly codenamed Katmai, came to market in the spring of 1999. With the introduction of the MMX came the process called Single Instruction Multiple Data (SIMD). This enabled one instruction to perform the same function on several pieces of data simultaneously, improving the speed at which sets of data requiring the same operations could be processed. The new processor introduces 70 new Streaming SIMD Extensions - but doesn't make any other architecture improvements.

50 of the new SIMD Extensions are intended to improve floating-point performance. In order to assist data manipulation there are eight new 128-bit floating-point registers. In combination, these enhancements can lead to up to four floating-point results being returned at each cycle of the processor. There are also 12 New Media instructions to complement the existing 57 integer MMX instructions by providing further support for multimedia data processing. The final 8 instructions are referred to by Intel as the New Cacheability instructions. They improve the efficiency of the CPU's Level 1 cache and allow sophisticated software developers to boost the performance of their applications or games.

Other than this, the Pentium III makes no other architecture improvements. It still fits into Slot 1 motherboards, albeit with simplified packaging - the new SECC2 cartridge allows a heatsink to be mounted directly onto the processor card and uses less plastic in the casing. The CPU still has 32KB of Level 1 cache and will initially ship in 450MHz and 500MHz models with a frontside bus speed of 100MHz and 512KB of half-speed Level 2 cache, as in the Pentium II. This means that unless a user is running a 3D/games application that has been specifically written to take advantage of Streaming SIMD Extensions, or using the 6.1 version of Microsoft's DirectX API - they're unlikely to see a significant performance benefit over a similarly clocked Pentium II.

Close on the heels of the Pentium III came the Pentium III Xeon, formerly codenamed Tanner. This is basically a Pentium Xeon with the new Streaming SIMD Extensions (SSE) instruction set added. Targeted at the server and workstation markets, the Pentium III Xeon was initially shipped as a 500MHz processor with either 512KB, 1MB or 2MB of Level 2 cache.

October 1999 saw the launch of Pentium III processors, codenamed Coppermine, built using Intel's advanced 0.18-micron process technology. This features structures that are smaller than 1/500th the thickness of a human hair - smaller than bacteria and smaller than the (human-) visible wavelength of light. The associated benefits include smaller die sizes and lower operating voltages, facilitating more compact and power-efficient system designs and making possible clock speeds of 1GHz and beyond. The desktop part is available in two forms, with either 100MHz or 133MHz FSBs at speeds ranging from 500MHz to 700MHz and 733MHz respectively. The part notion used differentiates 0.18-micron from 0.25-micron processors at the same frequency by the suffix 'E' and versions with the 133MHz FSB by the suffix 'B'.

Although the size of the Level 2 cache on the new Pentium IIIs has halved to 256KB, it's been placed on the die to run at the same speed as the processor, rather than half the speed as before - the full-speed operation more than making up for the missing 256KB. Intel refers to the enhanced cache as 'Advanced Transfer Cache'. In real terms ATC means the cache is connected to the CPU via a 256-bit wide bus - four times wider than the 64-bit bus of a Katmai-based Pentium III. Overall system performance is further enhanced by Intel's Advanced System Buffering technology, which increases the number of 'buffers' between the processor and its system bus resulting in a consequent increase in information flow.

The announcement of the 850MHz and 866MHz Pentium IIIs in the spring of 2000 appeared to confirm Intel's intention to rationalise CPU form factors across the board - signalled earlier by the announcement of the first 0.18-micron Celerons in a new FC-PGA (flip-chip pin grid array) packaging - with these versions being available in both SECC2 and FC-PGA packaging. The limited availability of FC-PGA compatible motherboards in the first half of 2000 created a market for the 'slot-to-socket adapter' (SSA). This, however, resulted in something of a minefield for consumers, with some SSA/motherboard combinations causing CPUs to operate out of specification - thereby voiding the Intel processor limited warranty - and potentially damaging the processor and/or motherboard!

Date	Codename	Transistors	L2 Cache	Fabrication (µm)	Speed (MHz)
1999	Katmai	9,500,000	512KB	0.25	450/500/550
1999	Tanner	9,500,000	512KB/1MB/2MB	0.25	500/550
1999	Coppermine	28,100,000	256KB (on-die)	0.18	500 to 800MHz
2000	Coppermine	28,100,000	256KB (on-die)	0.18	850MHz to 1.13GHz

Soon after its launch on 31 July 2000, Intel faced the embarrassment of having to recall all of its shipped 1.13GHz processors after it was discovered that the chip caused systems to hang when running certain applications. Many linked the problem with the increasing competition from rival chipmaker AMD - who had succeeded in beating Intel to the 1GHz barrier a few weeks earlier - believing that Intel may have been forced into introducing faster chips earlier than it had originally planned.

Itanium
It was in June 1994 that Hewlett-Packard announced their joint research-and development project aimed at providing advanced technologies for end-of- the-millennium workstation, server and enterprise-computing products and October 1997 that they revealed the first details of their 64-bit computing architecture. At that time the first member of Intel's new family of 64-bit processors - codenamed Merced, after a Californian river - was slated for production in 1999, using Intel's 0.18-micron technology. In the event the Merced development programme slipped badly and was estimated at still nearly a year from completion when Intel announced the selection of the brand name Itanium at the October 1999 Intel Developer Forum.

A major benefit of a 64-bit computer architecture is the amount of memory that can be addressed. In the mid-1980s, the 4GB addressable memory of 32-bit platforms was more than sufficient. However, by the end of the millennium large databases exceeded this limit. The time taken to access storage devices and load data into virtual memory has a significant impact on performance. 64-bit platforms are capable of addressing an enormous 16TB of memory - 4 billion times more than 32-bit platforms are capable of handling. In real terms this means that whilst a 32-bit platform can handle a database large enough to contain the name of every inhabitant of the USA since 1977, a 64-bit one is sufficiently powerful to store the name of every person who's lived since the beginning of time! However, notwithstanding the impact that it's increased memory addressing will have, it is its Explicitly Parallel Instruction Computing (EPIC) technology - the foundation for a new 64-bit Instruction Set Architecture (ISA) - that represents Itanium's biggest technological advance.

EPIC, incorporating an innovative combination of speculation, prediction and explicit parallelism, advances the the state-of-art in processor technologies, specifically addressing the performance limitations found in RISC and CISC technologies. Whilst both of these architectures already use various internal techniques to try to process more than one instruction at once where possible, the degree of parallelism in the code is only determined at run-time by parts of the processor that attempt to analyse and re-order instructions on the fly. This approach takes time and wastes die space that could be devoted to executing, rather than organising instructions. EPIC breaks through the sequential nature of conventional processor architectures by allowing software to communicate explicitly to the processor when operations can be performed in parallel.

The result is that the processor can simply grab as large a chunk of instructions as possible and execute them simultaneously, with minimal pre-processing. Increased performance is realised by reducing the number of branches and branch mis-predicts, and reducing the effects of memory-to-processor latency. The IA-64 Instruction Set Architecture - published in May 1999 - applies EPIC technology to deliver massive resources with inherent scaleability not possible with previous processor architectures. For example, systems can be designed to slot in new execution units whenever an upgrade is required, similar to plugging in more memory modules on existing systems. According to Intel the IA-64 ISA represents the most significant advancement in microprocessor architecture since the introduction of its 386 chip in 1985.

IA-64 processors will have massive computing resources including 128 integer registers, 128 floating-point registers, and 64 predicate registers along with a number of special-purpose registers. Instructions will be bundled in groups for parallel execution by the various functional units. The instruction set has been optimised to address the needs of cryptography, video encoding and other functions that will be increasingly needed by the next generation of servers and workstations. Support for Intel's MMX technology and Internet Streaming SIMD Extensions is maintained and extended in IA-64 processors.

Pentium 4

In early 2000, Intel unveiled details of its first new IA-32 core since the Pentium Pro - introduced in 1995. Previously codenamed Willamette - after a river that runs through Oregon - it was announced a few months later that the new generation of microprocessors would be marketed under the brand name Pentium 4 and be aimed at the advanced desktop market rather than servers.

Representing the biggest change to Intel's 32-bit architecture since the Pentium Pro in 1995, the Pentium 4's increased performance is largely due to architectural changes that allow the device to operate at higher clock speeds and logic changes that allow more instructions to be processed per clock cycle. Foremost amongst these is the Pentium 4 processor's internal pipeline - referred to as Hyper Pipeline - which comprises 20 pipeline stages versus the ten for the P6 microarchitecture.

A typical pipeline has a fixed amount of work that is required to decode and execute an instruction. This work is performed by individual logical operations called "gates." Each logic gate consists of multiple transistors. By increasing the stages in a pipeline, fewer gates are required per stage. Because each gate requires some amount of time (delay) to provide a result, decreasing the number of gates in each stage allows the clock rate to be increased. It allows more instructions to be "in flight" or at various stages of decode and execution in the pipeline. Although these benefits are offset somewhat by the overhead of additional gates required to manage the added stages, the overall effect of increasing the number of pipeline stages is a reduction in the number of gates per stage, which allows a higher core frequency and enhances scalability.

In absolute terms, the maximum frequency that can be achieved by a pipeline in an equivalent silicon production process can be estimated as:

1/(pipeline time in ns/number of stages) * 1,000 (to convert to megahertz) = maximum frequency

Accordingly, the maximum frequency achievable by a five-stage, 10-ns pipeline is: 1/(10/5) * 1,000 = 500MHz

In contrast, a 15-stage, 12-ns pipeline can achieve: 1/(12/15) * 1,000 = 1,250MHz or 1.25GHz

Additional frequency gains can be achieved by changing the silicon process and/or using smaller transistors to reduce the amount of delay caused by each gate.

Other new features introduced by the Pentium 4's new micro-architecture - dubbed NetBurst - include:

	an innovative Level 1 cache implementation comprising - in addition to an 8KB data cache - an Execution Trace Cache, that stores up to 12K of decoded x86 instructions (micro-ops), thus removing the latency associated with the instruction decoder from the main execution loops
	a Rapid Execution Engine that pushes the processor's ALUs to twice the core frequency resulting in higher execution throughput and reduced latency of execution - the chip actually uses three separate clocks: the core frequency, the ALU frequency and the bus frequency
	a very deep, out-of-order speculative execution engine - referred to as the Advanced Dynamic Execution engine - that avoids the stall that can occur while instructions are waiting for dependencies to resolve by providing a very large window of instructions from which the execution units can choose
	a 256KB Level 2 Advanced Transfer Cache that provides a 256-bit (32-byte) interface that transfers data on each core clock, thereby delivering a much higher data throughput channel - 44.8 GBps (32 bytes x 1 data transfer per clock x 1.4 GHz) - for a 1.4GHz Pentium 4 processor
	SIMD Extensions 2 (SSE2) - the latest iteration of Intel's Single Instruction Multiple Data technology - which integrate 76 new SIMD instructions and improvements to 68 integer SIMD instructions, allowing the chip to grab 128-bits of data at a time in both floating-point and integer and thereby accelerate CPU-intensive encoding and decoding operations such as streaming video, speech, 3D rendering and other multimedia procedures
	the industry's first 400MHz system bus, providing a 3-fold increase in throughput compared with Intel's current 133MHz system bus.

Based on Intel's ageing 0.18-micron process, the new chip comprises a massive 42 million transistors. Indeed, the chip's original design would have resulted in a significantly larger chip still - and one that was ultimately deemed too large to build economically at 0.18 micron. Features that had to be dropped from the Willamette's original design included a larger 16KB Level 1 cache, two fully functional FPUs and 1MB of external Level 3 cache. What this reveals is that the Pentium 4 really needs to be built on 0.13-micron technology - something that Intel has planned for 2001.

The first Pentium 4 shipments - at speeds of 1.4GHz and 1.5GHz - occurred in November 2000. Early indications were that the new chip offered the best performance improvements on 3D applications - such as games - and on graphics intensive applications such as video encoding. On everyday office applications - such as word processing, spreadsheets, Web browsing and e-mail - the performance gain appeared much less pronounced.

A 1.7GHz version of the Pentium 4 chip and volume production are scheduled for the first half of 2001 with a 2GHz version planned for around mid-2001.

Copper interconnect

Every chip has a base layer of transistors, with layers of wiring stacked above to connect the transistors to each other and, ultimately, to the rest of the computer. The transistors at the first level of a chip are a complex construction of silicon, metal, and impurities precisely located to create the millions of minuscule on-or-off switches that make up the brains of a microprocessor. Breakthroughs in chip technology have most often been advances in transistor-making. As scientists kept making smaller, faster transistors and packing them closer together, the interconnect started to present problems.

Aluminium has long been the conductor of choice, but it will soon reach the technological and physical limits of existing technology. Pushing electrons through smaller and smaller conduits becomes harder to do - aluminium just isn't fast enough at these new, smaller sizes. Scientists had seen this problem coming for years and sought to find a way to replace aluminium with one of the three metals that conduct electricity better: copper, silver, or gold. However, after many years of trying, no one had succeeded in making a marketable copper chip.

Celeron

In an attempt to better address the low-cost PC sector, hitherto the province of the cloners, AMD and Cyrix, who are continuing to develop the legacy Socket 7 architecture, Intel launched its Celeron range of processors in April 1998 The initial 266MHz and 300MHz Celerons, with no Level 2 cache, met with a less-than-enthusiastic market response, carrying little or no advantage over clone-based Socket 7 systems, yet failing to deliver a compelling performance advantage. In August 1998 Intel beefed up its Celeron range with the processor family formerly codenamed Mendocino. Consequently, starting with the 300A, all Celerons have come equipped with 128KB of on-die Level 2 cache running at full CPU speed and communicating externally via a 66MHz bus. This has made the newer Celerons far more capable than their sluggish predecessors.

Somewhat confusingly, all Celeron processors from the 300A up until the 466MHz were available in two versions - the SEPP form factor or in a plastic pin grid array (PPGA) form factor. The former is the mainstream version, compatible with Intel's existing Slot 1 architecture, while the latter is a proprietary Pin 370 socket, neither Socket 7 nor Slot 1. The use of a socket, rather than a slot, gives more flexibility to motherboard designers as a socket has a smaller footprint as well as better heat dissipation characteristics. Consequently, it provides OEMs with more potential to lower system design costs. The 500MHz version was available in PPGA packaging only.

NON – INTEL MICROPROCESSORS

Intel has enjoyed a comfortable position as the PC processor manufacturer of choice in recent years. Dating from the time of their 486 line of processors, in 1989, it has been Cyrix, together with fellow long-time Intel cloner Advanced Micro Devices (AMD), who have posed the most serious threat to Intel’s dominance.

AMD's involvement in personal computing spans the entire history of the industry, the company having supplied every generation of PC processor, from the 8088 used in the first IBM PCs to the new, seventh-generation AMD Athlon processor. In fact, the commonly held view that the Athlon represents the first occasion in the history of the x86 CPU architecture that Intel had surrendered the technological lead to a rival chip manufacturer is not strictly true. A decade earlier AMD's 386DX-40 CPU bettered Intel's 486SX chip in terms of speed, performance and cost.

In the early 1990s both AMD and Cyrix made their own versions of the 486DX, but their products became better known with their 486DX2 clones, one copying the 486DX2-66 (introduced by Intel in 1992) and another upping the ante to 80MHz for internal speed. The 486DX2-80 was based on a 40MHz system bus, and unlike the Intel DX2 chips (which ran hot at 5V) it ran at the cooler 3.3V. AMD and Cyrix both later introduced clock-tripled versions of their 40MHz 486 processors, which ran at 120MHz. Both AMD and Cyrix offered power management features beginning with their clock-doubled processors, with Intel finally following suit with its DX4, launched a couple of years later.

Although Intel stopped improving the 486 with the DX4-100, AMD and Cyrix kept going. In 1995, AMD offered the clock-quadrupled 5x86, a 33MHz 486DX that ran internally at 133MHz. AMD marketed the chip as comparable in performance to Intel's new Pentium/75, and thus the company called it the 5x86-75. But it was a 486DX in all respects, including the addition of the 16K Level 1 cache (the cache built into the processor), which Intel had introduced with the DX4. Cyrix followed suit with its own 5x86, called the M1sc, but this chip was much different from AMD's. In fact, the M1sc offered Pentium-like features, even though it was designed for use on 486 motherboards. Running at 100MHz and 120MHz, the chip included a 64-bit internal bus, a six-stage pipeline (as opposed to the DX4's five-stage pipeline), and branch-prediction technology to improve the speed of instruction execution. It's important to remember, however, that the Cyrix 5x86 appeared after Intel had introduced the Pentium, so these features were more useful in upgrading 486s than in pioneering new systems.

In the post-Pentium era, designs from both manufacturers have met with reasonable levels of market acceptance, especially in the low-cost, basic PC market segment. With Intel now concentrating on its Slot 1 and Slot 2 designs, the target for its competitors is to match the performance of Intel's new designs as they emerge, without having to adopt the new processor interface technologies. As a consequence the lifespan of the Socket 7 form factor has been considerably extended, with both motherboard and chipset manufacturers co-operating with Intel's competitors to allow Socket 7 based systems to offer advanced features such as 100MHz frontside bus and AGP support.

Mid-1999 saw some important developments, likely to have a significant bearing on the competitive position in the processor market in the coming years. In August, Cyrix finally bowed out of the PC desktop business when National Semiconductor sold the rights to its x86 CPUs to Taiwan-based chipset manufacturer VIA Technologies. The highly integrated MediaGX product range remained with National Semiconductor - to be part of the new Geode family of system-on-a-chip solutions the company is developing for the client devices market.

A matter of days later, VIA announced its intention to purchase IDT's Centaur Technology subsidiary - responsible for the design and production of its WinChip x86 range of processors. It is unclear if these moves signal VIA's intention to become a serious competitor in the CPU market, or whether its ultimate goal is to compete with National Semiconductor in the system-on-a-chip market. Hitherto the chipset makers have lacked any x86 design technology to enable them to take the trend for low-cost chipsets incorporating increasing levels of functionality on a single chip to its logical conclusion.

The other significant development was AMD seizing the technological lead from Intel with the launch of its new Athlon (formerly codenamed K7) processor. With Intel announcing delays to its 'Coppermine' 0.18-micron Pentium III at around the same time as AMD's new processor's launch, it's going to be interesting to see whether the company can capitalise on its unprecedented opportunity to dominate in the high-performance arena and what impact the Athlon has on the company's fortunes in the longer term.

Cyrix 6x86

Unveiled in October 1995, the 6x86 was the first Pentium-compatible processor to reach the market and the result of a collaboration with IBM’s Microelectronics Division. Acceptance of the 6x86 was initially slow because Cyrix priced it too high, mistakenly thinking that since the chip's performance was comparable to Intel's, its price could be too. Once Cyrix readjusted its sights and accepted its position as a low-cost, high-performance alternative to the Intel Pentium series, the chip made a significant impact in the budget sector of the market.

Since a 6x86 processor was capable of an equivalent level of performance to a Pentium chip at a lower clock speed, Cyrix collaborated with a number of other companies to develop an alternative to the traditional clock speed-based rating system. The resulting Processor Performance rating, or P-rating, is an application-based standardised performance measure and Cyrix processors traditionally run at a slower clock speed than their P-rating with no apparent performance degradation. For example, the P133+ runs at a clock speed of 110MHz, while the P150+ and P166+ run at 120MHz and 133MHz respectively.

The 6x86’s superior performance was due to improvements in the chip’s architecture which allowed the 6x86 to access its internal cache and registers in one clock cycle (a Pentium typically takes two or more for a cache access). Furthermore, the 6x86’s primary cache was unified, rather than comprising two separate 8KB sections for instructions and data. This unified model was able to store instructions and data in any ratio, allowing an improved cache hit rate in the region of 90%.

Indeed, the 6x86 has a number of similarities to the Pentium Pro. It’s a sixth-generation superscalar, superpipelined processor, able to fit a Pentium P54C socket (Socket 7). It contains 3.5 million transistors, initially manufactured on a 0.5 micron five-layer process. It has a 3.3v core with 5v I/O protection.

The 6x86 features, like that of the Pentium, are: superscalar architecture, 80-bit FPU, 16KB primary cache and System Management Mode (SMM). However, it has a number of important differences. The 6x86 is superpipelined, meaning there are seven, instead of five, pipeline stages (Prefetch, two Decode, two Address Generation, Execute, and Write-back) to keep information flowing faster and avoid execution stalls. Also present is Register Renaming, providing temporary data storage for instant data availability without waiting for the CPU to access the on-chip cache or system memory.

Other new features include data dependency removal, multi-branch prediction, speculative execution 'out-of-order' completion. The presence of these architectural components prevent pipeline stalling by continually providing instruction results: predicting requirements, executing instructions a high level of accuracy and allowing faster instructions to exit the pipeline of order, without disrupting the program flow. All this boosts 6x86 performance a level beyond a similarly-clocked Pentium.

The real key to the 6x86 is its processing of code. It handles code in 'native mode'; it fully optimises the x86 CISC instruction set. This applies to both 16- and 32-bit code. The Pentium does this, too, but by contrast a Pentium Pro requires the conversion of CISC instructions to RISC (or micro) operations before they enter the pipelines. Consequently, the 6x86 execution engine, unlike the Pentium Pro, doesn’t take a performance hit when handling 16- or 32-bit applications because no code conversion is required. The Pentium Pro, on the other hand, is known to be designed as a pure 32-bit processor and 16-bit instructions can stall considerably while in its pipeline.

All of these additional architectural features add up to one thing for the Cyrix 6x86: better performance at a lower clock speed. Compared with a Pentium on a clock-for-clock basis, the 6x86 is a more efficient chip.

However, early 6x86s in particular were plagued by a number of problems, notably overheating, poor floating-point performance and Windows NT incompatibilities. These adversely impacted the processor’s success and the 6x86’s challenge to the Pentium proved short-lived, being effectively ended by the launch of Intel’s MMX-enabled Pentiums at the start of 1997.

Cyrix MediaGX

The introduction of the MediaGX processor in February 1997 defined the first new PC architecture in a decade, and ignited a new market category - the low-cost 'Basic PC'. The growth of this market has been explosive, and Cyrix's processor technology and system-level innovation has been a critical component.

The more processing that occurs on a PC's CPU itself, the more efficient the overall system performance. In traditional computer designs, the CPU processes data at the megahertz of the chip, while the bus that moves data to and from other components operates at only half that speed, or even less. This means that data movement to and from the CPU takes more time - and the potential for data 'stalls' increases. Cyrix eliminated this bottleneck with MediaGX technology.

The MediaGX architecture integrates the graphics and audio functions, the PCI interface and the memory controller into the processor unit, thereby eliminating potential system conflicts and end-user configuration problems. It consists of two chips - the MediaGX processor and the MediaGX Cx5510 companion chip. The processor uses a propriety socket requiring a specially designed motherboard.

The MediaGX processor is a x86-compatible processor which directly interfaces to a PCI bus and EDO DRAM memory over a dedicated 64-bit data bus. Cyrix claims that the compression technique used over the data bus obviates the need for a Level 2 cache. There is 16KB unified Level 1 cache on the CPU - the same amount as on a standard Pentium chip.

Graphics are handled by a dedicated pipeline on the CPU itself and the display controller is also on the main processor. There is no video memory, the frame buffer being stored in main memory without the performance degradation associated with traditional Unified Memory Architecture (UMA), using instead Cyrix's own Display Compression Technology (DCT). VGA data operations are handled in hardware, but VGA registers and controls are controlled through Cyrix’s Virtual System Architecture (VSA) software.

The companion chip, the MediaGX Cx5510, houses the audio controller and again uses VSA software to mimic the functionality of industry standard audio chips. It also provides the bridge to the ISA bus and IDE and I/O ports. It bridges the MediaGX processor over the PCI bus to the ISA bus and interfaces to the IDE and I/O ports and also performs traditional chipset functions.

After its acquisition by National Semiconductor in November 1997, the new company reasserted its intention to compete with Intel and to focus on driving down the price of PCs by continuing to develop its 'PC on a chip' Media GX technology. By the summer of 1998 MediaGX processors, based on 0.25-micron fabrication, had reached speeds of 233MHz and 266MHz, with higher speed grades expected by the end of the year.

Cyrix 6x86MX

Cyrix’s response to Intel's MMX technology was the 6x86MX, launched in mid-1997, shortly before the company was acquired by National Semiconductor. The company stuck with the Socket 7 format for its new chip, a decision which held down costs to system builders and ultimately consumers by extending the life of existing chipsets and motherboards.

The architecture of the new chip remains essentially the same as that of its predecessor, with the addition of MMX instructions, a few enhancements to the Floating Point Unit, a larger 64KB unified primary cache and an enhanced memory-management unit. It’s dual-pipeline design is similar to the Pentium’s but simpler and more flexible than the latter’s RISC-based approach.

The 6x86MX was well-received in the marketplace, with a 6x86MX/PR233 (running at a clock speed of 187MHz) proving faster than both a 233MHz Pentium II and K6. The MX was also the first leading processor capable of running on a 75MHz external bus, which provides obvious bandwidth advantages and boosts overall performance. On the downside, and in common with previous Cyrix processors, the 6x86MX’s floating-point performance was significantly less good than that of its competitors, adversely affecting 3D graphics performance.

Cyrix MII

The M II is an evolution of the 6x86MX, operating at higher frequencies. By the summer of 1998 0.25-micron MII-300 and MII-333 processors were being produced out of National Semiconductor's new manufacturing facility in Maine and the company claimed to have already seen shrinks of its 0.25-micron process to produce 0.22-micron geometries on its way to its stated goal of 0.18 micron in 1999.

AMD K6

For many years Advanced Micro Devices (AMD), like Cyrix, had made 286, 386 and 486 CPUs that were directly derived from Intel’s designs. The K5 was the company's first independently created x86 processor, and one for which AMD had held high hopes. In the event, however, it met with only limited success, more as a result of missing its window of opportunity than any particular problems with the processor itself.

However, it’s purchase of a California-based competitor in the spring of 1996 appears to have enabled AMD to prepare better for its next assault on Intel. The K6 began life as the Nx686, being renamed after the acquisition of NextGen. The K6 range of MMX-compatible processors was launched in mid-1997, some weeks ahead of the Cyrix 6x86MX, and met with immediate critical acclaim.

Manufactured on a 0.35-micron five-layer-metal process, the K6 is almost 20% smaller than a Pentium Pro yet contains 3.3 million more transistors (8.8 million to 5.5 million). Most of these additional transistors reside in the chip’s 64KB Level 1 cache, consisting of 32KB of instruction cache and 32KB of writeback dual-ported cache. This is four times as much as the Pentium Pro and twice as much as the Pentium MMX and Pentium II.

The K6 supports Intel's MMX Technology, including 57 new x86 instructions designed to enhance and accelerate multimedia software. Like the Pentium Pro, the K6 owes a great deal to classic Reduced Instruction Set Computer (RISC) designs. Using AMD's RISC86 superscalar microarchitecture, the chip decodes each x86 instruction into a series of simpler operations that can then be processed using typical RISC principles - such as out-of-order execution, register renaming, branch prediction, data forwarding, and speculative execution.

The K6 was launched in 166MHz, 200MHz and 233MHz versions. Its level of performance is very similar to a similarly clocked Pentium Pro with its maximum 512KB Level 2 cache. In common with Cyrix’s MX chip, but so a somewhat lesser extent, floating-point performance is an area of relative weakness compared with Intel’s Pentium Pro and Pentium II processors. However, the processor's penetration of the marketplace in late 1997/early 1998 was hampered by problems AMD had in migrating its new 0.25-micron manufacturing process from its development labs to its manufacturing plant. As well as causing a shortage of 200MHz and 233MHz parts, this also delayed the introduction of 266MHz chip and the cancellation of the 300MHz chip.

Super7

When Intel stopped making its MMX processor in mid-1998 it effectively left the Socket 7 field entirely to its competitors, principally AMD and Cyrix. And whilst their latest ideas need co-operation from motherboard and chipset manufacturers to reach their full potential, both companies have ambitious plans for extending the life of the 'legacy' form factor. AMD's determination to match Intel's proprietary Slot 1 architecture on Socket 7 boards is amply illustrated by their 0.25-micron K6-2 processor, launched at the end of May 1998, which marks a significant development of the architecture.

AMD refers to this as the 'Super7' platform initiative, and its aim is to keep the platform viable throughout 1999 and into the year 2000. Developed by AMD and key industry partners, the Super7 platform supercharges Socket 7 by adding support for 100MHz and 95MHz bus interfaces and the Accelerated Graphics Port (AGP) specification and by delivering other leading-edge features, including 100MHz SDRAM, USB, Ultra DMA and ACPI. The latest enhancements to the AMD-K6 processor family include support for a full-speed backside Level 2 cache and an optional frontside Level 3 cache.

AMD K6-2

The 9.3-million-transistor AMD K6-2 processor is manufactured on AMD's 0.25-micron, five-layer-metal process technology using local interconnect and shallow trench isolation at AMD's Fab 25 wafer fabrication facility in Austin, Texas. The AMD-K6-2 processor is packaged in a 100MHz Super7 platform-compatible, 321-pin ceramic pin grid array (CPGA) package.

The K6-2 incorporates the innovative and efficient RISC86 microarchitecture, a large 64KB Level 1 cache (32KB dual-ported data cache, 32KB instruction cache with an additional 20KB of predecode cache) and an improved floating-point execution unit. The MMX unit's execution speed has also been tweaked, addressing one of the criticisms of the K6. At its launch in mid-1998 the entry-level version of the CPU was rated at 300MHz - by early 1999 the fastest processor available was a 450MHz version.

The K6-2's 3D capabilities represent the other major breakthrough. These are embodied in AMD's 3DNow! technology, a new set of 21 instructions that work to enhance the standard MMX instructions already included in the K6 architecture, dramatically speeding up the sort of operations required in 3D applications.

3DNow!
With the launch of K6-2, in May 1998, AMD stole something of a march on Intel, whose similar Katmai technology was not due for release until up to a year later, in the first half of 1999. By the end of March 1999 the installed base of 3DNow! technology-enhanced PCs was estimated to have reached about 14 million systems worldwide.

By improving the processor's ability to handle floating-point calculations, 3DNow! technology closes the growing performance gap between processor and graphics accelerator performance - and eliminates the bottleneck at the beginning of the graphics pipeline. This clears the way for dramatically improved 3D and multimedia performance.

Processing in the graphics pipeline can be viewed as comprising four stages:

	Physics: The CPU performs floating-point-intensive physics calculations to create simulations of the real world and the objects in it
	Geometry: Next, the CPU transforms mathematical representations of objects into three-dimensional representations, using floating point intensive 3D geometry
	Setup: The CPU starts the process of creating the perspective required for a 3D view, and the graphics accelerator completes it
	Rendering: Finally, the graphics accelerator applies realistic textures to computer-generated objects, using per-pixel calculations of colour, shadow, and position.

Each 3DNow! instruction handles two floating-point operands, and the K6-2 micro-architecture allows it to execute two 3DNow! instructions per clock cycle, giving a total of four floating-point operations per cycle. The K6-2's multimedia units combine the existing MMX instructions, which accelerate integer-intensive operations, with the new 3DNow! instructions, and both types can execute simultaneously. Of course, with graphics cards which accelerate 3D in hardware, a great deal of 3D rendering is already being done off the CPU. However, with many 3D hardware solutions, that still leaves a lot of heavily floating-point intensive work at the 'front-end' stages of the 3D graphics pipeline - scene generation and geometry mainly, but also triangle setup. Intel's P6 architecture, as used in Pentium II and Celeron, has always been particularly strong in this area, leaving AMD, Cyrix and IBM behind. The new 3DNow! instruction redress the balance with Single Instruction Multiple Data (SIMD) floating-point operations to enhance 3D geometry setup and MPEG decoding.

A wide range of application types stand to benefit from 3DNow! technology., which is also being licensed by Cyrix and IDT/Centaur for use in their forthcoming processors. As well as games, these include VRML web sites, CAD, speech recognition and software DVD decoding. Performance is further boosted by use with Microsoft's DirectX 6.0, released in the summer of 1998, which includes routines to recognise and get the most out of the new instruction set. Future versions of the OpenGL API will also be optimised for 3DNow!

AMD K6-III

In February 1999 AMD announced that it had begun volume shipments of the 400MHz AMD K6-III processor, codenamed 'Sharptooth', and was sampling the 450MHz version to OEM customers. The key feature of this new processor is its innovative 'TriLevel Cache' design.

Traditionally, PC processors have relied on two levels of cache:

	Level 1 (L1) cache, which is usually located internally on the silicon die, and
	Level 2 (L2) cache, which can reside either externally on a motherboard or in a slot module, or internally in the form of an 'on-chip' backside L2 cache.

In designing a cache subsystem, the general rule of thumb is that the larger and faster the cache, the better the performance (the more quickly the CPU core can access instructions and data). Recognising the benefits of a large and fast cache design in feeding today's power-hungry PC applications, AMD's 'TriLevel Cache' introduces a number of cache design architectural innovations, designed to enhance the performance of PCs based on the Super7 platform:

	An internal 256KB L2 write-back cache operating at the full speed of the AMD-K6-III processor and complementing the 64KB L1 cache, which is standard in all AMD-K6 family processors
	A multiport internal cache design, enabling simultaneous 64-bit reads and writes to both the L1 cache and the L2 cache
	A 4-way set associative L2 cache design enabling optimal data management and efficiency
	A 100MHz frontside bus to a Super7 motherboard-resident external cache, scaleable from 512KB to 2048KB.

The AMD-K6-III processor's multiport internal cache design enables both the 64KB L1 cache and the 256KB L2 cache to perform simultaneous 64-bit read and write operations in a clock cycle. This multiport capability allows data to be processed faster and more efficiently than non-ported designs. In addition to this multiport cache design, the AMD-K6-III processor core can access both L1 and L2 caches simultaneously, which further enhances overall CPU throughput.

AMD claims that with a fully-configured Level 3 cache, the K6-III has a 435% cache size advantage over a Pentium III and, consequently, a significant performance advantage.

AMD Athlon

The launch of the Athlon processor, in the summer of 1999, represented a major coup for AMD. It allowed them to boast not only of having produced the first seventh-generation processor - there are enough radical architectural differences between the Athlon core and that of the Pentium II/III and K6-III to earn it the title of a next-generation processor - but it also meant that they had wrested technological leadership from the mighty Intel at the same time.

The word Athlon derives from ancient Greek, where it can mean 'trophy' or 'of the games', and the Athlon is the processor that AMD is looking to add a real competitive presence in the corporate sector to its traditionally strong performance in the consumer and 3D games markets. With a processor die size of 102mm² and approximately 22 million transistors, the Athlon's principal elements include:

	Multiple Decoders: Three full x86 instruction decoders translate x86 instructions into fixed-length MacroOPs for higher instruction throughput and increased processing power. Instead of executing x86 instructions, which have lengths of 1 to 15 bytes, the Athlon processor executes the fixed-length MacroOPs, while maintaining the instruction coding efficiencies found in x86 programs.
	Instruction Control Unit: Once MacroOPs are decoded, up to three MacroOPs per cycle are dispatched to the instruction control unit (ICU). The ICU is a 72-entry MacroOP reorder buffer (ROB) that manages the execution and retirement of all MacroOPs, performs register renaming for operands, and controls any exception conditions and instruction retirement operations. The ICU dispatches the MacroOPs to the processor’s multiple execution unit schedulers.
	Execution Pipeline: The Athlon contains an 18-entry integer/address generation MacroOP scheduler and a 36-entry floating-point unit (FPU)/multimedia scheduler. These schedulers issue MacroOPs to the nine independent execution pipelines - three for integer calculations, three for address calculations, and three for execution of MMX, 3DNow!, and x87 floating-point instructions.

Superscalar FPU: AMD's previous CPUs were poor floating-point performers compared with Intel's. This previous weakness has been more than adequately addressed in the Athlon, which features an advanced three-issue superscalar engine based on three pipelined out-of-order execution units (FMUL, FADD, and FSTORE). The term 'superscalar' refers to a CPU's ability to execute more than one instruction per clock cycle, and while such processors have existed for some time now, the Athlon represents the first application of the technology to an FPU subsystem. The superscalar performance characteristic of the Athlon's FPU is partly down to pipelining - the process of pushing data and instructions into a virtual pipe so that the various segments of this pipe can process the operations simultaneously. The bottom line is that the Athlon is capable of delivering as many as four 32-bit, single-precision floating-point results per clock cycle, resulting in a peak performance of 2.4Gflops at 600MHz.

Branch Prediction: The AMD Athlon processor offers sophisticated dynamic branch prediction logic to minimise or eliminate the delays due to the branch instructions (jumps, calls, returns) common in x86 software.

	System Bus: The Athlon system bus is the first 200MHz system bus for x86 platforms. Based on the Digital's Alpha EV6 bus protocol, the frontside bus (FSB) is potentially scaleable to 400MHz and beyond and, unlike the shared bus SMP (Symmetric Multi-Processing) design of the Pentium III, uses a point-to-point architecture to deliver superior bandwidth for uniprocessor and multiprocessor x86 platforms.
	Cache Architecture: Athlon's cache architecture is a significant leap forward from that of conventional sixth-generation CPUs. The total Level 1 cache is 128KB - four times that of the Pentium III - and the high-speed 64-bit backside Level 2 cache controller supports between 512KB and a massive 8MB.
	Enhanced 3DNow!: In response to Intel's Pentium III Streaming SIMD Extensions, the 3DNow! implementation in the Athlon has been upgraded, adding 24 new instructions to the original 21 3DNow! instructions - 19 to improve MMX integer math calculations and enhance data movement for Internet streaming applications and 5 DSP extensions for soft modem, soft ADSL, Dolby Digital, and MP3 applications.

The Athlon uses AMD's Slot A module design, which is mechanically compatible with Slot 1 motherboards but uses a different electrical interface - meaning that Athlon CPUs will not work with Slot 1 motherboards. Slot A is designed to connect electrically to a 200MHz system bus based on the Alpha EV6 bus protocol, thus delivering a significant performance advantage over the Slot 1 infrastructure. As well as providing its own optimised chipset solution - the AMD-750 chipset - the company is working with leading third-party chipset suppliers to assist them in delivering their own Athlon-optimised solutions.

The Athlon was initially available in speed grades of 650, 600, 550 and 500MHz, fabricated using AMD's 0.25-micron process technology. By the end of 1999 AMD had increased speeds further, it's 750MHz model being the first processor built using AMD's aluminium 0.18-micron, six-layer metal, manufacturing process technology. Whether this can claim to have been the fastest x86 CPU of the millennium is debatable, as Intel was quick to respond with the announcement of an 800MHz Pentium III. However, AMD re-took the lead in the speed stakes early in 2000 with the announcement of 800MHz and 850MHz versions and succeeded in beating Intel to the coveted 1GHz barrier by a matter of days some weeks later. AMD Athlon processors featuring copper interconnect technology are slated for availability around mid-2000.

AMD-750 chipset

The AMD-750 chipset consists of two physical devices: the AMD-751 system controller and the AMD-756 peripheral bus controller.

The key features of the AMD-751 system controller are:

	Support for the AMD Athlon system bus interface, the first 200MHz system bus for x86 system platforms
	System logic architecture optimised for the seventh-generation AMD Athlon processor
	PCI 2.2 compliant bus interface with support for 6 PCI masters
	Support for up to 768MB of PC-100 SDRAM DIMM memory
	Compliant with AGP 2.0 specs for 1x and 2x AGP modes
	Optimised to deliver enhanced AMD Athlon processor system performance

The key features of the AMD-756 peripheral bus controller are:

	Enhanced master mode IDE controller with Ultra DMA-33/66 support
	Support for Plug-n-Play, ACPI 1.0 and APM 1.2 power management standards
	PC97 compliant PCI to ISA bridge and integrated ISA bus controller
	Integrated OHCI-compliant USB controller with root hub and four ports
	Support for legacy style mouse/keyboard controller Duron Ever since AMD's repositioning of its Socket 7 based K6-III processor for exclusive use in mobile PCs in the second half of 1999, Intel's Celeron range of processors had enjoyed a position of dominance in the low-cost market segment. In mid-2000 AMD sought to reverse this trend with the announcement of its Duron brand - a new family of processors for targeted at value conscious business and home users. The Duron is based on its more powerful sibling, the Athlon, and takes its name from a Latin derivative - 'durare' meaning 'to last' and 'on' meaning 'unit'. It has 128KB/64KB of Level 1/2 cache - both on-die - a 200MHz front side system bus, and enhanced 3DNow! technology. The 64KB of Level 2 cache compares with the 256KB of its Athlon sibling and the 128KB of its Celeron rival. AMD believes this is sufficient to provide acceptable performance in its target market whilst giving it a cost advantage over its Intel rival.

September 15, 2010

How To Install an Exchange Server 2003 (Part 1)