Computer lessons

What underlies the operation of the processor. How does a computer processor work? Principle of operation

In order to understand how a microprocessor works, let's ask ourselves the question - how should it work? There is a theory (mostly created after the fact: after the first computers were already built and functioning) that indicates exactly how to build algorithms, and what the processor should do in accordance with this. Naturally, we will not go deeper into this; we will simply state that any algorithm is a sequence of certain actions written in the form of a set of sequentially executed commands (instructions, operators). Moreover, among such commands there may be transition commands, which in some cases violate the original sequence of execution of operators strictly one after another. Among others, there should also be commands for input and output of data (the program must somehow communicate with the outside world?), as well as commands for performing arithmetic and logical operations.

Instructions must be stored somewhere, so a program memory device must be an integral part of the entire system. Somewhere you need to store the data, both the initial data and the results of the program, so there must be a data memory device. Since commands and data, ultimately, are still numbers, the memory can be shared, we just need to be able to distinguish where exactly we have commands and where we have data. This is one of the von Neumann principles, although in microcontrollers, which we will talk about later, they traditionally use not von Neumann, but the so-called Harvard architecture, when data and program memory are separated (this separation, however, can, within certain limits be violated). A processor built according to von Neumann is more universal; for example, it allows you to expand memory without any problems, build it hierarchically, and more efficiently redistribute it directly during operation. For example, Windows always assumes that the computer has a virtually unlimited amount of memory (measured in terabytes), and if it really is not enough, a swap file on the hard drive gets involved. At the same time, microcontrollers do not particularly require such flexibility - on their basis, as a rule, nodes are built that perform a specific task and work according to a specific program, so it does not cost anything to provide the required system configuration in advance.

MP and MK

By the way, why do we always talk about microprocessors (MP) or microcontrollers (MC)? A microcontroller differs from a microprocessor in that it is designed to control other devices, and therefore has a built-in advanced input-output system, but, as a rule, a relatively weaker ALU. Microcontrollers are very well suited to the term, which in Soviet times had, however, a slightly different meaning - “micro-computer”; the English word “computer-on-chip”, a single-chip computer, is even more accurate. In fact, to build the simplest computing device that could do something useful, a conventional microprocessor, from i4004 to Pentium and Core Duo, has to be supplemented with memory, ROM with a written BIOS, input-output devices, an interrupt controller, a clock generator with timers, etc. - all that has now begun to be combined into the so-called. "chipsets". A “naked” MP is capable of only one thing: turning on correctly; it doesn’t even have a boot program from anywhere.

At the same time, for a microprocessor, a microprocessor is only a core, not even the largest part of the crystal. To build a complete system on a standard MK, nothing is required at all except a power supply and peripheral execution devices that would allow a person to determine that the system is working. A regular MK can, without additional components, communicate with other MKs, external memory, special microcircuits (like a real-time clock or flash memory), control small (and sometimes large) matrix panels, and sensors of physical quantities can be directly connected to it (including including purely analog ones, ADCs are also often included in microcontrollers), buttons, keyboards, LEDs and indicators, in short - everything has been done in microcontrollers so that you have to solder as little as possible and think about the selection of elements. You have to pay for this with reduced performance (which, however, is not so important in typical tasks for MK) and some limitations in certain functions - compared to universal, but hundreds of times more expensive and complex systems on “real” ones. » MP. You may not believe me, but processors for personal computers (PCs), which we hear so much about, occupy only 5-6% of the total number of processors produced - the rest are microcontrollers for various purposes.

In accordance with what has been said, the main cycle of the processor’s operation should be as follows: fetching the next command (from memory), if necessary, fetching the source data for it, executing the command, placing the results in memory (again, if necessary). All work in this cycle must occur automatically at the commands of some control device containing a clock generator - a system clock by which everything is synchronized. In addition, somewhere all this must happen - storing data, command code, performing actions, etc., so that the processor must contain a certain set of working registers (essentially a small-sized ultra-fast memory), connected in a certain way between itself, as well as with the control device and ALU, which must inevitably be present.

The program counter plays a decisive role in the operation of the processor. It is automatically set to zero at the beginning of operation, which corresponds to the first command, and is automatically incremented (that is, increased by one) with each command executed. If along the way the order of commands is violated, for example, a jump (branch) command is encountered, then the corresponding command address is loaded into the counter - its number from the beginning of the program. If this is not just a branch, but the execution of a subroutine, which involves subsequently returning to the main sequence of commands (to the next command after calling the subroutine), then before moving on to executing the subroutine, the current value of the program counter is stored in a memory area specially allocated for this purpose - the stack. On a subroutine end command, the stored address is popped from the stack and execution of the main program continues. Fortunately, we don't have to deal with the program counter ourselves, because all the instructions for this are contained in the commands, and the processor does everything automatically.

Rice. 18.2. Block diagram of a simple microcontroller

A block diagram of the simplest MK, containing a processor core and a minimum of components for “communication” with the external environment, is shown in Fig. 18.2. Here we have included in the system program memory, which is always located separately in PC processors (except for a relatively small amount of high-speed cache memory) - you yourself know how much programs there are in personal computers. In most modern microcontrollers, read-only memory (ROM) for programs is included in the chip and is typically between V-2 and 8-32 kB. Although there are models with 256 kilobytes of built-in memory, 2-8 kilobytes is quite enough for the vast majority of applications. Built-in random access memory (RAM) for storing data in one volume or another is also available in all modern microcontrollers; the typical size of such RAM is from 128-256 bytes to 1-4 kbytes. Most universal controllers also have a certain amount of built-in non-volatile memory for storing constants - usually the same amount as data RAM. But we will return to memory in this chapter, but for now we will continue about processors.

Details

in the first models of microprocessors (including Intel processors for PCs - from 8086 to 80386), the processor executed commands strictly sequentially: load the command, determine that it needs operands, load these operands (at the address of the registers that should contain them; these addresses are like are usually stored immediately after the command code itself or are defined in advance), then perform the necessary actions, store the results... The architecture of the super popular 8051 microcontrollers, which are still produced to this day by various companies (Atnnel, Philips), has reached our time, which executed one command as much as in 12 cycles (in some modern analogues, however, this number is less). To speed up work, they began to divide clock cycles into parts (for example, trigger on leading and falling edges), but the real breakthrough came with the introduction of the conveyor. Since the days of Henry Ford, it has been known that the performance of a pipeline depends only on the execution time of the longest operation - if you divide commands into stages and execute them simultaneously by different hardware nodes, you can achieve significant speedup (although not in all cases). In the Atmel AVR microcontrollers discussed below, the pipeline is two-stage: when the next command is loaded and decoded, the previous one is already executed and writes the results. In AVR, this made it possible to execute most commands in one clock cycle (except for program branch commands).

The main device in the MP, which connects all nodes into a single system, is the internal data bus. All devices exchange signals through it. For example, if the MP needs to access external, additional memory, then when the corresponding command is executed, the required address is set on the data bus, and a request is received from the control device through it to access the required I/O ports. If the ports are ready, the address is sent to the port outputs (that is, to the corresponding controller pins), then, when ready, the receiving port puts data received from external memory on the bus, which is loaded into the desired register, after which the data bus is free. To ensure that all devices do not interfere with each other, all this is strictly synchronized, and each device has, firstly, its own address, and secondly, it can be in three states - work as input, as output, or be in a third state, without interfering with others' work.

The digit capacity of an MP is usually understood as the digit capacity of the numbers with which the ALU operates; accordingly, the working registers also have the same digit capacity. For example, all PC processors from the 1386 to the latest incarnations of Pentkim were 32-bit, some recent models from Intel and AMD have become 64-bit. Most general purpose microcontrollers are 8-bit, but there are also 16- and 32-bit ones. In this case, the internal data bus can have more bits - for example, in order to simultaneously transmit both addresses and data.

The distribution of the MK market in the first years of the millennium was as follows: slightly less than half of the products produced were 8-bit crystals, and the second half was divided between 16- and 32-bit chips, with the share of the latter steadily growing at the expense of 16-bit chips. Even 4-bit ones are produced, descendants of the first 14004, which occupy no more than 10% of the market, but, curiously, this share is decreasing very slowly.

Notes in the margins

Typically, the clock frequency of universal MKs is low (although to an engineer in the 1980s, when PCs operated* at frequencies no higher than 6 MHz, it would have seemed huge) - about 8-16 MHz, sometimes up to 20 MHz or slightly more. And this suits everyone - the fact is that ordinary MKs are not intended for the development of high-speed circuits. If performance is required, then another class of integrated circuits is used - FPGAs, “programmable logic integrated circuits”. The simplest FPGA is a set of logic elements that are not interconnected in any way (the most complex of them may include some complete units, such as flip-flops and generators), which are connected into the desired circuit during programming of such a chip. Combinational logic works much faster than clocked controllers, and currently only FPGAs are used to build various logic circuits; the use of discrete elements (“powder”) on a mass scale has long been abandoned. Another advantage of FPGAs is that the static energy consumption for some series is a few microwatts, in contrast to MKs, which consume quite a lot when turned on (if they are not in power saving mode). Together with more versatile and much easier to use, but less fast and cost-effective microcontrollers, FPGAs form the basis of most mass-produced electronic products that you see on the shelves. In this book, of course, we will not consider FPGAs - in amateur practice, mainly due to the high cost of the corresponding tools and the high threshold for mastering them, they are not used, and it is not advisable to use them to design individual copies of devices even for professional applications .

If the details of the internal functioning of the MP do not concern us very much (we already “invented” the central node - the ALU in Chapter 15, and this is enough to understand what exactly is happening inside the processor core), then the exchange with the external environment interests us in all the details . For this purpose, input/output ports (I/0-port, from input/output) are used. There is some ambiguity in this term, since those who programmed for PCs in assembler remember that in PCs, input/output ports (IOPs) were registers for controlling all devices except the processor core itself. In microcontrollers, the same thing is called “input/output registers” (IO) - these are registers for accessing built-in controller components external to the computing core. And these are all the nodes that are directly controlled by the user - from timers and serial ports to the flag register and interrupt management. Apart from RAM, access to which is provided by special commands, everything else in the controller is controlled via RVV, and they should not be confused with I/O ports.

PVVs in the MK are used for exchange with the “environment” (they are, naturally, also controlled by internal I/O registers). In the diagram fig. 18.2 shows 3 PVVs - A, B and C; in real MKs there may be more or less of them. Even more important is the number of pins of these ports, which most often coincides with the processor bit size (but not always, as was the case with the 8086, which had an internal 16-bit structure, but externally looked 8-bit). If we force 8-bit ports to “communicate,” for example, with external memory, then two of them can be set to a 16-bit address, and the remaining one can receive data. But what if there are two ports or just one? (For example, in the ATxxxx2313 microcontroller there are formally two ports, but one is truncated, so the total number of lines is 15). In order for this to be possible even in such a situation, all external ports in the MP are always bidirectional. Let's say, if there are two ports, then you can first set the address, and then switch the ports to input and receive data. Naturally, for this, the ports must allow operation on a common bus - that is, either have a third state, or an output with a common collector for combining into a “mounted OR”.

Options for both cases of organizing the port output line are shown in Fig. 18.3, which shows simplified diagrams of the output lines of the 8048 family of microcontrollers - the once widely used predecessor of the popular 80S1 MCU (for example, the 8048 was chosen as the keyboard controller in the IBM PC). In modern MCUs, the construction of ports is somewhat more complicated (in particular, instead of a resistor there is a field-effect transistor), but for understanding the principles of operation this is not essential.

According to the first option (Fig. 18.3, a), ports 1 and 2 are built in the 8048 MK. When writing is done to the port, the logical level comes from the direct output of the latch on the static D-flip-flop to the input of the “AND” circuit, and from the inverse - to gate of transistor VT2. If this level is equal to logical zero, then transistor VT1 is locked and VT2 is open, and the output is also logical zero. If the level is equal to logical one, then for the duration of the “Write” pulse, transistor VT1 opens, and transistor VT2 closes (they are of the same polarity). If there is capacitance at the output (and it is always present in the form of distributed capacitance of the conductors and the capacitance of the inputs of other components), then a sufficiently large charging current of this capacitance flows through open VT1, allowing the formation of a good transition front from O to 1. As soon as the “Write” pulse ends, both transistors are turned off, and the logical one at the output is maintained by resistor R1. The output resistance of the open transistor VT1 is approximately 5 kOhm, and the resistor is 50 kOhm. Any other device connected to this bus, when operating as an output, can only either support a logical one by connecting its similar resistor in parallel with R1, or occupy the line with its logical zero - this, as you can see, is the “wiring OR” circuit. When working as an input, the line state is simply read during the “Write” pulse from the input buffer (element “B” in Fig. 18.3, a).

The second option (Fig. 18.3, b), according to which port O is designed, is a conventional CMOS output stage with a third state, that is, such a port can work as an output, only it completely occupies the line, while the rest of the devices connected to the line must humbly listen to the monopolist , perceiving signals. This usually does not create any special difficulties and is even preferable from circuit design due to the symmetry of the output signals and the high resistance of the input signals. The only difficulty arises when pairing such a port with a line operating according to the first option, since with a logical one at the output, electrical conflicts may arise if someone tries to output a logical zero into the line (the current from the source will flow through two open transistors).

Rice. 18.3. Simplified diagrams of MK 8048 input/output ports: a - ports 1 and 2; b - port O

To ensure the operation of a three-stable port according to the “wiring OR” circuit, a tricky trick is used: the entire line is “pulled up” to the supply voltage using an external resistor (many microcontrollers have a built-in switchable resistor, installed similarly to R1 in the circuit of Fig. 18.3, a), and normal the state of all participating three-stable ports is input operation in the third state. In this mode, there will always be a logical one on the line. The line is switched to the output only when it is necessary to output a logical zero. In this case, even if several ports are active at the same time, conflicts will not arise.

You are reading these lines from a smartphone, tablet or computer. Any of these devices are microprocessor based. The microprocessor is the “heart” of any computer device. There are many types of microprocessors, but they all solve the same problems. Today we will talk about how the processor works and what tasks it performs. At first glance, all this seems obvious. But many users would be interested in deepening their knowledge about the most important component that makes the computer work. We'll learn how technology based on simple digital logic allows your computer to not only solve math problems, but also be an entertainment center. How do just two numbers - one and zero - transform into colorful games and movies? Many people have asked themselves this question many times and will be glad to receive an answer. After all, even at the heart of our recent AMD Jaguar processor, on which the latest game consoles are based, lies the same ancient logic.

In English-language literature, a microprocessor is often called a CPU (central processing unit, [single] module of the central processor). The reason for this name lies in the fact that a modern processor is a single chip. The first microprocessor in human history was created by Intel Corporation back in 1971.

Intel's role in the history of the microprocessor industry


We are talking about the Intel 4004 model. It was not powerful and could only perform addition and subtraction operations. It could process only four bits of information at a time (that is, it was 4-bit). But for its time its appearance was a significant event. After all, the entire processor fits into one chip. Before the Intel 4004, computers were based on a whole set of chips or discrete components (transistors). The 4004 microprocessor formed the basis of one of the first portable calculators.

The first microprocessor for home computers was the Intel 8080, introduced in 1974. All the processing power of an 8-bit computer was contained in one chip. But the announcement of the Intel 8088 processor was of truly great importance. It appeared in 1979 and from 1981 began to be used in the first mass-produced personal computers, the IBM PC.

Then processors began to develop and become more powerful. Anyone who is at least a little familiar with the history of the microprocessor industry remembers that the 8088 was replaced by the 80286. Then came the 80386, followed by the 80486. Then there were several generations of Pentiums: Pentium, Pentium II, III and Pentium 4. All this Intel processors based on the basic 8088 design. They were backwards compatible. This means that the Pentium 4 could process any piece of code for the 8088, but it did it at about five thousand times the speed. Not many years have passed since then, but several more generations of microprocessors have changed.


Since 2004, Intel began offering multi-core processors. The number of transistors used in them has increased by millions. But even now the processor obeys the general rules that were created for early chips. The table reflects the history of Intel microprocessors up to 2004 (inclusive). We will make some explanations about what the indicators reflected in it mean:
  • Name. Processor model
  • Date. The year in which the processor was first introduced. Many processors were introduced multiple times, each time their clock speed was increased. Thus, the next modification of the chip could be re-announced even several years after its first version appeared on the market
  • Transistors (Number of transistors). The number of transistors in the chip. You can see that this figure has been steadily increasing
  • Microns (Width in microns). One micron is equal to one millionth of a meter. The value of this indicator is determined by the thickness of the thinnest wire in the chip. For comparison, the thickness of a human hair is 100 microns
  • Clock speed. Maximum processor speed
  • Data Width. “Bit capacity” of the processor’s arithmetic-logical unit (ALU). An 8-bit ALU can add, subtract, multiply, and perform other operations on two 8-bit numbers. A 32-bit ALU can handle 32-bit numbers. To add two 32-bit numbers, an eight-bit ALU needs to execute four instructions. A 32-bit ALU can handle this task in one instruction. In many (but not all) cases, the width of the external data bus coincides with the “bit count” of the ALU. The 8088 processor had a 16-bit ALU, but an 8-bit bus. For later Pentiums, a typical situation was when the bus was already 64-bit, but the ALU was still 32-bit
  • MIPS (Million Instructions Per Second). Allows you to roughly estimate processor performance. Modern ones perform so many different tasks that this indicator has lost its original meaning and can be used mainly to compare the computing power of several processors (as in this table)

There is a direct relationship between the clock speed, as well as the number of transistors and the number of operations performed by the processor per second. For example, the clock speed of the 8088 processor reached 5 MHz, and performance: only 0.33 million operations per second. That is, it took about 15 processor cycles to execute one instruction. In 2004, processors could already execute two instructions per clock cycle. This improvement was achieved by increasing the number of processors on the chip.

The chip is also called an integrated circuit (or just an integrated circuit). Most often, this is a small and thin silicon wafer into which transistors are “imprinted.” The chip, whose side reaches two and a half centimeters, can contain tens of millions of transistors. The simplest processors can be squares with a side of only a few millimeters. And this size is enough for several thousand transistors.

Microprocessor logic


To understand how a microprocessor works, you should study the logic on which it is based, as well as become familiar with assembly language. This is the native language of the microprocessor.

The microprocessor is capable of executing a specific set of machine instructions (commands). Operating with these commands, the processor performs three main tasks:

  • Using its arithmetic-logical unit, the processor performs mathematical operations: addition, subtraction, multiplication and division. Modern microprocessors fully support floating point operations (using a dedicated floating point arithmetic processor)
  • The microprocessor is capable of moving data from one type of memory to another
  • The microprocessor has the ability to make a decision and, based on the decision it makes, “jump”, that is, switch to executing a new set of instructions

The microprocessor contains:

  • Address bus. The width of this bus can be 8, 16 or 32 bits. She is engaged in sending the address to memory
  • Data bus: 8, 16, 32 or 64 bits wide. This bus can send data to memory or receive data from memory. When they talk about the “bit capacity” of a processor, we are talking about the width of the data bus
  • RD (read) and WR (write) channels providing interaction with memory
  • Clock line (clocking pulse bus), providing processor clock cycles
  • Reset line (erasure bus, reset bus), which resets the program counter and restarts instruction execution

Since the information is quite complex, we will assume that the width of both buses - the address and data buses - is only 8 bits. Let's take a quick look at the components of this relatively simple microprocessor:

  • Registers A, B and C are logic chips used for intermediate data storage
  • Address latch is similar to registers A, B and C
  • The program counter is a logic chip (latch) capable of incrementing a value by one in one step (if it receives the corresponding command) and zeroing the value (subject to receiving the corresponding command)
  • An ALU (arithmetic logic unit) can perform addition, subtraction, multiplication and division operations between 8-bit numbers or act as a regular adder
  • The test register is a special latch that stores the results of comparison operations performed by the ALU. Typically, the ALU compares two numbers and determines whether they are equal or one is greater than the other. The test register is also capable of storing the carry bit of the last action of the adder. It stores these values ​​in a flip-flop circuit. These values ​​can later be used by the command decoder to make decisions
  • Six blocks in the diagram are labeled "3-State". These are sorting buffers. Multiple output sources can be connected to a wire, but the sort buffer allows only one of them (at one time) to transmit a value: "0" or "1". Thus, the sort buffer can skip values ​​or block the output source from transmitting data
  • The instruction register and instruction decoder keep all of the above components under control

This diagram does not show the control lines of the command decoder, which can be expressed in the form of the following “orders”:

  • "Register A accepts the value currently coming from the data bus"
  • "Register B accepts the value currently coming from the data bus"
  • "Register C accepts the value currently coming from the arithmetic logic unit"
  • "The program counter register takes the value currently coming from the data bus"
  • "Address register to accept the value currently coming from the data bus"
  • "The command register accepts the value currently coming from the data bus"
  • “Increase the value of the program counter [by one]”
  • “The command counter will be reset to zero”
  • "Activate one of six sort buffers" (six separate control lines)
  • “Tell the arithmetic logic unit what operation it should perform.”
  • "Test register accept test bits from ALU"
  • “Activate RD (reading channel)”
  • “Activate WR (recording channel)”

The command decoder receives data bits from the test register, the synchronization channel, and also from the command register. If we simplify the description of the tasks of the instruction decoder as much as possible, then we can say that it is this module that “tells” the processor what needs to be done at the moment.

Microprocessor memory


Familiarity with information related to computer memory and its hierarchy will help you better understand the contents of this section.

Above we wrote about buses (address and data), as well as read (RD) and write (WR) channels. These buses and channels are connected to memory: random access memory (RAM) and read only memory (ROM). In our example, we consider a microprocessor whose width of each bus is 8 bits. This means that it is capable of addressing 256 bytes (two to the eighth power). It can read or write 8 bits of data from memory at one time. Let's assume that this simple microprocessor has 128 bytes of ROM (starting at address 0) or 128 bytes of RAM (starting at address 128).

A read-only memory module contains a specific preset persistent set of bytes. The address bus requests a specific byte from the ROM to be transferred to the data bus. When the read channel (RD) changes state, the ROM module supplies the requested byte to the data bus. That is, in this case, only reading data is possible.

The processor can not only read information from RAM, it can also write data to it. Depending on whether reading or writing is being performed, the signal enters either the read channel (RD) or the write channel (WR). Unfortunately, RAM is volatile. When the power is turned off, it loses all data stored in it. For this reason, a computer needs a non-volatile read-only storage device.

Moreover, theoretically, a computer can do without RAM at all. Many microcontrollers allow the necessary data bytes to be placed directly on the processor chip. But it is impossible to do without ROM. In personal computers, ROM is called the basic input and output system (BIOS, Basic Input/Output System). When starting up, the microprocessor begins its work by executing the commands it found in the BIOS.

BIOS commands perform tests on the computer's hardware, and then they access the hard drive and select the boot sector. This boot sector is a separate small program that the BIOS first reads from disk and then places in RAM. After this, the microprocessor begins to execute commands from the boot sector located in RAM. The boot sector program tells the microprocessor what data (intended for subsequent execution by the processor) should be additionally moved from the hard disk to RAM. This is how the processor loads the operating system.

Microprocessor instructions


Even the simplest microprocessor is capable of processing a fairly large set of instructions. A set of instructions is a kind of template. Each of these instructions loaded into the command register has its own meaning. It's not easy for people to remember the sequence of bits, so each instruction is described as a short word, each of which represents a specific command. These words make up the processor's assembly language. The assembler translates these words into a binary code language that the processor can understand.

Here is a list of assembly language command words for a conventional simple processor, which we are considering as an example for our story:

  • LOADA mem — Load register A from some memory address
  • LOADB mem — Load register B from some memory address
  • CONB con — Load a constant value into register B
  • SAVEB mem — Save the value of register B in memory at a specific address
  • SAVEC mem — Save the value of register C in memory at a specific address
  • ADD — Add (add) the values ​​of registers A and B. Store the result of the action in register C
  • SUB — Subtract the value of register B from the value of register A. Store the result of the action in register C
  • MUL — Multiply the values ​​of registers A and B. Store the result of the action in register C
  • DIV — Divide the value of register A by the value of register B. Store the result of the action in register C
  • COM - Compare the values ​​of registers A and B. Transfer the result to the test register
  • JUMP addr - Jump to the specified address
  • JEQ addr - If the condition of equality of the values ​​of two registers is met, jump to the specified address
  • JNEQ addr - If the condition for equal values ​​of two registers is not satisfied, jump to the specified address
  • JG addr - If the value is greater, jump to the specified address
  • JGE addr - If value is greater than or equal to, jump to the specified address
  • JL addr - If the value is less, jump to the specified address
  • JLE addr — If the value is less than or equal to, jump to the specified address
  • STOP - Stop execution

The English words denoting the actions performed are given in brackets for a reason. So we can see that assembly language (like many other programming languages) is based on English, that is, on the usual means of communication of those people who created digital technologies.

Microprocessor operation using the example of factorial calculation


Let's consider the operation of a microprocessor using a specific example of its execution of a simple program that calculates the factorial of the number “5”. First, let’s solve this problem “in a notebook”:

factorial of 5 = 5! = 5 * 4 * 3 * 2 * 1 = 120

In the C programming language, this piece of code performing this calculation would look like this:

A=1;f=1;while (a

When this program finishes, the variable f will contain the value of the factorial of five.

The C compiler translates (that is, translates) this code into a set of assembly language instructions. In the processor we are considering, RAM starts at address 128, and permanent memory (which contains assembly language) starts at address 0. Therefore, in the language of this processor, this program will look like this:

// Assume that a is at address 128 // Assume that F is at address 1290 CONB 1 // a=1;1 SAVEB 1282 CONB 1 // f=1;3 SAVEB 1294 LOADA 128 // if a > 5 the jump to 175 CONB 56 COM7 JG 178 LOADA 129 // f=f*a;9 LOADB 12810 MUL11 SAVEC 12912 LOADA 128 // a=a+1;13 CONB 114 ADD15 SAVEC 12816 JUMP 4 // loop back to if17 STOP

Now the next question arises: what do all these commands look like in permanent memory? Each of these instructions must be represented as a binary number. To simplify the understanding of the material, we will assume that each of the assembly language commands of the processor we are considering has a unique number:

  • LOADA - 1
  • LOADB - 2
  • CONB - 3
  • SAVEB - 4
  • SAVEC mem - 5
  • ADD - 6
  • SUB - 7
  • MUL - 8
  • DIV - 9
  • COM - 10
  • JUMP addr - 11
  • JEQ addr - 12
  • JNEQ addr - 13
  • JG addr - 14
  • JGE addr - 15
  • JL addr - 16
  • JLE addr - 17
  • STOP - 18

// Assume that a is at address 128 // Assume that F is at address 129Addr machine instruction/value0 3 // CONB 11 12 4 // SAVEB 1283 1284 3 // CONB 15 16 4 // SAVEB 1297 1298 1 // LOADA 1289 12810 3 // CONB 511 512 10 // COM13 14 // JG 1714 3115 1 // LOADA 12916 12917 2 // LOADB 12818 12819 8 // MUL20 5 // SAVEC 12921 12922 1 // LOADA 12823 12824 3 // CONB 125 126 6 // ADD27 5 // SAVEC 12828 12829 11 // JUMP 430 831 18 // STOP

As you'll notice, seven lines of C code have been converted into 18 lines of assembly language. They took up 32 bytes in ROM.

Decoding


The conversation about decoding will have to begin with a consideration of philological issues. Alas, not all computer terms have one-to-one correspondence in Russian. The translation of terminology often happened spontaneously, and therefore the same English term can be translated into Russian in several ways. This is what happened with the most important component of the microprocessor logic “instruction decoder”. Computer experts call it both a command decoder and an instruction decoder. None of these name options can be called either more or less “correct” than the other.

An instruction decoder is needed to translate each machine code into a set of signals that drive the various components of the microprocessor. If we simplify the essence of his actions, then we can say that he is the one who coordinates “software” and “hardware”.

Let's look at the operation of the command decoder using the example of the ADD instruction, which performs an addition action:

  • During the first processor clock cycle, the instruction is loaded. At this point, the command decoder needs to: activate the sort buffer for the program counter; activate read channel (RD); activate the sort buffer latch to pass input data into the instruction register
  • During the second processor clock cycle, the ADD instruction is decoded. At this stage, the arithmetic logic unit performs addition and transfers the value to register C
  • During the third cycle of the processor clock frequency, the program counter increases its value by one (theoretically, this action overlaps with what happened during the second cycle)

Each instruction can be represented as a set of sequentially executed operations that manipulate the components of the microprocessor in a certain order. That is, software instructions lead to completely physical changes: for example, changing the position of a latch. Some instructions may require two or three processor clock cycles to execute. Others may require even five or six cycles.

Microprocessors: Performance and Trends


The number of transistors in a processor is an important factor affecting its performance. As shown earlier, the 8088 processor required 15 clock cycles to execute one instruction. And to perform one 16-bit operation, it took about 80 cycles. This is how the ALU multiplier of this processor was designed. The more transistors and the more powerful the ALU multiplier, the more the processor can do in one clock cycle.

Many transistors support pipelining technology. Within the framework of a pipeline architecture, executable instructions partially overlap each other. An instruction may require the same five cycles to execute, but if the processor simultaneously processes five instructions (at different stages of completion), then on average, one processor clock cycle will be required to execute one instruction.

Many modern processors have more than one command decoder. And each of them supports pipelining. This allows more than one instruction to be executed in one processor cycle. Implementing this technology requires an incredible number of transistors.

64-bit processors


Although 64-bit processors became widespread only a few years ago, they have been around for a relatively long time: since 1992. Both Intel and AMD currently offer such processors. A 64-bit processor can be considered to be one that has a 64-bit arithmetic logic unit (ALU), 64-bit registers, and 64-bit buses.

The main reason processors need 64-bit is because the architecture expands the address space. 32-bit processors can only access two or four gigabytes of RAM. Once these numbers seemed gigantic, but years have passed and today such a memory will no longer surprise anyone. A few years ago, the memory of a typical computer was 256 or 512 megabytes. In those days, the four-gigabyte limit only bothered servers and machines running large databases.

But it quickly turned out that even ordinary users sometimes lack two or even four gigabytes of RAM. This annoying limitation does not apply to 64-bit processors. The address space available to them these days seems infinite: two to the sixty-fourth bytes, or something like a billion gigabytes. Such gigantic RAM is not expected in the foreseeable future.

The 64-bit address bus, as well as the wide and high-speed data buses of the corresponding motherboards, allow 64-bit computers to increase the speed of data input and output when interacting with devices such as the hard drive and video card. These new features significantly increase the performance of modern computing machines.

But not all users will experience the benefits of 64-bit architecture. It is necessary, first of all, for those who edit videos and photos, and also work with various large images. 64-bit computers are appreciated by connoisseurs of computer games. But those users who simply use a computer to communicate on social networks and surf the web and edit text files will most likely simply not feel any advantages of these processors.

Based on materials from computer.howstuffworks.com

Nowadays there is a lot of information on the Internet on the topic of processors, you can find a bunch of articles about how it works, where registers, clocks, interrupts, etc. are mainly mentioned... But, for a person who is not familiar with all these terms and concepts, it is quite difficult like this fly" to understand the process, but you need to start small - namely, with a basic understanding how the processor works and what main parts it consists of.

So, what will be inside the microprocessor if you disassemble it:

The number 1 denotes the metal surface (cover) of the microprocessor, which serves to remove heat and protect from mechanical damage what is behind this cover (that is, inside the processor itself).

At number 2 is the crystal itself, which in fact is the most important and expensive part of the microprocessor to manufacture. It is thanks to this crystal that all calculations take place (and this is the most important function of the processor) and the more complex it is, the more perfect it is, the more powerful the processor is and the more expensive it is, accordingly. The crystal is made of silicon. In fact, the manufacturing process is very complex and contains dozens of steps, more details in this video:

Number 3 is a special textolite substrate to which all other parts of the processor are attached; in addition, it plays the role of a contact pad - on its reverse side there are a large number of golden “dots” - these are contacts (they can be seen a little in the figure). Thanks to the contact pad (substrate), close interaction with the crystal is ensured, since it is not possible to directly influence the crystal in any way.

The cover (1) is attached to the substrate (3) using an adhesive-sealant that is resistant to high temperatures. There is no air gap between the crystal (2) and the cover; thermal paste takes its place; when it hardens, it forms a “bridge” between the processor crystal and the cover, which ensures very good heat transfer.

The crystal is connected to the substrate using soldering and sealant, the contacts of the substrate are connected to the contacts of the crystal. This figure clearly shows how the crystal contacts are connected to the substrate contacts using very thin wires (170x magnification in the photo):

In general, the design of processors from different manufacturers and even models from the same manufacturer can vary greatly. However, the principle of operation remains the same - they all have a contact substrate, a crystal (or several located in one case) and a metal cover for heat removal.

For example, this is what the contact substrate of an Intel Pentium 4 processor looks like (the processor is upside down):

The shape of the contacts and the structure of their arrangement depend on the processor and motherboard of the computer (the sockets must match). For example, in the picture just above, the contacts of the processor without “pins”, since the pins are located directly in the motherboard socket.

And there is another situation where the “pins” of the contacts stick out directly from the contact substrate. This feature is typical mainly for AMD processors:

As mentioned above, the design of different models of processors from the same manufacturer may differ; we have a striking example of this - the quad-core Intel Core 2 Quad processor, which is essentially 2 dual-core processors of the core 2 duo line, combined in one case:

Important! The number of crystals inside a processor and the number of processor cores are not the same thing.

Modern models of Intel processors fit 2 crystals (chips) at once. The second chip - the graphics core of the processor, essentially plays the role of a video card built into the processor, that is, even if the system is missing, the graphics core will take on the role of a video card, and quite a powerful one at that (in some processor models, the computing power of the graphics cores allows you to play modern games on medium graphics settings).

That's all central microprocessor device, in short, of course.

The heart of a personal computer is CPU. It is an electronic digital device that can operate according to a given program.

Let's consider the device of a computer. First, let’s decipher the adjectives “electronic” and “digital” separately.

The adjective “electronic” means that the computer processor runs on electrical energy and all signals that are processed by this device are electrical. However, in radio electronics, electronic devices are divided into 2 large classes: analog and digital. The adjective “digital” means that the computer processor belongs to the class of digital rather than analog devices.

The mentioned analog devices predominated among electronic equipment 20-30 years ago. And they appeared when radio engineers learned to record and transmit sound and image in the form of analog signals. These were radios, televisions, tape recorders, etc.

Analog devices gave way to the palm only at the end of the last century, when the development of digital devices led to the ability to record and transmit any information, including the already mentioned sounds and images, using digital codes.

Digital signals, unlike analogue ones, are slightly susceptible to interference and are transmitted over distances without distortion; they are better recorded, stored and do not “deteriorate” over time.

The computer processor is one of the most complex devices among digital electronic devices. This is a kind of apotheosis of the development of digital technology.

Externally, it is a silicon wafer mounted in a housing that has many electrical terminals for connecting to the power supply and other computer devices.

Because the processor is made on silicon wafers, in computer geek jargon it is sometimes called “rock,” since silicon is a very durable material.

On this plate, by very precise deposition of a substance (accuracy is measured in angstroms) in a vacuum and while maintaining ideal cleanliness of production, a very complex and extremely miniature electrical circuit is reproduced, consisting of tens and hundreds of thousands of tiny elements (mainly transistors) connected to each other in a special way.

The production of such devices is so high-tech that only countries with the most developed economies have been able to master it. It is interesting that in the production of processors they do not measure defects, as is customary in almost all industries and production, but measure the so-called percentage of usable products, since very few processor blanks ultimately become functional devices.

High-quality silicon wafers are placed in a package with leads and equipped with cooling devices (radiator and fan), since hundreds of thousands of miniature transistors emit a fair amount of heat during their operation.

If you look at the internal logical structure of a computer processor, it is a collection of interconnected devices:

– an arithmetic-logical unit (ALU), in which, in fact, the information is converted,

– control device (CU), which is designed to control an arithmetic-logical device,

– and memory registers (cells) in which input data, intermediate data and resulting data are stored.

Commands intended to control the operation of the processor are transferred from RAM to the control device. This device controls the operation of the arithmetic logic unit according to the received commands.

In turn, the ALU, in accordance with the commands received from the control unit, carries out

  • entering information from registers,
  • information processing and
  • recording processed information into registers.

Processor registers can exchange information with RAM cells (also based on ALU instructions). Therefore, ultimately the computer processor

  • processes data received from RAM,
  • and the processed data is also placed in RAM.

The following brief description of the operation of a computer processor illustrates that data processing by a processor is a sequence of very “small” steps:

  • reading data from RAM into processor registers,
  • processing of this data and
  • writing back data from processor registers to RAM cells.

But compensation for this is the highest speed of calculations, hundreds of thousands and millions of such “small” operations every second. And accordingly, high speed information processing is ensured, which makes the computer an indispensable assistant for work, study, recreation, and entertainment.

A tool is simpler than a machine. Often the tool is operated by hand and the machine is powered by steam power or an animal.

Charles Babbage

A computer can also be called a machine, only instead of steam power there is electricity. But programming has made the computer as simple as any tool.

The processor is the heart/brain of any computer. Its main purpose is arithmetic and logical operations, and before diving into the jungle of the processor, you need to understand its main components and the principles of their operation.

Two main components of the processor

Control device

The control unit (CU) helps the processor control and execute instructions. The control tells the components exactly what to do. Based on instructions, it coordinates with other parts of the computer, including the second major component, the arithmetic logic unit (ALU). All instructions are first sent to the control device.

There are two types of control implementation:

  • Control system based on hard logic(English: hardwired control units). The nature of the work is determined by the internal electrical structure - the design of the printed circuit board or crystal. Accordingly, modification of such a control system without physical intervention is impossible.
  • Control unit with microprogram control(eng. microprogrammable control units). Can be programmed for certain purposes. The software part is stored in the memory of the control unit.

A controller with hard logic is faster, but a controller with microprogram control has more flexible functionality.

Arithmetic logic unit

This device, oddly enough, performs all arithmetic and logical operations, such as addition, subtraction, logical OR, etc. The ALU consists of logical elements that perform these operations.

Most logic gates have two inputs and one output.

Below is the circuit of a half adder which has two inputs and two outputs. A and B here are inputs, S is output, C is carry (to the most significant digit).

Arithmetic half-adder circuit

Information storage - registers and memory

As mentioned earlier, the processor executes the commands it receives. Commands in most cases work with data, which can be intermediate, input or output. All this data, along with instructions, is stored in registers and memory.

Registers

A register is the minimum data memory cell. Registers consist of flip-flops (latches/flip-flops). Triggers, in turn, consist of logical elements and can store 1 bit of information.

Note translation Triggers can be synchronous or asynchronous. Asynchronous ones can change their state at any time, and synchronous ones only during a positive/negative drop at the synchronization input.

According to their functional purpose, triggers are divided into several groups:

  • RS trigger: retains its state at zero levels on both inputs and changes it when one of the inputs is set to one (Reset/Set).
  • JK flip-flop: identical to the RS flip-flop, except that when units are supplied to two inputs at once, the flip-flop changes its state to the opposite (counting mode).
  • T-flip-flop: reverses its state at every clock cycle on its only input.
  • D-flip-flop: remembers the input state at the moment of synchronization. Asynchronous D-flip-flops make no sense.

RAM is not suitable for storing intermediate data, as it will slow down the processor. Intermediate data is sent to registers via the bus. They can store commands, output data, and even addresses of memory cells.

Operating principle of an RS trigger

Memory (RAM)

RAM (random access memory, English RAM) is a large group of these same registers connected together. The memory of such storage is unstable and data from there disappears when the power is turned off. RAM receives the address of the memory cell in which the data needs to be placed, the data itself, and the write/read flag that activates the flip-flops.

Note translation RAM can be static and dynamic - SRAM and DRAM, respectively. In static memory, the cells are flip-flops, and in dynamic memory, the cells are capacitors. SRAM is faster and DRAM is cheaper.

Commands (instructions)

Commands are the actual actions that the computer must perform. They come in several types:

  • Arithmetic: addition, subtraction, multiplication, etc.
  • brain teaser: AND (logical multiplication/conjunction), OR (logical sum/disjunction), negation, etc.
  • Information: move , input , outptut , load and store .
  • Jump Commands: goto , if ... goto , call and return .
  • Stop command: halt .

Note translation In fact, all arithmetic operations in an ALU can be created based on just two: addition and shift. However, the more basic operations an ALU supports, the faster it is.

Instructions are provided to the computer in assembly language or generated by a high-level language compiler.

In a processor, instructions are implemented in hardware. In one clock cycle, a single-core processor can execute one elementary (basic) instruction.

A group of instructions is usually called an instruction set.

CPU clocking

The speed of a computer is determined by the clock speed of its processor. Clock frequency - the number of clock cycles (respectively, of executed commands) per second.

The frequency of current processors is measured in GHz (Gigahertz). 1 GHz = 10⁹ Hz - billion operations per second.

To reduce the execution time of a program, you need to either optimize (reduce) it or increase the clock frequency. Some processors have the ability to increase the frequency (overclock the processor), but such actions physically affect the processor and often cause overheating and failure.

Executing Instructions

Instructions are stored in RAM in sequential order. For a hypothetical processor, an instruction consists of an opcode and a memory/register address. Inside the control device there are two instruction registers into which the instruction code and the address of the currently executing instruction are loaded. The processor also has additional registers that store the last 4 bits of executed instructions.

Below is an example of a set of commands that sums two numbers:

  1. LOAD_A 8 . This command stores data in RAM, say,<1100 1000>. The first 4 bits are the operation code. It is he who defines the instructions. This data is placed in the instruction registers of the control unit. The command is decoded into a load_A instruction - place data 1000 (last 4 bits of the command) into the A register.
  2. LOAD_B 2 . The situation is similar to the previous one. This places the number 2 (0010) in register B.
  3. ADD B A . The command sums two numbers (more precisely, it adds the value of register B to register A). The controller tells the ALU to perform the addition operation and place the result back into the A register.
  4. STORE_A 23 . We save the value of register A into memory cell with address 23.

These are the operations needed to add two numbers.

Tire

All data between the processor, registers, memory and I/O devices (input/output devices) is transferred via buses. To load newly processed data into memory, the processor places an address on the address bus and data on the data bus. Then you need to give permission to write on the control bus.

Cache

The processor has a mechanism for storing instructions in a cache. As we found out earlier, a processor can execute billions of instructions in a second. Therefore, if each instruction was stored in RAM, then retrieving it from there would take more time than processing it. Therefore, to speed up operation, the processor stores part of the instructions and data in the cache.

If the data in the cache and memory do not match, they are marked with dirty bits.

Instruction flow

Modern processors can process multiple instructions in parallel. While one instruction is in the decoding stage, the processor may have time to receive another instruction.

However, this solution is only suitable for those instructions that are independent of each other.

If a processor is multi-core, it means that it actually houses several separate processors with some shared resources, such as a cache.