Computer lessons

RPC, Messaging, REST: Terminology. Fix: the "Remote Procedure Call (RPC)" service does not start. Performing a server stub

Lecture 4

4.1 Remote Procedure Call Concept

The idea of ​​calling remote procedures (Remote Procedure Call - RPC) consists of extending the well-known and understood mechanism for transferring control and data within a program running on one machine to transferring control and data over a network. Remote procedure call tools are designed to facilitate the organization of distributed computing. The greatest efficiency of using RPC is achieved in those applications in which there is interactive communication between remote components with fast response times and a relatively small amount of data transferred. Such applications are called RPC-oriented.

The characteristic features of calling local procedures are: asymmetry, that is, one of the interacting parties is the initiator; synchronicity, that is, execution of the calling procedure stops from the moment the request is issued and is resumed only after the called procedure returns.

Implementing remote calls is much more complicated than implementing local procedure calls. To begin with, since the calling and called procedures are executed on different machines, they have different address spaces, and this creates problems when passing parameters and results, especially if the machines are not identical. Since RPC cannot rely on shared memory, this means that RPC parameters must not contain pointers to non-stack memory locations and that parameter values ​​must be copied from one computer to another. The next difference between RPC and a local call is that it necessarily uses the underlying communication system, but this should not be explicitly visible either in the definition of the procedures or in the procedures themselves. Remoteness introduces additional problems. The execution of the calling program and the called local procedure on the same machine is implemented within a single process. But the implementation of RPC involves at least two processes - one on each machine. If one of them crashes, the following situations may arise: if the calling procedure crashes, the remotely called procedures will become “orphaned”, and if the remote procedures crash, the calling procedures will become “orphaned parents”, waiting in vain for a response from the remote procedures.

In addition, there are a number of problems associated with the heterogeneity of programming languages ​​and operating environments: the data structures and procedure call structures supported in any one programming language are not supported in the same way in all other languages.


These and some other problems are solved by the widespread RPC technology, which underlies many distributed operating systems.

Basic RPC Operations

To understand how RPC works, let's first consider making a local procedure call on a typical machine running offline. Let this be, for example, a system call

count=read(fd,buf,nbytes);

where fd is an integer;

buf – array of characters;

nbytes is an integer.

To make the call, the calling procedure pushes the parameters onto the stack in reverse order. After the read call is executed, it places the return value into a register, moves the return address, and returns control to the calling procedure, which pops parameters from the stack, returning it to its original state. Note that in the C language, parameters can be called either by reference (by name) or by value (by value). In relation to the called procedure, value parameters are initialized local variables. The called procedure can change them without affecting the original values ​​of these variables in the calling procedure.

If a pointer to a variable is passed to the called procedure, then changing the value of this variable by the called procedure entails changing the value of this variable for the calling procedure. This fact is very significant for RPC.

There is also another parameter passing mechanism that is not used in C. It is called call-by-copy/restore, which requires the caller to copy variables onto the stack as values, and then copy them back after the call is made over the original values ​​of the calling procedure.

The decision about which parameter passing mechanism to use is made by the language developers. Sometimes it depends on the type of data being transferred. In C, for example, integers and other scalar data are always passed by value, and arrays are always passed by reference.

The idea behind RPC is to make a remote procedure call look as similar as possible to a local procedure call. In other words, make RPC transparent: the calling procedure does not need to know that the called procedure is on another machine, and vice versa.

RPC achieves transparency in the following way. When the called procedure is actually remote, another version of the procedure, called a client stub, is placed in the library instead of the local procedure. Like the original procedure, the stub is called using a calling sequence, and an interrupt occurs when accessing the kernel. Only, unlike the original procedure, it does not place parameters in registers and does not request data from the kernel; instead, it generates a message to be sent to the kernel of the remote machine.

RPC Execution Stages

The interaction of software components when performing a remote procedure call is illustrated in Figure 2.

Figure 2. Remote Procedure Call

After the client stub has been called by the client program, its first task is to fill the buffer with the message being sent. In some systems, the client stub has a single fixed-length buffer that is filled from the very beginning with each new request. In other systems, the message buffer is a pool of buffers for individual message fields, some of which are already full. This method is especially suitable for cases where the packet has a format consisting of a large number of fields, but the values ​​of many of these fields do not change from call to call.

The parameters must then be converted to the appropriate format and inserted into the message buffer. At this point, the message is ready to be sent, so the kernel call interrupt is executed.

When the kernel gains control, it switches contexts, saves processor registers and memory map (page handles), and installs a new memory map that will be used to run in kernel mode. Because the kernel and user contexts are different, the kernel must copy the message exactly into its own address space so that it can access it, remember the destination address (and possibly other header fields), and it must pass it to the network interface. This completes the work on the client side. The transmission timer is turned on, and the kernel can either cyclically poll for a response or pass control to the scheduler, which will select some other process to run. In the first case, query execution is accelerated, but multiprogramming is absent.

On the server side, incoming bits are placed by the receiving hardware either in an on-chip buffer or in RAM. When all information has been received, an interrupt is generated. The interrupt handler checks the correctness of the packet data and determines which stub it should be sent to. If none of the stubs are expecting this packet, the handler must either buffer it or discard it altogether. If there is a waiting stub, the message is copied to it. Finally, a context switch is performed, as a result of which the registers and memory map are restored, taking the values ​​that they had at the moment when the stub made the receive call.

Now the server stub starts working. It unpacks the parameters and pushes them appropriately onto the stack. When everything is ready, a call to the server is made. After executing the procedure, the server transmits the results to the client. To do this, perform all the steps described above, only in reverse order.

Figure 3 shows the sequence of commands that must be executed for each RPC call.

Figure 3. RPC procedure steps

Remote procedure call(or Calling remote procedures) (from English. Remote Procedure Call (RPC)) - a class of technologies that allow computer programs to call functions or procedures in another address space (usually on remote computers). Typically, an implementation of RPC technology includes two components: a network protocol for client-server communication and an object serialization language (or structures, for non-object RPC). Different RPC implementations have very different architectures and vary in their capabilities: some implement the SOA architecture, others CORBA or DCOM. At the transport layer, RPCs primarily use TCP and UDP protocols, however, some are built on top of HTTP (which violates the ISO/OSI architecture, since HTTP is not a transport protocol natively).

Implementations

There are many technologies that provide RPC:

  • Sun RPC (binary protocol based on TCP and UDP and XDR) RFC-1831 second name ONC RPC RFC-1833
  • .Net Remoting (binary protocol based on TCP, UDP, HTTP)
  • SOAP - Simple Object Access Protocol (HTTP-based text protocol) see specification: RFC-4227
  • XML RPC (HTTP based text protocol) see specification: RFC-3529
  • Java RMI - Java Remote Method Invocation - see specification: http://java.sun.com/j2se/1.5.0/docs/guide/rmi/index.html
  • JSON-RPC JavaScript Object Notation Remote Procedure Calls (HTTP-based text protocol) see specification: RFC-4627
  • DCE/RPC - Distributed Computing Environment / Remote Procedure Calls (binary protocol based on various transport protocols, including TCP/IP and Named Pipes from the SMB/CIFS protocol)
  • DCOM - Distributed Component Object Model known as MSRPC Microsoft Remote Procedure Call or "Network OLE" (an object-oriented extension to DCE RPC that allows you to pass references to objects and call object methods through such references)

Principle

The idea of ​​a Remote Procedure Call (RPC) is to extend the well-known and understood mechanism for transferring control and data within a program running on one machine to transfer control and data over a network. Remote procedure call tools are designed to facilitate the organization of distributed computing and the creation of distributed client-server information systems. The greatest efficiency of using RPC is achieved in those applications in which there is interactive communication between remote components with fast response times and a relatively small amount of data transferred. Such applications are called RPC-oriented.

Implementing remote calls is much more complicated than implementing local procedure calls. We can identify the following problems and tasks that need to be solved when implementing RPC:

  • Because the calling and called procedures run on different machines, they have different address spaces, and this creates problems when passing parameters and results, especially if the machines are running different operating systems or have different architectures (for example, little-endian or big-endian). ). Since RPC cannot rely on shared memory, this means that RPC parameters must not contain pointers to non-stack memory locations and that parameter values ​​must be copied from one computer to another. To copy procedure parameters and execution results over the network, they are serialized.
  • Unlike a local call, a remote procedure call necessarily uses the transport layer of the network architecture (for example, TCP), but this remains hidden from the developer.
  • The execution of the calling program and the called local procedure on the same machine is implemented within a single process. But the implementation of RPC involves at least two processes - one on each machine. If one of them crashes, the following situations may arise: if the calling procedure crashes, the remotely called procedures will become “orphaned”, and if the remote procedures crash, the calling procedures will become “destitute parents”, which will wait in vain for a response from the remote procedures.
  • There are a number of problems associated with the heterogeneity of programming languages ​​and operating environments: the data structures and procedure call structures supported in any one programming language are not supported in the same way in all other languages. Thus, there is a compatibility problem that has not yet been solved either by the introduction of one generally accepted standard, or by the implementation of several competing standards on all architectures and in all languages.

Subsystems

  • Transport subsystem

Manage outgoing and incoming connections. - support for the concept of “message boundary” for transport protocols that do not directly support it (TCP). - support for guaranteed delivery for transport protocols that do not directly support it (UDP).

  • Thread pool (callee only). Provides an execution context for code invoked over the network.
  • Marshalling (also called "serialization"). Packing call parameters into a byte stream in a standard way, independent of the architecture (in particular, the order of bytes in a word). In particular, it can affect arrays, strings, and structures pointed to by pointer parameters.
  • Encrypting packages and applying a digital signature to them.
  • Authentication and authorization. Transmission over the network of information identifying the subject making the call.

In some RPC (.NET Remoting) implementations, subsystem boundaries are open polymorphic interfaces, and it is possible to write your own implementation of almost all of the listed subsystems. In other implementations (DCE RPC on Windows) this is not the case.

see also

Remote Procedure Call (RPC) Remote Procedure Call Concept

The idea of ​​a Remote Procedure Call (RPC) is to extend the well-known and understood mechanism for transferring control and data within a program running on a single machine to transfer control and data over a network. Remote procedure call tools are designed to facilitate the organization of distributed computing. The greatest efficiency of using RPC is achieved in those applications in which there is interactive communication between remote components with fast response times and a relatively small amount of data transferred. Such applications are called RPC-oriented.

The characteristic features of calling local procedures are:

  • Asymmetry, that is, one of the interacting parties is the initiator;
  • Synchronicity, that is, the execution of the calling procedure is suspended from the moment the request is issued and is resumed only after the called procedure returns.

Implementing remote calls is much more complicated than implementing local procedure calls. To begin with, since the calling and called procedures are executed on different machines, they have different address spaces, and this creates problems when passing parameters and results, especially if the machines are not identical. Since RPC cannot rely on shared memory, this means that RPC parameters must not contain pointers to non-stack memory locations and that parameter values ​​must be copied from one computer to another. The next difference between RPC and a local call is that it necessarily uses the underlying communication system, but this should not be explicitly visible either in the definition of the procedures or in the procedures themselves. Remoteness introduces additional problems. The execution of the calling program and the called local procedure on the same machine is implemented within a single process. But the implementation of RPC involves at least two processes - one on each machine. If one of them crashes, the following situations may arise: if the calling procedure crashes, the remotely called procedures will become “orphaned”, and if the remote procedures crash, the calling procedures will become “orphaned parents”, waiting in vain for a response from the remote procedures.

In addition, there are a number of problems associated with the heterogeneity of programming languages ​​and operating environments: the data structures and procedure call structures supported in any one programming language are not supported in the same way in all other languages.

These and some other problems are solved by the widespread RPC technology, which underlies many distributed operating systems. Basic RPC Operations

To understand how RPC works, let's first consider making a local procedure call on a typical machine running offline. Let this be, for example, a system call

count=read(fd, buf, nbytes);

where fd is an integer, buf is an array of characters, nbytes is an integer.

To make the call, the calling procedure pushes the parameters onto the stack in reverse order (Figure 3.1). After the read call is executed, it places the return value into a register, moves the return address, and returns control to the calling procedure, which pops parameters from the stack, returning it to its original state. Note that in the C language, parameters can be called either by reference (by name) or by value (by value). In relation to the called procedure, value parameters are initialized local variables. The called procedure can change them without affecting the original values ​​of these variables in the calling procedure.

If a pointer to a variable is passed to the called procedure, then changing the value of this variable by the called procedure entails changing the value of this variable for the calling procedure. This fact is very significant for RPC.

There is also another parameter passing mechanism that is not used in C. It is called call-by-copy/restore, which requires the caller to copy variables onto the stack as values, and then copy them back after the call is made over the original values ​​of the calling procedure.

The decision about which parameter passing mechanism to use is made by the language developers. Sometimes it depends on the type of data being transferred. In C, for example, integers and other scalar data are always passed by value, and arrays are always passed by reference.

Application

A significant portion of Windows operating system remote management tools (Event Viewer, Server Manager, print management, user list management) use DCE RPC as a means of network communication between the managed service and the user interface management application. DCE RPC support has been present in Windows NT since the very first version 3.1. DCE RPC clients were also supported in the lightweight line of Windows 3.x/95/98/Me operating systems.

The Windows system libraries that provide such control capabilities and serve as the base layer for user interface control applications (netapi32.dll and partly advapi32.dll) actually contain client code for the DCE RPC interfaces that perform this control.

This architectural decision was the subject of active criticism against Microsoft. The generic marshalling procedures present in DCE RPC are very complex and have a huge potential for defects to be exploited in the network by sending a deliberately malformed DCE RPC packet. A significant portion of Windows security defects discovered from the late 90s to the mid-2000s were errors in the DCE RPC marshalling code.

In addition to DCE RPC, Windows actively uses DCOM technology. For example, it is used as a means of communication between IIS web server management tools and the server itself being managed. A fully functional interface for communicating with the MS Exchange Server mail system - MAPI - is also based on DCOM.

The purpose of this article is to discuss terminology. The article is not about how and why, but only about the use of terminology. The article reflects the opinion of the author and does not pretend to be scientific.

Introduction

If you work in programming distributed systems or in systems integration, then most of what is presented here is not new to you.

The problem comes when people who use different technologies meet, and when those people start having technical conversations. In this case, mutual misunderstandings often arise due to terminology. Here I will try to bring together the terminologies used in different contexts.

Terminology

There is no clear terminology and classification in this area. The terminology used below is a reflection of the author’s model, that is, it is strictly subjective. Any criticism and any discussions are welcome.

I've divided the terminology into three areas: RPC (Remote Procedure Call), Messaging and REST. These areas have historical roots.

RPC

RPC technologies - the oldest technologies. The most prominent representatives of RPC are - CORBA And DCOM.

In those days, systems mostly had to be connected on fast and relatively reliable local networks. The main idea behind RPC was to make calling remote systems much like calling functions within a program. The entire mechanics of remote calls were hidden from the programmer. At least they tried to hide it. Programmers in many cases were forced to work at a deeper level, where the terms marshaling appeared ( marshalling) And unmarshalling(how is that in Russian?), which essentially meant serialization. Normal function calls within processes were handled at the caller's end in Proxy, and on the side of the system performing the function, in Dispatcher. Ideally, neither the calling system nor the processing system would deal with the intricacies of transferring data between systems. All these subtleties were concentrated in the Proxy - Dispatcher bundle, the code of which was generated automatically.

So you won't notice, you shouldn't notice, any difference between calling a local function and calling a remote function.
Now there is a kind of RPC renaissance, the most prominent representatives of which are: Google ProtoBuf, Thrift, Avro.

Messaging

Over time, it turned out that the attempt to protect the programmer from the fact that the called function still differs from the local one did not lead to the desired result. The implementation details and fundamental differences between distributed systems were too great to be resolved using automatically generated Proxy code. Gradually, the understanding came that the fact that the systems are connected by an unreliable, slow, low-speed environment must be explicitly reflected in the program code.

Technologies have appeared web services. We started talking ABC: Address, Binding, Contract. It is not entirely clear why contracts appeared, which are essentially Envelopes for input arguments. Contracts often complicate the overall model rather than simplify it. But... it doesn't matter.

Now the programmer explicitly created service(Service) or client(Client) calling the service. The service consisted of a set operations (Operation), each of which took at the input request(Request) and issued answer(Response). Client explicitly sent(Sent) request, the service explicitly received ( Receive) and answered him (Sent), sending the answer. The client received a response and the call ended.

Just like in RPC, there was a Proxy and Dispatcher running somewhere. And as before, their code was generated automatically and the programmer did not need to understand it. Unless the client explicitly used classes from Proxy.

Requests and responses are explicitly converted to a format intended for transmission over the wire. Most often this is a byte array. The transformation is called Serialization And Deserialization and sometimes hides in the Proxy code.
The culmination of messaging manifested itself in the emergence of the paradigm ESB (Enterprise Service Bus). No one can really articulate what it is, but everyone agrees that data moves through the ESB in the form of messages.

REST

In the constant struggle with code complexity, programmers took the next step and created REST.

The main principle of REST is that function operations are sharply limited and left only a set of operations CRUD: Create - Read - Update - Delete. In this model, all operations are always applied to some data. The operations available in CRUD are sufficient for most applications. Since REST technologies in most cases imply the use of the HTTP protocol, the CRUD commands were reflected in the commands HTTP (Post - Get - Put - Delete) . It is constantly stated that REST is not necessarily tied to HTTP. But in practice, reflection of operation signatures onto the syntax of HTTP commands is widely used. For example, calling the function

EntityAddress ReadEntityAddress(string param1, string param2)

Expressed like this:

GET: entityAddress?param1=value1¶m2=value2

Conclusion

Before starting a discussion on distributed systems or integration, define the terminology. If Proxy will always mean the same thing in different contexts, then, for example, request will mean little in RPC terms, and marshalling will cause confusion when discussing REST technologies.



Programs communicating over a network need a communication mechanism. At the lower level, upon arrival of packets, a signal is sent and processed by a network signal processing program. At the top level, the rendezvous mechanism, adopted in the Ada language, operates. NFS uses a remote procedure call (RPC) mechanism in which the client communicates with the server (see Figure 1). According to this process, the client first calls a procedure that sends a request to the server. Upon arrival of a request packet, the server calls the packet opening procedure, performs the requested service, sends a response, and control returns to the client.

The RPC interface can be thought of as consisting of three layers:

  1. The top level is completely transparent. A program at this level may, for example, contain a call to the rnusers() procedure, which returns the number of users on the remote machine. You don't need to know about using the RPC mechanism since you are making the call in the program.
  2. The middle tier is intended for the most common applications. RPC calls at this level are handled by the registerrpc() and callrpc() subroutines: registerrpc() receives system-wide code, and callrpc() executes a remote procedure call. The rnusers() call is implemented using these two routines.
  3. The lower level is used for more complex tasks that change the default values ​​of procedure parameters. At this level, you can explicitly manipulate the sockets used to transmit RPC messages.

As a general rule, you should use the top level and avoid using the lower levels unless absolutely necessary.

Although in this tutorial we only consider the C interface, calls to remote procedures can be made from any language. The operation of the RPC mechanism for organizing communication between processes on different machines is no different from its operation on one machine.

RPC (Remote Procedure Call) is an interface between remote users and certain host programs that are launched at the request of these users. The RPC service of any host, as a rule, provides clients with a set of programs. Each of these programs consists, in turn, of one or more remote procedures. For example, an NFS remote file system service that is built on RPC calls may consist of only two programs: for example, one program interacts with high-level user interfaces and the other with low-level I/O functions.

Each remote procedure call involves two parties: the active client, which sends the procedure call request to the server, and the server, which sends the response to the client.

Note. It should be kept in mind that the terms "client" and "server" in this case refer to a specific transaction. A particular host or software (process or program) can act as both a client and a server. For example, a program that provides a remote procedure service can at the same time be a client for working with a network file system.

The RPC protocol is built on a remote procedure call model, similar to the local procedure call mechanism. When you call a local procedure, you place arguments at a specific location in memory, the stack, or environment variables, and transfer control of the process to a specific address. Once the job is complete, you read the results at a specific address and continue with your process.

When working with a remote procedure, the main difference is that the remote function call is handled by two processes: a client process and a server process.

The client process sends a message to the server, which includes the parameters of the called procedure, and waits for a response message with the results of its work. When a response is received, the result is read and the process continues. On the server side, the call handler process is in a waiting state, and when a message arrives, it reads the parameters of the procedure, executes it, sends a response and becomes in a waiting state for the next call.

The RPC protocol does not impose any requirements on additional communications between processes and does not require synchronization of the functions performed, that is, calls can be asynchronous and mutually independent, so that the client can perform other procedures while waiting for a response. The RPC server can allocate a separate process or virtual machine for each function, therefore, without waiting for previous requests to complete, it can immediately accept the next ones.

However, there are several important differences between local and remote procedure calls:

  1. Error processing. In any case, the client should receive notification of errors that occur when calling remote procedures on the server or network.
  2. Global variables. Because the server does not have access to the client's address space, remote procedure calls cannot use hidden parameters in the form of global variables.
  3. Performance. The speed of executing remote procedures is usually one or two orders of magnitude lower than the speed of executing similar local procedures.
  4. Authentication. Because remote procedure calls occur over the network, client authentication mechanisms must be used.

Principles of protocol construction.

The RPC protocol can use several different transport protocols. The responsibilities of the RPC protocol are only to provide standards and interpret the transmission of messages. The reliability and reliability of message transmission is entirely ensured by the transport layer.

However, RPC can control the selection and some functions of the transport protocol. As an example of the interaction between RPC and the transport protocol, consider the procedure for assigning an RPC port for an application process to work through RPC - Portmapper.

This function dynamically (on demand) assigns an RPC connection to a specific port. The Portmapper function is used quite often because the set of transport ports reserved for RPC is limited, and the number of processes that can potentially run simultaneously is very high. Portmapper, for example, is called when selecting ports for interaction between the client and the NFS system server.

The Portmapper service uses a mechanism for broadcasting RPC messages to a specific port - III. The client sends a port request broadcast message for a specific RPC service to this port. The Portmapper service processes the tax message, determines the address of the local RPC service and sends a response to the client. The RPC Portmapper service can work with both TCP and UDP protocols.

RPC can work with various transport protocols, but never duplicates their functions, i.e. if RPC runs on top of TCP, all concerns about the reliability and validity of the RPC connection are assigned to TCP. However, if the RPC protocol is installed on top of UDP, it can provide additional features of its own to ensure message delivery is guaranteed.

Note.

Application tasks can consider the RPC protocol as a specific procedure for calling a function over the JSR (Jump Subroutine Instruction) network.

For the RPC protocol to work, the following conditions must be met:

  1. Unique identification of all remotely called procedures on a given host. RPC requests contain three identifier fields - the number of the remote program (service), the version number of the remote program, and the number of the remote procedure of the specified program. The program number is assigned by the service manufacturer, the procedure number indicates the specific function of this service
  2. Identification of the RPC protocol version. RPC messages contain an RPC protocol version field. It is used to coordinate the formats of passed parameters when the client is working with different versions of RPC.
  3. Providing client authentication mechanisms to the server. The RPC protocol provides a procedure for authenticating the client to the service, and, if necessary, each time a request is made or a response is sent to the client. In addition, RPC allows the use of various additional security mechanisms.

RPC can use four types of authentication mechanisms:

  • AUTH_NULL - no authentication required
  • AUTH_UNIX - authentication according to the UNIX standard
  • AUTH_SHORT - UNIX standard authentication with its own encoding structure
  • AUTH_DES - authentication according to the DES standard
  1. Identification of response messages to corresponding requests. RPC response messages contain the identifier of the request from which they were constructed. This identifier can be called the RPC call transaction identifier. This mechanism is especially necessary when working in asynchronous mode and when performing a sequence of several RPC calls.
  2. Identification of protocol errors. All network or server errors have unique identifiers, by which each of the connection participants can determine the cause of the failure.

Protocol message structures

When sending RPC messages over a transport protocol, multiple RPC messages can be located within a single transport packet. In order to separate one message from another, a record marker (RM - Record Marker) is used. Each RPC message is "marked" with exactly one RM.

An RPC message can consist of several fragments. Each fragment consists of four bytes of header and (0 to 2**31-1) data. The first bit of the header indicates whether the fragment is the last one, and the remaining 31 bits indicate the length of the data packet.

The RPC structure is formally described in the language for describing and representing data formats - XDR with additions regarding the description of procedures. You could even say that the RPC description language is an extension of XDR, complemented by work with procedures.

The structure of the RPC packet looks like this:


The response structure (reply_body) can contain either an error structure (in which case it contains the error code) or a successful request processing structure (in which case it contains the return data).

High level software interface.

Using subroutines in a program is a traditional way to structure a task and make it clearer. The most frequently used routines are collected in libraries where they can be used by various programs. In this case, we are talking about a local (local) call, i.e. both the calling and the called objects work within the same program on the same computer.

In the case of a remote call, a process running on one computer starts a process on the remote computer (that is, it actually runs procedure code on the remote computer). Obviously, a remote procedure call differs significantly from a traditional local one, but from the programmer’s point of view, such differences are practically absent, i.e., the architecture of a remote procedure call allows you to simulate a local procedure call.

However, if in the case of a local call the program passes parameters to the called procedure and receives the result of the work through the stack or shared memory areas, then in the case of a remote call the transfer of parameters turns into the transmission of a request over the network, and the result of the work is in the received response.

This approach is a possible basis for creating distributed applications, and although many modern systems do not use this mechanism, the basic concepts and terms are retained in many cases. When describing the RPC mechanism, we will traditionally call the calling process a client, and the remote process that implements the procedure a server.

A remote procedure call involves the following steps:

  1. The client program makes a local call to a procedure called a stub. In this case, the client “seems” that by calling the stub, he is actually calling the server procedure. Indeed, the client passes the necessary parameters to the stub, and it returns the result. However, things are not quite as the client imagines. The stub's job is to accept the arguments intended for the remote procedure, possibly convert them into some standard format, and formulate a network request. Packaging the arguments and creating the network request is called marshalling.
  2. The network request is forwarded across the network to the remote system. To do this, the stub uses appropriate calls, for example, those discussed in the previous sections. Note that various transport protocols can be used, and not only the TCP/IP family.
  3. On the remote host, everything happens in the reverse order. The server stub listens for a request and, upon receipt, retrieves the parameters - the arguments to the procedure call. Unmarshalling may involve necessary transformations (for example, byte order changes).
  4. The stub makes a call to the real server procedure to which the client's request is addressed, passing it the arguments received over the network.
  5. After the procedure is completed, control returns to the server stub, passing it the required parameters. Like the client stub; The server stub converts the values ​​returned by the procedure, generating a network response message that is transmitted over the network to the system from which the request came.
  6. The operating system passes the received message to the client stub, which, after the necessary transformation, passes the values ​​(which are the values ​​returned by the remote procedure) to the client, which treats this as a normal return from the procedure.

Thus, from the client's point of view, it makes a call to a remote procedure as it would for a local one. The same can be said about the server: a procedure is called in a standard way, a certain object (server stub) calls a local procedure and receives the values ​​​​returned by it. The client treats the stub as a callable server procedure, and the server treats its own stub as a client.

Thus, stubs form the core of the RPC system, responsible for all aspects of message generation and transmission between the client and the remote server (procedure), although both client and server believe that the calls are occurring locally. This is the basic concept of RPC - to completely hide the distributed (network) nature of the interaction in the stub code. The advantages of this approach are obvious: both the client and the server are independent of the network implementation, they both operate within a distributed virtual machine, and procedure calls have a standard interface.

Passing parameters

Passing value parameters does not cause any particular difficulties. In this case, the client stub places the parameter value in the network request, possibly performing conversions to the standard form (for example, reversing the byte order). The situation is much more complicated with the passing of pointers, when the parameter represents the address of the data, and not its value. Passing the address in the request is meaningless, since the remote procedure is executed in a completely different address space. The simplest solution used in RPC is to prohibit clients from passing parameters other than by value, although this, of course, imposes serious restrictions.

Binding

Before a client can call a remote procedure, it must be associated with a remote system that hosts the required server. Thus, the binding task breaks down into two:

  1. Finding a remote host with the required server
  2. Finding the required server process on a given host

Various approaches can be used to find the host. A possible option is to create some kind of centralized directory in which hosts advertise their servers, and where the client, if desired, can select the host and procedure address that are suitable for him.

Each RPC procedure is uniquely identified by a program and procedure number. The program number identifies a group of remote procedures, each of which has its own number. Each program is also assigned a version number, so that if you make minor changes to the program (for example, adding a procedure), there is no need to change its number. Typically, several functionally similar procedures are implemented in one software module, which, when launched, becomes the server of these procedures, and which is identified by the program number.

Thus, when a client wants to call a remote procedure, it needs to know the program, version, and procedure numbers that provide the required service.

To transmit a request, the client also needs to know the host network address and port number associated with the server program providing the required procedures. The portmap(IM) daemon is used for this (called rpcbind(IM) on some systems). The daemon runs on the host that provides the remote procedure service and uses a well-known port number. When a server process is initialized, it registers its procedures and port numbers in portmap(IM). Now, when a client needs to know the port number to call a specific procedure, it sends a request to the portmap(IM) server, which in turn either returns the port number or forwards the request directly to the remote procedure server and, after executing it, returns a response to the client. In any case, if the required procedure exists, the client receives the procedure port number from the portmap(IM) server, and further requests can be made directly to this port.

Handling special situations (exception)

Handling exceptions when calling local procedures is not particularly problematic. UNIX provides processing for process errors such as division by zero, access to an invalid memory area, etc. In the case of a remote procedure call, the likelihood of error situations increases. In addition to server and stub errors, errors associated, for example, with receiving an erroneous network message are added.

For example, when using UDP as a transport protocol, messages are retransmitted after a certain timeout. An error is returned to the client if, after a certain number of attempts, a response from the server has not been received. In the case where the TCP protocol is used, an error is returned to the client if the server has closed the TCP connection.

Call semantics

Calling a local procedure unambiguously leads to its execution, after which control returns to the main program. The situation is different when calling a remote procedure. It is impossible to determine when exactly the procedure will be performed, whether it will be performed at all, and if so, how many times? For example, if the request is received by the remote system after the server program has crashed, the procedure will not be executed at all. If the client, when not receiving a response after a certain period of time (timeout), resends the request, then a situation may arise when the response is already transmitted over the network, and the repeated request is again accepted for processing by the remote procedure. In this case, the procedure will be performed several times.

Thus, the execution of a remote procedure can be characterized by the following semantics:

  • Once and only once. This behavior (in some cases the most desirable) is difficult to require due to possible server crashes.
  • Maximum times. This means that the procedure was either not performed at all or was performed only once. A similar statement can be made when receiving an error instead of a normal response.
  • At least once. The procedure was probably performed once, but it is possible more. To work normally in such a situation, the remote procedure must have the idempotency property (from the English idemponent). This property is possessed by a procedure whose repeated execution does not cause cumulative changes. For example, reading a file is idempotent, but adding text to a file is not.

Data presentation

When the client and server are running on the same system on the same computer, there are no data incompatibility problems. For both the client and the server, data in binary form is represented in the same way. In the case of a remote call, the matter is complicated by the fact that the client and server may be running on systems with different architectures, having different data representations (for example, floating point representation, byte order, etc.)

Most RPC system implementations define some standard data representation to which all values ​​passed in requests and responses must be converted.

For example, the format for presenting data in RPC from Sun Microsystems is as follows:

  1. Byte order - Most significant - last
  2. Floating Point Representation - IEEE
  3. Character representation - ASCII

Net

In terms of functionality, the RPC system occupies an intermediate place between the application layer and the transport layer. According to the OSI model, this provision corresponds to the presentation and session layers. Thus, RPC is theoretically independent of the network implementation, in particular, of the transport layer network protocols.

Software implementations of the system, as a rule, support one or two protocols. For example, the RPC system developed by Sun Microsystems supports message transmission using the TCP and UDP protocols. The choice of one protocol or another depends on the application requirements. The choice of the UDP protocol is justified for applications that have the following characteristics:

  • Called procedures are idempotent
  • The size of the transmitted arguments and the returned result is less than the size of the UDP packet - 8 KB.
  • The server provides work with several hundred clients. Since when working with TCP protocols, the server is forced to maintain a connection with each of the active clients, this takes up a significant part of its resources. The UDP protocol is less resource-intensive in this regard

On the other hand, TCP provides efficient operation of applications with the following characteristics:

  • The application requires a reliable transfer protocol
  • Called procedures are non-identical
  • The size of the arguments or return result exceeds 8 KB

The choice of protocol is usually left to the client, and the system organizes the generation and transmission of messages in different ways. Thus, when using the TCP protocol, for which the transmitted data is a stream of bytes, it is necessary to separate messages from each other. For this purpose, for example, the record marking protocol described in RFC1057 "RPC: Remote Procedure Call Protocol specification version 2" is used, in which a 32-bit integer is placed at the beginning of each message, defining the size of the message in bytes.

The situation is different with the semantics of the call. For example, if RPC is performed using an unreliable transport protocol (UDP), the system retransmits the message at short intervals (timeouts). If the client application does not receive a response, then it is safe to say that the procedure has been executed zero or more times. If a response is received, the application can conclude that the procedure was executed at least once. When using a reliable transport protocol (TCP), if a response is received, the procedure can be said to have been performed once. If a response is not received, it is impossible to definitely say that the procedure was not completed3.

How it works?

Essentially, the RPC system itself is built into the client program and the server program. It's nice that when developing distributed applications, you don't have to delve into the details of the RPC protocol or program message processing. The system assumes the existence of an appropriate development environment, which greatly simplifies the life of application software creators. One of the key points in RPC is that the development of a distributed application begins with the definition of an object interface - a formal description of the server's functions, written in a special language. Based on this interface, client and server stubs are then automatically generated. The only thing you need to do after this is write the actual code for the procedure.

As an example, consider RPC from Sun Microsystems. The system consists of three main parts:

  • rpcgen(1) is an RPC compiler that, based on the description of the remote procedure interface, generates client and server stubs in the form of C programs.
  • The XDR (eXternal Data Representation) library, which contains functions for converting various data types into a machine-independent form that allows information exchange between heterogeneous systems.
  • A library of modules that ensure the operation of the system as a whole.

Let's look at an example of a simple distributed event logging application. When the client starts, it calls a remote procedure to write a message to the log file of the remote computer.

To do this, you will have to create at least three files: the specification of the interfaces of the remote procedures log.x (in the interface description language), the actual text of the remote procedures log.c and the text of the main client program main () - client.c (in C language).

The rpcgen(l) compiler creates three files based on the log.x specification: the text of the client and server stubs in C (log clnt.c and log svc.c) and the description file log.h, used by both stubs.

So, let's look at the source codes of the programs.

This file specifies the registration parameters of the remote procedure - program, version and procedure numbers, and also defines the call interface - input arguments and return values. Thus, an RLOG procedure is defined that takes a string as an argument (which will be written to the log), and the return value standardly indicates the success or failure of the ordered operation.


program LOG_PROG( version LOG_VER( int RLOG(string) = 1; ) = 1; ) = 0x31234567;

The rpcgen(l) compiler creates a header file log.h, where, in particular, the procedures are defined:


Let's look at this file carefully. The compiler translates the RLOG name defined in the interface description file into rlog_1, replacing uppercase characters with lowercase ones and adding the program version number with an underscore. The return type has changed from int to int*. This is the rule - RPC allows you to transmit and receive only the addresses of the parameters declared when describing the interface. The same rule applies to the string passed as an argument. Although it doesn't appear from print.h, the rlog_l() function actually passes the address of the string as an argument.

In addition to the header file, the rpcgen(l) compiler produces client stub and server stub modules. Essentially, the text of these files contains all the code for the remote call.

The server stub is the main program that handles all network interactions with the client (more precisely, with its stub). To perform the operation, the server stub makes a local function call, the text of which must be written:


The client stub accepts the argument passed to the remote procedure, does the necessary conversions, issues a request to the portmap(1M) server, communicates with the remote procedure server, and finally passes the return value to the client. For the client, a call to a remote procedure is reduced to a call to a stub and is no different from a regular local call.

client.c


#include #include"log.h" main(int argc char*argv) ( CLIENT *cl; char*server, *mystring, *clnttime; time_t bintime; int*result; if(argc != 2) ( fprintf(stderr, "Call format: %s Host_Address\n", argv ); exit (1) ; ) server = argv ; /*Get the client descriptor. If unsuccessful, we will inform you that it is impossible to establish a connection with the server*/ if((c1 = clnt_create (server, LOG_PROG, LOG_VER, "udp")) == NULL) ( clnt_pcreateerror (server); exit (2); ) /*Allocate a buffer for the line*/ mystring = ( char*)malloc(100); /*Determine the time of the event*/ bintime = time ((time_t *) NULL); clnttime = ctime(&bintime); sprintf (mystring, "%s - Client started", clnttime); /*Send a message for the log - the time the client started working. If unsuccessful, we will report an error*/ if((result = rlog_l(&mystring, cl)) == NULL) ( fprintf(stderr, "error2\n"); clnt_perror(cl, server); exit(3); ) /*In case of failure on the remote computer, we will report an error*/ if(*result !=0) fprintf(stderr, "Error writing to log\n"); /*0free the handle*/ cint destroy(cl); exit(0); )

The client stub log_clnt.c is compiled with the client.c module to produce an executable client program.


Now on some host server.nowhere.ru you need to start a server process:


$logger

Then, when you run the rlog client on another machine, the server will add a corresponding entry to the log file.

The RPC operation diagram in this case is shown in Fig. 1. Modules interact as follows:

  1. When the server process starts, it creates a UDP socket and binds any local port to that socket. Next, the server calls the library function svc_register(3N) to register program numbers and versions. To do this, the function calls the portmap(IM) process and passes the required values. The portmap(IM) server is usually started when the system is initialized and binds to some well-known port. Now portmap(3N) knows the port number for our program and version. The server is waiting to receive the request. Note that all the described actions are performed by a server stub created by the rpcgen(IM) compiler.
  2. When rlog runs, the first thing it does is call the library function clnt_create(3N), giving it the remote system address, program and version numbers, and transport protocol. The function sends a request to the portmap(IM) server of the remote system server.nowhere.m and obtains the remote port number for the log server.
  3. The client calls the rlog_1() procedure defined in the client stub and transfers control to the stub. That, in turn, generates a request (converting the arguments to XDR format) in the form of a UDP packet and sends it to the remote port received from the portmap (IM) server. It then waits for a response for some time and, if not received, resends the request. Under favorable circumstances, the request is accepted by the logger server (server stub module). The stub determines which function was called (by procedure number) and calls the rlog_1() function of the log.c module. After control returns to the stub, the latter converts the value returned by the rlog_1() function into XDR format, and generates a response also in the form of a UDP packet. Upon receiving the response, the client stub extracts the returned value, transforms it, and returns it to the client's main program.

The idea of ​​a Remote Procedure Call (RPC) is to extend the well-known and understood mechanism for transferring control and data within a program running on a single machine to transfer control and data over a network. That is, the client application accesses procedures stored on the server. Remote procedure call tools are designed to facilitate the organization of distributed computing. The greatest efficiency of using RPC is achieved in those applications in which there is interactive communication between remote components with fast response times and a relatively small amount of data transferred. Such applications are called RPC-oriented.

The characteristic features of RPC are:

Asymmetry, that is, one of the interacting parties is the initiator;

Synchronicity, that is, the execution of the calling procedure is suspended from the moment the request is issued and is resumed only after the called procedure returns.

There are several implementations of remote calling procedures on different operating systems. The UNIX operating system uses a procedure of the same name (Remote Procedure Call - RPC). This procedure is implemented into the core of the system. Its implementation is ensured by the RPC protocol. In Windows operating systems, remote procedure call began to develop on the basis of OLE mechanisms, which gradually developed into DCOM (Distributed Component Object Model) technology. This technology allows you to create fairly powerful distributed network computing environments. The technology uses proprietary Microsoft protocols.

How RPC works

Before the direct call, special structures (procedures, files) must be created on the client and server sides - these are the so-called client stub and server skeleton, which are necessary for the correct operation of RPC. Most often, they are generated automatically by special utilities using the main program code.

When a remote procedure is called in a distributed system, the following actions occur:

1. The client procedure calls stub as a normal procedure. Stub packs parameters (marshalization).

2. Stub accesses the OS kernel.

3. The kernel sends a message to the remote machine (the kernel of the remote PC).

4. Transferring the received message to the skeleton of the server process.

5. Unpacking parameters (unmarshaling). Call the required procedure.

6. The procedure is being executed on the server. Returns the results to the skeleton.

7. The skeleton packs the result.

8. Transfer the result to the kernel.

9. The server kernel passes the message over the network to the client kernel.

10. The client core accesses the stub. Stub unpacks the result.

11. Transfer from the stub to the client process.

Remote Procedure Call (RPC) service in Windows OS

In order to understand the importance of the remote procedure call mechanism, you can at least consider the list of utilities and services that do not work without RPC in Windows 2000. In fact, disabling the RPC service in the specified environment leads to the crash of the entire system. So, the following depend on the Remote Procedure Call (RPC) service:

1. Telnet - allows a remote user to log in and run console programs using the command line.

2. Windows Installer - installs, uninstalls or repairs software according to the instructions of the MSI files.

3. IPSEC Policy Agent - Manages the IP security policy and runs the ISAKMP/Oakley (IKE) and IP security driver.

4. Print spooler - loads files into memory for subsequent printing.

5. Secure Storage - Provides secure storage of sensitive data such as private keys to prevent unauthorized access by services, processes or users.

6. Windows Management Instrumentation - provides information about system management.

7. Changed Link Tracking Client - Sends alerts about files moved between NTFS volumes in a network domain.

8. Distributed Transaction Coordinator - Coordinating transactions distributed across multiple databases, message queues, file systems, or other secure transaction resource managers.

9. Routing and Remote Access - Offers routing services to organizations on local and global networks.

10. Task scheduler - allows you to execute programs at the scheduled time.

11. Network connections - manages objects in the “Network and remote access network” folder, which displays the properties of the local network and remote access connections.

12. COM+ event system - automatic distribution of events to subscribed COM components.

13. Indexing service - indexing for quick search.

14. Messaging service - sends and receives messages sent by administrators or the notification service.

15. Fax service - helps you send and receive fax messages.

16. Removable storage - manages removable media, disks and libraries.

17. Telephony - provides support for the Telephony API (TAPI) for programs that manage telephone equipment and voice IP connections on this computer, as well as through the LAN - on servers where the corresponding service is running.

RMI applications

Remote Method Invocation (RMI) is an implementation of RPC ideas for the Java programming language.

RMI is a JavaSoft product developed for Java and integrated into JDK 1.1 and higher. RMI implements a distributed computing model and provides a means of communication between Java programs (Java virtual machines) running on one or more remote computers. RMI allows client and server applications to call methods of clients/servers running on Java virtual machines over the network. The main advantage of RMI is that it provides the programmer with a higher-level programmable interface that allows a reference to a remote object to be passed as an argument or returned as a result. RMI requires Java programs to be running on both ends of the connection. The network connection is achieved using the TCP/IP protocol. The RMI architecture is shown in Fig. "RMI Architecture".

Client Stub (an adapter for the client - a certain entity on the client that provides reception/transmission functions), and Server Skeleton (an adapter for the server - a certain entity on the server that processes remote calls) are derived from the common interface, but differ in that the Client Stub is simply used to connect to the RMI Registry, and the Server Stub is used to communicate directly with server functions.

RMI is actually a new kind of object request broker that is built on the Java object model. Like ORB, RMI introduces five key points:

1. Allows you to move code in addition to data.

2. Practically ensures the safety of execution of loaded code.

3. Allows you to pass objects by value.

4. Uses Java as both an interface definition language and an implementation language.

5. Uses a naming scheme based on the Uniform Resource Locator (URL).

This converts objects into serial form - into a stream of bytes transmitted as a parameter in a message using the TCP/IP protocol.

RMI interfaces can be divided into 4 categories:

RMI Core - defines the interfaces needed to make remote method calls;

RMI naming service - defines interfaces and classes that allow you to obtain references to server objects by name;

RMI security - defines a new RMI security manager and class loader interfaces (RMI extends the on-demand Java class loading mechanism to stub loading);

Marshalization (packaging a request, including parameters, return value, the request itself, into a standard format suitable for transmission over a network) - RMI defines low-level interfaces for marshaling remote objects, which are used to write Java objects to a stream and to read an object from a stream.

JavaSoft and OMG are working to bring the RMI and CORBA object models closer together. This convergence occurs in two areas:

RMI via IIOP. JavaSoft is developing a version of RMI that runs on top of the IIOP transport. IIOP provides the following benefits to RMI:

1. Built-in support for transaction distribution.

2. ORB based firewall support using IIOP proxy (no HTTP tunneling).

3. Interaction with objects written in other languages ​​through a subset of RMI/IDL.

4. Open standard for distributed objects.

RMI/IDL. The CORBA Java standard in IDL is a CORBA/RMI convergence standard. It allows Java programmers to define CORBA interfaces using Java RMI semantics instead of CORBA IDL. The compiler uses these semantics to automatically generate CORBA IDL, stubs, and skeletons. The RMI/IDL subset allows RMI programs to be called by multilingual CORBA clients using IIOP; it also allows RMI programs to call CORBA objects written in other languages.

RMI over IIOP seems to be a good solution for a CORBA/Java system because it combines two powerful technologies. The main advantage of RMI is that it allows you to quickly and easily create a small distributed system in a purely Java environment. The main disadvantage of RMI is that it cannot be integrated with existing applications.

Comparison of Distributed and Undistributed Java Programs

The developers of RMI aimed to make using distributed Java objects the same as using local objects. The following table lists some important differences.

Interfaces in RMI

The RMI architecture is based on one important principle: defining behavior and implementing that behavior are considered different concepts. RMI makes it possible to separate and execute on different JVMs the code that defines the behavior and the code that implements the behavior.

This meets the requirements of distributed systems in which clients are aware of service definitions and servers provide those services. Specifically, in RMI, the definition of a remote service is encoded using a Java interface. The remote service implementation is coded in a class. So the key to understanding RMI is to remember that interfaces define behavior and classes define implementation.

Remember that Java interfaces do not contain executable code. RMI supports two classes that implement the same interface. The first class is the implementation of the behavior and is executed on the server. The second class works as an intermediate interface for the remote service and is executed on the client machine.

The client program calls the proxy object's methods, RMI passes the request to the remote JVM and forwards it to the object's implementation. Any values ​​returned from the implementation are passed back to the proxy object and then to the client program.

RMI Architecture Layers

An RMI implementation essentially consists of three abstract layers. The first is the stub and skeleton level, located directly in front of the developer. This layer intercepts method calls made by the client using an interface reference variable and forwards them to the remote RMI service.

The next level is the remote link level. This layer understands how to interpret and manage references to remote service objects. In JDK 1.1, this layer connects clients to remote service objects that are running on the server. This connection is a one-to-one connection (unidirectional connection). In the Java 2 SDK, this layer was extended to support activation of passive remote objects using Remote Object Activation technology.

The transport layer is based on TCP/IP connections between networked machines. It provides basic connectivity and some anti-tamper strategies. With a layered architecture, each layer can be changed or replaced without affecting the rest of the system. For example, the transport layer can be replaced by the UDP/IP protocol without changing the other layers.

Search for deleted objects

When considering the RMI architecture, the question arises: "How does the client find the remote RMI service?" Clients find remote services using a naming or directory service. How can a client find a service using a service? But this is true. The name or directory service runs on a well-known host and has a known port number (well-known means that everyone in the organization knows about it).

RMI can use many different directory services, including the Java Naming and Directory Interface (JNDI). RMI itself includes a simple service called the RMI registry, rmiregistry. The RMI registry runs on every machine that hosts remote service objects and accepts service requests, using port 1099 by default. On the host, the server program creates a remote service by first creating a local object that implements that service. It then exports this object to RMI. Once the object is exported, RMI creates a listening service that waits for a client connection and service request. After export, the server registers the object in the RMI registry using the public name.

On the client side, the RMI registry is accessed through the static Naming class. It provides a lookup() method that the client uses to query the registry. The lookup() method accepts a URL pointing to the hostname and the name of the required service. The method returns a remote reference to the serving object. The URL takes the following form:

rmi:// [:] /
where host_name is a name recognized on a local area network (LAN), or a DNS name on the Internet. You only need to specify name_service_port if the name service is running on a port other than the default 1099.

Using RMI

A working RMI system consists of several parts: defining interfaces for remote services, implementing remote services, stub and skeleton files, a server that exposes remote services, an RMI naming service that allows clients to find remote services, a class file provider (HTTP or FTP server) ), a client program that needs remote services.

Assuming that the RMI system has already been designed, the following steps must be followed to create it:

1. Write and compile Java code for the interfaces.

2. Write and compile Java code for implementation classes.

3. Create stub and skeleton class files from the implementation classes.

4. Write Java code for the host program for remote maintenance.

5. Develop Java code for the RMI client program.

6. Install and launch the RMI system.

Example RMI - Applications

The first step is to write and compile the Java code for the service interfaces. The Calculator interface defines all remote capabilities offered by the service:

public interface Calculator extends java.rmi.Remote (
public long add(long a, long b) throws java.rmi.RemoteException;
public long sub(long a, long b) throws java.rmi.RemoteException;
public long mul(long a, long b) throws java.rmi.RemoteException;
public long div(long a, long b) throws java.rmi.RemoteException;
}

Note that this interface extends the Remote interface, and each method's signature specifies that it can throw a RemoteException object. In general, an object is called remote if it implements the Remote interface. “Implements” in the sense of the header (public interface Calculator extends java.rmi.Remote), there are no methods in this interface. This is a mark. Now you need to write an implementation of the remote service. Below is the CalculatorImpl class:

public class CalculatorImpl extends java.rmi.server.UnicastRemoteObject
implements Calculator (
// Implementations must have an explicit constructor in order to declare
// exception RemoteException
public CalculatorImpl()
throws java.rmi.RemoteException (
super();
}
public long add(long a, long b) throws java.rmi.RemoteException (
return a + b;
}
public long sub(long a, long b) throws java.rmi.RemoteException (
return a - b;
}
public long mul(long a, long b) throws java.rmi.RemoteException (
return a * b;
}
public long div(long a, long b) throws java.rmi.RemoteException (
return a/b;
}
}

The implementation class uses Unicast RemoteObject to attach to the RMI system. In this example, the implementation class directly extends UnicastRemoteObject. This is not a requirement. A class that does not extend UnicastRemoteObject can use its exportObject() method to attach to RMI. If a class extends UnicastRemoteObject, it must provide a constructor that declares that it can throw a RemoteException object. If this constructor calls the super() method, it invokes code in the UnicastRemoteObject that performs the RMI connection and initialization of the remote object.

Remote RMI services must be placed in the server process. The CalculatorServer class is a very simple server that provides simple elements to host.

import java.rmi.Naming;

public class CalculatorServer(
public CalculatorServer() (
try (
Calculator c = new CalculatorImpl();
Naming.rebind("
rmi://localhost:1099/
CalculatorService", c);
) catch (Exception e) (
System.out.println("Trouble: " + e);
}
}
new CalculatorServer();
}
}

The client source code, for example, could be as follows:

import java.rmi.Naming;
import java.rmi.RemoteException;
import java.net.MalformedURLException;
import java.rmi.NotBoundException;
public class CalculatorClient(
public static void main(String args) (
try (
Calculator c = (Calculator)
Naming.lookup(
"rmi://remotehost
/CalculatorService");
System.out.println(c.sub(4, 3));
System.out.println(c.add(4, 5));
System.out.println(c.mul(3, 6));
System.out.println(c.div(9, 3));
}
catch (MalformedURLException murle) (
System.out.println();
System.out.println(
"MalformedURLException");
System.out.println(murle);
}
catch (RemoteException re) (
System.out.println();
System.out.println(
"RemoteException");
System.out.println(re);
}
catch (NotBoundException nbe) (
System.out.println();
System.out.println(
"NotBoundException");
System.out.println(nbe);
}
catch(
java.lang.ArithmeticException
ae) (
System.out.println();
System.out.println(
"java.lang.ArithmeticException");
System.out.println(ae);
}
}
}

Now you can start the system. This can be done (after receiving the appropriate class files and placing them on the same or different PCs) like this:

1. Launch the RMI registry (“rmiregistry”).

2. Start the server ("java CalculatorServer").

3. Launch the client ("java CalculatorClient").

If everything goes well, you will see the following information:

1
9
18
3

That's it - a working RMI system is ready. Even if you run three consoles on the same computer, RMI uses your network's TCP/IP protocol stack to communicate between the three separate JVMs. This is a completely complete RMI system.

Distribution of RMI classes

To run an RMI application, supporting class files must be located in places where they can be found by the server and clients.

The following classes must be available to the server (for the class loader):

Remote Services Implementations

Skeletons for implementation classes (only for servers based on JDK 1.1)

Stubs for implementation classes

All other server classes

The following classes must be available to the client (for the class loader):

Remote Service Interface Definitions

Stubs for classes that implement a remote service

Server classes for objects used by the client (such as the return value)

All other client classes

If you know which files should be located on different nodes on the network, then making them available to each JVM class loader is easy.

Distributed Garbage Collection

One of the benefits of programming for the Java platform is that you don't have to worry about memory allocation. The JVM has an automatic garbage collector that frees memory occupied by any object that is no longer in use by the running program. One of the requirements for the development of RMI was its seamless integration into the Java programming language, including garbage collection. Developing an efficient garbage collector for a single machine is a difficult task; developing a distributed garbage collector is a very difficult task. The RMI system provides a reference-counting distributed garbage collection algorithm based on the network objects used in Modula-3. During operation, this system monitors which clients have requested access to remote objects running on the server. When a link appears, the server marks the object as "dirty", and when the client removes the link, the object is marked as "clean".

The interface to the DGC (distributed garbage collector) is hidden at the stub and skeleton level. However, a remote object can implement the java.rmi.server.Unreferenced interface and be notified via the unreferenced method when there are no longer any clients holding a live reference. In addition to the link counting mechanism, a live link in the client has a lease period with a specified time. If the client does not renew the connection to the remote object before the lease expires, the link is considered dead and the remote object may be garbage collected. The lease time is controlled by the java.rmi.dgc.leaseValue system property. Its value is specified in milliseconds and defaults to 10 minutes. Because of these garbage collection semantics, the client must be prepared to deal with objects that can "disappear."

Conclusion

Remote Method Invocation (RMI), first introduced in JDK 1.1, took network programming to a higher level. Although RMI is relatively easy to use and not without its shortcomings, it is an incredibly powerful technology and exposes the average Java programmer to an entirely new paradigm - the world of distributed object computing.