Introduction to open distributed computing

Distributed Computing

In this section, we will analyze what are the characteristics that make a system distributed. We will look at several issues that arise in such systems and explain the benefits that are worth the efforts of overcoming the challenges. Furthermore, we will introduce several computing paradigms that distributed computing encompasses. Finally, we will list the fundamental technologies, which are critical for implementing distributed computing systems.

Characteristics of distributed computing environment

Physical separation of computing units

Computing space is divided into physically separate execution units. A network that connects all of units allows for exchanging data during the computation process.

Scalability

Distributed computing space can be extended by adding new execution units or by adding new services that they provide. In the first case, the network has to be enlarged. Extending the spectrum of services on any of the computing units involves activating new entry points and advertising them globally.

Administrative autonomy

Regional parts of a distributed system are administered autonomously. There is no global management center coordinating the computing and maintenance activity.

Heterogeneity

The morphology of a distributed computing medium is much diversified. Various software and hardware platforms are tied together by standardized modules.

Challenges

Naming

Naming strategy is critical for locating cooperating computing units. It is easy to resolve local references, but in a global environment, the process is complex.

Security

Employing services provided by external resources might be risky, as the security and integrity of the requesting party can be compromised. On the other hand, the use of certain services might be restricted to a selected set of clients.

Manageability

Distributed systems are difficult to manage, because they are large, geographically dispersed, heterogeneous, dynamic and there is no central authority. Coordinating all cooperating units and regions is a difficult task.

Indeterminacy

A distributed environment constitutes a non-deterministic computing platform with inherent data inconsistency, latency, asynchrony and event ordering.

Benefits
Resource sharing

In traditional computing, local resources must accommodate all processing requirements of the applications. This model is wasteful, because the computing nodes have exclusive access to resources that are not always used. In a distributed computing environments individual computing units may make their resources available for others. In that way, we do not need all required resources at each computing unit.

Performance

In distributed computing, processes can run in parallel on different processing units. Time consuming tasks can be performed by modular systems with certain modules executing at the same time.

Availability

If a centralized service is in use, then it is not available to others, who have to queue for the service. In a distributed system, services might be duplicated in several locations. If a request cannot be satisfied by a provider, then the client can be directed to others.

Agility

Distributed computing systems do not age, because their aging components can be taken out of service. At the same time, new services can be added freely.

Scalability

Very often centralized servers suffer from inability to accommodate new services. Such problems are addressed by adding resources, but there are limits to this approach. In a distributed computing environment, new computing units can be added to provide new services.

Distributed computing paradigms

Several programming paradigms provide indispensable support for distributed applications. The degree of transparency varies depending on the specific platform.

Global execution space

Applications execute in a global execution space; i.e., they utilize the resources of any participating computing unit.

Concurrent programming

Processes run in parallel on different execution units.

Global communication

Any two entities can exchange data independently of their locations.

Secure access to objects

Access to services and resources is controlled.

Persistent storage

Applications and services may restore data from previous execution sessions.

Event ordering

Events can be ordered at any location independent of their origin.

Fundamental technologies

Remote Procedure Call (RPC)

RPC is the mechanism for service invocation.

Global naming

Naming and trading services provide means for obtaining and resolving global identifiers.

Object lifecycle

The lifecycle of objects providing services must be controlled, so they can be installed, activated, deactivated and deleted. Objects are created by specialized modules called factories. Removal of components that are no longer used is called garbage collection.

Security

Objects can be protected from unauthorized use by security services.

Threads

Threading services allow running multi-threaded distributed applications.

Distributed time

Time can be synchronized between the members of a distributed computing platform by the means provided by timing services.

Remote Procedure Call (RPC)

All distributed frameworks are based on the Remote Procedure Call (RPC) technology. We will review this fundamental technology in the next few sections.

Regular procedure call review

Procedure calls are common concepts in programming languages. The compiler provides the support for a procedure call. The process involves a call statement, in which the name of the procedure and the parameters are specified. There are input parameters to pass data to the procedure and output parameters to obtain the results of the call. Very often, an assignment statement is used instead of output parameters to obtain data from a typed procedure (function) that defines return values.

MyProc(input1, input2, out1, out2);

RetVal = MyFunc(input1, input2);

The mechanism of such a call is as follows. The code currently executed is suspended, the execution context is saved, the parameters are put on the stack, the program pointer is set to point to the code of the procedure and the control is passed to this code. The code of the procedure uses the parameters on the stack to obtain the input and generate the output. When the procedure terminates, then the return value is put on the stack, together with the parameters. Then, the context of the previously executing code is recreated and it starts executing again. The value of the output parameters and the return value are accessible on the stack, so the calling code can make use of it.

To recap, let’s look at the stages of the invocation process:

determining which procedure to call
passing parameters to the procedure
processing the data
returning the results

The procedure that is called must be a part of the same execution context (process). Its name has to be resolved statically by the compiler.

In an object-oriented world, not procedures, but methods of objects are called. The invocation process requires that the object be known in addition to the method (procedure). The object can be determined dynamically by a late binding mechanism. Some modern languages have facilities to discover object behavior, so the name of the method to call can be obtained dynamically as well. For example, Java’s reflection package provides a capability to query classes about their methods at runtime and to use the obtained information to construct objects and invoke their methods.

Remote procedure call

Remote procedure call (RPC) is a method of invoking processing services that reside outside of the execution environment of the caller. It provides transparency to the programmer, because there is no difference in the code between a regular and a remote call.

The term remote suggests that the server resides on a remote computer, but it does not have to be so. The request might be directed to another process running locally. The essential difference is that the executing context of the called procedure is remote to the executing context of the caller.

RPC Runtime System

Nothing comes free, so several components are needed to provide RPC capability. An underlying RPC framework that includes several components must be in place. On the client side (by the client side we mean the process that invokes the call), we need a stub procedure for each remote procedure. The role of the stub is to take the parameters passed in a regular procedure call and pass them on together with the server identifier (that is the procedure being invoked) to the RPC runtime system (RTS). RTS is either a separate process or a part of the server and the client. In any event, the RPC services are available to any process through a runtime library that can be attached to any application. The services are invoked by the stub in a way transparent to the client. The RPC process contacts its counterpart on the remote system running the server and delivers the requests along with the parameters. The target RTS contacts the skeleton corresponding to the requested object and passes the parameters. The skeleton invokes the corresponding procedure in a regular way and obtains the results. The results are passed back to the client using the same indirect route through the two cooperating RTSes and the clients stub. The client is handed the result values in the same way as if returned from a regular procedure call.

The RPC server has to register each available procedure with the RTS, so the RTS is able to direct its invocations.

Marshalling: passing parameters

In a regular procedure call, the values of actual and formal parameters are ensured to be the same, because they exist in the same executing context. That might not be the case if the values were passed between different computers, because the computers may use different internal representation for data types. To ensure that the type mismatch is avoided, the parameters are encoded with their types in a process called marshalling (and unmarshalling on the other side of the communication link). In that way, the receiver is able to use the same semantics as the sender. The data might have different representation at both ends, but they have the same meaning.

RPC compiler : generating stubs

The developer of the system must create the stubs and skeletons for every RPC call. To ease that task, RPC toolkits include special tools. The developer provides details of an RPC call in a form of specifications encoded in Interface Definition Language (IDL). An IDL compiler is used to generate the stubs and the skeleton automatically from the IDL specifications. The stubs can then be included with clients, and the skeletons can be linked with the server. The IDL specifications have to include details on the names of the procedures in the package, the number of parameters, which they take, the data type of each parameter and the direction in which each parameter is transferred.

Distributed object frameworks

All frameworks for distributed objects build upon the basic notion of a remote procedure call. They provide many supporting facilities, but the fundamental technology is similar. The goal is to provide a relay mechanism for exchanging data between components that can reside anywhere in the network. The elements of this mechanism are unified under one umbrella name, Object Request Broker or ORB. In an apparent analogy to hardware architectures, an ORB is called a data bus. Continuing the analogy, many components can connect to the bus through a standard interface and simultaneously exchange data in standard ways. The data bus, ORB, synchronizes the flow of data and ensures the sanity of the interactions.

All of the frameworks are component-based, but CORBA and RMI are both object- oriented, while DCE and DCOM are not. We will analyze each of them in the next few sections. We will study programming CORBA and RMI in detail in the later sections. We include short discussions of ODP and TINA, which are attempts to standardize aspects of a distributed computing environment.

Open System Foundations’ Distributed Computing Environment (DCE)

OSF's Distributed Computing Environment is a layer between the operating system and network and distributed applications. DCE provides the services that allow a distributed application to interact with a collection of heterogeneous computers, operating systems, and networks as if they were a single system.

DCE is not object-oriented, so the use of the term ORB could seem strange. However, if we view DCE components as objects, then the supporting functions might be considered request brokerage services.

DCE Architecture

The DCE Remote Procedure Call (RPC) facility consists of a runtime service and development tools. They include an IDL compiler and a generator of unique identifiers (UUIDs), which are used to identify service interfaces. The runtime service implements the protocols for the client-server communication.

The DCE Directory Service holds information about the users of the system, the machines comprising the system and the services that are being provided somewhere in the system. The information consists of the name of the resource and its attributes (for example, user's home directory, or the location of an RPC-based server).

The DCE Security Service provides user authentication, secure communications, authorization and auditing. DCE security is based on the Kerberos system.

The DCE Directory Service has several parts. The Cell Directory Service (CDS) manages a database of information about the resources in a DCE cell. A DCE cell is a grouping of a number of machines, users, and resources managed as a single unit. The Global Directory Agent (GDA) links a cell to global directory services. CDS is accessed using the X/Open Directory Service (XDS) API, which is used as an API for the DCE directory service.

DCE Threads support the creation, management, and synchronization of threads for multi-threaded processes. If the underlying operating system supports multithreading, then its native thread library can be used rather than the DCE’s.

The DCE Distributed Time Service (DTS) is used to synchronize time on the computers in the network. Each DCE host has its time synchronized with Coordinated Universal Time (UTC).

The DCE Distributed File Service (DFS) allows users to share files. Any user can access a file stored on a File Server anywhere on the network, without having to know the physical location of the file. DCE DFS includes a physical file system, the DCE Local File System (LFS), which supports special features that are useful in a distributed environment like crash recovery, data replication, access control and access tracing.

Each DCE service contains a management and administrative component, so it can be managed over the network.

DCE RPC

DCE RPC consists of several components that work together to implement this facility. This includes the Interface Definition Language (IDL) and its compiler, a Universal Unique Identifier (UUID) generator, and the RPC Runtime. The RPC Runtime is a library that can be attached to clients and servers, so they can call its functions. The runtime supports data transport over two protocol implementations. One uses TCP and the other uses UDP.

RPC is hidden from the user. The minimal amount of administration is done only by the server that has to advertise the services it provides in the DCE Directory Service.

DCE components

The Interface Definition Language (IDL) and its Compiler

An RPC interface is described in DCE IDL. The IDL file is compiled into object code using the IDL compiler. The object code is in two main parts -- one for the client side of the application, and one for the server side.

The RPC Runtime Library

This library consists of a set of routines, linked with both the client and server sides of an application, which implement the communications between them. This involves the client finding the server in the distributed system, getting messages back and forth, managing any state that exists between requests, and processing any errors that occur.

Authenticated RPC

DCE RPC is integrated with the DCE Security Service component to provide secure communications. Levels of security can be controlled by the RPC application programmer through the Authenticated RPC API.

Name Service Interface (NSI) API

DCE RPC is integrated with the DCE Directory Service component to facilitate the location of RPC-based servers by their clients. The NSI routines allow a programmer to control the association, or binding, of a client to a server during RPC.

The DCE Host Daemon

The DCE Host daemon (dced) is a program that runs on every DCE machine. It includes (among other things) an RPC-specific name server called the endpoint mapper service, which manages a database that maps RPC servers to the transport endpoints (in IP, the ports) that the server is listening for requests on.

The DCE Control Program

The DCE control program (dcecp) is a tool for administering DCE.

UUID Facilities

These are ancillary commands and routines for generating Universal Unique Identifiers (UUIDs), which uniquely identify an RPC interface or any other resource. The uuidgen program can optionally generate an IDL template for a service interface, along with a unique identifier for the interface.

CORBA

Common Object Request Broker Architecture (CORBA) is a distributed computing framework designed and by a consortium of several companies known as the Object Management Group (OMG). Like DCE, it is a middleware in a three-tier client/server system. The main difference between DCE and CORBA is that CORBA is object-oriented, while DCE is not. CORBA allows for a uniform use of objects residing anywhere in a network. Any object can be a client, a server or both at the same time. CORBA has its own data bus, CORBA ORB. Anybody adhering to the CORBA ORB Interface can connect to an object residing on a server and utilize processing services offered as the object’s methods.

CORBA ORB

CORBA ORB is implemented in part as a library that is used by the clients and servers, and in part as supporting processes that are accessible from the ORB. Many elements of the ORB are in fact parts of the distributed components themselves. A number of processes are needed to implement certain CORBA services. The actual communication services to transfer data between objects constitute the core of ORB. The standard protocol for this is the Internet Inter-ORB Protocol (IIOP), but by historical reasons providers of CORBA ORBs used to implement proprietary protocols.

Similarly to an RPC call, the invocation mechanism requires the use of stubs by the clients and skeletons by the servers. CORBA Interface Definition Language (IDL) is used to specify the interfaces to remote objects. CORBA is language-independent because IDL is a non-implementation language to specify interfaces to objects. The objects can be implemented in any language. There are IDL compilers capable of generating stubs and skeletons in programming languages (C++, Java, Smalltalk and a few others). Additionally, every ORB is required to provide bindings for several languages, so there are no language constraints on creating CORBA applications. It is also easy to augment existing applications, so they become CORBA-enabled.

In addition to static method invocation through stubs and skeletons, CORBA provides mechanisms necessary to resolve object references dynamically. A client can use Dynamic Invocation Interface (DII) to acquire methods that had been added after the client was created. References to all such methods, method signatures, are stored in the Interface Repository. The Interface Repository API (Applications Programmer Interface) provides means to access, store and modify method signatures, so any object can advertise its services to the world during its initialization. On the server side, the Dynamic Skeleton Interface (DSI) handles calls to methods that do not have static skeletons. That can happen if a method had been added to the server after the server was installed. The Object Adapter handles requests on behalf of server’s objects. CORBA specifies that each ORB must implement Basic Object Adapter (BOA). The Object Adapter registers the objects with Implementation Repository that stores runtime information about the objects supported by the server. It also controls garbage collection of objects.

We will discuss CORBA in more detail later.

Remote Method Invocation (RMI) is a Java-only ORB implemented by JavaSoft (Sun’s subsidiary). The original implementation only provided invocation mechanism between objects that live in the Java world. However, due to the need for deeper integration into the heterogeneous reality, RMI will be able to use IIOP as its optional transmission protocol. In that way, any CORBA compliant server will be available to RMI clients, and vice versa.

For a Java programmer, RMI is simpler to use than CORBA. Additionally, it is included free of charge in every Java Development Kit (JDK). In contrast, CORBA implementations are expensive. On the other hand, RMI provides only basic services and lacks CORBA’s sophistication in many areas.

RMI comprises several Java libraries (packages) and a set of tools. The basic communication mechanism is, as in other frameworks, RPC. RMI supports only static method invocation, but with a twist. Owing to the capability of transferring code, the stubs that are needed for an invocation can be downloaded from a remote location, for example from the server. The client still needs to know the specifics of the interface, but it does not need to link with the stubs statically. If the client obtained the object reference and the location of the server, then a call could be created dynamically, and the stubs would be downloaded. A client can utilize the functionality of Java Reflection to acquire the information about a remote object dynamically, including its interface. This is far from having a full-blown support for dynamic method invocation, because there is no support for discovering new remote services.

Usually, a client obtains a reference to the server object through Naming services packaged in the java.rmi.Naming. The naming service in RMI is a simple way of resolving remote references. It is delivered by a daemon called rmiregistry, which has to run on the same host as the server. Several RMI servers that run on the same host can share the same registry. They have to publish their objects after the registry daemon is started, because the registry starts empty. The database is not persistent, so all references are lost if the service restarts.

RMI uses reference count to control server objects. Every call increases the reference to a given object. If the reference count goes to zero, then the object can be garbage collected.

A client can locate a registry on a given host. A specific port that the registry is listening on can also be queried. To get access to a remote object a client requests the registry by sending a URL in the following format:

rmi://host:port/ObjectName

RMI registry supports only flat names (no hierarchy). This might be consider as support for global identifiers.

RMI does not require specialized IDL. Remote interfaces are encoded directly in Java. After being compiled with a Java compiler (javac) into so called class files (intermediate code which is interpreted by a Java Virtual Machine, JVM), they are compiled yet again with the RMI compiler, rmic. Java stubs and skeletons are produced in this process. They can either be statically linked to clients and servers or dynamically downloaded when they are needed. This decreases the size of the applications or applets, which might be critical for use on the Web. Stubs can be only be downloaded from a location specified by the server.

There is a security risk in downloading executable code, so RMI provides a simple security manager that controls downloading of Java classes.

DCOM

Distributed Common Object Model (DCOM) is a Microsoft implementation of DCE. The roots of DCOM go back to Object Linking and Embedding (OLE), which did not have anything to do with distributed processing. OLE was technology for embedding documents. When Microsoft released OLE2, Common Object Model (COM) was created by extracting and combining communication facilities of OLE. DCOM is an extension of COM to distributed environment.

DCOM is a Windows-only environment, although there are attempts to put it in a standardized framework by an ActiveX Consortium. ActiveX is an umbrella name for DCOM and a number of other technologies that support distributed components. ActiveX controls are just DCOM components.

DCOM supports both static and dynamic methods of invocation.

Static invocation

DCOM uses RPC for remote invocations, but not every DCOM call is considered remote. If the referred component is an in-process server implemented as a Dynamic Link Library (DLL), than it is attached to the process executing the call. If the called component is an local server (.EXE, executable) residing on the same host, then DCOM uses Lightweight RPC (LRPC). Only if the called component is in fact residing on a remote server, the full-blown RPC is engaged. In each case, DCOM makes a transparent decision. If the called code is outside of the execution environment of the caller, a proxy is needed. A proxy is a client facility for remote calls (in CORBA and RMI it is called a stub). The term stub is used to refer to a server-side facilitator of remote calls (in CORBA and RMI it is a skeleton). Service Control Manager (SCM or scum) is responsible for locating servers and managing invocation through RPC.

Static DCOM interfaces are declared through IDL. A description of an interface consists of a header and a body. The header includes Global Interface Identifier (GUID or goo-id) which can be uniquely generated by DCOM utilities. Microsoft IDL (MIDL) is a IDL compiler that generates the proxies and stubs from the declarations. They are linked with clients and servers.

All references to server components are done through interface pointers. An interface pointer is a pointer to a virtual table (vtable) that contains the pointers to all functions defined by the interface and implemented by the component. A component might support many interfaces. In such a case, it will have a vtable for every interface. If a call is made to an in-process server, then no proxies are needed. The pointer in the vtable contains the pointer to the function residing in the same address space as the calling code, so a regular procedure call is executed.

DCOM components are identified by unique class identifiers (CLSIDs). For convenience, Microsoft refers to both the component classes and currently active components as objects. Strictly speaking, DCOM components are not objects, because it is not possible to obtain a reference to it. All references are done through vtables. It is not possible to preserve the state of a component for future invocations. However, a special kind of a proxy called moniker provides a mechanism for storing component’s state in a file and reloading the stored information as needed.

Each DCOM component has to implement the IUnknown interface (all names that refer to interface start with I; all names that refer to COM or DCOM start with Co). There are three functions that the caller can use to query the support for a specific interface (QueryInterface), and increase and decrease the count of references. The count is used by the server component to destroy itself (if the count is zero). It is very error prone garbage collection; especially in the Web environment. xxx provides the calling client with a pointer to the vtable corresponding to the requested interface.

DCOM components cannot encapsulate other components, but they can contain or aggregate several interfaces under one umbrella. A component is considered contained, if another component accepts calls on the behalf of the former and re-invokes methods on the inner component. The IUnknown interface of the outer component has to be aware of the interfaces of all contained components, but the pointers to vtables of the inner components are not exposed to the callers. In the aggregation mode, the details of the inner components are directly exposed by the outer IUnknown interface (i.e., it provides its vtable pointer to the client)

DCOM server is a process that manages a number of DCOM components. Each supported component CLSID has to be registered with the Windows Registry by specifying the path to its implementation (a .DLL or .EXE). When a client calls a function from an interface supported by one of the server’s components, then DCOM executes the server and instructs it to create the requested component. Each server has to implement Interface Factory, which provides functions for component management, support for licensing and security.

Dynamic Invocation

DCOM uses another specification language, Object Definition Language (ODL) to define dynamic interfaces. ODL has been embedded in IDL and is considered a subset of IDL. One compiler (MIDL) is used for compiling IDL and ODL files. The main structure in an ODL file is a library with a unique identifier, which defines a collection of interfaces. MIDL creates a Type Library from an ODL file. The library (a .TLB file) can be attached to a component as its resource or installed in some other way on the network. The installer has to register the library using the OLE API call to

RegisterTypeLib

Type Libraries are listed in the Windows Registry, so any potential client (called automation controller) can find and look it up. To find a library, a client needs to provide its ID to the OLE API call QueryPathOfRegTypeLib. The API returns a path to the library, which then can be loaded using the OLE API call LoadRegTypeLib or LoadTypeLibFromResource. The client requesting to load a library receives a reference to it (ITypLib), which can be used to navigate through the whole library.

Clients can query the library for its contents and obtain references to functions and properties (attributes). The references, called dispatch IDs or dispIDs. DispIDs, are specified in the ODL file. Each DCOM dynamic server (called automation server) implements the IDispatch interface, which provides the basic mechanism for late (dynamic) binding of functions. The interfaces containing function for late binding are called dispatch interfaces or dispinterfaces. A client can pass a dispID to the Invoke function (in the IDispatch interface) to request execution of the corresponding function. Parameters can also be included in the call. The server knows how to map (dispatch) the dispID to a corresponding function, so it satisfies the request.

Open Distributed Processing (ODP)

Open Distributed Processing (ODP) is a set of standards set by ISO and ITU-T that regulate distributed computing environment. The standards regulate interfaces between various functional modules and components. The implementation details are left to the vendors adopting the standard. ODP provides a big picture of the distributed processing, so other frameworks like CORBA and DCOM can fit into it. In fact, CORBA’s Trading Service is based on an ODP standard. ODP and CORBA share also the IDL.

ODP Reference Model (ODP-RM)

ODP-RM consists of three following elements:

ODP Transparencies – provide a uniform simplified view of the distributed system for the programmer and end user. Each transparency is implemented by one or several functions.

ODP Functions – implement the transparencies. They mask the system complexity and provide portability in a heterogeneous environment through standardized modes of working and interfaces.

ODP Viewpoints – define a framework for an effective and disciplined approach to distributed specification.

ODP Transparencies

Access Transparency

Access transparency masks the differences between two or more communicating objects. The differences could be in the data representation, in which case conversion will take place, or in the invocation mechanisms between the objects. The access transparency is provided by an ODP engineering channel, which provides stubs for suitable data conversions, and marshalling or parameters passed in an invocation.

Failure Transparency

Failure transparency enables fault tolerance in an object or shields an object from failures in the objects environment. Failure transparency can be implemented by several functions. One such function is the replication function, which replicates an object, and guards against the case where one copy fails. Another function is the checkpoint and recovery function. Recovery in turn depends on the relocation transparency.

Location Transparency

Location transparency masks the location of an object in space. As an example, WWW URLs are not location transparent, as they contain a host name where an object (Web page) resides. Location transparency depends on chosing a location independent naming scheme. Location transparency enables the named entities to be moved, without notifying all parties who carry a reference to the entity of the changed reference.

Migration Transparency

Migration transparency masks changes of location of an object. Migration transparency depends on the migration function. Before migration, an object will be checkpointed, and deleted from its original location. Once the object is moved, other objects depend on the relocation transparency to find the object again.

Persistence Transparency

Persistence transparency masks from an object the deactivation and reactivation of objects including itself. In this case, an object does not need to be concerned with loading an object from persistent store before using it. This is analogous to operating systems managing objects in a virtual store, but is more general. Persistence transparency depends on the deactivation and reactivation function.

Relocation Transparency

Relocation transparency masks the relocation of an object from other objects that are referring to it. This means that if objects are connected via a channel, and one object is relocated, the channel is reconfigured to the new location of the object. Relocation transparency is provided by the relocation function.

Replication Transparency

Replication transparency replicates objects in different locations to provide fault tolerance and enhanced performance by better access to data. Replication transparency is provided by the replication function.

Transaction Transparency

The transaction transparency is provided by the transaction function. It masks the coordination between a set of objects required to achieve consistency properties of the objects.

ODP Functions

ODP functions provide building blocks for construction of distributed systems. They deliver transparencies that ease the task of implementing distributed applications to the programmer. They make the use of the system simple, but hiding its complexity.

Management Functions

node management function
capsule management function
cluster management function
object management function.

Coordination Functions

event notification function
checkpoint and recovery function
deactivation and reactivation function
group function
replication function
migration function
engineering interface reference tracking function
transaction function
ACID transaction function.

Repository Functions

storage function
information organization function
relocation function
type repository function
trading function.

Security Functions

access control function
security audit function
authentication function
integrity function
confidentiality function
non-repudiation function
key management function.

ODP Viewpoints

ODP viewpoints provide a framework for specifying ODP systems. They can be used to specify other ODP component standards and for system development in general. ODP viewpoints can address many issues that arise in the development of any system, so they constitute a general framework for system specification. In fact, TINA-C, which we will discuss later, uses the ODP model in the telecommunication area.

Note that the end users view of the system is delivered by ODP transparencies, and not the ODP viewpoints.

Enterprise viewpoint

The focus of the enterprise viewpoint is the purpose, scope, and policies of a system. The enterprise viewpoint defines the basic objectives of a system through stating the purpose and scope of the enterprise, which can be anything. The viewpoint also specifies the policies, which regulate permissions, obligations, prohibitions that apply to the system, and policies called environment contracts that relate to interactions of the enterprise with the external environment.

The enterprise viewpoint defines constraints on all other viewpoint through a definition of the context and overall environment for the system.

Information viewpoint

The information viewpoint focuses on the information and associated processing of a system. The information viewpoint defines the semantics of information and semantics of processing of information in the system.

Computational viewpoint

The computational viewpoint provides functional decomposition of a system into objects that interact at interfaces. These objects will provide natural lines along which a system may be partitioned for distribution. ODP defines three different kinds of interfaces, which are used for different purposes: operation interfaces, stream interfaces and signal interfaces.

Engineering viewpoint

The engineering viewpoint focuses on the deployment aspects of a system. It specifies how object interaction is achieved in the distributed environment. In contrast to the computational viewpoint, which merely implicitly enables distribution, distribution is explicit in the engineering viewpoint.

An engineering specification is concerned with how the object is created, on which node, capsule and cluster it exists, how to dynamically track the object through activation, deactivation and relocation, and how objects interact via channels.

Technology viewpoint

The technology viewpoint details specific technologies, both hardware and software, which will be used to implement a system. The technology viewpoint fills in specifics for particular implementations of a system. As there may be many sets of technology chosen to implement a system, either at one time, or as future technology becomes available, there could be many different technology viewpoint specifications for any one system. For example, a system could be implemented on many different styles of processor and operating systems, and in many different programming languages.

Telecommunication Information Networking Architecture (TINA)

TINA has been defined by a group of telecommunication companies called TINA Consortium or TINA-C. The goal is to deliver a consistent and open architecture for distributed telecommunications software applications. TINA provides a set of principles that should be applied in the specification, design, implementation, deployment, execution and operation of software for telecommunication systems. A telecommunication system comprises hardware and software resources that are involved in delivery of services by and for stakeholders. In TINA, stakeholders are customers, end users, service providers and network operators.

TINA architecture uses concepts of layering and separation to increase the precision of specifications. Management and computing layering are two separation categories identified by TINA. The management layering applies generalized principles defined by Telecommunications Management Network (TMN). The computational layering divides the system along horizontal lines separating the hardware, Native Computing and Communications Environment (NCCE), Distributed Processing Environment (DPE), TINA’s ORB, and TINA applications.

TINA architecture

TINA architecture is decomposed into several categories that address certain aspects of a distributed telecommunication system.

Overall Architecture defines a set of generic concepts and principles that are should be applied to the design, specification and implementation of any type of software system.

Computing Architecture provides a set of concepts and principles for designing and building distributed applications and the supporting software. This is the most interesting aspect of TINA from the perspective of this course. TINA defines Distributed Processing Environment (DPE), which is an extension of OMG CORBA that accommodates the needs of telecommunications applications. The computing architecture defines several modeling concepts, which are based on the ODP Reference Model’s viewpoints.

Network Architecture provides concepts and principles for the design, specification, implementation and management of transport networks.

Management Architecture addressed the design, specifications and implementation of software for service and network management.

Service Architecture describes a universal platform for distribution of a wide range of services in a multi-provider environment. It is based on the Session concept.

In the next section, we have a closer look at the computational architecture of TINA. Other aspects of TINA architecture are of lesser interest in this course and will not be analyzed any further.

Computing Architecture Modeling Concepts

TINA applies the guidelines specified by ODP viewpoints to define modeling strategies for computing architectures of telecommunication systems.

Enterprise Model includes stakeholders, who play roles of actors or agents with certain obligations. The model sets constraints on the use of the system by specifying the policies. The features and capabilities desired in the system are defined by requirements.

Information Model consists of information objects classified into object types, relationships between the objects and the constraints and rules governing objects behavior. This includes object creation and and deletion. Guidelines for the Development of Management Objects (GDMO) and General Relationship Model (GRM) have been selected as for modeling of information, because of their widespread use in the telecommunications management community. A subset of GDMO and GRM comprising of elements suitable for information processing is called quasi-GDMO-GRM or qGDMO-GRM. Object Modeling Technique (OMT) is used as a notation for diagrammatic representation of information specifications.

Computational Model consists of computational objects and interfaces between them. A computational object is a unit of programming and encapsulation. An application is realized by a number of computational objects interacting with one another. An object may have many interfaces of several types. An operational interface defines the operations that can be invoked on an object. A stream interface provides for channels for streaming data (streams) like video bit stream that are established by service relationships. TINA Object Definition Language (ODL) is used for computational specifications. It is an enhanced version of CORBA IDL.

The use of ODL and qGDMO-GRM is enforced internally within TINA-C. Any other notation can be used by external parties.

Engineering Model describes a realization of a distributed TINA application through a deployment of computational objects on a network of computing nodes and the infrastructure providing the mechanisms for their execution and interaction.

Technology Model defines the implementation selections for the engineering model.

Distributed Processing Environment (DPE)

DPE is TINA’s framework for distributed applications (in other words, it is TINA’s ORB). Computational objects reside on DPE nodes, which are abstract units of resource administration providing support for DPE. The objects are grouped into capsules and clusters. A capsule is a unit of resource allocation and encapsulation (a process). Objects that belong to different clusters are considered remote objects. A cluster is a grouping of computational objects that constitute a logical unit. All objects of a cluster have to be managed together (created, activated, migrated, deactivated, etc.).

The DPE architecture consists of DPE kernel, kernel transport network and DPE services. The DPE kernel provides support for object lifecycle and inter-object communication. The kernel transport network (KTN) provides mechanisms for communication between computational objects. DPE services support the execution and cooperation of computational objects. For example, DPE provides dynamic trading and notification services.

Componentware

Componentware is the next great wave in software development. The basic idea revolves around a concept of a component that can be used as a building block for applications. The idea is analogous to the way electronic components are assembled into hardware systems by interconnecting them through published interfaces. Similarly, a number of software modules can be assembled dynamically into a software application. The components interact through publicly known interfaces. In a way similar to the way an electrical engineer builds an electronic device, a software developer can use known components with published interfaces. Software allows for additional sophisticated assembly of components. An application can be designed to discover new components and their interfaces dynamically. We saw that such capabilities are provided by CORBA and DCOM. Additionally, components do not have to reside locally, as is the case with electronic components on a PCB. Remote software components can be utilized through an ORB, or they can be downloaded for local execution, which is a case with Java applets or ActiveX controls.

In the next two sections, we will briefly review ActiveX controls and JavaBeans. We will analyze the details of JavaBeans later in the course. For several years, there was another competing technology called OpenDoc. Recently, two main proponents, IBM and Apple, withdrew their support and the technology has been pronounced dead.

ActiveX controls

ActiveX controls are the building blocks for distributed DCOM applications. Very often, an ActiveX control is mistakenly termed as ActiveX, which is a name reserved for the conglomerate of technologies that realize Microsoft’s view of the distributed computing environment.

An ActiveX control is a COM component that follows certain standards of inter-component interactions. ActiveX controls expose their interfaces through the automation mechanism. It implies the use of dispinterfaces that we discussed before.

Currently, ActiveX controls are Windows-specific, but there are attempt to accommodate them on different computing platforms (UNIX, Macintosh). An ActiveX control is implemented as a Dynamic Link Library (DLL). It requires a container that provides an execution context. The original container was VisualBasic, but currently a Web browser is a primary container on the client side of a client/server model. Currently, dozens of vendors offer a spectrum of ready to use ActiveX controls. They are mostly visual components that can be used to compose user interfaces of Windows applications. A number of tools from various vendors provide development environments that allow combining ActiveX controls from different sources. Some of the tools require minimal, or not at all, programming.

In principle, there is nothing that would prevent the use of ActiveX controls on servers, but this capability is far less utilized. One good choice for a server-side container is a transaction server. MTS, the Microsoft Transaction Server, is a concrete example of a server-side container for components.

ActiveX controls are implemented as platform-specific, native code. When they are downloaded, they are activated as any other application. If allowed to execute, they have access to all local resources. Microsoft designed a security scheme called authenticode that is based on signing code with a private key. Public keys that are distributed to the general public can be used to verify the source of the incoming code. If the source is trusted, then the control is allowed to execute. We will study security issues later in the course.

Accommodating ActiveX controls in a heterogeneous environment is not easy, because a platform-specific version of a control is required. For example, a Windows-based control implemented as a DLL will not executed on Macintosh or under UNIX. Several versions of an ActiveX control have to be available, which is a major software development problem. Alternatively, ActiveX containers could emulate Windows environment sacrificing execution efficiency.

JavaBeans

JavaBeans is a componentware technology developed by JavaSoft, which is supported by several major players. It is in direct competition with ActiveX. Enterprise JavaBeans is an extension of the basic technology that covers server-side components.

A Java bean is a Java class that conforms to JavaBeans Specifications, a set of rules set by Sun. The specifications define a standard contract between a bean developer and its user. The user of a bean can make certain decisions using implicit information defined by the contract.

JavaBeans specifications require that to become a bean a class has to be serializable; i.e., it has to implement an interface that allows for transmission of the class over a network or for a persistent storage in a file. A beans has to use the Java delegation event model that allows for maintaining a list of registered dependants (listeners), who will receive messages if the state of the bean is changed. A bean either has to follow certain design patterns that set constraints on naming of variables and methods, or has to provide information about itself in a form of an instance of a bean information class. The users of the bean utilize this information to determine beans properties, its event model and supported services (methods). A bean that follows the design patterns can be introspected with the Java class reflection facility. Simply, variables and methods are discovered dynamically, and their names that follow the design patterns are used to deduce the required information about properties, events and methods. All of this information can be obtained from the accompanying information class, if it is present. This is a usual route of transforming existing classes into Java beans.

The information about a bean can be used to assemble applications. Currently, the assemblage is usually done statically inside development tools. For example, a development tool may allow for creating event dependencies that utilize implicit event models of each of the participating beans. Another tool may be a property editor, which may utilize the implicit information on the properties of the bean to provide a customization environment. A bean with modified properties can be stored in a persistent storage, so the customized version is used in the future. In that way, vendors can provide generic beans that later obtain behavior desired by the customers.

From the perspective of this course, dynamic assemblage of beans is far more interesting, but not well understood yet. Java beans for dynamically composed applications (I call them composable applications) would probably use the specifications for Enterprise JavaBeans. These specifications are aimed at beans destined for modeling business objects that could reside on servers, rather than on clients. Business object residing on servers will be accessed remotely using CORBA, RMI or another ORB. Alternatively, a compact business object could be downloaded to a requesting party, where it could be used alone or as part of a dynamically assembled application to process local data or provide services locally.

Java beans, as other Java classes, are relatively small, but they can be made even more compact by compressing them into Java archive files. Java archive files, so called JAR files, can include several components including Java code, HTML documents, graphics, audio, etc. They provide a very convenient and efficient means of transporting components over a network.

In contrast to ActiveX controls, Java beans components are portable, because they execute on a uniform interpreter, Java Virtual Machine (JVM). The only requirement is that the underlying platform runs a JVM. Fortunately, this is a case with almost any operating environment. A JVM can also be implemented in hardware (e.g., picoJava chip from Sun) making it possible to run Java code on miniature platforms like telephones and household items.

Java beans are portable also from a perspective of a developer. A bean that has been created on one platform, can be reused on another platform and in a different development environment.

JavaBeans security is another feature contrasting its ActiveX counterpart. A Java bean runs inside a so-called security sandbox, which determines (limits) the scope of the operations that the bean will be allowed to perform locally. The level of security can be controlled by the user.