From Internet to ActiveNet

D.L. Tennenhouse[*], S.J. Garland, L. Shrira and M.F. Kaashoek
Laboratory for Computer Science, MIT

Abstract

The ActiveNet will address the mismatch between the rate at which user requirements can change, i.e., overnight, and the pace at which physical assets can be deployed. As the Internet grows it is increasingly difficult to maintain, let alone accelerate, the pace of innovation. Today, after a concept is prototyped its large scale deployment takes about 8 years. The ActiveNet will accelerate the pace of innovation by decoupling network services from the underlying hardware and by allowing new services to be demand loaded into the infrastructure. In the same way that IP enabled a range of upper layer protocols and transmission substrates, the ActiveNet will facilitate the development of new network services and hardware platforms.

Active Networks [1] represent a new approach to network architecture that incorporates interposed computation. These networks are "active" in two ways: routers and switches within the network can act on, i.e., perform computations on, user data flowing through them; furthermore, users can "program" the network, by supplying their own programs to perform these computations.

We propose that interested researchers work towards the deployment of a wide area ActiveNet, based on the active network approach. This experimental infrastructure will be overlaid on existing transmission facilities, such as the Internet, using similar techniques to those used by the prototype MBONE, i.e., by "tunneling" through existing networks. The connectivity available through existing substrates will enable the parallel deployment of a few different programming models, providing an opportunity to explore alternatives. Researchers at Bellcore, BBN, CMU, Columbia, John Hopkins, U. Arizona, UCLA, U. Penn and U. Washington have agreed to work towards this goal and other organizations, in both industry and academia, have expressed interest in participating.

Our work is motivated by user "pull", as well as technology "push". The "pull" comes from the ad hoc collection of firewalls, Web proxies, multicast routers, mobile proxies, video gateways, etc. that perform user-driven computation at nodes "within" the network. These nodes are flourishing, suggesting user and management demand for their services. One goal of our work is to replace the present collection of ad hoc approaches with a generic capability that allows users to program their networks.

The technology "push" is the emergence of "active technologies", supporting the encapsulation, transfer, interposition, and safe and efficient execution of program fragments. Today, active technologies are applied above the end-to-end network layer; for example, to allow clients and servers to exchange program fragments. Our innovation is to leverage and extend these technologies for use within the network - in ways that will fundamentally change today's model of what is "in" the network.

1. Active Networks

Active networks will allow users to deploy new services by tailoring components of the shared infrastructure to suit their requirements. For example, users could request that a router execute an application-specific compression algorithm during the processing of their packets. There are three advantages to basing the network architecture on the exchange of active programs, rather than static packets: Our approach replaces the passive packets of today's networks with active "capsules" - miniature programs that are executed at each router / switch they traverse. Every message, or capsule, that passes between nodes contains a program fragment that may include embedded data. When a capsule arrives at an active node, its contents are evaluated, in much the same way that a PostScript printer interprets files that are sent to it.



Figure 1. Active Node Organization

Figure 1 provides a conceptual view of how an active node could be organized. Bits arriving on incoming links are "framed", using traditional link layer techniques to identify capsule boundaries. The capsule's contents are then dispatched to a transient execution environment where they are evaluated. We hypothesize that programs are composed of "primitive" instructions, that perform basic computations on the capsule contents, and can also invoke external "methods", which provide access to programs and resources external to the transient environment. The execution of a capsule may result in the scheduling of zero or more capsules for transmission on the outgoing links and/or changes to the non-transient state of the node. The transient environment is destroyed when capsule evaluation terminates.

2. Lead Users / Applications

We are encouraged by the observation that a number of "lead users" have pressing requirements for the transparent interposition of computation within the network. These include the developers of: firewalls, web proxies, and mobile/nomadic gateways In the absence of architectural support for interposition, these users have adopted a variety of ad hoc interposition services, some of which pose as network layer routers even though they perform application/user- specific functions.

Firewalls implement customized filters that determine which packets are allowed to cross administrative boundaries. The manual process of updating firewalls, to enable the use of new applications, is an impediment to the adoption of new technology that could be automated in an active infrastructure.

Web proxies are an application-specific service interposed between clients and browsers. Harvest  [2] employs a hierarchical caching scheme that reduces latencies experienced by individual users and the aggregate bandwidth that is consumed. Using the ActiveNet, this system could be extended, allowing nodes of the hierarchy to be located at strategic points within the network.

Interposition is leveraged by many researchers addressing mobility and nomadic computing. Kleinrock [3] suggests the interposition of "nomadic routers", between end systems and the network. Similarly, "nomadic agents" [3] and "gateways" are placed near discontinuities in the available bandwidth, e.g., at wireless base stations. Services performed at these gateways include file caching and image transcoding [4]. The InfoPad [5] takes this process further, instantiating user-specific "pad servers" at intermediate nodes. Finally, researchers have investigated TCP "snooping" [6], in which per-connection state information is retained at wireless base stations.

3. Active Technologies

One of the reasons it will be possible to build and secure active networks is the availability of active technologies - mechanisms that allow users to inject customized programs into shared resources. Our plan is to use these technologies to ensure that each capsule is evaluated within a restricted environment - and to control the actions that can be performed on resources that lie outside of that environment. Our use of active technologies within the network is quite novel - until now active technologies have only been used on an end-to-end basis, e.g., shipping "applets" from web servers to browsers.

Active technologies have been emerging in the fields of operating systems and programming languages for over ten years. Early work addressed mobility, efficiency or safety. PostScript is an example of an effort that stressed mobility over safety. Safe-Tcl [7] stressed safety over efficiency, using a source language that can be interpreted safely. In the parallel processing arena, "active messages" [8, 9] favored efficiency, by reducing the "program" to a single instruction that invokes an application-specific handler. The pace of research accelerated with the advent of distributed systems and internetworking. The x-kernel [10] supports the composition of protocol handlers by providing an architecture for stacking them and by automating the dispatch process. Other efforts [11-13] have focused on less friendly environments by improving both the safety and efficiency with which handlers can be implemented.

Recent work has considered all three attributes in concert. Java [14] relies on an intermediate instruction set [15] that has been carefully designed to reduce the operand validation the interpreter must perform as each instruction is executed. The most aggressive of the active technologies support the "safe" execution of binary programs that are directly executed by the underlying hardware: SPIN [12] relies on a "trustworthy" compiler to generate programs that will not stray beyond a restricted environment; whereas the approach described in [16, 17] prescribes a set of rules that instruction sequences must adhere to. An important aspect of the latter work is that the "rules" are defined in such a way that conformance to them can be statically and automatically verified when an instruction sequence is presented for execution.

In summary, variability in network applications and traffic patterns suggests that there is no "right" technology. Accordingly, the ActiveNet initiative will extend a range of programming languages and active technologies, and explore their use within the network infrastructure.

4. The ActiveNet Initiative

The ActiveNet effort will involve the synthesis of work in programming languages, operating systems and networking. An approach to the organization of such an effort is depicted in Table 1, which illustrates how activities can be organized along the lines of: An additional activity, not listed in the table, involves the deployment and day-to-day operations of the research platforms, including the development of the tools and techniques to support network operation, monitoring and measurement. Although most of the nodes will be located at participating research sites, provision will be made to locate nodes at strategic locations, such as Internet NAPs.



Table 1. ActiveNet Participants / Activities.

* indicates additional CMU activities, in support of their principal activities.
** Bellcore may also participate in the deployment and operation of the network.

5. ActiveNet Research at MIT

Specific ActiveNet activities to be undertaken at MIT include:

Architectural Guidelines (Programming models)

This activity is focused on architectural support for interoperability. As part of this effort, we will coordinate architectural activities within the ActiveNet community, by convening workshops and working with others to publish a set of guidelines, similar to early Internet RFCs. We will also develop a suite of "foundation components" that will provide the basis of an ActiveNet API. Finally, we will investigate specification techniques that can be used to reason about capsules and the security of network resources.

Interoperability. Packet networks achieve interoperability by standardizing the syntax and semantics of packets. For example, Internet routers all support the agreed IP specifications - they perform the same computation on every packet. In contrast, active nodes can perform different computations on the packets that flow through them. Interoperability is achieved at a higher level of abstraction - instead of standardizing the computation, we standardize the computational model, i.e., the instruction set and resources available to capsules.

To enable evaluation across the ActiveNet, it will be necessary to agree on a common model for the encoding of capsule "instructions". Our objectives for this programming model include: mobility, the ability to transfer capsules to a range of platforms; safety, the ability to restrict the resources that capsules access; and efficiency, enabling the above without compromising network performance. Our plan is to adopt an intermediate instruction set, such as [18], as the primary vehicle for interoperability. One of the benefits of the IP architecture is that it enables an "hourglass" architecture in which a variety of upper layer protocols can operate over a wide range of network substrates. Our intermediate instruction set will provide an analogous "hourglass" that facilitates mobility - a range of programming languages and compilers can be used to generate intermediate code for execution on diverse ActiveNet platforms. We will also provide extensions that allow implementors to directly leverage source and binary encodings, while retaining the intermediate encoding as the "fallback" that ensures interoperability.

Foundation Components. We are setting out to create a "Smalltalk of networking", through which we can apply a programming language perspective to networks and their protocols. We are interested not just in the "language" itself but also in the class hierarchy, etc. that extends it. In place of protocol "stacks", we will develop software "components" that leverage the tools of the programming trade - encapsulation, polymorphism and inheritance.

The node instruction set will be extended through a set of "foundation components" that provide secure access to the node's embedded operating system. This ActiveNet API will include services tailored to the network environment, such as the efficient copying of capsules and control over the scheduling of transmissions. In addition to controlling access to the physical resources of nodes, foundation components secure access to logical resources, such as routing tables.

Foundation components will provide "soft" storage for objects that need not survive the re-initialization of the node or can be deleted without notice. Connections can be realized by arranging for a capsule to traverse a specific path, and leave a small amount of state in each node that evaluates it. A similar approach could be used to implement "flows" [19] that tolerate the deletion of information from soft storage - if the flow state is not found at a node the capsule dynamically generates the information and leaves it in the node for the convenience of the next capsule.

Unless programs are short relative to the data they encapsulate, it will prove inefficient to carry them in every message. The programming environment will be extensible, so that capsules can reference methods defined by other capsules. In this way, most capsules can be concise - as short as a single instruction that invokes a user-specified method. For example, a capsule carrying an Internet datagram could contain a single instruction that invokes the "IPv4" method on its payload. We will also investigate a scheme in which nodes maintain "caches" of external methods and use a dynamic mechanism to locate and load methods on demand.

ActiveNet Specifications. Active networks will provide the building blocks for a shared infrastructure that transcends many administrative domains. Their design must address a range of security issues that are often brushed over in systems intended for less public environments. We will investigate models that allow implementors and administrators to reason about the delegation of resources. We will also develop methods and tools to describe, contract for, and reason about the deployment of ActiveNet software. The principal idea is to provide different specifications for different customers and purposes: client specifications, for those writing capsules; environment specifications, that constrain the environment in which capsules are executed; and sub-component specifications, that impose requirements on other computational elements in the network.

In particular, we will investigate specification techniques that describe: (i) what users can expect from a computation performed in the network; and (ii) the impact that a computation has on the network. Capsules will interact with software components supplied by network providers and other users, and these components may evolve or be replaced at a rapid rate. Hence the first type of specification, which serves as a contract between the implementors and clients of a component, will need to provide acceptable ranges of behavior, rather than precise single behaviors (e.g., because cached components behave differently from non-cached components). The second type of specification requires a departure from traditional research. In this case, the primary issue is not to determine whether enough resources exist for a computation (because they probably will in a large network) but to determine whether, and at what cost, the network is willing to provide them. From this point of view, performance specifications become yet another aspect of the contract to be negotiated between the supplier and consumer of a service.

Active Work Flow and Storage (Middleware services)

Active networks will leverage middleware services that support the coordination of network management functions, such as the installation of new software and hardware, load balancing, fault tolerance, accounting and auditing. We will investigate the application of work flow concepts (presently used in business environments and in the human genome project) to the design of middleware that tracks dependencies and supports loosely synchronized activities. Since these services must be reliable in the presence of node crashes and network partitions, they will use recoverable storage services to reliably store the state of activities in progress and generate audit trails that can be analyzed to detect possible performance and consistency problems. These active storage services may also be used by other applications, especially those that implement asynchronous multi-cast services, such as news distribution and other notification services.

Expertise developed in the Thor [20] project will be used to address the shortcomings of current work flow systems. In particular, we can replace their centralized servers and databases with a scalable active storage system that: supports user defined work flow objects; uses efficient client caching and coherency techniques; and provides recoverability guarantees for work flow state. Furthermore, we can apply techniques developed within the Harp [21] project to support replicated servers and provide highly available access.

On-the-fly Compilation (Enabling technologies)

IP router performance is dependent on the careful tuning of "fast paths", i.e., the identification of a streamlined instruction sequence that processes the vast majority of packets, and relegates the complex (and less frequently) cases to other modules. An active node might achieve a similar performance boost by monitoring its traffic and dynamically generating fast path programs that streamline the execution of the most common capsules. We will leverage and extend recent work [22] that has demonstrated ``on-the-fly" compilation using `C, (tick C), a superset of ANSI C, that allows the dynamic generation of efficient, machine-independent code. `C provides many of the performance benefits of pure partial evaluation in the context of a statically-typed and widely-used language.

6. Impact

Active networks present an opportunity to change the structure of the networking industry, from a "mainframe" mind-set, in which hardware and software are bundled together, to a "virtualized" approach in which hardware and software innovation are decoupled [23]. The ability to download new services into the infrastructure will lead to a user-drive innovation process, in which the availability of new services will be dependent on their acceptance in the marketplace, and not be delayed by vendor consensus and standardization activities.

As the lead users cited in section 2 demonstrate, computation within the network is already happening - and network architecture must deal with this trend towards interposed computation. A community-wide infrastructure, such as the ActiveNet, will accelerate the pace of research and enable new generations of flexible networks that can be tailored to suit changing user requirements.

References

1. Tennenhouse, D. and D. Wetherall. Towards an Active Network Architecture. in Multimedia Computing and Networking (MMCN 96). 1996. San Jose, CA.

2. Chankhuntod, A., P.B. Danzig, and C. Neerdaels. A Hierarchical Internet Object Cache. in Proceedings of 1996 USENIX. 1996.

3. Kleinrock, L. Nomadic Computing (Keynote Address). in Int. Conf. on Mobile Computing and Networking. 1995. Berkeley, CA.

4. Amir, E., S. McCanne, and H. Zhang. An Application Level Video Gateway. in ACM Multimedia `95. 1995. San Francisco, CA.

5. Le, M.T., F. Burghardt, and J. Rabaey. Software Architecture of the Infopad System. in Mobidata Workshop on Mobile and Wireless Information Systems. 1994. New Brunswick, NJ.

6. Balakrishnan, H., et al. Improving TCP/IP Performance over Wireless Networks. in Int. Conf. on Mobile Computing and Networking. 1995. Berkeley, CA.

7. Borenstein, N. Email with a Mind of its Own: The Safe-Tcl Language for Enabled Mail. in IFIP International Conference. 1994. Barcelona, Spain.

8. von Eicken, T., et al. Active Messages: A Mechanism for Integrated Communication and Computation. in 19th Int. Symp. on Computer Architecture. 1992. Gold Coast, Australia.

9. Agarwal, A., et al. The MIT Alewife Machine: Architecture and Performance. in 22nd Int. Symp. on Computer Architecture (ISCA `95). 1995.

10. O'Malley, S.W. and L.L. Peterson, A Dynamic Network Architecture. ACM Transactions on Computer Systems, 1992. 10(2) p. 110-143.

11. Jones, M. Interposition Agents: Transparently Interposing User Code at the System Interface. in 14th ACM Symp. on Operating Systems Principles. 1993. Asheville, NC.

12. Bershad, B., et al. Extensibility, Safety and Performance in the SPIN Operating System. in 15th ACM Symp. on Operating Systems Principles. 1995.

13. Engler, D.R., M.F. Kaashoek, and J. O'Toole Jr. Exokernel: An Operating System Architecture for Application-Level Resource Management. in 15th ACM Symp. on Operating Systems Principles. 1995.

14. Gosling, J. and H. McGilton, The Java Language Environment: A White Paper, 1995, Sun Microsystems.

15. Gosling, J. Java Intermediate Bytecodes. in SIGPLAN Workshop on Intermediate Representations (IR95). 1995. San Francisco, CA.

16. Wahbe, R., et al. Efficient Software-Based Fault Isolation. in 14th ACM Symp. on Operating Systems Principles. 1993. Asheville, NC.

17. Colusa Software, Omniware: A Universal Substrate for Mobile Code, 1995, Colusa Software.

18. Sun Microsystems Inc., The Java Virtual Machine Specification, 1995,

19. Clark, D.D. The Design Philosophy of the DARPA Internet Protocols. in ACM Sigcomm Symposium. 1988. Stanford, CA.

20. Liskov, B., M. Day, and L. Shrira, Distributed Object Management in Thor, in Distributed Object Management, T. Ozsu, et al., Editor. 1993, Morgan Kaufmann: San Mateo. p. 79-91.

21. Liskov, B., et al. Replication in the Harp File System. in 13th ACM Symp. on Operating Systems Principles. 1991. Pacific Grove, CA.

22. Engler, D.R., W.C. Hsieh, and M.F. Kaashoek. `C: A Language for High-Level, Efficient, and Machine-Independent Dynamic Code Generation. in 23rd Annual ACM Symp. on Principles of Programming Languages (to appear). 1996. St. Petersburg, FL.

23. Tennenhouse, D., et al., Virtual Infrastructure: Putting Information Infrastructure on the Technology Curve. Computer Networks and ISDN Systems (to appear), 1996.