1. DEVS Library for Layered Queueing Networks
  2. by Dorin Petriu (dorin@sce.carleton.ca)

Overview

  1. This report covers the implementation of a DEVS library for the simulation of Layered Queueing Networks. See Layered Queueing Networks provides an overview of Layered Queueing Networks. See LQN Simulation Library for DEVS describes the design, implementation and testing of the DEVS library. Finally, See Conclusions concludes the report.

Layered Queueing Networks

Background

  1. Queueing Networks are based on a customer-server paradigm. Customers make service requests of the servers and these request are queued at the server until they can be serviced. Traditional Queueing Networks model only a single layer of customer-server relationships. Layered Queueing Networks (LQN) allow for of an arbitrary number of client-server levels See J. A. Rolia, K. C. Sevcik, "The Method of Layers", IEEE Transactions on Software Engineering, Vol. 21, No. 8, 1995, pp. 682-688 See C. M. Woodside, "Throughput Calculation for Basic Stochastic Rendezvous Networks", Performance Evaluation, Vol. 9, No. 2, Apr 1988, pp. 143-160 . LQNs can thus model intermediate software servers and be used to detect software deadlocks and software as well as hardware performance bottlenecks See J.E. Neilson, C.M. Woodside, D.C. Petriu and S. Majumdar, "Software Bottlenecking in Client-Server Systems and Rendez-vous Networks", IEEE Trans. On Software Engineering, Vol. 21, No. 9, pp. 776-782, September 1995 . The layered aspect of LQNs makes them very suitable for evaluating the performance of distributed systems See C. M. Woodside, J. E. Neilson, D. C. Petriu and S. Majumdar, "The Stochastic Rendezvous Network Model for Performance of Synchronous Client-Server-Like Distributed Software", IEEE Transactions on Computers, Vol. 44, No. 1, Jan 1995, pp. 20-34 See C. M. Woodside, S. Majumdar, J. E. Neilson, D. C. Petriu, J. A. Rolia, A. Hubbard and R. B. Franks, "A Guide to Performance Modeling of Distributed Client-Server Software Systems with Layered Queueing Networks", Department of Systems and Computer Engineering, Carleton University, Ottawa, Canada, Nov 1995 . For a tutorial on LQN please refer to See Woodside C.M. (2002) Tutorial Introduction to Layered Performance Modeling of Software Performance. http://www.sce.carleton.ca/rads/lqn/lqn-documentation/tutorialf.pdf, May 2002. .
  2. LQNs model both software and hardware resources. The basic software resource is a task which runs in the context of a hardware processor. A task is any software object that has its own thread of execution. Tasks have entries which act as service access points. Entries can also be decomposed into phases that divide the workload into a first phase that is executed prior to sending a reply and a second phase that is executed after sending a reply or activities which are operations connected in sequence or in parallel.
  3. Service calls can be made from entries in one task to entries in other tasks. LQNs support three types of calls: asynchronous (non-blocking), synchronous (blocking), and forwarding (a chain of calls where only the original caller blocks). See Time semantics of LQN asynchronous, synchronous, and forwarding calls. shows the time semantics of these different types of calls.

Solving

  1. LQNs can be solved either using either the Layered Queueing Network Solver (LQNS) or the Layered Queueing Simulator (LQSim).
  2. LQNS is an analytic solver developed at Carleton University by Greg Franks as part of his Ph.D. research See Greg Franks, "Performance Analysis of Distributed Server Systems", Report OCIEE-00-01, Ph.D. thesis, Carleton University, Ottawa, Jan. 2000 . LQNS breaks the LQN layers down into separate queueing network sub-models. The individual queueing networks can be solved analytically using mean value analysis (MVA). The MVA results for each sub-model are then used to the parameters for the other sub-models it is connected to and the MVA is performed anew. This process is repeated either for a maximum number of iterations or until the results converge on a convergence value specified by the user.
  3. LQSim uses the ParaSol simulation environment. ParaSol can simulate multithreaded systems that support transactions and provides built-in statistics for monitoring simulation objects See E. Mascarenhas, "A System for Multithreaded Parallel Simulation and Computation with Migrant Threads and Objects", Ph.D. Thesis, Department of Computer Sciences, Purdue University, West Lafayette, USA, 1996 See E. Mascarenhas, F. Knop, and V. Rego, "ParaSol: A Multithreaded System for Parallel Simulation Based on Mobile Threads", Winter Simulation Conference, 1995 . LQNs are simulated by creating tokens for each call and following those tokens through the system. The performance metrics are arrived at by recording the wait times and other statistics for each token. LQSim requires large number of runs in order to gather statistically meaningful data. This makes LQSim appreciably slower than LQNS and requires long simulations in order to generate accurate results.
  4. Both LQNS and LQSim generate results that show entry average service times, average waiting time, throughput, and utilization, as well as processor throughput and utilization.
  5. DEVS models for the LQN simulation library.

    LQN aspect or element

    DEVS atomic model

    DEVS coupled model

    Functionality

    Processor

    Processor

    • receives call, executes it for the specified time
    • replies when done
    • calculates utilization and throughput

    Processor

    • combines gather, queue, and atomic processor for full LQN processor functionality

    Entry with phases

    Entry

    • receives call, executes the associated workloads (phase 1 and phase 2 processing, makes calls), and replies when done
    • processor demands for phase 1 and phase 2 must first be initialized through the initproc port
    • server calls for phase 1 and phase 2 must first be initialized through the initserv port

    Entry

    • combines gather, queue, atomic entry, and distribute for full LQN entry functionality

    implied queue

    Queue

    • adds call to queue
    • sends first element in queue to attached idle Processor or Entry
    • passes reply back up to the call source

    aggregating calls from multiple sources

    Gather

    • aggregates calls from multiple input ports and sends them out the single output port
    • adds a message with the input port index
    • passes reply from the reply port at the "output end" through to the appropriate response port at the "input end"

    distributing calls to different entries

    Distribute

    • receives calls on the single input port and distributes them to the appropriate output port
    • sends reply from the reply port at the "output end" to the single response port at the "input end"

    Task

    Task

    • coupled model composed of multiple entries

    Disk

    Processor

    • reuses the functionality of a processor

    Activity

    not represented

    not represented

LQN Simulation Library for DEVS

Design

  1. The DEVS LQN library must represent processors, tasks, and entries with phases. Additionally, the library might also represent disks and activities. The DEVS simulation should provide results for:
  2. entry average service time, throughput, and utilization
  3. phase average service time
  4. processor throughput and utilization
  5. queue average waiting time, average queue length (this is not provided in the existing LQN solver and simulator and although it can be calculated, it would be desirable to provide it outright)
  6. The main design issue for the DEVS LQN simulation library was to decide which LQN elements or artifacts to model as DEVS atomic models and which ones to model as DEVS coupled models. Processors, tasks, and entries with phases definitely needed to be included in the library. Activities were initially left out since they are optional in the LQN notation.
  7. LQN elements have queues that are implicitly supported by LQNS and LQSim. FIFO queues were incorporated explicitly into the DEVS LQN library. Since queues behave the same way for software or hardware elements, a decision was made to implement a universal queue as a separate atomic model to be coupled with the processor or entry atomic models.
  8. LQN calls are made using entry names to identify the call target. Therefore a mechanism was needed to deal with addressing the calls in DEVS. The solution was to implement a DEVS version of a multiplexer and demultiplexer to either gather calls into a given queue (either for an entry or a processor) or to distribute calls from an entry to other entries.
  1. See DEVS models for the LQN simulation library. lists the different DEVS models for LQN elements, while See FSMs for the DEVS queue and processor atomic models. and See FSM for the DEVS entry atomic model. show FSMs for the behaviour of the Queue, Processor, and Entry atomic models. The Gather and Distribute models have non-blocking, single-state FSMs that instantly pass messages through and route them to the correct ports.

DEVS Coupled Model Structure

  1. See ROOM structure for the LQN processor and LQN entry DEVS coupled models. shows the structure, using ROOM See B. Selic, G. Gullekson, P.T. Ward, "Real-Time Object-Oriented Modeling", John Wiley & Sons, New York, 1994 actor notation, of the DEVS coupled models for LQN Processors and Entries incorporating queues and message routing multiplexers/demultiplexers. The in ports of the Processor and Entry atomic models are connected to the out ports of their dedicated Queue atomic models. The in port of the Queue is connected to the output port of the Gather multiplexer model. For entries, the servcall output port is connected to the in port of the Distribute demultiplexer which sends it on the appropriate out port for the intended call target. The same sort of connections are repeated for the reply ports but with the reply messages going in the opposite direction. It is these coupled models that fully represent the LQN processors and entries.

Structural Limitations

  1. The coupled Entry and Processor models are the building blocks that can be composed into layered models. The only limit on the LQN models that can be built is the limit of ten input ports into the Gather multiplexers and ten output ports out of the Distribute demultiplexers. Therefore, coupled Processor models can have a maximum of ten different entries running on them and coupled Entry models are limited to a maximum of ten different clients and can only call a maximum of ten other servers. This physical limitation can easily be overcome by implementing Gather and Distribute models with more ports if that becomes desirable.

Messaging

  1. LQN messages can be thought of as having a source field denoting the entity making the call, a destination or target field denoting the entry for which the call is destined, and a demand field denoting the workload associated with the call. Simple DEVS messages have only a single variable field per message, therefore making it necessary to send and receive sets of two or three messages in order to transmit all of the required fields. See DEVS LQN simulation library messages. lists the messages sent between atomic models in the DEVS LQN simulation library, how they are ordered, and how they should be interpreted.
  2. DEVS LQN simulation library messages.

    Sender (port)

    Receiver (port)

    LQN Equivalent Message

    DEVS Messages (in order)

    Interpretation

    Processor (reply)

    Queue (response)

    done

    • reply
    • notify the source entry that the processing is done, the message value represents the actual processing time in ms

    Processor (ready)

    Queue (ready)

    done

    • ready
    • ready for another job, the message value is irrelevant

    Processor (throughput)

    throughput

    • throughput
    • the message value represents the processor throughput in number of jobs per ms

    Processor (utilization)

    utilization

    • utilization
    • the message value represents the fraction/percentage of time that the processor has been busy

    Entry (proccall)

    Distribute (in[0...9])

    processor call

    • processor service demand
    • the message value represents the processor demand in ms

    Entry (servcall)

    Gather (in)

    service call

    • service call
    • the message value represents the index of the target server

    Entry (avservtime)

    average entry service time

    • average entry service time
    • the message value represents the average entry service time in ms

    Entry (avph1time)

    average phase1 service time

    • average phase 1 service time
    • the message value represents the average phase 1 service time in ms

    Entry (avph2time)

    average phase2 service time

    • average phase 2 service time
    • the message value represents the average phase 2 service time in ms

    Entry (throughput)

    throughput

    • throughput
    • the message value represents the entry throughput in number of jobs per ms

    Entry (utilization)

    utilization

    • utilization
    • the message value represents the fraction/percentage of time that the entry has been busy

    Queue (out)

    Processor (in)

    processor call

    • processor service demand
    • the message value represents the service demand in ms

    Queue (out)

    Entry (in)

    service call

    • service call
    • service call, the message value is irrelevant

    Queue (reply)

    Gather (response)

    reply

    • reply
    • the message value represents the index of the source that must be replied to

    Queue (averagesize)

    • average queue size
    • the message value represents the average number of elements in the queue at the time the message was sent

    Queue (averagewait)

    • average queueing wait
    • the message value represents the average number of milliseconds a message spent in the queue at the time the message was sent

    Gather (out)

    Queue (in)

    service call

    • source of service call

    • service call demand
    • the message value represents the index of the call source
    • if attached to a processor then the message value represents the processor service demand in ms, otherwise the message is irrelevant

    Gather (reply[0...9])

    Distribute (resp[0...9])

    reply

    • reply
    • reply, the message value is irrelevant

    Distribute (out[0...9])

    Gather (in[0...9])

    service call

    • service call
    • service call, if attached to a processor then the message value represents the processor service demand in ms, otherwise the message value is irrelevant

    Distribute (reply)

    Entry (response)

    reply

    • reply
    • reply, the message value represents the index of the call target returning the reply

    Entry (initproc)

    • phase number

    • processor demand
    • the message value represents the phase number to initialize
    • the message value represents the processor demand in ms

    Entry (initserv)

    • phase number

    • calls

    • call target
    • the message value represents the phase number to initialize
    • the message value represents the number of calls to make to the target server
    • the message value represents the index of the target server

DEVS LQN Limitations

  1. DEVS LQN Tasks are formed by aggregating, but not interconnecting, coupled Entry models. Using the current implementation, the resulting tasks end up having individual queues for each entry. This differs from the queueing model in single-threaded LQN tasks where the tasks has a single queue shared by all its entries. However, this is not an issue for single-entry tasks or for tasks with the same number of threads as entries. The solution to this potential problem is to add a Distribute layer between the atomic Queue and atomic Entry models such that a single queue is shared among the multiple entries in a task. Implementing this also requires refining the messages to have two target fields - one for the target task and another one for the target entry in that task.
  2. The DEVS environment does not provide any finer time granularity than milliseconds. This is a limitation for evaluating systems that require more precise time scales. A possible solution would be to reconfigure the DEVS environment so that time is counted in unitless "ticks" that can be interpreted to have whatever precision a user desires. This is the approach currently used in the LQSim simulator.
  3. Another limitation of the current implementation of DEVS LQN is that the random number generating routines in randlib.c (Cygwinl) do not generate truly random numbers since the generator is seeded with the same number and the random numbers generated are the same from run to run. As well, the genexp method provided with the DEVS package generates exponentially distributed numbers with an ever increasing mean, as was observed over long runs. Both of these issues should be easy to address by incorporating better mathematical libraries in the code.

Conclusions

  1. The DEVS LQN simulation library provides a starting point for creating simple LQN performance models in the DEVS environment. It makes a contribution to the LQN modeling paradigm by extending it to a simulation platform that supports interactions between different models and different simulation platforms, something that the existing LQNS and LQSim solvers cannot do.
  2. The current implementation exhibits some weaknesses in the generation of random numbers, the fixed simulation timescale, and by allowing entries to have individual queues. Additional work should also be undertaken to add support for asynchronous and forwarding calls to the library - the current version only uses synchronous calls. Eventually the library should also include support for LQN activities.

References

  1. Greg Franks, "Performance Analysis of Distributed Server Systems", Report OCIEE-00-01, Ph.D. thesis, Carleton University, Ottawa, Jan. 2000
  2. E. Mascarenhas, "A System for Multithreaded Parallel Simulation and Computation with Migrant Threads and Objects", Ph.D. Thesis, Department of Computer Sciences, Purdue University, West Lafayette, USA, 1996
  3. E. Mascarenhas, F. Knop, and V. Rego, "ParaSol: A Multithreaded System for Parallel Simulation Based on Mobile Threads", Winter Simulation Conference, 1995
  4. J.E. Neilson, C.M. Woodside, D.C. Petriu and S. Majumdar, "Software Bottlenecking in Client-Server Systems and Rendez-vous Networks", IEEE Trans. On Software Engineering, Vol. 21, No. 9, pp. 776-782, September 1995
  5. J. A. Rolia, K. C. Sevcik, "The Method of Layers", IEEE Transactions on Software Engineering, Vol. 21, No. 8, 1995, pp. 682-688
  6. B. Selic, G. Gullekson, P.T. Ward, "Real-Time Object-Oriented Modeling", John Wiley & Sons, New York, 1994
  7. C. M. Woodside, "Throughput Calculation for Basic Stochastic Rendezvous Networks", Performance Evaluation, Vol. 9, No. 2, Apr 1988, pp. 143-160
  8. C. M. Woodside, J. E. Neilson, D. C. Petriu and S. Majumdar, "The Stochastic Rendezvous Network Model for Performance of Synchronous Client-Server-Like Distributed Software", IEEE Transactions on Computers, Vol. 44, No. 1, Jan 1995, pp. 20-34
  9. C. M. Woodside, S. Majumdar, J. E. Neilson, D. C. Petriu, J. A. Rolia, A. Hubbard and R. B. Franks, "A Guide to Performance Modeling of Distributed Client-Server Software Systems with Layered Queueing Networks", Department of Systems and Computer Engineering, Carleton University, Ottawa, Canada, Nov 1995
  10. Woodside C.M. (2002) Tutorial Introduction to Layered Performance Modeling of Software Performance. http://www.sce.carleton.ca/rads/lqn/lqn-documentation/tutorialf.pdf, May 2002.