![]() |
![]() |
![]() |
![]() |
![]() |
Consider the following problem: you need a data-stream analysis program—further on, we will discuss a variant in which the problem is a realtime data-analysis program. Incoming streams of data are to be converted into a high-level synopsis or overview of the situation being described by the data streams. For example: data streams generated by medical monitoring equipment in an intensive care unit describe the patient's blood pressure, heart rate, temperature and many other vital signs; what does it all mean? What is the patient's condition? Is he getting better or worse? Data streams generated by avionics equipment describe an airplane's current situation—is the current situation satisfactory, optimal? Are there problems that the pilot should be worrying about? Data streams describe the condition of a complex market (a stock exchange, for example)—where is the market going? And incidentally, is insider trading going on? Data streams describe current atmospheric conditions; what's the weather, and which way is it tending? There are many other examples. Software to solve this kind of data-analysis problem is likely to become extremely important in the near future.
Main themes:
Load balance again is a main issue, and motivates our transformation from a specialist- to an agenda-parallel strategy.
Special considerations:
Agenda-parallel programs may be simpler than specialist-parallel programs even where network-structured programs are involved. Using abstraction to transform a process graph into a set of node-states that can be updated by workers in parallel may allow us to localize all interconnections to a single process, instead of distributing them throughout the program.
A realtime constraint—the necessity of assuring that a computation can be completed within some fixed period—imposes special scheduling requirements. We may be forced to seek a static task schedule, instead of relying on dynamic tasking.
Our main focus in this chapter is a particular software architecture designed with the data-analysis problem in mind. A software architecture is a structural framework for software—a plan that shows the basic module structure of a program, and the strategy for connecting the modules together. The software architecture we'll use to solve the data-analysis problem is the process trellis.
The trellis architecture is based on a hierarchical graph of decision processes. Each process concentrates on one part of the problem. We can break most data-analysis problems into pieces in such a way that some pieces are "low-level", others "high-level" and others in between. Low-level pieces generally focus on a particular data stream; high-level pieces attempt to answer a general question about the situation. If we are monitoring patients in an intensive care unit, for example, a low-level process might concentrate on figuring out whether heart rate is stable or is trending higher or lower; a high-level process might concentrate on a question like "is the patient suffering from septic shock (or some other disease)—or moving closer to this condition?" Medium-level processes concentrate on the recognition of patterns that are relevant to several full-fledged diagnoses, without constituting a complete diagnosis in themselves.
The trellis architecture requires that we break our data-analysis problems into pieces of this sort; then we organize the pieces into a hierarchical graph, an "information flow" hierarchy that mirrors the conceptual hierarchy of the problem. Conceptually low-level processes are at the bottom of the graph, high-level processes at the top. We can connect the bottom level of processes directly to data sources in the outside world. Data flows into the program at the bottom, and percolates upwards.
All processes in the graph execute continuously and concurrently, and they communicate among themselves and with the outside world using a simple and uniform protocol. Each process can be regarded as driving a meter that displays, at all times, the current status of the sub-problem it's focused on—the latest input value in some data stream, the probability that a given diagnosis correctly describes the situation, and so on. The parallelism in this architecture is important for two reasons. First, it will make it possible to apply lots of computational power to the problem—in many cases, the outputs of our analysis will be useless if they aren't available quickly. Second, it will allow us to build a clean, modular program out of conceptually-independent pieces.
Clearly, we've described a specialist-parallel problem. We noted in chapter 3 that "specialist parallelism involves programs that are conceived in terms of a logical network. They arise when an algorithm or a system to be modeled is best understood as a network in which each node executes a relatively autonomous computation, and inter-node communication follows predictable paths. The network may reflect a physical model, or the logical structure of an algorithm...".
The network in our case clearly reflects logical as opposed to physical problem structure.
In the following section we discuss the trellis architecture in greater detail, and a natural specialist-parallel approach. We go on to discuss refinements in the interest of efficiency—raising, in the final section, the important question of realtime execution. Realtime is a highly specialized topic that we can't discuss in detail, but we introduce several important points. Again, the real topic isn't the trellis architecture; it's the highly significant type of problem that the trellis represents.
The experiments with the trellis we discuss below were conducted by Michael Factor. His thesis [Fac90] is the definitive guide to this work.
Each node in the trellis (where a node is simply a decision process) continuously attempts to calculate a state based upon the states of inferior nodes. When sufficient information is available from a node's inferiors—each decision process defines for itself what "sufficient information" means—it generates a new state. As a result, nodes one level up may recalculate their own states, and so on.
We supply two types of "logic probe" with the system, "write probes" and "read probes," in order to accomplish input and output. Write probes are used to pump values into the system; read probes suck results out.
Touching a node with a write probe allows us to set the state of that node to any value we choose. If we squeeze the write probe's trigger (every probe comes equipped with a trigger), we cause the node whose state has just been reset to recompute its state. Touching a node with a read probe causes the node's current state to be returned. This current state can then be displayed somehow on a terminal connected to the program. The node's current state might be unknown -- insufficient data. If so, we can squeeze the read probe's trigger; this causes a query to be sent to the node we are touching. When a node is queried, it attempts to compute a new, current state. If it fails, it identifies which data values it's missing—which of its inferiors, in other words, have failed to report values that are required by its state-computing function. It then sends queries downwards to each one of these inferior nodes.
The interface between the trellis and the human user is constructed using probes; probes can also be used to wire separate trellises together into a larger ensemble.
For concreteness, figure 8.1 shows a diagram representing the functions of each level of a trellis for patient monitoring in an intensive care unit. Figure 8.2 shows the complete structure of this program as it existed in Fall 1989.


Massive parallelism: One of the most interesting aspects of the trellis architecture is its inherent scalability. It's easy to conceive of applications for ten-thousand node trellises (and beyond); and it's fairly easy to conceive of building such programs.
In very large trellises, many nodes will be similar or identical in structure. This is true, for example, of a current trellis design that focused on a stock market in an attempt to detect insider trading; the design encompasses thousands of nodes, but many are identical in structure. A transportation network might have a large number of segments all of which must be monitored separately and continuously, but the logic required will be basically the same in all cases; hence we may have, say, a thousand separate segments, each involving an identical 10-node sub-trellis. The complete program, of course, would involve other differentiated nodes as well. Scientific data gathering (in astronomy or weather domains, for example) may involve hundreds of data sources, to most of which essentially the same set of trends-detection filtering nodes may be attached.
Of course, many trellis domains will require a large collection of separate and distinct nodes. Our medical collaborators estimate that the ICU trellis will incorporate several hundred nodes, most distinct in design, when it's finished; but this trellis focuses only on hemodynamic monitoring. A general-purpose trellis for monitoring intensive care patients in all areas might require thousands of distinct nodes. The trellis architecture and its development environment is designed with the future needs of very large programs specifically in mind.
This discussion omits some fairly significant details. But it's sufficient to allow us to study the programming methodology issues that the trellis raises.
The specialist-parallel implementation is natural and suggests itself immediately. Basically, our strategy will be to implement each node in the trellis by a separate process. If our trellis has 100 nodes, we execute 100 eval operations to create 100 separate processes.
In this scheme of things, each process has the following structure:
while (TRUE)
if ( any of my inferiors has transmitted a new state) {
recompute my own state;
if ( my state has changed)
transmit the new state upwards to my superiors;
}
else if ( any of my superiors has transmitted a query) {
try to recompute my state;
if (I succeed)
transmit the new state upwards to my superiors;
else
transmit the query downwards to each inferior
from whom I need more information;
}
This outline ignores several points. First, it doesn't deal with probes; they are handled in the same fashion as upward-filtering values (write probes) or downward-filtering queries (read probes). If a node is top- or bottom-level, a new state or query is (clearly) propagated no farther. If a higher-level node appeals for data to the bottom level, and the bottom level doesn't have any data, the buck stops there, so far as the trellis itself is concerned. If a read-probe is attached to this bottom-level node, though, the node's current status gets reported (via the probe) to a display. It becomes the user's duty to supply the necessary data (via write-probe).
There are two reasonable, closely-related strategies for implementing this scheme.
First, we might provide one input stream for each node: in this case, whether I have new data to pass upwards to node Q, or a query to pass downwards to Q, I attach a tuple to Q 's unique input stream. In this case, each process will work roughly as shown in figure 8.3.
Figure 8.3
Process outline for a specialist-parallel trellis
|
Each process's input stream is a multiple-source one-sink in-stream of the sort discussed in chapter 3. Each node maintains a head index locally; the tail index floats in a tuple. Accordingly, we implement
get the next tuple from my input stream
using a statement like
in("Q-stream", ++head, ?information);
information is probably a struct; one field of this struct indicates whether this tuple holds a state or a query. To attach a tuple to R's stream, we use a sequence like
in("R-stream tail", ?index);
out("R-stream tail", index+1);
out("R-stream", index, NewInformation);
The other strategy is (perhaps) slightly cleaner and more modular. We implement each trellis node by means of two processes; one handles upward-filtering data, the other downward-filtering queries. Node Q has two streams associated with it, a query stream and a data stream. Each of the two processes associated with Q reads one stream. Q's current state is stored in tuple space, where each of the two processes can access it. The query and data streams are each in-streams of exactly the sort described above.
To complete the package, we need some computational routines to install in our coordination framework. One way to understand the computation is in terms of a set of state-computing procedures, one for each trellis node. A state-computing routine could accept a single struct as an argument. The first component of the struct holds the previous state of this node; the next components specify the current state of the other nodes upon which this node's state depends. The state-computing routine attempts to compute a new state, and if it succeeds, it assigns the new state value to the first field of its argument struct, TRUE. If it can't compute a new state, because of insufficient data, it returns FALSE. In this case, it assigns to the last field of the struct a value indicating which inferior nodes are responsible for the state-computing failure—which nodes, in other words, must be queried for more data.
For concreteness, let's say that some node is intended to calculate the likelihood that a patient is about to get a headache: when headache-likelihood rises to the "highly probable" level, we can give the patient some aspirin. Let's say that headache-likelihood depends on the state of three inferior nodes: the fatigue node, the hungry node, and the arguing with flat-earth activists, unreconstructed Trotskyites or functional programmers node. The headache-likelihood node also depends on its own previous state: even if all irritants suddenly disappear, a headache doesn't dissipate instantly.
We implement headache-likelihood by a routine that accepts a single five-field struct. The routine looks at the input struct and attempts to compute the current headache-likelihood. Suppose that the currently arguing node's state is FALSE, but that we have no current values for fatigue and hungry (their slots in the argument struct are filled by a NO-VALUE token). In this case, the headache node might decide that it has insufficient data to compute a new state. It assigns NO-VALUE to its own current state, assigns headache and fatigue to the last ("values needed") field of the argument STRUCT, and returns FALSE. (Note, on the other hand, that if arguing were TRUE, and had been true for—say—the last ten minutes, the headache node might well have decided that its current state was "highly probable" or even "certain" despite the lack of current data from the fatigue and the hungry nodes.)
Once we've specified each node's state function, we can build the coordination envelope that wraps around it. Suppose we are using the one-process-per-node coordination framework. Each node f has two corresponding routines: f_state() and f_envelope(). f_state implements f 's decision logic: it produces a new state given the appropriate input values. f_envelope handles inter-node communication, and invokes f_state as appropriate. See figure 8.4.
Figure 8.4
|
To create the trellis, we simply execute an
eval("Node f", f_envelope());
for every trellis node.
The main performance problem with our specialist-parallel starting-point is, again, load balancing. If we're going to create one process for every node in a trellis, the result is (potentially) lots of processes. We might in principle execute the trellis on a parallel machine that has one processor for every trellis process, but that's unlikely. In most cases, it's also unnecessary and undesirable. In order to give adequate performance (to handle information fast enough to be useful), trellis programs will in general need to run on many processors simultaneously. But when each trellis node performs a relatively modest computation, it doesn't need a whole processor to itself; it can perfectly well share a processor with other trellis nodes. Thus we might have, say, a one-hundred node trellis (hence, let's say, a one-hundred process parallel program, in the specialist-parallel version) that executes too slowly on a single processor, but whose computational requirements can easily be met, in principle, by a five-processor parallel workstation.
Our problem in the latter case is again, in line with the previous chapter, to deploy one hundred processes onto five processors in a reasonable way. We can't simply dump twenty randomly-chosen processes onto each processor: unless every process does the same thing (recall that a trellis is specifically a heterogeneous process ensemble), some processors will wind up with a lot more work than other processors. As we've discussed, this kind of poor load balance is bad for performance. Performance is constrained by the speed of the slowest (i.e., the most heavily burdened) processor. And we've discussed the disadvantages in making assumptions about any automatic process-reshuffling capabilities that the runtime support system might have.
As an alternative, a software architecture like the trellis lends itself to some (fairly sophisticated) "static scheduling" techniques—techniques to assign work to processors in such a way that load is well-balanced at runtime. We discuss these techniques in the next section. Here we describe a simpler approach, which also lays the groundwork for the next variation.
We can use exactly the sort of abstraction transformation discussed in chapter 2 to transform our specialist- to an agenda-parallel solution, in the process solving our load-balance problem.
As discussed in chapter 2, abstraction requires that we "raise the processes one level in the conceptual scheme." In our case, each process will compute many trellis states on each iteration, rather than embodying a single trellis node. A node's current state will be represented by a state tuple. Hence a thousand-node trellis implies the existence of a thousand state-tuples in tuple space, not 1000 separate processes. The "agenda" is simple:
while (TRUE) {
accept and install any new input data values
or queries;
update every node's state;
}
Clearly, the second agenda item entails many tasks; we can throw as many workers as we want (up to a maximum of one per trellis node) into the updating process. In other words, we'll do the following. We execute an unbounded series of iterations. On each iteration, we accept input (data or queries) from the outside world; then we send a collection of identical worker processes sweeping over the trellis. Each worker grabs a node-describing tuple, updates the corresponding node's state and repeats, until no more nodes remain to be updated. Then we go on to the next iteration.
We've solved our load-balancing problem: tasks are distributed dynamically to worker processes on each sweep. We no longer face the problem of allocating processes to processors: each processor executes a single worker process. Note also that under our new scheme, all trellis interconnection information is localized in the master process. Rewiring the trellis—adding or deleting a node, or changing decision procedures and, consequently, the interconnection pattern—require changes to a single table in the master process only.
In the implementation of this scheme, our basic data structure will be a task bag of the type discussed several times previously. This time, though, we need to implement an iterative bag structure—once every task in a given bag has been processed, workers must proceed to the next bag (or, alternatively, the single bag must be refilled with the next iteration's tasks).
There are (as usual) many ways to implement this scheme. We'll describe one simple version. We have a master and n worker processes. The master maintains a data structure that describes the current state of every node in the trellis. At the start of each iteration, the master accepts new data values and queries. Some nodes will have to be updated on this sweep and others won't. Nodes whose inferiors have changed state, whose superiors have sent them a query or who have been touched by a triggered probe will need to updated; others won't. The master creates one task-describing tuple for each node that requires updating, and dumps the tuples into a bag.
The master executes the following loop:
while (TRUE) {
/* for the indefinite lifespan of this program */
accept input values or queries;
identify nodes that need to be updated this iteration;
for ( each node that needs updating) {
++TasksThisRound;
out("task", TaskDescription);
};{
for (; TasksThisRound--; )
in("result", ?NewStateValue);
record NewStateValue;
}
}
The worker's coordination framework is simple:
while (TRUE) {
in("task", ?TaskDescription);
compute a new state for this node;
out("result", NewStateValue);
}
The computational part of the program is essentially the same as what we described in the previous section. A task in this agenda-parallel version corresponds to the arrival of a new input-stream tuple in the specialist-parallel program. The only significant difference is that, in this new version, state- and query- tuples aren't attached to the input streams of the relevant nodes. Instead, the fact that a new state has been computed or that a query must be propagated is simply recorded in the NewStateValue that is returned to the master in the result tuple. The master is responsible for updating the trellis data structure and scheduling the relevant nodes for execution during the next sweep.
Note that we've achieved not only dynamic load balancing but, as usual in our agenda-parallel schemes, control over granularity. If the update computation at a given node is too minor to cover the overhead of tasking, the master process can group updates into clumps, dynamically if necessary.
Variants of the scheme are possible under which there is no master, merely a collection of identical workers. We discuss such a scheme in the exercises.
Now we come to the hard part. Suppose that we require not merely that a trellis perform "well," but that it perform in realtime? Specifically: input values arrive from the external world at a certain maximum frequency. How can we guarantee that each new set of values is fully processed before the next crop arrives? That each new set of values entering at the bottom has percolated throughout the trellis before the next set is pumped in?
"Realtime" is a word that lacks a canonical definition. We don't require our program to meet so-called "hard realtime" constraints—it needn't deliver reliably predictable performance on a time scale that is comparable to a processor's clock frequency. Hard realtime performance would require a realtime operating system and a realtime Linda implementation. The operating system would have to avoid all sources of unpredictable runtime at any scale (virtual memory, for example). The Linda system would be required to report detailed expected-performance estimates for all Linda operations, based on the runtime data structures chosen by the optimizer for each tuple class. Having provided such a foundation, we could then implement a hard realtime system using the techniques we'll describe below. But the system we'll describe does meet the relevant realtime constraints for its domain without this added complexity. In the medical monitoring domain, we need to be able to guarantee performance on a time scale of seconds, not of microseconds. This kind of constraint is typical of many problems in the software-monitor domain.
Our agenda-parallel trellis implementation allows us to make certain statements that are germane to the realtime question. We know that, in the worst case, assimilating a completely new set of data values will require that every node in the trellis be updated. These updates will be distributed over a series of iterations—in the first iteration the bottom-level of nodes is updated, then the next level and so on. We know, too, how long it takes to update each node. The state-computing functions associated with each node are fixed, of course; we can simply time arbitrarily many executions of each node-update step and choose worst-case values. (Obviously, we must be able to guarantee that each state function behaves in a fairly predictable way and that it always converges. Our goal in general is the following: given a collection of well-behaved, predictable state functions, build a parallel program that is also well-behaved and predictable.)
Because we know the worst-case amount of work that is required in order to process a new set of input values, we can predict approximately how long a complete update of the trellis will require, given some number of workers among whom the tasks are being shared.
Unfortunately, though, the dynamic nature of task-assignment in our agenda-parallel program introduces some uncertainty into the timing: we don't know exactly which tasks each worker will grab, and so we can't predict exactly how long each iteration will take. Dynamic task-assignment introduces a measure of overhead as well as some uncertainty about timings. Dumping task tuples into tuple space, and removing them, takes time. If each worker knew a priori exactly what tasks it would be required to perform on each iteration, we wouldn't need to distribute task tuples.
Ordinarily we're happy to pay the costs of such a distribution, because arriving at a good static distribution of tasks would be difficult or impossible, and the costs of dynamic tasking are often small. But the realtime trellis case is different. The fixed structure of the trellis, and the predictable nature of trellis execution, make it possible to arrive at a good static schedule. Furthermore, when we are dealing with realtime requirements, predictability is crucial.
8.6.1 A realtime scheduling algorithm
The actual strategy followed in achieving realtime execution of the trellis is complicated. What is important here is the general flavor of the approach.
First, we need to define exactly what we're trying to achieve. If no process generates a new state without receiving new information (either from an inferior or a write probe), any trellis program upon which no probes are executed for a sufficiently long period of time eventually stabilizes—that is, reaches a state in which there are no enabled or currently executing processes (assuming all processes terminate whenever they execute). Under the iterative execution scheme, any appropriately constrained trellis program stabilizes in no more than 2Η + 1 sweeps, where H is the height of the trellis (that is, the length of the longest directed path). For H + 1 sweeps, states can flow upward to the top of the trellis; for the next H sweeps, queries can flow downward to the bottom of the trellis. The stability property gives the trellis a predictable, well-defined, worst-case behavior.
To make a real-time guarantee, we must know how frequently new data values can arrive; we assume that the minimum interval that can pass between the entry of a new set of data values is τ, the trellis's period. Every τ seconds, one data value or many values simultaneously may be entered. We can now state our realtime execution requirement: the program must run on enough processors so that it is guaranteed to be able to analyze a set of inputs completely within its period, τ . What does it mean to "analyze a set of inputs completely?" One obvious definition is that the program must stabilize within τ . Hence, we require that our realtime execution scheme be able to complete 2H + 1 sweeps within τ seconds. Our goal is to find the smallest number of processors that can meet this requirement, and to discover how they can meet it—how we can map trellis nodes to processors in such a way that the trellis will stabilize within τ seconds.
Our general approach will be as follows. Starting with the set of all trellis nodes, we distribute node-descriptions onto processors. During a sweep, each processor will update all trellis nodes that it holds locally; then it will send information about the results of the sweep to other processors, in preparation for the next sweep.
Factor's heuristic scheduler [Fac90] is a technique for placing node-descriptions onto processors in a way that's likely to use a minimal number of processors in meeting the goal—namely, trellis stabilization within τ seconds. ("Likely" because this is a scheduling heuristic; the scheduling problem can be shown to be NP-complete, thus not amenable to a provably optimal scheduling algorithm. As a trellis grows in size, such an algorithm would tend to run so slowly as to be useless.)
In order to perform this static assignment of trellis nodes to processors, we need an analytic model that allows us to predict how long it takes to execute a trellis until it stabilizes. We need such a model because our general procedure will be to assign trellis nodes incrementally to processors, evaluating at each stage how much time would be required to achieve stabilization given the configuration that exists so far. The analytic model will be expressed in terms of a collection of well-defined costs, most significantly the cost required to read a tuple from tuple space, to update a tuple (by removing and then re-inserting it), and to execute each node's update routine. We can then express the time required for a trellis to stabilize in terms of a combination of these costs, given a particular number of processors and some assignment of trellis nodes to processors.
Once we have a way of evaluating execution time given an assignment of trellis nodes to processors, we can execute the heuristic scheduling algorithm. The algorithm starts by assuming some fixed number of processors; then it tries to map trellis nodes onto those processors in such a way that the timing goal is achieved. If it fails, number-of-processors is incremented, and the procedure repeats.
Figure 8.5 gives some speedup results for several trellis programs running under the scheduler we've described.

The fact that we can design and assemble diverse, complex and potentially enormous parallel applications is reassuring. The fact that (having done so) we can make accurate predictions about their performance is remarkable. Parallel programming is a workable, practical, manageable reality.
Among problems that lend themselves to network-shaped solutions, circuit simulation is one of the most interesting. It requires that we step a circuit through a series of iterations as we trace signals and record the circuit's behavior. Such problems tend to be highly compute-intensive and highly parallelizable. Neural network emulators are another example. Many graph problems lend themselves to programming in this style; we discuss one simple example in the exercises.
Preliminary. The first three questions ask you to build trellis programs of various kinds. First, though, we need to address the obvious preliminary question: a trellis program that does what? To build a complete trellis, you need a hierarchical understanding of some domain, and a set of decision procedures in which this understanding is captured precisely. (In the experiments we've conducted, we built the coordination framework but other people supplied the domain expertise.)
These questions don't assume a fixed size trellis, but you should assemble roughly 50 decision procedures at least (arranged, perhaps, in 5 levels) in order to attack them seriously. There are three reasonable ways to get these decision procedures; we'll list them in order, from less to more interesting.
First, you could use synthetic decision procedures that do nothing: they read inferior-node states (which you can assume are arbitrary integers), spin in a loop for some number of iterations to kill time, and then generate their own new, arbitrary "state." (This kind of synthetic trellis is actually very useful for performance experiments, because it can be parameterized. You can test many different sorts of trellises by varying the height-vs.-width of the trellis, the average number of inferior states each state depends on, and the average granularity of the decision procedures.)
Second, you can make up your own set of decision procedures. Choose an arbitrary domain (or one you find interesting), and improvise something. (We carried out initial trellis experiments in this way, targeted at automobile traffic networks. The point was to infer the location of traffic blockages, and to reprogram traffic lights and highway-entrance-control lights dynamically for better throughput.)
Third, you can find a real domain expert. Most universities employ a fair number of such people. Rumor has it that domain expertise can sometimes be found in the non-academic world as well. You might talk to medical people, to experts on finance, markets or economics generally, to people with knowledge of factories, or of multi-factory manufacturing structures, to networks people (at telephone companies, for example), to experts on any kind of transportation system, to scientists (particularly physicists) with large experimental setups to monitor, and so on. The potential applicability of powerful expert monitors is nearly unlimited.
1. Build a specialist-parallel trellis of the sort described in section 2; then build an agenda-parallel version of the master-worker variety discussed in section 3. Compare the expressivity and efficiency of the two versions.
Suppose that your trellis problem were non-compute intensive: the computing power of a single processor is ample to run it effectively. Is the trellis nonetheless a good software architecture for the monitoring problem? Which implementation strategy is better in this case—specialist or agenda?
2. The master process in the agenda-parallel trellis you've just built is a potential bottleneck for large trellises. Build a new version that supports multiple master processes (run your version with three masters, let's say, for concreteness). Clearly, workers should be able to grab tasks generated by any master. There are (at least) two possible ways to handle the result tuples. They might be sent not to any master but to the right one; masters under this scheme may need to exchange some information, so that they are aware of state information describing trellis nodes that are adjacent to any nodes they own. Or, every master might read each result tuple, updating its own part of the trellis graph as appropriate. If you use this second scheme, you should implement a "garbage collected" read stream—result tuples are removed from the read stream once all master processes have seen them.
3. Another way to deal with the master-as-bottleneck problem is to eliminate the master altogether. Modify your trellis program to make all processes worker processes.
4. Graph algorithms of various kinds can be programmed using the techniques discussed in this chapter. Developing a cost-ordered list of all paths between two nodes in a graph is a simple example. Write such a program using the "bug trudger" algorithm: create a bug and place it on the origin node. Arrange for the bug to trudge outwards through the graph, generating new bugs whenever more than one path is available. Each bug remembers the path it's taken so far. The destination node welcomes each incoming bug, squashes it, retrieves the information it contains and inserts this data in a cost-ordered list. You must make sure, of course, that bugs don't wander in circles.
Program this algorithm either in specialist or in agenda style.
° (Turingware:) 5. Human organizations are generally arranged in tree-form: each person has a unique boss and (potentially) many boss-ees. What are the implications of arranging a human organization as a "person trellis"?—each individual has a place in a trellis graph; this trellis represents the information-flow hierarchy within the organization. Hence, each individual has, potentially, many semi-bosses and semi-boss-ees. Are there organizations for which such a strategy might be appropriate? What is the role of read and write probes in such a structure? Would a (software) process trellis be a useful inter-person communication device in such an environment? Are there any uses for a mixed process-person trellis, in which some trellis nodes are processes and others are people? (To put these questions in another way: would it be possible to make all interactions with such an organization systematic, all mediated by the same software support tools? Might inquiries from outside the organization be "attached" automatically to the right trellis node? Might many sub-organizations be coordinated systematically, using probes, into a single multi-trellis? Might directors or accountants or investors monitor such an organization by attaching probes as needed? Might any employee get the opportunity to take in the "big picture" by attaching his own probes?)
![]() |
![]() |
![]() |
![]() |
![]() |