The Simple Times (tm) is an openly-available publication devoted to the promotion of the Simple Network Management Protocol. In each issue, The Simple Times presents technical articles and featured columns, along with a standards summary and a list of Internet resources. In addition, some issues contain summaries of recent publications and upcoming events.
The Simple Times is openly-available. You are free to copy, distribute, or cite its contents; however, any use must credit both the contributor and The Simple Times. (Note that any trademarks appearing herein are the property of their respective owners.) Further, this publication is distributed on an ``as is'' basis, without warranty. Neither the publisher nor any contributor shall have any liability to any person or entity with respect to any liability, loss, or damage caused or alleged to be caused, directly or indirectly, by the information contained in The Simple Times.
The Simple Times is available via both electronic mail and hard copy. For information on subscriptions, see the end of this issue.
Nonetheless, the need for a standard base from which to respond to the growing demand for more flexible and dynamic management of multi-component systems -- as opposed to single devices -- continued to rise in importance. It became common, for example, to see computers of the server variety as a collection of manageable components, including interfaces (e.g., MIB-II), operating system and associated resources (Host Resources MIB), peripheral devices (e.g., Printer MIB, Modem MIB, RAID MIB), and key software applications (e.g., RDBMS MIB, MADMAN MIB). And users want -- indeed, expect -- to be able to manage this collection as a single system. Further, the population of possible managed components continues to grow rapidly (e.g., the emerging Applications MIB will open new floodgates). What were once relatively simple devices (from the SNMP perspective, at least) are now complex collections of manageable components. Consider, as but a single example, any modern product which integrates bridge, hub, router, and switch functionality, which also supports a variety of interfaces and protocols (e.g., serial, 10Mb/100Mb Ethernet, FDDI, Token Ring, ISDN, and ATM). The surge in demand to manage the desktop as part of the enterprise and, eventually the requirement to manage a myriad of home appliances and personal accessories as integrated components of an individual's ``techno-life support system'' both serve to quash any semblance of doubt with respect to the need for and value of a standards-based solution to the problem.
The keys to success in meeting this burgeoning demand, mirror those of the undeniable success of the first generation of SNMP itself: deploy low-cost agents on all the components as quickly as possible; ensure that those agents do not interfere with the components' principal functions; and, give those agents a standard way to interoperate with higher-order software elements.
The IESG chartered the AgentX working group accordingly. Its goal is to define standards-track technology for SNMP agent extensibility. The resulting technology specification(s) must allow independently developed subagents to communicate with a master agent running on an Internet device. (This last sentence merits re-reading.)
The charter stipulates that the AgentX technology specification(s) will consist of at least one to as many as three parts:
Furthermore, the working group was directed to use good engineering judgment in developing an approach with the smallest reasonable footprint to achieve intra-agent communication. As a consequence of that, the working group may choose to avoid complete transparency, if, at its discretion, this proves to be the more effective approach. In that case, however, the working group is obliged to document its decision criteria for this engineering trade-off.
Finally, the IESG stipulated that:
``Although the working group will solicit existing specifications and experience in this area, it will produce a vendor-neutral technology specification.''Given that directive, and in light of the sizable amount of time spent previously on analysis of the problems, issues, and requirements, the chair sought to orient the working group toward an initial review of the solution-space (the set of published extensible agent protocols and product specifications that were, individually, meeting users' needs in some fashion). The first article in this special issue, then, is by Dale Francisco of StrataCom, who also serves as the AgentX Editor. This article will give you a brief introduction to the major primary sources and pointers to them so that you can follow up in more depth.
The attempt to focus the working group's attention on this solution-space was not intended to avoid having to deal with the problems, issues, and requirements; however, it was a recognition that the community has collectively spent a lot of time and effort on almost all of those issues. At some point in such a process, you have to be able to weigh the arguments and then make some hard choices. Many people have helped us identify the problem areas over the years and quite a few have helped to formulate a decision and build a consensus around it. However, no one has done more to help us all get to the crux of almost every issue than Randy Presuhn of BMC/Peer. His article is more than just a masterful summary of those past contributions. It is also a new tool for the working group to use in achieving closure on some more of these issues. I am sure that you will have a good appreciation for both the complexity of this problem-space and for the possibilities for effective and efficient solutions after reading Randy's article.
One of the technologies in the AgentX solution-space is Digital's eSNMP, which can be seen as one possible evolution of DPIv2, which is itself another element of the solution-space. In his article, Mike Daniele of Digital Equipment Corporation, implementer of eSNMP, provides a detailed overview of the protocol itself and explains some of its differences from DPIv2 and its possible relevance to the eventual AgentX protocol. Those readers who would like a concrete idea of what an extensible agent protocol looks like and how it operates will get that from Mike's article. And I should observe that Mike is teamed with Bert Wijnen of IBM as ``point men'' on AgentX protocol design. Their job in that role is to take the consensus positions as they begin to emerge via the working group deliberations and turn them into AgentX protocol specifications via appropriate modifications to the existing DPIv2 specifications. This is how we are mapping the problem-space resolutions back on to the solution-space foundation.
Finally, Dave Bridgham of Epilogue presents an uncommonly readable and persuasive analysis of the proxy approach to agent extensibility. Subtly, but clearly, Dave hits on the major design difference between the proxy model and the extensible agent model being pursued in the AgentX effort -- transparency to the management applications. The goal, in that respect, for the extensible agent model, is to allow the management applications to remain unaware of the fact that the apparently monolithic native agent that is responding to their requests at the well-known service location (such as UDP port 161) is actually an agent system of sorts, consisting of a master agent and potentially many subagents, each representing a particular managed component. Dave's article in this issue outlines the cost/benefit outcome of the information hiding in the agent. The proponents of the extensible agent model and the proxy model have plugged different factors and formulas into that calculation and, not surprisingly, each model will produce a positive cost/benefit ratio in certain scenarios. Hence, both technologies have a role to play in the marketplace. As time goes on and we all gain more experience and the general technology base continues to improve and end-user sophistication continues to increase, it may well be that there will be a pronounced converge of the two models into a single solution.
This problem is harder than it may first appear. One of the things that makes it hard is that users of network management systems stubbornly persist in viewing the devices on the network, routers, telco switches, workstations, and so on, as unitary entities that, in principle, can be meaningfully reduced to green, yellow, or red icons on a topology map. Builders of routers, telco switches, and workstations are equally adamant in viewing these devices as small ecosystems (or perhaps bioregions) of sometimes cooperating, sometimes antagonistic, software and hardware entities.
The designer of an SNMP agent is somewhere in the lonely middle between these two camps, aware of the reasonableness and even the necessity of both points of view. Viewing the agent as a provider of management information, not only does it make conceptual sense for one agent to provide all the information for one device, it seems to be enshrined in SNMP itself, for it is written that the agent shalt listen to the manager on UDP port 161, of which there is exactly one per managed device. But viewing the agent as a gatherer of management information, the designer is forced to face the messy world of agent instrumentation. In a workstation, the knowledge of management information is quite likely distributed among many user processes; in a telco switch, it may reside in different modules with different processors and different operating systems. This real-world messiness would seem to be best handled by lots of little agents rather than one big one.
When confronted with such a dilemma, SNMP agent designers, having read the same books and gone to the same schools, soon think ``divide and conquer'', and not long after that they think of interposing a new layer or protocol. And as is often the case with ideas whose time has come, different people come to similar designs independently. Two of the earliest such designs for extensible agents were DPI and SMUX.
Let's pause here to define some terms. An extensible SNMP agent is one which allows the addition, at run-time, of support for arbitrary MIB modules that were not known at the time the agent was built. This is quite different from the first generation of SNMP agents, known as monolithic agents. An extensible agent is made up of what is known as a master agent (SNMP agent in SMUX), a single entity that is responsible for receiving management requests and sending management responses; and zero or more subagents that are responsible for instrumenting various parts of a MIB. A subagent takes responsibility for a part of the device's MIB by registering with the master agent. The division of labor between the master agent and its subagents allows the extensible agent to appear to be a single entity to the management station; in fact, from the outside, it is indistinguishable from a monolithic agent. But internally, the extensible agent has the flexibility to adapt to changed circumstances by registering new subagents that are responsible for new functionality.
Despite its simplicity and conceptual integrity, SMUX had its problems. Notable among these were:
Like SMUX, DPI separated the agent functionality into an SNMP agent and one or more subagents. Again as with SMUX, the SNMP agent was responsible for the interface with the management application, while the subagents were responsible for different subtrees of managed objects. An important difference, and one that has appeared in many extensible agent implementations since, was that DPI, instead of using SNMP PDUs between the SNMP agent and the subagent, specified its own lightweight protocol for agent-subagent communication. (In addition, the original version of the DPI specification included an example subagent API.) In 1994, three years after the publication of DPIv1, based on their considerable implementation experience, IBM researchers offered a much more detailed version of the DPI specification, DPIv2, in RFC 1592.
The original version of DPI, though it offered improved performance relative to SMUX, suffered from some of the same problems as SMUX with regard to sysUpTime semantics and hung subagents. And perhaps even more than with SMUX, which included a two-phase commit, there were problems satisfying the SNMP requirement that set requests appear to take effect ``as if simultaneously'' when the processing of set requests was distributed across multiple subagents.
In parallel, as it became clear that there was a large market for extensible agents, other commercial products such as Envoy (Epilogue Technologies) and EMANATE (SNMP Research) appeared on the scene. Typically, the advantages that commercial products offered over some of their publicly-specified predecessors were increased performance (with versions optimized for particular hardware and particular operating systems), and greater flexibility; for instance, increased options for subagents to register which parts of a MIB module, even down to the instance level, that they were responsible for.
Though there's no question that commercial extensible agents have improved the state-of-the-art, it's clear that a standard for interoperability is needed if SNMP agent extensibility is to achieve its true promise. At present, the dream of having a single network device that comprises modules from different vendors, each with one or more subagents cooperating with a device-wide master agent, remains unattainable unless all the vendors happen to be using the same extensible agent product.
What level of QoS (Quality of service) is required to carry the subagent protocol? Are specific transports needed?Subagent protocols can be used both within and between systems. This leads to the requirement that at least one standard transport mechanism be defined. This does not preclude vendors from employing additional transports optimized for specific environments. For example, UNIX domain sockets or IPC mailboxes might be attractive on some platforms. It is generally agreed that vendors should be free to support such transports in addition to any standardized ones. In pre-AgentX discussions, there seemed to be strong resistance to defining a specific transport. Within the AgentX effort, there appears to be general recognition of the value of defining a standard transport, even if vendors will define additional ones of their own.
The minimal quality of service required by most subagent protocols appears to be an 8-bit clean, connection-oriented byte-stream. (There is also one proposal that is based on intra-system UDP, assuming a very low probability of packet loss or duplication.) TCP/IP is recognized as a widely-available transport meeting these requirements. For a subagent protocol to be carried over something not meeting these requirements, such as an unreliable datagram transport or noisy byte-stream, a convergence layer would be needed.
The strongest argument for using a connectionless transport like UDP with a convergence layer is given in RFC 1592: per-process limitations on the number of open file descriptors in some environments. By separating the design of a convergence layer from that of the subagent protocol, we avoid having to add the complexity of data integrity and retransmission mechanisms to the subagent protocol.
A very special case of the transport issue is the possible choice of DLL as a communication mechanism. In this case, however, the level of interoperability is really at the API level, and will not meet inter-system requirements. It is possible to define an API that makes the choice of communication mechanism transparent to the subagent developer. An issue that straddles the border between transport and association requirements is the handling of confidentiality and authentication. In all the proposals to date, any confidentiality has to be provided by the underlying transport, with no standardized transport mapping to support confidentiality defined. In the existing proposals that provide some level of authentication, weak authentication is built into the agent-subagent protocol. The requirements for authentication and confidentiality between the master agent and its subagents has not received much discussion.
What information is needed to establish, maintain, and terminate an association between master agent and subagent?Five major sub-areas of association requirements are:
The association establishment phases of the published subagent protocols provide for the identification of the subagent using a trivial form of authentication. This minimal identification is required in order to maintain the sysORTable and various subagent MIB modules. Whether stronger authentication is needed (in either or both directions) remains an open issue. Whether strong authentication should be left to the underlying transport or included as part of the subagent protocol is also an open issue.
For most subagent protocols, association establishment is a simple two-way handshake. This allows limited negotiation of parameters for the association and transfer of useful bits of information. The parameters that have generated the most interest are limits on the number of varbinds per PDU, timeouts used to detect lockup, and identification of the naming scope. The additional bits of information that have been considered include the time base used to compute sysUpTime.
The significance of the requirement for negotiation of the number of varbinds per PDU is currently being debated. The arguments for are primarily in terms of providing some level of compatibility with existing DPI instrumentation; the arguments against are in terms of efficiency and protocol complexity.
For the remaining details of association establishment, the recurring issue is whether specific parameters should be considered properties of a particular registration or of the association as a whole, or whether the association level parameters would serve as defaults for registrations. Since there is a strong requirement to support multiple naming contexts over a single association, and since the timeout characteristics of naming scopes may differ, it looks like the design will be left with a choice between treating these as part of the registration dialog or handling them in both places. A subtle point is that there must be a sysUpTime for each naming context, and that these don't necessarily all have the same value.
Association maintenance can take several forms. Depending on the characteristics of the underlying transport, some form of keep-alive may be useful to detect subagent lockup. In fault-tolerant configurations, it may also be desirable for subagents to detect loss of connectivity to the master agent. Previous discussions of the impact of transport outages led to the conclusion that there was no requirement for an association to be maintained across successive transport connections. In environments with highly unreliable transports, a convergence layer providing session maintenance for the association could be used.
It may be convenient to handle sysUpTime maintenance (e.g., notification of discontinuities) as part of a keep-alive mechanism, but, since sysUpTime discontinuities may be asynchronous to subagent operations, the protocol elements to support sysUpTime maintenance need to be decoupled from management-initiated protocol operations.
Most subagent protocols have some provision for an orderly association termination procedure. Only two issues have surfaced here: whether the extensive status codes provided by SMUX and DPIv2 have actual value, and whether termination of a subagent association should affect sysUpTime. The emerging consensus seems to be that sysORTable is the place to record the comings and goings, and that sysUpTime should be unaffected.
All SNMP and SNMPv2 operations must be supported (at least in the agent role), and the behavior must, from a protocol perspective, be indistinguishable from a monolithic agent's behavior. That the boundaries between subagents should not be visible at this level is taken as a fundamental requirement.These requirements raise a number of issues:
Discussion of the requirements for multi-phase commit protocols to support set semantics has to begin with the recognition, formalized in the error codes of SNMPv2, that there are cases where the ``as if simultaneously'' requirement simply cannot be met, even in monolithic implementations. Consider, for example, Keith McCloghrie's discussion of this topic in Volume 2, Number 6 of The Simple Times.
The changes required to existing protocols to support the SNMPv2 error codes are relatively minor: adding a response to the second phase of the SMUX commit, or a transaction end on the DPIv2 commit. However, additional phases help in handling important, if seemingly pathological, cases.
The protocol needs to take account the various resource reservation and release strategies that are possible, and to not assume that all subagents have been implemented using the same allocation discipline, since that discipline may be inalterably embedded in the design of the system, of which the subagent is a minor component.
The most difficult case to handle is where the acceptability of a proposed value for a variable is dependent on a variable supported by some other subagent, which is to be modified in the same request. To a certain extent, one might argue that this is poor implementation strategy, poor MIB design, or both.
In general, rollback may not be possible for sets with action semantics. Although adding additional phases can bound the problem, the consensus seems to be that these cases are better handled as MIB design issues.
The requirements appear to boil down to a four-phase procedure:
As an optimization of the high frequency case, where all the varbinds in a request will be handled by a single subagent, it has been suggested that the protocol should employ a single exchange, rather than a multi-phase transaction.
A final nasty aspect of set processing, which is really a general SNMP issue, is whether it is permissible to concurrently process set operations for different naming scopes. It is fairly clear that concurrent processing of set operations within a single naming scope would be risky, since there is no reasonable way to predict the side-effects of an operation. Whether an operation (like reset) must be assumed to potentially affect multiple naming scopes requires additional discussion.
Another area with operational implications is the handling of access control. The current consensus is that all access control handling is the responsibility of the master agent. It should be noted that some MIB modules, such as the RMON alarm group, may require knowledge of a system's access control policy.
The final area of operational issues has been the handling of informs. Part of the problem is coming to an understanding of what informs are. Even this has been subject to considerable debate within the SNMP community. There may be a requirement for subagents to subscribe to, and to issue, informs.
How much power and flexibility are needed for subagents to identify their area(s) of responsibility?The registration process (and, as a result, the details of operation dispatch) has been the subject of extensive discussion. The following requirements have emerged, based on the capabilities of existing solutions:
Additional registration issues include:
Discussions of handling registration priority led to the conclusion that registration priority is needed for handling redundant and fault-tolerant configurations, that the complexity is equivalent to that of handling registration-time based precedence, and that the notion is needed within an implementation to handle collisions anyway. Registration overlap occurs when an OID is a subtree of two registrations at the same priority. The most useful way to handle this case is to treat the longer registration as having the better priority. This conclusion is the result of implementation experience with protocols using different resolution strategies. The deciding case is where one subagent is responsible for handling requests for the creation of arbitrary new rows, and the new rows, once created, will be the responsibility of separate subagents. (For example, consider application processes forked off as a result of create operations -- if the shorter subtree registration took priority, the table entries for the forked processes would not be manageable.)
What support for information access between subagents is needed (e.g., can one subagent search another's tables)?The requirements for inter-subagent communications include:
The requirement for index reservation and coordination has found vocal and convincing support. The problem, simply stated, is that when different subagents implement different rows of a table, there is a need for a coherent index reservation policy. For some indexes, this policy is inherent in the index semantics, such as the use of a process ID. For others, such as an ifIndex, more sophisticated infrastructure is needed to ensure consistency of index values and references.
A key requirement is that the protocol state machine for index reservation and query must be able to function independently of the state machine for processing set and get operations, since row creation can happen due to local action as well as due to management request.
Index reservation is distinct from the registration protocol for several of reasons. The most important one is that the lifetime of a reservation may be far greater than that of a registration. For example, the reservation may be a result of system configuration or provisioning, determined long before a specific subagent is activated. Another reason is that the same reservation may be of interest to a number of entities, as in the case of ifIndex.
There appears to be general agreement that the full range of index syntaxes should be supported by the solution, although merely handling ifIndex alone would have significant value.
Subagent access to another subagent's MIB variables is not fully settled as a requirement. If this is accepted as a requirement, it will be necessary to define the access protocol so as to avoid deadlock situations with respect to SNMP-initiated operations. A special case of this is support for MIB modules like the RMON alarm group, which also require knowledge of access control information.
What information should appear in a subagent MIB to meet the requirements of managers that need to know the internal structure of the managed system?Although a manager will generally not be interested in what specific constellation of subagents is used to instrument a system, there are cases, especially in debugging scenarios, where a manager may need to find out which subagent is responsible for what. MIB modules have been defined for existing protocols. They are remarkably similar; most of the discussion, product deployment, and research experience leads to the conclusion that these MIB modules are probably overkill, recording more information than is actually useful. Trimming the excess from these MIB modules remains to be done.
What abstract syntax notation should be used for the protocol definition, and which set of encoding rules should be used?Two major debates have occurred in the area of specification requirements. The first is the choice of protocol definition language. Some subagent protocols have been defined using ASN.1; others have used ad hoc notations. The AgentX effort is, at this time, basing its work on a document which uses an ad hoc notation. The issues are ones of clarity and rigor. The second debate, quite distinct from the choice of specification language, is the choice of transfer syntax. The choices are between the formally defined encoding rules, such as BER, PER, DER, or XDR, and various ad hoc schemes. The AgentX effort is, at this time, basing its work on a document which uses an ad hoc encoding scheme. At issue are clarity, implementation complexity, and verifying the correctness of implementations, as well as performance.
The difficult part of the AgentX working group's task will be to reach agreement where there's no clear leader among the technical alternatives, or where the group is not able to agree on metrics for evaluating alternatives. Here we must stick to principles of clarity and simplicity, tempered by practical experience and sensitivity to our customers' needs.
The following diagram illustrates the major components of the framework:
Master Agent Subagents +-----------------------------+ +-----------------------------+ | +-------------------------+ | +-----------------------------+ | | | Registry | | | +-----------+ | | | | o o o | | | | Method | | | | | | \ /|\ / \ | | | +-----------+ | Routines | | | | | o o o o o o o | | | | | | | | | | | | | | | | | API | | +-------+ | | | | | / \ |-|-|-|-| | | | | client | | |Object | | | | | | o o o o o o o | | | | | | |Type | | | | | | | | | +-----------+ +-| |-+ | | | | +----------++---------+ | | | |Table | | | | | |Dispatcher||Registrar| | | |=================| |===| | | | +----------++---------+ | | | | API routines +-------+ | | | | +-------------------------+ | | | | | | | +--------+ +--------------+ | | +-------------------------+ | | | | SNMP | | eSNMP/DPI | | | | eSNMP/DPI Engine | | | | | Engine | | Engine | | | +-------------------------+ | | | +--------+ +--------------+ | | +-------------------------+ | | | +--------+ +--------------+ | | | AF_UNIX | | | | | UDP | | AF_UNIX | | | +-------------------------+ |-+ | +--------+ +--------------+ | +-----------------------------+ +-----------------------------+ | ^ ^ | /\ || ^ | | ^ | | | | || || | | | | {Control} | | | | || || | | | +-OPEN,REG,UNREG,TRAP,PING-+ | | | SNMP || | | +--------RESP, CLOSE-----------+ | | Request || | | | | || || | | {Data} | | || || | +------GET,NEXT,BULK,SET,COMMIT,UNDO,CLEANUP-----+ | || || +-------------------RESP-----------------------------+ || SNMP || Response || || \/
eSNMP/DPI runs over UNIX domain sockets, so both sides need to support this transport and know where to rendezvous.
The master agent's dispatcher and registrar are the algorithms by which it accepts registrations, associates requested MIB variables with a particular subagent, and communicates with those subagents. The object-type table in the subagent is emitted by a MIB compiler back-end tool (we used mosy and snmpi from ISODE, with some modifications).
SNMP is transmitted between the master agent and the management application only. Communication between the master agent and subagents is via eSNMP/DPI only. DPI is not encoded using the BER, and we kept the same general encoding and packet formats. As a result, both sides contain an engine for eSNMP/DPI handling. Each packet has some explicit (header) format and some variable length fields. The variable length fields (including OIDs) are encoded as null-terminated ASCII strings.
Each eSNMP/DPI packet starts with a fixed header:
.--------------------------------------------------------. | Layout of eSNMP packet header. Present in all packets | +--------+------+----------------------------------------+ | OFFSET | SIZE | FIELD | +--------+------+----------------------------------------+ | 0 | 2 | packet length | +--------+------+----------------------------------------+ | 2 | 2 | packet ID | +--------+------+----------------------------------------+ | 4 | 1 | protocol major version | +--------+------+----------------------------------------+ | 5 | 1 | protocol minor version | +--------+------+----------------------------------------+ | 6 | 1 | packet-flags | +--------+------+----------------------------------------+ | 7 | 1 | packet type | +========+======+========================================+
This permits code to assign a header pointer to the start of a suitably aligned receive buffer. The remainder of the packets are dependent on packet type. We opted for native-byte ordering in the headers (which is possible since eSNMP is restricted to a single host system), but eSNMP uses network-byte ordering of data lengths and values. This doesn't reduce performance as these fields fall on arbitrary byte boundaries and must be parsed byte-by-byte. (We are also hedging against future off-host subagents.)
To build a subagent the developer must:
+--------+------+----------------------------------------+ | OFFSET | SIZE | FIELD | +--------+------+----------------------------------------+ ... header ... +========+======+========================================+ | 8 | 2 | timeout | +--------+------+----------------------------------------+ | 10 | 2 | max-vbs = 0 | +--------+------+----------------------------------------+ | 12 | var | subagent description (null-terminated) | +--------+------+----------------------------------------+The timeout field permits a subagent to indicate longer than normal latency. The max-vbs field is unused. The description identifies the subagent, and must be unique among connected subagents (it is typically the fully qualified path of the command executing the subagent). There is no on-disk configuration of subagents. This protocol exchange is the only way the master agent is made aware of available subagents.
Once so connected, a subagent may send any of the control PDUs, specifically it may register. Data requests may be sent to a subagent whenever the dispatching policy chooses any of its registered subtrees.
If the master agent receives a CLOSE packet from a subagent, or detects from the underlying transport that the subagent cannot receive data, the master agent unregisters all of its subtrees and destroys the logical connection.
The PING PDU is available for subagents to verify the status of the master agent.
+--------+------+-----------------------------------------+ | OFFSET | SIZE | FIELD | +--------+------+-----------------------------------------+ ... header ... +========+======+=========================================+ | 8 | 2 | priority | +--------+------+-----------------------------------------+ | 10 | 2 | timeout | +--------+------+-----------------------------------------+ | 12 | 2 | view-sel | +--------+------+-----------------------------------------+ | 14 | 2 | bulk-sel | +--------+------+-----------------------------------------+ | 16 | var | subtree OID (null-terminated, length L) | +--------+------+-----------------------------------------+ | 16+L | var | sub-tree description (null-terminated) | +--------+------+-----------------------------------------+The unit of registration is a single OID. By registering an OID, the subagent indicates it will handle management requests for objects within the subtree named by that OID. A subagent may register any OID at any priority, there is no policy to limit subagents. The OID may represent an entire MIB, a MIB group, a table entry, a partial instance, or a full instance. This is left entirely to the discretion of the subagent developer.
The master agent handles overlapping registration by splitting affected subtrees into smaller ranges that are either exact duplicates, or no longer overlapping. For instance, suppose component agent A registers ip (1.3.6.1.2.1.4) and subsequently component agent B registers ipNetToMediaTable (1.3.6.1.2.1.4.22). The master agent splits the registry into 3 OID ranges:
(1.3.6.1.2.1.4 upto 1.3.6.1.2.1.4.22) subagent A (1.3.6.1.2.1.4.22 upto 1.3.6.1.2.1.4.23) subagents A, B (1.3.6.1.2.1.4.23 upto 1.3.6.1.2.1.5) subagent AHere ``upto'' means ``up to but not including''. If component agent C now registers mib-2 (1.3.6.1.2.1) this is split into OID ranges, resulting in:
(1.3.6.1.2.1 upto 1.3.6.1.2.1.4) subagent C (1.3.6.1.2.1.4 upto 1.3.6.1.2.1.4.22) subagents A, C (1.3.6.1.2.1.4.22 upto 1.3.6.1.2.1.4.23) subagents A, B, C (1.3.6.1.2.1.4.23 upto 1.3.6.1.2.1.5) subagents A, C (1.3.6.1.2.5 upto 1.3.6.1.2.2) subagent CThe policy we used for overlapping registrations is that only 1 of them is active at a given time. That's the one with the highest priority, or, in the case of a tie, the one most recently registered. Hence, in the example above, assuming all subagents used the default priority, subagent C would have the active ranges.
Subsequent UNREGISTER packets of course cause inactive ranges to bubble up. If subagent C unregistered mib-2, then B would own the active range from 1.3.6.1.2.1.4.22 upto 1.3.6.1.2.1.4.23, and A would own two ranges, 1.3.6.1.2.1.4 upto 1.3.6.1.2.1.4.22, and 1.3.6.1.2.1.4.23 upto 1.3.6.1.2.1.5.
We didn't include any way for a subagent to be aware of what ranges it is active for, mainly because we couldn't think of what subagent code could do about it. When signaled, the master agent dumps the current registry to a file. This enables local operator intervention when a configuration is unsatisfactory.
Priorities are viewed as mechanism used by cooperating subagents, to provide precedence and backup coordination without explicitly having to check for the other's existence.
.----------------------------------------------------------. | Layout of a varbind section | +--------+------+------------------------------------------+ | OFFSET | SIZE | FIELD | +--------+------+------------------------------------------+ ... +--------+------+------------------------------------------+ | L | 1 | varbind-flags | +--------+------+------------------------------------------+ | L+1 | var | Starting OID (null-terminated string) | +--------+------+------------------------------------------+ | L+M+2 | var | Ending OID (null-terminated string) | +========+======+==========================================+ The following only on SET and RESPONSE to GET, GETNEXT, and GETBULK: +--------+------+------------------------------------------+ | L+M+N+3| 1 | varbind variable type | +--------+------+------------------------------------------+ | L+M+N+5| 2 | varbind data length (network-byte order) | +--------+------+------------------------------------------+ | L+M+N+6| var | varbind data value | +--------+------+------------------------------------------+ ... +----------------------------------------------------------+ | Notes: L - current position in message | | M - strlen(Starting OID), without terminator | | N - strlen(Ending OID), without terminator | `----------------------------------------------------------'
eSNMP supports both SNMPv1 and SNMPv2 SMI. GET, GETNEXT, and GETBULK packets use the following format:
+--------+------+------------------------------------------+ | OFFSET | SIZE | FIELD | +--------+------+------------------------------------------+ ... header ... +========+======+==========================================+ | 8 | 4 | non-repeaters | +--------+------+------------------------------------------+ | 12 | 4 | max-repetitions | +--------+------+------------------------------------------+ | 16 | 4 | sec-len | +--------+------+------------------------------------------+ | 20 | var | security data (octet sec-len bytes long) | +--------+------+------------------------------------------+ | 20+M | var | varbinds, (see above for layout) | +--------+------+------------------------------------------+ | | | Notes: M is size of the security data which is value of | | sec-len. | `----------------------------------------------------------'For these PDUs, the packet-flags field of the header contains the SNMP version of the original request. non-repeaters and max-repetitions are both 0 for GET and GETNEXT packets. For GETBULK packets, non-repeaters may be initially adjusted in light of the varbinds sent to each agent, and max-repetitions may be adjusted subsequently if endofMibView is returned for any varbind and more GETBULK packets are issued.
Currently the security field is always empty, although there is a separate sec-len field in anticipation of passing formatted data for security/naming scope information.
For a GET packet, the starting OID is the requested OID. It must be within the range of a subtree registered by the subagent. For the GETNEXT and GETBULK packets, the starting OID is either the requested OID (if that falls within a registered subtree), or the OID of the registered subtree that is lexi-next after the requested OID. In this latter case the varbind-flags field is set to indicate this.
The ending OID is empty for GET packets, and on GETNEXT or GETBULK packets it is set to the end (noninclusive) of the range of OIDs that this subagent may respond for this varbind. (Due to overlaps and splitting by the master agent, this search range does not necessarily include the entire registered subtree; however, the search range will NEVER span multiple registered subtrees.)
When the subagent receives these requests, it searches its registered subtrees for the one containing starting OID, and then through a linkage to the object type table finds the correct object type. It then calls the indicated method routine, passing the value of starting OID and an indication of the request type (along with other information specific to the API).
If the varbinds-flag bit is set, the subagent modifies this procedure slightly to perform a get operation, regardless of the actual request. This is how possible instance level registrations are handled, since the master agent is unaware of what the registered OIDs mean. If the GET processing fails, the subagent proceeds with normal NEXT/BULK processing if that was the original request. (Recall that the subagent developer is unaware of this processing.)
The subagent processes all varbinds in the request packet in this manner with the inclusion of a step that checks the packet-flags field of the header for the SNMP version of the request before calling any method routine. Object types in the table with SNMPv2-only syntax are ignored on SNMPv1 requests.
The data or error information for each varbind is eSNMP/DPI encoded and a RESPONSE packet sent back to the master agent. In these packets the starting OID and associated data are the varbind to return to the original requester. The varbinds in this response packet are returned in the order they were sent in the request packet.
The SET, COMMIT, UNDO, and CLEANUP packets use the same format as the GET* packets, the only difference is that the varbinds sections contains data in a SET packet.
The dispatch processing at the master agent is the same as for GET* requests except that another bit in the header's packet-flags fields is set if all requested varbinds are dispatched to the same subagent. In this case, the subagent performs the SET, COMMIT, potentially UNDO, and CLEANUP phases itself, and returns a single RESPONSE. Otherwise, the master agent must itself initiate the SET, COMMIT, optionally the UNDO, and the CLEANUP phases by sending these packets to each involved subagent.
Note that no check-consistency PDU needs to be issued by the master agent, since the subagent receives ALL varbinds destined for it in any request PDU. Consistency checking (and other very useful aids to help subagent developers perform set-related operations) then become an API issue.
+--------+------+----------------------------------+ | OFFSET | SIZE | FIELD | +--------+------+----------------------------------+ ... header ... +========+======+==================================+ | 8 | 4 | error-code | +--------+------+----------------------------------+ | 12 | 4 | error-index | +--------+------+----------------------------------+ | 16 | var | additional data (see discussion) | +--------+------+----------------------------------+Responses to requests use SNMPv2 compatible error codes, and use the error-index field. The master agent maps the code to SNMPv1 if required, and maps the error index for all requested varbinds (not just those sent to this subagent.) Successful response varbinds can contain endofMibView, indicating the master agent needs to roll-over to the next registered subtree. SNMPv2 support in the master agent is being added at the time of this writing.
Responses to the OPEN PDU contain a UNIX-style timestamp, from which an API routine can calculate sysUpTime-like timetick values. Responses to other PDUs contain no additional data.
eSNMP permits ``table sharing by brute force''. Subagent A can register ifIndex.1, ifDescr.1, etc., and subagent B can register ifIndex.2, ifDescr.2, and so on. It works, but AgentX needs to do better in 2 areas. First, the registration syntax should probably allow registering an entire row in one operation. Second, a bigger problem is that of index collision in shared tables. AgentX will need to provide mechanisms for subagents to reserve indexes, so that subagents that share tables can be truly independent, relying only on AgentX services (as opposed to allocating indexes amongst themselves).
AgentX may contain explicit support for augmenting tables, and may provide an index subscription service, so subagents may learn about existing rows in tables of interest, and be notified of changes. Since eSNMP does not have special registration syntax for sharing tables, and its dispatching policy limits each varbind to at most 1 subagent, it is impossible to create rows in shared tables. This is unacceptable for AgentX. In contrast, eSNMP exists on a UNIX platform in which network interfaces are rationalized within the kernel, and the operating system implements MIB support for MIB-II, media-specific MIB modules, and Host Resources MIB. This removed any need to share ifTable between subagents, and so removed most of the impetus to provide more explicit support of table sharing. Finally, eSNMP does not carry context/naming scope information, nor does it carry access control information (community names, views). AgentX is likely to require support for naming scope at least.
From the perspective of providing an application development environment, feedback has reenforced this division of labor and general framework. Issues typically involve the clarity of the documentation, particularly at the method routine interface.
My personal belief is that subagent API/toolkit development is best left to the vendors who specialize in that area, and that differentiation in this area is not a ``Bad Thing''. The challenge before the AgentX working group is to provide a standard subagent protocol that is functionally equivalent to, though not binary compatible with, currently deployed protocols. This would enable the wide variety of independently-developed subagents and master agents to interoperate, which is the ultimate goal of our efforts.
I look forward to the day I can support AgentX in our operating system!
``How do I get my management information into the SNMP agent that another group provided?''And from someone who was producing the SNMP agent:
``How do I allow all these separate projects (separate processes, cards, devices, whatever) get their management information to me?''These are natural enough questions given the circumstances. Unfortunately they are not the correct questions. The questions, as asked, specify the answer. If you back up a little and look at the larger management picture, rather than concentrating solely on the agent, then you get a different question and possibly different answers. The question we should be asking is:
``How do we get management information from a variety of sources on a host, back to the manager?''
This article looks at using SNMP itself as the means for doing this communication. After all, moving management information around is what SNMP was designed to do. The technique is called SNMP proxy. For proxy to extend agents requires no new protocol design though a proxy registration MIB would be helpful. It would allow for smoother, automatic management station configuration and dynamic reconfiguration of agents.
Many people I've talked to seem astonished even to hear ``proxy'' and ``agent extensibility'' in the same sentence. Apparently, they think the two are completely unrelated. So let's look at how something as simple as proxy can help.
As I use the term in this article, proxy is essentially what SNMPv2 classic meant by proxy. An SNMP packet arrives at a well known service location (such as UDP port 161) where it is received by the SNMP proxy agent. This agent determines, from the community string in the packet, that the packet is intended for a different SNMP agent called the proxy target. The proxy agent relays the SNMP packet, selecting a community string for the proxy target along with a new request-ID. The proxy agent also keeps a bit of state around, referenced by the new request-ID, containing the original requester's addressing information, community string, and request-ID so when the proxy target responds to the proxy agent, it is able to relay the response back to the original requester.
The most common use of extensible agents is the splitting of MIB modules over several agents. Maybe it's different cards in a backplane, each with its own processor. You don't want to burn the MIB module into a single processor, thereby limiting flexibility in adding new MIB support as you come out with new cards for your system. Maybe it's different processes on a UNIX system, written in some cases by different companies.
On the agent end, just run one agent as the proxy agent and management stations talk to all the other agents in the system through it. When the management station first contacts the agents, it reads sysObjectID and other similar information from each agent to discover what that agent does. Remember, each proxy target is an SNMP agent and has its own system group. Then, when asked to show some aspect of the managed device, the management station directs its requests to the right proxy target by selecting the right community string and retrieves information as usual with SNMP.
What if the information is spread across multiple proxy targets? This isn't likely to be a common situation, as any information that's likely to be grouped on the manager's screen is also likely to be grouped into a single proxy target, but it's easily handled. The manager simply makes requests of each proxy target, collects all the information available from the lot of them, and displays the results in a unified manner.
Another common situation is the splitting of an SNMP table across several agents, most often the interface table, e.g., consider a backplane system where each interface card wants to run its own entry in the interfaces table and its own transmission group MIB module. Each interface card runs its own proxy target with those MIBs. Obviously, each interface needs to pick values for ifIndex, and they're very likely going to all pick 1. That's not a problem, since the management station knows they're different interfaces because it talked to each proxy target separately to retrieve the entire interface table. If the management station were to simply display the values of ifIndex, it would be rather confusing. So, instead of displaying the raw information, a very simple transformation of the data gives each interface a unique number, or even a name, which is likely to be more palatable to most human users.
The final common example we'll examine here is multiple instances of the same MIB module. You have a MIB module which instruments some part of your system and one day you get a second instance of that part of your system and so need a second instance of that MIB. This may be a duplicate board plugged into a backplane, or some instrumented UNIX application that was run twice. With proxy, operation on the agent's end is just like any other extended agent: same MIB module, different MIB module, it just doesn't matter.
The management station has a couple of choices, though. It could show the two instances of the same MIB module as if there were two devices, or it could try to merge the two into one, like merging multiple ifTables into a single large ifTable. The choice comes down to which provides the clearest picture for the human user, and the choice can be left to the management station rather than trying to pre-decide in the agent. Since each of these proxy targets is an SNMP agent in its own right, why bother with proxy at all? Why not have the mangers just talk directly to the targets? There are two main reasons: security and configuration.
Even though it might be possible to talk directly to proxy targets, rather than through the proxy agent, concentration of security configuration makes proxy still useful.
One possibility is to design a proxy registration protocol, but a simpler solution is simply to write a registration MIB module to be implemented by the proxy agent. To register, proxy targets write themselves into the table in the proxy agent. The information in this MIB need only be contact information for the proxy target: the community string to use and the transport type and address. No information about the proxy target itself is needed, because management stations can query the proxy target directly for anything they want to know. This MIB module also then provides the means whereby a management station can discover all the proxy targets it might want to talk to.
Actually, one additional piece of information would be useful in this proxy registration MIB. Some proxy targets are agent extensions, and others are just other agents being accessed through a proxy. For a management station which is trying to do a good job of displaying an extended agent as a single device, it must distinguish the two cases.
Another issue with this registration table is how it gets garbage collected. If a proxy target goes away willingly, it can obviously just remove itself from the registration table. But what about otherwise? Since a proxy agent needs to keep state around to eventually relay the response, it also needs to timeout this state in the event that the proxy target disappears. A few of occurrences of this, and the proxy agent could mark this proxy target in the proxy table as having a problem. At that point the proxy agent could actively interrogate the proxy target, or it could wait for a few more timeouts and then remove the entry from its table.
If the only part of the system you control is an SNMP agent, then you have to do whatever it is you're trying to do within that agent. Stepping back and looking at a larger network management system lets you ask,
``Where's the best place to aggregate information from a variety of sources?''You may get a different answer than when you restrict your vision to just the agent.
The rules for defining MIB modules are generally referred to as the Structure of Management Information (the SMI). There are three documents which specify these rules:
Nevertheless, this interval issue has been debated many times. For example, what happens with the table totaling all intervals? Unfortunately, there is only so much you can do to convey correct information, and the law of diminishing returns is looming. The problem may be partly solved by mandating when for calculating the totals the value 0 should be used for gaps in the table; and, by defining a separate object indicating potentially missing intervals.
Of course, this practice is redundant and therefore should be avoided. But is it wrong? In the interests of simple agents this practice should be avoided as general policy. But an argument a large network is not helped by managers having to retrieve large numbers of counters continuously, and is better off by storing some history in the agents (for example, this capability could be made configurable allowing probing in particular areas of interest in a network).
The good news is that I've received only positive comments on the previous issue. One reader exclaimed:
``More technical content per square of paper than I've seen in a long time...''Similarly, the unofficial index of IETF MIB modules, has also proven popular, with many asking why all RFCs couldn't be published in HTML.
Reader feedback was positive on this change, and, as luck would have it, a major topic in the SNMP community, agent extensibility, is now reaching a consensus point. So, this is a special issue of The Simple Times focusing on that special topic, with coverage of five articles!
Historically, I have long opposed a standards-based effort to agent extensibility. I felt that the issues are too implementation-specific to favor standardization; further, I felt that an imperfect standardized resolution of these issues would diminish the correct behavior of an SNMP implementation. Over time however, the issues have become clearly understood and there is now sufficient experience for a standards-based effort to proceed and succeed. As such, I am pleased that the major contributors to the IETF effort on agent extensibility were able to contribute to The Simple Times.
I think it fair to speculate that we will have another special issue later this year, dealing with another major topic as it approaches consensus. I'll leave that to the reader to guess which topic that might be. (Hint: it doesn't, thankfully, deal with security!) Of course, The Simple Times still needs your help: please consider contributing a technical article to the community! The publication schedule is quarterly, so that's plenty of time for you to do some serious writing.
Full Standards:
Full Standards:
MIB module checking:
MIB module conversion:Agents:
For more information: +1 408 459 9817.
The Simple Times also solicits terse announcements of products and services, publications, and events. These contributions are reviewed only to the extent required to ensure commonly-accepted publication norms.
Submissions are accepted only via electronic mail, and must be formatted in HTML version 1.0. Each submission must include the author's full name, title, affiliation, postal and electronic mail addresses, telephone, and fax numbers. Note that by initiating this process, the submitting party agrees to place the contribution into the public domain.
Back issues are available, either via the Web or anonymous FTP.
In addition, The Simple Times has several hard copy distribution outlets. Contact your favorite SNMP vendor and see if they carry it.