How to use ExoGENI

Various GENI aggregates are managed by different control framework software elements, e.g. ProtoGENI/Emulab, PlanetLab, Orca, FOAM etc.

ExoGENI is managed through a combination of control framework software, namely Orca and FOAM. Orca is used to create integrated slices of compute elements and Layer2 links with optional OpenFlow integration. FOAM is used to manage standalone OpenFlow experiments. One of the critical aspects of ExoGENI is that it can be viewed both as a collection of independently-managed racks, as well as a single aggregate, capable of creating complex inter-rack topologies.

ExoGENI offers two basic classes of compute slivers: VMs and bare-metal servers. VMs are offered in a number of subclasses, which differ from one another by the number of CPU cores, amount of RAM and amount of disk made available in each; VM classification is based on the EC2 instance type hierarchy adapted to the ExoGENI hardware and OpenStack.

VM images can be built by experimenters off-line. In contrast, a limited number of bare-metal images is periodically synchronized out-of-band between the racks and a central repository. These are provided as-is, with no customization options, other than post-installing needed packages in a booted sliver.

Experimental slices can have complex topologies of compute instances/slivers that span one or more ExoGENI (and non-ExoGENI) sites. Experimenters can get resources from specific racks, however the true power of ExoGENI is in the ability to automatically tie resources of multiple racks in Layer2 topologies. This is accomplished by submitting slice requests to ExoSM.

Experimenters can submit their slice requests using GENI tools (and GENI AM APIs) to specific SMs or they can use Flukes – ORCA’s GUI that operates on native interface and semantic web resource representations. Both types of requests are processed in the same pipeline within the SM such that similar authorization policies are respected and they operate on the same credential sets.

If you are coming from Emulab/InstaGENI/ProtoGENI/PlanetLab environment

This is a summary of noteworthy facts comparing ExoGENI and other testbeds. It is important to understand that ExoGENI presents different types of resources from other testbeds and also a different partitioning of responsibilities between the control framework (ORCA) and user tools (Flukes, Flack, omni). The details can be found further in this document:

  1. Your credentials (the .pem file you've downloaded from the portal) will work.
  2. There is a separate mailing list (a Google group called 'geni-orca-users'; be sure to provide the name of the project and institution when signing up) for ExoGENI users for getting assistance. Asking questions on help@geni.net can be a start, but you will generally be redirected to the Google group
  3. There is a broad choice of experimenter tools some specific only to ExoGENI, some applicable to ExoGENI and other GENI testbeds. They vary in their expressiveness and ability to accomplish different tasks.
  4. ExoGENI offers two broad categories of compute nodes. You can get KVM virtual machines of multiple sizes and bare-metal nodes. This contrasts with e.g. OpenVZ or other container types available in other testbeds. Compute instances in ExoGENI can behave differently, than in other testbeds
    • They are well-isolated from other slices and represent completely separate compute nodes with their own operating system, on which the slice owner is root.
    • Command-line reboots are not guaranteed to succeed (however they do if no errors in boot process are present).
  5. ExoGENI does not oversubscribe its resources: each rack site has a certain number of CPU cores dedicated to launching virtual machines. Each VM can have multiple cores. When all the cores in the rack are exhausted, no more compute instances can be created on this rack. This is done to improve performance isolation between experiments.
  6. ExoGENI is a 'Bring Your Own Image' (BYOI) testbed. You can build your own image outside the testbed, post it on any webserver (or use Dropbox), create an image description file and launch your slice with this image. ExoGENI does not have its own image hosting service. While we offer a number of images to the community, these are merely examples. We encourage experimenters to build their own images, upload them to their own webservers and list them with us.
    • Compute images from other testbeds generally are not portable to ExoGENI.
    • ExoGENI offers scripts to snapshot the running VM (or even a VirtualBox VM) that you can then retrieve and turn into a new image.
    • Once the image is built, it can be used across the entire testbed - there is no association between images and ExoGENI sites.
  7. ExoGENI is designed as Infrastructure-as-a-Service testbed, on which experiments act as applications. This means that the default behavior of ExoGENI slivers are minimal. ExoGENI tries not to make any assumptions about what elements of the slice should or should not do and leaves it up to experimenter tools to configure the correct behavior. Common examples of differences with other testbeds include:
    • Experimenters login to the provisioned slivers directly (there is no need to go through any other nodes)
    • The IP address through which experimenters access their slivers (referred to as 'management plane address') cannot be externally specified and is instead returned as part of the slice manifest
    • Not specifying an IP address of a dataplane interface (contrasted with management plane) in a slice means a network interface will be created in the sliver, but left in unconfigured/down state. ExoGENI native experimenter tool Flukes offers the 'automatic IP address assignment' option, however other tools may not.
    • Nodes do not forward traffic by default, however this can be easily achieved with post-boot scripts, if desired
  8. ExoGENI offers several unique features (check the tools section to see which tools are compatible):
    • Rich scripting and templating capabilities.
    • Ability to provision large compute clusters as NodeGroups
    • Ability to modify NodeGroups dynamically (add/remove nodes) once the slice is up
    • Ability to create complex inter-rack topologies (using ExoSM).
    • Ability to create storage slivers (iSCSI volumes) as part of the slice topology.
  9. ExoGENI has its own very robust stitching engine that predates GENI stitching by several years. For this reason ExoGENI can be viewed as a collection of disconnected racks or as a single interconnected aggregate, depending on whether you choose to acquire your resources directly from a rack or via ExoSM. When using ExoSM, ExoGENI does its own stitching of slices across multiple racks together. You simply need to indicate the embeddding of pieces of the slice in required racks in the slice request and let ExoGENI do the rest (either in Flukes GUI or in your slice RSpec).
  10. If you are used to 'hunting' for resources on other testbeds, it isn't necessary on ExoGENI. Binding of slivers to racks (i.e. explicitly assigning resources to specific racks) is optional and is only needed if there is a specific reason to use a specific rack. If you want a slice that fits in one rack (up to several dozen nodes), leave your request unbound (don't indicate which rack it should go to) and let ExoGENI control software locate the free rack for you. The resource splitting and placement algorithm can, in fact, locate available resources and split a slice that is too large to fit in one rack across several racks automatically.

Read the ExoBlog for guides on how to do interesting things with ExoGENI.

The Tools

ExoGENI exposes a number of APIs that can be spoken to by a variety of tools. Today the primary user-facing tools are:

  • Flukes - a graphical tool developed specifically for ORCA/ExoGENI to take advantage of its unique features. It uses ORCA's native APIs and resource descriptions and is not compatible with other testbeds.
  • Omni - a command-line tool developed by the GPO. It uses GENI APIs and GENI RSpecs and can be used with other GENI aggregates.
  • Flack - Flash-based tool developed for Emulab built into GENI portal

Feature comparison matrix

Feature Flukes Omni Flack/Jacks GENI Portal
Installation JNLP (downloadable), requires Java Requires Python Browser-based (Flash) Browser-based (Flash)
Interface Graphical Command-line Graphical Graphical
Description language ORCA NDL-OWL GENI RSpec GENI RSpec GENI RSpec
Compatibility ExoGENI only GENI GENI GENI
Creating unbound slices1) Yes Yes No No
ExoGENI any-to-any VLAN stitching between racks Yes Yes Yes Yes
ExoGENI nodegroups Yes No No No
ExoGENI slice modify Yes No No No
Using ExoGENI Slice-to-slice Stitching capabilities Yes No No No
ExoGENI templated post-boot scripts Yes Yes 2) Partial 3) Partial 4)
Stitching to campus networks Yes No No No
Storage slivering 5) Yes No No No
Multi-point Layer2 Connections Yes Partial6) No No
Automatic IP address assignment Yes No No 7) No
iRods support Yes No No Yes

Gaining access and using the testbed

ExoGENI delegates user authorization to GENI (GPO) and its current authorization policy as implemented in the ORCA SM matches that implemented in ProtoGENI. ExoGENI honors identity certificates issued by GENI Project Office and Emulab. This means valid users must present certificates issued by either of those authorities.

To gain access to ExoGENI you must:

  • Obtain credentials from the GPO. Take the time to read this
  • Join the geni-orca-users Google group. Please indicate the project and institution you are with when subscribing. Subscription requests without this information will not be approved.
    • This is where you ask for help for ExoGENI-specific issues/problems.
  • Decide on the tools you are going to use.
  • Check availability of the testbed via calendar. You can even subscribe to the calendar to see maintenance events.
  • Read the tutorials and ExoBlog for examples of what is possible.

Communications with Operations Team/Testbed status

Please look at this page describing the various notification/communication mechanisms.

Acquiring resources from ExoGENI

Compute Resources

ExoGENI offers two basic classes of compute slivers: VMs and bare-metal servers. VMs are offered in a number of subclasses, which differ from one another by the number of CPU cores, amount of RAM and amount of disk made available in each; VM classification is based on the EC2 instance type hierarchy adapted to the ExoGENI hardware and OpenStack (i.e. the allocations of CPU cores, RAM and disk do not match EC2's). The class.subclass can be specified as part of sliver_type element of node RSpec.

Since ExoGENI permits experimenters to use images of their own creation in slices (rather than using only a set of pre-approved images pulled from a common repository), the images must be specified differently: they must be specified by a URL of an ImageProxy metafile and its hash (used to detect malicious modifications to a metafile). See more on using VM images and relevant RSpec conversion conventions.

When getting a bare metal instance ExoGENI uses ‘ExoGENI-M4’ (or 'raw-pc' for compatibility with GENI) type name and conform to existing RSpec conventions used in protogeni for specifying the name/version of a pre-approved image for bare-metal slivering that must exist in each rack. These images are periodically synchronized out-of-band between the racks and a small central repository.

Selecting which ORCA actor to request resources from

ORCA actors named 'SM' (Service Manager) are available from multiple locations. They implement the same logic and present similar API (ORCA native API, usable by Flukes and GENI AM API, usable by Omni and Flack), but have different levels of resource visibility. Each rack presents an SM actor that has visibility only for the resources in that rack. An actor called 'ExoSM' has global visibility across all racks and all interconnecting networks between them.

Rack SMs can create topologies only within a single rack. ExoSM can create topologies consisting of resources from many ExoGENI racks. Bare-metal nodes are available only through ExoSM.

Reasons to use ExoSM:

  • You have a topology in mind that needs multiple racks. ORCA native stitching will connect racks together.
  • You need a topology that includes bare metal nodes
  • You have a single-rack topology in mind and either
  1. don't care which rack it goes into (ExoSM will automatically find available resources if your request is not bound to a specific rack).
  2. do care which rack it goes into (you can bind your request to a specific rack)

Reasons to use individual rack SMs:

  • You have a topology that spans only one rack and you wish to use that specific rack for some reason
  • There are no resources available from ExoSM
  • You have a complex topology in mind, but you wish to use GENI stitching instead of ORCA's native slice stitching

At different times some of the racks and controllers may not be available due to e.g. maintenance events. To see which racks are available, you can visit the status page.

Compute images

Experimenters can build their own images to launch ExoGENI VMs. Images must be placed on some webserver along with an XML meta-file describing the image properties. The URL of the metafile (and a SHA-1 hash of it, for security) is what is used to identify an image in ExoGENI. ExoGENI racks then automatically download the images and launch virtual machines with them.

For bare-metal instances there are currently a few images available. At present there is no ability for the experimenter to build their own bare-metal image.

Useful links:

InstaGENI and ExoGENI images are currently not portable across testbeds.

Intra-rack slices

The AM interface in each rack can allocate resources only from this rack; however they will still be stitched together (i.e. compute slivers will be connected to each other by VLANs in the user-specified topology within the rack). The SM will then produce an RSpec manifest with stitching schema extensions that can be used by external tools to operate on the slice or, potentially, stitch it to other slivers via e.g. FOAM.

Requests for intra-rack slices can be placed with individual rack SM's or with ExoSM. When placing requests with the rack SM, they do not have to be bound, since the rack SM can only see resources in one rack. When placing a request with ExoSM, the experimenter must explicitly bind the request to a specific site, otherwise Orca's internal binding engine inside the SM will select available site for the topology embedding.

Inter-rack slices

Experimenters can request inter-rack slices by submitting their requests to the ExoSM which exposes GENI AM API. This SM has visibility into a wider pool of available resources, and will perform all needed bindings of requests to available substrate, as in the case of a user-specified topology that is unbound or partially bound. In case of an explicitly bound topology ExoSM will simply attempt to fulfill the request as stated, performing the necessary stitching on the way if resources are available.

This request must be submitted either through the GENI compatibility interface or via Flukes to ExoSM (using RSpec conventions). Thanks to the global visibility of resources, ExoSM will make appropriate decisions about stitching the slice together from multiple edge resource providers and any necessary transit network providers. Rack SMs cannot fulfill this function because their visibility of resources is limited only to those contained in a single rack.

The alternative to using ExoSM is to use GENI tools to do the stitching external to ORCA using the manifest produced by the individual rack AMs.

OpenFlow slices

In ExoGENI, OpenFlow is an integrated capability of some ExoGENI network providers, including the ExoGENI sites, that is available as a slice feature on top of the basic VLAN slicing mechanism, rather than as a distinct aggregate. ExoGENI slices may designate OpenFlow controllers to direct network traffic within the virtual network topology that makes up the slice's dataplane. The ExoGENI aggregates authorize the controller automatically based on their assignments of virtual network resources to the slice's virtual topology. As an option, ExoGENI may also allow GENI experimenters to program the OpenFlow datapaths as separate aggregates (using FOAM), with manual approval by GENI administrators.

The user can specify the URL of their external OF controller in the RSpec request (the specific extension to RSpec to support it is under discussion) or via Flukes. A slice will be created with typically a single broadcast domain for an entire slice (with VLAN tag remapping as needed to connect slivers from multiple ExoGENI domains). The user’s controller will be able to perform forwarding of packets using OpenFlow rules in OF datapaths included in the slice by matching on header fields other than the VLAN tag, which will be ‘locked in’ to this slice via FlowVisor running in the rack. ORCA will communicate with FlowVisor via its XMLRPC interface to communicate slice information (ports and vlan tags involved in the slice) and controller URL.

It is important to understand that all slices in ExoGENI are based on VLANs. Some slices have OpenFlow enabled in them for explicit user control of packet forwarding within those vlans. This behavior is optional and is specified in the request.

MesoScale slices

To create a slice that includes GENI Mesoscale resources (namely OpenFlow), the experimenter must create some edge resources (VM or bare-metal) off of well-defined VLANs in a specific rack. These VLAN are statically configured to lead to GENI Mesoscale OpenFlow deployment. The values of VLAN tags are site-dependent.

Experimenters must use FOAM AM on the rack to acquire the necessary flowspace on the selected mesoscale OpenFlow vlan. Please contact the GENI Project Office on details of using mesoscale aggregates.

Performance isolation

ExoGENI relies on industry- standard mechanisms for resource provisioning (KVM-based virtual machines, bare-metal instances, quality-of-service provisioned VLANs) which all have well-documented performance isolation properties. Several practical factors may serve to weaken those properties:

  • Using pools of static best-effort VLANs for connecting a rack to a dynamic circuit provider. In our experience, while pools of static VLANs provide a practical means of connecting a rack to other racks (or intermediate service providers), the reality is that these VLANs are usually best-effort, since campus and RON providers don’t want to permanently reserve bandwidth allocated to each VLAN, since it may remain unused. Even if a group of VLANs is given some portion of link bandwidth, they may still compete with each other for bandwidth. This means that portions of the intra-rack slices relying on such a pool of VLANs may lose performance isolation and retain only namespace isolation, thus affecting the repeatability of experiments. The solution is to opt as much as possible for a direct (dark-fiber) connection between the rack’s dataplane switch and a dynamic circuit provider, which can then provide VLANs with reserved bandwidth.
  • The current (as of April 2012) implementation of the OpenFlow in the IBM G8264R switch is high-performance and does flow matching and forwarding on the ASIC at 10Gbps speeds with thousands of flows simultaneously, but it does not support a hybrid mode in which some of the ports can operate in OpenFlow mode, while the rest of the ports operate as regular switching ports. Hybrid mode will become available as a firmware upgrade from IBM later in 2012. The OpenFlow implementation currently does not provide any performance isolation between flows (creating a virtual interface with a queue is an optional OF 1.0 spec feature and is not supported). When hybrid mode becomes available, we will use the fact that each worker node is dual-homed into the dataplane OpenFlow switch to connect one of the worker 10G interfaces to a port with OpenFlow enabled and the other to the port with traditional VLAN switching. This way, if a user wants a slice without OpenFlow, we will provide strong performance isolation by provisioning a traditional QoS-enabled VLANs with specified bandwidth in the switch to the compute instances running within the node. If a user wants an OpenFlow slice, we will provision it on the OpenFlow-enabled port. We will continue working with the vendor on future implementations through the lifetime of ExoGENI to improve isolation properties in OpenFlow slices as the OpenFlow spec evolves and new implementations become available.

Error handling, troubleshooting and getting help

For issues with GENI credentials and other GENI aggregates please send email to help@geni.net

All errors in ExoGENI slices should be reported via the geni-orca-users mailing list. Before reporting the problem visit these pages, to see if this is a common scenario:

* Common error scenarios and how to deal with them

When reporting a problem please include a detailed description of what you were attempting to do and attach the following to the message:

  • Slice name, which SM you were talking to and when
  • The request file (either saved request from Flukes, or an XML RSpec file)
  • If relevant (when experiencing configuration problems with e.g. interfaces or bootscripts), the output of ' neuca-user-data' command from the virtual machine instance(s). Please indicate which node in the request the output is from.
  • Any other relevant debugging output (e.g. screenshots) that can help the operations team address your concern.

Whenever reasonable, do not delete the problem slice to give the ExoGENI operators a chance to inspect it.

Further reading

For more detailed information about how experimenters can take advantage of ORCA features please look at these documents:

1) Letting control framework find available resources for your slice
2) manual rspec edits and using <pbs:services_post_boot_script type=“velocity”> service
3) 'execute' service allows for variable substitutions, but does not allow macros and automatic code generation
4) 'execute' service applied in ExoGENI allows for variable substitutions, but does not allow macros and automatic code generation
5) creating slivers of up to several TB of iSCSI storage in your slice
6) It is possible to express a request in RSpec for a multipoint connection, but it is not possible to produce a manifest
7) InstaGENI and Emulab substrates are capable of automatic IP address assignment, however it is not the function of Flack.
Navigation
Print/export