Pseudowires (PWE3)

References: - Pseudowire Emulation Edge-to-Edge (PWE3) Control Word for Use over an MPLS PSN - IANA Allocations for Pseudowire Edge to Edge Emulation (PWE3)
RFC4447 - Pseudowire Setup and Maintenance Using the Label Distribution Protocol (LDP) - Encapsulation Methods for Transport of Ethernet over MPLS Networks
RFC4905 - Encapsulation Methods for Transport of Layer 2 Frames over MPLS Networks
RFC4906 - Transport of Layer 2 Frames Over MPLS
RFC4761 - Virtual Private LAN Service (VPLS) Using BGP for Auto-Discovery and Signaling
RFC4762 - Virtual Private LAN Service (VPLS) Using Label Distribution Protocol (LDP) Signaling
RFC5586 - MPLS Generic Associated Channel
RFC6624 - Layer 2 Virtual Private Networks Using BGP for Auto-Discovery and Signaling
Cisco Press - Layer 2 VPN Architectures

Draft Martini and Draft Kompella
These two IETF drafts addressed two different L2 scaling topologies and although neither were standardised both spurred on two separate working groups to develop a series of standards which are now official standards and implemented by vendors and the the names have become everyday phrases to describe the two different methods used to extend layer 2 connectivity using MPLS VPNs. Both drafts use the same encapsulation stack; Service Provider’s Layer 2 header + Tunnel Label + VC Label + Control Word + Service Payload (L2 frame).

Loosly speaking the main differences are that the Martini draft addresses point-to-point pseudowires using LDP for signalling and no auto-discovery support and the Kompella draft uses BGP for signalling and auto-discovery to provide for fully meshed Layer 2 VPN services (such as VPLS). Classical Cisco IOS uses LDP for signalling without any auto discovery whilst Junos has traditionally used Kompella via BGP signalling. Cisco support RFC 4762 and Juniper RFC 4761. One can provide an LDP pseudowire (for example) between a Cisco PE and Juniper PE with minimal configuration.

Draft Martini:
This draft describes how to establish a pseudowire between two attachment circuits that are located on two peering PE devices. It also specifies the encapsulation methods for each Layer 2 service. The Label Distribution Protocol (LDP) distributes MPLS labels for various MPLS applications, including pseudowire emulation. The Draft Martini architecture is concerned with creating and managing individual point-to-point pseudowires, which have no correlation to one another. It does not support point-to-multipoint topologies nor does it support auto-discovery (using either LDP or BPG, it only uses LDP for signalling).

Before initiating a pseudowire to a remote PE one needs to provision the local PE with a virtual circuit (VC) ID or pseudowire ID shared by both the local and remote attachment circuit, and an IP address of the remote PE. Because the baseline LDP definition does not readily have the necessary protocol element for pseudowire signalling the draft defines a pseudowire extension for LDP. A pseudowire is considered established when the peering PE devices exchange label information for the pseudowire. Using LDP terminology this means that each PE device sends and receives a label mapping message for a given pseudowire.

The lack of auto-discovery through a protocol like BGP means that pseudowire set up is manual (or via an automated provisioning system) and each pseudowire must be provisioned individually. Draft Martini is thus not as scalable as Draft Kompella in terms of provisioning such in a full-mesh topology when one must provision P2P pseudowires between every pair of PEs.

Draft Kompella:
Unlike Draft Martini, Draft Kompella involves complex signalling procedures and algorithms, and the provisioning scheme which is somewhat tricky works better with some Layer 2 services than others. One major objective of Draft Kompella is to tackle the inherent scalability problem of the traditional Layer 2 VPNs.

A pseudowire is needed to connect two CE devices that attach to two different PE devices. To have full connectivity among the CE devices when the number of CE devices increases, the number of pseudowires that needs to be established and managed grows exponentially. In addition, every time you add a new CE device or move an existing CE device to attach to a different PE device, you must reconfigure all the PE devices that are participating in this VPN to maintain the full-mesh connectivity. This can become a labour-intensive task for network operators. The draft attempts to solve the scaling problem by over-provisioning the number of attachment circuits needed for current CE devices so that the existing CE and its PE devices do not need to be reconfigured when adding a new CE to a VPN. The basic premise for over-provisioning is that the attachment circuits between CE and PE devices are relatively cheap.

For each CE, the PE advertises its own router ID, VPN ID, CE range, and label base through BGP update messages, which are broadcast to all other PE devices. Even though some PE devices might not be part of the VPN, they can receive and keep this information just in case a CE that is connected to the PE joins the VPN in the future. Because the baseline BGP standard does not readily have the necessary protocol element for pseudowire signalling, the draft defines a pseudowire extension for BGP.

This architecture (using BGP for both signalling and auto-discovery) solves the scaling problem by making the provisioning task of adding a new CE device a local matter. Remote PE devices can learn about the new CE through BGP update messages [think RR's]. The broadcast nature of BGP makes it easy to automatically discover PE devices that are participating in Layer 2 VPNs, which further reduces the configuration on PE devices.

The weakness of this architecture comes from the validity of the assumptions it is based on. For example, the low cost of attachment circuits. In the case in which the cost of individual attachment circuits is expensive, over-provisioning becomes impractical. Also, the typical Layer 2 VPNs deployed today are rarely fully meshed because having a fully meshed flat network creates scaling problems for Layer 3 routing, where hierarchy is desired. If a Layer 2 VPN consists only of sparse point-to-point connections, advertising the information of a CE to all other PE devices and keeping it on these PE devices waste network resources because such information is only interesting to a single remote PE.


Pseudowire Signalling & VC Types
RFC4448 defines two MPLS pseudowire types for Ethernet, VC type 4 and VC type 5. It also includes various interface parameter sub-TLV definitions, the most commonly used for EoMPLS are below.

PW type   Description
0x0004    Ethernet Tagged Mode                             
0x0005    Ethernet
Parameter  Length       Description                     
 0x01      4       Interface MTU in octets
 0x03   up to 82   Optional Interface Description string
 0x06      4       Requested VLAN ID

RFC4446 defines two modes of Ethernet pseudowire, Ethernet Tagged Mode and Ethernet Raw Mode.

4.1. Ethernet Tagged Mode

This mode uses service-delimiting tags to map input Ethernet frames to respective PWs and corresponds to PW type 0x0004 "Ethernet Tagged Mode" in RFC4448.

4.2. Ethernet Raw Mode

In this mode, all Ethernet frames received on the attachment circuit of PE1 will be transmitted to PE2 on a single PW (*tagged or untagged). This service corresponds to PW type 0x0005 "Ethernet" in RFC4448.

4.3. Ethernet-Specific Interface Parameter LDP Sub-TLV

RFC4446 defines the Ethernet-specific interface parameters:

- 0x06 Requested VLAN ID Sub-TLV (from RFC4448)

An Optional 16-bit value indicating the requested VLAN ID. This parameter MUST be used by a PE that is incapable of rewriting the 802.1Q Ethernet VLAN tag on output. If the ingress PE receives this request, it MUST rewrite the VLAN ID contained inside the VLAN Tag at the input to match the requested VLAN ID. If this is not possible, and the VLAN ID does not already match the configured ingress VLAN ID, the PW MUST not be enabled. This parameter is applicable only to PW type 0x0004.

4.4.1. Raw Mode vs. Tagged Mode

When the PE receives an Ethernet frame, and the frame has a VLAN tag, we can distinguish two cases:

1. The tag is service-delimiting (* such as S-VLAN tag). This means that the tag was placed on the frame by some piece of service provider-operated equipment, and the tag is used by the service provider to distinguish the traffic. For example, LANs from different customers might be attached to the same service provider switch, which applies VLAN tags to distinguish one customer's traffic from another's, and then forwards the frames to the PE.

2. The tag is not service-delimiting (* such as C-VLAN tag). This means that the tag was placed in the frame by a piece of customer equipment, and is not meaningful to the PE.

Whether or not the tag is service-delimiting is determined by local configuration on the PE.

If an Ethernet PW is operating in raw mode, service-delimiting tags are NEVER sent over the PW. If a service-delimiting tag is present when the frame is received from the attachment circuit by the PE, it MUST be stripped (by the NSP) from the frame before the frame is sent to the PW.

If an Ethernet PW is operating in tagged mode, every frame sent on the PW MUST have a service-delimiting VLAN tag. If the frame as received by the PE from the attachment circuit does not have a service-delimiting VLAN tag, the PE must prepend the frame with a dummy VLAN tag before sending the frame on the PW. This is the default operating mode. This is the only REQUIRED mode.

In both modes, non-service-delimiting tags are passed transparently across the PW as part of the payload. It should be noted that a single Ethernet packet may contain more than one tag. At most, one of these tags may be service-delimiting. In any case, the NSP function may only inspect the outermost tag for the purpose of adapting the Ethernet frame to the pseudowire.

In both modes, the service-delimiting tag values have only local significance, i.e., are meaningful only at a particular PE-CE interface. When tagged mode is used, the PE that receives a frame from the PW may rewrite the tag value, or may strip the tag entirely, or may leave the tag unchanged, depending on its configuration. When raw mode is used, the PE that receives a frame may or may not need to add a service-delimiting tag before transmitting the frame on the attachment circuit; however, it MUST not rewrite or remove any tags that are already present.

The following table illustrates the operations that might be performed at input from the attachment circuit:

| Tag-> | service delimiting | non service delimiting|
| Raw Mode | 1st VLAN Tag Removed| no operation performed|
| Tagged Mode | NO OP or Tag Added | Tag Added |


AToM Overview

Any Transport over MPLS (AToM) transports Layer 2 packets over an MPLS backbone. AToM (on Cisco) uses a directed Label Distribution Protocol (LDP) session between edge routers for setting up and maintaining connections. Forwarding occurs through the use of two level labels that provide switching between the edge routers. The external label (tunnel label) routes the packet over the MPLS backbone to the egress PE at the ingress PE. The VC label is a demuxing label that determines the connection at the tunnel endpoint (the particular egress interface on the egress PE as well as the VLAN identifier for an Ethernet frame).

When the pseudowire sequence numbers reveal out-of-out frames on a Cisco AToM tunnel, the frames as typically discarded to let the higher layer protocols signal a retransmit.


EoMPLS Overview

EoMPLS is one of the AToM transport types. EoMPLS works by encapsulating Ethernet PDUs in MPLS packets and forwarding them across the MPLS network. Each PDU is transported as a single packet. Cisco IOS supports two EoMPLS modes:

VLAN mode—Transports Ethernet traffic from a source 802.1Q VLAN to a destination 802.1Q VLAN through a single VC over an MPLS network. VLAN mode uses VC type 5 as default (no dot1q tag) and VC type 4 (transport dot1 tag) if the remote PE does not support VC type 5 for sub-interface (VLAN) based EoMPLS.

Port mode—Allows all traffic (which can be untagged, single tagged, N tagged etc) on a port to share a single VC across an MPLS network. Port mode uses VC type 5.

VC type 4 or VC type 5 support is platform dependent, by default on Cisco IOS the capability is instead auto sensed at control plane level. VC type 5 is the defualt for newer platforms such as MEF 2.0 certified Cisco platforms and they can support VC type 5 for port-tunneled pseudowires or service instance based (VLAN based) pseudowires. VC type 4 can be hard coded using a pseudowire class map. Older platforms like a 7200 will auto negotiate to VC type 4 when  xconnecting a sub-interface for example.

VC Type 5: In EoMPLS VLAN mode configuration, only the MPLS label is added to the packet transmitted to the MPLS core. On the ingress of the remote end or receiving PE the MPLS label stack is popped out before the transmission on the internal bus. If Port mode is configured instead, the 802.1Q tag is also carried along with the MPLS label. No VLAN tagged is imposed if there is non on the ingress frame on the ingress AC.

VC Type 4: The original 802.1Q tag is inserted in the EoMPLS payload (along with the MPLS label) before forwarding it to the MPLS core. At the ingress of the remote end or receiving PE the 802.1Q tag is stripped off before its transmission to the internal bus. If a packet is received from the MPLS core without a tag (ether type of the packet is other than 0x8100) the packet is dropped. Some platforms will insert a dummy VLAN tag onto the ingress Ethernet frame VLAN stack even if a VLAN is present and the egress PE is required to pop the dummy VLAN. VC type 4 pseudowires are generally being phased out as they only support tagged frames. Newer platofrms like the Cisco ASR1000 series routers don't support VC type 4 xconnects (even though they will come "up") because they can't pop the dummy frame imposed by the ingress PE.


Pseusdowire Control Word
RFC4385 defines the Pseudowire MPLS Control Word (PWMCW) and Pseudowire Associated Control Header (PWACH). Loosely summarising the control word sits on top of the VC MPLS label and under the pseudowire payload, and provides a mechanism to prevent against ECMP for traffic that is sensitive (read: intolerant) to packet/frame sequencing or miss-ordering.

The first nibble of a L3 VPN payload might be a four or a six to indicate the start of an IPv4 or IPv6 packet header. When carrying layer 2 traffic over a pseudowire it could be anything. When the PWMCW is present and the first nibble is zero, this indicates to LSRs along the LSP that the MPLS payload is potentially not IPv4 or IPv6 directly above the MPLS label stack. The LSRs should not look into the MPLS payload to look for a key to use in a load-balancing hash for ECMP processing.

The control word contains a sequencing field so that if supported PEs can provide basic sequencing features such as if transmitted frames are being received in order or not.

In the case that the first nibble in the control word is 1 then the control word is a Pseudowire Associated Control Header and indicates that the MPLS payload is for an associated channel along the LSP but not the main service channel. This header allows multiple "channels" to be multiplex over the LSP, the PWACH contains a channel type which acts as the separation mechanism.

Previous page: Multicast Recap
Next page: Pseudowires (PWE3) - Cisco